⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 target2bioformat.java

📁 常用机器学习算法,java编写源代码,内含常用分类算法,包括说明文档
💻 JAVA
字号:
/* Copyright (C) 2002 Univ. of Massachusetts Amherst, Computer Science Dept.   This file is part of "MALLET" (MAchine Learning for LanguagE Toolkit).   http://www.cs.umass.edu/~mccallum/mallet   This software is provided under the terms of the Common Public License,   version 1.0, as published by http://www.opensource.org.  For further   information, see the file `LICENSE' included with this distribution. *//**    @author Aron Culotta <a href="mailto:culotta@cs.umass.edu">culotta@cs.umass.edu</a> */package edu.umass.cs.mallet.base.pipe.tsf;import edu.umass.cs.mallet.base.types.*;import edu.umass.cs.mallet.base.pipe.*;import java.io.*;/**	 Creates a {@link LabelSequence} out of a {@link TokenSequence} that	 is the target of an {@link Instance}. Labels are constructed out of	 each Token in the TokenSequence to conform with BIO format (Begin,	 Inside, Outside of Segment). Prepends a "B-" to Tokens that leave a	 background state and an "I-" to tags that have the same label as	 the previous Token. NOTE: This class assumes that subsequent	 identical tags belong to the same Segment. This means that you	 cannot have B B I, only B I I. */public class Target2BIOFormat extends Pipe implements Serializable{	String backgroundTag;		public Target2BIOFormat ()	{		super (null, LabelAlphabet.class);		backgroundTag = "O";	}	/**		 @param background represents Tokens that are not part of a target		 Segment.	 */	public Target2BIOFormat (String background)	{		super (null, LabelAlphabet.class);		this.backgroundTag = background;	}	public Instance pipe (Instance carrier)	{				Object target = carrier.getTarget();		if (target instanceof TokenSequence) {			Alphabet v = getTargetAlphabet ();			TokenSequence ts = (TokenSequence) target;			int indices[] = new int[ts.size()];			String previousString =  this.backgroundTag;			for (int i = 0; i < ts.size(); i++) {				String s = ts.getToken (i).getText ();				String tag = s;				if (!tag.equals (this.backgroundTag)) {					if (tag.equals (previousString))						tag = "I-" + tag;					else tag = "B-" + tag;									}				indices[i] = v.lookupIndex (tag);				previousString = s;			}			LabelSequence ls = new LabelSequence ((LabelAlphabet)getTargetAlphabet(), indices);			carrier.setTarget(ls);		} else {			throw new IllegalArgumentException ("Unrecognized target type.");		}		return carrier;	}	// Serialization 		private static final long serialVersionUID = 1;	private static final int CURRENT_SERIAL_VERSION = 0;		private void writeObject (ObjectOutputStream out) throws IOException {		out.writeInt (CURRENT_SERIAL_VERSION);		out.writeObject (backgroundTag);	}		private void readObject (ObjectInputStream in) throws IOException, ClassNotFoundException {		int version = in.readInt ();		backgroundTag = (String) in.readObject ();	}	}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -