wordscorefeatures.java
来自「CRF分类器,是一个很好的研究工具.用于中文信息处理的很好的工具」· Java 代码 · 共 50 行
JAVA
50 行
package iitb.Model;import iitb.CRF.*;import java.util.*;import java.io.*;/** * These return one feature per state. The value of the feature is the * fraction of training instances passing through this state that contain * the word * * @author Sunita Sarawagi */ public class WordScoreFeatures extends FeatureTypes { int stateId; int wordPos; WordsInTrain dict; public WordScoreFeatures(FeatureGenImpl m, WordsInTrain d) { super(m); dict = d; } private void nextStateId() { stateId = dict.nextStateWithWord(wordPos, stateId); } public boolean startScanFeaturesAt(DataSequence data, int prevPos, int pos) { stateId = -1; if (dict.count(data.x(pos)) > WordFeatures.RARE_THRESHOLD) { Object token = (data.x(pos)); wordPos = dict.getIndex(token); stateId = -1; nextStateId(); return true; } return false; } public boolean hasNext() { return (stateId < model.numStates()) && (stateId >= 0); } public void next(FeatureImpl f) { setFeatureIdentifier(stateId,stateId,"S",f); f.yend = stateId; f.ystart = -1; f.val = (float)Math.log(((double)dict.count(wordPos,stateId))/dict.count(stateId)); // System.out.println(f.toString()); nextStateId(); }};
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?