📄 perceptronclassifier.java

📁 一个自然语言处理的Java开源工具包。LingPipe目前已有很丰富的功能
💻 JAVA
📖 第 1 页 / 共 2 页
字号:
12 下一页
package com.aliasi.classify;import com.aliasi.corpus.ClassificationHandler;import com.aliasi.corpus.Corpus;import com.aliasi.matrix.DenseVector;import com.aliasi.matrix.KernelFunction;import com.aliasi.matrix.SparseFloatVector;import com.aliasi.matrix.Vector;import com.aliasi.symbol.MapSymbolTable;import com.aliasi.symbol.SymbolTable;import com.aliasi.util.AbstractExternalizable;import com.aliasi.util.Arrays;import com.aliasi.util.Compilable;import com.aliasi.util.FeatureExtractor;import com.aliasi.util.ObjectToDoubleMap;import com.aliasi.util.SmallObjectToDoubleMap;import java.io.IOException;import java.io.ObjectInput;import java.io.ObjectOutput;import java.io.Serializable;import java.util.ArrayList;import java.util.List;import java.util.HashMap;import java.util.Map;/** * A <code>PerceptronClassifier</code> implements a binary classifier * based on an averaged kernel-based perceptron.  These * classifiers are large margin (discriminitive) linear classifiers in * a feature space expanded by a plug-and-play kernel implemeting * {@link KernelFunction}. * * <p>A perceptron classifier may be applied to any type of object. * An appropriately typed {@link FeatureExtractor} is used to map * these objects to feature vectors for use in the perceptron * classifier. * * <h3>Corpus Training</h3> * * <p>Unlike the language-model-based classifiers, for which training, * classification and compilation may be interleaved, averaged * perceptron-based classifiers require batch training.  This requires * the entire training corpus to be available in one shot.  In * particular, training will iterate over a fixed corpus multiple * times before producing a completely trained classifier. * * <p>The constructor will do the training using the supplied instance of * {@link Corpus}.  The constructor will store the entire corpus in * memory in the form of {@link SparseFloatVector} and boolean * polarities for the accept/reject decision.  The corpus will only be * held locally in the constructor; it is available for garbage * collection, as are all intermediate results, as soon as the * constructor is done training. * * <h3>Kernel Function</h3> * * <p>The basic (non-kernel) perceptron is equivalent to using the * kernel function {@link com.aliasi.matrix.DotProductKernel}.  A good * choice for most classification tasks is the polynomial kernel, * implemented in {@link com.aliasi.matrix.PolynomialKernel}. * Usually, higher polynomial kernel degrees perform dramatically * better than dot products.  3 is a good general starting degree, as * sometimes performance degrades in higher kernel degrees.  In * some cases, the Gaussian radial basis kernel implemented in * {@link com.aliasi.matrix.GaussianRadialBasisKernel} works well. * * <p>If the kernel function is neither serializable nor compilable, * then the resulting perceptron classifier will not be serializable. * * <h3>Training Iterations</h3> * * <p>More training iterations are usually better for accuracy.  As * more basis vectors are added to the perceptron, more memory is * needed for the model and more time is needed for classification. * Typically, the amount of memory required at run time will * stabilize after a few training iterations. * * <h3>Memory Usage: Training and Compiled</h3> * * <p>The memory usage of a perceptron classifier may be very high. * The perceptron must store every feature vector in the input on * which the classifier being trained made a mistake in some * iteration.  During training, every feature vector in the input * corpus must be stored.  These are stored in memory as instances of * {@link SparseFloatVector}. * * <p>If the data are linearly separable in the kernel space, * the training process will converge to the point where no additional * basis feature vectors are needed and the result will converge to * using the single final perceptron (which requires all the intermediate * to be stored given the demands of a non-linear kernel calculation). * * <h3>Serialization</h3> * * <p>After a perceptron classifier is constructed, it may be * serialized to an object output stream.  If the underlying * feature extractor is compilable, it's compiled, but if * it's not compilable, it's serialized.  To be serializable, * a perceptron classifier requires both its feature extractor * and kernel function to implement one of {@link Serializable} * or {@link com.aliasi.util.Compilable}. * * <p>The object read back in after serialization will * be an instance of <code>PerceptronClassifier</code>. * * <h3>About Averaged Kernel Perceptrons</h3> * <p><a * href="http://en.wikipedia.org/wiki/Perceptron">Perceptrons</a> * are a kind of large-margin linear classifier.  The * polynomial <a * href="http://en.wikipedia.org/wiki/Kernel_trick">kernel trick</a> * is used to embed the basic feature vector in a higher-degree vector space * in which the data are more separable. * * <p>An average of all of the perceptrons created during training is * used for the final prediction. The factored formulation of the * algorithm allows the perceptrons to be expressed as linearly * weighted training samples. * * <p>Although theoretical bounds are almost equivalent, in practice * averaged perceptrons slightly underperform <a * href="http://en.wikipedia.org/wiki/Support_vector_machine">support * vector machine</a> (SVM) learners over the same polynomial kernels. * The advantage of perceptrons is that they are much more efficient * in time and slightly more efficient in space in practice. * * <h3>Averaged Perceptron Model with Polynomial Kernel</h3> * * <p>The model used for runtime predictions by the averaged * perceptron is quite straightforward, consisting of a set of * weighted feature vectors (represented as parallel arrays of basis * vectors and weights) and a kernel degree: * * <blockquote><pre> * Vector[] basisVectors; * int[] weights; * int degree;</pre></blockquote> * * The basis vectors are all vectors derived from single training * examples by the specified feature extractor.  The weights may * be positive or negative and represent the cumulative voted * weight of the specified basis vector. * * <p>The kernel function computes a distance * <code>kernel(v1,v2)</code> between vectors <code>v1</code> and * <code>v2</code> in an enhanced feature space defined by the * particular kernel employed. * * <p>A new input to classify is first converted to a feature * vector by the feature extractor.  Classification is then * based on the sign of the following score: * * <blockquote><pre> * score(Vector v) = <big><big><big>&Sigma;</big></big></big><sub><sub>i</sub></sub> weights[i] * kernel(basisVectors[i],v)</pre></blockquote> * * An example is accepted if the score of its feature vector is * greater than zero and rejected otherwise. * * <h3>Estimating the Perceptron Model</h3> * * To estimate the perceptron model, we will assume that we have a * training corpus consisting of an array of vectors with boolean * polarities indicating whether they are positive (to accept) or * negative (to reject) examples.  We also assume we have a fixed * kernel function.  The training method iterates over the corpus a * specified number of times. * * <blockquote><pre> * Vector[] basisVectors; * int[] incrementalWeights; * boolean[] polarities; * int degree; * int index = -1; * for (# iterations) *     for (vector,polarity) in training corpus *         yHat = scoreIntermediate(vector); *         if (yHat &gt 0 &amp;&amp; polarity) || (yHat &lt; 0 &amp;&amp; !polarity) *             ++incrementalWeights[index]; *         else *             ++index; *             basisVectors[index] = vector; *             polarities[index] = polarity; *             incrementalWeights[index] = 1;</pre></blockquote> * * <blockquote><pre> * scoreIntermediate(vector) *   = <big><big><big>&Sigma;</big></big></big><sub><sub>i &lt;= index</sub></sub> polarities[i] * kernel(basisVectors[i],vector) * </pre></blockquote> * * The final weight for a vector is the cumulative weight * computed as follows: * * <blockquote><pre> * cumulativeWeight(j) = <big><big><big>&Sigma;</big></big></big><sub><sub>k &gt;= j</sub></sub> incrementalWeights[k] * </pre></blockquote> * * The actual implementations of these methods involve * considerably more indirection and index chasing to avoid * copies and duplication in the final vectors. * * <h3>Historical Notes</h3> * * <p>The averaged kernel perceptron implemented here was introduced * in the following paper, which also provides error bounds * for learning and evaluations with polynomial kernels of various * degrees: * * <blockquote> * Freund, Yoav and Robert E. Schapire (1999) * Large margin classification using the perceptron algorithm. * <i>Machine Learning</i> <b>37</b>(3):277-296. * </blockquote> * * The basic perceptron model was introduced in: * * <blockquote> * Block, H.D. (1962) The perceptron: a model for brain functioning. * <i>Reviews of Modern Physics</i> <b>34</b>:123-135. * </blockquote> * <p>The kernel-based perceptron was introduced in: * * <blockquote> * Aizerman, M.A., E.M. Braverman, and L.I. Rozonoer.  1964. * Theoretical foundations of the potential function method in pattern * recognition learning.  <i>Automation and Remote Control</i>. * <b>25</b>:821-837. * </blockquote> * * The basis of the voting scheme is a deterministically averaged * version of the randomized approach of adapting online learners * to a batch setup described in the following paper: * * <blockquote> * Helmbold, D.P. and M.K. Warmuth.  (1995) * On weak learning.  <i>Journal of Computer and System Sciences</i> * <b>50</b>:551-573. * </blockquote> * * @author  Bob Carpenter * @version 3.1 * @since   LingPipe3.1 */public class PerceptronClassifier<E>    implements Classifier<E,ScoredClassification>,               Serializable {    final FeatureExtractor<? super E> mFeatureExtractor;    final MapSymbolTable mSymbolTable;    final KernelFunction mKernelFunction;    final SparseFloatVector[] mBasisVectors;    final int[] mBasisWeights;    final String mAcceptCategory;    final String mRejectCategory;    PerceptronClassifier(FeatureExtractor<? super E> featureExtractor,                         KernelFunction kernelFunction,                         MapSymbolTable symbolTable,                         SparseFloatVector[] basisVectors,                         int[] basisWeights,                         String acceptCategory,                         String rejectCategory) {        mKernelFunction = kernelFunction;        mBasisVectors = basisVectors;        mBasisWeights = basisWeights;        mFeatureExtractor = featureExtractor;        mAcceptCategory = "true";        mRejectCategory = "false";        mSymbolTable = symbolTable;    }    /**     * Construct a perceptron classifier from the specified feature extractor,     * corpus with designated accept category, polynomial kernel degree and     * number of training iterations, using the default runtime accept and     * reject categories.     *     * <p>The default runtime accept and reject categories are     * {@link BinaryLMClassifier#DEFAULT_ACCEPT_CATEGORY} and     * {@link BinaryLMClassifier#DEFAULT_REJECT_CATEGORY}, respectively.     *     * @param featureExtractor Feature extractor for objects.     * @param corpus Corpus to use for training.     * @param corpusAcceptCategory Category in training data to treat as positive.     * @param kernelFunction Kernel function for expanding vector basis.     * @param numIterations Number of iterations to carry out during training.     */    public PerceptronClassifier(FeatureExtractor<? super E> featureExtractor,                                KernelFunction kernelFunction,                                Corpus<ClassificationHandler<E,                                                             Classification>> corpus,                                String corpusAcceptCategory,                                int numIterations) throws IOException {        this(featureExtractor,             kernelFunction,             corpus,             corpusAcceptCategory,             numIterations,             BinaryLMClassifier.DEFAULT_ACCEPT_CATEGORY,             BinaryLMClassifier.DEFAULT_REJECT_CATEGORY);    }    /**     * Construct a perceptron classifier from the specified feature extractor,     * corpus with designated accept category, polynomial kernel degree and     * number of training iterations, and output accept and reject categoriesl     *     * @param featureExtractor Feature extractor for objects.     * @param corpus Corpus to use for training.     * @param corpusAcceptCategory Category in training data to treat as positive.     * @param kernelFunction Kernel function for expanding vector basis.     * @param numIterations Number of iterations to carry out during training.     * @param outputAcceptCategory Category with which to label accepted instances.     * @param outputRejectCategory Category with which to label rejected instances.     */    public PerceptronClassifier(FeatureExtractor<? super E> featureExtractor,                                KernelFunction kernelFunction,                                Corpus<ClassificationHandler<E,                                                             Classification>> corpus,                                String corpusAcceptCategory,                                int numIterations,                                String outputAcceptCategory,                                String outputRejectCategory)        throws IOException {        mFeatureExtractor = featureExtractor;        mKernelFunction = kernelFunction;        mAcceptCategory = outputAcceptCategory;        mRejectCategory = outputRejectCategory;        mSymbolTable = new MapSymbolTable();        // collect training vectors and categories        CorpusCollector collector = new CorpusCollector();        corpus.visitCorpus(collector);        SparseFloatVector[] featureVectors = collector.featureVectors();        boolean[] polarities = collector.polarities();
12 下一页
💿 文件大小 4561 K
👤 上传用户 edan1181
📂 所属分类 Java编程
🏷️ 相关标签

#LingPipe #Java #自然语言处理 #开源
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -