📄 svdmatrix.java

📁 一个自然语言处理的Java开源工具包。LingPipe目前已有很丰富的功能
💻 JAVA
📖 第 1 页 / 共 3 页
字号:
 * moves from its initial size to one tenth (<code>1/10</code>) of * its initial size after <code>9 * annealingRate</code> epochs, * and one hundredth of its initial size after * <code>99 * annealingRate</code> epochs.. * * <h3>Parameter Choices</h3> * * <p>The previous discussion has introduced a daunting list of * parameters required for gradient descent for singular value * decomposition.  Unfortunately, the stochastic gradient descent * solver requires a fair amount of tuning to recover a low mean * square error factorization.  The optimal settings will also depend * on the input matrix; for example, very sparse partial matrices are * much easier to fit than dense matrices. * * <h4>Maximum Order</h4> * * <p>Determining the order of the decomposition is mostly a matter * of determining how many orders are needed for the amount of * smoothing required.  Low order reconstructions are useful for * most applications.  One way to determine maximum order is using * held out data for an application.  Another is to look for * a point where the singular values become (relatively) insignificant. * * * <h4>Maximum Epochs and Early Stopping</h4> * * <p>For low mean square error factorizations, many epochs may be * necessary.  Lowering the maximum number of epochs leads to what is * known as early stopping.  Sometimes, early stopping leads to * more accurate held-out predictions, but it will always raise * the factorization error (training data predictions).  In general, * the maximum number of epochs needs to be set empirically. * Initially, try fewer epochs and gradually raise the number of * epochs until the desired accuracy is achieved. * * <h4>Minimum Improvement</h4> * * By setting the minimum improvement to 0.0, the algorithm is forced * to use the maximum number of epochs.  By setting it above this * level, a form of early stopping is achieved when the benefit of * another epoch of refitting falls below a given improvement * threshold.  This value may also be set on an application basis * using held out data, or it may be fit to achieve a given level of * mean square error on the training (input) data.  The minimum * improvement needs to be set fairly low (less than 0.000001) to * achieve reasonably precise factorizations.  Note that minimum * improvement is defined relatively, as noted above. * * <h4>Initial Parameter Values</h4> * * <p>Initial values of the singular vectors are not particularly * sensitive, because we are using multiplicative updates.  A good * rule of thumb for starting values is the the inverse square root of * the maximum order: * * <blockquote><pre> * initVal ~ 1 / sqrt(maxOrder)</pre></blockquote> * * <h4>Learning Rate, Maximum Epochs and Annealing</h4> * * <p>A good starting point for the learning rate is 0.01.  The * annealing parameter should be set so that the learning rate is cut * by at least a factor of 10 in the final rounds.  This calls for a * value that's roughly one tenth of the maximum number of epochs.  If * the initial learning rate is set higher, then the annealing * schedule should be more agressive so that it spends a fair amount * of time in the 0.001 to 0.0001 learning rate range. * * * <h3>References</h3> * * <p>There are thorough Wikipedia articles on singular value decomposition * and gradient descent, although the SVD entry focuses entirely on * complete (non-partial) matrices: * * <ul> * <li><a href="http://en.wikipedia.org/wiki/Singular_value_decomposition">Wikipedia: Singular Value Decomposition</a></li> * <li><a href="http://en.wikipedia.org/wiki/Gradient_descent">Wikipedia: Gradient Descent</a></li> * </ul> * * Both of the standard machine learning textbooks have good * theoretical explanations of least squares, regularization, * gradient descent, and singular value decomposition, but not * all together: * * <ul> * <li>Chris Bishop.  2007.  <i>Pattern Recognition and Machine Learning.</i>  Springer.</li> * <li>Trevor Hastie, Robert Tibshirani, and Jerome Friedman.  2001. * <i>The Elements of Statistical Learning</i>.  Springer.</i> * </ul> * * <p>The following is a particularly clear explanation of many of * the issues involved in the context of neural networks: * * <ul> * <li>Genevieve Orr, Nici Schraudolph, and Fred Cummins.  1999.<a href="http://www.willamette.edu/~gorr/classes/cs449/intro.html">CS-449: Neural Networks</a>.  Willamette University course notes.</li> * </ul> * <p>Our partial SVD solver is based on C code from Timely Development * (see license below).  Timely based their code on Simon Funk's * blog entry.  Simon Funk's approach was itself based on Genevieve Gorrell's * 2006 EACL paper. * * <ul> * <li><a href="http://www.timelydevelopment.com/Demos/NetflixPrize.htm">Timely Development's Netflix Prize Page</a>. * <li>Simon Funk (pseudonym for Brandyn Webb). 2007. * <a href="http://sifter.org/~simon/journal/20061211.html">Gradient Descent SVD Algorithm</a>.  <i>The Evolution of Cybernetics</i> blog. * <li>Genevieve Gorrell.  2006. * <a href="http://acl.ldc.upenn.edu/E/E06/E06-1013.pdf">Generalized Hebbian Algorithm for Incremental Singular ValueDecomposition in Natural Language Processing</a>.  EACL 2006.</li>* <li>Genevieve Gorrell.  2006.* <a href="http://www.dcs.shef.ac.uk/~genevieve/gorrell_thesis.pdf">Generalized Hebbian Algorithm for Dimensionality Reduction in Natural Language Processing</a>. Ph.D. Thesis.* Link&#246;ping University.  Sweden.</li> * </ul> * * <h3>Acknowledgements</h3> * * <p>The singular value decomposition code is rather loosely based on * a C program developed by <a * href="http://www.timelydevelopment.com/">Timely Development</a>. * Here is the copyright notice for the original code: * * <blockquote><pre style="font-size:80%"> * // SVD Sample Code * // * // Copyright (C) 2007 Timely Development (www.timelydevelopment.com) * // * // Special thanks to Simon Funk and others from the Netflix Prize contest * // for providing pseudo-code and tuning hints. * // * // Feel free to use this code as you wish as long as you include * // these notices and attribution. * // * // Also, if you have alternative types of algorithms for accomplishing * // the same goal and would like to contribute, please share them as well :) * // * // STANDARD DISCLAIMER: * // * // - THIS CODE AND INFORMATION IS PROVIDED "AS IS" WITHOUT WARRANTY * // - OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT * // - LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR * // - FITNESS FOR A PARTICULAR PURPOSE. * </pre></blockquote> * * @author Bob Carpenter * @version 3.2 * @since   LingPipe3.2 */public class SvdMatrix extends AbstractMatrix {    private final double[][] mRowVectors;    private final double[][] mColumnVectors;    private final int mOrder;    /**     * Construct a factored matrix using the specified row and column     * vectors at the specified order.  Each vector in the arrays of     * row and column vectors must be of the same dimension as the     * order.     *     * <p>See the class documentation above for more information     * on singular value decomposition.     *     * @param rowVectors Row vectors, indexed by row.     * @param columnVectors Column vectors, indexed by column.     * @param order Dimensionality of the row and column vectors.     */    public SvdMatrix(double[][] rowVectors,                     double[][] columnVectors,                     int order) {        verifyDimensions("row",order,rowVectors);        verifyDimensions("column",order,columnVectors);        mRowVectors = rowVectors;        mColumnVectors = columnVectors;        mOrder = order;    }    /**     * Construct an SVD matrix using the specified singular vectors     * and singular values.  The order of the factorization is equal to     * the length of the singular values.  Each singular vector must     * be the same dimensionality as the array of singular values.     *     * <p>See the class documentation above for more information     * on singular value decomposition.     *     * @param rowSingularVectors Row singular vectors, indexed by row.     * @param columnSingularVectors Column singular vectors, indexed by column.     * @param singularValues Array of singular values, in decreasing order.     */    public SvdMatrix(double[][] rowSingularVectors,                     double[][] columnSingularVectors,                     double[] singularValues) {        mOrder = singularValues.length;        verifyDimensions("row",mOrder,rowSingularVectors);        verifyDimensions("column",mOrder,columnSingularVectors);        mRowVectors = new double[rowSingularVectors.length][mOrder];        mColumnVectors = new double[columnSingularVectors.length][mOrder];        double[] sqrtSingularValues = new double[singularValues.length];        for (int i = 0; i < sqrtSingularValues.length; ++i)            sqrtSingularValues[i] = Math.sqrt(singularValues[i]);        scale(mRowVectors,rowSingularVectors,sqrtSingularValues);        scale(mColumnVectors,columnSingularVectors,sqrtSingularValues);    }    /**     * Returns the number of rows in this matrix.     *     * @return The number of rows in this matrix.     */    public int numRows() {        return mRowVectors.length;    }    /**     * Returns the number of columns in this matrix.     *     * @return The number of columns in this matrix.     */    public int numColumns() {        return mColumnVectors.length;    }    /**     * Returns the order of this factorization.  The order     * is the number of dimensions in the singular vectors     * and the singular values.     *     * @return The order of this decomposition.     */    public int order() {        return mRowVectors[0].length;    }    /**     * Returns the value of this matrix at the specified row and column.     *     * @param row Row index.     * @param column Column index.     * @return The value of this matrix at the specified row and     * column.     */    public double value(int row, int column) {        double[] rowVec = mRowVectors[row];        double[] colVec = mColumnVectors[column];        double result = 0.0;        for (int i = 0; i < rowVec.length; ++i)            result += rowVec[i] * colVec[i];        return result;    }    /**     * Returns the array of singular values for this decomposition.     *     * @return The singular values for this decomposition.     */    public double[] singularValues() {        double[] singularValues = new double[mOrder];        for (int i = 0; i < singularValues.length; ++i)            singularValues[i] = singularValue(i);        return singularValues;    }    /**     * Returns the singular value for the specified order.     *     * @param order Dimension of the singular value.     * @return The singular value at the specified order.     */    public double singularValue(int order) {        if (order >= mOrder) {            String msg = "Maximum order=" + (mOrder-1)                + " found order=" + order;            throw new IllegalArgumentException(msg);        }        return columnLength(mRowVectors,order)            * columnLength(mColumnVectors,order);    }    /**     * Returns a matrix in which the left singular vectors make up the     * columns.  The first index is for the row of the original matrix     * and the second index is for the order of the singular vector.     * Thus the returned matrix is <code>m&times;k</code>, if the     * original input was an <code>m&times;n</code> matrix and SVD was     * performed at order <code>k</code>.     * @return The left singular vectors of this matrix.     */    public double[][] leftSingularVectors() {        return normalizeColumns(mRowVectors);    }    /**     * Returns a matrix in which the right singular vectors make up     * the columns.  The first index is for the column of the original     * matrix and the second index is for the order of the singular     * vector.  Thus the returned matrix is <code>n&times;k</code>, if     * the original input was an <code>m&times;n</code> matrix and SVD     * was performed at order <code>k</code>.     *     * @return The right singular vectors.     */    public double[][] rightSingularVectors() {        return normalizeColumns(mColumnVectors);    }    /**     * Returns the signular value decomposition of the specified     * complete matrix of values.     *     * <p>For a full description of the arguments and values, see     * the method documentation for     * {@link #partialSvd(int[][],double[][],int,double,double,double,double,double,int,int,PrintWriter)}     * and the class documentation above.     *     * <p>The two-dimensional array of values must be an     * <code>m &times; n</code> matrix.  In particular, each row     * must be of the same length.  If this is not the case, an illegal     * argument exception will be raised.     *     * @param values Array of values.     * @param maxOrder Maximum order of the decomposition.     * @param featureInit Initial value for singular vectors.     * @param initialLearningRate Incremental multiplier of error determining how     * fast learning occurs.     * @param annealingRate Rate at which annealing occurs; higher values     * provide more gradual annealing.     * @param regularization A regularization constant to damp learning.     * @param minImprovement Minimum relative improvement in mean square error required     * to finish an epoch.     * @param minEpochs Minimum number of epochs for training.     * @param maxEpochs Maximum number of epochs for training.     * @param writer Print writer to which output is directed during     * training, or null if no output is desired.     * @return Singular value decomposition for the specified partial matrix     * at the specified order.     * @throws IllegalArgumentException Under conditions listed in the     * method documentation above.     */    public static SvdMatrix svd(double[][] values,                                int maxOrder,                                double featureInit,                                double initialLearningRate,                                double annealingRate,                                double regularization,                                double minImprovement,                                int minEpochs,                                int maxEpochs,                                PrintWriter writer) {        int m = values.length;        int n = values[0].length;
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -