📄 optimization.java

📁 矩阵的QR分解算法
💻 JAVA
📖 第 1 页 / 共 3 页
字号:
12 3 下一页
/* *    This program is free software; you can redistribute it and/or modify *    it under the terms of the GNU General Public License as published by *    the Free Software Foundation; either version 2 of the License, or *    (at your option) any later version. * *    This program is distributed in the hope that it will be useful, *    but WITHOUT ANY WARRANTY; without even the implied warranty of *    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the *    GNU General Public License for more details. * *    You should have received a copy of the GNU General Public License *    along with this program; if not, write to the Free Software *    Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. *//* *    Optimization.java *    Copyright (C) 2003 University of Waikato, Hamilton, New Zealand * */package weka.core;import weka.core.TechnicalInformation.Field;import weka.core.TechnicalInformation.Type;/** * Implementation of Active-sets method with BFGS update to solve optimization * problem with only bounds constraints in multi-dimensions.  In this * implementation we consider both the lower and higher bound constraints. <p/> * * Here is the sketch of our searching strategy, and the detailed description * of the algorithm can be found in the Appendix of Xin Xu's MSc thesis:<p/> * Initialize everything, incl. initial value, direction, etc.<p/> * LOOP (main algorithm):<br/> *  * 1. Perform the line search using the directions for free variables<br/> * 1.1  Check all the bounds that are not "active" (i.e. binding variables) *      and compute the feasible step length to the bound for each of them<br/> * 1.2  Pick up the least feasible step length, say \alpha, and set it as  *      the upper bound of the current step length, i.e. *      0&lt;\lambda&lt;=\alpha<br/> * 1.3  Search for any possible step length&lt;=\alpha that can result the  *      "sufficient function decrease" (\alpha condition) AND "positive *      definite inverse Hessian" (\beta condition), if possible, using *      SAFEGUARDED polynomial interpolation.  This step length is "safe" and *      thus is used to compute the next value of the free variables .<br/> * 1.4  Fix the variable(s) that are newly bound to its constraint(s).<p/>      * * 2. Check whether there is convergence of all variables or their gradients. *    If there is, check the possibilities to release any current bindings of *    the fixed variables to their bounds based on the "reliable" second-order *    Lagarange multipliers if available.  If it's available and negative for *    one variable, then release it.  If not available, use first-order *    Lagarange multiplier to test release.  If there is any released *    variables, STOP the loop.  Otherwise update the inverse of Hessian matrix *    and gradient for the newly released variables and CONTINUE LOOP.<p/> * * 3. Use BFGS formula to update the inverse of Hessian matrix.  Note the  *    already-fixed variables must have zeros in the corresponding entries *    in the inverse Hessian.<p/>   * * 4. Compute the new (newton) search direction d=H^{-1}*g, where H^{-1} is the  *    inverse Hessian and g is the Jacobian.  Note that again, the already- *    fixed variables will have zero direction.<p/> * * ENDLOOP<p/> * * A typical usage of this class is to create your own subclass of this class * and provide the objective function and gradients as follows:<p/> * <pre> * class MyOpt extends Optimization { *   // Provide the objective function *   protected double objectiveFunction(double[] x) { *       // How to calculate your objective function... *       // ... *   } * *   // Provide the first derivatives *   protected double[] evaluateGradient(double[] x) { *       // How to calculate the gradient of the objective function... *       // ... *   } * *   // If possible, provide the index^{th} row of the Hessian matrix *   protected double[] evaluateHessian(double[] x, int index) { *      // How to calculate the index^th variable's second derivative *      // ...  *   } * } * </pre> * * When it's the time to use it, in some routine(s) of other class... * <pre> * MyOpt opt = new MyOpt(); *  * // Set up initial variable values and bound constraints * double[] x = new double[numVariables]; * // Lower and upper bounds: 1st row is lower bounds, 2nd is upper * double[] constraints = new double[2][numVariables]; * ... * * // Find the minimum, 200 iterations as default * x = opt.findArgmin(x, constraints);  * while(x == null){  // 200 iterations are not enough *    x = opt.getVarbValues();  // Try another 200 iterations *    x = opt.findArgmin(x, constraints); * } * * // The minimal function value * double minFunction = opt.getMinFunction(); * ... * </pre> *  * It is recommended that Hessian values be provided so that the second-order * Lagrangian multiplier estimate can be calcluated.  However, if it is not * provided, there is no need to override the <code>evaluateHessian()</code> * function.<p/> * * REFERENCES (see also the <code>getTechnicalInformation()</code> method):<br/> * The whole model algorithm is adapted from Chapter 5 and other related * chapters in Gill, Murray and Wright(1981) "Practical Optimization", Academic * Press.  and Gill and Murray(1976) "Minimization Subject to Bounds on the * Variables", NPL Report NAC72, while Chong and Zak(1996) "An Introduction to * Optimization", John Wiley &amp; Sons, Inc. provides us a brief but helpful * introduction to the method. <p/> * * Dennis and Schnabel(1983) "Numerical Methods for Unconstrained Optimization * and Nonlinear Equations", Prentice-Hall Inc. and Press et al.(1992) "Numeric * Recipe in C", Second Edition, Cambridge University Press. are consulted for * the polynomial interpolation used in the line search implementation.  <p/> * * The Hessian modification in BFGS update uses Cholesky factorization and two * rank-one modifications:<br/> * Bk+1 = Bk + (Gk*Gk')/(Gk'Dk) + (dGk*(dGk)'))/[alpha*(dGk)'*Dk]. <br/> * where Gk is the gradient vector, Dk is the direction vector and alpha is the * step rate. <br/> * This method is due to Gill, Golub, Murray and Saunders(1974) ``Methods for * Modifying Matrix Factorizations'', Mathematics of Computation, Vol.28, * No.126, pp 505-535. <p/> * * @author Xin Xu (xx5@cs.waikato.ac.nz) * @version $Revision: 1.8 $  * @see #getTechnicalInformation() */public abstract class Optimization    implements TechnicalInformationHandler {        protected double m_ALF = 1.0e-4;    protected double m_BETA = 0.9;        protected double m_TOLX = 1.0e-6;       protected double m_STPMX = 100.0;        protected int m_MAXITS = 200;        protected static boolean m_Debug = false;        /** function value */    protected double m_f;         /** G'*p */    private double m_Slope;        /** Test if zero step in lnsrch */    private boolean m_IsZeroStep = false;        /** Used when iteration overflow occurs */    private double[] m_X;        /** Compute machine precision */    protected static double m_Epsilon, m_Zero;     static {	m_Epsilon=1.0;	while(1.0+m_Epsilon > 1.0){	    m_Epsilon /= 2.0;	    	}	m_Epsilon *= 2.0;	m_Zero = Math.sqrt(m_Epsilon);	if (m_Debug)	    System.err.print("Machine precision is "+m_Epsilon+			     " and zero set to "+m_Zero);    }        /**     * Returns an instance of a TechnicalInformation object, containing      * detailed information about the technical background of this class,     * e.g., paper reference or book this class is based on.     *      * @return the technical information about this class     */    public TechnicalInformation getTechnicalInformation() {      TechnicalInformation 	result;      TechnicalInformation 	additional;            result = new TechnicalInformation(Type.MASTERSTHESIS);      result.setValue(Field.AUTHOR, "Xin Xu");      result.setValue(Field.YEAR, "2003");      result.setValue(Field.TITLE, "Statistical learning in multiple instance problem");      result.setValue(Field.SCHOOL, "University of Waikato");      result.setValue(Field.ADDRESS, "Hamilton, NZ");      result.setValue(Field.NOTE, "0657.594");      additional = result.add(Type.BOOK);      additional.setValue(Field.AUTHOR, "P. E. Gill and W. Murray and M. H. Wright");      additional.setValue(Field.YEAR, "1981");      additional.setValue(Field.TITLE, "Practical Optimization");      additional.setValue(Field.PUBLISHER, "Academic Press");      additional.setValue(Field.ADDRESS, "London and New York");            additional = result.add(Type.TECHREPORT);      additional.setValue(Field.AUTHOR, "P. E. Gill and W. Murray");      additional.setValue(Field.YEAR, "1976");      additional.setValue(Field.TITLE, "Minimization subject to bounds on the variables");      additional.setValue(Field.INSTITUTION, "National Physical Laboratory");      additional.setValue(Field.NUMBER, "NAC 72");            additional = result.add(Type.BOOK);      additional.setValue(Field.AUTHOR, "E. K. P. Chong and S. H. Zak");      additional.setValue(Field.YEAR, "1996");      additional.setValue(Field.TITLE, "An Introduction to Optimization");      additional.setValue(Field.PUBLISHER, "John Wiley and Sons");      additional.setValue(Field.ADDRESS, "New York");            additional = result.add(Type.BOOK);      additional.setValue(Field.AUTHOR, "J. E. Dennis and R. B. Schnabel");      additional.setValue(Field.YEAR, "1983");      additional.setValue(Field.TITLE, "Numerical Methods for Unconstrained Optimization and Nonlinear Equations");      additional.setValue(Field.PUBLISHER, "Prentice-Hall");            additional = result.add(Type.BOOK);      additional.setValue(Field.AUTHOR, "W. H. Press and B. P. Flannery and S. A. Teukolsky and W. T. Vetterling");      additional.setValue(Field.YEAR, "1992");      additional.setValue(Field.TITLE, "Numerical Recipes in C");      additional.setValue(Field.PUBLISHER, "Cambridge University Press");      additional.setValue(Field.EDITION, "Second");            additional = result.add(Type.ARTICLE);      additional.setValue(Field.AUTHOR, "P. E. Gill and G. H. Golub and W. Murray and M. A. Saunders");      additional.setValue(Field.YEAR, "1974");      additional.setValue(Field.TITLE, "Methods for modifying matrix factorizations");      additional.setValue(Field.JOURNAL, "Mathematics of Computation");      additional.setValue(Field.VOLUME, "28");      additional.setValue(Field.NUMBER, "126");      additional.setValue(Field.PAGES, "505-535");            return result;    }        /**     * Subclass should implement this procedure to evaluate objective     * function to be minimized     *      * @param x the variable values     * @return the objective function value     * @throws Exception if something goes wrong     */    protected abstract double objectiveFunction(double[] x) throws Exception;    /**     * Subclass should implement this procedure to evaluate gradient     * of the objective function     *      * @param x the variable values     * @return the gradient vector     * @throws Exception if something goes wrong     */    protected abstract double[] evaluateGradient(double[] x) throws Exception;    /**     * Subclass is recommended to override this procedure to evaluate second-order     * gradient of the objective function.  If it's not provided, it returns     * null.     *     * @param x the variables     * @param index the row index in the Hessian matrix     * @return one row (the row #index) of the Hessian matrix, null as default     * @throws Exception if something goes wrong     */    protected double[] evaluateHessian(double[] x, int index) throws Exception{	return null;    }    /**     * Get the minimal function value     *     * @return minimal function value found     */    public double getMinFunction() {      return m_f;    }    /**     * Set the maximal number of iterations in searching (Default 200)     *     * @param it the maximal number of iterations     */    public void setMaxIteration(int it) {      m_MAXITS=it;    }          /**     * Set whether in debug mode     *     * @param db use debug or not     */    public void setDebug(boolean db) {      m_Debug = db;    }        /**     * Get the variable values.  Only needed when iterations exceeds      * the max threshold.     *     * @return the current variable values     */    public double[] getVarbValues() {      return m_X;    }        /**     * Find a new point x in the direction p from a point xold at which the     * value of the function has decreased sufficiently, the positive      * definiteness of B matrix (approximation of the inverse of the Hessian)     * is preserved and no bound constraints are violated.  Details see "Numerical      * Methods for Unconstrained Optimization and Nonlinear Equations".     * "Numeric Recipes in C" was also consulted.     *     * @param xold old x value      * @param gradient gradient at that point     * @param direct direction vector     * @param stpmax maximum step length     * @param isFixed indicating whether a variable has been fixed     * @param nwsBounds non-working set bounds.  Means these variables are free and     *                  subject to the bound constraints in this step     * @param wsBdsIndx index of variables that has working-set bounds.  Means     *                  these variables are already fixed and no longer subject to     *                  the constraints     * @return new value along direction p from xold, null if no step was taken     * @throws Exception if an error occurs     */    public double[] lnsrch(double[] xold, double[] gradient, 			   double[] direct, double stpmax,			   boolean[] isFixed, double[][] nwsBounds,			   DynamicIntArray wsBdsIndx)	throws Exception {		int i, k,len=xold.length, 	    fixedOne=-1; // idx of variable to be fixed	double alam, alamin; // lambda to be found, and its lower bound		// For convergence and bound test	double temp,test,alpha=Double.POSITIVE_INFINITY,fold=m_f,sum; 		// For cubic interpolation	double a,alam2=0,b,disc=0,maxalam=1.0,rhs1,rhs2,tmplam;		double[] x = new double[len]; // New variable values		// Scale the step 	for (sum=0.0,i=0;i<len;i++){	    if(!isFixed[i]) // For fixed variables, direction = 0		sum += direct[i]*direct[i];	}		sum = Math.sqrt(sum);		if (m_Debug)	    System.err.println("fold:  "+Utils.doubleToString(fold,10,7)+"\n"+			       "sum:  "+Utils.doubleToString(sum,10,7)+"\n"+			       "stpmax:  "+Utils.doubleToString(stpmax,10,7));	if (sum > stpmax){	    for (i=0;i<len;i++)		if(!isFixed[i])		    direct[i] *= stpmax/sum;			}	else	    maxalam = stpmax/sum;		// Compute initial rate of decrease, g'*d 	m_Slope=0.0;	for (i=0;i<len;i++){	    x[i] = xold[i];	    if(!isFixed[i])		m_Slope += gradient[i]*direct[i];	}		if (m_Debug)	    System.err.print("slope:  " + Utils.doubleToString(m_Slope,10,7)+ "\n");		// Slope too small	if(Math.abs(m_Slope)<=m_Zero){	    if (m_Debug)		System.err.println("Gradient and direction orthogonal -- "+				   "Min. found with current fixed variables"+				   " (or all variables fixed). Try to release"+				   " some variables now.");	    return x;	}		// Err: slope > 0	if(m_Slope > m_Zero){	    if(m_Debug)		for(int h=0; h<x.length; h++)		    System.err.println(h+": isFixed="+isFixed[h]+", x="+				       x[h]+", grad="+gradient[h]+", direct="+				       direct[h]);	    throw new Exception("g'*p positive! -- Try to debug from here: line 327.");	}		// Compute LAMBDAmin and upper bound of lambda--alpha	test=0.0;	for(i=0;i<len;i++){	    	    if(!isFixed[i]){// No need for fixed variables		temp=Math.abs(direct[i])/Math.max(Math.abs(x[i]),1.0);		if (temp > test) test=temp;	    }	}		if(test>m_Zero) // Not converge	    alamin = m_TOLX/test;	else{	    if (m_Debug)		System.err.println("Zero directions for all free variables -- "+				   "Min. found with current fixed variables"+				   " (or all variables fixed). Try to release"+				   " some variables now.");	    return x;	}			// Check whether any non-working-set bounds are "binding"	for(i=0;i<len;i++){	    if(!isFixed[i]){// No need for fixed variables		double alpi;		if((direct[i]<-m_Epsilon) && !Double.isNaN(nwsBounds[0][i])){//Not feasible		    alpi = (nwsBounds[0][i]-xold[i])/direct[i];		    if(alpi <= m_Zero){ // Zero			if (m_Debug)			    System.err.println("Fix variable "+i+					       " to lower bound "+ nwsBounds[0][i]+					       " from value "+ xold[i]);			x[i] = nwsBounds[0][i];			isFixed[i]=true; // Fix this variable			alpha = 0.0;			nwsBounds[0][i]=Double.NaN; //Add cons. to working set			wsBdsIndx.addElement(i);		    }		    else if(alpha > alpi){ // Fix one variable in one iteration			alpha = alpi;			fixedOne = i;		    }					}		else if((direct[i]>m_Epsilon) && !Double.isNaN(nwsBounds[1][i])){//Not feasible		    alpi = (nwsBounds[1][i]-xold[i])/direct[i];		    if(alpi <= m_Zero){ // Zero			if (m_Debug)			    System.err.println("Fix variable "+i+					       " to upper bound "+ nwsBounds[1][i]+					       " from value "+ xold[i]);			x[i] = nwsBounds[1][i];
12 3 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -