📄 optimization.java

📁 wekaUT是 university texas austin 开发的基于weka的半指导学习(semi supervised learning)的分类器
💻 JAVA
📖 第 1 页 / 共 3 页
字号:
12 3 下一页
/* *    This program is free software; you can redistribute it and/or modify *    it under the terms of the GNU General Public License as published by *    the Free Software Foundation; either version 2 of the License, or *    (at your option) any later version. * *    This program is distributed in the hope that it will be useful, *    but WITHOUT ANY WARRANTY; without even the implied warranty of *    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the *    GNU General Public License for more details. * *    You should have received a copy of the GNU General Public License *    along with this program; if not, write to the Free Software *    Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. *//* *    Optimization.java *    Copyright (C) 2003 Xin Xu * */package weka.core;import java.util.*;import java.io.*;/** * Implementation of Active-sets method with BFGS update * to solve optimization problem with only bounds constraints in  * multi-dimensions.  In this implementation we consider both the lower and higher  * bound constraints.  <p> * * Here is the sketch of our searching strategy, and the detailed description * of the algorithm can be found in the Appendix of Xin Xu's MSc thesis:<p> * Initialize everything, incl. initial value, direction, etc.<p> * LOOP (main algorithm):<br> *  * 1. Perform the line search using the directions for free variables<br> * 1.1  Check all the bounds that are not "active" (i.e. binding variables) *      and compute the feasible step length to the bound for each of them<br> * 1.2  Pick up the least feasible step length, say \alpha, and set it as  *      the upper bound of the current step length, i.e. 0<\lambda<=\alpha<br> * 1.3  Search for any possible step length<=\alpha that can result the  *      "sufficient function decrease" (\alpha condition) AND "positive definite  *      inverse Hessian" (\beta condition), if possible, using SAFEGUARDED polynomial  *      interpolation.  This step length is "safe" and thus *      is used to compute the next value of the free variables .<br> * 1.4  Fix the variable(s) that are newly bound to its constraint(s).<p>      * * 2. Check whether there is convergence of all variables or their gradients. *    If there is, check the possibilities to release any current bindings of *    the fixed variables to their bounds based on the "reliable" second-order  *    Lagarange multipliers if available.  If it's available and negative for one *    variable, then release it.  If not available, use first-order Lagarange  *    multiplier to test release.  If there is any released variables, STOP the loop. *    Otherwise update the inverse of Hessian matrix and gradient for the newly  *    released variables and CONTINUE LOOP.<p> * * 3. Use BFGS formula to update the inverse of Hessian matrix.  Note the  *    already-fixed variables must have zeros in the corresponding entries *    in the inverse Hessian.<p>   * * 4. Compute the new (newton) search direction d=H^{-1}*g, where H^{-1} is the  *    inverse Hessian and g is the Jacobian.  Note that again, the already- *    fixed variables will have zero direction.<p> * * ENDLOOP<p> * * A typical usage of this class is to create your own subclass of this class * and provide the objective function and gradients as follows: * <code> * class MyOpt extends Optimization{ *   // Provide the objective function *   protected double objectiveFunction(double[] x){ *       // How to calculate your objective function... *       // ... *   } * *   // Provide the first derivatives *   protected double[] evaluateGradient(double[] x){ *       // How to calculate the gradient of the objective function... *       // ... *   }  * *   // If possible, provide the index^{th} row of the Hessian matrix *   protected double[] evaluateHessian(double[] x, int index){ *      // How to calculate the index^th variable's second derivative *      // ... *   } * }  * * // When it's the time to use it, in some routine(s) of other class... * MyOpt opt = new MyOpt(); *  * // Set up initial variable values and bound constraints * double[] x = new double[numVariables]; * // Lower and upper bounds: 1st row is lower bounds, 2nd is upper * double[] constraints = new double[2][numVariables]; * ... * * // Find the minimum, 200 iterations as default * x = opt.findArgmin(x, constraints);  * while(x == null){  // 200 iterations are not enough *    x = opt.getVarbValues();  // Try another 200 iterations *    x = opt.findArgmin(x, constraints); * } * * // The minimal function value * double minFunction = opt.getMinFunction(); * ... * </code>   * It is recommended that Hessian values be provided so that the second-order  * Lagrangian multiplier estimate can be calcluated.  However, if it is not provided,  * there is no need to override the <code>evaluateHessian()</code> function.<p> * * REFERENCES:<br> * The whole model algorithm is adapted from Chapter 5 and other related chapters in  * Gill, Murray and Wright(1981) "Practical Optimization", Academic Press. * and Gill and Murray(1976) "Minimization Subject to Bounds on the Variables", NPL  * Report NAC72, while Chong and Zak(1996) "An Introduction to Optimization",  * John Wiley & Sons, Inc. provides us a brief but helpful introduction to the method. <p> * * Dennis and Schnabel(1983) "Numerical Methods for Unconstrained Optimization and  * Nonlinear Equations", Prentice-Hall Inc. and Press et al.(1992) "Numeric Recipe in C", * Second Edition, Cambridge University Press. are consulted for the polynomial * interpolation used in the line search implementation.  <p> * * The Hessian modification in BFGS update uses Cholesky factorization and two rank-one  * modifications:<br> * Bk+1 = Bk + (Gk*Gk')/(Gk'Dk) + (dGk*(dGk)'))/[alpha*(dGk)'*Dk]. <br> * where Gk is the gradient vector, Dk is the direction vector and alpha is the * step rate.  <br> * This method is due to Gill, Golub, Murray and Saunders(1974) ``Methods for Modifying  * Matrix Factorizations'', Mathematics of Computation, Vol.28, No.126, pp 505-535. * <p> * * @author Xin Xu (xx5@cs.waikato.ac.nz) * @version $Revision: 1.1.1.1 $  */public abstract class Optimization{        protected double m_ALF = 1.0e-4;    protected double m_BETA = 0.9;        protected double m_TOLX = 1.0e-6;       protected double m_STPMX = 100.0;        protected int m_MAXITS = 200;        protected static boolean m_Debug = false;        // function value    protected double m_f;         // G'*p    private double m_Slope;        // Test if zero step in lnsrch    private boolean m_IsZeroStep = false;        // Used when iteration overflow occurs    private double[] m_X;        // Compute machine precision    protected static double m_Epsilon, m_Zero;     static {	m_Epsilon=1.0;	while(1.0+m_Epsilon > 1.0){	    m_Epsilon /= 2.0;	    	}	m_Epsilon *= 2.0;	m_Zero = Math.sqrt(m_Epsilon);	if (m_Debug)	    System.err.print("Machine precision is "+m_Epsilon+			     " and zero set to "+m_Zero);    }        /*      * Subclass should implement this procedure to evaluate objective     * function to be minimized     *      * @param x the variable values     * @return the objective function value     */    protected abstract double objectiveFunction(double[] x);    /*      * Subclass should implement this procedure to evaluate gradient     * of the objective function     *      * @param x the variable values     * @return the gradient vector     */    protected abstract double[] evaluateGradient(double[] x);    /*      * Subclass is recommended to override this procedure to evaluate second-order     * gradient of the objective function.  If it's not provided, it returns     * null.     *     * @param x the variables     * @param index the row index in the Hessian matrix     * @return one row (the row #index) of the Hessian matrix, null as default     */    protected double[] evaluateHessian(double[] x, int index){	return null;    }    /**     * Get the minimal function value     *     * @return minimal function value found     */    public double getMinFunction(){ return m_f;}    /**     * Set the maximal number of iterations in searching (Default 200)     *     * @param it the maximal number of iterations     */    public void setMaxIteration(int it){ m_MAXITS=it; }          /**     * Set whether in debug mode     *     * @param db use debug or not     */    public void setDebug(boolean db){ m_Debug = db; }        /**     * Get the variable values.  Only needed when iterations exceeds      * the max threshold.     *     * @return the current variable values     */    public double[] getVarbValues(){ return m_X; }        /**     * Find a new point x in the direction p from a point xold at which the     * value of the function has decreased sufficiently, the positive      * definiteness of B matrix (approximation of the inverse of the Hessian)     * is preserved and no bound constraints are violated.  Details see "Numerical      * Methods for Unconstrained Optimization and Nonlinear Equations".     * "Numeric Recipes in C" was also consulted.     *     * @param xold old x value      * @param gradient gradient at that point     * @param direct direction vector     * @param stpmax maximum step length     * @param isFixed indicating whether a variable has been fixed     * @param nwsBounds non-working set bounds.  Means these variables are free and     *                  subject to the bound constraints in this step     * @param wsBdsIndx index of variables that has working-set bounds.  Means     *                  these variables are already fixed and no longer subject to     *                  the constraints     * @return new value along direction p from xold, null if no step was taken     * @exception Exception if an error occurs     */    public double[] lnsrch(double[] xold, double[] gradient, 			   double[] direct, double stpmax,			   boolean[] isFixed, double[][] nwsBounds,			   FastVector wsBdsIndx)	throws Exception {		int i, j, k,len=xold.length, 	    fixedOne=-1; // idx of variable to be fixed	double alam, alamin; // lambda to be found, and its lower bound		// For convergence and bound test	double temp,test,alpha=Double.POSITIVE_INFINITY,fold=m_f,sum; 		// For cubic interpolation	double a,alam2=0,b,disc=0,maxalam=1.0,rhs1,rhs2,tmplam;		double[] x = new double[len]; // New variable values		// Scale the step 	for (sum=0.0,i=0;i<len;i++){	    if(!isFixed[i]) // For fixed variables, direction = 0		sum += direct[i]*direct[i];	}		sum = Math.sqrt(sum);		if (m_Debug)	    System.err.println("fold:  "+Utils.doubleToString(fold,10,7)+"\n"+			       "sum:  "+Utils.doubleToString(sum,10,7)+"\n"+			       "stpmax:  "+Utils.doubleToString(stpmax,10,7));	if (sum > stpmax){	    for (i=0;i<len;i++)		if(!isFixed[i])		    direct[i] *= stpmax/sum;			}	else	    maxalam = stpmax/sum;		// Compute initial rate of decrease, g'*d 	m_Slope=0.0;	for (i=0;i<len;i++){	    x[i] = xold[i];	    if(!isFixed[i])		m_Slope += gradient[i]*direct[i];	}		if (m_Debug)	    System.err.print("slope:  " + Utils.doubleToString(m_Slope,10,7)+ "\n");		// Slope too small	if(Math.abs(m_Slope)<=m_Zero){	    if (m_Debug)		System.err.println("Gradient and direction orthogonal -- "+				   "Min. found with current fixed variables"+				   " (or all variables fixed). Try to release"+				   " some variables now.");	    return x;	}		// Err: slope > 0	if(m_Slope > m_Zero){	    if(m_Debug)		for(int h=0; h<x.length; h++)		    System.err.println(h+": isFixed="+isFixed[h]+", x="+				       x[h]+", grad="+gradient[h]+", direct="+				       direct[h]);	    throw new Exception("g'*p positive! -- Try to debug from here: line 263.");	}		// Compute LAMBDAmin and upper bound of lambda--alpha	test=0.0;	for(i=0;i<len;i++){	    	    if(!isFixed[i]){// No need for fixed variables		temp=Math.abs(direct[i])/Math.max(Math.abs(x[i]),1.0);		if (temp > test) test=temp;	    }	}		if(test>m_Zero) // Not converge	    alamin = m_TOLX/test;	else{	    if (m_Debug)		System.err.println("Zero directions for all free variables -- "+				   "Min. found with current fixed variables"+				   " (or all variables fixed). Try to release"+				   " some variables now.");	    return x;	}			// Check whether any non-working-set bounds are "binding"	for(i=0;i<len;i++){	    if(!isFixed[i]){// No need for fixed variables		double alpi;		if((direct[i]<-m_Epsilon) && !Double.isNaN(nwsBounds[0][i])){//Not feasible		    alpi = (nwsBounds[0][i]-xold[i])/direct[i];		    if(alpi <= m_Zero){ // Zero			if (m_Debug)			    System.err.println("Fix variable "+i+					       " to lower bound "+ nwsBounds[0][i]+					       " from value "+ xold[i]);			x[i] = nwsBounds[0][i];			isFixed[i]=true; // Fix this variable			alpha = 0.0;			nwsBounds[0][i]=Double.NaN; //Add cons. to working set			wsBdsIndx.addElement(new Integer(i));		    }		    else if(alpha > alpi){ // Fix one variable in one iteration			alpha = alpi;			fixedOne = i;		    }					}		else if((direct[i]>m_Epsilon) && !Double.isNaN(nwsBounds[1][i])){//Not feasible		    alpi = (nwsBounds[1][i]-xold[i])/direct[i];		    if(alpi <= m_Zero){ // Zero			if (m_Debug)			    System.err.println("Fix variable "+i+					       " to upper bound "+ nwsBounds[1][i]+					       " from value "+ xold[i]);			x[i] = nwsBounds[1][i];			isFixed[i]=true; // Fix this variable			alpha = 0.0;			nwsBounds[1][i]=Double.NaN; //Add cons. to working set			wsBdsIndx.addElement(new Integer(i));		    }		    else if(alpha > alpi){			alpha = alpi;			fixedOne = i;		    }					}					    }	}			if (m_Debug){	    System.err.println("alamin: " + Utils.doubleToString(alamin,10,7));
12 3 下一页
💿 文件大小 12323 K
👤 上传用户 ilovexzhu
📂 所属分类人工智能/神经网络
🏷️ 相关标签

#university #supervised #learning #wekaUT
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -