optimization.java

来自「一个数据挖掘软件ALPHAMINERR的整个过程的JAVA版源代码」· Java 代码 · 共 1,211 行 · 第 1/3 页
JAVA
1,211 行
/*
 *    This program is free software; you can redistribute it and/or modify
 *    it under the terms of the GNU General Public License as published by
 *    the Free Software Foundation; either version 2 of the License, or
 *    (at your option) any later version.
 *
 *    This program is distributed in the hope that it will be useful,
 *    but WITHOUT ANY WARRANTY; without even the implied warranty of
 *    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 *    GNU General Public License for more details.
 *
 *    You should have received a copy of the GNU General Public License
 *    along with this program; if not, write to the Free Software
 *    Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
 */

/*
 *    Optimization.java
 *    Copyright (C) 2003 Xin Xu
 *
 */

package weka.core;


/**
 * Implementation of Active-sets method with BFGS update
 * to solve optimization problem with only bounds constraints in 
 * multi-dimensions.  In this implementation we consider both the lower and higher 
 * bound constraints.  <p>
 *
 * Here is the sketch of our searching strategy, and the detailed description
 * of the algorithm can be found in the Appendix of Xin Xu's MSc thesis:<p>
 * Initialize everything, incl. initial value, direction, etc.<p>
 * LOOP (main algorithm):<br>
 * 
 * 1. Perform the line search using the directions for free variables<br>
 * 1.1  Check all the bounds that are not "active" (i.e. binding variables)
 *      and compute the feasible step length to the bound for each of them<br>
 * 1.2  Pick up the least feasible step length, say \alpha, and set it as 
 *      the upper bound of the current step length, i.e. 0<\lambda<=\alpha<br>
 * 1.3  Search for any possible step length<=\alpha that can result the 
 *      "sufficient function decrease" (\alpha condition) AND "positive definite 
 *      inverse Hessian" (\beta condition), if possible, using SAFEGUARDED polynomial 
 *      interpolation.  This step length is "safe" and thus
 *      is used to compute the next value of the free variables .<br>
 * 1.4  Fix the variable(s) that are newly bound to its constraint(s).<p>     
 *
 * 2. Check whether there is convergence of all variables or their gradients.
 *    If there is, check the possibilities to release any current bindings of
 *    the fixed variables to their bounds based on the "reliable" second-order 
 *    Lagarange multipliers if available.  If it's available and negative for one
 *    variable, then release it.  If not available, use first-order Lagarange 
 *    multiplier to test release.  If there is any released variables, STOP the loop.
 *    Otherwise update the inverse of Hessian matrix and gradient for the newly 
 *    released variables and CONTINUE LOOP.<p>
 *
 * 3. Use BFGS formula to update the inverse of Hessian matrix.  Note the 
 *    already-fixed variables must have zeros in the corresponding entries
 *    in the inverse Hessian.<p>  
 *
 * 4. Compute the new (newton) search direction d=H^{-1}*g, where H^{-1} is the 
 *    inverse Hessian and g is the Jacobian.  Note that again, the already-
 *    fixed variables will have zero direction.<p>
 *
 * ENDLOOP<p>
 *
 * A typical usage of this class is to create your own subclass of this class
 * and provide the objective function and gradients as follows:<p>
 * <code>
 * class MyOpt extends Optimization{ <br>
 *   // Provide the objective function <br>
 *   protected double objectiveFunction(double[] x){<br>
 *       // How to calculate your objective function...<br>
 *       // ...<br>
 *   }<p>
 *
 *   // Provide the first derivatives<br>
 *   protected double[] evaluateGradient(double[] x){<br>
 *       // How to calculate the gradient of the objective function...<br>
 *       // ...<br>
 *   } <p>
 *
 *   // If possible, provide the index^{th} row of the Hessian matrix<br>
 *   protected double[] evaluateHessian(double[] x, int index){<br>
 *      // How to calculate the index^th variable's second derivative<br>
 *      // ... <br>
 *   }<br>
 * } <p>
 *
 * // When it's the time to use it, in some routine(s) of other class...<br>
 * MyOpt opt = new MyOpt();<p>
 * 
 * // Set up initial variable values and bound constraints<br>
 * double[] x = new double[numVariables];<br>
 * // Lower and upper bounds: 1st row is lower bounds, 2nd is upper<br>
 * double[] constraints = new double[2][numVariables];<br>
 * ...<p>
 *
 * // Find the minimum, 200 iterations as default<br>
 * x = opt.findArgmin(x, constraints); <br>
 * while(x == null){  // 200 iterations are not enough<br>
 *    x = opt.getVarbValues();  // Try another 200 iterations<br>
 *    x = opt.findArgmin(x, constraints);<br>
 * }<p>
 *
 * // The minimal function value<br>
 * double minFunction = opt.getMinFunction();<br>
 * ...<p>
 * </code>  
 * It is recommended that Hessian values be provided so that the second-order 
 * Lagrangian multiplier estimate can be calcluated.  However, if it is not provided, 
 * there is no need to override the <code>evaluateHessian()</code> function.<p>
 *
 * REFERENCES:<br>
 * The whole model algorithm is adapted from Chapter 5 and other related chapters in 
 * Gill, Murray and Wright(1981) "Practical Optimization", Academic Press.
 * and Gill and Murray(1976) "Minimization Subject to Bounds on the Variables", NPL 
 * Report NAC72, while Chong and Zak(1996) "An Introduction to Optimization", 
 * John Wiley & Sons, Inc. provides us a brief but helpful introduction to the method. <p>
 *
 * Dennis and Schnabel(1983) "Numerical Methods for Unconstrained Optimization and 
 * Nonlinear Equations", Prentice-Hall Inc. and Press et al.(1992) "Numeric Recipe in C",
 * Second Edition, Cambridge University Press. are consulted for the polynomial
 * interpolation used in the line search implementation.  <p>
 *
 * The Hessian modification in BFGS update uses Cholesky factorization and two rank-one 
 * modifications:<br>
 * Bk+1 = Bk + (Gk*Gk')/(Gk'Dk) + (dGk*(dGk)'))/[alpha*(dGk)'*Dk]. <br>
 * where Gk is the gradient vector, Dk is the direction vector and alpha is the
 * step rate.  <br>
 * This method is due to Gill, Golub, Murray and Saunders(1974) ``Methods for Modifying 
 * Matrix Factorizations'', Mathematics of Computation, Vol.28, No.126, pp 505-535.
 * <p>
 *
 * @author Xin Xu (xx5@cs.waikato.ac.nz)
 * @version $Revision$ 
 */
public abstract class Optimization{
    
    protected double m_ALF = 1.0e-4;

    protected double m_BETA = 0.9;    

    protected double m_TOLX = 1.0e-6;
   
    protected double m_STPMX = 100.0;
    
    protected int m_MAXITS = 200;
    
    protected static boolean m_Debug = false;
    
    // function value
    protected double m_f;    
 
    // G'*p
    private double m_Slope;
    
    // Test if zero step in lnsrch
    private boolean m_IsZeroStep = false;
    
    // Used when iteration overflow occurs
    private double[] m_X;
    
    // Compute machine precision
    protected static double m_Epsilon, m_Zero; 
    static {
	m_Epsilon=1.0;
	while(1.0+m_Epsilon > 1.0){
	    m_Epsilon /= 2.0;	    
	}
	m_Epsilon *= 2.0;
	m_Zero = Math.sqrt(m_Epsilon);
	if (m_Debug)
	    System.err.print("Machine precision is "+m_Epsilon+
			     " and zero set to "+m_Zero);
    }
    
    /* 
     * Subclass should implement this procedure to evaluate objective
     * function to be minimized
     * 
     * @param x the variable values
     * @return the objective function value
     */
    protected abstract double objectiveFunction(double[] x) throws Exception;

    /* 
     * Subclass should implement this procedure to evaluate gradient
     * of the objective function
     * 
     * @param x the variable values
     * @return the gradient vector
     */
    protected abstract double[] evaluateGradient(double[] x) throws Exception;

    /* 
     * Subclass is recommended to override this procedure to evaluate second-order
     * gradient of the objective function.  If it's not provided, it returns
     * null.
     *
     * @param x the variables
     * @param index the row index in the Hessian matrix
     * @return one row (the row #index) of the Hessian matrix, null as default
     */
    protected double[] evaluateHessian(double[] x, int index) throws Exception{
	return null;
    }

    /**
     * Get the minimal function value
     *
     * @return minimal function value found
     */
    public double getMinFunction(){ return m_f;}

    /**
     * Set the maximal number of iterations in searching (Default 200)
     *
     * @param it the maximal number of iterations
     */
    public void setMaxIteration(int it){ m_MAXITS=it; }
      
    /**
     * Set whether in debug mode
     *
     * @param db use debug or not
     */
    public void setDebug(boolean db){ m_Debug = db; }
    
    /**
     * Get the variable values.  Only needed when iterations exceeds 
     * the max threshold.
     *
     * @return the current variable values
     */
    public double[] getVarbValues(){ return m_X; }
    
    /**
     * Find a new point x in the direction p from a point xold at which the
     * value of the function has decreased sufficiently, the positive 
     * definiteness of B matrix (approximation of the inverse of the Hessian)
     * is preserved and no bound constraints are violated.  Details see "Numerical 
     * Methods for Unconstrained Optimization and Nonlinear Equations".
     * "Numeric Recipes in C" was also consulted.
     *
     * @param xold old x value 
     * @param gradient gradient at that point
     * @param direct direction vector
     * @param stpmax maximum step length
     * @param isFixed indicating whether a variable has been fixed
     * @param nwsBounds non-working set bounds.  Means these variables are free and
     *                  subject to the bound constraints in this step
     * @param wsBdsIndx index of variables that has working-set bounds.  Means
     *                  these variables are already fixed and no longer subject to
     *                  the constraints
     * @return new value along direction p from xold, null if no step was taken
     * @exception Exception if an error occurs
     */
    public double[] lnsrch(double[] xold, double[] gradient, 
			   double[] direct, double stpmax,
			   boolean[] isFixed, double[][] nwsBounds,
			   FastVector wsBdsIndx)
	throws Exception {
	
	int i, j, k,len=xold.length, 
	    fixedOne=-1; // idx of variable to be fixed
	double alam, alamin; // lambda to be found, and its lower bound
	
	// For convergence and bound test
	double temp,test,alpha=Double.POSITIVE_INFINITY,fold=m_f,sum; 
	
	// For cubic interpolation
	double a,alam2=0,b,disc=0,maxalam=1.0,rhs1,rhs2,tmplam;
	
	double[] x = new double[len]; // New variable values
	
	// Scale the step 
	for (sum=0.0,i=0;i<len;i++){
	    if(!isFixed[i]) // For fixed variables, direction = 0
		sum += direct[i]*direct[i];
	}	
	sum = Math.sqrt(sum);
	
	if (m_Debug)
	    System.err.println("fold:  "+Utils.doubleToString(fold,10,7)+"\n"+
			       "sum:  "+Utils.doubleToString(sum,10,7)+"\n"+
			       "stpmax:  "+Utils.doubleToString(stpmax,10,7));
	if (sum > stpmax){
	    for (i=0;i<len;i++)
		if(!isFixed[i])
		    direct[i] *= stpmax/sum;		
	}
	else
	    maxalam = stpmax/sum;
	
	// Compute initial rate of decrease, g'*d 
	m_Slope=0.0;
	for (i=0;i<len;i++){
	    x[i] = xold[i];
	    if(!isFixed[i])
		m_Slope += gradient[i]*direct[i];
	}
	
	if (m_Debug)
	    System.err.print("slope:  " + Utils.doubleToString(m_Slope,10,7)+ "\n");
	
	// Slope too small
	if(Math.abs(m_Slope)<=m_Zero){
	    if (m_Debug)
		System.err.println("Gradient and direction orthogonal -- "+
				   "Min. found with current fixed variables"+
				   " (or all variables fixed). Try to release"+
				   " some variables now.");
	    return x;
	}
	
	// Err: slope > 0
	if(m_Slope > m_Zero){
	    if(m_Debug)
		for(int h=0; h<x.length; h++)
		    System.err.println(h+": isFixed="+isFixed[h]+", x="+
				       x[h]+", grad="+gradient[h]+", direct="+
				       direct[h]);
	    throw new Exception("g'*p positive! -- Try to debug from here: line 327.");
	}
	
	// Compute LAMBDAmin and upper bound of lambda--alpha
	test=0.0;
	for(i=0;i<len;i++){	    
	    if(!isFixed[i]){// No need for fixed variables
		temp=Math.abs(direct[i])/Math.max(Math.abs(x[i]),1.0);
		if (temp > test) test=temp;
	    }
	}
	
	if(test>m_Zero) // Not converge
	    alamin = m_TOLX/test;
	else{
	    if (m_Debug)
		System.err.println("Zero directions for all free variables -- "+
				   "Min. found with current fixed variables"+
				   " (or all variables fixed). Try to release"+
				   " some variables now.");
	    return x;
	}
		
	// Check whether any non-working-set bounds are "binding"
	for(i=0;i<len;i++){
	    if(!isFixed[i]){// No need for fixed variables
		double alpi;
		if((direct[i]<-m_Epsilon) && !Double.isNaN(nwsBounds[0][i])){//Not feasible
		    alpi = (nwsBounds[0][i]-xold[i])/direct[i];
		    if(alpi <= m_Zero){ // Zero
			if (m_Debug)
			    System.err.println("Fix variable "+i+
					       " to lower bound "+ nwsBounds[0][i]+
					       " from value "+ xold[i]);
			x[i] = nwsBounds[0][i];
			isFixed[i]=true; // Fix this variable
			alpha = 0.0;
			nwsBounds[0][i]=Double.NaN; //Add cons. to working set
			wsBdsIndx.addElement(new Integer(i));
		    }
		    else if(alpha > alpi){ // Fix one variable in one iteration
			alpha = alpi;
			fixedOne = i;
		    }			
		}
		else if((direct[i]>m_Epsilon) && !Double.isNaN(nwsBounds[1][i])){//Not feasible
		    alpi = (nwsBounds[1][i]-xold[i])/direct[i];
		    if(alpi <= m_Zero){ // Zero
			if (m_Debug)
			    System.err.println("Fix variable "+i+
					       " to upper bound "+ nwsBounds[1][i]+
					       " from value "+ xold[i]);
			x[i] = nwsBounds[1][i];
			isFixed[i]=true; // Fix this variable
			alpha = 0.0;
			nwsBounds[1][i]=Double.NaN; //Add cons. to working set
			wsBdsIndx.addElement(new Integer(i));
		    }
		    else if(alpha > alpi){
			alpha = alpi;
			fixedOne = i;
		    }			
		}				
	    }
	}	
	
	if (m_Debug){
	    System.err.println("alamin: " + Utils.doubleToString(alamin,10,7));
	    System.err.println("alpha: " + Utils.doubleToString(alpha,10,7));
	}
	
	if(alpha <= m_Zero){ // Zero	   
	    m_IsZeroStep = true;
	    if (m_Debug)
		System.err.println("Alpha too small, try again");
	    return x;
	}
	
	alam = alpha; // Always try full feasible newton step 
	if(alam > 1.0)
optimization.java - 源码说明

本页面展示了「一个数据挖掘软件ALPHAMINERR的整个过程的JAVA版源代码」中的 optimization.java 源码文件，采用 Java 编程语言编写，共 1,211 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与ALPHAMINERR相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?