📄 lbfgs.c

📁 NIST Handwriting OCR Testbed
💻 C
📖 第 1 页 / 共 3 页
字号:
12 3 下一页
/*# proc: lbfgs - Limited Memory BFGS Method for Large Scale Optimization by# proc:         Jorge Nocedal# proc: lb1 - Prints monitoring information w.r.t. lbfgs optimization.# proc: mcsrch - Finds a step which satisfies a sufficient decrease condition# proc:          and a curvature condition.# proc: mcstep - Computes a safeguarded step for a linesearch and to update an# proc:          interval of uncertainty for a minimizer of the function.*//* This is a C version of Nocedal's LBFGS routine, Limited-Memory BFGSoptimizer, and its supporting routines LB1, MCSRCH, and MCSTEP.  Itwas translated by gtc from a provided Fortran version (which hadalready been modified by jlb), by a combination of editing, andrunning the f2c Fortran-to-C converter program.(The parm EPS which was in the Fortran version of LBFGS() is notpresent in this version.  It was removed because, in the Fortranversion (as modified by jlb), it is not used.) */#include <stdio.h>#include <math.h>#include <mlp/defs.h>#include <mlp/macros.h>#include <mlp/blas.h>/* To activate surveying, uncomment following line: *//* #define SURVEY *//*******************************************************************/  /* --------------     function lbfgs     --------------     Limited Memory BFGS Method for Large Scale Optimization                        Jorge Nocedal                      *** July 1990 ***  This function solves the unconstrained minimization problem            min f(x),    x = (x[0],x[1],...,x[n-1]),  using the limited memory BFGS method.  The routine is especially  effective on problems involving a large number of variables.  In  a typical iteration of this method an approximation Hk to the  inverse of the Hessian is obtained by applying m BFGS updates to  a diagonal matrix Hk0, using information from the previous m steps.  The user specifies the number m, which determines the amount of  storage required by the routine.  The user may also provide the  diagonal matrices Hk0 if not satisfied with the default choice.  The algorithm is described in "On the limited memory BFGS method  for large scale optimization", by D. Liu and J. Nocedal,  Mathematical Programming B 45 (1989) 503-528.  The user is required to calculate the function value f and its  gradient g.  In order to allow the user complete control over these  computations, reverse communication is used.  The routine must be  called repeatedly under the control of the parameter iflag.  The steplength is determined at each iteration by means of the  line search routine mcsrch, which is a slight modification of  the routine CSRCH written by More' and Thuente.  The calling statement is:    lbfgs(n, m, x, f, g, diagco, diag, iprint, xtol, work, iflag,      info, fp_mon, fp_err, gtol, stpmin, stpmax, itopt, iter, ierr,      stp);  where:  [(input) int]  n       is an int variable that must be set by the user to the          number of variables.  It is not altered by the routine.          Restriction: n > 0.  [(input) int]  m       is an int variable that must be set by the user to the          number of corrections used in the bfgs update.  It is not          altered by the routine. Values of m less than 3 are not          recommended; large values of m will result in excessive          computing time.  3 <= m <= 7 is recommended.          Restriction: m > 0.  [float vector]  x       is a float array of length n.  On initial entry it must be          set by the user to the values of the initial estimate of the          solution vector.  On exit with *iflag = 0, it contains the          values of the variables at the best point found (usually a          solution).  [(input) float]  f       is a float variable.  Before initial entry and on a re-entry          with *iflag = 1, it must be set by the user to contain the          value of the function f at the point x.  [float vector]  g       is a float array of length n.  Before initial entry and on a          re-entry with *iflag = 1, it must be set by the user to          contain the components of the gradient at the point x.  [(input) int]  diagco  is an int variable that must be set to TRUE if the user          wishes to provide the diagonal matrix Hk0 at each iteration.          Otherwise it should be set to FALSE, in which case lbfgs          will use a default value described below.  If diagco is set          to TRUE the routine will return at each iteration of the          algorithm with *iflag = 2, and the diagonal matrix Hk0 must          be provided in the array diag.  [float vector]  diag    is a float array of length n.  If diagco is TRUE, then on          initial entry or on re-entry with *iflag = 2, diag must be          set by the user to contain the values of the diagonal matrix          Hk0.  Restriction: all elements of diag must be positive.  [int vector]  iprint  is an int array of length two which must be set by the          user as follows:          iprint[0] specifies the frequency of the output:             iprint[0] < 0 : no output is generated.             iprint[0] = 0 : output only at first and last iteration.             iprint[0] > 0 : output every iprint[0] iterations.          iprint[1] specifies the type of output generated:             iprint[1] = 0 : iteration count, number of function                             evaluations, function value, norm of the                             gradient, and steplength.             iprint[1] = 1 : same as iprint[1] = 0, plus vector of                             variables and  gradient vector at the                             initial point.             iprint[1] = 2 : same as iprint[1] = 1, plus vector of                             variables.             iprint[1] = 3 : same as iprint[1] = 2, plus gradient                             vector.  [(input) float]  xtol    is a positive float variable that must be set by the user to          an estimate of the machine precision (e.g. 10^(-7) on a SUN          station 3/60).  The line search routine will terminate if          the relative width of the interval of uncertainty is less          than xtol.  [float vector]  work    is a float array of length n(2m+1)+2m used as workspace for          lbfgs.  This array must not be altered by the user.  [(input/output) int-pointer]  iflag   is an int-pointer, used for input and output of an int          defined in the calling routine; *iflag must be 0 on initial          entry to lbfgs.  A return with *iflag < 0 indicates an error,          and *iflag = 0 indicates that the routine has terminated          without detecting errors.  On a return with *iflag = 1, the          user must compute the function value f and gradient g.  On          a return with *iflag = 2, the user must provide the diagonal          matrix Hk0.          The following negative values of *iflag, detecting an error,          are possible:          *iflag = -1  The line search routine mcsrch failed.  The                       output parameter "info" provides more detailed                       information (see also the documentation of                       mcsrch):                       *info = 0  Improper input parameters.                       *info = 2  Relative width of the interval of                                  uncertainty is at most xtol.                       *info = 3  More than maxfev function evaluations                                  were required at the present                                  iteration.                       *info = 4  The step is too small.                       *info = 5  The step is too large.                       *info = 6  Rounding errors prevent further                                  progress.  There may not be a step                                  which satisfies the sufficient                                  decrease and curvature conditions.                                  Tolerances may be too small.          *iflag = -2  The i-th diagonal element of the diagonal                       inverse Hessian approximation, given in diag,                       is not positive.          *iflag = -3  Improper input parameters for lbfgs (n or m is                       not positive).  [Parms info through stpmax added by gtc: info was always returned  by mcsrch but was not returned by lbfgs, and fp_mon, fp_err, gtol,  stpmin, and stpmax were global variables in the LB3 common block  in the Fortran version.]  [(output) int-pointer]  info   returns the info value returned by mcsrch.  This is of         interest only if *iflag = -1.  [(input) FILE-pointer]  fp_mon will be used for the writing of monitoring information, as         controlled by iprint.  [(input) FILE-pointer]  fp_err will be used for the writing of error messages.  This writing         may be suppressed by setting fp_err to (FILE *)NULL.  [(input) float]  gtol   controls the accuracy of the line search routine mcsrch.  If         the function and gradient evaluations are inexpensive with         respect to the cost of the iteration (which is sometimes the         case when solving very large problems) it may be advantageous         to set gtol to a small value.  A typical small value is 0.1.         Restriction: gtol should be greater than 1.e-04.  [(input) float]  stpmin and stpmax         are nonnegative variables which specify lower and upper         bounds for the step in the line search.  The recommended         values (in original Fortran version, default values) are         1.e-20 and 1.e+20.  These values need not be modified unless         the exponents are too large for the machine being used, or         unless the problem is extremely badly scaled (in which case         the exponents should be increased).   [Remaining parms added by jlb:]  [int-pointer]  itopt  is ? [This is just passed to optchk as its second arg,    which is supposed to be how many iterations the optimization    has used so far.  Should eventually be changed to an int, not    int-pointer.]  [int-pointer]  iter   is ?  [(output) int-pointer]  ierr   returns the ierr value returned by optchk().  [Note that         depending on *ierr, *iflag can (?) be set differently than         indicated in its comment above, which therefore should be         modified.]  [float-pointer]  stp    is ?  Machine Dependencies:    The only variables that are machine-dependent are xtol, stpmin    and stpmax.   General Information:    Other routines called directly:  saxpy (BLAS), sdot (BLAS),      lb1 (code in this file), mcsrch (code in this file).    Input/Output:  no input; diagnostic messages to fp_mon and      error messages to fp_err. */voidlbfgs(n, m, x, f, g, diagco, diag, iprint, xtol, work, iflag, info,  fp_mon, fp_err, gtol, stpmin, stpmax, itopt, iter, ierr, stp)int n, m, diagco, *iprint, *iflag, *info, *itopt, *iter, *ierr;FILE *fp_mon, *fp_err;float *x, f, *g, *diag, xtol, *work, gtol, stpmin, stpmax, *stp;{#ifdef SURVEY  char str[50];  void fsaso(), survey();#endif    static int inmc, iscn, nfev, iycn, nfun, ispt, iypt, i, bound,    point, cp, finish, maxfev, npt, one = 1;  float r1;  static float beta, ftol, gnorm, xnorm, sq, yr, ys, yy, pctmin_junk,    stp1;  void mcsrch(), optchk(), lb1();  /* parameter adjustments */  --work;  --iprint;  --diag;  --g;  --x;  /* Initialize     ---------- */  if(*iflag == 0)    goto L10;  switch ((int)(*iflag)) {  case 1:  goto L172;  case 2:  goto L100;  } L10:  *iter = 0;  if(n <= 0 || m <= 0)    goto L196;  if(gtol <= 1.e-4) {    if(fp_err != (FILE *)NULL)      fprintf(fp_err, "\n gtol is less than or equal to 1.e-04 ;\n\ it has been reset to 9.e-01 .\n");    gtol = .9;  }  nfun = 1;  point = 0;  finish = FALSE;  if(diagco) {    for(i = 1; i <= n; i++)      if(diag[i] <= 0.)	goto L195;  }  else    for(i = 1; i <= n; i++)      diag[i] = 1.;  /* The work vector w is divided as follows:     ----------------------------------------     Locations 1,...,n are used to store the gradient and other       temporary information.     Locations n+1,...,n+m store the scalars rho.     Locations n+m+1,...,n+2m store the numbers alpha used in the       formula that computes h*g.     Locations n+2m+1,...,n+2m+nm store the last m search steps.     Locations n+2m+nm+1,...,n+2m+2nm store the last m gradient       differences.     The search steps and gradient differences are stored in a     circular order controlled by the parameter point. */  ispt = n + (m << 1);  iypt = ispt + n * m;  for(i = 1; i <= n; i++)    work[ispt + i] = -g[i] * diag[i];  gnorm = snrm2_(&n, &g[1], &one);  stp1 = 1. / gnorm;  /* parameters for line search routine */  /* ftol = 1.e-4; */  /* maxfev = 20; */  /* jlb: */  ftol = 1.e-10;  maxfev = 5;  if(iprint[1] >= 0)    lb1(&iprint[1], *iter, nfun, gnorm, n, m, &x[1], f, &g[1], *stp,      finish, fp_mon);  /* -------------------     Main iteration loop     ------------------- */ L80:  (*iter)++;  *info = 0;  bound = *iter - 1;  if(*iter == 1)    goto L165;  if(*iter > m)    bound = m;  ys = sdot_(&n, &work[iypt + npt + 1], &one, &work[ispt + npt + 1],    &one);  if(!diagco) {    yy = sdot_(&n, &work[iypt + npt + 1], &one, &work[iypt + npt + 1],      &one);    for(i = 1; i <= n; i++)      diag[i] = ys / yy;  }  else {    *iflag = 2;    return;  } L100:  if(diagco)    for(i = 1; i <= n; i++)      if(diag[i] <= 0.)	goto L195;  /* -----------------------------------------------------------     Compute -H*G using the formula given in:  Nocedal, J. 1980,     "Updating quasi-Newton matrices with limited storage",     Mathematics of Computation, Vol. 24, No. 151, pp. 773-782.     ----------------------------------------------------------- */  cp = point;  if(point == 0)    cp = m;  work[n + cp] = 1. / ys;  for(i = 1; i <= n; i++)    work[i] = -g[i];  cp = point;  for(i = 1; i <= bound; i++) {    --cp;    if(cp == -1)      cp = m - 1;    sq = sdot_(&n, &work[ispt + cp * n + 1], &one, &work[1], &one);    inmc = n + m + cp + 1;    iycn = iypt + cp * n;    work[inmc] = work[n + cp + 1] * sq;
12 3 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -