📄 lbfgs.h

📁 最大熵等模型使用的 lbfgs 训练源代码。
💻 H
📖 第 1 页 / 共 4 页
字号:
12 3 4 下一页
/* Copyright (c) 2001-2002 The Regents of the University of California   through E.O. Lawrence Berkeley National Laboratory, subject to   approval by the U.S. Department of Energy.   See files COPYRIGHT.txt and LICENSE.txt for further details.   Revision history:     2002 Aug: Copy of cctbx/lbfgs.h (rwgk)     2002 Mar: Created (R.W. Grosse-Kunstleve) */#ifndef SCITBX_LBFGS_H#define SCITBX_LBFGS_H#include <stdio.h>#include <cstddef>#include <cmath>#include <stdexcept>#include <algorithm>#include <vector>#include <string>#include <fstream>namespace scitbx {//! Limited-memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) %minimizer./*! Implementation of the    Limited-memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS)    algorithm for large-scale multidimensional minimization    problems.    This code was manually derived from Java code which was    in turn derived from the Fortran program    <code>lbfgs.f</code>.  The Java translation was    effected mostly mechanically, with some manual    clean-up; in particular, array indices start at 0    instead of 1.  Most of the comments from the Fortran    code have been pasted in.    Information on the original LBFGS Fortran source code is    available at    http://www.netlib.org/opt/lbfgs_um.shar . The following    information is taken verbatim from the Netlib documentation    for the Fortran source.    <pre>    file    opt/lbfgs_um.shar    for     unconstrained optimization problems    alg     limited memory BFGS method    by      J. Nocedal    contact nocedal@eecs.nwu.edu    ref     D. C. Liu and J. Nocedal, ``On the limited memory BFGS method for    ,       large scale optimization methods'' Mathematical Programming 45    ,       (1989), pp. 503-528.    ,       (Postscript file of this paper is available via anonymous ftp    ,       to eecs.nwu.edu in the directory pub/%lbfgs/lbfgs_um.)    </pre>    @author Jorge Nocedal: original Fortran version, including comments    (July 1990).<br>    Robert Dodier: Java translation, August 1997.<br>    Ralf W. Grosse-Kunstleve: C++ port, March 2002. */namespace lbfgs {  //! Generic exception class for %lbfgs %error messages.  /*! All exceptions thrown by the minimizer are derived from this class.   */  class error : public std::exception {    public:      //! Constructor.      error(std::string const& msg) throw()        : msg_("lbfgs error: " + msg)      {}      //! Access to error message.      virtual const char* what() const throw() { return msg_.c_str(); }    protected:      virtual ~error() throw() {}      std::string msg_;#ifndef DOXYGEN_SHOULD_SKIP_THIS    public:      static std::string itoa(unsigned long i) {        char buf[80];        sprintf(buf, "%lu", i); // FUTURE: use C++ facility        return std::string(buf);      }#endif  };  //! Specific exception class.  class error_internal_error : public error {    public:      //! Constructor.      error_internal_error(const char* file, unsigned long line) throw()        : error(            "Internal Error: " + std::string(file) + "(" + itoa(line) + ")")      {}  };  //! Specific exception class.  class error_improper_input_parameter : public error {    public:      //! Constructor.      error_improper_input_parameter(std::string const& msg) throw()        : error("Improper input parameter: " + msg)      {}  };  //! Specific exception class.  class error_improper_input_data : public error {    public:      //! Constructor.      error_improper_input_data(std::string const& msg) throw()        : error("Improper input data: " + msg)      {}  };  //! Specific exception class.  class error_search_direction_not_descent : public error {    public:      //! Constructor.      error_search_direction_not_descent() throw()        : error("The search direction is not a descent direction.")      {}  };  //! Specific exception class.  class error_line_search_failed : public error {    public:      //! Constructor.      error_line_search_failed(std::string const& msg) throw()        : error("Line search failed: " + msg)      {}  };  //! Specific exception class.  class error_line_search_failed_rounding_errors  : public error_line_search_failed {    public:      //! Constructor.      error_line_search_failed_rounding_errors(std::string const& msg) throw()        : error_line_search_failed(msg)      {}  };  namespace detail {    template <typename NumType>    inline    NumType    pow2(NumType const& x) { return x * x; }    template <typename NumType>    inline    NumType    abs(NumType const& x) {      if (x < NumType(0)) return -x;      return x;    }    // This class implements an algorithm for multi-dimensional line search.    template <typename FloatType, typename SizeType = std::size_t>    class mcsrch    {      protected:        int infoc;        FloatType dginit;        bool brackt;        bool stage1;        FloatType finit;        FloatType dgtest;        FloatType width;        FloatType width1;        FloatType stx;        FloatType fx;        FloatType dgx;        FloatType sty;        FloatType fy;        FloatType dgy;        FloatType stmin;        FloatType stmax;        static FloatType const& max3(          FloatType const& x,          FloatType const& y,          FloatType const& z)        {          return x < y ? (y < z ? z : y ) : (x < z ? z : x );        }      public:        /* Minimize a function along a search direction. This code is           a Java translation of the function <code>MCSRCH</code> from           <code>lbfgs.f</code>, which in turn is a slight modification           of the subroutine <code>CSRCH</code> of More' and Thuente.           The changes are to allow reverse communication, and do not           affect the performance of the routine. This function, in turn,           calls <code>mcstep</code>.<p>           The Java translation was effected mostly mechanically, with           some manual clean-up; in particular, array indices start at 0           instead of 1.  Most of the comments from the Fortran code have           been pasted in here as well.<p>           The purpose of <code>mcsrch</code> is to find a step which           satisfies a sufficient decrease condition and a curvature           condition.<p>           At each stage this function updates an interval of uncertainty           with endpoints <code>stx</code> and <code>sty</code>. The           interval of uncertainty is initially chosen so that it           contains a minimizer of the modified function           <pre>                f(x+stp*s) - f(x) - ftol*stp*(gradf(x)'s).           </pre>           If a step is obtained for which the modified function has a           nonpositive function value and nonnegative derivative, then           the interval of uncertainty is chosen so that it contains a           minimizer of <code>f(x+stp*s)</code>.<p>           The algorithm is designed to find a step which satisfies           the sufficient decrease condition           <pre>                 f(x+stp*s) &lt;= f(X) + ftol*stp*(gradf(x)'s),           </pre>           and the curvature condition           <pre>                 abs(gradf(x+stp*s)'s)) &lt;= gtol*abs(gradf(x)'s).           </pre>           If <code>ftol</code> is less than <code>gtol</code> and if,           for example, the function is bounded below, then there is           always a step which satisfies both conditions. If no step can           be found which satisfies both conditions, then the algorithm           usually stops when rounding errors prevent further progress.           In this case <code>stp</code> only satisfies the sufficient           decrease condition.<p>           @author Original Fortran version by Jorge J. More' and             David J. Thuente as part of the Minpack project, June 1983,             Argonne National Laboratory. Java translation by Robert             Dodier, August 1997.           @param n The number of variables.           @param x On entry this contains the base point for the line             search. On exit it contains <code>x + stp*s</code>.           @param f On entry this contains the value of the objective             function at <code>x</code>. On exit it contains the value             of the objective function at <code>x + stp*s</code>.           @param g On entry this contains the gradient of the objective             function at <code>x</code>. On exit it contains the gradient             at <code>x + stp*s</code>.           @param s The search direction.           @param stp On entry this contains an initial estimate of a             satifactory step length. On exit <code>stp</code> contains             the final estimate.           @param ftol Tolerance for the sufficient decrease condition.           @param xtol Termination occurs when the relative width of the             interval of uncertainty is at most <code>xtol</code>.           @param maxfev Termination occurs when the number of evaluations             of the objective function is at least <code>maxfev</code> by             the end of an iteration.           @param info This is an output variable, which can have these             values:             <ul>             <li><code>info = -1</code> A return is made to compute                 the function and gradient.             <li><code>info = 1</code> The sufficient decrease condition                 and the directional derivative condition hold.             </ul>           @param nfev On exit, this is set to the number of function             evaluations.           @param wa Temporary storage array, of length <code>n</code>.         */        void run(          FloatType const& gtol,          FloatType const& stpmin,          FloatType const& stpmax,          SizeType n,          FloatType* x,          FloatType f,          const FloatType* g,          FloatType* s,          SizeType is0,          FloatType& stp,          FloatType ftol,          FloatType xtol,          SizeType maxfev,          int& info,          SizeType& nfev,          FloatType* wa);        /* The purpose of this function is to compute a safeguarded step           for a linesearch and to update an interval of uncertainty for           a minimizer of the function.<p>           The parameter <code>stx</code> contains the step with the           least function value. The parameter <code>stp</code> contains           the current step. It is assumed that the derivative at           <code>stx</code> is negative in the direction of the step. If           <code>brackt</code> is <code>true</code> when           <code>mcstep</code> returns then a minimizer has been           bracketed in an interval of uncertainty with endpoints           <code>stx</code> and <code>sty</code>.<p>           Variables that must be modified by <code>mcstep</code> are           implemented as 1-element arrays.           @param stx Step at the best step obtained so far.             This variable is modified by <code>mcstep</code>.           @param fx Function value at the best step obtained so far.             This variable is modified by <code>mcstep</code>.           @param dx Derivative at the best step obtained so far.             The derivative must be negative in the direction of the             step, that is, <code>dx</code> and <code>stp-stx</code> must             have opposite signs.  This variable is modified by             <code>mcstep</code>.           @param sty Step at the other endpoint of the interval of             uncertainty. This variable is modified by <code>mcstep</code>.           @param fy Function value at the other endpoint of the interval             of uncertainty. This variable is modified by             <code>mcstep</code>.           @param dy Derivative at the other endpoint of the interval of             uncertainty. This variable is modified by <code>mcstep</code>.           @param stp Step at the current step. If <code>brackt</code> is set             then on input <code>stp</code> must be between <code>stx</code>             and <code>sty</code>. On output <code>stp</code> is set to the             new step.           @param fp Function value at the current step.           @param dp Derivative at the current step.           @param brackt Tells whether a minimizer has been bracketed.             If the minimizer has not been bracketed, then on input this             variable must be set <code>false</code>. If the minimizer has             been bracketed, then on output this variable is             <code>true</code>.           @param stpmin Lower bound for the step.           @param stpmax Upper bound for the step.           If the return value is 1, 2, 3, or 4, then the step has           been computed successfully. A return value of 0 indicates
12 3 4 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -