📄 multimin.texi
字号:
small variations in @math{x}. The relationship between these quantities
is given by @c{$\delta{f} = g\,\delta{x}$}
@math{\delta f = g \delta x}.
@end deftypefun
@deftypefun int gsl_multimin_test_size (const double @var{size}, double @var{epsabs})
This function tests the minimizer specific characteristic
size (if applicable to the used minimizer) against absolute tolerance @var{epsabs}.
The test returns @code{GSL_SUCCESS} if the size is smaller than tolerance,
otherwise @code{GSL_CONTINUE} is returned.
@end deftypefun
@node Multimin Algorithms
@section Algorithms
There are several minimization methods available. The best choice of
algorithm depends on the problem. All of the algorithms uses the value
of the function and most of its gradient at each evaluation point, too.
@deffn {Minimizer} gsl_multimin_fdfminimizer_conjugate_fr
@cindex Fletcher-Reeves conjugate gradient algorithm, minimization
@cindex Conjugate gradient algorithm, minimization
@cindex Minimization, conjugate gradient algorithm
This is the Fletcher-Reeves conjugate gradient algorithm. The conjugate
gradient algorithm proceeds as a succession of line minimizations. The
sequence of search directions is used to build up an approximation to the
curvature of the function in the neighborhood of the minimum. An
initial search direction @var{p} is chosen using the gradient and line
minimization is carried out in that direction. The accuracy of the line
minimization is specified by the parameter @var{tol}. At the minimum
along this line the function gradient @var{g} and the search direction
@var{p} are orthogonal. The line minimization terminates when
@c{$p\cdot g < tol |p| |g|$}
@math{dot(p,g) < tol |p| |g|}. The
search direction is updated using the Fletcher-Reeves formula
@math{p' = g' - \beta g} where @math{\beta=-|g'|^2/|g|^2}, and
the line minimization is then repeated for the new search
direction.
@end deffn
@deffn {Minimizer} gsl_multimin_fdfminimizer_conjugate_pr
@cindex Polak-Ribiere algorithm, minimization
@cindex Minimization, Polak-Ribiere algorithm
This is the Polak-Ribiere conjugate gradient algorithm. It is similar
to the Fletcher-Reeves method, differing only in the choice of the
coefficient @math{\beta}. Both methods work well when the evaluation
point is close enough to the minimum of the objective function that it
is well approximated by a quadratic hypersurface.
@end deffn
@deffn {Minimizer} gsl_multimin_fdfminimizer_vector_bfgs
@cindex BFGS conjugate gradient algorithm, minimization
@cindex Minimization, BFGS conjugate gradient algorithm
This is the vector Broyden-Fletcher-Goldfarb-Shanno (BFGS) conjugate gradient
algorithm. It is a quasi-Newton method which builds up an approximation
to the second derivatives of the function @math{f} using the difference
between successive gradient vectors. By combining the first and second
derivatives the algorithm is able to take Newton-type steps towards the
function minimum, assuming quadratic behavior in that region.
@end deffn
@deffn {Minimizer} gsl_multimin_fdfminimizer_steepest_descent
@cindex Steepest descent algorithm, minimization
@cindex Minimization, steepest descent algorithm
The steepest descent algorithm follows the downhill gradient of the
function at each step. When a downhill step is successful the step-size
is increased by a factor of two. If the downhill step leads to a higher
function value then the algorithm backtracks and the step size is
decreased using the parameter @var{tol}. A suitable value of @var{tol}
for most applications is 0.1. The steepest descent method is
inefficient and is included only for demonstration purposes.
@end deffn
@deffn {Minimizer} gsl_multimin_fminimizer_nmsimplex
@cindex Nelder-Mead simplex algorithm for minimization
@cindex Simplex algorithm, minimization
@cindex Minimization, simplex algorithm
This is the Simplex algorithm of Nelder and Mead. It constructs
@math{n} vectors @math{p_i} from the
starting vector @var{x} and the vector @var{step_size} as follows:
@tex
\beforedisplay
$$
\eqalign{
p_0 & = (x_0, x_1, \cdots , x_n) \cr
p_1 & = (x_0 + step\_size_0, x_1, \cdots , x_n) \cr
p_2 & = (x_0, x_1 + step\_size_1, \cdots , x_n) \cr
\dots &= \dots \cr
p_n & = (x_0, x_1, \cdots , x_n+step\_size_n) \cr
}
$$
\afterdisplay
@end tex
@ifinfo
@example
p_0 = (x_0, x_1, ... , x_n)
p_1 = (x_0 + step_size_0, x_1, ... , x_n)
p_2 = (x_0, x_1 + step_size_1, ... , x_n)
... = ...
p_n = (x_0, x_1, ... , x_n+step_size_n)
@end example
@end ifinfo
@noindent
These vectors form the @math{n+1} vertices of a simplex in @math{n}
dimensions. On each iteration step the algorithm tries to improve
the parameter vector @math{p_i} corresponding to the highest
function value by simple geometrical transformations. These
are reflection, reflection followed by expansion, contraction and multiple
contraction. Using these transformations the simplex moves through
the parameter space towards the minimum, where it contracts itself.
After each iteration, the best vertex is returned. Note, that due to
the nature of the algorithm not every step improves the current
best parameter vector. Usually several iterations are required.
The routine calculates the minimizer specific characteristic size as the
average distance from the geometrical center of the simplex to all its
vertices. This size can be used as a stopping criteria, as the simplex
contracts itself near the minimum. The size is returned by the function
@code{gsl_multimin_fminimizer_size}.
@end deffn
@node Multimin Examples
@section Examples
This example program finds the minimum of the paraboloid function
defined earlier. The location of the minimum is offset from the origin
in @math{x} and @math{y}, and the function value at the minimum is
non-zero. The main program is given below, it requires the example
function given earlier in this chapter.
@smallexample
int
main (void)
@{
size_t iter = 0;
int status;
const gsl_multimin_fdfminimizer_type *T;
gsl_multimin_fdfminimizer *s;
/* Position of the minimum (1,2). */
double par[2] = @{ 1.0, 2.0 @};
gsl_vector *x;
gsl_multimin_function_fdf my_func;
my_func.f = &my_f;
my_func.df = &my_df;
my_func.fdf = &my_fdf;
my_func.n = 2;
my_func.params = ∥
/* Starting point, x = (5,7) */
x = gsl_vector_alloc (2);
gsl_vector_set (x, 0, 5.0);
gsl_vector_set (x, 1, 7.0);
T = gsl_multimin_fdfminimizer_conjugate_fr;
s = gsl_multimin_fdfminimizer_alloc (T, 2);
gsl_multimin_fdfminimizer_set (s, &my_func, x, 0.01, 1e-4);
do
@{
iter++;
status = gsl_multimin_fdfminimizer_iterate (s);
if (status)
break;
status = gsl_multimin_test_gradient (s->gradient, 1e-3);
if (status == GSL_SUCCESS)
printf ("Minimum found at:\n");
printf ("%5d %.5f %.5f %10.5f\n", iter,
gsl_vector_get (s->x, 0),
gsl_vector_get (s->x, 1),
s->f);
@}
while (status == GSL_CONTINUE && iter < 100);
gsl_multimin_fdfminimizer_free (s);
gsl_vector_free (x);
return 0;
@}
@end smallexample
@noindent
The initial step-size is chosen as 0.01, a conservative estimate in this
case, and the line minimization parameter is set at 0.0001. The program
terminates when the norm of the gradient has been reduced below
0.001. The output of the program is shown below,
@example
x y f
1 4.99629 6.99072 687.84780
2 4.98886 6.97215 683.55456
3 4.97400 6.93501 675.01278
4 4.94429 6.86073 658.10798
5 4.88487 6.71217 625.01340
6 4.76602 6.41506 561.68440
7 4.52833 5.82083 446.46694
8 4.05295 4.63238 261.79422
9 3.10219 2.25548 75.49762
10 2.85185 1.62963 67.03704
11 2.19088 1.76182 45.31640
12 0.86892 2.02622 30.18555
Minimum found at:
13 1.00000 2.00000 30.00000
@end example
@noindent
Note that the algorithm gradually increases the step size as it
successfully moves downhill, as can be seen by plotting the successive
points.
@iftex
@sp 1
@center @image{multimin,3.4in}
@end iftex
@noindent
The conjugate gradient algorithm finds the minimum on its second
direction because the function is purely quadratic. Additional
iterations would be needed for a more complicated function.
Here is another example using the Nelder Mead Simplex algorithm to
minimize the same example object function, as above.
@smallexample
int
main(void)
@{
size_t np = 2;
double par[2] = @{1.0, 2.0@};
const gsl_multimin_fminimizer_type *T =
gsl_multimin_fminimizer_nmsimplex;
gsl_multimin_fminimizer *s = NULL;
gsl_vector *ss, *x;
gsl_multimin_function minex_func;
size_t iter = 0, i;
int status;
double size;
/* Initial vertex size vector */
ss = gsl_vector_alloc (np);
/* Set all step sizes to 1 */
gsl_vector_set_all (ss, 1.0);
/* Starting point */
x = gsl_vector_alloc (np);
gsl_vector_set (x, 0, 5.0);
gsl_vector_set (x, 1, 7.0);
/* Initialize method and iterate */
minex_func.f = &my_f;
minex_func.n = np;
minex_func.params = (void *)∥
s = gsl_multimin_fminimizer_alloc (T, np);
gsl_multimin_fminimizer_set (s, &minex_func, x, ss);
do
@{
iter++;
status = gsl_multimin_fminimizer_iterate(s);
if (status)
break;
size = gsl_multimin_fminimizer_size (s);
status = gsl_multimin_test_size (size, 1e-2);
if (status == GSL_SUCCESS)
@{
printf ("converged to minimum at\n");
@}
printf ("%5d ", iter);
for (i = 0; i < np; i++)
@{
printf ("%10.3e ", gsl_vector_get (s->x, i));
@}
printf ("f() = %7.3f size = %.3f\n", s->fval, size);
@}
while (status == GSL_CONTINUE && iter < 100);
gsl_vector_free(x);
gsl_vector_free(ss);
gsl_multimin_fminimizer_free (s);
return status;
@}
@end smallexample
@noindent
The minimum search stops when the Simplex size drops to 0.01. The output is
shown below.
@example
1 6.500e+00 5.000e+00 f() = 512.500 size = 1.082
2 5.250e+00 4.000e+00 f() = 290.625 size = 1.372
3 5.250e+00 4.000e+00 f() = 290.625 size = 1.372
4 5.500e+00 1.000e+00 f() = 252.500 size = 1.372
5 2.625e+00 3.500e+00 f() = 101.406 size = 1.823
6 3.469e+00 1.375e+00 f() = 98.760 size = 1.526
7 1.820e+00 3.156e+00 f() = 63.467 size = 1.105
8 1.820e+00 3.156e+00 f() = 63.467 size = 1.105
9 1.016e+00 2.812e+00 f() = 43.206 size = 1.105
10 2.041e+00 2.008e+00 f() = 40.838 size = 0.645
11 1.236e+00 1.664e+00 f() = 32.816 size = 0.645
12 1.236e+00 1.664e+00 f() = 32.816 size = 0.447
13 5.225e-01 1.980e+00 f() = 32.288 size = 0.447
14 1.103e+00 2.073e+00 f() = 30.214 size = 0.345
15 1.103e+00 2.073e+00 f() = 30.214 size = 0.264
16 1.103e+00 2.073e+00 f() = 30.214 size = 0.160
17 9.864e-01 1.934e+00 f() = 30.090 size = 0.132
18 9.190e-01 1.987e+00 f() = 30.069 size = 0.092
19 1.028e+00 2.017e+00 f() = 30.013 size = 0.056
20 1.028e+00 2.017e+00 f() = 30.013 size = 0.046
21 1.028e+00 2.017e+00 f() = 30.013 size = 0.033
22 9.874e-01 1.985e+00 f() = 30.006 size = 0.028
23 9.846e-01 1.995e+00 f() = 30.003 size = 0.023
24 1.007e+00 2.003e+00 f() = 30.001 size = 0.012
converged to minimum at
25 1.007e+00 2.003e+00 f() = 30.001 size = 0.010
@end example
@noindent
The simplex size first increases, while the simplex moves towards the
minimum. After a while the size begins to decrease as the simplex
contracts around the minimum.
@node Multimin References and Further Reading
@section References and Further Reading
@noindent
A brief description of multidimensional minimization algorithms and
further references can be found in the following book,
@itemize @asis
@item C.W. Ueberhuber,
@cite{Numerical Computation (Volume 2)}, Chapter 14, Section 4.4
"Minimization Methods", p. 325---335, Springer (1997), ISBN
3-540-62057-5.
@item J.A. Nelder and R. Mead,
@cite{A simplex method for function minimization}, Computer Journal
vol. 7 (1965), 308---315.
@end itemize
@noindent
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -