📄 optimtips_16_31.m

📁 maltab优化工具箱的一些小例子主要介绍最优化的相关知识欢迎大家共同探讨
💻 M
📖 第 1 页 / 共 4 页
字号:
optimizer may often step lightly over the bounds. The simple trick is to use a transformation of variables. Allowthe optimizer to leave your variable unconstrained, but theninside the objective function you perform a transformation ofthe variable. If a variable must be positive, then square it. Ifit must lie in a closed interval, then use a sine transformationto transform an unbounded variable into one which is implicitlybounded.I'll give an example of such a transformation solution, then suggestthat you look at fminsearchbnd from matlab central. Here is a url:(as with consolidator, I'd prefer that you download the file exchangecopy, as it will always be the current version.)http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=8277&objectType=file%}%%% As an example, lets find the minimum of a simple function of% two variables. This simple function, z = x^2 + y^2 is a% parabola of revolution. Its a simple conic form.fun = @(vars) vars(1).^2 + vars(2).^2% its easy enough to minimize with fminsearch. The result is zero,% at least as well as fminsearch will solve it.format short gminxy = fminsearch(fun,rand(1,2))%%% Now lets constrain the first variable to be >= 1. I'll define% a transformation of x, such that x = u^2 + 1.fun2 = @(vars) (vars(1).^2+1).^2 + vars(2).^2;minuy = fminsearch(fun2,rand(1,2));% And after fminsearch is done, transform u back into x.% y was not transformed at all.minxy = [minuy(1).^2+1, minuy(2)]% This is the minimizer of our original function fun, subject to the% bound constraint that x>=1%%% These transformation have been implemented in fminsearchbnd which% is really just a transparent wrapper on top of fminsearch. Try this% on the (well known, even famous) Rosenbrock function.rosen = @(x) (1-x(1)).^2 + 105*(x(2)-x(1).^2).^2;% with no constraints, fminsearch finds the point [1 1]. This is% global optimizer, with no constraints.xy = fminsearch(rosen,[3 4])%%% minimize the Rosenbrock function, subject to the constraint that% 2 <= x <= 5% 1.5 <= yLB = [2 1.5];UB = [5 inf];xy = fminsearchbnd(rosen,[3 4],LB,UB)%%%{---------------------------------------------------------------19. Inclusive versus exclusive bound constraintsWhy do we employ constraints at all in an optimization?A not uncommon reason is to enhance robustness of the optimizerto poor starting values. By fencing in the optimizer to thedomain where we know the solution must exist, we improve the oddsthat it will find our preferred solution.A second reason is if some calculation in the objective function will suddenly begin to produce complex numbers as results. I.e.,if x is a parameter in the optimization, and the first thing wedo in the objective function is compute sqrt(x), then we want toavoid negative values for x. As soon as complex results arereturned to an optimizer that expects to see only real values, you should expect trouble.Note that in this example, x = 0 is a perfectly valid parametervalue, since sqrt(0) is still real. We may not be able to tolerateany penetration of the boundary at all, but the boundary itselfis acceptable. This is what I would call a closed boundary, oran inclusive bound. Fminsearchbnd implements boundary constraintsof this form and should not allow any excess. Beware however whenusing constraints. Some optimizers will allow the constraint to beviolated by a tiny amount. Given the combination of linear algebraand floating point mathematics in an optimizer, its sometimes ahard thing to require that a bound never be exceeded.Sometimes however, we cannot even tolerate the boundary itself.A common example involves log(x). Whereas sqrt(0) still returnsa useable result, log(0) returns -inf. The presence of an infin your computation may well upset a less than clever optimizer.These boundaries could be called variously open, exclusive, orstrict.The simplest solution to these problems is to offset your boundslightly. If x cannot go negative, then simply supply a boundaryvalue of some small positive value that is acceptably close tozero.There are also transformations one can use to enforce an openboundary. Thus while u = x.^2 allows u to attain a value of 0,whereas u = exp(x) will never attain that value. For variableswhich have both lower and upper bounds, an atan transformationworks reasonably well.In fact, experimentation on my part has shown that some such transformations work better than others. In my experience, theexponential transformation can sometimes cause numerical underflowor overflow problems. An atan transformation seems to approachits boundary more softly, avoiding many of those problems. %}%{---------------------------------------------------------------20. Mixed integer/discrete problems Many problems are inherently discrete. I won't get into that classof problems at all. However there are also problems of mixed class.Here one or more of the variables in the optimization are limitedto a discrete set of values, while the rest live in a continuousdomain. There are mixed solvers available, both on Matlab centraland from other sources.What do you do when there are only a few possible discrete statesto investigate? Probably best here is to simply fix the discretevariables at each possible state, then solve for the continuousvariables. Then choose the solution which is best overall.%}%{---------------------------------------------------------------21. Understanding how they workWhile I will not even try to give a full description of how anyoptimizer works, a little bit of understanding is worth a tremendousamount when there are problems. True understanding can only comefrom study in some depth of an algorithm. I can't offer that here.Instead, I'll try to show the broad differences between some of themethods, and suggest when one might use one method over another.Previously in this text I've referred to optimizers as tools thatoperate on a black box, or compared them to a blindfolded individualon a hillside. These are useful analogies but these analogies don'treally tell enough to visualize how the tools work. We'll starttalking about the method used in fminsearch, since it may wellbe one of the tools which is used most often.Fminsearch is a Nelder/Mead polytope algorithm. Some call it asimplex method, but this may confuse things with linear programming.It works by evaluating the objective function over a polytope ofpoints in the parameter space. In 2-dimensions, this will be atriangle, in n-dimensions, the polytope (or simplex) will becomposed of n+1 points. The basic algorithm is simple. Comparethese n+1 points, and choose the WORST one to delete. Replace thisbad point with its reflection through the remaining points inthe polytope. You can visualize this method as flopping a trianglearound the parameter space until it finds the optimum. Whereappropriate, the polytope can shrink or grow in size.I'll note that this basic Nelder/Mead code is simple, it requiresno gradient information at all, and can even survive an occasionalslope or function discontinuity. The downside of a polytope methodis how slowly it will converge, especially for higher dimensionalproblems. I will happily use fminsearch for 2 or 3 variable problems,especially if its something I will need to use infrequently. I willrarely if ever consider fminsearch for more than 5 or 6 variables.(Feel free to disagree. You may have more patience than I do.)The other point to remember about a polytope method is the stoppingcriterion. These methods are generally allowed to terminate whenall the function values over the polytope have the same valuewithin the function tolerance. (I have also seen the standarddeviation of the function values over the polytope used to compareto the function tolerance.)We can contrast the polytope algorithm to the more traditionalNewton-like family of methods, combined with a line search. Thesemethods include many of the optimizers in the optimization toolbox,such as fminunc, lsqcurvefit, fsolve, etc. Whereas a polytope methodlays down a polytope on the objective function surface, then movesthe polytope around, these line-search methods attempt to be moreintelligent in their operation. The basic idea is to form a locallylinear approximation to the problem at hand, then solve the linearizedproblem exactly. This determines a direction to search along, plusa predicted step length in that direction from your current pointin the parameter space. (I already discussed line searches in section9.) This is why these tools require a gradient (or Jacobian asappropriate) of the objective. This derivative information is usedto compute the necessary locally linear approximation.The other feature of interest about these methods is how they "learn"about the surface in question. The very first iteration taken mighttypically be a steepest descent step. After the first iteration, theoptimizer can gradually form an adaptive approximation to the localHessian matrix of the objective. This, in effect tells the optimizerwhat the local curvature of the surface looks like.=======================================================I'd like to expand on this topic in future releases of thisdocument. My plans are to show explicitly the sequence ofsteps several different optimizers choose on the same function.=======================================================%}%{---------------------------------------------------------------22. Wrapping an optimizer around quadNested optimizations or nesting an integration inside anoptimization are similar problems. Both really just a problemof managing parameter spaces. I.e., you must ensure that eachobjective function or integrand sees the appropriate set ofparameters. In this example, I'll formulate an optimization which has anintegration at its heart. %}%%% Choose some simple function to integrate. % %   z = (coef(1)^2)*x^2 + (coef(2)^2)*x^4%% I picked this one since its integral over the% interval [-1 1] is clearly minimized at coef = [0,0].K = @(x,coef) coef(1).^2*x.^2 + coef(2).^2*x.^4;% As far as the integration is concerned, coef is a constant% vector. It is invariant. And the optimization really does% not want to understand what x is, as far as the optimizer% is concerned, its objective is a function of only the% parameters in coef.% We can integrate K easily enough (for a given set of% coefficients (coef) viacoef = [2 3];quad(K,-1,1,1e-13,[],coef)% We could also have embedded coef directly into the% anonymous function K, as I do below.%%% As always, I like to make sure that any objective function% produces results that I'd expect. Here we know what to% expect, but it never hurts to test our knowledge. I'll% change the function K this time to embed coef inside it.% If we try it at [0 0], we see the expected result, i.e., 0.coef = [0 0];K = @(x) coef(1).^2*x.^2 + coef(2).^2*x.^4;quad(K,-1,1,1e-13)%%% I used a tight integration tolerance because we will% put this inside an optimizer, and we want to minimize% any later problems with the gradient estimation.% Now, can we minimize this integral? Thats easy enough.% Here is an objective function, to be used with a% minimizer. As far as the optimizer is concerned, obj is% a function only of coef.obj = @(coef) quad(@(x) coef(1).^2*x.^2+coef(2).^2*x.^4,-1,1,1e-13);% Call fminsearch, or fminuncfminsearch(obj,[2 3],optimset('disp','iter'))%%% orfminunc(obj,[2 3],optimset('disp','iter','largescale','off'))% Both will return a solution near [0 0].%%%{---------------------------------------------------------------23. Graphical tools for understanding sets of nonlinear equationsPick some simple set of nonlinear equations. Perhaps this pairwill do as well as any others.  x*y^3 = 2  y-x^2 = 1We will look for solutions in the first quadrant of the x-y plane,so x>=0 and y>=0.%}%%% Now consider each equation independently from the other. In the x-y% plane, these equations can be thought of as implicit functions, thus% ezplot can come to our rescue.close allezplot('x*y.^3 - 2',[0 5])hold onezplot('y - x.^2 - 1',[0 5])hold offgrid ontitle 'Graphical (ezplot) solution to a nonlinear system of equations'xlabel 'x'ylabel 'y'% The intersection of the two curves is our solution, zooming in% will find it quite nicely.%%% An alternative approach is to think of the problem in terms of% level sets. Given a function of several variables, z(x,y), a% level set of that function is the set of (x,y) pairs such that% z is constant at a given level.% In Matlab, there is a nice tool for viewing level sets: contour.% See how it allows us to solve our system of equations graphically.% Form a lattice over the region of interest in the x-y plane[x,y] = meshgrid(0:.1:5);% Build our functions on that lattice. Be careful to use .*, .^ and% ./ as appropriate.z1 = x.*y.^3;z2 = y - x.^2;% Display the level sets contour(x,y,z1,[2 2],'r')hold oncontour(x,y,z2,[1 1],'g')hold offgrid ontitle 'Graphical (contour) solution to a nonlinear system of equations'xlabel 'x'ylabel 'y'%%%{---------------------------------------------------------------24. Optimizing non-smooth or stochastic functionsA property of the tools in the optimization toolbox is theyall assume that their objective function is a continuous,differentiable function of its parameters. (With the exceptionof bintprog.)If your function does not have this property, then lookelsewhere for optimization. One place might be the GeneticAlgorithms and Direct Search toolbox.http://www.mathworks.com/products/gads/description4.htmlSimulated annealing is another tool for these problems. Youwill find a few options on the file exchange.%}%{---------------------------------------------------------------25. Linear equality constraintsThis section is designed to help the reader understand how onesolves linear least squares problems subject to linear equalityconstraints. Its written for the reader who wants to understandhow to solve this problem in terms of liner algebra. (Some mayask where do these problems arise: i.e., linear least squares withmany variables and possibly many linear constraints. Spline modelsare a common example, functions of one or several dimensions.
💿 文件大小 79 K
👤 上传用户 sujiwei
📂 所属分类其他
🏷️ 相关标签

#maltab #工具箱 #家 #相关知识
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -