📄 optimtips.m
字号:
% define a batched objective functionbatchfun = @(x,y) bessel(0,x) - yoptions = optimset('lsqnonlin');options.Largescale = 'on';options.TolX = 1.e-13;options.TolFun = 1.e-13;% I'll just put the first p=10 problems in a batchp = 10;start = ones(p,1);LB = repmat(0,p,1);UB = repmat(4,p,1);ticxb = lsqnonlin(batchfun,start,LB,UB,options,y(1:p));toc% This took .1 seconds on my computer, roughly the same amount% of time per problem as did the loop, so no gain was achieved.%%% Why was the call to lsqnonlin so slow? Because I did not tell% lsqnonlin to expect that the Jacobian matrix would be sparse.% How sparse is it? Recall that each problem is really independent% of every other problem, even though they are all related by% a common function we are inverting. So the Jacobian matrix% here will be a diagonal matrix. We tell matlab the sparsity% pattern to expect with JacobPattern.% Lets add that information and redo the computations. This time% for all 1000 points.ticp = 1000;options.JacobPattern = speye(p,p);start = ones(p,1);LB = repmat(0,p,1);UB = repmat(4,p,1);xb = lsqnonlin(batchfun,start,LB,UB,options,y(1:p));toc% The batched solution took only 0.76 seconds on my computer for% ALL 1000 subproblems. Compare this to the 9 seconds it took when% I put a loop around fzero.% Why is there such a significant difference? Some of the gain% comes from general economies of scale. Another part of it is% due to the efficient computation of the finite difference% approximation for the Jacobian matrix.%%% We can test the question easily enough by formulating a problem% with simple derivatives. I'll kill two birds with one stone by% showing an example of a batched least squares problem for% lsqcurvefit to solve. This example will be a simple, single% exponential model.% One important point: when batching subproblems together, be careful% of the possibility of divergence of a few of the subproblems,% since the entire system won't be done until all have converged.% Some data sets may have poor starting values, allowing divergence% otherwise. The use of bound constraints is a very good aid to% avoid this bit of nastiness.n = 100; % n subproblemsm = 20; % each with m data points% some random coefficientscoef = rand(2,n);% negative exponentialst = rand(m,n);y = repmat(coef(1,:),m,1).*exp(-t.*repmat(coef(2,:),m,1));y = y+randn(size(y))/10;% first, solve it in a loopticexpfitfun1 = @(coef,tdata) coef(1)*exp(-coef(2)*tdata);options = optimset('disp','off','tolfun',1.e-12);LB = [-inf,0];UB = [];start = [1 1]';estcoef = zeros(2,n);for i=1:n estcoef(:,i) = lsqcurvefit(expfitfun1,start,t(:,i), ... y(:,i),LB,UB,options);endtoc%%% next, leave in the loop, but provide the gradient.% fun2 computes the lsqcurvefit objective and the gradient.ticexpfitfun2 = @(coef,tdata) deal(coef(1)*exp(-coef(2)*tdata), ... [exp(-coef(2)*tdata), -coef(1)*tdata.*exp(-coef(2)*tdata)]);options = optimset('disp','off','jacobian','on');LB = [-inf,0];UB = [];start = [1 1]';estcoef2 = zeros(2,n);for i=1:n estcoef2(:,i) = lsqcurvefit(expfitfun2,start,t(:,i),y(:,i),LB,UB,options);endtoc% I saw a 40% gain in speed on my computer for this loop.%%% A partitioned least squares solution might have sped it up too.% Call expfitfun3 to do the partitioned least squares. lsqnonlin% seems most appropriate for this fit.ticoptions = optimset('disp','off','tolfun',1.e-12);LB = 0;UB = [];start = 1;estcoef3 = zeros(2,n);for i=1:n nlcoef = lsqnonlin('expfitfun3',start,LB,UB,options,t(:,i),y(:,i)); % Note the trick here. I've defined expfitfun3 to return % a pair of arguments, but lsqnonlin was only expecting one % return argument. The second argument is the linear coefficient. % Make one last call to expfitfun3, with the final nonlinear % parameter to get our final estimate for the linear parameter. [junk,lincoef]=expfitfun3(nlcoef,t(:,i),y(:,i)); estcoef3(:,i)=[lincoef,nlcoef];endtoc%%% In an attempt to beat this problem to death, there is one more% variation to be applied. Use a batched, partitioned solution.tic% Build a block diagonal sparse matrix for the Jacobian patternjp = repmat({sparse(ones(m,1))},1,n);options = optimset('disp','off','TolFun',1.e-13, ... 'JacobPattern',blkdiag(jp{:}));start = ones(n,1);LB = zeros(n,1);UB = [];nlcoef = lsqnonlin('expfitfun4',start,LB,UB,options,t,y);% one last call to provide the final linear coefficients[junk,lincoef]=expfitfun4(nlcoef,t,y);estcoef4 = [lincoef';nlcoef'];toc%%%{---------------------------------------------------------------17. Global solutions & domains of attractionGlobal optimization is a relatively hard problem to solve. Visualizean optimizer as a blindfolded person on the surface of the earth.Imagine that you have directed this individual to find the highestpoint on the surface of the earth. The individual can do so onlyby wandering around. He is given an altimeter to consult, so atany point he can compare his elevation with any previous point. Andto make things harder, you have chosen to place this poor fellow onthe Galapagos Islands to start. What are the odds that your subjectwould ever manage to discover the peak of Mount Everest?Clearly this is an impossible task. Global optimization of ageneral function in n-dimensions is no easier, and often harderwhen the dimensionality gets large.%}%%% So consider a smaller problem. Find all locally minimum values% of the function sin(x).closeezplot(@(x) sin(x),[-50,50])title 'Global optimization when there are infinitely many solutions?'xlabel 'x'ylabel 'y'% This is an easier problem if we recall enough high school trig% to know that sin(x) is minimized at the points (n+1/2)*pi for% all odd integers n. For an optimizer its not so easy to find% an infinite number of minima.% Even better, all of these local minimizers are equivalent, in% the sense that they all yield the same function value.%%% Perhaps not so easy for you is to find all local minimizers on% the real line of the first order Bessel function.ezplot(@(x) bessel(1,x),[-20,20])title 'Global optimization when there are infinitely many solutions?'xlabel 'x'ylabel 'y'% The minima that we see in this plot are not all equivalent.%%% Remember our blindfolded friend? Suppose that we felt some% compassion in his plight, instead setting him down on the% North face of Mt. Everest (please do it on an especially warm% day.) Clearly given better "starting values", our poor frozen% subject may indeed even have a chance at success.% This brings to mind an interesting idea though. It is a concept% called a basin of attraction. For any given optimization problem% and specific optimizer, each local minimizer will have its own% basin of attraction. A basin of attraction is the set of all% points which when used as a starting value for your chosen% optimizer, will converge to a given solution. Just for kicks,% lets map out the basins of attraction for the optimizer % when applied to the first order Bessel function.% Do it using a simple loop over a variety of starting values.xstart = -20:.1:20;xfinal = zeros(size(xstart));bessfun = @(x) real(bessel(1,x));for i = 1:length(xstart) xfinal(i) = fminsearch(bessfun,xstart(i));endsubplot(1,2,1)plot(xstart,xfinal,'.')xlabel 'Starting point'ylabel 'Terminal point'grid onsubplot(1,2,2)plot(xstart,bessfun(xfinal),'.')xlabel 'Starting point'ylabel 'Minimum value'grid on%%%{We can use clustering techniques to group each of these solutionsinto clusters. k-means is one clustering method, although it requiresadvance knowledge of how many clusters to expect. If you choose todownload my function "consolidator" from the file exchange, you cando a simple clustering without that knowledge in advance. (I'mtempted to include a copy with this text. However I'd prefer thatyou download the copy from the file exchange. In the event of anyenhancements or bug fixes the online copy will be maintained at mymost current version.)Here is the url to consolidator:http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=8354&objectType=file%}% consolidator(xfinal',[],'',.001)%%% Why all this talk about basins of attraction? I want to highlight% this basic behavior of most optimizers. They are simple search% tools that can do not much more than investigate their local% environment to decide where to look next. % % What can we do to improve the success rate of an optimizer on a% nasty function? Suppose we have a very flat function, with a deep% valley with steep sides at some location. The basin of attraction% for that local minimizer may well be very small. A trick that% works nicely in a probabilistic sense is to evaluate the objective% function at some random set of points. Choose some subset of the% best of these points as starting values for your favorite optimizer.% An example of using Monte Carlo starting valuesfun = @(p) -peaks(p(1),p(2));n0 = 200;x0 = rand(n0,2)*6 - 3;z0 = zeros(n0,1);for i = 1:n0 z0(i) = fun(x0(i,:));end% sort to find the best points from the set[z0,tags] = sort(z0);x0 = x0(tags,:);n1 = 25;xfinal = zeros(n1,2);for i = 1:n1 xfinal(i,:) = fminsearch(fun,x0(i,:));end% now cluster the solutions.xfinal = consolidator(xfinal,[],'',.001)zfinal=zeros(size(xfinal,1),1);for i = 1:size(xfinal,1) zfinal(i) = peaks(xfinal(i,1),xfinal(i,2));end% Consolidator identified 3 distinct clusters.%%% plot these solutions on top of the peaks surfacefigurepeaks(50)hold onplot3(xfinal(:,1),xfinal(:,2),zfinal,'bo')hold off%%%{A pretty example of fractal basins can be found at:http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=3235&objectType=FILETry this example to start:Pctr = nrfrac(-1,1,-1,1,.005,50);%}%{---------------------------------------------------------------18. Bound constrained problemsOptimization subject to constraints is nicely covered by a varietyof tools and texts. I won't even try to do a better job. There aresome days when you just don't want to roll out fmincon. Perhapsyou have already got your problem working with fminsearch, but itoccasionally tends to wander into a bad place. Bound constraintsare the most common class of constraint to employ, and they arequite easy to implement inside an optimization problem.I could spend a fair amount of time here showing how to use Lagrangemultipliers to solve constrained problems. You can find them intextbooks though. And some problems do well with penalties imposedon the objective function. I won't get into penalty function methods.Personally, I've never really liked them. They often introducediscontinuities into the objective function. Even so, then theoptimizer may often step lightly over the bounds. The simple trick is to use a transformation of variables. Allowthe optimizer to leave your variable unconstrained, but theninside the objective function you perform a transformation ofthe variable. If a variable must be positive, then square it. Ifit must lie in a closed interval, then use a sine transformationto transform an unbounded variable into one which is implicitlybounded.I'll give an example of such a transformation solution, then suggestthat you look at fminsearchbnd from matlab central. Here is a url:(as with consolidator, I'd prefer that you download the file exchangecopy, as it will always be the current version.)http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=8277&objectType=file%}%%% As an example, lets find the minimum of a simple function of% two variables. This simple function, z = x^2 + y^2 is a% parabola of revolution. Its a simple conic form.fun = @(vars) vars(1).^2 + vars(2).^2% its easy enough to minimize with fminsearch. The result is zero,% at least as well as fminsearch will solve it.format short gminxy = fminsearch(fun,rand(1,2))%%% Now lets constrain the first variable to be >= 1. I'll define% a transformation of x, such that x = u^2 + 1.fun2 = @(vars) (vars(1).^2+1).^2 + vars(2).^2;minuy = fminsearch(fun2,rand(1,2));% And after fminsearch is done, transform u back into x.% y was not transformed at all.minxy = [minuy(1).^2+1, minuy(2)]% This is the minimizer of our original function fun, subject to the% bound constraint that x>=1%%% These transformation have been implemented in fminsearchbnd which% is really just a transparent wrapper on top of fminsearch. Try this% on the (well known, even famous) Rosenbrock function.rosen = @(x) (1-x(1)).^2 + 105*(x(2)-x(1).^2).^2;% with no constraints, fminsearch finds the point [1 1]. This is% global optimizer, with no constraints.xy = fminsearch(rosen,[3 4])%%% minimize the Rosenbrock function, subj
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -