📄 optimtips_16_31.m

📁 maltab优化工具箱的一些小例子主要介绍最优化的相关知识欢迎大家共同探讨
💻 M
📖 第 1 页 / 共 4 页
字号:
Under some circumstances there may also be linear inequalityconstraints.) We'll start off with a simple example. Assume a model of  y = a + b*x1 + c*x2with the simple equality constraint  c - b = 1How would we solve this problem on paper? (Yes, lsqlin solvesthese problems in its sleep. We'll do it the hard way instead.)Simplest is to eliminate one of the unknowns using the constraint.Replace c with 1+b in the model.   c = 1 + bSo  y = a + b*x1 + (1+b)*x2Rearrange terms to get this  (y - x2) = a + b*(x1+x2)Estimate the unknowns (a,b), then recover c once b is known.A point to note is I suggested we eliminate C in the model.As easily, we could have eliminated b, but we could not havech0osen to eliminate a, because a did not enter into ourconstraint at all. Another thing to think about is if theconstraint had some coefficients with values very near zero.Elinimation of those variables would then cause significantinstabilities in the computations. Its similar to the reasonwhy pivoting is useful in the solution of systems of equations.Lets try it on some data:%}%%n = 500;x1 = rand(n,1);x2 = rand(n,1);% yes, I know that these coefficients do not actually satisfy the% constraint above. They are close though.y = 1 + 2*x1 + pi*x2 + randn(n,1)/5;% solve the reduced problem aboveab = [ones(n,1), x1+x2]\(y - x2)% recover cc = 1 + ab(2)% Note that the coefficients were originally (1,2,pi) but% application of the constraint c - b = 1 %%% Had we never applied any constraint, the (1,2,pi) values will be% closer than for the constrained solution. This is easy to verify.[ones(n,1),x1,x2]\y%%% We may use lsqlin to verify our constrained solutionlsqlin([ones(n,1),x1,x2],y,[],[],[0 -1 1],1,[],[],[],optimset('largescale','off'))% As expected, the two constrained solutions agree.%%% The real question in this section is not how to solve a linearly% constrained problem, but how to solve it programmatically, and% how to solve it for possibly multiple constraints. % Start with a completely random least squares problem.n = 20; % number of data pointsp = 7;  % number of parameters to estimateA = rand(n,p);% Even generate random coefficients for our ground truth.coef0 = 1 + randn(p,1)y = A*coef0 + randn(n,1);% Finally, choose some totally random constraintsm = 3;  % The number of constraints in the modelC = randn(m,p);D = C*coef0;%%% Again, compute the simple (unconstrained) linear regression% estimates for this modelcoef1 = A\y% verify that the unconstrained model fails to satisfy the% constraints, but that the original (ground truth) coefficients% do satisfy them. D-C*coef should be zero.[D-C*coef0,D-C*coef1]%%% How does one solve the constrained problem? There are at least% two ways to do so (if we choose not to resort to lsqlin.) For% those devotees of pinv and the singular value distribution,% one such approach would involve a splitting of the solution to% A*x = y into two components: x = x_u + x_c. Here x_c must lie% in the row space of the matrix C, while x_u lies in its null% space. The only flaw with this approach is it will fail for% sparse constraint matrices, since it would rely on the singular% value decomposition.%% I'll discuss an approach that is based on the qr factorization% of our constraint matrix C. It is also nicely numerically stable,% and it offers the potential for use on large sparse constraint% matrices.[Q,R,E]= qr(C,0)% First, we will ignore the case where C is rank deficient (high% quality numerical code would not ignore that case, and the QR% allows us to identify and deal with that event. It is merely a% distraction in this discussion however.)%% We transform the constraint system C*x = D by left multiplying% by the inverse of Q, i.e., its transpose. Thus, with the pivoting% applied to x, the constraints become%%  R*x(E) = Q'*D%% In effect, we wanted to compute the Q-less QR factorization,% with pivoting.%% Why did we need pivoting? As I suggested above, numerical% instabilities may result otherwise. % % We will reduce the constraints further by splitting it into% two fragments. Assuming that C had fewer rows than columns,% then R can be broken into two pieces:% %  R = [R_c, R_u]R_c = R(:,1:m);R_u = R(:,(m+1):end);% Here R_c is an mxm, upper triangular matrix, with non-zero% diagonals. The non-zero diagonals are ensured by the use of% pivoting. In effect, column pivoting provides the means by % which we choose those variables to eliminate from the regression% model.%% The pivoting operation has effectively split x into two pieces% x_c and x_u. The variables x_c will correspond to the first m% pivots identified in the vector E.%% This split can be mirrored by breaking the matrices into pieces%%  R_c*x_c + R_u*X_u = Q'*D% % We will use this version of our constraint system to eliminate% the variables x_c from the least squares problem. Break A into% pieces also, mirroring the qr pivoting:A_c = A(:,E(1:m));A_u = A(:,E((m+1):end));%%% So the least squares problem, split in terms of the variable% as we have reordered them is:%%  A_c*x_c + A_u*x_u = y%% We can now eliminate the appropriate variables from the linear% least squares.%%  A_c*inv(R_c)*(Q'*D - R_u*x_u) + A_u*x_u = y%% Expand and combine terms. Remember, we will not use inv()% in the actual code, but instead use \. The \ operator, when% applied to an upper triangular matrix, is very efficient% compared to inv().%%  (A_u - A_c*R_c\R_u) * x_u = y - A-c*R_c\(Q'*D)x_u = (A_u - A_c*(R_c\R_u)) \ (y - A_c*(R_c\(Q'*D)))%%% Finally, we recover x_c from the constraint equationsx_c = R_c\(Q'*D - R_u*x_u)%%% And we put it all together in the unpivoted solution vector x:xfinal = zeros(p,1);xfinal(E(1:m)) = x_c;xfinal(E((m+1):end)) = x_u%%% Were we successful? How does this result compare to lsqlin?% The two are identical (as usual, only to within floating% point precision irregularities.)lsqlin(A,y,[],[],C,D,[],[],[],optimset('largescale','off'))%%% Verify that the equality constraints are satisfied to within% floating point tolerances.C*xfinal - D%%%{---------------------------------------------------------------26. Sums of squares surfaces and the geometry of a regressionI'll admit in advance that this section has no tips or tricksto look for. But it does have some pretty pictures, and I hopeit leads into the next two sections, where I talk about confidencelimits and error estimates for parameters.Consider the very simple (linear) regression model:  y = a0 + a1*x + error.The sum of squares surface as a function of the parameters(a0,a1) will have ellipsoidal contours. Theory tells us this,but its always nice to back up theory with an example. ThenI'll look at what happens in a nonlinear regression.%}%% % A simple linear modeln = 20;x = randn(n,1);y = 1 + x + randn(n,1);closefigureplot(x,y,'o')title 'Linear data with noise'xlabel 'x'ylabel 'y'%%% linear regression estimatesM = [ones(n,1),x];a0a1 = M\y% look at various possible values for a0 & a1v = -1:.1:3;nv = length(v);[a0,a1] = meshgrid(v);a0=a0(:)';a1=a1(:)';m = length(a0);SS = sum(((ones(n,1)*a0 + x*a1) - repmat(y,1,m)).^2,1);SS = reshape(SS,nv,nv);surf(v,v,SS)title 'Sum of squares error surface'xlabel 'a0'ylabel 'a1'figurecontour(v,v,SS)hold onplot(a0a1(1),a0a1(2),'rx')plot([[-1;3],[1;1]],[[1;1],[-1;3]],'g-')hold offtitle 'Linear model: Sum of squares (elliptical) error contours'xlabel 'a0'ylabel 'a1'axis equalaxis squaregrid on% Note: the min SSE will occur roughly at (1,1), as indicated% by the green crossed lines. The linear regression estimates% are marked by the 'x'.% As predicted, the contours are elliptical.%%% next, we will do the same analysis for a nonlinear model% with two parameters:%%   y = a0*exp(a1*x.^2) + error%% First, we will look at the error surface% A simple nonlinear modeln = 20;x = linspace(-1.5,1.5,n)';y = 1*exp(1*x.^2) + randn(n,1);figureplot(x,y,'o')title 'Nonlinear data with noise'xlabel 'x'ylabel 'y'%%% Nonlinear regression estimates. Use pleas for this one.a1_start = 2;[a1,a0] = pleas({@(a1,xdata) exp(a1*xdata.^2)},a1_start,x,y)a0a1 = [a0,a1];%%% look at various possible values for a0 & a1v = .5:.01:1.5;nv = length(v);[a0,a1] = meshgrid(v);a0=a0(:)';a1=a1(:)';m = length(a0);SS = sum((((ones(n,1)*a0).*exp(x.^2*a1)) - repmat(y,1,m)).^2,1);SS = reshape(SS,nv,nv);minSS = min(SS(:));surf(v,v,SS)title 'Nonlinear model: Sum of squares error surface'xlabel 'a0'ylabel 'a1'figurecontour(v,v,SS,minSS + (0:.25:2))hold onplot(a0a1(1),a0a1(2),'rx')plot([[-1;3],[1;1]],[[1;1],[-1;3]],'g-')hold offtitle 'Nonlinear model: Sum of squares error contours'xlabel 'a0'ylabel 'a1'axis equalaxis squaregrid on% Note: This time, the sums of squares surface is not as% neatly elliptical. %%%{---------------------------------------------------------------27. Confidence limits on a regression modelThere are various things that people think of when the phrase"confidence limits" arises.- We can ask for confidence limits on the regression parametersthemselves. This is useful to decide if a model term is statisticallysignificant, if not, we may choose to drop it from a model.- We can ask for confidence limits on the model predictions (yhat)These goals are related of course. They are obtained from theparameter covariance matrix from the regression. (The reference tolook at here is again Draper and Smith.)If our regression problem is to solve for x, such that A*x = y,then we can compute the covariance matrix of the parameter vectorx by the simple  V_x = inv(A'*A)*s2where s2 is the error variance. Typically the error variance isunknown, so we would use a measure of it from the residuals.  s2 = sum((y-yhat).^2)/(n-p);Here n is the number of data points, and p the number of parametersto be estimated. This presumes little or no lack of fit in the model.Of course, significant lack of fit would invalidate any confidencelimits of this form.Often only the diagonal of the covariance matrix is used. This
💿 文件大小 79 K
👤 上传用户 sujiwei
📂 所属分类其他
🏷️ 相关标签

#maltab #工具箱 #家 #相关知识
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -