⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 lsselect.m

📁 一个非常实用的统计工具箱
💻 M
字号:
function [Q, I, B, BB] = lsselect(y,x,crit,how,pmax,level);%LSSELECT Select a predictor subset for regression%%     	  [Q, I, B, BB] = lsselect(y,x,crit,how,pmax,level)%%	  Selects a good subset of regressors in a multiple linear %	  regression model. The criterion is one of the following %	  ones, determined by the third argument, crit. %%	    'HT'   Hypothesis Test (default level = 0.05)%	    'AIC'  Akaike's Information Criterion %	    'BIC'  Bayesian Information Criterion%	    'CMV'  Cross Model Validation (inner criterion RSS)%%	  The forth argument, how, choses between %%	    'AS'   All Subsets%	    'FI'   Forward Inclusion%	    'BE'   Backward Elimination%%	  The fifth argument, pmax, limits the number of included%	  parameters. The returned Q is the criterion as a function of %	  the number of parameters; it might be interpreted as an %	  estimate of the prediction standard deviation. For the method%	  'HT' the reported Q is instead the successive p-values for%	  inclusion or elimination.%%	  The last column of the prediction matrix x must be an intercept %	  column, ie all elements are ones. This column is never excluded %	  in the search for a good model. If it is not present it is added.  %	  The output I contains the index numbers of the included columns.%	  For the method 'HT' the optional input argument "level" is the %	  p-value reference used for inclusion or deletion. Output B is%	  the vector of coefficients, ie the suggested model is %	  Y = X*B. Column p of BB is the best B of parameter size p. %%	  This function is not highly optimized for speed but rather for%	  flexibility. It would be faster if 'all subsets' were in a %	  separate routine and 'forward' and 'backward' were in another%	  routine, especially for CMV.%%         See also LSFIT and LINREG.%       Anders Holtsberg, 14-12-94%       Copyright (c) Anders Holtsbergn = length(y);nc = size(x,2);if nargin<5    pmax = nc;elseif isempty(pmax)   pmax = nc;endif nargin<4   how = 'FI';   fprintf('   Default is forward selection\n');endif nargin<3   crit = 'CMV';   fprintf('   Default criterion is CMV\n');endif strcmp(how,'BE')   pmax = nc;endif nargin<6 & strcmp(crit,'HT')    level = 0.05;   fprintf('   Default level is 0.05\n');endif any(x(:,nc)~=1)   fprintf('   An intercept column added')   x = [x ones(n,1)];   nc = nc + 1;   pmax = pmax + 1;endif nc<2, disp('only one model'), return, end if ~(strcmp(crit,'HT')|strcmp(crit,'AIC')|...     strcmp(crit,'BIC')|strcmp(crit,'CMV'))  error('Third argument error');endif ~(strcmp(how,'BE')|strcmp(how,'FI')|strcmp(how,'AS'))  error('Forth argument error');end      Qsml = NaN*ones(pmax,1);XX = x'*x;XY = x'*y; YY = y'*y;  % === If all subsets then set up an all-subsets-indicator-matrix ====if strcmp(how,'AS')   C = [];   for i = 1:nc-1      d = max(1,size(C,1));      C = [zeros(d,1) C; ones(d,1) C];      H = C * ones(size(C,2),1);      J = find(H<pmax);      C = C(J,:);   end   H = H(J);   [S,I] = sort(H);   C = C(I,:);   C = [C ones(size(C,1),1)];   AllSubsets = C;   AllSubsetsH = sum(C')';end% === This is for CMV ===============================================if strcmp(crit,'CMV')   dataloopend = n + 1;   Qcmv = zeros(pmax,1);   XXs = XX;   XYs = XY;   YYs = YY;else   dataloopend = 1;endfor idata = 1:dataloopend      if strcmp(crit,'CMV')      fprintf('\n %3.0f:', idata*(idata ~= dataloopend))      if idata == dataloopend         XX = XXs;         XY = XYs;         YY = YYs;      else         xi = x(idata,:);         yi = y(idata);         XX = XXs - xi'*xi;         XY = XYs - xi'*yi;         YY = YYs - yi^2;       end   end% === Now we begin to loop over model sizes =========================   if strcmp(how,'BE')      p = nc;      loopto = 1;      C = ones(1,nc);   else      p = 1;      loopto = pmax;      C = [zeros(1,nc-1) 1];   end   BB = zeros(nc,pmax);   fprintf('  ');   while 1;% === The whole loop over the models of size p  ======================           fprintf('%2.0f ', p)      qmin = 1e99;      for k = 1:size(C,1)         Jk = find(C(k,:));         q = YY - XY(Jk)'*(XX(Jk,Jk)\XY(Jk));                        if q < qmin            qmin = q;            Cbest = C(k,:);         end      end      Qsml(p) = qmin;      I = find(Cbest);      BB(I,p)  = XX(I,I)\XY(I);   % === And a piece for CMV only ======================================      if strcmp(crit,'CMV') & idata < n + 1         Jk = find(Cbest);         q = yi - xi(Jk)*(XX(Jk,Jk)\XY(Jk));                        Qcmv(p) = Qcmv(p) + q^2;      end% === Next parameter size ===========================================         if p == loopto, break, end      if strcmp(how,'FI')         p = p + 1;         C = [];         for i = 1:nc-1            if Cbest(i)==0               Cnew = Cbest;               Cnew(i) = 1;               C = [C; Cnew];            end         end      elseif strcmp(how,'BE')         p = p - 1;         C = [];         for i = 1:nc-1            if Cbest(i)==1               Cnew = Cbest;               Cnew(i) = 0;               C = [C; Cnew];            end         end         elseif strcmp(how,'AS')         p = p + 1;         C = AllSubsets(find(AllSubsetsH==p),:);      end      endend% === Finished at last ===============================================if strcmp(crit,'CMV')    Q = sqrt(Qcmv/n);endif strcmp(crit,'AIC')    Q = sqrt(Qsml/n).*exp((1:pmax)'/n);end;if strcmp(crit,'BIC')   Q = sqrt(Qsml/n).*exp((1:pmax)'/n*log(n)/2);endif strcmp(crit,'HT')   Qsml = sqrt(Qsml/n);   Fvals = (Qsml(1:pmax-1).^2 - Qsml(2:pmax).^2) ...            ./ (Qsml(2:pmax).^2 ./ (n-(2:pmax))' );   Q = 1-pf(Fvals,1,(n-(2:pmax))');   i = find(Q>level);   if strcmp(how,'BE')      i = [1; Q<level];      i = find(i);      i = i(length(i));   else      i = [Q>level; 1];      i = find(i);      i = i(1);   endelse   [S,i] = sort(Q);   i = i(1);endB = BB(:,i);I = find(B);fprintf('\n');

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -