⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 svm_multi_pred.m

📁 一个简洁好用的SVM代码
💻 M
字号:
function [beta, bo] = svm_multi_pred(X,Y,C,varargin)% SVM_MULTI_PRED%% Support Vector Multi Classification%% USAGE: [beta, bo] = svm_multi_pred(X,Y,C,vargin)%% PARAMETERS:  X      - (m,d) matrix of m Training inputs in R^d%              Y      - m vector of m Training targets in {1,..,Q}%              C      - Trade-off between regularization and empirical error %              beta   - Coefficients matrix of the expansion of%                       the vectors w_i over the x_p's%              bo     - bias terms%% SUBROUTINES: svm_multi_init.m -> initialization of the optimization problem%             squadsolve.m          -> optimization %             compute_kernel.m -> computation of the gram matrix%% DESCRIPTION:%%              This procedure is the implementation of a multiclass SVM corresponding to %              the following problem%%              \min_{w_i,w_j} \sum_{i\neq j} \|w_i - w_j\|^2 + C\sum_{p,j} \xi_{pj}%%              subject to :%               for all p=1,..,m   ( w_{c(p)} - w_j ).x_p + b_{c(p)} - b_j \geq 1 - \xi_{pj}%                                  \xi_{pj} \geq 0%%      where:  m is the number of training points%              Q is the number of classes%              i and j are indices between 1 and Q%              p is an index between 1 and m%              w.x is the dot product between w and x%%              To solve this problem, a direct approach is used. All the gram matrix K (K_{i,j} =x_i.x_j)%              is computed before beginning the optimization. The memory cost is thus proportionnal to Q^2*m^2.%              The solution is not approximated but is provided by a quadratic programming method implemented in%              squadsolve.m. Instead of minimizing the primal objective function, we maximize the dual. Its %              formulation is not given here.%% EXTRA PARAMETERS: The 'varargin' in the declaration means that extra parameters can be added%                   to specify different kernels. One can add three parameters: kernel,param1,param2.%                   These parameters corresponds to:%                   kernel = 0 -> linear SVM%                   kernel = 1 -> polynomial SVM%                                 k(x_i,x_j) = (x_i.x_j + param2)^(param1)     %                   kernel = 2 -> gaussian SVM%                                 k(x_i,x_j) = exp-(\|x_i-x_j\|^2/param1^2)%                  If the number of parameter is only 2, then C is assumed to be infinite. If no extra parameters%                  are added, then kernel is assumed to be zero (linear SVM).%              % ERRORS AND BUGS:%              It is recommended not to use this code for large problems: unless your computer has a huge memory,%              the procedure will stop with the message: 'memory exhausted'. Use the code msvm_fw.m or chunk_msvm.m%              instead. Sometimes, the bias may not be computable because the term 'bias_eps' (see below) is too large.%              In that case, decrease 'bias_eps' and it should work. In all cases, the program will output the bias and%              try to give the best solution. %              % EXAMPLE OF USE:%%      > [beta,bo] = svm_multi_pred(inputs,targets,10,2,0.1);%      > out = output(beta,bo,inputs,X,2,0.1);%%       -> solve the multiclass optimization problem for a learning set S = \cup_i {(inputs(i,:),targets(i))}%          with C=10, and a gaussian kernel with a sigma = 0.1%          outputs the coefficients beta and the bias bo which can be directly used in the procedure%          output.m to yield the output of the SVM on different data (here X).%% NOTES: La formulation du dual peut etre retrouvee dans ma these p.56-57%% Andre Elisseeff, Sep. 2000% aelissee@eric.univ-lyon2.fr% Description of variables:%%      kernel = 0,1,2 (integer)    -> type of kernel%      m              (integer)    -> number of input points%      Q              (integer)    -> number of classes%      bias_eps       (real)       -> precision for the computation of the bias%      reg_eps        (real)       -> to avoid ill conditionning, a diagonal matrix%                                     is added to the hessian of the minimization problem,%                                     reg_eps scales this diagonal matrix. (usual < 10^(-6))%      A              ((Q,Q*m) matrix) -> equality constraint matrix %      H            ((Q*m,Q*m) matrix) -> hessian of the objective function%      c               (Q*m vector)    -> linear part of the objectif function%      K            ((Q*m,Q*m) matrix) -> gram matrix%      alpha           (Q*m vector)    -> variables of the optimization problem%      constraints     (Q*m vector)    -> store 1-(w_{c(p)} - w_j).x_p%      beta            ((m,Q) matrix)  -> vector w_i is computed as w_i = sum_p beta_{pi} x_p%      b               ((Q,Q) matrix)  -> store the differences between bias: b(i,j) = b_i - b_j% Avoid warning messageswarning off;% Test if nargin is correctif (nargin < 2) | (nargin > 6),help svm_multi_pred;else  % Init empty_list_elements_ok = 1; kernel=0; m = size(X,1); Q = max(Y); if (nargin<3),     C=Inf; end;   if (C==Inf)    tol = 1e-5;  else    tol = C*1e-6; end; bias_eps=10^(-8); reg_eps = 10^(-6); sv_eps=10^(-6);    % Message  disp(sprintf('\nMulti Support Vector Classification\n'));  disp(sprintf('-----------------------------------\n'));   % Construct the Kernel matrixdisp(sprintf('Constructing the gram matrix...\n'));K=zeros(m,m);K = compute_kernel(X,X,varargin);% Set up the parameters for the Optimisation problemdisp(sprintf('Initialization...\n'));[H,A,c]=svm_multi_init(Y,K);% Add small amount of zero order regularization to% avoid problems when Hessian is badly conditioned.H = 1/(2*Q^2)*H+reg_eps*eye(size(H));% the problem is to minimize (1/4Q^2)x'Hx - c.xc = -c;% Some variables are irrelevant. Consider only relevant variablesmul = Q*(0:m-1);indx = mul' + Y;s1 = (1:Q*m);s2 = (indx);indx = setdiff(s1,s2)';clear s1;clear s2;clear mul;% Solve the optimization problem% the solution is stored in alphadisp(sprintf('Begin the optimization ...\n'));bo = zeros(size(A,1),1);[x1] = squadsolve(H(indx,indx),c(indx),A(:,indx),bo,C);obj = 0.5*x1'*(H(indx,indx)-reg_eps*eye(length(indx)))*x1 + c(indx)'*x1;clear H;clear c;alpha = zeros(Q*m,1);alpha(indx)=x1;clear x1;% Output the objective valuedisp(sprintf('Final objective function: %f \n',-obj));% Compute the number of support vector% Can be removed if not desiredM = max(alpha);nsv=0;for i=1:m,  if ~(isempty(find(alpha(Q*(i-1)+1:Q*i)>M*10^(-3)/m)))    nsv = nsv+1;  end;end;disp(sprintf('Support vectors : %d (%3.1f)\n',nsv,100*nsv/(m)));% Compute the coefficients beta_ip of the vector w_i = sum_p beta_ip x_p% clean alphabeta=zeros(m,Q);notvoid = find(alpha>=M*10^(-3)/m);alpha_tmp=zeros(Q*m,1);alpha_tmp(notvoid) = alpha(notvoid);alpha=alpha_tmp;clear notvoid;clear alpha_tmp;% compute betafor i =1:Q, for p=1:m,  tmp=0;  if (i==Y(p))   tmp = tmp + sum(alpha((p-1)*Q + 1: (p-1)*Q + Q))/Q;  end;  beta(p,i) = tmp - alpha((p-1)*Q + i)/Q; end;end;% Computation of the biasdisp(sprintf('\n Computation of the bias........\n'));% Computation of the outputs 1-(w_{c(i)} - w_j).x_ierrorcache = zeros(Q*m,1);for p=1:m,  for i=1:Q,    temp = ((beta(:,Y(p))-beta(:,i))'*K(:,p));    errorcache((p-1)*Q+i)=temp;  end;end;     % computation of alpha_bias  % here: computing of the bias that minimize the \sum_{ip} \xi_{pi}  % subject to linear constraints:  % (w_{c(p)} - w_i).x_p + b_{c(p)} - b_i >= 1 - \xi_{pi}  % \xi_{pi}  % This is a linear program whose dual is easy to compute. It can be  % solved with the slinearsolve method.   b=zeros(Q,Q); % matrix of the difference b_i-b_j       % Computation of the output 1-(w_Y(p)-w_i).x_p  constraints=1-errorcache;   if C~=Inf,   alpha_bias=zeros(Q*m,1);   c_un = ones(Q*m,1);   [alpha_bias(indx)] = slinearsolve(errorcache(indx)-c_un(indx)+1,A(1:(Q-1),indx),zeros(Q-1,1),1);  else      b=-Inf*ones(Q,Q);     for i=1:m,         for k=1:Q,            b(Y(i),k) = max([b(Y(i),k),(constraints(Q*(i-1)+k) - 0.05)]);                      end;     end;     for k=1:Q,         b(k,k)=0;     end;       end;   % consider the alpha_ip s.t. 0 < alpha_bias_ip < 1  % here, i consider alpha_new, because i compute the bias  % that minimize the l_1 error of the current solution  % and not of the optimal.  if C == Inf,    Cm =1;  else    Cm=C;  end;    % if there are no such alpha -> problem    % the bias are computed by averaging   % the matrix 'compteur' counts when  b_i-b_j is computed  compteur=zeros(Q,Q);  if C~=Inf,      indice_fix = find(alpha_bias < C*(1-bias_eps) & alpha_bias > Cm*bias_eps);      if isempty(indice_fix)    disp('WARNING : Unable to compute the bias, use another method\n');        end;  for ind=indice_fix',    % find the index Q and p corresponding to ind    indQ = rem(ind,Q);    if (indQ==0), indQ=Q;end;    indp = (ind-indQ)/Q;    compteur(Y(indp+1),indQ)=compteur(Y(indp+1),indQ)+1;    b(Y(indp+1),indQ) = b(Y(indp+1),indQ) + constraints(ind);  end;      % average by dividing by compteur  for i=1:Q,    b(i,i) = 0;    for j=1:Q,      if (compteur(i,j))       b(i,j)=b(i,j)/compteur(i,j);      end;    end;  end;end; %% if C==Inf  clear indice_fix;clear constraints;    % transform b in order to be anti symmetric  for i=1:Q,    for j=i+1:Q,      if (b(i,j)==0)        b(i,j)=-b(j,i);      end;      if (b(j,i)==0)        b(j,i)=-b(i,j);      end;                           end;  end;    % average between both anti symmetric elements  for i=1:Q,    for j=i+1:Q,        b(i,j)=(b(i,j)-b(j,i))/2;        b(j,i)=-b(i,j);    end;  end;  % compute the values of b_i s.t. sum_i b_i = 0  ok=0;% ok > 0, if computation is possible otherwise ok=0  b_guess = zeros(Q,1); % if no computation is possible still give a guess  bo = zeros(Q,1); % the bias vector  for i=1:Q,    z = find(b(i,:)~=0);    if (length(z)==Q-1)      ok=ok+1;      temp = sum(b(i,:))/Q;      for j=1:Q,        bo(j) =bo(j)+temp-b(i,j);      end;    else      disp(sprintf('Warning : bias %d -> problem... %d\n',i,length(z)));      if length(z),        b_guess(i) = sum(b(i,:))/length(z);      end;      for j=1:length(z),        b_guess(z(j)) =b_guess(i) -b(i,z(j));      end;    end;                  end;    if ~ok    % then one should compute the bias with another method    disp('WARNING : Problem with the bias, use another method');    bo=b_guess;  else    bo =bo/ok;  end;    end; %test on narginif 0,    % Computation of the biasdisp(sprintf('\n Computation of the bias........\n'));% Computation of the outputs 1-(w_{c(i)} - w_j).x_ierrorcache = zeros(Q*m,1);for p=1:m,  for i=1:Q,    temp = sum((beta(:,Y(p))-beta(:,i))'*K(:,p));    errorcache((p-1)*Q+i)=temp;  end;end;     % computation of alpha_bias  % here: computing of the bias that minimize the \sum_{ip} \xi_{pi}  % subject to linear constraints:  % (w_{c(p)} - w_i).x_p + b_{c(p)} - b_i >= 1 - \xi_{pi}  % \xi_{pi}  % This is a linear program whose dual is easy to compute. It can be  % solved with the slinearsolve method.     alpha_bias=zeros(Q*m,1);  c_un = ones(Q*m,1);  [alpha_bias(indx)] = slinearsolve(errorcache(indx)-c_un(indx),A(1:(Q-1),indx),zeros(Q-1,1),1);       % clean alpha_bias  tmp = 0:m-1;  tmp = tmp*Q;  tmp = tmp + Y';  alpha_bias(tmp)=0;      % if problem, consider the alpha instead  if (flag~=1)    alpha_bias=alpha;  end;        b=zeros(Q,Q); % matrix of the difference b_i-b_j    % Computation of the output 1-(w_Y(p)-w_i).x_p  constraints=1-errorcache;    % consider the alpha_ip s.t. 0 < alpha_bias_ip < 1  % here, i consider alpha_new, because i compute the bias  % that minimize the l_1 error of the current solution  % and not of the optimal.  if C == Inf,    Cm =1;  else    Cm=C;  end;  if (flag~=1)   indice_fix = find(alpha_bias < C*(1-sv_eps) & alpha_bias > Cm*sv_eps);  else   indice_fix = find(alpha_bias < (1-sv_eps) & alpha_bias > sv_eps);  end;    % if there are no such alpha -> problem  if isempty(indice_fix)    disp('WARNING : Unable to compute the bias, use another method\n');        end;  % the bias are computed by averaging   % the matrix 'compteur' counts when  b_i-b_j is computed  compteur=zeros(Q,Q);  for ind=indice_fix',    % find the index Q and p corresponding to ind    indQ = rem(ind,Q);    if (indQ==0), indQ=Q;end;    indp = (ind-indQ)/Q;    compteur(Y(indp+1),indQ)=compteur(Y(indp+1),indQ)+1;    b(Y(indp+1),indQ) = b(Y(indp+1),indQ) + constraints(ind);  end;  % average by dividing by compteur  for i=1:Q,    b(i,i) = 0;    for j=1:Q,      if (compteur(i,j))       b(i,j)=b(i,j)/compteur(i,j);      end;    end;  end;  clear indice_fix;clear constraints;    % transform b in order to be symetric  for i=1:Q,    for j=i+1:Q,      if (b(i,j)==0)        b(i,j)=-b(j,i);      end;      if (b(j,i)==0)        b(j,i)=-b(i,j);      end;                           end;  end;    % average between both symetric elements  for i=1:Q,    for j=i+1:Q,        b(i,j)=(b(i,j)-b(j,i))/2;        b(j,i)=-b(i,j);    end;  end;  % compute the values of b_i s.t. sum_i b_i = 0  ok=0;% ok > 0, if computation is possible otherwise ok=0  b_guess = zeros(Q,1); % if no computation is possible still give a guess  bo = zeros(Q,1); % the bias vector  for i=1:Q,    z = find(b(i,:)~=0);    if (length(z)==Q-1)      ok=ok+1;      temp = sum(b(i,:))/Q;      for j=1:Q,        bo(j) =bo(j)+temp-b(i,j);      end;    else      disp(sprintf('Warning : bias %d -> problem... %d\n',i,length(z)));      if length(z),        b_guess(i) = sum(b(i,:))/length(z);      end;      for j=1:length(z),        b_guess(z(j)) =b_guess(i) -b(i,z(j));      end;    end;                  end;    if ~ok    % then one should compute the bias with another method    disp('WARNING : Problem with the bias, use another method');    bo=b_guess;  else    bo =bo/ok;  end;     end; %test on nargin (if 0)

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -