📄 parafac.m

📁 多维数据处理：MATLAB源程序用于处理多维数据
💻 M
📖 第 1 页 / 共 2 页
字号:
12 下一页
function [Factors,it,err,corcondia]=parafac(X,DimX,Fac,Options,const,OldLoad,FixMode,Weights);

% PARAFAC
% See also:
% 'npls' 'tucker' 'dtld' 'gram'
%
% Copyright, 1998 - 
% This M-file and the code in it belongs to the holder of the
% copyrights and is made public under the following constraints:
% It must not be changed or modified and code cannot be added.
% The file must be regarded as read-only. Furthermore, the
% code can not be made part of anything but the 'N-way Toolbox'.
% In case of doubt, contact the holder of the copyrights.
%
% Rasmus Bro
% Chemometrics Group, Food Technology
% Department of Food and Dairy Science
% Royal Veterinary and Agricultutal University
% Rolighedsvej 30, DK-1958 Frederiksberg, Denmark
% Phone  +45 35283296
% Fax    +45 35283245
% E-mail rb@kvl.dk
%
%
%
%     ___________________________________________________
%
%                  THE PARAFAC MODEL
%     ___________________________________________________
% 
% [Factors,it,err,corcondia,Weights] = parafac(X,DimX,Fac,Options,const,OldLoad,FixMode,Weights);
%
% or skipping optional in/outputs
%
% Factors = parafac(X,DimX,Fac);
%
% Algorithm for computing an N-way PARAFAC model. Optionally
% constraints can be put on individual modes for obtaining 
% orthogonal, nonnegative, or unimodal solutions. The algorithm
% also handles missing data. For details of PARAFAC 
% modeling see R. Bro, Chemom. Intell. Lab. Syst., 1997.
%
% Several possibilities exist for speeding up the algorithm. 
% Compressing has been incorporated, so that large arrays can be
% compressed by using Tucker (see Bro & Andersson, Chemom. 
% Intell. Lab. Syst., 1998).
% Another acceleration method incorporated here is to 
% extrapolate the individual loading elements a number of 
% iterations ahead after a specified number of iterations.
%
% A temporary MAT-file called TEMP.mat is saved for every 
% 50 iterations. IF the computer breaks down or the model 
% seems to be good enough, one can break the program and 
% load the last saved estimate. The loadings in TEMP.MAT
% are given in one matrix as described below and can be 
% converted to A, B, C etc. by FAC2LET.M
% 
% All loading vectors except in first mode are normalized, 
% so that all variance is kept in the first mode (as is 
% common in two-way PCA). The components are arranged as
% in PCA. After iterating, the most important component is
% made the first component etc.
%
%
%
% ----------------------INPUT---------------------
%
% X          X is the input array, which can be from three- to N-way (also
%            twoway if the third mode is interpreted as a onedimensional
%            mode). The dimensions of the array is I x J x K x .... R, and X
%            is inputted as an I x JK ....R matrix. For an 10 x 3 x 2 threeway
%            array, X = [X1 X2], where X1 is the 10 x 3 matrix corresponding
%            to k=1, and X2 is an 10 x 3 matrix corresponding to k=2.
%
% DimX 		 A vector of the dimensions of the different orders. For a
%            3 x 10 x 2 x 55 fourway array, DimX would be [3 10 2 55].
%
% Fac		    No of factors/components sought.
%
%
% ----------------OPTIONAL INPUT---------------------
%
% Options    Optional parameters. If not given or set to zero or [], 
%            defaults will be used. If you want Options(5) to be 2 and
%            not change others, simply write Options(5)=2. Even if Options
%            hasn't been defined Options will contain zeros except its
%            fifth element.
%
%            Options(1) - Convergence criterion
%            The relative change in fit for which the algorithm stops.
%            Standard is 1e-6, but difficult data might require a lower value.
%  
%            Options(2) - Initialization method
%            This option is ignored if PARAFAC is started with old values.
%            If no default values are given the default Options(2) is 0.
%            The advantage of using DTLD or SVD for initialization is that
%            they often provide good starting values. However, since the 
%            initial values are then fixed, repeating the fitting will give
%            the exact same solution. Therefore it is not possible to substantiate
%            if a local minimum has been reached. To avoid that use an initialization
%            based on random values (2).
%
%            0  = fit using DTLD/GRAM for initialization (default if three-way and no missing)
%            1  = fit using SVD vectors for initialization (default if higher than three-way or missing)
%            2  = fit using random orthogonalized values for initialization
%            10 = fit using the best-fitting models of several models
%            fitted using a few iterations
%
%            Options(3) - Plotting options
%            2=produces several graphical outputs (loadings shown during iterations)
%            1=as above, but graphics only shown after convergence
%            0=no plots
%
%            Options(4) - Not user-accesible
% 
%            Options(5) - How often to show fit
%            Determines how often the deviation between the model and the data
%            is shown. This is helpful for adjusting the output to the number
%            of iterations. Default is 10. If showfit is set to NaN, almost no
%            outputs are given 
%
%            Options(6) - Maximal number of iterations
%            Maximal number of iterations allowed. Default is 2500.
%
% const      A vector telling type of constraints put on the loadings of the
%            different modes. Same size as DimX but the i'th element tells
%            what constraint is on that mode.
%            0 => no constraint,
%            1 => orthogonality
%            2 => nonnegativity
%            3 => unimodality (and nonnegativitiy)
%            If const is not defined, no constraints are used.
%            For no constraints in a threeway problem const = [0 0 0]
%
% OldLoad    If initial guess of the loadings is available. OldLoad should be
%            given as one matrix as shown below. If the existing estimate is
%            given as a set of matrices (A,B,C...), OldLoad should be given
%            [A(:);B(:);C(:);...].
%
% FixMode    FixMode is a binary vector of same sixe as DimX. If 
%            FixMode(i) = 1 => Mode i is fixed (requires old values given)
%            FixMode(i) = 0 => Mode i is not fixed hence estimated
%            Ex.: FixMode = [0 1 1] find the scores of a data set given the loadings.
%            When some modes are fixed, the numbering of the components will 
%            also be fixed. Normally components are sorted according to variance
%            as in PCA, but this will not be performed if some modes are fixed.
%
% Weights    If a matrix of the same size as X is given, weighted regression
%            is performed using the weights in the matrix Weights.
%
% ---------------------OUTPUT---------------------
%
% Factors    PARAFAC estimate of loadings in one matrix. For a 3 component
%            solution to a 4 x 3 x 3 array the loadings A, B & C will be
%            stored in a 30 x 1 vector as indicated below. This way of saving
%            the loadings have been chosen to simplify the extension to multiway 
%            arrays. Using the optional output stated above, the loadings will 
%            be given separately for each mode.
%
%            Factors = [vecA;vecB; ...] = [A(:);B(:); ...]
%
%            Use FAC2LET.M for converting to "normal" output.
%
% it         Number of iterations used. Can be helpful for checking if the algorithm
%            has converged or simply hit the maximal number of iterations (default 2500).
%
% err        The fit of the model = the sum of squares of errors (not including missing
%            elements).
%
% Corcondia  Core consistency test. Should ideally be 100%. If significantly below
%            100% the model is not valid
%
%
%       Copyright
%       Rasmus Bro 1995/1998
%       Chemometrics Group, Food Technology
%       Department of Dairy- and Food Science
%       Royal Veterinary & Agricultural University
%       Copenhagen
%       Rolighedsvej 30, iii
%       DK-1958 Frederiksberg C
%       Denmark
%	
%       Phone +45 35283296
%       Fax   +45 35283265
%       E-mail rb@kvl.dk
% 
%
%  %%%%%%%%OTHER STUFF%%%%%%%%
%  
%  Missing values
%  Missing values are handled by expectation maximization only. Set all 
%  missing values to NaN
%
%  COMMAND LINE (SHORT)
%
%  Factors = parafac(X,DimX,Fac);
%

% $ Version 1.03 $ Date 1. October   1998 $ Not compiled $ Changed sign-convention because of problems with centered data
% $ Version 1.04 $ Date 18. February 1999 $ Not compiled $ Removed auxiliary line
% $ Version 1.06 $ Date 1. December  1999 $ Not compiled $ Fixed bug in low fit error handling
% $ Version 1.07 $ Date 17. January  2000 $ Not compiled $ Fixed bug in nnls handling so that the algorithm is not stopped until nonnegative appear
% $ Version 1.08 $ Date 21. January  2000 $ Not compiled $ Changed init DTLD so that primarily negative loadings are reflected if possible
% $ Version 1.09 $ Date 30. May 2000 $ Not compiled $ changed name noptioPF to noptiopf


NumbIteraInitia=20;

if nargin==0
   disp(' ')
   disp(' ')
   disp(' THE PARAFAC MODEL')
   disp(' ')
   disp(' Type <<help parafac>> for more info')
   disp('  ')
   disp(' [Factors,it,err,Corcondia] = parafac(X,DimX,Fac,Options,const,OldLoad,FixMode,Weights);')
   disp(' or short')
   disp(' Factors = parafac(X,DimX,Fac);')
   disp(' ')
   disp(' Options=[Crit Init Plot NotUsed ShowFit MaxIt]')
   disp(' ')
   disp(' ')
   disp(' EXAMPLE:')
   disp(' To fit a four-component PARAFAC model to X of size 6 x 2 x 200 x 3 type')
   disp(' Factors=parafac(X,[6 2 200 3],4)')
   disp(' and to obtain the scores and loadings from the output type')
   disp(' [A,B,C,D]=fac2let(Factors,[6 2 200 3],4);')
   break
elseif nargin<3
   error(' The inputs X, DimX, and Fac must be given')
end
if nargin<4
   load noptiopf
   OptionsDefault=Options;
else
   
      % Call the current Options OptionsHere and load default to use if some of the current settings should be default
   Options=Options(:);
   I=length(Options);
   if I==0
      Options=zeros(8,1);
   end
   I=length(Options);
   if I<8
      Options=[Options;zeros(8-I,1)];
   end
   OptionsHere=Options;
   load noptiopf
   OptionsDefault=Options;
   Options=OptionsHere;
end

if ~exist('OldLoad')==1
   OldLoad=0;
elseif length(OldLoad)==0
   OldLoad=0;
end
% Convergence criteriaif Options(1,1)==0
   Options(1,1)=OptionsDefault(1,1);
end
crit=Options(1);

% Initialization
if ~any(Options(2))
   Options(2)=OptionsDefault(2);
end
Init=Options(2);
% Interim plotting
Plt=Options(3,1);
if ~any([0 1 2]==Plt)
   error(' Options(3,1) - Plotting - not set correct; must be 0,1, or 2')
end

if Options(5,1)==0
   Options(5,1)=OptionsDefault(5,1);
end
showfit=Options(5,1);
if isnan(showfit)
   showfit=-1;
end
if showfit<-1|round(showfit)~=showfit
   error(' Options(5,1) - How often to show fit - not set correct; must be positive integer or -1')
end
if Options(6,1)==0
   Options(6,1)=OptionsDefault(6,1);
   maxit=Options(6,1);
elseif Options(6)>0&round(Options(6))==Options(6)
   maxit=Options(6,1);
else
   error(' Options(6,1) - Maximal number of iterations - not set correct; must be positive integer')
end
ShowPhi=0; % Counter. Tuckers congruence coef/Multiple cosine/UUC shown every ShowPhiWhen'th time the fit is shown
ShowPhiWhen=10;
MissConvCrit=1e-4; % Convergence criterion for estimates of missing values
NumberOfInc=0; % Counter for indicating the number of iterations that increased the fit. ALS algorithms ALLWAYS decrease the fit, but using outside knowledge in some sense (approximate equality or iteratively reweighting might cause the algorithm to diverge

% INITIALIZE 
if showfit~=-1
   disp(' ') 
   disp(' PRELIMINARY')
   disp(' ')
end
ord=chkpfdim(X,DimX,Options(5));

if showfit~=-1
   disp([' A ',num2str(Fac),'-component model will be fitted'])
end
if ord==2
   if showfit~=-1
      disp(' ')
      disp(' Error in PARAFAC.M')
      disp(' If a two-way array (a matrix) is to be modelled please express')
      disp(' it as a three-way array with a fixed third mode, i.e., let the')
      disp(' third dimension be one (DimX = [I J 1])')
      break
   end
end

if exist('const')~=1
   const=zeros(size(DimX));
elseif length(const)~=ord
   const=zeros(size(DimX));
   if showfit~=-1
      disp(' Constraints are not given properly')
   end
end

if showfit~=-1
   for i=1:ord
      if const(i)==0
         disp([' No constraints on mode ',num2str(i)])
      elseif const(i)==1
         disp([' Orthogonality on mode ',num2str(i)])
      elseif const(i)==2
         disp([' Nonnegativity on mode ',num2str(i)])
      elseif const(i)==3
         disp([' Unimodality on mode ',num2str(i)])
      end
   end
end

% Check if orthogonality required on all modes
if max(max(const))==1
   if min(min(const))==1,disp(' ')
      disp(' Not possible to orthogonalize all modes in this implementation.')
      error(' Contact the authors for further information')
   end
end
if exist('FixMode')==1
   if length(FixMode)~=ord
      FixMode = zeros(1,ord);
   end
else
   FixMode = zeros(1,ord);
end

if showfit~=-1
   if any(FixMode)
      disp([' The loadings of mode : ',num2str(find(FixMode(:)')),' are fixed']) 
   end
end
if any(FixMode)&length(OldLoad)~=sum(DimX*Fac)
   disp(' You must input loadings that are consistent with the NEW array.') 
   error(' This will be handled more beautiful in the 5.x-compatible toolbox')
end
if exist('Weights')~=1
   Weights=[];
end
% Display convergence criterion
if showfit~=-1
   disp([' The convergence criterion is ',num2str(crit)]) 
end

% Define loading as one ((r1*r2*r3*...*r7)*Fac x 1) vector [A(:);B(:);C(:);...].
% The i'th loading goes from lidx(i,1) to lidx(i,2)
lidx=[1 DimX(1)*Fac];
for i=2:ord
   lidx=[lidx;[lidx(i-1,2)+1 sum(DimX(1:i))*Fac]];
end
% Check if weighted regression required
if size(Weights,1)==size(X,1)&size(Weights,2)==size(X,2)
   if showfit~=-1
      disp(' Given weights will be used for weighted regression')
   end
   DoWeight=1;
else
   if showfit~=-1
      disp(' No weights given')
   end   DoWeight=0;
end
% Make idx matrices if missing values
if any(isnan(X(:)))
   MissMeth=1;
else
   MissMeth=0;
end
if MissMeth
   id=sparse(find(isnan(X)));
   id2=sparse(find(~isnan(X)));
   if showfit~=-1
      disp([' ', num2str(100*(length(id)/prod(DimX))),'% missing values']);
      disp(' Expectation maximization will be used for handling missing values')
   end
   SSX=sum(sum(X(id2).^2)); % To be used for evaluating the %var explained
   % If weighting to zero should be used
   % Replace missing with mean values or model estimates initially
   if length(OldLoad)==sum(DimX)*Fac
      model=nmodel(OldLoad(:),DimX,Fac);
      X(id)=model(id);
   else
      meanX=mean(X(find(~isnan(X))));
      meanX=mean(meanX);
      X(id)=meanX*ones(size(id));
   end
else
   if showfit~=-1
      disp(' No missing values')
   end
   SSX=sum(sum(X.^2)); % To be used for evaluating the %var explained
end

% Check if weighting is tried used together with unimodality or orthogonality
if any(const==3)|any(const==1)
   if DoWeight==1
      disp(' ')
      disp(' Weighting is not possible together with unimodality and orthogonality.')
      disp(' It can be done using majorization, but has not been implemented here')
      disp(' Please contact the authors for further information')
      error
   end
end
% Acceleration
acc=-5;     
do_acc=1;   % Do acceleration every do_acc'th time
acc_pow=2;  % Extrapolate to the iteration^(1/acc_pow) ahead
acc_fail=0; % Indicate how many times acceleration have failed 
max_fail=4; % Increase acc_pow with one after max_fail failure
if showfit~=-1
12 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -