📄 parafac.m
字号:
function [Factors,it,err,corcondia]=parafac(X,Fac,Options,const,OldLoad,FixMode,Weights);
% PARAFAC multiway parafac model
%
% See also:
% 'npls' 'tucker' 'dtld' 'gram'
%
%
% ___________________________________________________
%
% THE PARAFAC MODEL
% ___________________________________________________
%
% [Factors,it,err,corcondia] = parafac(X,Fac,Options,const,OldLoad,FixMode,Weights);
%
% or skipping optional in/outputs
%
% Factors = parafac(X,Fac);
%
% Algorithm for computing an N-way PARAFAC model. Optionally
% constraints can be put on individual modes for obtaining
% orthogonal, nonnegative, or unimodal solutions. The algorithm
% also handles missing data. For details of PARAFAC
% modeling see R. Bro, Chemom. Intell. Lab. Syst., 1997.
%
% Several possibilities exist for speeding up the algorithm.
% Compressing has been incorporated, so that large arrays can be
% compressed by using Tucker (see Bro & Andersson, Chemom.
% Intell. Lab. Syst., 1998).
% Another acceleration method incorporated here is to
% extrapolate the individual loading elements a number of
% iterations ahead after a specified number of iterations.
%
% A temporary MAT-file called TEMP.mat is saved for every
% 50 iterations. IF the computer breaks down or the model
% seems to be good enough, one can break the program and
% load the last saved estimate. The loadings in TEMP.MAT
% are given a cell array as described below and can be
% converted to A, B, C etc. by FAC2LET.M typing
% [A,B,C]=fac2let(Factors,size(X));
%
% All loading vectors except in first mode are normalized,
% so that all variance is kept in the first mode (as is
% common in two-way PCA). The components are arranged as
% in PCA. After iterating, the most important component is
% made the first component etc.
%
%
%
% ----------------------INPUT---------------------
%
% X X is the input array, which can be from three- to N-way (also
% twoway if the third mode is interpreted as a onedimensional
% mode).
%
% Fac No of factors/components sought.
%
%
% ----------------OPTIONAL INPUT---------------------
%
% Options Optional parameters. If not given or set to zero or [],
% defaults will be used. If you want Options(5) to be 2 and
% not change others, simply write Options(5)=2. Even if Options
% hasn't been defined Options will contain zeros except its
% fifth element.
%
% Options(1) - Convergence criterion
% The relative change in fit for which the algorithm stops.
% Standard is 1e-6, but difficult data might require a lower value.
%
% Options(2) - Initialization method
% This option is ignored if PARAFAC is started with old values.
% If no default values are given the default Options(2) is 0.
% The advantage of using DTLD or SVD for initialization is that
% they often provide good starting values. However, since the
% initial values are then fixed, repeating the fitting will give
% the exact same solution. Therefore it is not possible to substantiate
% if a local minimum has been reached. To avoid that use an initialization
% based on random values (2).
%
% 0 = fit using DTLD/GRAM for initialization (default if
% three-way and no missing and if sizes are
% largere than number of factors at least
% in two modes)
% 1 = fit using SVD vectors for initialization (default if higher than three-way or missing)
% 2 = fit using random orthogonalized values for initialization
% 10 = fit using the best-fitting models of several models
% fitted using a few iterations
%
% Options(3) - Plotting options
% 0 = no plots
% 1 = produces several graphical outputs
% 2 = produces several graphical outputs (loadings also shown during iterations)
% 3 = as 2 but no core consistency check (very slow for large arrays and/or many components)
%
% Options(4) - Scaling
% 0 or 1 = default scaling (columns in mode one carry the variance)
% 2 = no scaling applied (hence fixed elements will not be modified
%
% Options(5) - How often to show fit
% Determines how often the deviation between the model and the data
% is shown. This is helpful for adjusting the output to the number
% of iterations. Default is 10. If showfit is set to NaN, almost no
% outputs are given
%
% Options(6) - Maximal number of iterations
% Maximal number of iterations allowed. Default is 2500.
%
% const A vector telling type of constraints put on the loadings of the
% different modes. Same size as DimX but the i'th element tells
% what constraint is on that mode.
% 0 => no constraint,
% 1 => orthogonality
% 2 => nonnegativity
% 3 => unimodality (and nonnegativitiy)
% If const is not defined, no constraints are used.
% For no constraints in a threeway problem const = [0 0 0]
%
% OldLoad If initial guess of the loadings is available. OldLoad should be
% given a cell array where OldLoad{1}=A,OldLoad{2}=B etc.
%
% FixMode FixMode is a binary vector of same sixe as DimX. If
% FixMode(i) = 1 => Mode i is fixed (requires old values given)
% FixMode(i) = 0 => Mode i is not fixed hence estimated
% Ex.: FixMode = [0 1 1] find the scores of a data set given the loadings.
% When some modes are fixed, the numbering of the components will
% also be fixed. Normally components are sorted according to variance
% as in PCA, but this will not be performed if some modes are fixed.
%
% Weights If a matrix of the same size as X is given, weighted regression
% is performed using the weights in the matrix Weights. Statistically
% the weights will usually contain the inverse error standard
% deviation of the particular element
%
% ---------------------OUTPUT---------------------
%
% Factors PARAFAC estimate of loadings in one matrix. For a 3 component
% solution to a 4 x 3 x 3 array the loadings A, B & C will be
% stored in a 3 element cell vector:
% Factors{1}=A,
% Factors{2}=B
% Factors{3}=C
% etc.
%
% Use FAC2LET.M for converting to "normal" output or simply extract the
% components as e.g. A = Factors{1};
%
% it Number of iterations used. Can be helpful for checking if the algorithm
% has converged or simply hit the maximal number of iterations (default 2500).
%
% err The fit of the model = the sum of squares of errors (not including missing
% elements).
%
% Corcondia Core consistency test. Should ideally be 100%. If significantly below
% 100% the model is not valid
%
%
%
% OTHER STUFF
%
% Missing values are handled by expectation maximization only. Set all
% missing values to NaN
%
% COMMAND LINE (SHORT)
%
% Factors = parafac(X,Fac);
%
% Copyright, 1998 -
% This M-file and the code in it belongs to the holder of the
% copyrights and is made public under the following constraints:
% It must not be changed or modified and code cannot be added.
% The file must be regarded as read-only. Furthermore, the
% code can not be made part of anything but the 'N-way Toolbox'.
% In case of doubt, contact the holder of the copyrights.
%
% Rasmus Bro
% Chemometrics Group, Food Technology
% Department of Food and Dairy Science
% Royal Veterinary and Agricultutal University
% Rolighedsvej 30, DK-1958 Frederiksberg, Denmark
% Phone +45 35283296
% Fax +45 35283245
% E-mail rb@kvl.dk
% $ Version 1.03 $ Date 1. October 1998 $ Not compiled $ Changed sign-convention because of problems with centered data
% $ Version 1.04 $ Date 18. February 1999 $ Not compiled $ Removed auxiliary line
% $ Version 1.06 $ Date 1. December 1999 $ Not compiled $ Fixed bug in low fit error handling
% $ Version 1.07 $ Date 17. January 2000 $ Not compiled $ Fixed bug in nnls handling so that the algorithm is not stopped until nonnegative appear
% $ Version 1.08 $ Date 21. January 2000 $ Not compiled $ Changed init DTLD so that primarily negative loadings are reflected if possible
% $ Version 1.09 $ Date 30. May 2000 $ Not compiled $ changed name noptioPF to noptiopf
% $ Version 2.00 $ May 2001 $ Changed to array notation $ RB $ Not compiled $
% $ Version 2.001 $ June 2001 $ Fixed error in weighted regression $ RB $ Not compiled $
% $ Version 2.002 $ Jan 2002 $ Fixed scaling problem due to non-identifiability of DTLD(QZ) by scaling and normalizing after each iteration $ RB $ Not compiled $
% $ Version 2.003 $ Jan 2002 $ Fixed negative solutions when nonneg imposed $ RB $ Not compiled $
% $ Version 2.004 $ Jan 2002 $ Changed initialization when many components used $ RB $ Not compiled $
% $ Version 2.005 $ Jan 2002 $ Changed absolute fit criterion (approacing eps) into relative sse/ssx$ RB $ Not compiled $
% $ Version 2.006 $ Jan 2002 $ Fixed post-scaling when fixed loadings $ RB $ Not compiled $
% $ Version 2.01 $ Jan 2003 $ Removed corcondia for two-way data (doesn't work) and fixed a bug for data with dimension 2 $ RB $ Not compiled $
% $ Version 2.011 $ feb 2003 $ Added an option (4) for not post scaling components $ RB $ Not compiled $
% $ Version 2.10 $ jan 2004 $ Fixed a plotting error occuring when fitting model to old data $ RB $ Not compiled $
% $ Version 2.11 $ jan 2004 $ Fixed that PCA can be fitted $ RB $ Not compiled $
% $ Version 2.12 $ Jul 2004 $ Fixed initialization bug $ RB $ Not compiled $
% $ Version 2.13 $ Jan 2005 $ Modified sign conventions of scores and loads $ RB $ Not compiled $
% $ Version 2.14 $ Feb 2006 $ Fixed bug in sign-swicth when loadings are fixed $ RB $ Not compiled $
NumbIteraInitia=20;
if nargin==0
disp(' ')
disp(' ')
disp(' THE PARAFAC MODEL')
disp(' ')
disp(' Type <<help parafac>> for more info')
disp(' ')
disp(' [Factors,it,err,Corcondia] = parafac(X,Fac,Options,const,OldLoad,FixMode,Weights);')
disp(' or short')
disp(' Factors = parafac(X,Fac);')
disp(' ')
disp(' Options=[Crit Init Plot NotUsed ShowFit MaxIt]')
disp(' ')
disp(' ')
disp(' EXAMPLE:')
disp(' To fit a four-component PARAFAC model to X of size 6 x 2 x 200 x 3 type')
disp(' Factors=parafac(X,4)')
disp(' and to obtain the scores and loadings from the output type')
disp(' [A,B,C,D]=fac2let(Factors);')
return
elseif nargin<2
error(' The inputs X, and Fac must be given')
end
DimX = size(X);
X = reshape(X,DimX(1),prod(DimX(2:end)));
nonneg_obeyed = 1; % used to check if noneg is ok
if nargin<3
load noptiopf
OptionsDefault=Options;
else
% Call the current Options OptionsHere and load default to use if some of the current settings should be default
Options=Options(:);
I=length(Options);
if I==0
Options=zeros(8,1);
end
I=length(Options);
if I<8
Options=[Options;zeros(8-I,1)];
end
OptionsHere=Options;
load noptiopf
OptionsDefault=Options;
Options=OptionsHere;
end
if ~exist('OldLoad')==1
OldLoad=0;
elseif length(OldLoad)==0
OldLoad=0;
end
% Convergence criteria
if Options(1,1)==0
Options(1,1)=OptionsDefault(1,1);
end
crit=Options(1);
% Initialization
if ~any(Options(2))
Options(2)=OptionsDefault(2);
end
Init=Options(2);
% Interim plotting
Plt=Options(3,1);
if ~any([0 1 2 3]==Plt)
error(' Options(3,1) - Plotting - not set correct; must be 0,1,2 or 3')
end
if Options(5,1)==0
Options(5,1)=OptionsDefault(5,1);
end
showfit=Options(5,1);
if isnan(showfit)
showfit=-1;
end
if showfit<-1|round(showfit)~=showfit
error(' Options(5,1) - How often to show fit - not set correct; must be positive integer or -1')
end
if Options(6,1)==0
Options(6,1)=OptionsDefault(6,1);
maxit=Options(6,1);
elseif Options(6)>0&round(Options(6))==Options(6)
maxit=Options(6,1);
else
error(' Options(6,1) - Maximal number of iterations - not set correct; must be positive integer')
end
ShowPhi=0; % Counter. Tuckers congruence coef/Multiple cosine/UUC shown every ShowPhiWhen'th time the fit is shown
ShowPhiWhen=10;
MissConvCrit=1e-4; % Convergence criterion for estimates of missing values
NumberOfInc=0; % Counter for indicating the number of iterations that increased the fit. ALS algorithms ALLWAYS decrease the fit, but using outside knowledge in some sense (approximate equality or iteratively reweighting might cause the algorithm to diverge
% INITIALIZE
if showfit~=-1
disp(' ')
disp(' PRELIMINARY')
disp(' ')
end
ord=length(DimX);
if showfit~=-1
disp([' A ',num2str(Fac),'-component model will be fitted'])
end
if exist('const')~=1
const=zeros(size(DimX));
elseif length(const)~=ord
if length(DimX)==2 & length(const)==3
const = const(1:2);
else
const=zeros(size(DimX));
if showfit~=-1
disp(' Constraints are not given properly')
end
end
end
if showfit~=-1
for i=1:ord
if const(i)==0
disp([' No constraints on mode ',num2str(i)])
elseif const(i)==1
disp([' Orthogonality on mode ',num2str(i)])
elseif const(i)==2
disp([' Nonnegativity on mode ',num2str(i)])
elseif const(i)==3
disp([' Unimodality on mode ',num2str(i)])
end
end
end
% Check if orthogonality required on all modes
DoingPCA= 0;
if max(max(const))==1
if min(min(const))==1,
if length(DimX)>2
disp(' ')
disp(' Not possible to orthogonalize all modes in this implementation.')
error(' Contact the authors for further information')
else
const = [1 0]; % It's ok for PCA but do in one mode to get LS and then orthogonalize afterwards
DoingPCA = 1;
end
end
end
if exist('FixMode')==1
if length(FixMode)~=ord
FixMode = zeros(1,ord);
end
else
FixMode = zeros(1,ord);
end
if showfit~=-1
if any(FixMode)
disp([' The loadings of mode : ',num2str(find(FixMode(:)')),' are fixed'])
end
end
if exist('Weights')~=1
Weights=[];
end
% Display convergence criterion
if showfit~=-1
disp([' The convergence criterion is ',num2str(crit)])
end
% Define loading as one ((r1*r2*r3*...*r7)*Fac x 1) vector [A(:);B(:);C(:);...].
% The i'th loading goes from lidx(i,1) to lidx(i,2)
lidx=[1 DimX(1)*Fac];
for i=2:ord
lidx=[lidx;[lidx(i-1,2)+1 sum(DimX(1:i))*Fac]];
end
% Check if weighted regression required
if size(Weights,1)==size(X,1)&prod(size(Weights))/size(X,1)==size(X,2)
Weights = reshape(Weights,size(Weights,1),prod(size(Weights))/size(X,1));
if showfit~=-1
disp(' Given weights will be used for weighted regression')
end
DoWeight=1;
else
if showfit~=-1
disp(' No weights given')
end
DoWeight=0;
end
% Make idx matrices if missing values
if any(isnan(X(:)))
MissMeth=1;
else
MissMeth=0;
end
if MissMeth
id=sparse(find(isnan(X)));
idmiss2=sparse(find(~isnan(X)));
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -