📄 parafac2.m
字号:
function [A,H,C,P,fit,AddiOutput]=parafac2(X,F,Constraints,Options,A,H,C,P);
% $ Version 1.01 $ Date 28. December 1998 $ Not compiled $ RB
% $ Version 1.02 $ Date 31. March 1999 $ Added X-validation and added function $ Not compiled $ RB
% $ Version 1.03 $ Date 20. April 1999 $ Cosmetic changes $ Not compiled $ RB
% $ Version 1.04 $ Date 25. April 1999 $ Cosmetic changes $ Not compiled $ RB
% $ Version 1.05 $ Date 18. May 1999 $ Added orthogonality constraints $ Not compiled $ RB
% $ Version 1.06 $ Date 14. September1999 $ Changed helpfile $ Not compiled $ RB
% $ Version 1.07 $ Date 20. October 1999 $ Added unimodality $ Not compiled $ RB
% $ Version 1.08 $ Date 27. March 2000 $ Optimized handling of missing dat $ Not compiled $ RB
%
% This M-file and the code in it belongs to the holder of the
% copyrights and is made public under the following constraints:
% It must not be changed or modified and code cannot be added.
% The file must be regarded as read-only. Furthermore, the
% code can not be made part of any toolbox or similar.
% In case of doubt, contact the holder of the copyrights.
%
% Rasmus Bro
% Chemometrics Group, Food Technology
% Department of Food and Dairy Science
% Royal Veterinary and Agricultutal University
% Rolighedsvej 30, DK-1958 Frederiksberg, Denmark
% Phone +45 35283296
% Fax +45 35283245
% E-mail rb@kvl.dk
%
% ___________________________________________________
%
% THE PARAFAC2 MODEL
% ___________________________________________________
%
%
% Algorithm to fit the PARAFAC2 model which is an advanced variant of the
% normal PARAFAC1 model. It handles slab-wise deviations between components
% in one mode as long as the cross-product of the components stays
% reasonably fixed. This can be utilized for modeling chromatographic
% data with retention time shifts, modeling certain batch data of
% varying length etc. See Bro, Kiers & Andersson, Journal of Chemometrics,
% 1999, 13, 295-309 for details on application and Kiers, ten Berge &
% Bro, Journal of Chemometrics, 1999, 13, 275-294, for details on the algorithm
%
%
% The PARAFAC2 model is given
%
% Xk = A*Dk*(Pk*H)' + Ek, k = 1, .., K
%
% Xk is a slab of data (I x J) in which J may actually vary with K. K
% is the number of slabs. A (I x F) are the scores or first-mode loadings. Dk
% is a diagonal matrix that holds the k'th row of C in its diagonal. C
% (K x F) is the third mode loadings, H is an F x F matrix, and Pk is a
% J x F orthogonal matrix (J may actually vary from k to k. The output here
% is given as a cell array of size J x F x K. Thus, to get e.g. the second P
% write P(:,:,2), and to get the estimate of the second mode loadings at this
% second frontal slab (k = 2), write P(:,:,2)*H. The matrix Ek holds the residuals.
%
% INPUT
%
% X
% Holds the data.
% If all slabs have similar size, X is an array:
% X(:,:,1) = X1; X(:,:,2) = X2; etc.
% If the slabs have different size X is a cell array (type <<help cell>>)
% X{1} = X1; X{2} = X2; etc.
% If you have your data in an 'unfolded' two-way array of size
% I x JK (the three-way array is I x J x K), then simply type
% X = reshape(X,[I J K]); to convert it to an array.
%
% F
% The number of components to extract
%
% Constraints
% Vector of length 2. The first element defines constraints
% imposed in the first mode, the second defines contraints in
% third mode (the second mode is not included because constraints
% are not easily imposed in this mode)
%
% If Constraints = [a b], the following holds. If
% a = 0 => no constraints in the first mode
% a = 1 => nonnegativity in the first mode
% a = 2 => orthogonality in the first mode
% a = 3 => unimodality (and nonnegativity) in the first mode
% same holds for b for the third mode
%
% Options
% An optional vector of length 3
% Options(1) Convergence criterion
% 1e-7 if not given or given as zero
% Options(2) Maximal iterations
% default 2000 if not given or given as zero
% Options(3) Initialization method
% A rather slow initialization method is used per default
% but it pays to investigate in avoiding local minima.
% Experience may point to faster methods (set Options(3)
% to 1 or 2). You can also change the number of refits etc.
% in the beginning of the m-file
% 0 => best of 10 runs of maximally 80 iterations (default)
% 1 => based on SVD
% 2 => random numbers
% Options(4) Cross-validation
% 0 => no cross-validation
% 1 => cross-validation splitting in 7 segments
% Options(5) show output
% 0 => show standard output on screen
% 1 => hide all output to screen
%
% AUXILIARY
% - Missing elements: Use NaN for missing elements
% - You can input initial values by using the input argument
% (X,F,Constraints,Options,A,H,C,P);
%
% OUTPUT
% See right above INPUT
%
% I/O
%
% Demo
% parafac2('demo')
%
% Short
% [A,H,C,P]=parafac2(X,F);
%
% Long
% [A,H,C,P,fit]=parafac2(X,F,Constraints,Options);
%
% Copyright
% Rasmus Bro
% KVL, DK, 1998
% rb@kvl.dk
%
% Reference to algorithm
% Bro, Kiers & Andersson, PARAFAC2 - Part II. Modeling chromatographic
% data with retention time shifts, Journal of Chemometrics, 1999, 13, 295-309
% TO DO:
% Set the algorithm to handle fixed modes as in PARALIN
% Make it N-way
% Incorporate ulsr
if nargin==0
disp(' ')
disp(' ')
disp(' THE PARAFAC2 MODEL')
disp(' ')
disp(' Type <<help parafac2>> for more info')
disp(' ')
disp(' I/O ')
disp(' [A,H,C,P]=parafac2(X,F);')
disp(' ')
disp(' Or optionally')
disp(' ')
disp(' [A,H,C,P,fit]=parafac2(X,F,Constraints,Options);')
disp(' ')
disp(' Options=[Crit MaxIt Init Xval Show]')
disp(' ')
disp(' ')
break
elseif nargin<2&~all(X=='demo')
error(' The inputs X and F must be given')
end
if isstr(X) & all(X=='demo')
F=3;
n=1:30;
disp(' ')
disp(' %%%%% PARAFAC2 DEMO %%%%%%')
disp(' ')
disp(' Generating simulated data')
disp(' Note that the third mode loadings change from slab to slab')
disp(' hence the ordinary PARAFAC model is not valid')
disp(' ')
subplot(2,2,1)
A=[exp(-((n-15)/5).^2);exp(-((n-1)/10).^2);exp(-((n-21)/7).^2)]';
plot(A),title(' First mode loadings')
subplot(2,2,2)
C=rand(4,3);
plot(C),title(' Third mode loadings')
H=orth(orth(rand(F))');
P=[];X=[];
for i=1:size(C,1),
subplot(2,4,4+i)
P(:,:,i)=orth(rand(7,F));
plot(P(:,:,i)*H),eval(['title([''2. mode, k = '',num2str(i)])'])
end,
disp(' Press key to continue'),pause
for i=1:size(C,1),
X(:,:,i)=A*diag(C(i,:))*(P(:,:,i)*H)';
end,
X = X + randn(size(X))*.01;
disp(' Adding one percent noise and fitting model')
disp(' Several initial models will be fitted and the best used')
[a,h,c,p]=parafac2(X,F);
disp(' ')
disp(' Results shown in plot')
subplot(2,2,1)
plot(A*diag(sum(A).^(-1)),'r'),
hold on,
plot(a*diag(sum(a).^(-1)),'g'),title(' First mode (red true,green estimated)')
hold off
subplot(2,2,2)
plot(C*diag(sum(C).^(-1)),'r')
hold on,
plot(c*diag(sum(c).^(-1)),'g'),title(' Third mode (red true,green estimated)')
hold off
for i=1:size(C,1),
subplot(2,4,4+i)
ph=P(:,:,i)*H;
plot(ph*diag(sum(ph).^(-1)),'r'),
hold on
ph=p{i}*h;
plot(ph*diag(sum(ph).^(-1)),'g'),
eval(['title([''2. mode, k = '',num2str(i)])'])
hold off
end,
break
end
ShowFit = 100; % Show fit every 'ShowFit' iteration
NumRep = 10; %Number of repetead initial analyses
NumItInRep = 80; % Number of iterations in each initial fit
if ~(length(size(X))==3|iscell(X))
error(' X must be a three-way array or a cell array')
end
%set random number generators
randn('state',sum(100*clock));
rand('state',sum(100*clock));
if nargin < 4
Options = zeros(1,5);
end
if length(Options)<5
Options = Options(:);
Options = [Options;zeros(5-length(Options),1)];
end
% Convergence criterion
if Options(1)==0
ConvCrit = 1e-7;
else
ConvCrit = Options(1);
end
if Options(5)==0
disp(' ')
disp(' ')
disp([' Convergence criterion : ',num2str(ConvCrit)])
end
% Maximal number of iterations
if Options(2)==0
MaxIt = 2000;
else
MaxIt = Options(2);
end
% Initialization method
initi = Options(3);
if nargin<3
Constraints = [0 0];
end
if length(Constraints)~=2
Constraints = [0 0];
disp(' Length of Constraints must be two. It has been set to zeros')
end
% Modify to handle GPA (Constraints = [10 10]);
if Constraints(2)==10
Constraints(1)=0;
ConstB = 10;
else
ConstB = 0;
end
ConstraintOptions=[ ...
'Fixed ';...
'Unconstrained ';...
'Non-negativity constrained';...
'Orthogonality constrained ';...
'Unimodality constrained ';...
'Not defined ';...
'Not defined ';...
'Not defined ';...
'Not defined ';...
'Not defined ';...
'Not defined ';...
'GPA '];
if Options(5)==0
disp([' Maximal number of iterations : ',num2str(MaxIt)])
disp([' Number of factors : ',num2str(F)])
disp([' Loading 1. mode, A : ',ConstraintOptions(Constraints(1)+2,:)])
disp([' Loading 3. mode, C : ',ConstraintOptions(Constraints(2)+2,:)])
disp(' ')
end
% Make X a cell array if it isn't
if ~iscell(X)
for k = 1:size(X,3)
x{k} = X(:,:,k);
end
X = x;
clear x
end
I = size(X{1},1);
K = max(size(X));
% CROSS-VALIDATION
if Options(4)==1
Opt = Options;
Opt(4) = 0;
splits = 7;
while rem(I,splits)==0 % Change the number of segments if 7 is a divisor in prod(size(X))
splits = splits + 2;
end
AddiOutput.NumberOfSegments = splits;
if Options(5)==0
disp(' ')
disp([' Cross-validation will be performed using ',num2str(splits),' segments'])
disp([' and using from 1 to ',num2str(F),' components'])
XvalModel = [];
end
SS = [];
for f = 1:F
Arep = [];Hrep = [];Crep = [];clear Prep;
for s = 1:splits
Xmiss = X;
for k = 1:K
Xmiss{k}(s:splits:end)=NaN;
end
[a,h,c,p]=parafac2(Xmiss,f,Constraints,Opt);
Arep(:,:,s)=a;Hrep(:,:,s)=h;Crep(:,:,s)=c;Prep(s,:)=p;
M=[];
for k = 1:K
m = a*diag(c(k,:))*(p{k}*h)';
M{k} = m;
end
XvalModel{f} = M;
end
AddiOutput.XvalModels=XvalModel;
ss = 0;
for k = 1:K
x = X{k};m = M{k};
ss = ss + sum((x(s:splits:end)-m(s:splits:end)).^2);
end
SS = [SS ss];
AddiOutput.SS = SS;
AddiOutput.A_xval{f}=Arep;
AddiOutput.H_xval{f}=Hrep;
AddiOutput.C_xval{f}=Crep;
AddiOutput.P_xval{f}=Prep;
end
clf
plot([1:F],SS),title(' Residual sum-squares - cross-validation')
xlabel('Number of components')
disp(' ')
disp(' The total model has NOT been fitted.')
disp(' You must refit the model with the number of ')
disp(' components you judge necessary.')
disp(' ')
disp(' You can also check the outputted struct array')
disp(' It contains loadings estimated from different')
disp(' subsets and stability of subsets indicates validity.')
disp(' (e.g. if name of struct array is Output then the file')
disp(' AX=Output.A_xval{3}is a three-way array holding all A')
disp(' loadings estimated with 3 components. AX(:,:,1) is the ')
disp(' estimate of A obtained from the first subset etc.')
[a,b]=min(SS);
figure
subplot(2,1,1)
a=AddiOutput.A_xval{b};
for i=2:splits
plot(a(:,:,i),'r'),hold on
end
title([' A resampled during Xval for ',num2str(b),' comp.'])
hold off
subplot(2,1,2)
c = AddiOutput.C_xval{b};
for i=2:splits
plot(c(:,:,i),'r'),hold on
end
title([' C resampled during Xval for ',num2str(b),' comp.'])
hold off
break
end
% Find missing and replace with average
MissingElements = 0;
MissNum=0;AllNum=0;
for k = 1:K
x=X{k};
miss = sparse(isnan(x));
MissingOnes{k} = miss;
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -