plsdemo.m

来自「偏最小二乘算法在MATLAB中的实现」· M 代码 · 共 225 行
225 行
echo on
%PLSDEMO Demonstrates PLS and PCR functions
%
% This demonstration illustrates the use of the PLS and
% PCR functions in the PLS_Toolbox.

echo off
%  Copyright
%  Barry M. Wise
%  1992
%  Modified June 1994
echo on

% The data we are going to work with is from a Liquid-Fed 
% Ceramic Melter (LFCM).  We will develop a correlation between
% temperatures in the molten glass tank and the tank level.
% Lets start by loading and plotting the data.  Hit a key when
% you are ready.
pause

echo off
load plsdata
subplot(2,1,1)
plot(xblock1);
title('X-block Data (Predictor Variables) for PLS Demo');
xlabel('Sample Number');
ylabel('Temperature (C)');
subplot(2,1,2)
plot(yblock1)
title('Y-Block Data (Predicted Variable) for PLS Demo');
xlabel('Sample Number');
ylabel('Level (Inches)');
echo on

% You can probably already see that there is a very regualar
% variation in the temperature data and that it appears to
% correlate with the level data. This is because there is
% steep temperature gradient in the molten glass, and when
% the level changes, glasses of different temperatures pass
% by the location of the thermocouples.
pause

% Lets use the fact that temperature correlates with
% level to build PLS and PCR models that uses temperature
% to predict level.  We will start by mean-centering the data.
% Here mean-centering makes sense because all of the variables
% are of the same type, and we have reason to expect that 
% the temperatures with the most variance will also be the
% most predictive for level.
pause

[mxblock1,mx] = mncn(xblock1);
[myblock1,my] = mncn(yblock1);

% Now that the data is scaled we can use the PLS and PCR routines
% to make a calibration.  Lets start by using all the data to 
% make models and see how variance they capture.  We'll also 
% make a model using MLR and compare it to the PLS  and PCR models.
pause

[p,q,w,t,u,b,ssqdif] = pls(mxblock1,myblock1,10);
[t,p,b] = pcr1(mxblock1,myblock1,10);
mlrmod = mxblock1\myblock1;
pause

% Take a close look at the variance captured by the PLS and PCR
% models. Notice that for any particular number of LVs or PCs
% that the PLS model always captures just a bit more Y-Block
% (predicted variable) variance while the PCR model always
% captures just a bit more X-Block (predictor variable)
% variance. This is because the principal components 
% decomposition of the X-Block in PCR captures the maximum amount
% of variation that can be explained with linear factors without
% regard to how well they correlate with the Y-Block (in this
% case they do correlate quite well). PLS, on the other hand,
% tries to capture more Y-Block variance as well as describing
% X-Block variance. Thus, PLS always gets more Y-Block variance
% and less X-Block variance than PCR.
pause

% We can also see from the variance captured by the PLS  and
% PCR models that 1 latent variable or principal component
% is pretty good and anything after 4 doesn't really add much.  
% However, we really need to cross validate to determine the 
% optimum number of latent variables and principal components.
% For this we will use the PLSCVBLK and PCRCVBLK functions.  
% The reason for using these routines is that the data is 
% serially correlated, so we should really split it into
% contiguous blocks. This method decreases the correlation
% between any serially correlated noise in the training and
% test sets.
pause

% Before we use PLSCVBLK and PCRCVBLK we must decide how many 
% times to rebuild and test the model. I'll choose 5 since
% it is reasonable to expect that any disturbance in this
% system would have died away after 60 samples. The maximum
% number of LVs and PCs is set to 10 since it doesn't look
% like any more than that would be of any use.  As the functions
% run you will see the PRESS plots for each time the model is
% rebuilt and tested.  After all the trials the function finds
% the number of LVs or PCs for minimum PRESS. (Note that the
% user prompt to override the chosen number of LVs or PCs
% has been turned off for this demo.) The function then
% calculates the regression vector with the optimum number
% of LVs or PCs.
pause

subplot(1,1,1)
[plsss,cplsss,mlv,bpls] = plscvblk(mxblock1,myblock1,5,10,1);
drawnow
[pcrss,cpcrss,mpc,bpcr] = pcrcvblk(mxblock1,myblock1,5,10,1);


% You may have noticed that for PLS it was determined that
% 5 LVs was optimal, while for PCR 7 PCs was optimal. This
% is typical of PLS relative to PCR. Because PLS finds factors
% that correlate with the predicted variable, it generally
% goes through the mimimum in prediction error before PCR.
pause

% It is also interesting to look at how the Cumulative PRESS
% plots compare with each other. Note the PCR model appears 
% to have a better PRESS value at the minimum than the PLS
% model.

echo off
plot(1:10,cplsss,'-y',1:10,cplsss,'+y'), hold on
plot(1:10,cpcrss,'-g',1:10,cpcrss,'og'), hold off
title('Comparison of PRESS for PLS (+) and PCR (o) Models')
xlabel('Number of Latent Variables or Principal Components')
ylabel('Model Prediction Error - PRESS')

% By plotting the regression vector we can see what variables
% were important in predicting the level.  We can also compare
% this to the MLR model.
pause

echo off
plot(1:20,bpls,'-y',1:20,bpls,'+y',[1 20],[0 0],'-r'), hold on
plot(1:20,bpcr,'-g',1:20,bpcr,'og')
plot(1:20,mlrmod,'-c',1:20,mlrmod,'*c'), hold off
title('PLS (+), PCR (o) and MLR (*) Regression Vector Coefficients For Level Prediction');
xlabel('Variable Number');
ylabel('Coefficent');
pause
echo on

% Notice how the MLR model is more "spikey". This "ringing"
% in the coefficients is typical of models identified with
% MLR when there is a great deal of correlation structure in
% the data, as we have here.

% Now lets use the regression vectors to calculate the fitted
% level to the training (calibration) data and compare it to 
% the actual level.
pause

ypls = mxblock1*bpls;
ypcr = mxblock1*bpcr;
ymlr = mxblock1*mlrmod;
sypls = rescale(ypls,my);
sypcr = rescale(ypcr,my);
symlr = rescale(ymlr,my);

echo off
s = 1:300;
plot(s,sypls,'-y',s,sypls,'+y'), hold on
plot(s,sypcr,'-g',s,sypcr,'og')
plot(s,symlr,'-c',s,symlr,'*c')
plot(s,yblock1,'-r',s,yblock1,'xr'), hold off
title('Actual (x) and Fitted Level by PLS (+), PCR (o) and MLR')
xlabel('Sample Number');
ylabel('Level (Inches)');
pause
echo on

% This looks pretty good, but lets try the models with a new data
% set to see how they will work for that.  We start by scaling the
% new data using the same factors we used to scale the original
% data.
pause

sxblock2 = scale(xblock2,mx);
syblock2 = scale(yblock2,my);

% Now we just multiply the new xblock by the regression vectors
% to get the new prediction.  After rescaling we can compare the
% predicted and actual data.
pause

newypls = sxblock2*bpls;
newypcr = sxblock2*bpcr;
newymlr = sxblock2*mlrmod;
sypls = rescale(newypls,my);
sypcr = rescale(newypcr,my);
symlr = rescale(newymlr,my);

echo off
s = 1:200;
plot(s,sypls,'-y',s,sypls,'+y'), hold on
plot(s,sypcr,'-g',s,sypcr,'og')
plot(s,symlr,'-c',s,symlr,'*c')
plot(s,yblock2,'-r',s,yblock2,'xr'), hold off
title('Actual (x) and Predicted Level by PLS (+), PCR (o) and MLR')
xlabel('Sample Number');
ylabel('Level (Inches)');
pause
echo on

% We can also calculate the total sum of squared prediction error
% for the PLS, PCR and MLR models as follows:

echo off
plsssq = sum((yblock2-sypls).^2);
pcrssq = sum((yblock2-sypcr).^2);
mlrssq = sum((yblock2-symlr).^2);

disp('  PLS error PCR error MLR error'), 
disp([plsssq pcrssq mlrssq])
echo on

% So here we see that the PLS and PCR models are slightly
% better than the MLR model, as expected.
plsdemo.m - 源码说明

本页面展示了「偏最小二乘算法在MATLAB中的实现」中的 plsdemo.m 源码文件，采用 M 编程语言编写，共 225 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与MATLAB相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?