📄 learningpcapls.html

📁 偏最小二乘法PLS广泛应用于很多领域。这个程序包提供了一个函数
💻 HTML
📖 第 1 页 / 共 2 页
字号:
12 下一页
<!DOCTYPE html  PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"><html xmlns:mwsh="http://www.mathworks.com/namespace/mcode/v1/syntaxhighlight.dtd">   <head>      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">         <!--This HTML is auto-generated from an M-file.To make changes, update the M-file and republish this document.      -->      <title>Principal Component Analysis and Partial Least Squares</title>      <meta name="generator" content="MATLAB 7.5">      <meta name="date" content="2008-02-15">      <meta name="m-file" content="learningpcapls"><style>body {  background-color: white;  margin:10px;}h1 {  color: #990000;   font-size: x-large;}h2 {  color: #990000;  font-size: medium;}/* Make the text shrink to fit narrow windows, but not stretch too far in wide windows. */ p,h1,h2,div.content div {  max-width: 600px;  /* Hack for IE6 */  width: auto !important; width: 600px;}pre.codeinput {  background: #EEEEEE;  padding: 10px;}@media print {  pre.codeinput {word-wrap:break-word; width:100%;}} span.keyword {color: #0000FF}span.comment {color: #228B22}span.string {color: #A020F0}span.untermstring {color: #B20000}span.syscmd {color: #B28C00}pre.codeoutput {  color: #666666;  padding: 10px;}pre.error {  color: red;}p.footer {  text-align: right;  font-size: xx-small;  font-weight: lighter;  font-style: italic;  color: gray;}  </style></head>   <body>      <div class="content">         <h1>Principal Component Analysis and Partial Least Squares</h1>         <introduction>            <p>Principal Component Analysis (PCA) and Partial Least Squares (PLS) are widely used tools. This code is to show their relationship               through the Nonlinear Iterative PArtial Least Squares (NIPALS) algorithm.            </p>         </introduction>         <h2>Contents</h2>         <div>            <ul>               <li><a href="#1">The Eigenvalue and Power Method</a></li>               <li><a href="#2">The NIPALS Algorithm for PCA</a></li>               <li><a href="#3">The NIPALS Algorithm for PLS</a></li>               <li><a href="#4">Example: Discriminant PLS using the NIPALS Algorithm</a></li>            </ul>         </div>         <h2>The Eigenvalue and Power Method<a name="1"></a></h2>         <p>The NIPALS algorithm can be derived from the Power method to solve the eigenvalue problem. Let x be the eigenvector of a square            matrix, A, corresponding to the eignvalue s:         </p>         <p><img vspace="5" hspace="5" src="learningpcapls_eq955.png"> </p>         <p>Modifying both sides by A iteratively leads to</p>         <p><img vspace="5" hspace="5" src="learningpcapls_eq1937.png"> </p>         <p>Now, consider another vectro y, which can be represented as a linear combination of all eigenvectors:</p>         <p><img vspace="5" hspace="5" src="learningpcapls_eq7260.png"> </p>         <p>where</p>         <p><img vspace="5" hspace="5" src="learningpcapls_eq38356.png"> </p>         <p>and</p>         <p><img vspace="5" hspace="5" src="learningpcapls_eq48172.png"> </p>         <p>Modifying y by A gives</p>         <p><img vspace="5" hspace="5" src="learningpcapls_eq2092.png"> </p>         <p>Where S is a diagnal matrix consisting all eigenvalues. Therefore, for a large enough k,</p>         <p><img vspace="5" hspace="5" src="learningpcapls_eq14726.png"> </p>         <p>That is the iteration will converge to the direction of x_1, which is the eigenvector corresponding to the eigenvalue with            the maximum module. This leads to the following Power method to solve the eigenvalue problem.         </p><pre class="codeinput">A=randn(10,5);<span class="comment">% sysmetric matrix to ensure real eigenvalues</span>B=A'*A;<span class="comment">%find the column which has the maximum norm</span>[dum,idx]=max(sum(A.*A));x=A(:,idx);<span class="comment">%storage to judge convergence</span>x0=x-x;<span class="comment">%convergence tolerant</span>tol=1e-6;<span class="comment">%iteration if not converged</span><span class="keyword">while</span> norm(x-x0)&gt;tol    <span class="comment">%iteration to approach the eigenvector direction</span>    y=A'*x;    <span class="comment">%normalize the vector</span>    y=y/norm(y);    <span class="comment">%save previous x</span>    x0=x;    <span class="comment">%x is a product of eigenvalue and eigenvector</span>    x=A*y;<span class="keyword">end</span><span class="comment">% the largest eigen value corresponding eigenvector is y</span>s=x'*x;<span class="comment">% compare it with those obtained with eig</span>[V,D]=eig(B);[d,idx]=max(diag(D));v=V(:,idx);disp(d-s)<span class="comment">% v and y may be different in signs</span>disp(min(norm(v-y),norm(v+y)))</pre><pre class="codeoutput">  2.7960e-012  6.0941e-007</pre><h2>The NIPALS Algorithm for PCA<a name="2"></a></h2>         <p>The PCA is a dimension reduction technique, which is based on the following decomposition:</p>         <p><img vspace="5" hspace="5" src="learningpcapls_eq1475.png"> </p>         <p>Where X is the data matrix (m x n) to be analysed, T is the so called score matrix (m x a), P the loading matrix (n x a) and            E the residual. For a given tolerance of residual, the number of principal components, a, can be much smaller than the orginal            variable dimension, n. The above power algorithm can be extended to get T and P by iteratively subtracting A (in this case,            X) by x*y' (in this case, t*p') until the given tolerance satisfied. This is the so called NIPALS algorithm.         </p><pre class="codeinput"><span class="comment">% The data matrix with normalization</span>A=randn(10,5);meanx=mean(A);stdx=std(A);X=(A-meanx(ones(10,1),:))./stdx(ones(10,1),:);B=X'*X;<span class="comment">% allocate T and P</span>T=zeros(10,5);P=zeros(5);<span class="comment">% tol for convergence</span>tol=1e-6;<span class="comment">% tol for PC of 95 percent</span>tol2=(1-0.95)*5*(10-1);<span class="keyword">for</span> k=1:5    <span class="comment">%find the column which has the maximum norm</span>    [dum,idx]=max(sum(X.*X));    t=A(:,idx);    <span class="comment">%storage to judge convergence</span>    t0=t-t;    <span class="comment">%iteration if not converged</span>    <span class="keyword">while</span> norm(t-t0)&gt;tol        <span class="comment">%iteration to approach the eigenvector direction</span>        p=X'*t;        <span class="comment">%normalize the vector</span>        p=p/norm(p);        <span class="comment">%save previous t</span>        t0=t;        <span class="comment">%t is a product of eigenvalue and eigenvector</span>        t=X*p;    <span class="keyword">end</span>    <span class="comment">%subtracing PC identified</span>    X=X-t*p';    T(:,k)=t;    P(:,k)=p;    <span class="keyword">if</span> norm(X)&lt;tol2        <span class="keyword">break</span>    <span class="keyword">end</span><span class="keyword">end</span>T(:,k+1:5)=[];P(:,k+1:5)=[];S=diag(T'*T);<span class="comment">% compare it with those obtained with eig</span>[V,D]=eig(B);[D,idx]=sort(diag(D),<span class="string">'descend'</span>);D=D(1:k);V=V(:,idx(1:k));fprintf(<span class="string">'The number of PC: %i\n'</span>,k);fprintf(<span class="string">'norm of score difference between EIG and NIPALS: %g\n'</span>,norm(D-S));fprintf(<span class="string">'norm of loading difference between EIG and NIPALS: %g\n'</span>,norm(abs(V)-abs(P)));</pre><pre class="codeoutput">The number of PC: 4norm of score difference between EIG and NIPALS: 3.20008e-012norm of loading difference between EIG and NIPALS: 1.89247e-006</pre><h2>The NIPALS Algorithm for PLS<a name="3"></a></h2>         <p>For PLS, we will have two sets of data: the independent X and dependent Y. The NIPALS algorithm can be used to decomposes            both X and Y so that         </p>         <p><img vspace="5" hspace="5" src="learningpcapls_eq29314.png"> </p>         <p>The regression, U=TB is solved through least sequares whilst the decompsition may not include all components. That is why            the approach is called partial least squares. This algorithm is implemented in the PLS function.         </p>         <h2>Example: Discriminant PLS using the NIPALS Algorithm<a name="4"></a></h2>         <p>From Chiang, Y.Q., Zhuang, Y.M and Yang, J.Y, "Optimal Fisher discriminant analysis using the rank decomposition", Pattern            Recognition, 25 (1992), 101--111.         </p><pre class="codeinput"><span class="comment">% Three classes data, each has 50 samples and 4 variables.</span>x1=[5.1 3.5 1.4 0.2; 4.9 3.0 1.4 0.2; 4.7 3.2 1.3 0.2; 4.6 3.1 1.5 0.2;<span class="keyword">...</span>    5.0 3.6 1.4 0.2; 5.4 3.9 1.7 0.4; 4.6 3.4 1.4 0.3; 5.0 3.4 1.5 0.2; <span class="keyword">...</span>    4.4 2.9 1.4 0.2; 4.9 3.1 1.5 0.1; 5.4 3.7 1.5 0.2; 4.8 3.4 1.6 0.2; <span class="keyword">...</span>    4.8 3.0 1.4 0.1; 4.3 3.0 1.1 0.1; 5.8 4.0 1.2 0.2; 5.7 4.4 1.5 0.4; <span class="keyword">...</span>    5.4 3.9 1.3 0.4; 5.1 3.5 1.4 0.3; 5.7 3.8 1.7 0.3; 5.1 3.8 1.5 0.3; <span class="keyword">...</span>    5.4 3.4 1.7 0.2; 5.1 3.7 1.5 0.4; 4.6 3.6 1.0 0.2; 5.1 3.3 1.7 0.5; <span class="keyword">...</span>    4.8 3.4 1.9 0.2; 5.0 3.0 1.6 0.2; 5.0 3.4 1.6 0.4; 5.2 3.5 1.5 0.2; <span class="keyword">...</span>    5.2 3.4 1.4 0.2; 4.7 3.2 1.6 0.2; 4.8 3.1 1.6 0.2; 5.4 3.4 1.5 0.4; <span class="keyword">...</span>    5.2 4.1 1.5 0.1; 5.5 4.2 1.4 0.2; 4.9 3.1 1.5 0.2; 5.0 3.2 1.2 0.2; <span class="keyword">...</span>    5.5 3.5 1.3 0.2; 4.9 3.6 1.4 0.1; 4.4 3.0 1.3 0.2; 5.1 3.4 1.5 0.2; <span class="keyword">...</span>    5.0 3.5 1.3 0.3; 4.5 2.3 1.3 0.3; 4.4 3.2 1.3 0.2; 5.0 3.5 1.6 0.6; <span class="keyword">...</span>    5.1 3.8 1.9 0.4; 4.8 3.0 1.4 0.3; 5.1 3.8 1.6 0.2; 4.6 3.2 1.4 0.2; <span class="keyword">...</span>    5.3 3.7 1.5 0.2; 5.0 3.3 1.4 0.2];x2=[7.0 3.2 4.7 1.4; 6.4 3.2 4.5 1.5; 6.9 3.1 4.9 1.5; 5.5 2.3 4.0 1.3; <span class="keyword">...</span>    6.5 2.8 4.6 1.5; 5.7 2.8 4.5 1.3; 6.3 3.3 4.7 1.6; 4.9 2.4 3.3 1.0; <span class="keyword">...</span>    6.6 2.9 4.6 1.3; 5.2 2.7 3.9 1.4; 5.0 2.0 3.5 1.0; 5.9 3.0 4.2 1.5; <span class="keyword">...</span>    6.0 2.2 4.0 1.0; 6.1 2.9 4.7 1.4; 5.6 2.9 3.9 1.3; 6.7 3.1 4.4 1.4; <span class="keyword">...</span>    5.6 3.0 4.5 1.5; 5.8 2.7 4.1 1.0; 6.2 2.2 4.5 1.5; 5.6 2.5 3.9 1.1; <span class="keyword">...</span>    5.9 3.2 4.8 1.8; 6.1 2.8 4.0 1.3; 6.3 2.5 4.9 1.5; 6.1 2.8 4.7 1.2; <span class="keyword">...</span>    6.4 2.9 4.3 1.3; 6.6 3.0 4.4 1.4; 6.8 2.8 4.8 1.4; 6.7 3.0 5.0 1.7; <span class="keyword">...</span>    6.0 2.9 4.5 1.5; 5.7 2.6 3.5 1.0; 5.5 2.4 3.8 1.1; 5.5 2.4 3.7 1.0; <span class="keyword">...</span>    5.8 2.7 3.9 1.2; 6.0 2.7 5.1 1.6; 5.4 3.0 4.5 1.5; 6.0 3.4 4.5 1.6; <span class="keyword">...</span>    6.7 3.1 4.7 1.5; 6.3 2.3 4.4 1.3; 5.6 3.0 4.1 1.3; 5.5 2.5 5.0 1.3; <span class="keyword">...</span>    5.5 2.6 4.4 1.2; 6.1 3.0 4.6 1.4; 5.8 2.6 4.0 1.2; 5.0 2.3 3.3 1.0; <span class="keyword">...</span>    5.6 2.7 4.2 1.3; 5.7 3.0 4.2 1.2; 5.7 2.9 4.2 1.3; 6.2 2.9 4.3 1.3; <span class="keyword">...</span>    5.1 2.5 3.0 1.1; 5.7 2.8 4.1 1.3];x3=[6.3 3.3 6.0 2.5; 5.8 2.7 5.1 1.9; 7.1 3.0 5.9 2.1; 6.3 2.9 5.6 1.8; <span class="keyword">...</span>    6.5 3.0 5.8 2.2; 7.6 3.0 6.6 2.1; 4.9 2.5 4.5 1.7; 7.3 2.9 6.3 1.8; <span class="keyword">...</span>    6.7 2.5 5.8 1.8; 7.2 3.6 6.1 2.5; 6.5 3.2 5.1 2.0; 6.4 2.7 5.3 1.9; <span class="keyword">...</span>    6.8 3.0 5.5 2.1; 5.7 2.5 5.0 2.0; 5.8 2.8 5.1 2.4; 6.4 3.2 5.3 2.3; <span class="keyword">...</span>    6.5 3.0 5.5 1.8; 7.7 3.8 6.7 2.2; 7.7 2.6 6.9 2.3; 6.0 2.2 5.0 1.5; <span class="keyword">...</span>    6.9 3.2 5.7 2.3; 5.6 2.8 4.9 2.0; 7.7 2.8 6.7 2.0; 6.3 2.7 4.9 1.8; <span class="keyword">...</span>    6.7 3.3 5.7 2.1; 7.2 3.2 6.0 1.8; 6.2 2.8 4.8 1.8; 6.1 3.0 4.9 1.8; <span class="keyword">...</span>    6.4 2.8 5.6 2.1; 7.2 3.0 5.8 1.6; 7.4 2.8 6.1 1.9; 7.9 3.8 6.4 2.0; <span class="keyword">...</span>    6.4 2.8 5.6 2.2; 6.3 2.8 5.1 1.5; 6.1 2.6 5.6 1.4; 7.7 3.0 6.1 2.3; <span class="keyword">...</span>    6.3 3.4 5.6 2.4; 6.4 3.1 5.5 1.8; 6.0 3.0 4.8 1.8; 6.9 3.1 5.4 2.1; <span class="keyword">...</span>    6.7 3.1 5.6 2.4; 6.9 3.1 5.1 2.3; 5.8 2.7 5.1 1.9; 6.8 3.2 5.9 2.3; <span class="keyword">...</span>    6.7 3.3 5.7 2.5; 6.7 3.0 5.2 2.3; 6.3 2.5 5.0 1.9; 6.5 3.0 5.2 2.0; <span class="keyword">...</span>    6.2 3.4 5.4 2.3; 5.9 3.0 5.1 1.8];<span class="comment">%Split data set into training (1:25) and testing (26:50)</span>idxTrain = 1:25;idxTest = 26:50;<span class="comment">% Combine training data with normalization</span>X = [x1(idxTrain,:);x2(idxTrain,:);x3(idxTrain,:)];<span class="comment">% Define class indicator as Y</span>Y = kron(eye(3),ones(25,1));<span class="comment">% Normalization</span>xmean = mean(X);xstd = std(X);ymean = mean(Y);ystd = std(Y);X = (X - xmean(ones(75,1),:))./xstd(ones(75,1),:);Y = (Y - ymean(ones(75,1),:))./ystd(ones(75,1),:);<span class="comment">% Tolerance for 90 percent score</span>tol = (1-0.9) * 25 * 4;<span class="comment">% Perform PLS</span>
12 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -