bootstrapping a principal component analysis.htm

来自「matlab bootstrap程序设计方法」· HTM 代码 · 共 509 行 · 第 1/2 页

HTM
509
字号
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!-- saved from url=(0060)http://www-stat.stanford.edu/~susan/courses/s208/node18.html -->
<!--Converted with LaTeX2HTML 2002-2 (1.70)original version by:  Nikos Drakos, CBLU, University of Leeds* revised and updated by:  Marcus Hennecke, Ross Moore, Herb Swan* with significant contributions from:  Jens Lippmann, Marek Rouchal, Martin Wilck and others --><HTML><HEAD><TITLE>Bootstrapping a Principal Component Analysis</TITLE>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="Bootstrapping a Principal Component Analysis" name=description>
<META content=web1 name=keywords>
<META content=document name=resource-type>
<META content=global name=distribution>
<META content="MSHTML 6.00.2900.2523" name=GENERATOR>
<META http-equiv=Content-Style-Type content=text/css><LINK 
href="Bootstrapping a Principal Component Analysis.files/web1.css" 
rel=STYLESHEET><LINK href="node19.html" rel=next><LINK href="node17.html" 
rel=previous><LINK href="node6.html" rel=up><LINK href="node19.html" 
rel=next></HEAD>
<BODY bgColor=#ffffff><!--Navigation Panel--><A 
href="http://www-stat.stanford.edu/~susan/courses/s208/node19.html" 
name=tex2html397><IMG height=24 alt=next src="" width=37 align=bottom 
border=0></A> <A 
href="http://www-stat.stanford.edu/~susan/courses/s208/node6.html" 
name=tex2html395><IMG height=24 alt=up src="" width=26 align=bottom 
border=0></A> <A 
href="http://www-stat.stanford.edu/~susan/courses/s208/node17.html" 
name=tex2html389><IMG height=24 alt=previous src="" width=63 align=bottom 
border=0></A> <BR><B>Next:</B> <A 
href="http://www-stat.stanford.edu/~susan/courses/s208/node19.html" 
name=tex2html398>Confidence Intervals</A> <B>Up:</B> <A 
href="http://www-stat.stanford.edu/~susan/courses/s208/node6.html" 
name=tex2html396>Lectures</A> <B>Previous:</B> <A 
href="http://www-stat.stanford.edu/~susan/courses/s208/node17.html" 
name=tex2html390>Cross Validation</A> <BR><BR><!--End of Navigation Panel--><!--Table of Child-Links--><A 
name=CHILD_LINKS><STRONG>Subsections</STRONG></A> 
<UL>
  <LI>
  <UL>
    <LI><A 
    href="http://www-stat.stanford.edu/~susan/courses/s208/node18.html#SECTION002120100000000000000" 
    name=tex2html399>Description of singular value Decomposition</A> 
</LI></UL><BR>
  <LI><A 
  href="http://www-stat.stanford.edu/~susan/courses/s208/node18.html#SECTION002121000000000000000" 
  name=tex2html400>Principal Components</A> 
  <UL>
    <LI><A 
    href="http://www-stat.stanford.edu/~susan/courses/s208/node18.html#SECTION002121100000000000000" 
    name=tex2html401>Matlab for the Scores Example -in handout 4/27/99</A> 
    <LI><A 
    href="http://www-stat.stanford.edu/~susan/courses/s208/node18.html#SECTION002121200000000000000" 
    name=tex2html402>How to generate a multivariate normal</A> </LI></UL></LI></UL><!--End of Table of Child-Links-->
<HR>

<H1><A name=SECTION002120000000000000000>Bootstrapping a Principal Component 
Analysis</A> </H1>
<P>The scores data are the first example in chapter 7 of the text, the analysis 
which is done is called a principal components analysis, here is a little about 
that decomposition. 
<P>
<H3><A name=SECTION002120100000000000000>Description of singular value 
Decomposition</A> </H3>This is the most important matrix decomposition in 
statistics. 
<P>Here is a first introduction: a little made-up example with matlab to start 
with: <PRE>&gt;&gt; u=[3 1 -1 2]'
u =
     3
     1
    -1
     2
&gt;&gt; v=(1:4)'
v = 
     1
     2
     3
     4
&gt;&gt; X=u*v'
X =
     3     6     9    12
     1     2     3     4
    -1    -2    -3    -4
     2     4     6     8
&gt;&gt;svd(X)
ans =
   21.2132
    0.0000
         0
         0
&gt;&gt; E=10^(-3)*randn(4);
&gt;&gt; XE=X+E 
XE =
    3.0012    5.9993    9.0003   12.0012
    1.0006    2.0017    3.0009    3.9994
   -0.9999   -1.9999   -3.0014   -3.9994
    2.0004    4.0018    5.9993    7.9996
&gt;&gt; svd(XE) 
ans =
   21.2143
    0.0028
    0.0019
    0.0005
&gt;&gt; cond(X)
Condition is infinite
ans =
   Inf 
&gt;&gt; cond(XE)
ans =
   4.2567e+04
%An example you can't see with your bare eyes:
&gt;&gt; X2=u'*v 
X2 =
    0.2468    0.2499    0.0166    0.2487
    0.3225    0.3266    0.0218    0.3250
    0.1495    0.1514    0.0101    0.1506
    0.4367    0.4423    0.0295    0.4401
&gt;&gt; svd(X2) 
ans =
    1.0730
    0.0000
         0
         0
&gt;&gt; [U,S,V]=svd(X2);
&gt;&gt; U*S*V' 
ans =
    0.2468    0.2499    0.0166    0.2487
    0.3225    0.3266    0.0218    0.3250
    0.1495    0.1514    0.0101    0.1506
    0.4367    0.4423    0.0295    0.4401
&gt;&gt; 10000*(X2-U*S*V') 
ans =
    1.0e-11 *
   -0.0278    0.0833   -0.0035    0.0555
   -0.0555    0.0555         0    0.0555
         0    0.0555    0.0017    0.0278
   -0.0555    0.1665         0    0.1110
</PRE><PRE>&gt;&gt; A=rand(4)
A = 
    0.5045    0.4940    0.0737    0.9138
    0.5163    0.2661    0.5007    0.5297
    0.3190    0.0907    0.3841    0.4644
    0.9866    0.9478    0.2771    0.9410
&gt;&gt; flops(0)
&gt;&gt; [L,U,P]=lu(A) 
L =
     1.0000         0         0         0
    0.5233    1.0000         0         0
    0.5114   -0.0406    1.0000         0
    0.3234    0.9388    0.7363    1.0000
U =
     0.9866    0.9478    0.2771    0.9410
         0   -0.2298    0.3557    0.0373
         0         0   -0.0535    0.4342
         0         0         0   -0.1945
&gt;&gt; flops 
ans =
    34 
&gt;&gt; P*A
ans =
    0.9866    0.9478    0.2771    0.9410
    0.5163    0.2661    0.5007    0.5297
    0.5045    0.4940    0.0737    0.9138
    0.3190    0.0907    0.3841    0.4644
&gt;&gt; L*U
ans =
     0.9866    0.9478    0.2771    0.9410
    0.5163    0.2661    0.5007    0.5297
    0.5045    0.4940    0.0737    0.9138
    0.3190    0.0907    0.3841    0.4644
&gt;&gt; P*A-L*U
ans =    1.0e-15 *
         0         0         0         0
   -0.1110         0         0         0
         0         0         0         0
   -0.0555   -0.0139         0         0
</PRE>
<P>Here is what we need to remember: <FONT size=+1><A name=923></A></FONT><BR>
<P></P>
<DIV align=center><!-- MATH \begin{displaymath}X=USV', V'V=I, U'U=I, S\; diagonal\; s_i\end{displaymath} --><IMG 
height=32 
alt="\begin{displaymath}&#10;X=USV', V'V=I, U'U=I, S\; diagonal\; s_i&#10;\end{displaymath}" 
src="Bootstrapping a Principal Component Analysis.files/img238.png" width=360 
border=0> </DIV><BR clear=all>
<P></P>Actually the singular values are the square roots of the eigenvalues of 
<IMG height=17 alt="$X'X$" 
src="Bootstrapping a Principal Component Analysis.files/img239.png" width=43 
align=bottom border=0>. 
<H2><A name=SECTION002121000000000000000>Principal Components</A> </H2>
<UL>
  <LI>Start by recentring <IMG height=16 alt=$X$ 
  src="Bootstrapping a Principal Component Analysis.files/img162.png" width=22 
  align=bottom border=0>, from now on consider <IMG height=16 alt=$X$ 
  src="Bootstrapping a Principal Component Analysis.files/img162.png" width=22 
  align=bottom border=0> centered ie <IMG height=35 alt="$1_n   X=0$" 
  src="Bootstrapping a Principal Component Analysis.files/img240.png" width=78 
  align=middle border=0>, 
  <LI>Cols of <IMG height=16 alt=$V\rightarrow$ 
  src="Bootstrapping a Principal Component Analysis.files/img241.png" width=44 
  align=bottom border=0> are new variables, 
  <LI>Principal Components 
  <!-- MATH $C=US\quad C^\prime C=S^2\quad C=XV$ --><IMG height=19 
  alt="$C=US\quad C^\prime C=S^2\quad C=XV$" 
  src="Bootstrapping a Principal Component Analysis.files/img242.png" width=260 
  align=bottom border=0>, 
  <LI>Principal axes <!-- MATH $Z=U^\prime X=SV^\prime$ --><IMG height=17 
  alt="$Z=U^\prime X=SV^\prime$" 
  src="Bootstrapping a Principal Component Analysis.files/img243.png" width=137 
  align=bottom border=0>, 
  <LI>Distance between two points, <BR>
  <P></P>
  <DIV align=center><!-- MATH \begin{displaymath}\left( x_{k\cdot}-x_{\ell\cdot} \right)^\prime \left( x_{k\cdot}-x_{\ell\cdot} \right) = \sum^T_{j=1} \left( c_{kj}-c_{\ell j} \right)^2,\end{displaymath} --><IMG 
  height=61 
  alt="\begin{displaymath}&#10;\left( x_{k\cdot}-x_{\ell\cdot} \right)^\prime \left( x_{k\c...&#10;...ot} \right) = \sum^T_{j=1} \left( c_{kj}-c_{\ell j} \right)^2,&#10;\end{displaymath}" 
  src="Bootstrapping a Principal Component Analysis.files/img244.png" width=315 
  border=0> </DIV><BR clear=all>
  <P></P>
  <LI>Transition Formulae <!-- MATH $Z=S^{-1}C^\prime X\quad C=XZ^\prime S^{-1}$ --><IMG height=19 
  alt="$Z=S^{-1}C^\prime X\quad C=XZ^\prime S^{-1}$" 
  src="Bootstrapping a Principal Component Analysis.files/img245.png" width=236 
  align=bottom border=0>, </LI></UL>
<P><IMG height=16 alt=$X$ 
src="Bootstrapping a Principal Component Analysis.files/img162.png" width=22 
align=bottom border=0> centered, all points (observations) same weight. <BR>
<P></P>
<DIV align=center><!-- MATH \begin{displaymath}1 \, X=0\quad\mbox{and}\quad x_{ij} = \sum^r_{t=1}x_{it}s_t v_{jt},\end{displaymath} --><IMG 
height=55 
alt="\begin{displaymath}&#10;1   X=0\quad\mbox{and}\quad x_{ij} = \sum^r_{t=1}x_{it}s_t v_{jt},&#10;\end{displaymath}" 
src="Bootstrapping a Principal Component Analysis.files/img246.png" width=264 
border=0> </DIV><BR clear=all>
<P></P><IMG height=33 alt=$p$ 
src="Bootstrapping a Principal Component Analysis.files/img247.png" width=14 
align=middle border=0> variables can be replaced by the <IMG height=16 alt=$r$ 
src="Bootstrapping a Principal Component Analysis.files/img248.png" width=14 
align=bottom border=0> columns of <IMG height=16 alt=$v$ 
src="Bootstrapping a Principal Component Analysis.files/img249.png" width=14 
align=bottom border=0>. <BR>
<P></P>
<DIV align=center><!-- MATH

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?