⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 tp5.html

📁 流形学习算法库
💻 HTML
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=iso-8859-1"><link rel=File-List href="index_files/filelist.xml"><title>Lecture 5 - Graph-Based data Processing</title><link href="styles.css" rel="stylesheet" type="text/css"></head><body bgcolor="#FFFFFF" lang=FR><div >   <h1>Lecture 5 - Graph-based <br>    data processing</h1></div><blockquote>   <blockquote>     <blockquote>       <blockquote>         <blockquote>          <p align="justify"><strong>Abstract : </strong>The goal of this lecture             is to manipulate data in arbitrary dimensions using graph-based method.             The points is the data are linked together in a graph structure.             Geodesic computations can be performed to compute distance on the             dataset.</p>        </blockquote>      </blockquote>    </blockquote>  </blockquote></blockquote><h2> Setting up Matlab. </h2><ul>  <li>First download the Matlab toolbox <a href="matlab/toolbox_dimreduc.zip"><font face="Courier New, Courier, mono">toolbox_dimreduc.zip</font></a>     and <a href="matlab/toolbox_graph.zip"><font face="Courier New, Courier, mono">toolbox_graph.zip</font></a>.     Unzip it into your working directory. You should have directories <font face="Courier New, Courier, mono">toolbox_dimreduc/</font>     and <font face="Courier New, Courier, mono">toolbox_graph</font>/ in your     path. </li>  <li>The first thing to do is to install this toolbox in your path.</li>  <blockquote>     <p><font face="Courier New, Courier, mono">path(path, 'toolbox_dimreduc');<br>      path(path, 'toolbox_dimreduc/toolbox/');<br>      path(path, 'toolbox_dimreduc/data/');<br>      path(path, 'toolbox_graph');</font></p>  </blockquote>  <li>Recompile the mex file for your machine (this can produce some warning).     If it does not work, either use the already compiled mex file (they should     be available for MacOs and/or Unix) or try to set up matlab with a C compiler     (e.g. gcc) using '<font face="Courier New, Courier, mono">mex -setup</font>'.   </li>  <blockquote>     <p><font face="Courier New, Courier, mono">cd toolbox_graph<br>      compile_mex;<br>      cd ..</font></p>  </blockquote></ul><h2> Distance Computation on Graphs. </h2><ul>  <li>You can load synthetic and real datasets (only frey_rawface is included     in the distribution however)</li>  <blockquote>     <p> <font face="Courier New, Courier, mono">name = 'swissroll'; % you can       also try with 'scurve'<br>      n = 800; % number of points<br>      [X,col] = load_points_set( name, n );<br>      clf; plot_scattered(X,col);<br>      axis equal; axis off;</font></p>  </blockquote>  <li>From points in arbitrary dimension, you can create a graph data structure     using the nearest-neighbor rule (you can also try the alternative epsilon     rule). </li>  <blockquote>     <p> <font face="Courier New, Courier, mono">options.use_nntools = 0; % set       to 1 if you have compile TStool for your machine, this will speed up computations<br>      options.nn_nbr = 7; % number of nearest neighbor<br>      % compute NN graph and the distance between nodes<br>      [D,A] = compute_nn_graph(X,options);<br>      % display the graph<br>      clf; plot_graph(A,X, col);<br>      axis tight; axis off; </font></p>  </blockquote>  <li>You can now compute the geodesic distance on this graph using the Dijkstra     algorithm. </li>  <blockquote>     <p> <font face="Courier New, Courier, mono">% set up the length of the edges       (0 for no edges)<br>      D(D==Inf) = 0; % weight on the graph<br>      % find some cool location for starting point<br>      [tmp,start_point] = min( abs(col(:)-mean(col(:)))); % starting point<br>      % compute the geodesic distance<br>      [d,S] = perform_dijkstra(D, start_point);<br>      % display the graph with distance colors<br>      clf; hold on;<br>      plot_scattered(X, d);<br>      h = plot3(X(1,start_point), X(2,start_point), X(3,start_point), 'k.');<br>      set(h, 'MarkerSize', 25);<br>      axis tight; axis off; hold off;</font></p>  </blockquote>  <table width="1000" border="0" cellspacing="0" cellpadding="0" align="center">    <tr align="center">       <td> <img src="images/tp4/scurve-display.jpg" width="300"/> <img src="images/tp4/scurve-graph.jpg" width="300"/>         <img src="images/tp4/scurve-dist.jpg" width="300"/> <br/> <img src="images/tp4/swissroll-display.jpg" width="300"/>         <img src="images/tp4/swissroll-graph.jpg" width="300"/> <img src="images/tp4/swissroll-dist.jpg" width="300"/>       </td>    </tr>    <tr>       <td align="center">Two examples of 3D point clouds (left) ; corresponding         NN-graph (center) ;<br>        geodesic distance to the point in black (right), blue means close.<br></td>    </tr>  </table></ul><h2> PCA and Isomap Flattening. </h2><ul>  <li>The principal component analysis realize a linear dimensionality reduction     by projecting the data set on the axes of largest variance.</li>  <blockquote>     <p><font face="Courier New, Courier, mono">% dimension reduction using PCA<br>      [Y,xy] = pca(X,2);<br>      % display<br>      clf; plot_scattered(xy,col);<br>      axis equal; axis off;</font></p>  </blockquote>  <li>In order to better capture the manifold geometry of the data, Isomap compute     the geodesic distance between pair of points, and find a low-dimensional layout     that best respect these geodesic distance.</li>  <blockquote>     <p><font face="Courier New, Courier, mono">% dimension reduction using Isomap<br>      options.nn_nbr = 7; % number of NN for the graph<br>      xy = isomap(X,2, options);<br>      % display<br>      clf; plot_scattered(xy,col);<br>      axis equal; axis off;</font></p>  </blockquote>  <table width="1100" border="0" cellspacing="0" cellpadding="0" align="center">    <tr align="center">       <td> <img src="images/tp4/scurve-display.jpg" width="300"/> <img src="images/tp4/scurve-dred-pca.jpg" width="300">         <img src="images/tp4/scurve-dred-isomap.jpg" width="300"><br/> <img src="images/tp4/swissroll-display.jpg" width="300"/>         <img src="images/tp4/swissroll-dred-pca.jpg" width="300"> <img src="images/tp4/swissroll-dred-isomap.jpg" width="300">       </td>    </tr>    <tr>       <td align="center">Original 3D data set (left) ; 2D flattening using PCA         (center) and using Isomap (right).<br>        Note how the PCA does not recover the simple 2D parameterization of the         <br>        manifold since it is a linear process. In contrast, Isomap is able <br>        to &quot;unfold&quot; the curvy surface.</td>    </tr>  </table></p>  </ul><h2> Dimension Reduction for Image Libraries. </h2><ul>  <li>We can use the same process of flattening for data of arbitrary dimension.     We first use a simple library of translating disks.</li>  <blockquote>     <p><font face="Courier New, Courier, mono">name = 'disks';<br>      options.nbr = 1000;<br>      % Read database<br>      M = load_images_dataset(name, options);<br>      % turn it into a set of points<br>      a = size(M,1);b = size(M,2);n = size(M,3);<br>      X = reshape(M, a*b, n);<br>      % perform isomap<br>      options.nn_nbr = 7;<br>      options.use_nntools = 0;<br>      xy = isomap(X,2, options);<br>      k = 30;<br>      clf; plot_flattened_dataset(xy,M,k);<br>      % perform pca<br>      [tmp,xy] = pca(X,2);<br>      clf; plot_flattened_dataset(xy,M,k); </font></blockquote>  <li>We can do the same computation on a more complex library of faces.</li>  <blockquote>     <p><font face="Courier New, Courier, mono">name = 'frey_rawface';<br>      options.nbr = 2000;<br>      % Read database<br>      M = load_images_dataset(name, options);<br>      % subsample at random<br>      n = 1000;<br>      sel = randperm(size(M,3));<br>      M = M(:,:,sel(1:n)); <br>      M = permute(M,[2 1 3]); % fix x/y orientation of the faces<br>      % turn it into a set of points<br>      a = size(M,1);b = size(M,2);n = size(M,3);<br>      X = reshape(M, a*b, n);<br>      % perform isomap<br>      options.nn_nbr = 7;<br>      options.use_nntools = 0;<br>      xy = isomap(X,2, options);<br>      k = 30;<br>      clf; plot_flattened_dataset(xy,M,k);<br>      % perform pca<br>      [tmp,xy] = pca(X,2);<br>      clf; plot_flattened_dataset(xy,M,k); </font></blockquote>  <blockquote>     <table width="920" border="0" cellspacing="0" cellpadding="0" align="center">      <tr align="center">         <td> <img src="images/tp4/disks-dataset.jpg" width="400"> <img src="images/tp4/frey_rawface-dataset.jpg" width="400">           <br/> <img src="images/tp4/disks-pca.jpg" width="400"> <img src="images/tp4/frey_rawface-pca.jpg" width="400">           <br/> <img src="images/tp4/disks-isomap.jpg" width="400"> <img src="images/tp4/frey_rawface-isomap.jpg" width="400">         </td>      </tr>      <tr>         <td align="center"><p>Dimension reduction for two different libraries             of images.<br>            Left: translating disks, right: face images.<br>            Although the disks data set depends on 2D translation, this is not<br>            a flat (euclidean) manifold (it is a bit curvy due to the disk shape).<br>            This is why the PCA mapping does not recover exactly a 2D square.<br>            The face database exhibits a more complex embedding.</p></td>      </tr>    </table>    <br/>    Copyright &copy; 2006 <a href="http://www.ceremade.dauphine.fr/%7Epeyre/">Gabriel     Peyr&eacute;</a> </blockquote></ul></body></html>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -