📄 graphdemo.html

📁 一个Matlab写的关于图理论以及其在机器学习中应用的教学用GUI软件
💻 HTML
字号:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html><head>  <title>GraphDemo</title>  <meta content="Evrsoft First Page" name="GENERATOR"></head><body>  <table style="WIDTH: 100%; HEIGHT: 20%" align="center" bgcolor="#00FFFF">    <tbody>      <tr>        <td width="10%"><img height="150" src="GraphPicture.png" width="190" border="0"></td>        <td width="5%"></td>        <td width="80%">          <p align="left"><font face="Verdana" size="6">GraphDemo</font></p>          <p align="left"><font face="Verdana" size="4">a Matlab GUI to explore similarity graphs and their use in machine learning</font></p>          <p align="left">&nbsp;</p>          <p align="left"><font face="Verdana">by <a href="http://www.ml.uni-saarland.de/index.html">Matthias Hein</a> and <a href="http://www.kyb.mpg.de/~ule">Ulrike von Luxburg</a></font></p>        </td>      </tr>    </tbody>  </table>  <h2><font face="Verdana">Overview</font></h2>  <blockquote dir="ltr" style="MARGIN-RIGHT: 0px">    <p><font face="Verdana">Many machine learning algorithms model local neighborhoods using similarity graphs: manifold methods for dimensionality reduction or data denoising, spectral clustering,    label propagation for semi-supervised learning, and so on. However, for most of those algorithms it is pretty unclear which kind of similarity graph one should use, and how its parameters have to    be chosen.<br>    <br>    To get some intuition about those questions we wrote the GraphDemo package. It aims to highlight the behavior of different kinds of similarity graphs and to demonstrate their influence on the    outcome of machine learning algorithms. The package currently contains three different parts:</font></p>  </blockquote>  <table width="100%">    <tbody>      <tr>        <td>          <p align="center"><img height="200" src="fig_DemoSimilarityGraphs.png" width="300" border="0"></p>        </td>        <td>          <p align="center"><img height="200" src="fig_DemoSpectralClustering.png" width="300" border="0"></p>        </td>        <td>          <p align="center"><img height="200" src="fig_DemoSSL.png" width="300" border="0"></p>        </td>      </tr>      <tr>        <td>          <p align="center"><a href="DemoSimilarityGraphs.html"><font face="Verdana" size="5">DemoSimilarityGraphs</font></a></p>        </td>        <td>          <p align="center"><a href="DemoSpectralClustering.html"><font face="Verdana" size="5">DemoSpectralClustering</font></a></p>        </td>        <td>          <p align="center"><a href="DemoSSL.html"><font face="Verdana" size="5">DemoSSL</font></a></p>        </td>      </tr>    </tbody>  </table>  <h2><font face="Verdana">Purpose</font></h2>  <blockquote dir="ltr" style="MARGIN-RIGHT: 0px">    <p><font face="Verdana">The GraphDemo has originally been written for teaching purposes. It has first been used in the practical sessions of the <a href="http://www.mlss.cc/tuebingen07">Machine    Learning Summer School 2007</a>.</font></p>  </blockquote>  <h2><font face="Verdana">Download and License</font></h2>  <blockquote dir="ltr" style="MARGIN-RIGHT: 0px">    <p><font face="Verdana">GraphDemo has been written by <a href="http://www.ml.uni-saarland.de/index.html">Matthias Hein</a>, Department of Computer Science, Saarland University and <a href=    "http://www.kyb.mpg.de/~ule">Ulrike von Luxburg</a>, Max-Planck-Institute for Biological Cybernetics, Tuebingen, Germany. GraphDemo is released under the GNU public license.</font></p>  </blockquote>  <p align="center"><font face="Verdana"><a href="GraphDemos.zip"><font face="Verdana" size="5">Download GraphDemos.zip</font></a>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (Size: 12.6 MB)&nbsp;  &nbsp;(Version: 1.0.1)</font></p>  <h2><font face="Verdana">History</font></h2>  <p align="center"><font face="Verdana"><blockquote dir="ltr" style="MARGIN-RIGHT: 0px">    <table>      <tbody>        <tr>          <td>26.10.2007,</td>          <td>Version 1.0:</td>          <td>Initial Release</td>        </tr>        <tr>          <td>10.04.2008,</td>          <td>Version 1.0.1:</td>          <td>bug fix in the demo on spectral clustering, problem in the case of small disconnected components resolved</td>        </tr>      </tbody>    </table>  </blockquote></p>  <h2><font face="Verdana">Installation</font></h2>  <blockquote dir="ltr" style="MARGIN-RIGHT: 0px">    <p><font face="Verdana">To run the GraphDemo, download the file GraphDemo.zip. Unzip the file GraphDemo.zip in some convenient directory. Add this directory to the Matlab path variable or cd to    this directory within Matlab. The three demos can then be called from within Matlab by the commands 'DemoSimilarityGraphs', 'DemoSpectralClustering', 'DemoSSL'. The GUI should be compatible with    Matlab 7.x. If you find bugs or have other comments contact <a href="http://www.ml.uni-saarland.de/index.html">Matthias Hein</a>.</font></p>  </blockquote>  <h2><font face="Verdana">Documentation</font></h2>  <h3><font face="Verdana">Data sets and noise models</font></h3>  <blockquote>    <p><font face="Verdana">All three demos use the same data sets. In all demos, the user can select the data set in a popup-menue and adjust its dimensionality using a slider. All data sets (except    USPS, see below) have two dimensions which contain the data structure, the other dimensions are pure noise. The data sets and the noise have been generated according to the following data    models:</font></p>    <ol>      <li><font face="Verdana"><strong><em>Two moons balanced:</em></strong> the first two dimensions are two half-circles of two moons shape (as shown in the screen shot). In all dimensions,      Gaussian noise with mean 0 and variance 0.01 has been added. Both moons have the same weight (probability mass), that is in expectation they will contain the same number of data      points.</font></li>      <li><font face="Verdana"><em><strong>Two moons unbalanced:</strong></em> as "Two moons balanced", but the two classes have unequal weights of 0.2 and 0.8, respectively. That is, in expectation      only 20% of the points come from the first moon, while 80% of the points come from the second moon.</font></li>      <li><font face="Verdana"><strong><em>Two Gaussians balanced:</em></strong> The data are the two points, (-1.1,-1.1) and (1,1), in two-dimensional space. Both points are disturbed by Gaussian      noise of variance 0.36 and dimension given by the "dimension"-parameter. Both classes have equal weight.</font></li>      <li><font face="Verdana"><strong><em>Two Gaussians unbalanced:</em></strong> As "Two Gaussians Balanced", but unbalanced class weights of 0.2 and 0.8.</font></li>      <li><font face="Verdana"><strong><em>Two Gaussians different variance:</em></strong> As "Two Gaussians Balanced", but each point is disturbed by a Gaussian with different variance.</font></li>      <li><font face="Verdana"><strong><em>Three Gaussians:</em></strong> The data are the three points (1,1), (-1.1,-1.1) and (2,-2), in two-dimensional space. All three points are disturbed by      Gaussian noise of variance 0.36. The weights of the Gaussian are 0.3, 0.3 and 0.4.</font></li>      <li><font face="Verdana"><strong><em>Two Gaussians, no clusters (in DemoSSL only):</em></strong> As "Two Gaussians Balanced", but the decision boundary goes through the middle of the two      Gaussians.</font></li>      <li><font face="Verdana"><strong><em>USPS (in DemoSSL only):</em></strong> This is the well-known USPS data set of handwritten digits, which is available online. It is the test set containing      2007 digits. No noise dimensions are added in this case. That means that the sliders "dimension" and "number of points" have no effect.</font></li>    </ol><font face="Verdana">Note that all data sets (except USPS) are ordered according to classes. This means that the heat maps of the similarity matrix or the adjacency matrices are block    matrices.</font>  </blockquote>  <h3><font face="Verdana">Distance and similarity functions</font></h3>  <blockquote dir="ltr" style="MARGIN-RIGHT: 0px">    <p><font face="Verdana">We use for all data sets the the Euclidean distance: d(x,y) = ||x - y|| as distance function.<br>    <br>    As similarity function , for all data sets we used the Gaussian kernel s(x,y) = exp(- || x - y||^2 / (2 sigma^2) ). The kernel width sigma can be adjusted using a slider. Note that what a    "reasonable" value of sigma is changes with the number of noise dimensions. The panels "Heat map of the similarity matrix" and "Histogram of the similarity values" in DemoSimilarityGraphs are    meant to help the user to choose a good parameter sigma. These reasonable parameters should then be used in DemoSpectralClustering and DemoSSL.<br>    <strong>Warning:</strong> Due to the limited numerical precision, for high dimensional data sets a too small value of the parameter sigma can lead to zero weights. Then the graph is completely    disconnected.</font></p>  </blockquote>  <h3><font face="Verdana">Similarity graphs and their parameters</font></h3>  <blockquote>    <p><font face="Verdana">The purpose of a similarity graph is to connect data points in "local neighborhoods". Each data point corresponds to a vertex in the neighborhood graph. Depending on the    type of similarity graph, "close" vertices will be connected.</font></p>    <ul>      <li><font face="Verdana"><strong><em>Epsilon-neighborhood graph:</em></strong> Two vertices are connected if the distance of the corresponding data points is less than epsilon. In machine      learning, the epsilon-neighborhood graph is used both as a weighted or unweighted graph. In the weighted case, the edge weights are the similarity (!) values of the adjacent points (note that      edge weights in similarity graphs should always be similarities, not distances!). The parameter of this graph is epsilon, it can be set by a slider in all demos.</font></li>      <li><font face="Verdana">Note that a <strong><em>completely connected graph</em></strong> (as it is sometimes used in connection with the Gaussian kernel) can be achieved using an      epsilon-neighborhood graph and setting epsilon to its maximal value.</font></li>      <li><font face="Verdana">The <strong><em>symmetric k-nearest neighbor graph</em></strong>: two vertices x, y are connected if x is among the k-nearest neighbors of y <strong>or</strong> vice      versa. In this demo, we use a weighted symmetric k-nearest neighbor graph, that is all edges are weighted by the similarity of the adjacent points. The parameter of this graph is k, it can be      set by a slider in all demos. [The name "symmetric kNN graph" is not standard, in many papers this graph is simply called "the kNN graph". We use the term "symmetric" to distinguish it from the      mutual kNN graph, see below. ]</font></li>      <li><font face="Verdana">The <strong><em>mutual k-nearest neighbor graph:</em></strong> two vertices x, y are connected if x is among the k-nearest neighbors of y <strong>and</strong> vice      versa. Similarly we use a weighted mutual k-nearest neighbor graph, that is all edges are weighted by the similarity of the adjacent points. The parameter of this graph is k, it can be set by a      slider in all demos.</font></li>    </ul>  </blockquote>  <h3><font face="Verdana">Descriptions of the individual demos</font></h3>  <blockquote>    <font face="Verdana">... can be found on the pages <a href="DemoSimilarityGraphs.html"><font face="Verdana">DemoSimilarityGraphs</font></a>, <a href="DemoSpectralClustering.html"><font face=    "Verdana">DemoSpectralClustering</font></a>, and <a href="DemoSSL.html"><font face="Verdana">DemoSSL</font></a>.</font>  </blockquote><br>  <br></body></html>
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -