📄 cluster.html

📁 非线性时间学列分析工具
💻 HTML
字号:
<html> <head><title>Clustering a dissimilarity matrix</title></head><body bgcolor="#ffffff"><h3>Clustering a dissimilarity matrix</h3><font color=blue><tt>cluster <font color=red>-##</font>    [-= -X </tt><em>xfile</em><tt>] </tt><em>file</em></font><blockquote>   <br>     <font color=red><tt> -#  </tt></font>number of clusters   <br>     <font color=blue><tt> -=  </tt></font>if set, bias towards similar size clusters   <br>     <font color=blue><tt> -X  </tt><em>ffile</em></font>list of indices with fixed cluster assignments   <br>     <font color=blue><tt> -o  </tt></font><a href=../general.html#outfile>output file name</a>, just <font color=blue><tt> -o  </tt></font>means <font color=blue><em>file</em><tt>_clust</tt></font>   <br>     <font color=blue><tt> -V  </tt></font><a href=../general.html#verbosity>verbosity level</a> (0 = only fatal errors)   <br>     <font color=blue><tt> -h  </tt></font>show this message<p><a href=../general.html#verbosity>verbosity level</a> (add what you want): <p>      <font color=blue> 	1</font> = input/output<br>   <font color=blue> 	2</font> = state of clustering<br>   <font color=blue> 	8</font> = temperature / cost at cooling</blockquote>The program reads a dissimilarity matrix of the form <font color=blue>i, j,d<sub>i,j</sub></font> (columns 1,2,3 of the input file). Any missing valuesare filled in by the mean of the given values. Now <font color=red><tt> -#</tt></font> clusters are formed by minimising the average dissimilarity ofeach entity to all the entities within each cluster. The method is described in<a href="../chaospaper/citation.html#cluster">Schreiber and Schmitz</a>.Certain indices may be assigned to a cluster by listing the index and thecluster number in <font color=blue><em>ffile</em></font> 9the argument of the<font color=blue><tt> -X  </tt></font> option).<p>Progress is monitored by a string printed at brief intervals. Here, clustersare lettered by A, B,&nbsp;... On output, the clustering is described by givingfor each index the cluster number and the average dissimilarities of that itemto each cluster.<p>As an example, consider four time series 1,2,3,4 where 1 and 2 are verysimilar, 3 and 4 as well, but teh two groups are quite dissimilar. This may bereflected in the dissimilarity matrix<blockquote><table noborder cellpadding=0 cellspacing=0><tr><th>i</th><th>j</th><th>d<sub>i,j</sub></th></tr><tr><td colspan=3><hr with=100%></td></tr><tr><td>     1 </td><td>1 </td><td>0 </td></tr>	<tr><td>     1 </td><td>2 </td><td>0 </td></tr>	<tr><td>     1 </td><td>3 </td><td>1 </td></tr>	<tr><td>     1 </td><td>4 </td><td>1 </td></tr>	<tr><td>     2 </td><td>1 </td><td>0 </td></tr>	<tr><td>     2 </td><td>2 </td><td>0 </td></tr>	<tr><td>     2 </td><td>3 </td><td>1 </td></tr>	<tr><td>     2 </td><td>4 </td><td>1 </td></tr>	<tr><td>     3 </td><td>1 </td><td>1 </td></tr>	<tr><td>     3 </td><td>2 </td><td>1 </td></tr>	<tr><td>     3 </td><td>3 </td><td>0 </td></tr>	<tr><td>     3 </td><td>4 </td><td>0 </td></tr>	<tr><td>     4 </td><td>1 </td><td>1 </td></tr>	<tr><td>     4 </td><td>2 </td><td>1 </td></tr>	<tr><td>     4 </td><td>3 </td><td>0 </td></tr>	<tr><td>     4 </td><td>4 </td><td>0 </td></tr>	</table></blockquote>Running<blockquote>   <font color=blue><tt>&gt; cluster -#2</tt></font> </blockquote>on this data will yield as output<pre> 1  0.  1. 1  0.  1. 2  1.  0. 2  1.  0.</pre>This means that, as expected, set 1 and two are in cluster 1. Also, 3 and 4 arein cluster 2. They all have average distance 0. to their home cluster and 1. tothe other cluster. <p>Dissimilarity matrices for time series can be produced either using <aherf="../docs_c/nstat_z.htm">nstat_z</a> or by computing any otherdissimilarity measure (<aherf="../docs_c/xc2.html">xc2</a>, <aherf="../docs_c/xzero.html">xzero</a>, <aherf="../docs_c/xcor.html">xcor</a>, with appropriate settings)in a loop.<p>Here one more example for UNIX users using pipelines:<blockquote>   <font color=blue><tt>&gt;  ( henon -l 1000 ; ikeda -l 1000 ) | nstat_z -#10 | cluster -#2</tt></font> </blockquote>A time series of 2000 points is produced, the first 1000 from the H&eacute;nonmap, the second from the Ikeda map. Splitting it into 10 segments, <ahref="../docs_c/nstat_z.htm">nstat_z</a> produces a 10 by 10 matrix which isthen used to form 2 clusters:<pre> 1  0.510285676  1.51835787 1  0.515677989  1.47538877 1  0.505068302  1.50731277 1  0.526149631  1.50477791 1  0.54192245  1.5086087 2  1.49558449  0.900290847 2  1.49753571  0.912206411 2  1.50899839  0.89825052 2  1.49374235  0.903614342 2  1.51858497  0.915635824</pre>Indeed, the first 5 segments form one cluster and segments 6-10 the other.<p><a href="../contents.html">Table of Contents</a> * <a href="../../index.html" target="_top">TISEAN home</a></body></html>
💿 文件大小 1539 K
👤 上传用户 delsboy
📂 所属分类数学计算
🏷️ 相关标签

#非线性 #分
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -