⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 tutorial.xml

📁 使用具有增量学习的监控式学习方法。包括几个不同的分类算法。
💻 XML
字号:
<chapter id="tutorial"><title>Tutorial</title><para>This short tutorial shows how to use <application>Select</application> for doing some email classification testing.It does not show, however, how to practically use it for classifying your incoming email.In fact, the current version of <application>Select</application> is not very suitable for such purposes.I encourage everyone who is interested, though, to look at <xref linkend="select_ifile"/> and <xref linkend="natural"/>, and provide me with feedback of how to improve <application>Select</application> in this area.</para><sect1><title>Preparations</title><para>First you need to compile <application>Select</application>, see <xref linkend="install"/>.Note, however, that this tutorial has been made to work without needing to actually install <application>Select</application>. So it is not necessary to do <userinput>make install</userinput>.</para><para>For the rest of this tutorial you are supposed to be in the directory <filename>tutorial</filename>.<screen><prompt>$</prompt> <userinput>cd tutorial</userinput></screen></para><para>You also need to get some data to use for training and testing.In this tutorial a subset of the <ulink url="http://kdd.ics.uci.edu/databases/20newsgroups/20newsgroups.html">20-Newsgroups</ulink> collection will be used.This can be downloaded from <ulink url="http://kdd.ics.uci.edu/databases/20newsgroups/mini_newsgroups.tar.gz">here</ulink> (1.8MB).Unpack this file like this:<screen><prompt>$</prompt> <userinput>gzip -cd mini_newsgroups.tar.gz | tar x</userinput></screen></para></sect1><sect1><title>Running selectd</title><para><application>Select</application> uses a separate process to do all classification.This is started by running <application>selectd</application>, the Select daemon.<application>selectd</application> is configured using a configuration file. By default it is assumed that this file is called <filename>selectd.conf</filename> and is located in a directory <filename>.select</filename> in your home directory. The <option>-f</option> switch can be used to select another configuration file.See <xref linkend="selectd"/> for more information about this.<screen><prompt>$</prompt> <userinput>../selectd -f selectd.conf &amp;</userinput></screen></para></sect1><sect1><title>Running select_test</title><para>Run <application>select_test</application> and redirect the output into a file.<screen><prompt>$</prompt> <userinput>../select_test -f select_test.conf &gt; TEST</userinput></screen></para><para>Now you can kill <application>selectd</application>.<screen><prompt>$</prompt> <userinput>ps | grep selectd</userinput> 4711 pts/0    00:00:10 selectd<prompt>$</prompt> <userinput>kill 4711</userinput></screen></para><para>Now view the last lines of the file.You can do this with the command <command>tail</command>:<screen><prompt>$</prompt> <userinput>tail -n 12 TEST</userinput><computeroutput>## Classifier 0 (plotted)##   Correct: 1383  Top-3-Correct: 1750  Covered: 1999  Total: 2000##   Accuracy: 0.692  Top-3-Accuracy: 0.875  Coverage: 1.000##   Total-Accuracy: 0.692  Total-Top-3-Accuracy: 0.875## Classifier 1##   Correct: 129  Top-3-Correct: 129  Covered: 169  Total: 2000##   Accuracy: 0.763  Top-3-Accuracy: 0.763  Coverage: 0.085##   Total-Accuracy: 0.065  Total-Top-3-Accuracy: 0.065## Classifier 2##   Correct: 178  Top-3-Correct: 178  Covered: 200  Total: 2000##   Accuracy: 0.890  Top-3-Accuracy: 0.890  Coverage: 0.100##   Total-Accuracy: 0.089  Total-Top-3-Accuracy: 0.089</computeroutput></screen></para><sect2><title>Making a Plot</title><para>If you have the application <application>gnuplot</application> installed, then you can make a plot of the test run.<screen><prompt>$</prompt> <userinput>gnuplot</userinput><prompt>gnuplot></prompt> <userinput>plot "TEST" notitle with lines</userinput></screen><figure><title>Test plot</title><mediaobject><imageobject><imagedata fileref="test.png" format="PNG"/></imageobject><textobject><phrase>Test plot</phrase></textobject></mediaobject></figure>The plot shows how the classification performance varies over time.</para></sect2></sect1></chapter>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -