📄 matlabarsenal.htm
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE> MATLAB Classification Wrapper 0.99 (Debug Version) </TITLE>
<META http-equiv=Content-Type content="text/html; charset=windows-1252">
<META content="MSHTML 6.00.2800.1400" name=GENERATOR>
<META content=8.0.3514 name=Version>
<META content=11/26/96 name=Date>
<META content="C:\Programme\Microsoft Office\Office\HTML.DOT"
name=Template></HEAD>
<BODY vLink=#800080 link=#0000ff bgColor=#ffffff>
<TABLE cellSpacing=0 cellPadding=5 border=0 width="916">
<TBODY>
<TR>
<TD vAlign=top width="9%">
<H2> </H2>
</TD>
<TD vAlign=top width="89%">
<H1 align=center>MATLABArsenal</H1>
<H1 align=center>A MATLAB Wrapper for Classification</H1>
<P align=center> <FONT color=#000000>Developed at: <BR>
</font><A href="http://www.informedia.cs.cmu.edu" target=_top>Informedia</A><br>
<A href="http://www.cs.cmu.edu" target=_top>School of Computer Science</A><br>
<A href="http://www.cmu.edu" target=_top>Carnegie Mellon University</A></P>
<FONT color=#000000>
<P align=center>Version: 0.99 Debug Version<BR>
Date: 05.03.2004</P>
</FONT></TD>
<TD vAlign=top width="2%">
<H2> </H2>
</TD></TR></TBODY></TABLE>
<H2>Overview</H2><FONT color=#000000>
<P>MATLABArsenal is a open-source wrapper for the problem of classification written
in MATLAB. The main features of the program are the following: </P>
</FONT>
<UL>
<FONT color=#000000></font>
<LI>include many popular classification algorithms
<ul>
<li>(Transductive) Support Vector Machines (SVMs)
<ul>
<li>SVM_light, libSVM, mySVM</li>
</ul>
</li>
<li>Logistic Regression / Maximum Entropy</li>
<li>Linear/Quadratic Discriminant Analysis</li>
<li>Kernel Logistic Regression</li>
<li>Gaussian Mixutre Models</li>
<li>k Nearest Neighbor </li>
<li>Multile Layer Perceptron</li>
<li>Naive Bayes </li>
<li>Decision Tree </li>
<li>TO DO: Bayes Net </li>
<li>TO DO: HMM,MEMM, CRF</li>
</ul>
<li>incorporate several machine learning packages
<ul>
<li><a href="http://www.cs.waikato.ac.nz/ml/weka/">WEKA</a></li>
<li><a href="http://www.ncrg.aston.ac.uk/netlab/">NETLAB</a></li>
<li><a href="http://svmlight.joachims.org/%20">SVMLight</a>, <a href="http://www-ai.cs.uni-dortmund.de/SOFTWARE/MYSVM%20">mySVM</a>,
<a href="http://www.csie.ntu.edu.tw/%7Ecjlin/libsvm/%20">libSVM</a></li>
</ul>
</li>
<li>implement the following ensemble schemes
<ul>
<li>AdaBoostM1</li>
<li>Bagging</li>
<li>Up Sampling </li>
<li>Down Sampling</li>
<li>Stacking / Hierachial Classification<br>
</li>
<li>Stacking with Feature Subspaces</li>
<li>Majority Voting</li>
<li>Multi-Class Classification Wrapper using Output Coding</li>
<li>Multi-Label Classification Wrapper</li>
</ul>
</li>
<li>evaluation methods
<ul>
<li>Cross Validation</li>
<li>Train Test Validation</li>
<li>Train Only</li>
<li>Test Only<br>
</li>
</ul>
</li>
<LI>support binary/multi-class active learning with best-worse case strategies
<LI>support full/sparse vector representation
<LI>allow feature selection using SVD or FLD
<LI>scalable to new learning algorithms
<LI>including learning algorithms with pairwise constraints / side information
<LI>support shot-level classification<br>
<LI>TO DO: visualization<br>
</UL>
<H2>Description</H2>
<P> <FONT color=#000000>MATLAB<i>Arsenal</i> is a open-source wrapper written
in MATLAB for the problem of supervised learning / classification. </FONT></P>
<H2>Source Code<FONT color=#000000> </FONT></H2>
<FONT color=#000000>
<P>The program is free for scientific use. Please contact me, if you are planning
to use the software for commercial purposes. The software must not be modified
and distributed without prior permission of the author. If you use MATLAB<i>Arsenal</i>
in your scientific work, please cite as </P>
</FONT>
<UL>
<LI>N/A yet</LI>
</UL>
<p><a href="http://finalfantasyxi.inf.cs.cmu.edu/tmp/MATLABArsenal.zip">http://finalfantasyxi.inf.cs.cmu.edu/tmp/MATLABArsenal.zip</a>
</p>
<P>This wrapper is powered by several other machine learning packages. Their binaries
have already been included in this release. If necessary, you can download their
lastest version from the following websites, </P>
<ul>
<li>N/A yet</li>
</ul>
<H2>Installation for MATLAB source code</H2>
<ol>
<li>You can skip this step if you already have a MATLAB environment, otherwises you
have to install MATLAB in your machine. <br>
<br>
</li>
<li>Download the package from <a href="http://finalfantasyxi.inf.cs.cmu.edu/tmp/MATLABArsenal.zip">http://finalfantasyxi.inf.cs.cmu.edu/tmp/MATLABArsenal.zip</a>
<br>
<br>
</li>
<li>Unzip the .zip files into a arbitrary directory, say $MATLABArsenalRoot<br>
<br>
</li>
<li>Add the path $MATLABArsenalRoot and its subfolders in MATLAB. Use addpath
command or menu File->Set Path. <br>
</li>
</ol>
<p>Then it is ready to go.</p>
<p> </p>
<H2>Installation for binary code<br>
</H2>
<ol>
<li>You can skip this step if you already have a Java Runtime environment, otherwises
it is better to install Java in your machine (for WEKA). <br>
<br>
</li>
<li>Download the package from <a href="http://finalfantasyxi.inf.cs.cmu.edu/tmp/MATLABArsenalExec.zip">http://finalfantasyxi.inf.cs.cmu.edu/tmp/MATLABArsenalExec.zip</a>
<br>
<br>
</li>
<li>Unzip the .zip files into a arbitrary directory, say $MATLABArsenalRoot<br>
<br>
</li>
<li>Add the path $MATLABArsenalRoot/bin/win32 to system path.<br>
<br>
</li>
<li>Run demo1.bat in the $MATLABArsenalRoot<br>
</li>
</ol>
<p>Then it is ready to go.</p>
<p> </p>
<h2></h2>
<H2>How to use</H2>
<P> <FONT color=#000000>This section explains how to use the MATLAB<i>Arsenal</i>
software.</font></P>
<FONT color=#000000>
<P>The main module of MATLAB<i>Arsenal</i> is called "test_classify".
</P>
<P> It can be called with the following approaches: </P>
<P>(1) In MATLAB Command line, type</P>
<pre> test_classify('classify -t input_file [options] [--@Evaluation [options]] ...
-- Classifier [param] [-- Classifiers]);
</pre>
For example, the following commands use SVM_LIGHT with RBF kernel('-Kernel 2')
to classify, where param is 0.01, cost factor is 3. The first 100 data in TREC03_com.CNN.hstat1
is used as training data, the rest are the testing data. The data features('-n')
will be normalized to [0,1] before classificaiton. <br>
<pre> test_classify(strcat(
'classify -t TREC03_com.CNN.hstat1 -n 1', ...
' -- train_test_validate -t 100 ', ...
' -- train_test_multiple_class ', ...
' -- SVM_LIGHT -Kernel 2 -KernelParam 0.01 -CostFactor 3'));
</pre>
<P>You might need to run </P>
<pre>clear global preprocess; </pre>
<P>before classification to clean the global variable "preprocess" </P>
</FONT>
<dir>
<p></p>
</dir>
<FONT color=#000000>
<P>(2) Write a .m files in the current directory as follows, and run </P>
</FONT>
<pre> global preprocess; </pre>
<pre> %Normalize the data
preprocess.Normalization = 1;</pre>
<pre> %Evaluation Method
%0: Train-Test Split
%1: Cross Validation
preprocess.Evaluation = 0;</pre>
<pre> preprocess.TrainTestSplitBoundary = 100;
% Multi-class classification
% 0: Classification
% 1: Multi-class Classification Wrapper
% 2: Multi-label Classification Wrapper
% 3: Multi-class Active Learning Wrapper
preprocess.MultiClassType = 1;</pre>
<pre> preprocess.root = '.'; // Change to your files
preprocess.output_file = sprintf('%s/_Result', preprocess.root);
preprocess.input_file = sprintf('%s/TREC03_com.CNN.hstat1', preprocess.root);</pre>
<pre> run = test_classify('SVM_LIGHT -Kernel 2 -KernelParam 0.01 -CostFactor 3');</pre>
<p>This example give the same results as above. </p>
<p><font color="#000000">(3) The binary code is running on the DOS command line.
Suppose the current directory is $MATLABArsenalRoot, type</font></p>
<pre><font color="#000000"> ./test_classify.exe "classify -t input_file [options] [--@Evaluation [options]] ...
-- Classifier [param] [-- Classifiers]"</font></pre>
<font color="#000000"></font>with the same parameters as before.
<p>A more detailed documentation is avaiable in <a href="http://finalfantasyxi.inf.cs.cmu.edu/tmp/MATLABArsenalDoc/">http://finalfantasyxi.inf.cs.cmu.edu/tmp/MATLABArsenalDoc/</a></p>
<FONT color=#000000> </FONT>
<h2></h2>
<h2>Input & Output Formats</h2>
<FONT color=#000000>
<P>Available options are: (To be continued) </P>
</FONT>
<DIR><FONT color=#000000></font>
<PRE>Output options:
-o OUTPUT_FILE - The file name of the result output file
-of [{'a'};'w'] - Overwrite the output file or append
Input options:
-if 0/1 - Use the first type or the second type of input formats</PRE>
</DIR>
<FONT
color=#000000>
<P>The input file <TT>example_file</TT> contains the training examples. Currently
two typrs of input formats are accepted. The first type is, </P>
</FONT>
<dir>
<p><font color="#000000"><tt><line> .=. <value> <value> ...
<value> <target><br>
<target> .=. <integer></tt><br>
<tt><value> .=. <float></tt> </font></p>
</dir>
<p></p>
<FONT
color=#000000>
<P><font color="#000000">Sample Input Format: (Each line represents one training
example, Last number is the label, all others are the features)</font></P>
</FONT>
<pre><font color="#000000"> </font><font color="#000000">0, 0, 0.40, 0
0, 1, 0.10, 0
0, 0, 0.05, 1
0, 0, 0.10, 1
0, 0, 0.15, 0
0, 0, 0.05, 0</font></pre>
<FONT
color=#000000>
<P>For the second type of input format, each of the following lines represents
one training example and is of the following format: </P>
<DIR>
<P> <TT><line> .=. <target> <feature>:<value> <feature>:<value>
... <feature>:<value><BR>
<target> .=. <integer></tt> <BR>
<TT><feature> .=. <integer> | "qid"</TT> <BR>
<TT><value> .=. <float></TT> </P>
</DIR>
<P><font color="#000000">Sample Input Format: (Each line represents one training
example, First number is the label, all others are the features, zeros can be
omitted)</font></P>
</FONT>
<table width="49%" border="0">
<tr>
<td width="37%">
<pre><font color="#000000">0 1:0 2:0 3:0.40
0 1:0 2:1 3:0.10
1 1:0 2:0 3:0.05
1 1:0 2:0 3:0.10
0 1:0 2:0 3:0.15
0 1:0 2:0 3:0.05</font></pre>
</td>
<td width="23%">
<pre><b><font size="6">OR</font></b></pre>
</td>
<td width="40%">
<pre><font color="#000000">0 3:0.40
0 2:1 3:0.10
1 3:0.05
1 3:0.10
0 3:0.15
0 3:0.05</font></pre>
</td>
</tr>
</table>
<p><FONT
color=#000000>There are two major output files. By default, their names are set
to $(input_file).pred and $(input_file).result. The file $(input_file).pred
contains the prediction results for each test instances. The sample output format
is, </FONT> </p>
<table width="36%" border="1" height="163">
<tr>
<td>
<pre>Index</pre>
</td>
<td>
<pre>Prob</pre>
</td>
<td>
<pre>Pred</pre>
</td>
<td>
<pre>Truth</pre>
</td>
</tr>
<tr>
<td>
<pre>1</pre>
</td>
<td>
<pre>0.98</pre>
</td>
<td>
<pre>0</pre>
</td>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -