http:^^www.cs.wisc.edu^~olvi^uwmp^msmt.html

来自「This data set contains WWW-pages collect」· HTML 代码 · 共 322 行

HTML
322
字号
Date: Tue, 05 Nov 1996 20:59:49 GMTServer: NCSA/1.5Content-type: text/htmlLast-modified: Thu, 06 Jul 1995 16:04:38 GMTContent-length: 9460<HTML><HEAD><TITLE>Multisurface Method Tree with MATLAB</TITLE></HEAD><BODY><HR><H1>Multisurface Method Tree with MATLAB</H1><HR><H1>Brief Overview of the MSM-T Algorithm</H1><P>Let <B> A </B> and <B> B </B> be finite, disjoint point sets in <EM>n</EM>-dimensional Euclidean space, represented by the <EM>m</EM> x <EM>n</EM> and <EM>k</EM> x <EM>n</EM> matrices <EM>A</EM> and <EM>B</EM>, respectively.  The MSM-T algorithm generates a decision tree representing the planes needed to separate the sets <B> A </B> and <B> B </B>.  Each non-leaf node in the tree is plane that further separates <B> A </B> and <B> B </B>.  Each leaf node contains points of either <B> A </B> or <B> B </B> exclusively (or to a prescribed tolerance).  For a detailed description of the MSM-T algorithm, see:<DL>  <DT> <B> O. L. Mangasarian. </B>    <DD> Mathematical Programming in Neural Networks.  <EM> ORSA Journal on Computing, </EM> Vol. 5, No. 4, Fall 1993, pages 349 - 360.</DL>  <HR><H2>Table of Contents</H2><UL>  <LI> <!WA0><!WA0><!WA0><!WA0><A HREF="#tree_gen">Generation of the Decision Tree</A>  <LI> <!WA1><!WA1><!WA1><!WA1><A HREF="#disp_tree">Displaying the Decision Tree</A>  <LI> <!WA2><!WA2><!WA2><!WA2><A HREF="#prune">Pruning</A>  <LI> <!WA3><!WA3><!WA3><!WA3><A HREF="#cross_val">Cross-Validation</A>  <LI> <!WA4><!WA4><!WA4><!WA4><A HREF="#storage">Storage of Decision Trees</A>  <LI> <!WA5><!WA5><!WA5><!WA5><A HREF="http://www.cs.wisc.edu/~olvi/uwmp/msmt_ex.html">Examples</A></UL><HR><A NAME="tree_gen"><H1>Generation of the Decision Tree</H1></A><P>The MATLAB representation of the matrices <EM>A</EM> and <EM>B</EM> (from now on denoted by A and B), must be placed in the MATLAB environment.  This can be done either by actually entering them by hand, or by placing them in an M-file and loading the M-file into the MATLAB environment.<P>The decision tree is technically represented as a matrix in the MATLAB environment.  This matrix representation of the decision tree must be generated.  To generate this matrix, call (in the MATLAB environment):<P>T = msmt_tree(A,B,max_depth,tolerance,certainty_factor,min_points)<P>In the above expression the various symbols are defined as follows:<UL>  <LI> A, B: MATLAB representation of the matrices <EM>A</EM> and <EM>B</EM>.  <LI> max_depth: maximum allowable depth of the decision tree (must be greater than or equal to 1).  If this argument is not given, then max_depth is set (by default) to some huge positive integer.  <LI> tolerance: percentage of allowable error in a leaf node (must be between 0.0 and 1.0).  If this argument is not given, then tolerance is set (by default) to 0.0.  <LI> certainty_factor: used in C4.5 pruning algorithm (see <!WA6><!WA6><!WA6><!WA6><A HREF="#prune"> <B> Pruning</B></A>).  If this argument is not given, or certainty_factor = 0.0, the tree is not pruned.  <LI> min_points: used in point pruning algorithm (see <!WA7><!WA7><!WA7><!WA7><A HREF="#prune"><B> Pruning</B></A>).                    If this argument is not given or min_points = 0, the tree                    is not pruned.</UL><P>See <!WA8><!WA8><!WA8><!WA8><A HREF="file://www.cs.wisc.edu/~olvi/uwmp//afs/cs.wisc.edu/u/p/a/paulb/public/msmt/msmt_tree.m">msmt_tree.m</A> file.<HR><A NAME="disp_tree"><H1>Displaying the Decision Tree</H1></A><P>The decision tree generated by the call above can be displayed graphically by calling the following routine (within the MATLAB environment):<P>disp_tree(T,A,B)<P>where:<UL>  <LI> T: matrix representing the decision tree in the MATLAB environment.  <LI> A: matrix representing the point set <B>A</B> in the MATLAB environment.  <LI> B: matrix representing the point set <B>B</B> in the MATLAB environment.</UL><P>The following is an example of the graphical representation of the decision tree using Wisconsin Breast Cancer data.<!WA9><!WA9><!WA9><!WA9><IMG SRC="http://www.cs.wisc.edu/~olvi/uwmp/tree.gif"><P>Each node in the tree is numbered.  In the MATLAB environment, the following information is provided:<UL>  <LI> For each non-leaf node:<UL>  <LI> Equation of the plane is given as:  <EM>wx = theta</EM>.  <LI> Number of points of set <B>A</B> at this node.  <LI> Number of points of set <B>B</B> at this node.</UL>  <LI> For each leaf node:<UL>  <LI> Identification that the node is a leaf node  <LI> Number of points of set <B>A</B> at this node.  <LI> Number of points of set <B>B</B> at this node.</UL></UL><P>See <!WA10><!WA10><!WA10><!WA10><A HREF="file://www.cs.wisc.edu/~olvi/uwmp//afs/cs.wisc.edu/u/p/a/paulb/public/msmt/disp_tree.m">disp_tree.m</A> file.<HR><A NAME="prune"><H1>Pruning</H1></A><P>Pruning removes potentially unnecessary subtrees from the decision tree.  This MATLAB implementation allows for pruning using 2 different algorithms:  (1) Error-based pruning from C4.5:  Programs for Machine Learning, and (2) Minimum misclassified points algorithm.<H3>Error-Based Pruning</H3><P>To prune the given decision tree using the error-based pruning algorithm (outlined in C4.5:  Programs for Machine Learning), call (in the MATLAB environment):<P>T = prune_tree_C45(T,A,B,certainty_factor)<P>where:<UL>  <LI> T: matrix representing the decision tree in the MATLAB environment.  <LI> A, B: MATLAB representation of matrices <EM>A</EM> and <EM>B</EM>.  <LI> certainty_factor:  real number between (and including) 0.0 and 1.0.  Smaller values of certainty_factor will result in more pruning, and vice-versa for larger values.  NOTE: Suggested value for certainty_factor is 0.25.</UL><P>The decision tree may also be pruned by this algorithm when the tree is generated by giving a value for certainty_factor in the call:<P>T = msmt_tree(A,B,max_depth,tolerance, certainty_factor, min_points)<P>For a detailed description of the pruning algorithm, see  <DL>  <DT> <B> J. Ross Quinlan. </B>    <DD> C4.5:  Programs for Machine Learning.  Morgan Kaufman Publishers, San Mateo, California.</DL><P>See <!WA11><!WA11><!WA11><!WA11><A HREF="file://www.cs.wisc.edu/~olvi/uwmp//afs/cs.wisc.edu/u/p/a/paulb/public/msmt/prune_tree_C45.m">prune_tree_C45.m</A> file.<P><H3>Minimum Misclassified Points Pruning</H3><P>The minimum misclassified points algorithm works as follows. An integer number of allowable misclassified points is given.  If a plane is  generated that splits a node and has less than this number of allowable misclassified points, this decision node is made into a leaf node.  If the plane generated splits more than this allowable number of misclassified points, this decision node remains.  This pruning algorithm can be called (in the MATLAB environment) by:<P>T = prune_tree_points(T,A,B,min_points)<P>where:<UL>  <LI> T: matrix representing the decision tree in the MATLAB environment.  <LI> A, B: MATLAB representation of matrices <EM>A</EM> and <EM>B</EM>.  <LI> min_points:  the minimum allowable number of misclassified points at a decision node.</UL><P>The decision tree may be pruned using this algorithm when it is generated by giving a value for min_points in the call:<P>T = msmt_tree(A,B,max_depth,tolerance, certainty_factor, min_points)<P>See <!WA12><!WA12><!WA12><!WA12><A HREF="file://www.cs.wisc.edu/~olvi/uwmp//afs/cs.wisc.edu/u/p/a/paulb/public/msmt/prune_tree_points.m">prune_tree_points.m</A> file.<P><HR><A NAME="cross_val"><H1>Cross-Validation</H1></A><P>The performance of the MSM-T algorithm on a given data set may be tested by cross-validation.  The cross-validation procedure works as follows.  The sets  <B> A </B> and <B> B </B> are equally divided into a given number of groups (num_groups);  for each group, a decision tree is constructed using (num_groups - 1) groups.  The tree is then tested using the group set aside.  The algorithm returns percent (percentage of correctly classified points by the MSM-T algorithm), confusion_matrix (see <!WA13><!WA13><!WA13><!WA13><A HREF="http://www.cs.wisc.edu/~olvi/uwmp/msmt/cross_val.m">cross_val.m</A> file), ave_planes (average number of planes needed in the separation).The cross-validation procedure can be initiated by making the following call (in the MATLAB environment):<P>[percent,confusion_matrix,ave_planes] = cross_val(A,B,num_groups,max_depth,tolerance,certainty_factor,min_points)<P>where:<UL>  <LI> A, B: MATLAB representation of matrices <EM>A</EM> and <EM>B</EM>.  <LI> numgroups: number of groups to use in cross-validation.  <LI> max_depth, tolerance, certainty_factor, min_points:  See <!WA14><!WA14><!WA14><!WA14><A HREF="#tree_gen"> <B> Generating the Decision Tree </B></A></A></UL><P>See <!WA15><!WA15><!WA15><!WA15><A HREF="file://www.cs.wisc.edu/~olvi/uwmp//afs/cs.wisc.edu/u/p/a/paulb/public/msmt/cross_val.m">cross_val.m</A> file.<HR><A NAME="storage"><H1>Storage of Decision Trees</H1></A><P>A decision tree may be written to any specified file using the following call (in the MATLAB environment):<P>msmt_write_file(T,'filename')<P>where:<UL>  <LI> T: matrix representing the decision tree in the MATLAB environment.  <LI> filename: name of the file to be written (NOTE: this name must be enclosed in single quotes)</UL><P>Once a tree has been written, it can be retrieved and put back in the MATLAB environment with the following call (in the MATLAB environment):<P>T = msmt_read_file('filename')<P>where:<UL>  <LI> filename: name of the file to read (NOTE: this must be enclosed in single quotes)  <LI> T: the decision tree is place in the MATLAB environment as matrix T.</UL><P>See <!WA16><!WA16><!WA16><!WA16><A HREF="file://www.cs.wisc.edu/~olvi/uwmp//afs/cs.wisc.edu/u/p/a/paulb/public/msmt/msmt_write_file.m">msmt_write_file.m</A> and <!WA17><!WA17><!WA17><!WA17><A HREF="file://www.cs.wisc.edu/~olvi/uwmp//afs/cs.wisc.edu/u/p/a/paulb/public/msmt/msmt_read_file.m">msmt_read_file.m</A> files.<HR>Last modified: Thu Jul  6 11:04:38 1995 by Paul Bradley<ADDRESS>  <!WA18><!WA18><!WA18><!WA18><A HREF="http://www.cs.wisc.edu/~paulb/paulb.html">paulb@cs.wisc.edu</A></ADDRESS></BODY></HTML>

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?