📄 opencvref_ml.htm
字号:
<p class="Blurb">Decision tree</p>
<pre>
class CvDTree : public CvStatModel
{
public:
CvDTree();
virtual ~CvDTree();
virtual bool train( const CvMat* _train_data, int _tflag,
const CvMat* _responses, const CvMat* _var_idx=0,
const CvMat* _sample_idx=0, const CvMat* _var_type=0,
const CvMat* _missing_mask=0,
CvDTreeParams params=CvDTreeParams() );
virtual bool train( CvDTreeTrainData* _train_data, const CvMat* _subsample_idx );
virtual CvDTreeNode* predict( const CvMat* _sample, const CvMat* _missing_data_mask=0,
bool raw_mode=false ) const;
virtual const CvMat* get_var_importance();
virtual void clear();
virtual void read( CvFileStorage* fs, CvFileNode* node );
virtual void write( CvFileStorage* fs, const char* name );
// special read & write methods for trees in the tree ensembles
virtual void read( CvFileStorage* fs, CvFileNode* node,
CvDTreeTrainData* data );
virtual void write( CvFileStorage* fs );
const CvDTreeNode* get_root() const;
int get_pruned_tree_idx() const;
CvDTreeTrainData* get_data();
protected:
virtual bool do_train( const CvMat* _subsample_idx );
virtual void try_split_node( CvDTreeNode* n );
virtual void split_node_data( CvDTreeNode* n );
virtual CvDTreeSplit* find_best_split( CvDTreeNode* n );
virtual CvDTreeSplit* find_split_ord_class( CvDTreeNode* n, int vi );
virtual CvDTreeSplit* find_split_cat_class( CvDTreeNode* n, int vi );
virtual CvDTreeSplit* find_split_ord_reg( CvDTreeNode* n, int vi );
virtual CvDTreeSplit* find_split_cat_reg( CvDTreeNode* n, int vi );
virtual CvDTreeSplit* find_surrogate_split_ord( CvDTreeNode* n, int vi );
virtual CvDTreeSplit* find_surrogate_split_cat( CvDTreeNode* n, int vi );
virtual double calc_node_dir( CvDTreeNode* node );
virtual void complete_node_dir( CvDTreeNode* node );
virtual void cluster_categories( const int* vectors, int vector_count,
int var_count, int* sums, int k, int* cluster_labels );
virtual void calc_node_value( CvDTreeNode* node );
virtual void prune_cv();
virtual double update_tree_rnc( int T, int fold );
virtual int cut_tree( int T, int fold, double min_alpha );
virtual void free_prune_data(bool cut_tree);
virtual void free_tree();
virtual void write_node( CvFileStorage* fs, CvDTreeNode* node );
virtual void write_split( CvFileStorage* fs, CvDTreeSplit* split );
virtual CvDTreeNode* read_node( CvFileStorage* fs, CvFileNode* node, CvDTreeNode* parent );
virtual CvDTreeSplit* read_split( CvFileStorage* fs, CvFileNode* node );
virtual void write_tree_nodes( CvFileStorage* fs );
virtual void read_tree_nodes( CvFileStorage* fs, CvFileNode* node );
CvDTreeNode* root;
int pruned_tree_idx;
CvMat* var_importance;
CvDTreeTrainData* data;
};
</pre>
<hr><h3><a name="decl_CvDTree_train">CvDTree::train</a></h3>
<p class="Blurb">Trains decision tree</p>
<pre>
bool CvDTree::train( const CvMat* _train_data, int _tflag,
const CvMat* _responses, const CvMat* _var_idx=0,
const CvMat* _sample_idx=0, const CvMat* _var_type=0,
const CvMat* _missing_mask=0,
CvDTreeParams params=CvDTreeParams() );
bool CvDTree::train( CvDTreeTrainData* _train_data, const CvMat* _subsample_idx );
</pre><p>
There are 2 <code>train</code> methods in <code>CvDTree</code>.<p>
The first method follows the generic <a href="#decl_CvStatModel_train">CvStatModel::train</a>
conventions, it is the most complete form of it. Both data layouts
(<code>_tflag=CV_ROW_SAMPLE</code> and <code>_tflag=CV_COL_SAMPLE</code>) are supported,
as well as sample and variable subsets, missing measurements, arbitrary combinations
of input and output variable types etc. The last parameter contains all
the necessary training parameters, see <a href="#decl_CvDTreeParams">CvDTreeParams</a> description.
<p>The second method <code>train</code> is mostly used for building tree ensembles.
It takes the pre-constructed <a href="#decl_CvDTreeTrainData">CvDTreeTrainData</a> instance and
the optional subset of training set. The indices in <code>_subsample_idx</code> are counted
relatively to the <code>_sample_idx</code>, passed to <code>CvDTreeTrainData</code> constructor.
For example, if <code>_sample_idx=[1, 5, 7, 100]</code>, then
<code>_subsample_idx=[0,3]</code> means that the samples <code>[1, 100]</code> of the original
training set are used.
</p>
<hr><h3><a name="decl_CvDTree_predict">CvDTree::predict</a></h3>
<p class="Blurb">Returns the leaf node of decision tree corresponding to the input vector</p>
<pre>
CvDTreeNode* CvDTree::predict( const CvMat* _sample, const CvMat* _missing_data_mask=0,
bool raw_mode=false ) const;
</pre>
<p>The method takes the feature vector and the optional missing measurement mask on input,
traverses the decision tree and returns the reached leaf node on output.
The prediction result, either the class label or the estimated function value, may
be retrieved as <code>value</code> field of the <a href="#decl_CvDTreeNode">CvDTreeNode</a>
structure, for example: dtree->predict(sample,mask)->value</p>
<p>The last parameter is normally set to <code>false</code> that implies a regular input.
If it is <code>true</code>, the method assumes that all the values of the discrete input variables
have been already normalized to <code>0..<num_of_categories<sub>i</sub>>-1</code> ranges.
(as the decision tree uses such normalized representation internally). It is useful for faster
prediction with tree ensembles. For ordered input variables the flag is not used.
</p>
<hr>
<h3>Example. Building Tree for Classifying Mushrooms</h3>
<p>See <a href="../../samples/c/mushroom.cpp">mushroom.cpp</a> sample that
demonstrates how to build and use the decision tree.</p>
<!-- *****************************************************************************************
*****************************************************************************************
***************************************************************************************** -->
<hr><h2><a name="ch_boosting">Boosting</a></h2>
<!--Boosting works by sequentially applying a classification algorthm to reweighted versions
of the training data, and then taking a weighted majority vote of the sequence of
classifiers thus produced.-->
<p>
A common machine learning task is supervised learning of the following kind:
Predict the output <var>y</var> for an unseen input sample <var>x</var> given a training set
consisting of input and its desired output. In other words, the goal is to learn
the functional relationship <var>F</var>: <var>y</var> = <var>F</var>(<var>x</var>)
between input <var>x</var> and output <var>y</var>.
Predicting qualitative output is called classification, while predicting
quantitative output is called regression.</p>
<p>
Boosting is a powerful learning concept, which provide a solution
to supervised classification learning task.
It combines the performance of many "weak" classifiers
to produce a powerful 'committee' [<a href="#HTF01">HTF01</a>].
A weak classifier is only required to be better
than chance, and thus can be very simple and computationally inexpensive.
Many of them smartly combined, however, result in a strong classifier,
which often outperforms most 'monolithic' strong classifiers such as SVMs and Neural Networks.
</p>
<p>
Decision trees are the most popular weak classifiers used in boosting schemes.
Often the simplest decision trees with only a single split node per tree (called stumps)
are sufficient.</p>
<p>
Learning of boosted model is based on <var>N</var> training examples
{(<var>x<sub>i</sub></var>,<var>y<sub>i</sub></var>)}<span class=subs>1</span><span class=sups><var>N</var></span>
with <var>x<sub>i</sub></var> ∈ <var>R<sup>K</sup></var> and
<var>y<sub>i</sub></var> ∈ {−1, +1}.
<var>x<sub>i</sub></var> is a <var>K</var>-component vector.
Each component encodes a feature relevant for the learning task at hand.
The desired two-class output is encoded as −1 and +1.
</p>
<p>
Different variants of boosting are known such as Discrete Adaboost, Real AdaBoost, LogitBoost,
and Gentle AdaBoost [<a href="#FHT98">FHT98</a>].
All of them are very similar in their overall structure.
Therefore, we will look only at the standard two-class Discrete AdaBoost algorithm
as shown in the box below.
Each sample is initially assigned the same weight (step 2).
Next a weak classifier <var>f<sub>m</sub></var>(<var>x</var>)
is trained on the weighted training data (step 3a).
Its weighted training error and scaling factor <var>c<sub>m</sub></var> is computed (step 3b).
The weights are increased for training samples, which have been misclassified (step 3c).
All weights are then normalized, and the process of finding the next week classifier continues
for another <var>M</var>-1 times. The final classifier <var>F</var>(<var>x</var>) is the sign
of the weighted sum over the individual weak classifiers (step 4).</p>
<div class=alg>
<div class=box>
<ol>
<li>Given <var>N</var> examples
{(<var>x<sub>i</sub></var>,<var>y<sub>i</sub></var>)}<span class=subs>1</span><span class=sups><var>N</var></span>
with <var>x<sub>i</sub></var> ∈ <var>R<sup>K</sup></var>,
<var>y<sub>i</sub></var> ∈ {−1, +1}.
</li>
<li>Start with weights <var>w<sub>i</sub></var> = 1/<var>N</var>,
<var>i</var> = 1,…,<var>N</var>.
</li>
<li>Repeat for <var>m</var> = 1,2,…,<var>M</var>:
<ol>
<li>Fit the classifier <var>f<sub>m</sub></var>(<var>x</var>) ∈ {−1,1},
using weights <var>w<sub>i</sub></var> on the training data.
</li>
<li>Compute err<var><sub>m</sub></var> = <var>E<sub>w</sub></var>
[1<sub>(<var>y</var> ≠ <var>f<sub>m</sub></var>(<var>x</var>))</sub>],
<var>c<sub>m</sub></var> = log((1 − err<var><sub>m</sub></var>)/err<var><sub>m</sub></var>).
</li>
<li>Set <var>w<sub>i</sub></var> ← <var>w<sub>i</sub></var>
exp[<var>c<sub>m</sub></var>
1<sub>(<var>y<sub>i</sub></var> ≠ <var>f<sub>m</sub></var>(<var>x<sub>i</sub></var>))</sub>],
<var>i</var> = 1,2,…,<var>N</var>,
and renormalize so that ∑ <var><span class=subs>i</span> w<sub>i</sub></var> = 1.
</li>
</ol>
</li>
<li>Output the classifier sign[∑ <span class=subs><var>m</var> = 1</span><span class=sups><var>M</var></span>
<var>c<sub>m</sub> f<sub>m</sub></var>(<var>x</var>)].
</li>
</ol>
</div>
<div class=cap>
Two-class Discrete AdaBoost Algorithm: Training (steps 1 to 3)
and Evaluation (step 4)
</div>
</div>
<p><b>NOTE. </b>As well as the classical boosting methods, the current implementation
supports 2-class classifiers only. For M>2 classes there is <em>AdaBoost.MH</em> algorithm,
described in [<a href="#FHT98">FHT98</a>], that reduces the problem to the 2-class problem,
yet with much larger training set.</p>
<p>In order to reduce computation time for boosted models without substantial loosing
of the accuracy, the influence trimming technique may be employed.
As the training algorithm proceeds and the number of trees in the ensemble is increased,
a larger number of the training samples are classified correctly
and with increasing confidence, thereby those samples receive
smaller weights on the subsequent iterations. Examples with very low relative weight have small impact on training of
the weak classifier. Thus such examples may be excluded during the weak classifier training
without having much effect on the induced classifier. This process is controlled
via the <span class=par>weight_trim_rate</span> parameter.
Only examples with the summary fraction <span class=par>weight_trim_rate</span>
of the total weight mass are used in the weak classifier training.
Note that the weights for <em>all</em> training examples are recomputed at each
training iteration. Examples deleted at a particular iteration may
be used again for learning some of the weak classifiers further
[<a href="#FHT98">FHT98</a>].</p>
<b>
<p><a name="HTF01">[HTF01]</a> Hastie, T., Tibshirani, R., Friedman, J. H.
The Elements of Statistical Learning:
Data Mining, Inference, and Prediction. Springer Series in Statistics. 2001.</p>
<p><a name="FHT98">[FHT98]</a> Friedman, J. H., Hastie, T. and Tibshirani, R. Additive Logistic
Regression: a Statistical View of Boosting. Technical Report, Dept. of Statistics, Stanford
University, 1998.</p></b>
<hr><h3><a name="decl_CvBoostParams">CvBoostParams</a></h3>
<p class="Blurb">Boosting training parameters</p>
<pre>
struct CvBoostParams : public CvDTreeParams
{
int boost_type;
int weak_count;
int split_criteria;
double weight_trim_rate;
CvBoostParams();
CvBoostParams( int boost_type, int weak_count, double weight_trim_rate,
int max_depth, bool use_surrogates, const float* priors );
};
</pre>
<p><dl>
<dt>boost_type<dd>Boosting type, one of the following:<br>
<code>CvBoost::DISCRETE</code> - Discrete AdaBoost<br
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -