📄 index.html
字号:
<area shape="rect" coords="30,334,187,378" href="#detail-eval_prediction"><area shape="rect" coords="296,110,415,153" href="#detail-read_struct_model"><area shape="rect" coords="284,175,429,219" href="#detail-read_struct_examples"><area shape="rect" coords="275,293,437,337" href="#detail-print_struct_testing_stats"></map><img src="testing-tree.gif" alt="Flow Chart of the Classification Program" width="442" height="381" align="right" usemap="#classificationmap"><p>Pictured is a diagram illustrating the flow of execution within <code>svm_python_classify</code>. The color coding of the boxes is the same as that in the high level description of the <a href="#learning">learning program</a>.</p><p>The <code>svm_python_classify</code> program first checks whether the command line arguments are fine, and if they are not it exits. Otherwise, the indicated Python module is loaded. Then, the learned model is read and the testing pattern-label example pairs are loaded from the indicated example file. Then, it iterates over all the testing examples, classifies each example, writes the label to a file, finding the loss of this example, and then may evaluate the prediction and accumulate statistics. Once each example is processed, some summary statistics are printed out and the program exits.</p><a class="bookmark" name="objects"><h2>Objects</h2></a><p>The functions a user writes for the Python module will accept some objects as arguments, and return other objects. These objects correspond more or less like structures in C code: their intended use is that they only contain data. Though knowledge of SVM<sup><i>struct</i></sup>'s peculiarities is not strictly required to know how to use SVM<sup><i>python</i></sup>, attention was given to make SVM<sup><i>python</i></sup> resemble SVM<sup><i>struct</i></sup> to as great a degree as seemed sensible, including the names of functions and how different types of objects are structured.</p><p>In this section we go over the types of these objects that a user needs to be aware of in order to interface successfully with SVM<sup><i>python</i></sup>. Note that if you change a value in the Python object this does not copy over to the corresponding C structure, except in the case where you initialize <code>size_psi</code>, and during classification where you read the model and synchronize the Python object to the C structures. This disparity between the two may change in future releases if the performance hit for copying everything over becomes too offensive.</p><h3>Structure Model (sm)</h3><img src="object-sm.gif" alt="Diagram Showing SM" width="152" height="360" align="left"><p>Many of the module functions get the structure model as input. In the documentation, the structure model argument is called <code>sm</code> in a functions argument list. This type of corresponds to the C data type <code>STRUCTMODEL</code> that is passed into many functions. In nearly every case, the only necessary attributes to know about are the <font color="red">red ones</font>, but we describe the others as well.</p><p>The <font color="red">red attributes</font> correspond to those that appear within a <code>STRUCTMODEL</code> C structure. If we are learning or classification with a linear kernel, <code>w</code> is the linear weight vector of length <code>size_psi+1</code>, indexed from 1 through <code>size_psi</code> inclusive. <code>size_psi</code> contains the maximum feature index for out examples, which in the linear case is also equal to the number of weights we are learning.</p><p>The <font color="green">green attributes</font> correspond to those that appear within a <code>STRUCTMODEL</code> C structure's <code>svm_model</code> field. <code>sv_num</code> holds the number of support vectors plus one. <code>supvec</code> is a sequence of document objects (described later) that encode every document, while <code>alpha</code> is the multiplier associated with each support vector, where entry <code>alpha[i]</code> corresponds to entry <code>supvec[i-1]</code>. The <code>b</code> parameter is the linear weight you get if you use the <code>svmlight.classify_example</code> function. I am less familiar with the role some of the rest of these play with SVM<sup><i>python</i></sup>'s learning model as many of them never seem to be set to anything but a default value, but they are copied to the structure model anyway.<p>The <font color="blue">blue attributes</font> correspond to those that appear within a <code>STRUCTMODEL</code> C structure's <code>svm_model.kernel_parm</code> field, holding attributes relating to the kernel. The <code>kernel_type</code> parameter is an integer holding the type of kernel, either linear (0), polynomial (1), RBF (2), sigmoid (3), or user defined (4). For the polynomial kernel, <code>coef_lin</code> and <code>coef_const</code> hold the coefficient for the inner product of the two vectors and the constant term, while <code>poly_degree</code> holds the polynomial degree to which the sum of the inner product and constant coefficent is taken. For the RBF kernel, <code>rbf_gamma</code> holds the gamma parameter. The <code>custom</code> parameter is a string holding information that may be of use for a user defined kernel.</p><p>Finally, the <code>cobj</code> object is an object that holds the C <code>STRUCTMODEL</code> structure corresponding to the Python structure model object. This is of no use within Python, and is used in the event that you call some function of the <code>svmlight</code> package that requires a structure model.</p><p>Note that, while learning, anything you store in the structure model will eventually be written out to the model so it can be restored to the classifier, excepting entries that are deleted or overwritten. So, if you want to pass any information from the learner to the classifier, store it in the structure model. For example, if you at some point set <code>sm.foo = 10</code> while learning, then during classification <code>sm.foo</code> will evaluate to the integer 10.</p><p>The Python code never needs to create structure model objects.</p><h3>Structure Learning Parameters (sparm)</h3><img src="object-sparm.gif" alt="Diagram Showing Sparm" width="151" height="150" align="left"><p>Many of the module functions for learning get a structure learning parameter object, identified as <code>sparm</code> in a function's argument list, which holds many attributes related to structured learning.<p>Some attributes control how the program optimizes. Recall that the learning process adds a constraint if the constraint is sufficiently violated; The <code>epsilon</code> attribute controls how much a constraint can be violated before it is added to the model. In the learning process, constriants are added, but the quadratic program is not reoptimized after <em>every</em> constraint is added, but may wait till as many as <code>newconstretrain</code> constraints are added before it reoptimizes.</p><p>For attributes relating directly to the quadratic program, the <code>C</code> attribute is the usual SVM regularization parameter that controls the tradeoff between low slack (high C) and a simple model (low C). The <code>slack_norm</code> is 1 or 2 depending on what norm is used on the slack vector in the quadratic program. The <code>loss_type</code> is an integer indicating whether loss is introduced into constraints by multiplying by the slack term (<code>loss_type=1</code>) or by dividing by the margin term (<code>loss_type=2</code>).</p><p>Other attributes are more for the benefit of the user code, including <code>loss_function</code>, an integer indicating which loss function to use. The <code>custom_argv</code> and <code>custom_argd</code> attributes hold the custom command line arguments. In SVM<sup><i>python</i></sup>, as in SVM<sup><i>struct</i></sup>, custom command line argument flags are prefixed with two dashes, while the universal command line argument flags are prefixed with one dash. The <code>custom_argv</code> holds the list of all the custom arguments, while <code>custom_argd</code> is a dictionary holding a mapping of each "<code>--key</code>" argument to the "<code>value</code>" argument following it. For example, if the command line arguments "<code>--foo bar --biz bam</code>" are processed, <code>custom_argv</code> would hold the Python sequence <code>['--foo', 'bar', '--biz', 'bam']</code>, while <code>custom_argd</code> would hold the Python dictionary <code>{'foo':'bar', 'biz':'bam'}</code>.</p><p>The Python code never needs to create structure learning parameter objects.</p><h3>Word Sequences (words)</h3><p>In SVM<sup><i>light</i></sup> and SVM<sup><i>struct</i></sup>, the basic feature vector is represented as an array of <code>WORD</code> objects, each of which encodes the feature index number (an integer counting from 1 and higher), and the feature value for this index (a floating point number for the value of the feature). In the Python code of SVM<sup><i>python</i></sup>, a structure corresponding to these word arrays is a sequence of tuples. Each tuple has two elements, where the first is the index of the feature, and the second is the value of the feature as described earlier. So, a sequence <code>[(1,2.3), (5,-6.1), (8,0.5)]</code> has features 1, 5, and 8 with values 2.3, -6.1, and 0.5 respectively; all other features implicitly have value 0. Note that, as in SVM<sup><i>light</i></sup>, word arrays start counting feature indices from 1, and the features must be listed in increasing feature index order, so if a tuple <var>(a,b)</var> occurs before a tuple <var>(c,d)</var>, it must be that <var>a < c</var>.</p><h3>Support Vector (sv)</h3><img src="object-sv.gif" alt="Diagram Showing SV" width="152" height="80" align="left"><p>A support vector structure corresponds to the <code>SVECTOR</code> C structure, which holds information relevant to a support vector, but it is used more generally simply as a feature vector. The <code>words</code> attribute holds a word sequence as described earlier to encode the feature values. The <code>userdefined</code> attribute holds a string presumably relevant to user defined kernels, but in most cases it is the empty string. The <code>kernel_id</code> is an attribute relevant to kernels, as only vectors with the same <code>kernel_id</code> have their kernel product taken. The <code>factor</code> attribute is the coefficient for the term in the sum of kernel function evaluations.</p><p>The <code>SVECTOR</code> C structure also holds a <code>next</code> field, allowing for linked list of kernel functions. To get this functionality in the Python code, whenever a support vector object is expected or asked for, you can instead pass in or return a sequence of support vector objects, and all the structures that say that an attribute holds a support vector instead has an attribute that holds a sequence of support vectors.</p><p>You can create support vector objects through the use of the <code>svmlight.create_svector</code> function. Support vectors are useful for <code>svmlight.classify_example</code> function, returned from the <code>psi</code> user function, and contained within document objects, described below.</p><h3>Document (doc)</h3><img src="object-doc.gif" alt="Diagram Showing Doc" width="151" height="80" align="left"><p>A document vector structure corresponds to the <code>DOC</code> C structure, which holds information relevant to a document example in SVM<sup><i>light</i></sup>, but within SVM<sup><i>struct</i></sup> and SVM<sup><i>python</i></sup> is used for encoding constraints. The <code>fvec</code> attribute holds sequence of support vector objects. The <code>costfactor</code> attribute indicates how important it is not to misclassify this example; I'm unclear on the importance of this attribute to SVM<sup><i>struct</i></sup>.</p> The <code>slackid</code> attribute indicates which slack ID is associated with this constraint; if two constraints have the same slack ID, then they share the same slack variable. Finally, SVM<sup><i>struct</i></sup> appears to use <code>docnum</code> as the position of the constraint in the constraint set.<p>You can create support vector objects through the use of the <code>svmlight.create_doc</code> function. Examples of uses of document objects include the return list from the <code>init_struct_constraints</code> user function to encode initial constraints, the <code>sm.supvec</code> list consists of document objects, and the <code>print_struct_learning_stats</code> has an argument for a list of constraints encoded as document objects.</p><h3>Patterns and Labels (x, y)</h3><p>In SVM<sup><i>struct</i></sup>'s C API, patterns and labels must be declared as structures. In SVM<sup><i>python</i></sup>, because patterns and labels only interact with the code in the Python module, the underlying code does not need to know anything about these, so these may be any Python objects. Their types do not have to be explicitly created, and they do not have to have any particular attributes beyond what is used by the user created Python module.</p><a class="bookmark" name="details"><h2>Details of User Functions</h2></a><p>In this part, detailed descriptions of each of the user functions is listed. The expectation that SVM<sup><i>python</i></sup> has of each function is </p><dl> <dt><a class="bookmark" name="detail-classify_struct_example"><code><b>classify_struct_example</b></code></a>(<i>x, sm, sparm</i>)</dt><dd>Given a pattern <var>x</var>, return the predicted label.</dd> <dt><a class="bookmark" name="detail-eval_prediction"><code><b>eval_prediction</b></code></a>(<i>exnum, x, y, ypred, sm, sparm, teststats</i>)</dt><dd>Accumulate statistics about a single training example.<p>Allows accumulated statistics regarding how well the predicted label <var>ypred</var> for pattern <var>x</var> matches the true label <var>y</var>. The first time this function is called teststats is <code>None</code>. This function's return value will be passed along to the next call to <code>eval_prediction</code>. After all test predictions are made, the last value returned will be passed along to <code>print_testing_stats</code>.<p>If this function is not implemented, the default behavior is equivalent to initialize teststats as an empty list on the first example, and thence for each prediction appending the loss between y and ypred to teststats, and returning teststats.</dd> <dt><a class="bookmark" name="detail-find_most_violated_constraint"><code><b>find_most_violated_constraint</b></code></a>(<i>x, y, sm, sparm</i>)</dt><dd>Return <var>ybar</var> associated with <var>x</var>'s most violated constraint.<p>Returns the label <var>ybar</var> for pattern <var>x</var> corresponding to the most violated constraint according to SVM<sup><i>struct</i></sup> cost function. To find which cost function you should use, check sparm.loss_type for whether this is slack or margin rescaling (1 or 2 respectively), and check sparm.slack_norm for whether the slack vector is in an L1-norm or L2-norm in the QP (1 or 2 respectively). If there's no incorrect label, then return <code>None</code>.<p>If this function is not implemented, this function is equivalent to <code>classify</code>(<i>x, sm, sparm</i>). The guarantees of optimality of Tsochantaridis et al. no longer hold since this doesn't take the loss into account at all, but it isn't always a terrible approximation, and indeed impiracally speaking on many clustering problems I have looked at it doesn't yield a statistically significant difference in performance on a test set.</dd> <dt><a class="bookmark" name="detail-init_struct_constraints"><code><b>init_struct_constraints</b></code></a>(<i>sample, sm, sparm</i>)</dt>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -