📄 manual.html

📁 Amis - A maximum entropy estimator 一个最大熵模型统计工具
💻 HTML
📖 第 1 页 / 共 2 页
字号:
上一页 12
OUTPUT_FILE     me.outputLOG_FILE        me.logESTIMATION_ALGORITHM    GISNUM_ITERATIONS  200NUM_NEWTON_ITERATIONS   20REPORT_INTERVAL 1PRECISION       6</pre></p><p>The following options are available.  By specifying "--help" option toamis, you will get the full description of available options.<table border><tr><th>Option<th>Default<th>Valid values<th>Effect</tr><tr><td>DATA_FORMAT<td>Amis<td>Amis AmisTree AmisFix    <td>Specify the data file format</tr><tr><td>FEATURE_TYPE<td>binary<td>binary integer real    <td>Specify the type of features</tr><tr><td>MODEL_FILE<td>amis.model<td>file names    <td>Specify the names of model files</tr><tr><td>EVENT_FILE<td>amis.event<td>file names    <td>Specify the names of event files</tr><tr><td>OUTPUT_FILE<td>amis.output<td>file name    <td>Specify the name of an output file</tr><tr><td>LOG_FILE<td>amis.log<td>file name    <td>Specify the name of a log file</tr><tr><td>ESTIMATION_ALGORITHM<td>IIS<td>IIS GIS BFGS GISMAP BFGSMAP    <td>Specify the algorithm for parameter estimation.</tr><tr><td>MAP_SIGMA<td>1.0<td>real value    <td>Specify the deviation of the Gaussian prior distribution of MAP    estimation</tr><tr><td>FEATURE_COUNT_HASH<td>FALSE<td>TRUE FALSE    <td>Specify the class ("map" or "vector") used for the factoring    optimization of IIS algorithms.  As a default, "vector" is    used.</tr><tr><td>NUM_ITERATIONS<td>200<td>non-zero positive value    <td>Specify the number of iterations</tr><tr><td>MEMORY_SIZE<td>5<td>non-zero positive value    <td>Specify the memory size for limited-memory BFGS</tr><tr><td>REPORT_INTERVAL<td>1<td>non-zero positive value    <td>Specify the interval of reporting the progress of computation</tr><tr><td>PRECISION<td>6<td>non-zero positive value    <td>Specify the number of significant digits</tr><tr><td>EVENT_ON_FILE<td>FALSE<td>TRUE FALSE    <td>Specify whether the event data is put on memory (default), or    on a file.  This option is used when the event data is too large    and the memory is unsufficient.</tr><tr><td>EVENT_ON_FILE_NAME<td>amis.event.tmp<td>file name    <td>Specify the file of storing event data.  Effective only when    EVENT_ON_FILE option is enabled</tr></table>Options specified by the configuration file can be overwritten bystartup options.</p><h4><a name="amis_model_file">Model file</a></h4><p>A model file gives a set of feature functions and correspondinginitial parameters.  See the following example.</p><pre>feature1    1.0feature2    2.0feature3    0.3...</pre><p>Each line corresponds to each feature.  First, you specify the name ofa feature.  For feature names, you can use any characters except forspaces, tabs, colons (:), and pounds (#).  Next, following spaces ortabs, specify the initial parameter of the feature.Initial values are given by C-style floating point values.  Initial parameters can be any positive values (usually, 1.0).  Only two tokens,feature name and parameter, can be specified in each line.</p><h4><a name="amis_event_file">Event file</a></h4><p>An event file gives events as training data.  In detail, you specifyactivated features for an observed event and its complement events.Here, an observed event is an event observed in the training data.  Acomplement event is an event whose target event is substituted fromthe observed event.  As descrbied in <ahref="#introduction">Introduction</a>, we must normalize the summationof probabilities for all events which can be observed under the samehistory event of the observed event.  Such events are given viacomplement events.  To sum up, each event is represented with anobserved event and a set of events observable under the same historyevent.</p><p>See the following example.<pre>event_11    feature1:2 feature30    feature10    feature2:3event_20    feature2 feature3:51    feature10    feature3...</pre></p><p>Each block separated by blank lines corresponds to one event.  Inthe first line, you specify the name of an event.  You can use anycharacters except for special characters mentioned above.  Other linesrepresent an observed event or complement events.  At the beginning ofa line, specify the number of times the event observed.  For an observed event, it should be non-zero positive value, and for acomplement event it should be zero.  When you have enabled ambiguous events, you can specify positive real values.  Otherwise, the numberof observed times must be an integer and only one observed eventis permitted for one event description.  Next, enumerate activated featuresfor an event.  Each feature must be defined in a model file.  If youspecified a feature not found in a model file, it would be an error.The value of a feature function can be specified following the featurename.  As in the above example, specify the feature value following a colon (:).  When omitted, it will be 1.</p><p>Each event description is separated by blank lines.  Note that a linewith only comments is also treated as a blank line.</p><h3><a name="amistree_input">Input (AmisTree format)</a></h3><p>When you use an estimation algorithm for feature forests, an eventfile must be written in AmisTree format.  Configuration and modelfiles are the same as <a href="#amis_input">Amis format</a>.  Thissection describes the format of event files.</p><p>An event file in AmisTree format is as following.<pre>event_1  2feature1:2 feature2:3 feature3{ dnode_1 ( node_1 feature1:2 { dnode_2 ( node_2feature2:3 ) ( node_3 ) } { dnode_3 $node_2 (node_4 feature3 ) } ) }event_2  1feature2:3{ dnode_1 ( node_1 feature1 ) ( node_2 { dnode_2( node_3 feature2:3 ) ( node_4 feature3 ) } ) }...</pre></p>Note: The third and following lines in each event descriptionmust be in deed represented in a line.<p>As in the Amis format, blank lines separates each event description.In the AmisTree format, an event is represented with three lines.  Thefirst line specifies the name of an event and the number of times ofthe observed event.  In the above example, event_1 is observed twice,and event_2 once.  In the second line, you enumerate activatedfeatures of an observed event.  As in the Amis format, specify thename of a feature together with its value.  The third line representsan observed event and complement events in a feature forest.Disjunctive nodes are represented with curly braces.  Between thecurly braces, the name of a node is first specified, and conjunctivenodes follow.  Conjunctive nodes are represented with round brackets.Between the round brackets, the name of a node is first specified, andactivated features follow.  Feature descriptions are the same as Amisformat.  You can also specify disjunctive nodes as daughter nodes.Node names are used to represent structure-sharing.  Already appearednodes can be refered by "$" followed by the node name.  In event_1 inthe above example, $node2 represents the sharing with node2 alreadyappeared.  By using node sharing and pack the feature forest smaller,the computational complexity reduces, and the computation will beaccelerated.  You can use any characters except for special charactersfor node names and feature names.</p><p>Do not forget to put spaces before and after curly braces and roundbrackets.  Without spaces, they will be treated as a part of nodenames or feature names.</p><h3><a name="amis_output">Output</a></h3><p>An output format is common to Amis and AmisTree format.</p><p>Amis outputs pairs of feature and parameter.  The output format is thesame as <a href="#model_file">a model file</a>.  Since the outputformat is the same as a model file, the output file can be reused asan input of the new computation.  That is, we can further progress theparameter estimation given already estimated data.</p><p>Parameters <span class=math>a_i</span> corresponding to featurefunctions are output, and we can compute the probability of an unknownevent by the product of <span class=math>a_i</span> for all activatedfeatures.</p><hr><h2><a name="example">Example</a></h2><p>Along with a simple example, I describe how to use the estimator.Some other examples will be found in "test/" directory.  Suppose weare making a maximum entropy model for POS tagging.  For example, tosentence "I like handball", we can assign POS to each word, like"I/Noun like/Verb handball/Noun".  We consider the tagging task as asequence of <em>events</em>, each of which is the task to assign oneof tags to each word given a previous word and tag.  For example, whenwe are looking at "like", we select a tag (<em>target</em>) for "like"under the context (<em>history</em>) of having "I/Noun" as a previousword.  In the Amis format, tagging events are expressed as following("BOS" is for "Beginnig Of Sentence").<pre>event_BOB/BOS-I/Noun1  BOS/BOS-I/Noun */*-I/Noun */*-*/Noun0  */*-*/Verb0  */*-*/Prep0  */*-*/Modifevent_I/Noun-like/Verb0  I/Noun-like/Noun */Noun-like/Noun */*-like/Noun */*-*/Noun1  I/Noun-like/Verb */Noun-like/Verb */*-like/Verb */*-*/Verb0  */Noun-like/Prep */*-like/Prep */*-*/Prep0  */*-*/Modifevent_like/Verb-handball/Noun...</pre><p>Each block separated by a blank line corresponds to each event.  Thefirst line of each block is a name of an event, which has in fact noeffect to the estimator (just for human readability).  Other linesdescribe the features activated for the event.  Each line correspondsto each target, in this example, Noun, Verb, Prep, and Modif.  Everyevent description must have every line for all possible targets forthe history.  The beginning of the line describes whether the event is"observed" or not.  If the pair of target and history corresponding tothe line is observed in the training data, it has the positive number.If it is not observed, for example "assign Verb for I", it is 0.<p>The model file will be as follows.<pre>BOS/BOS-I/Noun  1.0*/*-I/Noun      1.0*/*-*/Noun      1.0*/*-*/Verb      1.0*/*-*/Prep      1.0*/*-*/Modif     1.0...</pre><p>Running the estimator, and we get will get the output file like this:<pre>BOS/BOS-I/Noun  8.03*/*-I/Noun      1.45*/*-*/Noun      0.84*/*-*/Verb      0.72*/*-*/Prep      0.54*/*-*/Modif     0.48...</pre>Using these parameter values, we can compute the probability of anevent.  For example, un-normalized score for event "assign Noun for Iunder the context BOS/BOS" is,<blockquote><div class=math>q(Noun|BOS/BOS-I/*)</div><div class=math>= prod( a_i^f_i(Noun|BOS/BOS-I/*) )</div><div class=math>= prod( 8.03 * 1.45 * 0.84 )</div><div class=math>= 9.78</div></blockquote>Similarly, <span class=math>q(Verb|...)=0.72, q(Prep|...)=0.54,q(Modif|...)=0.48</span>.  Hence, the probability of the event ofassining Noun is,<blockquote><div class=math>p(Noun|BOS/BOS-I/*)</div><div class=math>= q(Noun|BOS/BOS-I/*) / Z_y</div><div class=math>= 9.78 / (9.78+0.72+0.54+0.48)</div><div class=math>= 0.849</div></blockquote><hr><h2><a name="internal">Internal specs</a></h2>Coming soon...<hr><h2><a name="others">Misc. (restrictions and known bugs)</a></h2><ul>  <li>Model file and event file should not features feature that do  not appear in an observed event.  Such features do not have an  impact on a model, and inherently have an alpha value 1.0.  Amis  ignores such features, but the estimation will get slow.    <li>The tractable size of events and features depends on the size of  system resources (memory and hard disk).  <li>When the memory is unsufficient, try EVENT_ON_FILE option.  <li>If the number of targets are fixed (as in the tagger example),  try AmisFix format.  Required memory will be reduced.</ul><hr><h2><a name="references">References</a></h2><p><a name="Berger1996">[1] Adam L. Berger, Stephen A. Della Pietra and Vincent J. DellaPietra.  A maximum entropy approach to natural language processing.Computational Linguistics, 22(1):39-71, 1996.</a><p><a name="Miyao2002">[2] Yusuke Miyao and Jun'ichi Tsujii.  Maximum entropy estimation forfeature forests.  In Proc. HLT2002.</a><p><a name="Pietra1997">[3] Stephen A. Della Pietra, Vincent J. Della Pietra and JohnLafferty.  Inducing features of random fields.  IEEE Transactions onPattern Analysis and Machine Intelligence, 19(4):380-393, 1997.</a><p><a name="Nocedal1980">[4] Jorge Nocedal.  Updating quasi-Newton matrices with limitedstorage.  Mathematics of Computation, 35:773-783, 1980.</a></body></html>
上一页 12
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -