📄 ngram-count.html
字号:
<DT><B>-debug</B><I> level</I><B></B><DD>Set debugging output from estimated LM at<I>level</I>.<I></I>Level 0 means no debugging.Debugging messages are written to stderr.<DT><B>-gt<I>n</I>min</B><I> count</I><B></B><DD>where<I> n </I>is 1, 2, 3, 4, 5, 6, 7, 8, or 9.Set the minimal count of N-grams of order<I> n </I>that will be included in the LM.All N-grams with frequency lower than that will effectively be discounted to 0.If <I> n </I>is omitted the parameter for N-grams of order > 9 is set.<BR>NOTE: This option affects not only the default Good-Turing discountingbut the alternative discounting methods described below as well.<DT><B>-gt<I>n</I>max</B><I> count</I><B></B><DD>where<I> n </I>is 1, 2, 3, 4, 5, 6, 7, 8, or 9.Set the maximal count of N-grams of order<I> n </I>that are discounted under Good-Turing.All N-grams more frequent than that will receivemaximum likelihood estimates.Discounting can be effectively disabled by setting this to 0.If <I> n </I>is omitted the parameter for N-grams of order > 9 is set.</DD></DL><P>In the following discounting parameter options, the order<I> n </I>may be omitted, in which case a default for all N-gram orders isset.The corresponding discounting method then becomes the default methodfor all orders, unless specifically overridden by an option with<I>n</I>.<I></I>If no discounting method is specified, Good-Turing is used.<DL><DT><B>-gt<I>n</I></B><I> gtfile</I><B></B><DD>where<I> n </I>is 1, 2, 3, 4, 5, 6, 7, 8, or 9.Save or retrieve Good-Turing parameters(cutoffs and discounting factors) in/from<I>gtfile</I>.<I></I>This is useful as GT parameters should always be determined fromunlimited vocabulary counts, whereas the eventual LM may use alimited vocabulary.The parameter files may also be hand-edited.If an<B> -lm </B>option is specified the GT parameters are read from<I>gtfile</I>,<I></I>otherwise they are computed from the current counts and saved in<I>gtfile</I>.<I></I><DT><B>-cdiscount<I>n</I></B><I> discount</I><B></B><DD>where<I> n </I>is 1, 2, 3, 4, 5, 6, 7, 8, or 9.Use Ney's absolute discounting for N-grams of order<I>n</I>,<I></I>using<I> discount </I>as the constant to subtract.<DT><B> -wbdiscount<I>n</I> </B><DD>where<I> n </I>is 1, 2, 3, 4, 5, 6, 7, 8, or 9.Use Witten-Bell discounting for N-grams of order<I>n</I>.<I></I>(This is the estimator where the first occurrence of each word istaken to be a sample for the ``unseen'' event.)<DT><B> -ndiscount<I>n</I> </B><DD>where<I> n </I>is 1, 2, 3, 4, 5, 6, 7, 8, or 9.Use Ristad's natural discounting law for N-grams of order<I>n</I>.<I></I><DT><B> -kndiscount<I>n</I> </B><DD>where<I> n </I>is 1, 2, 3, 4, 5, 6, 7, 8, or 9.Use Chen and Goodman's modified Kneser-Ney discounting for N-grams of order<I>n</I>.<I></I><DT><B> -kn-counts-modified </B><DD>Indicates that input counts have already been modified for Kneser-Ney smoothing.If this option is not given, the KN discounting method modifies counts(except those of highest order) in order to estimate the backoff distributions.When using the <B> -write </B>and related options the output will reflect the modified counts.<DT><B> -kn-modify-counts-at-end </B><DD>Modify Kneser-Ney counts after estimating discounting constants, rather thanbefore as is the default.<DT><B>-kn<I>n</I></B><I> knfile</I><B></B><DD>where<I> n </I>is 1, 2, 3, 4, 5, 6, 7, 8, or 9.Save or retrieve Kneser-Ney parameters(cutoff and discounting constants) in/from<I>knfile</I>.<I></I>This is useful as smoothing parameters should always be determined fromunlimited vocabulary counts, whereas the eventual LM may use alimited vocabulary.The parameter files may also be hand-edited.If an<B> -lm </B>option is specified the KN parameters are read from<I>knfile</I>,<I></I>otherwise they are computed from the current counts and saved in<I>knfile</I>.<I></I><DT><B> -ukndiscount<I>n</I> </B><DD>where<I> n </I>is 1, 2, 3, 4, 5, 6, 7, 8, or 9.Use the original (unmodified) Kneser-Ney discounting method for N-grams oforder<I>n</I>.<I></I></DD></DL><P>In the above discounting options, if the parameter <I> n </I>is omitted the option sets the default discounting method for all N-grams of length greater than 9.<DL><DT><B> -interpolate<I>n</I> </B><DD>where<I> n </I>is 1, 2, 3, 4, 5, 6, 7, 8, or 9.Causes the discounted N-gram probability estimates at the specified order <I> n </I>to be interpolated with lower-order estimates.(The result of the interpolation is encoded as a standard backoffmodel and can be evaluated as such -- the interpolation happens atestimation time.)This sometimes yields better models with some smoothing methods(see Chen & Goodman, 1998).Only Witten-Bell, absolute discounting, and modified Kneser-Ney smoothingcurrently support interpolation.<DT><B>-meta-tag</B><I> string</I><B></B><DD>Interpret words starting with <I> string </I>as count-of-count (meta-count) tags.For example, an N-gram<BR> a b <I>string</I>3 4<BR>means that there were 4 trigrams starting with "a b"that occurred 3 times each.Meta-tags are only allowed in the last position of an N-gram.<BR>Note: when using <B> -tolower </B>the meta-tag<I> string </I>must not contain any uppercase characters.<DT><B> -read-with-mincounts </B><DD>Save memory by eliminating N-grams with counts that fall below the thresholdsset by<B>-gt</B><I>N</I><B>min</B>options during <B> -read </B>operation (this assumes the input counts contain no duplicate N-grams).Also, if<B> -meta-tag </B>is defined,these low-count N-grams will be converted to count-of-count N-grams,so that smoothing methods that need this information still work correctly.</DD></DL><H2> SEE ALSO </H2><A HREF="ngram-merge.html">ngram-merge(1)</A>, <A HREF="ngram.html">ngram(1)</A>, <A HREF="ngram-class.html">ngram-class(1)</A>, <A HREF="training-scripts.html">training-scripts(1)</A>, <A HREF="lm-scripts.html">lm-scripts(1)</A>,<A HREF="ngram-format.html">ngram-format(5)</A>.<BR>S. F. Chen and J. Goodman, ``An Empirical Study of Smoothing Techniques forLanguage Modeling,'' TR-10-98, Computer Science Group, Harvard Univ., 1998.<BR>S. M. Katz, ``Estimation of Probabilities from Sparse Data for theLanguage Model Component of a Speech Recognizer,'' <I>IEEE Trans. ASSP</I> 35(3),400-401, 1987.<BR>R. Kneser and H. Ney, ``Improved backing-off for M-gram language modeling,''<I>Proc. ICASSP</I>, 181-184, 1995.<BR>H. Ney and U. Essen, ``On Smoothing Techniques for Bigram-based NaturalLanguage Modelling,'' <I>Proc. ICASSP</I>, 825-828, 1991.<BR>E. S. Ristad, ``A Natural Law of Succession,'' CS-TR-495-95,Comp. Sci. Dept., Princeton Univ., 1995.<BR>I. H. Witten and T. C. Bell, ``The Zero-Frequency Problem: Estimating theProbabilities of Novel Events in Adaptive Text Compression,''<I>IEEE Trans. Information Theory</I> 37(4), 1085-1094, 1991.<H2> BUGS </H2>Several of the LM types supported by <A HREF="ngram.html">ngram(1)</A>don't have explicit support in<B>ngram-count</B>.<B></B>Instead, they are built by separately manipulating N-gram counts, followed by standard N-gram model estimation.<BR>LM support for tagged words is incomplete.<BR>Only absolute and Witten-Bell discounting currently support fractional counts.<BR>The combination of <B> -read-with-mincounts </B>and <B> -meta-tag </B>preserves enough count-of-count information for<I> applying </I>discounting parameters to the input counts, but it does not necessarily allow the parameters to be correctly <I>estimated</I>.<I></I>Therefore, discounting parameters should always be estimated from full counts (e.g., using the helper <A HREF="training-scripts.html">training-scripts(1)</A>),and then read from files.<H2> AUTHOR </H2>Andreas Stolcke <stolcke@speech.sri.com>.<BR>Copyright 1995-2006 SRI International</BODY></HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -