📄 ngram.html
字号:
<DD><DT><B>-mix-lambda3</B><I> weight</I><B></B><DD><DT><B>-mix-lambda4</B><I> weight</I><B></B><DD><DT><B>-mix-lambda5</B><I> weight</I><B></B><DD><DT><B>-mix-lambda6</B><I> weight</I><B></B><DD><DT><B>-mix-lambda7</B><I> weight</I><B></B><DD><DT><B>-mix-lambda8</B><I> weight</I><B></B><DD><DT><B>-mix-lambda9</B><I> weight</I><B></B><DD>These are the weights for the additional mixture components, correspondingto<B> -mix-lm2 </B>through<B>-mix-lm9</B>.<B></B>The weight for the<B> -mix-lm </B>model is 1 minus the sum of <B> -lambda </B>and <B> -mix-lambda2 </B>through<B>-mix-lambda9</B>.<B></B><DT><B> -loglinear-mix </B><DD>Implement a log-linear (rather than linear) mixture LM, using the parameters above.<DT><B>-bayes</B><I> length</I><B></B><DD>Interpolate the second and the main model using posterior probabilitiesfor local N-gram-contexts of length<I>length</I>.<I></I>The <B> -lambda </B>value is used as a prior mixture weight in this case.<DT><B>-bayes-scale</B><I> scale</I><B></B><DD>Set the exponential scale factor on the context likelihood in conjunctionwith the<B> -bayes </B>function.Default value is 1.0.<DT><B>-cache</B><I> length</I><B></B><DD>Interpolate the main LM (or the one resulting from operations above) witha unigram cache language model based on a history of<I> length </I>words.<DT><B>-cache-lambda</B><I> weight</I><B></B><DD>Set interpolation weight for the cache LM.Default value is 0.05.<DT><B>-dynamic</B><I></I><B></B><DD>Interpolate the main LM (or the one resulting from operations above) witha dynamically changing LM.LM changes are indicated by the tag ``<LMstate>'' starting a line in theinput to<B>-ppl</B>,<B></B><B>-counts</B>,<B></B>or<B>-rescore</B>,<B></B>followed by a filename containing the new LM.<DT><B>-dynamic-lambda</B><I> weight</I><B></B><DD>Set interpolation weight for the dynamic LM.Default value is 0.05.<DT><B>-adapt-marginals</B><I> LM</I><B></B><DD>Use an LM obtained by adapting the unigram marginals to the values specifiedin the<I> LM </I>in<A HREF="ngram-format.html">ngram-format(5)</A>,using the method described in Kneser et al. (1997).The LM to be adapted is that constructed according to the other options.<DT><B>-base-marginals</B><I> LM</I><B></B><DD>Specify the baseline unigram marginals in a separate file <I>LM</I>,<I></I>which must be in<A HREF="ngram-format.html">ngram-format(5)</A>as well.If not specified, the baseline marginals are taken from the model to beadapted, but this might not be desirable, e.g., when Kneser-Ney smoothingwas used.<DT><B>-adapt-marginals-beta</B><I> B</I><B></B><DD>The exponential weight given to the ratio between adapted and baselinemarginals.The default is 0.5.<DT><B>-adapt-marginals-ratios</B><I></I><B></B><DD>Compute and output only the log ratio between the adapted and the baselineLM probabilities.These can be useful as a separate knowledge source in N-best rescoring.</DD></DL><P>The following options specify the operations performed on/with the LMconstructed as per the options above.<DL><DT><B> -renorm </B><DD>Renormalize the main model by recomputing backoff weights for the givenprobabilities.<DT><B>-prune</B><I> threshold</I><B></B><DD>Prune N-gram probabilities if their removal causes (training set)perplexity of the model to increase by less than<I> threshold </I>relative.<DT><B> -prune-lowprobs </B><DD>Prune N-gram probabilities that are lower than the correspondingbacked-off estimates.This generates N-gram models that can be correctlyconverted into probabilistic finite-state networks.<DT><B>-minprune</B><I> n</I><B></B><DD>Only prune N-grams of length at least<I>n</I>.<I></I>The default (and minimum allowed value) is 2, i.e., only unigrams are excludedfrom pruning.This option applies to both<B> -prune </B>and<B>-prune-lowprobs</B>.<B></B><DT><B>-rescore-ngram</B><I> file</I><B></B><DD>Read an N-gram LM from <I> file </I>and recompute its N-gram probabilities using the LM specified by theother options; then renormalize and evaluate the resulting new N-gram LM.<DT><B>-write-lm</B><I> file</I><B></B><DD>Write a model back to<I>file</I>.<I></I>The output will be in the same format as read by<B>-lm</B>,<B></B>except if operations such as <B> -mix-lm </B>or <B> -expand-classes </B>were applied, in which case the output will contain the generatedsingle N-gram backoff model in ARPA<A HREF="ngram-format.html">ngram-format(5)</A>.<DT><B>-write-bin-lm</B><I> file</I><B></B><DD>Write a model to<I> file </I>using a binary data format.This is only supported by certain model types, specifically, N-gram backoff models.Binary model files should be recognized automatically by the<B> -read </B>function.<DT><B>-write-vocab</B><I> file</I><B></B><DD>Write the LM's vocabulary to<I>file</I>.<I></I><DT><B>-gen</B><I> number</I><B></B><DD>Generate<I> number </I>random sentences from the LM.<DT><B>-seed</B><I> value</I><B></B><DD>Initialize the random number generator used for sentence generationusing seed<I>value</I>.<I></I>The default is to use a seed that should be close to unique for eachinvocation of the program.<DT><B>-ppl</B><I> textfile</I><B></B><DD>Compute sentence scores (log probabilities) and perplexities fromthe sentences in<I>textfile</I>,<I></I>which should contain one sentence per line.The<B> -debug </B>option controls the level of detail printed, even though output isto stdout (not stderr).<DT><B> -debug 0 </B><DD>Only summary statistics for the entire corpus are printed,as well a partial statistics for each input portion delimited by escaped lines (see<B>-escape</B>).<B></B>These statistics include the number of sentences, words, out-of-vocabularywords and zero-probability tokens in the input,as well as its total log probability and perplexity.Perplexity is given with two different normalizations: counting allinput tokens (``ppl'') and excluding end-of-sentence tags (``ppl1'').<DT><B> -debug 1 </B><DD>Statistics for individual sentences are printed.<DT><B> -debug 2 </B><DD>Probabilities for each word, plus LM-dependent details about backoffused etc., are printed.<DT><B> -debug 3 </B><DD>Probabilities for all words are summed in each context, andthe sum is printed.If this differs significantly from 1, a warning messageto stderr will be issued.<DT><B>-nbest</B><I> file</I><B></B><DD>Read an N-best list in<A HREF="nbest-format.html">nbest-format(5)</A>and rerank the hypotheses using the specified LM.The reordered N-best list is written to stdout.If the N-best list is given in``NBestList1.0'' format and contains composite acoustic/language model scores, then<B> -decipher-lm </B>and the recognizer language model and word transition weights (see below)need to be specified so the original acoustic scores can be recovered.<DT><B>-nbest-files</B><I> filelist</I><B></B><DD>Process multiple N-best lists whose filenames are listed in<I>filelist</I>.<I></I><DT><B>-write-nbest-dir</B><I> dir</I><B></B><DD>Deposit rescored N-best lists into directory <I>dir</I>,<I></I>using filenames derived from the input ones.<DT><B> -decipher-nbest </B><DD>Output rescored N-best lists in Decipher 1.0 format, rather than SRILM format.<DT><B> -no-reorder </B><DD>Output rescored N-best lists without sorting the hypotheses by theirnew combined scores.<DT><B> -split-multiwords </B><DD>Split multiwords into their components when reading N-best lists;the rescored N-best lists thus no longer contain multiwords.(Note this is different from the<B> -multiwords </B>option, which leaves the input word stream unchanged and splitsmultiwords only for the purpose of LM probability computation.)<DT><B>-max-nbest</B><I> n</I><B></B><DD>Limits the number of hypotheses read from an N-best list.Only the first<I> n </I>hypotheses are processed.<DT><B>-rescore</B><I> file</I><B></B><DD>Similar to<B>-nbest</B>,<B></B>but the input is processed as a stream of N-best hypotheses (without header).The output consists of the rescored hypotheses inSRILM format (the third of the formats described in<A HREF="nbest-format.html">nbest-format(5)</A>).<DT><B>-decipher-lm</B><I> model-file</I><B></B><DD>Designates the N-gram backoff model (typically a bigram) that was used by theDecipher(TM) recognizer in computing composite scores for the hypotheses fed to<B> -rescore </B>or<B>-nbest</B>.<B></B>Used to compute acoustic scores from the composite scores.<DT><B>-decipher-order</B><I> N</I><B></B><DD>Specifies the order of the Decipher N-gram model used (default is 2).<DT><B> -decipher-nobackoff </B><DD>Indicates that the Decipher N-gram model does not contain backoff nodes,i.e., all recognizer LM scores are correct up to rounding. <DT><B>-decipher-lmw</B><I> weight</I><B></B><DD>Specifies the language model weight used by the recognizer.Used to compute acoustic scores from the composite scores.<DT><B>-decipher-wtw</B><I> weight</I><B></B><DD>Specifies the word transition weight used by the recognizer.Used to compute acoustic scores from the composite scores.<DT><B>-escape</B><I> string</I><B></B><DD>Set an ``escape string'' for the<B>-ppl</B>,<B></B><B>-counts</B>,<B></B>and<B> -rescore </B>computations.Input lines starting with<I> string </I>are not processed as sentences and passed unchanged to stdout instead.This allows associated information to be passed to scoring scripts etc.<DT><B>-counts</B><I> countsfile</I><B></B><DD>Perform a computation similar to <B>-ppl</B>,<B></B>but based only on the N-gram counts found in <I>countsfile</I>.<I></I>Probabilities are computed for the last word of each N-gram, using theother words as contexts, and scaling by the associated N-gram count.Summary statistics are output at the end, as well as before eachescaped input line.<DT><B>-count-order</B><I> n</I><B></B><DD>Use only counts of order<I> n </I>in the<B>-counts</B><B></B>computation.The default value is 0, meaning use all counts.<DT><B> -counts-entropy </B><DD>Weight the log probabilities for <B> -counts </B>processing by the join probabilities of the N-grams.This effectively computes the sum over p(w,h) log p(w|h),i.e., the entropy of the model.In debugging mode, both the conditional log probabilities and the corresponding joint probabilities are output.<DT><B> -skipoovs </B><DD>Instruct the LM to skip over contexts that contain out-of-vocabularywords, instead of using a backoff strategy in these cases.<DT><B>-noise</B><I> noise-tag</I><B></B><DD>Designate<I> noise-tag </I>as a vocabulary item that is to be ignored by the LM.(This is typically used to identify a noise marker.)Note that the LM specified by<B> -decipher-lm </B>does NOT ignore this<I> noise-tag </I>since the DECIPHER recognizer treats noise as a regular word.<DT><B>-noise-vocab</B><I> file</I><B></B><DD>Read several noise tags from<I>file</I>,<I></I>instead of, or in addition to, the single noise tag specified by<B>-noise</B>.<B></B><DT><B> -reverse </B><DD>Reverse the words in a sentence for LM scoring purposes.(This assumes the LM used is a ``right-to-left'' model.)Note that the LM specified by<B> -decipher-lm </B>is always applied to the original, left-to-right word sequence.</DD></DL><H2> SEE ALSO </H2><A HREF="ngram-count.html">ngram-count(1)</A>, <A HREF="ngram-class.html">ngram-class(1)</A>, <A HREF="lm-scripts.html">lm-scripts(1)</A>, <A HREF="ppl-scripts.html">ppl-scripts(1)</A>,<A HREF="pfsg-scripts.html">pfsg-scripts(1)</A>, <A HREF="nbest-scripts.html">nbest-scripts(1)</A>,<A HREF="ngram-format.html">ngram-format(5)</A>, <A HREF="nbest-format.html">nbest-format(5)</A>, <A HREF="classes-format.html">classes-format(5)</A>.<BR>J. A. Bilmes and K. Kirchhoff, ``Factored Language Models and GeneralizedParallel Backoff,'' <I>Proc. HLT-NAACL</I>, pp. 4-6, Edmonton, Alberta, 2003.<BR>S. F. Chen and J. Goodman, ``An Empirical Study of Smoothing Techniques forLanguage Modeling,'' TR-10-98, Computer Science Group, Harvard Univ., 1998.<BR>K. Kirchhoff et al., ``Novel Speech Recognition Models for Arabic,''Johns Hopkins University Summer Research Workshop 2002, Final Report.<BR>R. Kneser, J. Peters and D. Klakow,``Language Model Adaptation Using Dynamic Marginals'',<I>Proc. Eurospeech</I>, pp. 1971-1974, Rhodes, 1997.<BR>A. Stolcke and E. Shriberg, ``Statistical language modeling for speechdisfluencies,'' Proc. IEEE ICASSP, pp. 405-409, Atlanta, GA, 1996.<BR>A. Stolcke,`` Entropy-based Pruning of Backoff Language Models,''<I>Proc. DARPA Broadcast News Transcription and Understanding Workshop</I>,pp. 270-274, Lansdowne, VA, 1998.<BR>A. Stolcke et al., ``Automatic Detection of Sentence Boundaries andDisfluencies based on Recognized Words,'' <I>Proc. ICSLP</I>, pp. 2247-2250,Sydney, 1998.<BR>M. Weintraub et al., ``Fast Training and Portability,''in Research Note No. 1, Center for Language and Speech Processing,Johns Hopkins University, Baltimore, Feb. 1996.<H2> BUGS </H2>Some LM types (such as Bayes-interpolated and factored LMs) currently donot support the <B> -write-lm </B>function.<P>For the <B> -limit-vocab </B>option to work correctly with hidden event and class N-gram LMs, theevent/class vocabularies have to be specified by options (<B> -hidden-vocab </B>and<B>-classes</B>,<B></B>respectively).Embedding event/class definitions in the LM file only will not work correctly.<P>Sentence generation is slow and takes time proportional to the vocabularysize.<P>The file given by <B> -classes </B>is read multiple times if<B> -limit-vocab </B>is in effect or if a mixture of LMs is specified.This will lead to incorrect behavior if the argument of<B> -classes </B>is stdin (``-'').<P>Also, <B> -limit-vocab </B>will not work correctly with LM operations that require the entirevocabulary to be enumerated, such as <B> -adapt-marginals </B>or perplexity computation with<B>-debug 3</B>.<B></B><P>Support for factored LMs is experimental and many LM operations supportedby standard N-grams (such as<B>-limit-vocab</B>)<B></B>are not implemented yet.<H2> AUTHORS </H2>Andreas Stolcke <stolcke@speech.sri.com><BR>Jing Zheng <zj@speech.sri.com><BR>Copyright 1995-2006 SRI International</BODY></HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -