📄 nbest-scripts.html
字号:
<! $Id: nbest-scripts.1,v 1.35 2006/07/29 18:42:28 stolcke Exp $><HTML><HEADER><TITLE>nbest-scripts</TITLE><BODY><H1>nbest-scripts</H1><H2> NAME </H2>nbest-scripts, combine-rover-controls, compare-sclite, compute-sclite, fix-ctm, merge-nbest, nbest-error, nbest-posteriors, nbest-rover, nbest-vocab, nbest2-to-nbest1, rescore-acoustic, rescore-decipher, rescore-reweight, sentid-to-sclite - rescore and evaluate N-best lists<H2> SYNOPSIS </H2><B> rescore-decipher </B>[<B>-bytelog</B>][<B>-nodecipherlm</B>][<B>-multiwords</B>][<B>-pretty</B><I>mapfile</I>]<I></I>[<B>-ngram-tool</B><I>program</I>]<I></I>[<B>-filter</B><I>command</I>]<I></I>[<B>-norescore</B>][<B>-lm-only</B>][<B>-count-oovs</B>][<B>-limit-vocab</B>][<B>-vocab-aliases</B><I>mapfile</I>]<I></I>[<B>-fast</B>]<I> nbest-file-list </I><I> score-dir </I><B> -lm </B>...<I> lm-options </I>...<BR><B> rescore-acoustic </B><I>old-nbest-dir</I>|<I>old-file-list</I><I> old-ac-weight </I><I> new-score-dir1 </I><I> new-ac-weight1 </I>...<I> new-nbest-dir </I>[<I>max-nbest</I>]<BR><B> rescore-reweight </B>[<B>-multiwords</B>]<I>score-dir</I>|<I>file-list</I>[<I>lmw</I>[<I>wtw</I>[<I>score-dir1 score-weight1</I>...][<I>max-nbest</I>]]]<BR><B> rescore-minimize-wer </B><I> score-dir </I>[<I>lmw</I>[<I>wtw</I>[<I>max-nbest</I>]]]<BR><B> nbest2-to-nbest1 </B>[<I>nbest-file</I>]<BR><B> nbest-rover </B>[<I> sentid-list </I>|<B> - </B>]<I> control-file </I>[<I> posterior-file </I>[<I> nbest-lattice-options </I>] ]<BR><B> combine-rover-controls </B>[<B>lambda=</B><I>weights</I><B></B>]<I> rover-control </I>[ ... ]<BR><B> nbest-posteriors </B>[<B>weight=</B><I>W</I><B></B><B>lmw=</B><I>lmw</I><B></B><B>wtw=</B><I>wtw</I><B></B><B>postscale=</B><I>S</I><B></B><B>max_nbest=</B><I>M</I><B></B>]<I> nbest-file </I><BR><B> merge-nbest </B>[<B>max_nbest=</B><I>M</I><B></B><B> multiwords=1 </B><B> nopauses=1 </B>]<I> nbest-file </I>...<BR><B> nbest-vocab </B>[<I>nbest-list</I>...]<BR><B> nbest-error </B><I>score-dir</I>|<I>file-list</I><I> refs </I>[<I>nbest-lattice-option</I>...]<BR><B> sentid-to-sclite </B><I> hyps </I><BR><B> sentid-to-ctm </B><I> hyps </I><BR><B> fix-ctm </B><I> ctmfile </I><BR><B> compute-sclite </B><B> -r </B><I> refs </I><B> -h </B><I> hyps </I>[<B> -h </B><I> hyps </I>...][<B> -S </B><I> subset </I>...][<B>-multiwords</B>|[<B>-noperiods</B>][<B>-R</B>][<B>-g</B><I>glmfile</I>]<I></I>[<B>-H</B>][<B>-v</B>][<I>sclite-options</I>...]<BR><B> compare-sclite </B><B> -r </B><I> refs </I><B> -h1 </B><I> hyps1 </I><B> -h2 </B><I> hyps2 </I>[<B> -S </B><I> subset </I>][<B>-multiwords</B>|[<I>sclite-options</I>...]<H2> DESCRIPTION </H2>These scripts perform common tasks on N-best hypotheses in <A HREF="nbest-format.html">nbest-format(5)</A>,especially those needed for rescoring and extracting and evaluating1-best hypotheses.<P><B> rescore-decipher </B>applies a language model implemented by <A HREF="ngram.html">ngram(1)</A>to the N-best lists listed in<I>nbest-file-list</I>.<I></I>The N-best files may be in compressed format.The rescored N-best lists are stored in directory<I>score-dir</I>.<I></I>All following arguments are passed to <A HREF="ngram.html">ngram(1)</A>and are used to control the language model.The following options are handled by <B> rescore-decipher </B>itself:<DL><DT><B> -bytelog </B><DD>causes scores to be output on the bytelog scale(see <A HREF="nbest-format.html">nbest-format(1)</A>).<DT><B> -nodecipherlm </B><DD>indicates that the recognizer language model is not being provided(with<B>-decipher-lm</B>).<B></B>(This is only possible if the N-best lists are not in ``NBestList1.0'' format.)<DT><B> -multiwords </B><DD>specifies that N-best lists contain words joined by underscores, which areto be split into their component prior to rescoring.<DT><B>-pretty</B><I> mapfile</I><B></B><DD>specifies a word mapping file that allows individual words to be globallyreplaced by strings of zero or more other words, e.g., to remove vocabularymismatches between the input N-best lists and the rescoring LM.The <I> mapfile </I>contains one mapping per line, the first field specifying the word to bereplaced and subsequent fields forming the replacement string.<DT><B>-ngram-tool</B><I> program</I><B></B><DD>specifies a non-standard<I> program </I>to perform the actual LM evaluation(by default, <A HREF="ngram.html">ngram(1)</A>is used).Such a program must understand<B>ngram</B>'s<B></B>command-line options related to N-best rescoring.<DT><B>-filter</B><I> command</I><B></B><DD>specifies a<I> command </I>that is used to filter the N-best hypotheses prior toevaluating the language model.This may be used for more general textual rewriting so that non-standardLMs can be applied.The output N-best lists will contain the filtered hypotheses.<DT><B> -norescore </B><DD>causes N-best lists to be simply reformatted from one of the Decipher formatsinto the SRILM N-best format, separating acoustic and LM scores, withoutreplacing the existing LM scores.In this case only the <A HREF="ngram.html">ngram(1)</A>options<B>-decipher-lmw</B><B></B>and <B>-decipher-wtw</B><B></B>are relevant, and others are ignored.<B> -norescore </B>and <B> -filter </B>may be used together to perform textual rewriting of N-best lists.<DT><B> -lm-only </B><DD>dumps out LM scores only, instead of complete N-best lists.<DT><B>-count-oovs</B><B></B><DD>writes the count of out-of-vocabulary and zero-probability words tothe output score files (instead of rescored N-best lists).<DT><B> -limit-vocab </B><DD>saves memory by arranging for<A HREF="ngram.html">ngram(1)</A> to load only those N-gram parameters that are relevant to the vocabularyof the N-best lists to be rescored.After determining the N-best vocabulary the <B> -limit-vocab </B>option is passed to <A HREF="ngram.html">ngram(1)</A>.<DT><B>-vocab-aliases</B><I> map</I><B></B><DD>declares that certain words are to be treated as alternative spellings of the same word for LM evaluation; see the same option for <A HREF="ngram.html">ngram(1)</A>.The <I> map </I>is filtered of unused words when used in conjunction with<B>-limit-vocab</B>,<B></B>and then passed on to <A HREF="ngram.html">ngram(1)</A>.<DT><B> -fast </B><DD>performs rescoring using only functions built into<A HREF="ngram.html">ngram(1)</A>.This avoids some computational and I/O overhead and therefore runs faster,but the options<B>-filter</B>,<B></B><B>-pretty</B>,<B></B>and <B> -lm-only </B>are not supported, and <B> -nodecipherlm </B>is obligatory.</DD></DL><P><B> rescore-acoustic </B>replaces the acoustic scores in a set of N-best lists by a weighted combination of new scores.The old N-best lists are given by either a directory<I> old-score-dir </I>or a filelist<I>old-file-list</I>;<I></I><I> old-ac-weight </I>is the weight given to the old scores.Directories containing the new scores are listed alternating with thecorresponding weights; each score directory must contain one file per waveform segment, each having the same file basenames as the original N-best lists.The new scores should appear in a single column per file, one per line.The N-best lists containing the new combined acoustic scores are written to <I>new-nbest-dir</I>.<I></I>The optional<I> max-nbest </I>argument can be used to limit the length of the N-best lists output.Also, When a new score file is encountered containing fewer than<I> max-nbest </I>lines, the missing scores are set to the lowest score encountered so far.<P><B> rescore-reweight </B>combines the scores in N-best lists with a set of weights and outputsthe 1-best hypotheses.The N-best files are found in directory<I> score-dir </I>or listed in<I>file-list</I>.<I></I>Optional arguments set the language model weight<I> lmw </I>(default 8),the word transition weight<I> wtw </I>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -