📄 nbest-lattice.1
字号:
If all three of _a_m_w, _l_m_w, and _w_t_w are set to zero the posteriors are computed directly from the aggregate scores stored in the N-best input. --vvooccaabb _f_i_l_e Read the N-best list vocabulary from _f_i_l_e. This option is mostly redundant since words found in the N-best input are implicitly added to the vocabu- lary. --vvooccaabb--aalliiaasseess _f_i_l_e Reads vocabulary alias definitions from _f_i_l_e, con- sisting of lines of the form _a_l_i_a_s _w_o_r_d This causes all tokens _a_l_i_a_s to be mapped to _w_o_r_d. --ttoolloowweerr Map vocabulary to lowercase, eliminating case dis- tinctions. --mmuullttiiwwoorrddss Split multiwords (words joined by '_') into their components when reading N-best lists. --nnooiissee _n_o_i_s_e_-_t_a_g Designate _n_o_i_s_e_-_t_a_g as a vocabulary item that is to be ignored in aligning hypotheses with each other (the same as the -pau- word). This is typically used to identify a noise marker. --nnooiissee--vvooccaabb _f_i_l_e Read several noise tags from _f_i_l_e, instead of, or in addition to, the single noise tag specified by --nnooiissee. --kkeeeepp--nnooiissee Do not remove pause or noise tokens from hypothe- ses. The default is to preserve noise tags but still eliminate pauses. --nnbbeesstt--eerrrroorr Compute the N-best error (minimum word error) of the N-best list read with --nnbbeesstt. Pause and noise tokens (as specified with --nnooiissee) in the N-best list are ignored. --dduummpp--ppoosstteerriioorrss Output posterior probabilities of all N-best hypotheses instead of choosing the best hypothesis. In N-best mode, only the posterior probability for each hypothesis is output. In lattice mode, the hyp posterior is followed by word posterior proba- bilities for each (non-pause, non-noise) token in the hypothesis. The --mmaaxx--rreessccoorree option limits the number of hypotheses per N-best list processed. --dduummpp--eerrrroorrss Output word correctness indicators for all N-best hypotheses instead of choosing the best hypothesis. For each hypothesis, a line is output containing first the total number of errors and the list of indicators of whether the corresponding word is correct, substituted or inserted relative to the reference string. The location of deleted words is also indicated by a corresponding marker. The --mmaaxx--rreessccoorree option limits the number of hypotheses per N-best list processed. --rreeffeerreennccee _w_1 _w_2 _._._. Specifies a reference word string for --dduummpp--eerrrroorrss, --nnbbeesstt--eerrrroorr, and --llaattttiiccee--eerrrroorr options. Addi- tionally, in --uussee--mmeesshh mode, the reference words are recorded in the word mesh and can be output with --wwrriittee, indicating which word in each align- ment position is the correct one. --rreeffss _r_e_f_e_r_e_n_c_e_s Read a table of reference transcripts from file _r_e_f_e_r_e_n_c_e, for when multiple N-best lists are pro- cessed (see --nnbbeesstt--ffiilleess). Each line in _r_e_f_e_r_e_n_c_e_s must contain the sentence ID (the last component in the N-best filename path, minus any suffixes) fol- lowed by zero or more reference words. The following options only affect lattice mode. --rreeaadd _f_i_l_e Reads an initial lattice from _f_i_l_e, to be merged with additional paths constructed from the N-best hypotheses. --llaattttiiccee--ffiilleess _f_i_l_e Reads the names of one or more lattices from _f_i_l_e and aligns those lattices with the main lattice being built. Each line of _f_i_l_e must contain a lat- tice filename, optionally followed by a weight. --wwrriittee _f_i_l_e Writes the resulting word posterior lattice or mesh to _f_i_l_e, in wwllaatt--ffoorrmmaatt(5). --wwrriittee--ddiirr _d_i_r_e_c_t_o_r_y Write the resulting N-best lattices to _d_i_r_e_c_t_o_r_y, in files named after the input N-best lists, for when multiple N-best lists are processed (see --nnbbeesstt--ffiilleess). --pprriimmee--llaattttiiccee Start building the lattice with the best hypothesis obtained from N-best error minimization. This pro- duces slightly better alignments and sometimes lower error rates. The default is to start with the top-scoring hypothesis. --pprriimmee--wwiitthh--11bbeesstt Similar to --pprriimmee--llaattttiiccee, but uses the top-ranked sentence hypothesis for priming. (Experience shows that --nnoo--rreeoorrddeerr --pprriimmee--llaattttiiccee gives best results.) --pprriimmee--wwiitthh--rreeffss Similar to --pprriimmee--llaattttiiccee, but uses the reference words for priming. --nnoo--mmeerrggee Build a lattice from the N-best hypotheses without merging edges (string/lattice alignment). This creates a lattice with one disjoint path per hypothesis, and is useful mainly for debugging pur- poses. --llaattttiiccee--eerrrroorr Compute the lattice error (minimum word error) of the lattice read with --rreeaadd or built with --nnbbeesstt. --ddiiccttiioonnaarryy file Use word pronunciations listed in _f_i_l_e to construct word alignments when building word meshes. This will use an alignment cost function that reflects the number of inserted/deleted/substituted phones, rather than words. The dictionary _f_i_l_e should con- tain one pronunciation per line, each naming a word in the first field, followed by a string of phone symbols. --hhiiddddeenn--vvooccaabb file Read a subvocabulary from _f_i_l_e and constrain word meshes to only align those words that are either all in or outside the subvocabulary. This may be used to keep ``hidden event'' tags from aligning with regular words. --rreeccoorrdd--hhyyppss Record the ranks of the hyps contributing to each word hypothesis in the resulting word lattice; the information is included in --wwrriittee output.SSEEEE AALLSSOO ngram(1), nbest-optimize(1), nbest-scripts(1), nbest-for- mat(5), wlat-format(5). A. Stolcke, Y. Konig, and M. Weintraub, ``Explicit Word Error Minimization in N-best List Rescoring,'' _P_r_o_c_. _E_u_r_o_s_p_e_e_c_h, 163-166, 1997. The ``word meshes'' used here are equivalent to the ``con- fusion networks'' described in: L. Mangu, E. Brill, and A. Stolcke, ``Finding Consensus Among Words: Lattice-based Word Error Minimization.'' _P_r_o_c_. _E_u_r_o_s_p_e_e_c_h, vol. 1, 495-498, 1999.BBUUGGSS Several functions are not uniformly implemented for all rescoring modes (e.g., --llaattttiiccee--ffiilleess, --ddiiccttiioonnaarryy, --rreeccoorrdd--hhyyppss, and --nnbbeesstt--bbaacckkttrraaccee are currently effective only in mesh-lattice mode). It is a common mistake (not a bug) to use the default LM weight with N-best lists directly from Decipher. Decipher N-best lists have the recognizer's LM weight already built in, so they should be processed with nbest-lattice -rescore-lmw 1 -posterior-scale _L_M_W where _L_M_W is the LM weight during recognition. This is not an issue if the N-best lists have been rescored with rreessccoorree--ddeecciipphheerr.AAUUTTHHOORR Andreas Stolcke <stolcke@speech.sri.com>. Copyright 1996-2004 SRI InternationalSRILM Tools $Date: 2006/07/05 08:24:08 $ nbest-lattice(1)
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -