📄 lattice-tool.1
字号:
lattice-tool(1) lattice-tool(1)NNAAMMEE lattice-tool - manipulate word latticesSSYYNNOOPPSSIISS llaattttiiccee--ttooooll [--hheellpp] option ...DDEESSCCRRIIPPTTIIOONN llaattttiiccee--ttooooll performs operations on word lattices in ppffssgg-- ffoorrmmaatt(5) or in HTK Standard Lattice format (SLF). Opera- tions include size reduction, pruning, null-node removal, weight assignment from language models, lattice word error computation, and decoding of the best hypotheses. Each input lattice is processed in turn, and a series of optional operations is performed in a fixed sequence (regardless of the order in which corresponding options are specified). The sequence of operations is as follows: 1. Read input lattice. 2. Score pronunciations (if dictionary was supplied). 3. Split multiword word nodes. 4. Posterior- and density-based pruning (before reduc- tion). 5. Write word posterior lattice. 6. Perform word-posterior based decoding. 7. Write word mesh (confusion network). 8. Compute word and transition posteriors (forward- backward algorithm), and N-gram counts if speci- fied. 9. Compute lattice density. 10. Check lattice connectivity. 11. Compute node entropy. 12. Compute lattice word error. 13. Output reference word posteriors. 14. Remove null nodes. 15. Lattice reduction. 16. Posterior- and density-based pruning (after reduc- tion). 17. Remove pause nodes. 18. Lattice reduction (post-pause removal). 19. Language model replacement or expansion. 20. Pause recovery or insertion. 21. Lattice reduction (post-LM expansion). 22. Multiword splitting (post-LM expansion). 23. Merging of same-word nodes. 24. Lattice algebra operations (or, concatenation). 25. Viterbi-decode best hypothesis and/or generate N- best lists. 26. Lattice-LM perplexity computation. 27. Writing output lattice. The following options control which of these steps actu- ally apply.OOPPTTIIOONNSS Each filename argument can be an ASCII file, or a com- pressed file (name ending in .Z or .gz), or ``-'' to indi- cate stdin/stdout. --hheellpp Print option summary. --vveerrssiioonn Print version information. --ddeebbuugg _l_e_v_e_l Set the debugging output level (0 means no debug- ging output). Debugging messages are sent to stderr. --iinn--llaattttiiccee _f_i_l_e Read input lattice from _f_i_l_e. --iinn--llaattttiiccee22 _f_i_l_e Read additional input lattice (for binary lattice operations) from _f_i_l_e. --iinn--llaattttiiccee--lliisstt _f_i_l_e Read list of input lattices from _f_i_l_e. Lattice operations are applied to each filename listed in _f_i_l_e. --oouutt--llaattttiiccee _f_i_l_e Write result lattice to _f_i_l_e. --oouutt--llaattttiiccee--ddiirr _d_i_r Write result lattices from processing of --iinn--llaatt-- ttiiccee--lliisstt to directory _d_i_r. --rreeaadd--mmeesshh Assume input lattices are in word mesh (confusion network) format, as described in wwllaatt--ffoorrmmaatt(5). --wwrriittee--iinntteerrnnaall Write output lattices with internal node numbering instead of compact, consecutive numbering. --oovveerrwwrriittee Overwrite existing output lattice files. --vvooccaabb _f_i_l_e Initialize the vocabulary to words listed in _f_i_l_e. This is useful in conjunction with --lliimmiitt--vvooccaabb Discard LM parameters on reading that do not per- tain to the words specified in the vocabulary. The default is that words used in the LM are automati- cally added to the vocabulary. This option can be used to reduce the memory requirements for large LMs; to this end, --vvooccaabb typically specifies the set of words used in the lattices to be processed (which has to be generated beforehand, see ppffssgg-- ssccrriippttss(1)). --vvooccaabb--aalliiaasseess _f_i_l_e Reads vocabulary alias definitions from _f_i_l_e, con- sisting of lines of the form _a_l_i_a_s _w_o_r_d This causes all tokens _a_l_i_a_s to be mapped to _w_o_r_d. --uunnkk Map lattice words not contained in the known vocab- ulary with the unknown word tag. This is useful if the rescoring LM contains a probability for the unknown word (i.e., is an open-vocabulary LM). The known vocabulary is given by what is specified by the --vvooccaabb option, as well as all words in the LM used for rescoring. --mmaapp--uunnkk _w_o_r_d Map out-of-vocabulary words to _w_o_r_d, rather than the default <<uunnkk>> tag. --ttoolloowweerr Map all vocabulary to lowercase. --nnoonneevveennttss _f_i_l_e Read a list of words from _f_i_l_e that are used only as context elements, and are not predicted by the LM, similar to ``<s>''. If --kkeeeepp--ppaauussee is also specified then pauses are not treated as nonevents by default. --mmaaxx--ttiimmee _T Limit processing time per lattice to _T seconds. Options controlling lattice operations: --wwrriittee--ppoosstteerriioorrss _f_i_l_e Compute the posteriors of lattice nodes and transi- tions (using the forward-backward algorithm) and write out a word posterior lattice in wwllaatt--ffoorr-- mmaatt(5). This and other options based on posterior probabilities make most sense if the input lattice contains combined acoustic-language model weights. --wwrriittee--ppoosstteerriioorrss--ddiirr _d_i_r Similar to the above, but posterior lattices are written to separate files in directory _d_i_r, named after the utterance IDs. --wwrriittee--mmeesshh _f_i_l_e Construct a word confusion network ("sausage") from the lattice and write it to _f_i_l_e. If reference words are available for the utterance (specified by --rreeff--ffiillee or --rreeff--lliisstt) their alignment will be recorded in the sausage. --wwrriittee--mmeesshh--ddiirr _d_i_r Similar, but write sausages to files in _d_i_r named after the utterance IDs. --iinniitt--mmeesshh _f_i_l_e Initialize the word confusion network by reading an existing sausage from _f_i_l_e. This effectively aligns the lattice being processed to the existing sausage. --aaccoouussttiicc--mmeesshh Preserve word-level acoustic information (times, scores, and pronunciations) in sausages, encoded as described in wwllaatt--ffoorrmmaatt(5). --ppoosstteerriioorr--pprruunnee _P Prune lattice nodes with posteriors less than _P times the highest posterior path. --ddeennssiittyy--pprruunnee _D Prune lattices such that the lattice density (non- null words per second) does not exceed _D. --nnooddeess--pprruunnee _N Prune lattices such that the total number of non- null, non-pause nodes does not exceed _N. --ffaasstt--pprruunnee Choose a faster pruning algorithm that does not recompute posteriors after each iteration. --wwrriittee--nnggrraammss _f_i_l_e Compute posterior expected N-gram counts in lat- tices and output them to _f_i_l_e. The maximal N-gram length is given by the --oorrddeerr option (see below). The counts from all lattices processed are accumu- lated and output at the end. --wwrriittee--nnggrraamm--iinnddeexx _f_i_l_e Output an index file of all N-gram occurences in the lattices processed, including their start times, durations, and posterior probabilities. The maximal N-gram length is given by the --oorrddeerr option (see below). --mmiinn--ccoouunntt _C Prune N-grams with count less than _C from output with --wwrriittee--nnggrraammss and --wwrriittee--nnggrraamm--iinnddeexx. In the former case, the threshold applies to the aggregate occurrence counts; in the latter case, the thresh- old applies to the posterior probability of an individual occurence. --mmaaxx--nnggrraamm--ppaauussee _T Index only N-grams that contain internal pauses (between words) not exceeding _T seconds (assuming time stamps are recorded in the input lattice). --ppoosstteerriioorr--ssccaallee _S Scale the transition weights by dividing by _S for the purpose of posterior probability computation. If the input weights represent combined acoustic- language model scores then this should be approxi- mately the language model weight of the recognizer in order to avoid overly peaked posteriors (the default value is 8). --wwrriittee--vvooccaabb _f_i_l_e Output the list of all words found in the lat- tice(s) to _f_i_l_e. --rreedduuccee Reduce lattice size by a single forward node merg- ing pass. --rreedduuccee--iitteerraattee _I Reduce lattice size by up to _I forward-backward node merging passes. --oovveerrllaapp--rraattiioo _R Perform approximate lattice reduction by merging nodes that share more than a fraction _R of their incoming or outgoing nodes. The default is 0, i.e., only exact lattice reduction is performed. --oovveerrllaapp--bbaassee _B If _B is 0 (the default), then the overlap ratio _R is taken relative to the smaller set of transitions being compared. If the value is 1, the ratio is
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -