📄 anti-ngram.1
字号:
anti-ngram(1) anti-ngram(1)NNAAMMEE anti-ngram - count posterior-weighted N-grams in N-best listsSSYYNNOOPPSSIISS aannttii--nnggrraamm [--hheellpp] _o_p_t_i_o_n ...DDEESSCCRRIIPPTTIIOONN aannttii--nnggrraamm counts the N-grams in a set of N-best hypothe- ses lists. The N-gram counts are weighted by the poste- rior probabilities of the hypotheses they occur in. Thus, aannttii--nnggrraamm can be used to construct language models of word sequences that are acoustically confusable with cor- rect hypotheses. The counts output should be processed with nnggrraamm--ccoouunntt --ffllooaatt--ccoouunnttss to estimate a language model.OOPPTTIIOONNSS Each filename argument can be an ASCII file, or a com- pressed file (name ending in .Z or .gz), or ``-'' to indi- cate stdin/stdout. --hheellpp Print option summary. --vveerrssiioonn Print version information. --rreeffss _f_i_l_e Read the reference transcripts from _f_i_l_e. Each line should contain an utterance ID followed by the transcript words. --nnbbeesstt--ffiilleess _f_i_l_e List of N-best files. The base components of file- names must correspond to the utterance IDs found in the reference file. --mmaaxx--nnbbeesstt _n Limits the number of hypotheses read from each N- best list to the first _n. --oorrddeerr _n Set the maximal order (length) of N-grams to count. The default order is 3. --llmm _f_i_l_e Reads an ARPA language model from _f_i_l_e and rescores the N-best lists with it prior to counting N-grams. --ccllaasssseess _f_i_l_e Interpret the LM as a class-based N-gram and read class definitions in ccllaasssseess--ffoorrmmaatt(5) from _f_i_l_e. --ttoolloowweerr Map vocabulary to lowercase, eliminating case dis- tinctions. --mmuullttiiwwoorrddss Split multiwords (words joined by '_') into their components when reading N-best lists. --rreessccoorree--llmmww _l_m_w Sets the language model weight used in combining the language model log probabilities with acoustic log probabilities (only relevant if separate scores are given in the N-best input). --rreessccoorree--wwttww _w_t_w Sets the word transition weight used to weight the number of words relative to the acoustic log proba- bilities (only relevant if separate scores are given in the N-best input). --ppoosstteerriioorr--ssccaallee _s_c_a_l_e Divide the total weighted log score by _s_c_a_l_e when computing normalized posterior probabilities. This controls the peakedness of the posterior distribu- tion. The default value is whatever was chosen for --rreessccoorree--llmmww, so that language model scores are scaled to have weight 1, and acoustic scores have weight 1/_l_m_w. --aallll--nnggrraammss Causes even N-grams that occur in the reference string to be counted. By default N-best N-grams that also occur in the corresponding reference are excluded. --mmiinn--ccoouunntt _C Prune all N-grams from the output that have counts less than _C. --rreeaadd--ccoouunnttss _c_o_u_n_t_s_f_i_l_e Read N-gram counts from a file. Each line contains an N-gram of words, followed by an integer or frac- tional count, all separated by whitespace. Repeated counts for the same N-gram are added. N- grams from N-best lists are added to those read with this option. --wwrriittee--ccoouunnttss _c_o_u_n_t_s_f_i_l_e Writes total N-gram counts to _c_o_u_n_t_s_f_i_l_e. The default is to write to stdout. --ssoorrtt Output counts in lexicographic order, as required for nnggrraamm--mmeerrggee(1). --ddeebbuugg _l_e_v_e_l Set debugging output level. Level 0 means no debugging. Debugging messages are written to stderr.SSEEEE AALLSSOO ngram(1), ngram-merge(1), ngram-count(1), nbest- scripts(1), classes-format(5), A. Stolcke et al., "The SRI March 2000 Hub-5 Conversa- tional Speech Transcription System", _P_r_o_c_. _N_I_S_T _S_p_e_e_c_h _T_r_a_n_s_c_r_i_p_t_i_o_n _W_o_r_k_s_h_o_p, College Park, MD, 2000.BBUUGGSS There is no --vvooccaabb option to limit the vocabulary.AAUUTTHHOORR Andreas Stolcke <stolcke@speech.sri.com>. Copyright 2000-2004 SRI InternationalSRILM Tools $Date: 2004/12/03 17:59:01 $ anti-ngram(1)
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -