anti-ngram.1

来自「这是一款很好用的工具包」· 1 代码 · 共 134 行

134 行

anti-ngram(1)                                       anti-ngram(1)NNAAMMEE       anti-ngram  -  count  posterior-weighted N-grams in N-best       listsSSYYNNOOPPSSIISS       aannttii--nnggrraamm [--hheellpp] _o_p_t_i_o_n ...DDEESSCCRRIIPPTTIIOONN       aannttii--nnggrraamm counts the N-grams in a set of N-best  hypothe-       ses  lists.   The N-gram counts are weighted by the poste-       rior probabilities of the hypotheses they occur in.  Thus,       aannttii--nnggrraamm  can  be  used  to construct language models of       word sequences that are acoustically confusable with  cor-       rect  hypotheses.   The  counts output should be processed       with nnggrraamm--ccoouunntt  --ffllooaatt--ccoouunnttss  to  estimate  a  language       model.OOPPTTIIOONNSS       Each  filename  argument  can  be an ASCII file, or a com-       pressed file (name ending in .Z or .gz), or ``-'' to indi-       cate stdin/stdout.       --hheellpp  Print option summary.       --vveerrssiioonn              Print version information.       --rreeffss _f_i_l_e              Read  the  reference  transcripts  from _f_i_l_e.  Each              line should contain an utterance ID followed by the              transcript words.       --nnbbeesstt--ffiilleess _f_i_l_e              List of N-best files.  The base components of file-              names must correspond to the utterance IDs found in              the reference file.       --mmaaxx--nnbbeesstt _n              Limits  the  number of hypotheses read from each N-              best list to the first _n.       --oorrddeerr _n              Set the maximal order (length) of N-grams to count.              The default order is 3.       --llmm _f_i_l_e              Reads an ARPA language model from _f_i_l_e and rescores              the N-best lists with it prior to counting N-grams.       --ccllaasssseess _f_i_l_e              Interpret  the  LM as a class-based N-gram and read              class definitions in ccllaasssseess--ffoorrmmaatt(5) from _f_i_l_e.       --ttoolloowweerr              Map vocabulary to lowercase, eliminating case  dis-              tinctions.       --mmuullttiiwwoorrddss              Split  multiwords  (words joined by '_') into their              components when reading N-best lists.       --rreessccoorree--llmmww _l_m_w              Sets the language model weight  used  in  combining              the  language model log probabilities with acoustic              log probabilities (only relevant if separate scores              are given in the N-best input).       --rreessccoorree--wwttww _w_t_w              Sets  the word transition weight used to weight the              number of words relative to the acoustic log proba-              bilities  (only  relevant  if  separate  scores are              given in the N-best input).       --ppoosstteerriioorr--ssccaallee _s_c_a_l_e              Divide the total weighted log score by  _s_c_a_l_e  when              computing normalized posterior probabilities.  This              controls the peakedness of the posterior  distribu-              tion.  The default value is whatever was chosen for              --rreessccoorree--llmmww, so that  language  model  scores  are              scaled  to  have weight 1, and acoustic scores have              weight 1/_l_m_w.       --aallll--nnggrraammss              Causes even N-grams that  occur  in  the  reference              string  to  be  counted.  By default N-best N-grams              that also occur in the corresponding reference  are              excluded.       --mmiinn--ccoouunntt _C              Prune  all N-grams from the output that have counts              less than _C.       --rreeaadd--ccoouunnttss _c_o_u_n_t_s_f_i_l_e              Read N-gram counts from a file.  Each line contains              an N-gram of words, followed by an integer or frac-              tional  count,   all   separated   by   whitespace.              Repeated  counts for the same N-gram are added.  N-              grams from N-best lists are  added  to  those  read              with this option.       --wwrriittee--ccoouunnttss _c_o_u_n_t_s_f_i_l_e              Writes  total  N-gram  counts  to  _c_o_u_n_t_s_f_i_l_e.  The              default is to write to stdout.       --ssoorrtt  Output counts in lexicographic order,  as  required              for nnggrraamm--mmeerrggee(1).       --ddeebbuugg _l_e_v_e_l              Set  debugging  output  level.   Level  0  means no              debugging.   Debugging  messages  are  written   to              stderr.SSEEEE AALLSSOO       ngram(1),     ngram-merge(1),    ngram-count(1),    nbest-       scripts(1), classes-format(5),       A. Stolcke et al., "The SRI  March  2000  Hub-5  Conversa-       tional  Speech  Transcription  System",  _P_r_o_c_. _N_I_S_T _S_p_e_e_c_h       _T_r_a_n_s_c_r_i_p_t_i_o_n _W_o_r_k_s_h_o_p, College Park, MD, 2000.BBUUGGSS       There is no --vvooccaabb option to limit the vocabulary.AAUUTTHHOORR       Andreas Stolcke <stolcke@speech.sri.com>.       Copyright 2000-2004 SRI InternationalSRILM Tools        $Date: 2004/12/03 17:59:01 $     anti-ngram(1)

anti-ngram.1 - 源码说明

本页面展示了「这是一款很好用的工具包」中的 anti-ngram.1 源码文件，采用 1 编程语言编写，共 134 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。

虫虫下载站收录了大量与工具包相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。

⌨️ 快捷键说明

复制代码Ctrl + C

搜索代码Ctrl + F

全屏模式F11

增大字号Ctrl + =

减小字号Ctrl + -

显示快捷键?