📄 segment-nbest.1

📁 这是一款很好用的工具包
💻 1
字号:
segment-nbest(1)                                 segment-nbest(1)NNAAMMEE       segment-nbest  -  rescore  and  segment N-best lists using       hidden segment N-gram modelSSYYNNOOPPSSIISS       sseeggmmeenntt--nnbbeesstt [--hheellpp] option ...  nbest-file-list ...DDEESSCCRRIIPPTTIIOONN       sseeggmmeenntt--nnbbeesstt processes a  series  of  consecutive  N-best       lists  from  a speech recognizer and applies a hidden seg-       ment N-gram language model to them.  The language model is       a  standard  backoff  N-gram model in ARPA nnggrraamm--ffoorrmmaatt(5)       modeling sentence segmentation using the boundary tags <s>       and  </s>.  The program reads in all N-best lists and out-       puts the hypotheses that have the highest aggregate  (com-       bined  acoustic  and  language model) score.  Hypothesized       sentence boundaries are marked by <s> tags.OOPPTTIIOONNSS       Each filename argument can be an ASCII  file,  or  a  com-       pressed file (name ending in .Z or .gz), or ``-'' to indi-       cate stdin/stdout.       --hheellpp  Print option summary.       --vveerrssiioonn              Print version information.       --oorrddeerr _n              Set the maximal N-gram order to be used, by default              3.   NOTE:  The order of the model is not set auto-              matically when a model file is read,  so  the  same              file can be used at various orders.       --ddeebbuugg _l_e_v_e_l              Set  the  debugging output level (0 means no debug-              ging  output).   Debugging  messages  are  sent  to              stderr.       --llmm _f_i_l_e              Read the N-gram model from _f_i_l_e.       --ttoolloowweerr              Map  all  vocabulary  to lowercase.  Useful if case              conventions for N-best  lists  and  language  model              differ.       --mmiixx--llmm _f_i_l_e              Read a second, standard N-gram model for interpola-              tion purposes.       --llaammbbddaa _w_e_i_g_h_t              Set the weight of the main model when interpolating              with --mmiixx--llmm.  Default value is 0.5.       --bbaayyeess _l_e_n_g_t_h              Interpolate  the  second  and  the main model using              posterior probabilities for  local  N-gram-contexts              of  length  _l_e_n_g_t_h.  The --llaammbbddaa value is used as a              prior mixture weight in this case.       --bbaayyeess--ssccaallee _s_c_a_l_e              Set the exponential scale  factor  on  the  context              likelihood in conjunction with the --bbaayyeess function.              Default value is 1.0.       --nnbbeesstt--ffiilleess _l_i_s_t              Specifies a list of N-best files.   The  file  _l_i_s_t              should  contain  a list of filenames, one per line,              each corresponding to an N-best file in one of  the              formats  described  in nnbbeesstt--ffoorrmmaatt(5).  The N-best              files should correspond to consecutive speech wave-              forms in the order listed.       --ffbb--rreessccoorree              Perform Forward-backward rescoring.  This generates              new N-best lists as output whose LM scores  reflect              the  posterior probability of each hypothesis.  The              default is to perform Viterbi rescoring and  output              only the best combined hypothesis.       --wwrriittee--nnbbeesstt--ddiirr _d_i_r              Write   rescored  N-best  lists  to  directory  _d_i_r              instead of to stdout.  The filenames from the input              are preserved.       --mmaaxx--nnbbeesstt _n              Limits  the  number of hypotheses read from each N-              best list to the first _n.       --mmaaxx--rreessccoorree _m              Only choose among the top _m hypotheses of each list              (after  reordering hypotheses, see below).  This is              an effective way to limit the quadratic computation              of the Viterbi or forward/backward dynamic program-              ming.       --nnoo--rreeoorrddeerr              Do not reorder the hypotheses before  limiting  the              computation  to the top _m.  By default the hypothe-              ses will first be sorted according to the  acoustic              and  language  model  scores recorded in the N-best              lists.       --rreessccoorree--llmmww _w_e_i_g_h_t              Specifies the language model weight to  be  use  in              combining  acoustic  and  language  model scores to              select the best hypotheses.       --rreessccoorree--wwttww _w_e_i_g_h_t              Specifies the word transition weight to be used  in              selecting the best hypotheses.       --nnooiissee _n_o_i_s_e_-_t_a_g              Designate _n_o_i_s_e_-_t_a_g as a vocabulary item that is to              be ignored by the LM.  (This is typically  used  to              identify a noise marker.)       --nnooiissee--vvooccaabb _f_i_l_e              Read  several  noise tags from _f_i_l_e, instead of, or              in addition to, the single noise tag  specified  by              --nnooiissee.       --ddeecciipphheerr--llmm _m_o_d_e_l_-_f_i_l_e              Designates  the  N-gram  backoff model (typically a              bigram) that was used by  the  Decipher(TM)  recog-              nizer  in computing composite scores.  Used to com-              pute acoustic scores from the composite  scores  if              the N-best lists are in "NBestList1.0" format.       --ddeecciipphheerr--llmmww _w_e_i_g_h_t              Specifies  the  language  model  weight used by the              recognizer.  Used to compute acoustic  scores  from              the composite scores.       --ddeecciipphheerr--wwttww _w_e_i_g_h_t              Specifies  the  word  transition weight used by the              recognizer.  Used to compute acoustic  scores  from              the composite scores.       --ssttaagg _s_t_r_i_n_g              Use  _s_t_r_i_n_g  to mark segment boundaries in the out-              put.   Default  is  the  start-of-sentence   symbol              defined in the language model (<s>).       --bbiiaass _b              Make  a  segment boundary a priori more likely by a              factor of _b.  If _b is 0, the dynamic program  algo-              rithm  is  restricted to never consider hidden sen-              tence boundaries; this is useful when sseeggmmeenntt--nnbbeesstt              is  used  merely  for  its  ability to apply the LM              across N-best boundaries.       --ssttaarrtt--ttaagg _s_t_r_i_n_g              Insert a tag _s_t_r_i_n_g at the front  of  every  N-best              hypothesis read in.       --eenndd--ttaagg _s_t_r_i_n_g              Insert  a  tag  _s_t_r_i_n_g  at  the end of every N-best              hypothesis read in.  This and the  previous  option              are useful if the LM marks acoustic waveform bound-              aries with a special tag.       sseeggmmeenntt--nnbbeesstt will also process any command line arguments       following  the  options  as lists of N-best lists, as with       the --nnbbeesstt--ffiilleess option.   Each  _n_b_e_s_t_-_f_i_l_e_-_l_i_s_t  will  be       processed  in  turn, with individual output delimited by a       line of the form            <nbestfile _n_b_e_s_t_-_f_i_l_e_-_l_i_s_t>SSEEEE AALLSSOO       ngram-count(1),  segment(1),  ngram-format(5),  nbest-for-       mat(5).       A.  Stolcke, ``Modeling Linguistic Segment and Turn Bound-       aries for N-best Rescoring of Spontaneous Speech,''  _P_r_o_c_.       _E_u_r_o_s_p_e_e_c_h, 2779-2782, 1997.BBUUGGSS       N-gram models of arbitrary order can be used, but the con-       text at the beginning of a hypothesis never extends beyond       the words from the preceding N-best list.AAUUTTHHOORR       Andreas Stolcke <stolcke@speech.sri.com>.       Copyright 1997-2004 SRI InternationalSRILM Tools        $Date: 2004/12/03 17:59:01 $  segment-nbest(1)
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -