⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 nbest-optimize.1

📁 这是一款很好用的工具包
💻 1
📖 第 1 页 / 共 2 页
字号:
nbest-optimize(1)                               nbest-optimize(1)NNAAMMEE       nbest-optimize  -  optimize  score  combination for N-best       word error minimizationSSYYNNOOPPSSIISS       nnbbeesstt--ooppttiimmiizzee [--hheellpp] option ...  [ _s_c_o_r_e_d_i_r ...  ]DDEESSCCRRIIPPTTIIOONN       nnbbeesstt--ooppttiimmiizzee reads a set  of  N-best  lists,  additional       score  files,  and corresponding reference transcripts and       optimizes the score combination weights so as to  minimize       the  word  error  of a classifier that performs word-level       posterior probability maximization.  The optimized weights       are  meant  to be used with nnbbeesstt--llaattttiiccee(1) and the --uussee--       mmeesshh  option,  or  the  nnbbeesstt--rroovveerr  script  (see   nnbbeesstt--       ssccrriippttss(1)).  nnbbeesstt--ooppttiimmiizzee determines both the best rel-       ative weighting of knowledge source scores and the optimal       --ppoosstteerriioorr--ssccaallee parameter that controls the peakedness of       the posterior distribution.       The optimization is performed by  gradient  descent  on  a       smoothed  (sigmoidal)  approximation  of the true 0/1 word       error function (Katagiri et  al.  1990).   Therefore,  the       result  can  only be expected to be a _l_o_c_a_l minimum of the       error surface.  (A more global search can be attempted  by       specifying different starting points.)  Another approxima-       tion is that the error function  is  computed  assuming  a       fixed  multiple alignment of all N-best hypotheses and the       reference string, which tends to slightly overestimate the       true  pairwise error between any single hypothesis and the       reference.       An  alternative  search  strategy  uses  a   simplex-based       "Amoeba"  search on the (non-smoothed) word error function       (Press et al. 1988).  The  search  is  restarted  multiple       times to avoid local minima.       Alternatively,  nnbbeesstt--ooppttiimmiizzee  can  also optimize weights       for a standard, 1-best hypothesis rescoring  that  selects       entire  (sentence)  hypotheses  (--11bbeesstt  option).  In this       mode sentence-level error counts may be read from external       files,  or computed on the fly from the reference strings.       The weights obtained are meant to be used for N-best  list       rescoring with rreessccoorree--rreewweeiigghhtt (see nnbbeesstt--ssccrriippttss(1)).OOPPTTIIOONNSS       Each  filename  argument  can  be an ASCII file, or a com-       pressed file (name ending in .Z or .gz), or ``-'' to indi-       cate stdin/stdout.       --hheellpp  Print option summary.       --vveerrssiioonn              Print version information.       --ddeebbuugg _l_e_v_e_l              Controls  the  amount  of  output  (the  higher the              _l_e_v_e_l, the more).  At level 1, error statistics  at              each  iteration  are  printed.   At  level  2, word              alignments are printed.  At  level  3,  full  score              matrix  is  printed.  At level 4, detailed informa-              tion about word hypothesis ranking is  printed  for              each training iteration and sample.       --nnbbeesstt--ffiilleess _f_i_l_e_-_l_i_s_t              Specifies  the  set  of  N-best  files as a list of              filenames.   Three  sets  of  standard  scores  are              extracted from the N-best files: the acoustic model              score, the language model score, and the number  of              words  (for  insertion  penalty  computation).  See              nnbbeesstt--ffoorrmmaatt(5) for details.       --rreeffss _r_e_f_e_r_e_n_c_e_s              Specifies the reference transcripts.  Each line  in              _r_e_f_e_r_e_n_c_e_s  must  contain the sentence ID (the last              component in the N-best filename  path,  minus  any              suffixes) followed by zero or more reference words.       --iinnsseerrttiioonn--wweeiigghhtt _W              Weight insertion errors by a factor _W.  This may be              useful to optimize for keyword spotting tasks where              insertions have a cost different from deletion  and              substitution errors.       --wwoorrdd--wweeiigghhttss _f_i_l_e              Read  a table of words and weights from _f_i_l_e.  Each              word error is weighted according to  the  word-spe-              cific weight.  The default weight is 1, and used if              a word has no specified weight.   Also,  when  this              option  is used, substitution errors are counted as              the sum of a deletion and an  insertion  error,  as              opposed  to  counting  as 1 error as in traditional              word error computation.       --11bbeesstt Select  optimization  for  standard  sentence-level              hypothesis selection.       --11bbeesstt--ffiirrsstt              Optimized  first  using --11bbeesstt mode, then switch to              full optimization.  This is  an  effective  way  to              quickly  bring  the  score  weights near an optimal              point, and then fine-tune  them  jointly  with  the              posterior scale parameter.       --eerrrroorrss _d_i_r              In  1-best mode, optimize for error counts that are              stored in separate files in directory _d_i_r.  Each N-              best list must have a matching error counts file of              the same basename in _d_i_r.   Each  file  contains  7              columns of numbers in the format                   wcr wer nsub ndel nins nerr nw              Only  the  last  two  columns (number of errors and              words, respectively) are used.              If this option is omitted, errors will be  computed              from  the N-best hypotheses and the reference tran-              scripts.       --mmaaxx--nnbbeesstt _n              Limits the number of hypotheses read from  each  N-              best list to the first _n.       --rreessccoorree--llmmww _l_m_w              Sets  the  language  model weight used in combining              the language model log probabilities with  acoustic              log probabilities.  This is used to compute initial              aggregate hypotheses scores.       --rreessccoorree--wwttww _w_t_w              Sets the word transition weight used to weight  the              number of words relative to the acoustic log proba-              bilities.  This is used to compute  initial  aggre-              gate hypotheses scores.       --ppoosstteerriioorr--ssccaallee _s_c_a_l_e              Initial  value  for  scaling  log  posteriors.  The              total weighted log score is divided by  _s_c_a_l_e  when              computing normalized posterior probabilities.  This              controls the peakedness of the posterior  distribu-              tion.  The default value is whatever was chosen for              --rreessccoorree--llmmww, so that  language  model  scores  are              scaled  to  have weight 1, and acoustic scores have              weight 1/_l_m_w.       --ccoommbbiinnee--lliinneeaarr              Compute aggregate  scores  by  linear  combination,              rather   than  log-linear  combination.   (This  is              appropriate if the input scores represent  log-pos-              terior probabilities.)       --nnoonn--nneeggaattiivvee              Constrain search to non-negative weight values.       --vvooccaabb _f_i_l_e              Read  the  N-best  list vocabulary from _f_i_l_e.  This              option is mostly redundant since words found in the              N-best  input  are  implicitly added to the vocabu-              lary.       --ttoolloowweerr              Map vocabulary to lowercase, eliminating case  dis-              tinctions.       --mmuullttiiwwoorrddss              Split  multiwords  (words joined by '_') into their              components when reading N-best lists.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -