⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 nbest-format.5

📁 这是一款很好用的工具包
💻 5
字号:
nbest-format(5)                                   nbest-format(5)NNAAMMEE       nbest-format - File formats for N-best hypotheses listsDDEESSCCRRIIPPTTIIOONN       SRILM  currently  understands  three different formats for       lists of N-best hypotheses for rescoring or 1-best hypoth-       esis  extraction.  The first two formats originated in the       SRI Decipher(TM) recognition system, the third  format  is       particular to SRILM.       The first format consists of the header            NBestList1.0       followed by one or more lines of the form            (_s_c_o_r_e) _w_1 _w_2 _w_3 ...       where  _s_c_o_r_e  is a composite acoustic/language model score       from the recognizer, on the bytelog scale.  (A bytelog  is       a logarithm to base 1.0001, divided by 1024 and rounded to       an integer.)  This format  is  output  by  the  SRI  Deci-       pher(TM)  recognizer,  by  the nnggrraamm --nnbbeesstt, and by nnbbeesstt--       llaattttiiccee --wwrriittee--nnbbeesstt --ddeecciipphheerr--nnbbeesstt.       The second Decipher(TM) format  is  an  extension  of  the       first  format  that  encodes  word-level  scores  and time       alignments.  It is marked by a header of the form            NBestList2.0       The hypotheses are in the format            (_s_c_o_r_e) _w_1 ( st: _s_t_1 et: _e_t_1 g: _g_1 a: _a_1 ) _w_2 ...       where words are followed by start and end times,  language       model  and acoustic scores (bytelog-scaled), respectively.       This format may also contain scores  and  time  marks  for       sub-word units (phones and HMM states), in the same format       as above, but with the _w's denoting phone and state names.       Sub-word  units will have time marks that are contained in       the duration of the preceding word units, and may thus  be       easily identified.       The  third  format understood by SRILM lists hypotheses in       the format            _a_s_c_o_r_e _l_s_c_o_r_e _n_w_o_r_d_s _w_1 _w_2 _w_3 ...       where the first three columns contain the  acoustic  model       log  probability,  the language model log probability, and       the number of words  in  the  hypothesis  string,  respec-       tively.   All scores are logarithms base 10.  (This format       must not be preceded by an  ``NBestList''  header.)   This       format  is  output by the nnggrraamm --rreessccoorree and by nnbbeesstt--llaatt--       ttiiccee --wwrriittee--nnbbeesstt without the --ddeecciipphheerr--nnbbeesstt option.SSEEEE AALLSSOO       ngram(1),   nbest-lattice(1),   segment-nbest(1),   nbest-       scripts(1), pfsg-scripts(1).BBUUGGSS       All these formats are somewhat ad hoc and could use a more       rational design.  The ``NBestList1.0'' format is  particu-       larly  cumbersome  because  it conflates acoustic and lan-       guage model scores.       A generalization to an arbitrary number of separate scores       would be nice.AAUUTTHHOORR       Manual    page   written   by   Andreas   Stolcke   <stol-       cke@speech.sri.com>.       Copyright 1999-2001 SRI InternationalSRILM File Formats $Date: 2001/08/11 20:03:03 $   nbest-format(5)

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -