nbest-format.5

来自「这是一款很好用的工具包」· 5 代码 · 共 72 行

72 行

nbest-format(5)                                   nbest-format(5)NNAAMMEE       nbest-format - File formats for N-best hypotheses listsDDEESSCCRRIIPPTTIIOONN       SRILM  currently  understands  three different formats for       lists of N-best hypotheses for rescoring or 1-best hypoth-       esis  extraction.  The first two formats originated in the       SRI Decipher(TM) recognition system, the third  format  is       particular to SRILM.       The first format consists of the header            NBestList1.0       followed by one or more lines of the form            (_s_c_o_r_e) _w_1 _w_2 _w_3 ...       where  _s_c_o_r_e  is a composite acoustic/language model score       from the recognizer, on the bytelog scale.  (A bytelog  is       a logarithm to base 1.0001, divided by 1024 and rounded to       an integer.)  This format  is  output  by  the  SRI  Deci-       pher(TM)  recognizer,  by  the nnggrraamm --nnbbeesstt, and by nnbbeesstt--       llaattttiiccee --wwrriittee--nnbbeesstt --ddeecciipphheerr--nnbbeesstt.       The second Decipher(TM) format  is  an  extension  of  the       first  format  that  encodes  word-level  scores  and time       alignments.  It is marked by a header of the form            NBestList2.0       The hypotheses are in the format            (_s_c_o_r_e) _w_1 ( st: _s_t_1 et: _e_t_1 g: _g_1 a: _a_1 ) _w_2 ...       where words are followed by start and end times,  language       model  and acoustic scores (bytelog-scaled), respectively.       This format may also contain scores  and  time  marks  for       sub-word units (phones and HMM states), in the same format       as above, but with the _w's denoting phone and state names.       Sub-word  units will have time marks that are contained in       the duration of the preceding word units, and may thus  be       easily identified.       The  third  format understood by SRILM lists hypotheses in       the format            _a_s_c_o_r_e _l_s_c_o_r_e _n_w_o_r_d_s _w_1 _w_2 _w_3 ...       where the first three columns contain the  acoustic  model       log  probability,  the language model log probability, and       the number of words  in  the  hypothesis  string,  respec-       tively.   All scores are logarithms base 10.  (This format       must not be preceded by an  ``NBestList''  header.)   This       format  is  output by the nnggrraamm --rreessccoorree and by nnbbeesstt--llaatt--       ttiiccee --wwrriittee--nnbbeesstt without the --ddeecciipphheerr--nnbbeesstt option.SSEEEE AALLSSOO       ngram(1),   nbest-lattice(1),   segment-nbest(1),   nbest-       scripts(1), pfsg-scripts(1).BBUUGGSS       All these formats are somewhat ad hoc and could use a more       rational design.  The ``NBestList1.0'' format is  particu-       larly  cumbersome  because  it conflates acoustic and lan-       guage model scores.       A generalization to an arbitrary number of separate scores       would be nice.AAUUTTHHOORR       Manual    page   written   by   Andreas   Stolcke   <stol-       cke@speech.sri.com>.       Copyright 1999-2001 SRI InternationalSRILM File Formats $Date: 2001/08/11 20:03:03 $   nbest-format(5)

nbest-format.5 - 源码说明

本页面展示了「这是一款很好用的工具包」中的 nbest-format.5 源码文件，采用 5 编程语言编写，共 72 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。

虫虫下载站收录了大量与工具包相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。

⌨️ 快捷键说明

复制代码Ctrl + C

搜索代码Ctrl + F

全屏模式F11

增大字号Ctrl + =

减小字号Ctrl + -

显示快捷键?