tidigits.lm

来自「WinCE平台上的语音识别程序」· LM 代码 · 共 64 行

LM
64
字号
############################################################################### Copyright (c) 1996, Carnegie Mellon University, Cambridge University,## Ronald Rosenfeld and Philip Clarkson#############################################################################============================================================================================  This file was produced by the CMU-Cambridge  ==============================     Statistical Language Modeling Toolkit     ============================================================================================This is a 2-gram language model, based on a vocabulary of 13 words,  which begins "<s>", "</s>", "oh"...This is an OPEN-vocabulary model (type 1)  (OOVs were mapped to UNK, which is treated as any other vocabulary word)Absolute discounting was applied.1-gram discounting constant : 0.042-gram discounting constant : 0.04This file is in the ARPA-standard format introduced by Doug Paul.p(wd3|wd1,wd2)= if(trigram exists)           p_3(wd1,wd2,wd3)                else if(bigram w1,w2 exists) bo_wt_2(w1,w2)*p(wd3|wd2)                else                         p(wd3|w2)p(wd2|wd1)= if(bigram exists) p_2(wd1,wd2)            else              bo_wt_1(wd1)*p_1(wd2)All probs and back-off weights (bo_wt) are given in log10 form.Data formats:Beginning of data mark: \data\ngram 1=nr            # number of 1-gramsngram 2=nr            # number of 2-grams\1-grams:p_1     wd_1 bo_wt_1\2-grams:p_2     wd_1 wd_2 end of data mark: \end\\data\ngram 1=14ngram 2=1\1-grams:-1.6805 <UNK>	0.0000-99.0000 <s>	0.0000-1.3795 </s>    0.0000-1.0695 oh	0.0000-1.0695 zero	0.0000-1.0695 one	0.0000-1.0695 two	0.0000-1.0695 three	0.0000-1.0695 four	0.0000-1.0695 five	0.0000-1.0695 six	0.0000-1.0695 seven	0.0000-1.0695 eight	0.0000-1.0695 nine	0.0000\2-grams:-99.0177 </s> <s> \end\

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?