📄 tidigits.lm
字号:
############################################################################### Copyright (c) 1996, Carnegie Mellon University, Cambridge University,## Ronald Rosenfeld and Philip Clarkson#############################################################################============================================================================================ This file was produced by the CMU-Cambridge ============================== Statistical Language Modeling Toolkit ============================================================================================This is a 2-gram language model, based on a vocabulary of 13 words, which begins "<s>", "</s>", "oh"...This is an OPEN-vocabulary model (type 1) (OOVs were mapped to UNK, which is treated as any other vocabulary word)Absolute discounting was applied.1-gram discounting constant : 0.042-gram discounting constant : 0.04This file is in the ARPA-standard format introduced by Doug Paul.p(wd3|wd1,wd2)= if(trigram exists) p_3(wd1,wd2,wd3) else if(bigram w1,w2 exists) bo_wt_2(w1,w2)*p(wd3|wd2) else p(wd3|w2)p(wd2|wd1)= if(bigram exists) p_2(wd1,wd2) else bo_wt_1(wd1)*p_1(wd2)All probs and back-off weights (bo_wt) are given in log10 form.Data formats:Beginning of data mark: \data\ngram 1=nr # number of 1-gramsngram 2=nr # number of 2-grams\1-grams:p_1 wd_1 bo_wt_1\2-grams:p_2 wd_1 wd_2 end of data mark: \end\\data\ngram 1=14ngram 2=1\1-grams:-1.6805 <UNK> 0.0000-99.0000 <s> 0.0000-1.3795 </s> 0.0000-1.0695 oh 0.0000-1.0695 zero 0.0000-1.0695 one 0.0000-1.0695 two 0.0000-1.0695 three 0.0000-1.0695 four 0.0000-1.0695 five 0.0000-1.0695 six 0.0000-1.0695 seven 0.0000-1.0695 eight 0.0000-1.0695 nine 0.0000\2-grams:-99.0177 </s> <s> \end\
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -