⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 multi-ngram.1

📁 这是一款很好用的工具包
💻 1
字号:
multi-ngram(1)                                     multi-ngram(1)NNAAMMEE       multi-ngram - build multiword N-gram modelsSSYYNNOOPPSSIISS       mmuullttii--nnggrraamm [--hheellpp] option ...DDEESSCCRRIIPPTTIIOONN       mmuullttii--nnggrraamm  builds  N-gram  language  models that contain       multiwords, i.e., compound words that are a  concatenation       of  words from some prior given model.  It will optionally       generate multiword N-grams and insert them into an  exist-       ing,  reference  N-gram  model,  so as to cover multiwords       occuring in a specified vocabulary.  It will  then  assign       probabilities  to  the  multiword  N-grams  so  that  word       strings containing multiwords have the same  probabilities       as  the strings of component words in the reference model.       Note that the inverse operation (expanding a multiword  N-       gram  to  contain  only  regular words) is subsumed by the       nnggrraamm --eexxppaanndd--ccllaasssseess function.OOPPTTIIOONNSS       Each filename argument can be an ASCII  file,  or  a  com-       pressed file (name ending in .Z or .gz), or ``-'' to indi-       cate stdin/stdout.       --hheellpp  Print option summary.       --vveerrssiioonn              Print version information.       --oorrddeerr _n              Set the maximal N-gram order to be  used  from  the              reference  model.   NOTE: The order of the model is              not set automatically when a model file is read, so              the  same  file  can be used at various orders.  To              use models of order higher than 3 it is always nec-              essary to specify this option.       --mmuullttii--oorrddeerr _n              The  maximal  N-gram  order  in the multiword-based              model.       --ddeebbuugg _l_e_v_e_l              Set the debugging output level (0 means  no  debug-              ging output).       --vvooccaabb _f_i_l_e              Words  to  be  added  to the model.  In particular,              this should include all the multiwords to be added.       --mmuullttii--cchhaarr _C              Character used to delimit component words in multi-              words (an underscore character by default).       --llmm _f_i_l_e              Reference N-gram model.       --mmuullttii--llmm _f_i_l_e              Model containing multiwords; the  N-grams  in  this              model  will  be assigned new probabilities based on              the reference model.  If this option is  _n_o_t  given              then  the  multiword  model  will  be  generated by              adding multiword N-grams to the reference model.       --pprruunnee--uunnsseeeenn--nnggrraammss              This option prevents the insertion of multiword  N-              grams  whose component N-grams are not contained in              the reference model.  For example, for a  multiword              bigram  "a_b  c_d" to be inserted, a trigram refer-              ence model must contain the trigrams "a b c" and "b              c  d".  If the reference model were a bigram LM, it              would have to contain "a b",  "b  c",  and  "c  d".              This option is important to control the size of the              multiword LM for large vocabularies.       --wwrriittee--llmm _f_i_l_e              Output location of the generated multiword model.SSEEEE AALLSSOO       ngram(1), ngram-format(5).BBUUGGSS       This program is a hack for cases were the original  train-       ing  data is not available and a multiword model has to be       generated from an existing model.       The resulting model  is  no  longer  properly  normalized,       since  the same word string can potentially be represented       with or without multiwords.       The generation of multiword N-grams uses a heuristic algo-       rithm that works well for bigrams and trigrams, but is not       exhaustive.AAUUTTHHOORR       Andreas Stolcke <stolcke@speech.sri.com>.       Copyright 2000-2004 SRI InternationalSRILM Tools        $Date: 2004/12/03 17:59:01 $    multi-ngram(1)

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -