📄 pfsg-scripts.1
字号:
pfsg-scripts(1) pfsg-scripts(1)NNAAMMEE pfsg-scripts, add-classes-to-pfsg, add-pauses-to-pfsg, classes-to-fsm, fsm-to-pfsg, htklat-vocab, make-nbest- pfsg, make-ngram-pfsg, pfsg-from-ngram, pfsg-to-dot, pfsg- to-fsm, pfsg-vocab, wlat-stats, wlat-to-dot, wlat-to-pfsg - create and manipulate finite-state networksSSYYNNOOPPSSIISS mmaakkee--nnggrraamm--ppffssgg [mmaaxxoorrddeerr==_N] [cchheecckk__bboowwss==0|1] [nnoo__eemmppttyy__bboo==11] [vveerrssiioonn==11] [ttoopp__lleevveell__nnaammee==_n_a_m_e] [nnuullll==_s_t_r_i_n_g] [_l_m_-_f_i_l_e] >>_p_f_s_g_-_f_i_l_e aadddd--ppaauusseess--ttoo--ppffssgg [vvooccaabb==_f_i_l_e] [ppaauusseellaasstt==11] [wwoorrddwwrraapp==00] [ppaauussee==_p_a_u_s_e_w_o_r_d] [vveerrssiioonn==11] [ttoopp__lleevveell__nnaammee==_n_a_m_e] [nnuullll==_s_t_r_i_n_g] [_p_f_s_g_-_f_i_l_e] aadddd--ccllaasssseess--ttoo--ppffssgg ccllaasssseess==_c_l_a_s_s_e_s [nnuullll==_s_t_r_i_n_g] [_p_f_s_g_- _f_i_l_e] ppffssgg--ffrroomm--nnggrraamm [_l_m_-_f_i_l_e] >>_p_f_s_g_-_f_i_l_e mmaakkee--nnbbeesstt--ppffssgg [nnoottrreeee==0|1 ssccaallee==_S aammww==_A llmmww==_L wwttww==_W ] [_n_b_e_s_t_-_f_i_l_e] ppffssgg--vvooccaabb [_p_f_s_g_-_f_i_l_e...] hhttkkllaatt--vvooccaabb [qquuootteess==11] [_h_t_k_-_l_a_t_t_i_c_e_-_f_i_l_e...] ppffssgg--ttoo--ddoott [sshhooww__pprroobbss==0|1 sshhooww__llooggss==0|1 sshhooww__nnuummss==0|1] [_p_f_s_g_-_f_i_l_e] ppffssgg--ttoo--ffssmm [ssyymmbboollffiillee==_s_y_m_b_o_l_s ssyymmbboolliicc==0|1 ssccaallee==_S ffiinnaall__oouuttppuutt==_E] [_p_f_s_g_-_f_i_l_e] ffssmm--ttoo--ppffssgg [ppffssgg__nnaammee==_n_a_m_e ttrraannssdduucceerr==0|1 ssccaallee==_S] [_f_s_m_- _f_i_l_e] ccllaasssseess--ttoo--ffssmm vvooccaabb==_v_o_c_a_b [iissyymmbboollffiillee==_i_s_y_m_b_o_l_s oossyymmbbooll-- ffiillee==_o_s_y_m_b_o_l_s ssyymmbboolliicc==0|1] [_c_l_a_s_s_e_s] wwllaatt--ttoo--ppffssgg [_w_l_a_t_-_f_i_l_e] wwllaatt--ttoo--ddoott [sshhooww__pprroobbss==0|1 sshhooww__nnuummss==0|1] [_w_l_a_t_-_f_i_l_e] wwllaatt--ssttaattss [_w_l_a_t_-_f_i_l_e]DDEESSCCRRIIPPTTIIOONN These scripts create and manipulate various forms of finite-state networks. Note that they take options with the ggaawwkk(1) syntax _o_p_t_i_o_n==_v_a_l_u_e instead of the more common --_o_p_t_i_o_n _v_a_l_u_e. Also, since these tools are implemented as scripts they don't automatically input or output compressed model files correctly, unlike the main SRILM tools. However, since most scripts work with data from standard input or to standard output (by leaving out the file argument, or specifying it as ``-'') it is easy to combine them with gguunnzziipp(1) or ggzziipp(1) on the command line. mmaakkee--nnggrraamm--ppffssgg encodes a backoff N-gram model in nnggrraamm-- ffoorrmmaatt(5) as a finite-state network in ppffssgg--ffoorrmmaatt(5). mmaaxxoorrddeerr==_N limits the N-gram length used in PFSG construc- tion to _N; the default is to use all N-grams occurring in the input model. cchheecckk__bboowwss==11 enables a check for condi- tional probabilities that are smaller than the correspond- ing backoff probabilities. Such transitions should first be removed from the model with nnggrraamm --pprruunnee--lloowwpprroobbss. nnoo__eemmppttyy__bboo==11 Prevents empty paths through the PFSG resulting from transitions through the unigram backoff node. aadddd--ppaauusseess--ttoo--ppffssgg replaces the word nodes in an input PFSG with sub-PFSGs that allow an optional pause before each word. It also inserts an optional pause following the last word in the sentence. A typical usage is make-ngram-pfsg _n_g_r_a_m | \ add-pauses-to-pfsg >_f_i_n_a_l_-_p_f_s_g The result is a PFSG suitable for use in a speech recog- nizer. The option ppaauusseellaasstt==11 switches the order of words and pause nodes in the sub-PFSGs; wwoorrddwwrraapp==00 disables the insertion of sub-PFSGs altogether. The options ppaauussee==_p_a_u_s_e_w_o_r_d and ttoopp__lleevveell__nnaammee==_n_a_m_e allow changing the default names of the pause word and the top- level grammar, respectively. vveerrssiioonn==11 inserts a version line at the top of the output as required by the Nuance recognition system (see NUANCE COMPATIBILTY below). aadddd-- ppaauusseess--ttoo--ppffssgg uses a heuristic to distinguish word nodes in the input PFSG from other nodes (NULL or sub-PFSGs). The option vvooccaabb==_f_i_l_e lets one specify a vocabulary of word names to override these heuristics. aadddd--ccllaasssseess--ttoo--ppffssgg extends an input PFSG with expansions for word classes, defined in _c_l_a_s_s_e_s. _p_f_s_g_-_f_i_l_e should contain a PFSG generated from the N-gram portion of a class N-gram model. A typical usage is thus make-ngram-pfsg _c_l_a_s_s_-_n_g_r_a_m | \ add-classes-to-pfsg classes=_c_l_a_s_s_e_s | \ add-pauses-to-pfsg >_f_i_n_a_l_-_p_f_s_g ppffssgg--ffrroomm--nnggrraamm is a wrapper script that combines removal of low-probability N-grams, conversion to PFSG, and adding of optional pauses to create a PFSG for recognition. mmaakkee--nnbbeesstt--ppffssgg converts an N-best list in nnbbeesstt--ffoorrmmaatt(5) into a PFSG which, when used in recognition, allows exactly the hypotheses contained in the N-best list. nnoottrreeee==11 creates separate PFSG nodes for all word instances; the default is to construct a prefix-tree structured PFSG. ssccaallee==_S multiplies the total hypothesis scores by _S; the default is 0, meaning that all hypotheses have identical probability in the PFSG. Three options, aammww==_A, llmmww==_L, and wwttww==_W, control the score weighting in N- best lists that contain separate acoustic and language model scores, setting the acoustic model weight to _A_, the language model weight to _L, and the word transition weight to _W. ppffssgg--vvooccaabb extracts the vocabulary used in one or more PFSGs. hhttkkllaatt--vvooccaabb does the same for lattices in HTK standard lattice format. The qquuootteess==11 option enables pro- cessing of HTK quotes. ppffssgg--ttoo--ddoott renders a PFSG in ddoott(1) format for subsequent layout, printing, etc. sshhooww__pprroobbss==11 includes transition probabilities in the output. sshhooww__llooggss==11 includes log (base 10) transition probabilities in the output. sshhooww__nnuummss==11 includes node numbers in the output. ppffssgg--ttoo--ffssmm converts a finite-state network in ppffssgg--ffoorr-- mmaatt(5) into an equivalent network in AT&T ffssmm(5) format. This involves moving output actions from nodes to transi- tions. If ssyymmbboollffiillee==_s_y_m_b_o_l_s is specified, the mapping from FSM output symbols is written to _s_y_m_b_o_l_s for later use with the --ii or --oo options of ffssmm(1) tools. ssyymmbboolliicc==11 preserves the word strings in the resulting FSA. ssccaallee==_S scales the transition weights by a factor _S; the default is -1 (to conform to the default FSM semiring). ffiinnaall__oouuttppuutt==_E forces the final FSA node to have output label _S; this also forces creation of a unique final FSA node, which is otherwise unnecessary if the final node has a null output. ffssmm--ttoo--ppffssgg conversely transforms ffssmm(5) format into ppffssgg-- ffoorrmmaatt(5). This involves moving output actions from tran- sitions to nodes, and generally requires an increase in the number of nodes. (The conversion is done such that ppffssgg--ttoo--ffssmm and ffssmm--ttoo--ppffssgg are exact inverses of each other.) The _n_a_m_e parameter sets the name field of the output PFSG. ttrraannssdduucceerr==11 indicates that the input is a transducer and that input:output pairs should be preserved in the PFSG. ssccaallee==_S scales the transition weights by a factor _S; the default is -1 (to conform to the default FSM semiring). ccllaasssseess--ttoo--ffssmm converts a ccllaasssseess--ffoorrmmaatt(5) file into a transducer in ffssmm(5) format, such that composing the transducer with an FSA encoding a class language model results in an FSA for the word language model. The word vocabulary needs to be given in file _v_o_c_a_b. iissyymmbbooll-- ffiillee==_i_s_y_m_b_o_l_s and oossyymmbboollffiillee==_o_s_y_m_b_o_l_s allow saving the input and output symbol tables of the transducer for later use. ssyymmbboolliicc==11 preserves the word strings in the result- ing FSA. The following commands show the creation of an FSA encod- ing the class N-gram grammar ``test.bo'' with vocabulary ``test.vocab'' and class expansions ``test.classes'': classes-to-fsm vocab=test.vocab symbolic=1 \ isymbolfile=CLASSES.inputs \ osymbolfile=CLASSES.outputs \ test.classes >CLASSES.fsm make-ngram-pfsg test.bo | \ pfsg-to-fsm symbolic=1 >test.fsm fsmcompile -i CLASSES.inputs test.fsm >test.fsmc fsmcompile -t -i CLASSES.inputs -o CLASSES.outputs \ CLASSES.fsm >CLASSES.fsmc fsmcompose test.fsmc CLASSES.fsmc >result.fsmc wwllaatt--ttoo--ppffssgg converts a word posterior lattice or mesh ("sausage") in wwllaatt--ffoorrmmaatt(5) into ppffssgg--ffoorrmmaatt(5). wwllaatt--ttoo--ddoott renders a wwllaatt--ffoorrmmaatt(5) word lattice in ddoott(1) format for subsequent layout, printing, etc. sshhooww__pprroobbss==11 includes node posterior probabilities in the output. sshhooww__nnuummss==11 includes node indices in the output. wwllaatt--ssttaattss computes statistics of word posterior lattices, including the number of word hypotheses, the entropy (log base 10) of the sentence hypothesis set represented, and the posterior expected number of words. For word meshes that have been aligned with references, the 1-best and oracle lattice error rates are also computed.NNUUAANNCCEE CCOOMMPPAATTIIBBIILLIITTYY The Nuance recognizer (as of version 6.2) understands a variant of the PFSG format; hence the scripts above should be useful in building recognition systems for that recog- nizer. A suitable PFSG can be generated from an N-gram backoff model in ARPA nnggrraamm--ffoorrmmaatt(5) using the following command: ngram -debug 1 -order _N -lm _L_M_._b_o -prune-lowprobs -write-lm - | \ make-ngram-pfsg | \ add-pauses-to-pfsg version=1 pauselast=1 pause=_pau_ top_level_name=.TOP_LEVEL >_L_M_._p_f_s_g assuming the pause word in the dictionary is ``_pau_''. Certain restrictions on the naming of words (e.g., no hyphens are allowed) have to be respected. The resulting PFSG can then be referenced in a Nuance grammar file, e.g., .TOP [NGRAM_PFSG] NGRAM_PFSG:lm _L_M_._p_f_s_g In newer Nuance versions the name for a non-emitting node was changed to NNUULLNNOODD, and inter-word optional pauses are automatically added to the grammar. This means that the PFSG should be create using ngram -debug 1 -order _N -lm _L_M_._b_o -prune-lowprobs -write-lm - | \ make-ngram-pfsg version=1 top_level_name=.TOP_LEVEL null=NULNOD >_L_M_._p_f_s_g The nnuullll==NNUULLNNOODD option should also be passed to aadddd-- ccllaasssseess--ttoo--ppffssgg. Starting with version 8, Nuance supports N-gram LMs. How- ever, you can still use SRILM to create LMs, as described above. The syntax for inclusion of a PFSG has changed to NGRAM_PFSG:slm _L_M_._p_f_s_g Caveat: Compatibility with Nuance is purely due to histor- ical circumstance and not supported.SSEEEE AALLSSOO lattice-tool(1), ngram(1), ngram-format(5), pfsg-for- mat(5), wlat-format(5), nbest-format(5), classes-for- mat(5), fsm(5), dot(1).BBUUGGSS mmaakkee--nnggrraamm--ppffssgg should be reimplemented in C++ for speed and some size optimizations that require more global oper- ations on the PFSG.AAUUTTHHOORR Andreas Stolcke <stolcke@speech.sri.com>. Copyright 1995-2005 SRI InternationalSRILM Tools $Date: 2006/10/05 19:43:07 $ pfsg-scripts(1)
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -