⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 hbuild.tex

📁 隐马尔可夫模型源代码
💻 TEX
字号:
%/* ----------------------------------------------------------- */


%/*                                                             */


%/*                          ___                                */


%/*                       |_| | |_/   SPEECH                    */


%/*                       | | | | \   RECOGNITION               */


%/*                       =========   SOFTWARE                  */ 


%/*                                                             */


%/*                                                             */


%/* ----------------------------------------------------------- */


%/*         Copyright: Microsoft Corporation                    */


%/*          1995-2000 Redmond, Washington USA                  */


%/*                    http://www.microsoft.com                */


%/*                                                             */


%/*   Use of this software is governed by a License Agreement   */


%/*    ** See the file License for the Conditions of Use  **    */


%/*    **     This banner notice must not be removed      **    */


%/*                                                             */


%/* ----------------------------------------------------------- */


%


% HTKBook - Steve Young  24/11/97


%





\newpage


\mysect{HBuild}{HBuild}





\mysubsect{Function}{HBuild-Function}





\index{hbuild@\htool{HBuild}|(}


This program is used to convert input files that represent language


models in a number of different formats and output a standard


\HTK\ lattice. The main purpose of \htool{HBuild} is to allow the


expansion of \HTK\ multi-level lattices and the conversion of


bigram language models (such as those generated by \htool{HLStats})


into lattice format. 





The specific input file types supported by \htool{HBuild} are:


\begin{enumerate}


\item \HTK\ multi-level lattice files.


\item Back-off bigram files in ARPA/MIT-LL format.


\item Matrix bigram files produced by \htool{HLStats}.


\item Word lists (to generate a word-loop grammar).


\item Word-pair grammars in ARPA Resource Management format.


\end{enumerate}





The formats of both types of bigram supported by \htool{HBuild} 


are described in Chapter~\ref{c:netdict}. The format for multi-level


\HTK\ lattice files is described in Chapter~\ref{c:htkslf}.





\mysubsect{Use}{HBuild-Use}





\htool{HBuild} is invoked by the command line


\begin{verbatim}


   HBuild [options] wordList outLatFile


\end{verbatim}


The {\tt wordList} should contain a list of all the words used


in the input language model. The options specify the type of input


language model as well as the source filename. If none of the flags


specifying input language model type are given a simple word-loop


is generated using the {\tt wordList} given. After processing the


input language model, the resulting lattice


is saved to file {\tt outLatFile}.





The operation of \htool{HBuild} is controlled by the following


command line options


\begin{optlist}


  \ttitem{-b} Output the lattice in binary format. This increases


              speed of subsequent loading (default ASCII text lattices).





  \ttitem{-m fn} The matrix format bigram in {\tt fn} forms the input


              language model.





  \ttitem{-n fn} The ARPA/MIT-LL format back-off bigram in {\tt fn} 


              forms the input language model.





  \ttitem{-s st en} Set the bigram entry and exit words to {\tt st} 


        and {\tt en}.  (Default {\tt !ENTER} and {\tt !EXIT}).


        Note that no words will follow the exit word, or precede


        the entry word. Both the entry and exit word must be included


        in the {\tt wordList}. This option is only effective in conjunction


          with the \texttt{-n} option.





  \ttitem{-t st en} This option is used with word-loops and word-pair 


        grammars.


        An output lattice is produced with an initial word-symbol


        {\tt st} (before the loop) and a final word-symbol {\tt en}


        (after the loop). This allows initial and final silences


        to be specified. (Default is that the initial and final nodes


        are labelled with {\tt !NULL}). Note that {\tt st} and {\tt en} 


        shouldn't be included in the {\tt wordList} unless they occur 


        elsewhere in the network. This is only effective for word-loop and


          word-pair grammars.





  \ttitem{-u s} The unknown word is {\tt s} (default !NULL). This


         option only has an effect when bigram input language models 


         are specified. It can be used in conjunction with the {\tt -z}


         flag to delete the symbol for unknown words from the output


         lattice.





  \ttitem{-w fn} The word-pair grammar in {\tt fn} 


              forms the input language model. The file must be in


         the format used for the ARPA Resource Management grammar.





  \ttitem{-x fn} The extended HTK lattice in {\tt fn} 


              forms the input language model. This option is


              used to expand a multi-level lattice into a single


              level lattice that can be processed by other \HTK\ tools.





  \ttitem{-z} Delete (zap) any references to the unknown word (see {\tt -u} 


              option) in the output lattice.





\end{optlist}


\stdopts{HBuild}





\mysubsect{Tracing}{HBuild-Tracing}





\htool{HBuild} supports the following trace options where each


trace flag is given using an octal base


\begin{optlist}


   \ttitem{0001} basic progress reporting.


\end{optlist}


Trace flags are set using the \texttt{-T} option or the  \texttt{TRACE} 


configuration variable.


\index{hbuild@\htool{HBuild}|)}








%%% Local Variables: 


%%% mode: latex


%%% TeX-master: "../htkbook"


%%% End: 


⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -