⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 hmmirest.tex

📁 隐马尔可夫模型源代码
💻 TEX
字号:
%/* ----------------------------------------------------------- */


%/*                                                             */


%/*                          ___                                */


%/*                       |_| | |_/   SPEECH                    */


%/*                       | | | | \   RECOGNITION               */


%/*                       =========   SOFTWARE                  */ 


%/*                                                             */


%/*                                                             */


%/* ----------------------------------------------------------- */


%/* developed at:                                               */


%/*                                                             */


%/*      Speech Vision and Robotics group                       */


%/*      Cambridge University Engineering Department            */


%/*      http://svr-www.eng.cam.ac.uk/                          */


%/*                                                             */


%/* ----------------------------------------------------------- */


%/*         Copyright:                                          */


%/*                                                             */


%/*              2002  Cambridge University                     */


%/*                    Engineering Department                   */


%/*                                                             */


%/*   Use of this software is governed by a License Agreement   */


%/*    ** See the file License for the Conditions of Use  **    */


%/*    **     This banner notice must not be removed      **    */


%/*                                                             */


%





% HTKBook - this chapter written by Dan Povey 01/11/02


%





\newpage


\mysect{HMMIRest}{HMMIRest}





\mysubsect{Function}{HMMIRest-Function}





\index{hmmirest@\htool{HMMIRest}|(}





This program is used to perform a single re-estimation of a HMM set using


a discriminative training criterion.  Maximum Mutual Information (MMI),


Minimum Phone Error (MPE) and  Minimum Word Error (MWE) are supported.


Gaussian parameters are re-estimated using the Baum-Welch update.  





Discriminative training needs to be initialised with trained models; these


will generally be trained using MLE.  Also essential are lattices.


Two sets of lattices are needed: a lattice for the correct transcription of


each training file, and a lattice derived from the recognition of each


training file.  These are called the ``numerator'' and ``denominator''


lattices respectively, a reference to their respective positions in the


MMI objective function, which can be expressed as follows:





\newcommand{\num}{{\mathrm{num}}}


\newcommand{\den}{{\mathrm{den}}}


\newcommand{\MMI}{{\mathrm{MMI}}}


\newcommand{\calO}{ {\cal O} }


\newcommand{\calM}{ {\cal M} }





\begin{equation}


    {\cal F}_\MMI(\lambda) = \sum_{r=1}^R \log \frac {  p^\kappa(\calO_r | \calM^\num_r) } { p^\kappa(\calO | \calM^\den_r) },


  \label{eq:mmimodels}


\end{equation}


where $\lambda$ is the HMM parameters,  $\kappa$ is a scale on the likelihoods, $\calO_r$ is the


speech data for the $r$'th training file and $\calM^\num_r$ and $\calM^\den_r$ are the numerator


and denominator models.  





The numerator and denominator lattices need to contain phone alignment information


(the start and end times of each phone) and language model likelihoods.


Phone alignment information can be obtained by using \htool{HDecode} with


the letter \texttt{d} included in the lattice format options (specified to \htool{HDecode} by the


\texttt{-q} command line option).  It is important that the language model


applied to the lattice not be too informative (e.g., use a unigram LM) and that the


pruning beam used to create it be large enough (probably greater than 100).


Typically an initial set of lattices would be created by recognition using


a bigram language model and a pruning beam in the region of 125-200


(e.g. using \htool{HDecode}, and this would then have a unigram language


model applied to it and a pruning beam applied in the region 100-150, using


the tool \htool{HLRescore}.   The phone alignment information


can then be added using \htool{HDecode}.





Having created these lattices using an initial set of models, the


tool \htool{HMMIRest} can be used to re-estimate the HMMs for 


typically 4-8 iterations, using the same set of lattices.   





\htool{HMMIRest} supports multiple mixture Gaussians, tied-mixture


HMMs, multiple data streams, parameter tying within and between models


(but not if means and variances are tied independently), and diagonal


covariance matrices only.








%\htool{HMMIRest} supports variance flooring in the same way as \htool{HERest}.





%It also supports \textit{Single pass retraining}, useful when the parameterisation of


%the front-end (e.g. from MFCC to PLP coefficients) is to be modified.








Like \htool{HERest}, \htool{HMMIRest} includes features to allow parallel


operation where a network of processors is available. When the training set


is large, it can be split into separate chunks that are processed in


parallel on multiple machines/processors, speeding up the training process.


\htool{HMMIRest} can operate in two modes:


\begin{enumerate}


\item Single-pass operation: the data files on the command line are the


  speech files to be used for re-estimation of the HMM set; the re-estimated


  HMM is written by the program to the directory specified by the {\texttt{-M}}


   option.


\item Two-pass operation with data split into \texttt{P} blocks:


   \begin{enumerate}  


     \item First pass: multiple jobs are started with the options \texttt{-p 1}, 


       \texttt{-p 2} $\ldots$ \texttt{-p P} are given; the different subsets of training


     files should be given on each command line or in a file specified by the \texttt{-S}


     option.  Accumulator files \texttt{<dir>/HDR<p>.acc.1},  \texttt{<dir>/HDR<p>.acc.2} 


    and (for MPE) \texttt{<dir>/HDR<p>.acc.3}  are written, where \texttt{<dir>} is


    the directory specified by the $\texttt{-M}$  and \texttt{<p>} is the number specified


     by the \texttt{-p} option.  


     \item Second pass: \htool{HMMIRest} is called with \texttt{-p 0}, and the


      command-line arguments are the accumulator files written by the first pass.


     The re-estimated HMM set is written to the directory specifified by the $\texttt{-M}$ option. 


   \end{enumerate}


\end{enumerate}











\mysubsect{Use}{HMMIRest-Use}


 \label{sec:hmmirestuse}





\htool{HMMIRest} is invoked via the command line


\begin{verbatim}


   HMMIRest [options] hmmList trainFile ...


\end{verbatim}


or alternatively if the training is in parallel mode, the second pass is invoked as


\begin{verbatim}


   HMMIRest [options] hmmList accFile ...


\end{verbatim}





As always, the list of arguments can be stored in a script file (\texttt{-S} option) if required.  





\subsubsection{Typical use}





To give an example of typical use, for single-pass operation the usage might be:


\begin{verbatim}


HMMIRest -A -V -D -H <startdir>/MMF -M <enddir> -C <config> -q <num-lat-dir> 


        -r <den-lat-dir> -S <traindata-list-file> <hmmlist>


\end{verbatim}


For two-pass operation, the first pass would be (for the \texttt{n}'th training file),


\begin{verbatim}


HMMIRest -p <p> -A -V -D -H <startdir>/MMF -M <enddir> -C <config> -q <num-lat-dir>


        -r <den-lat-dir> -S <p'th-traindata-list-file> <hmmlist>


\end{verbatim}


and the second pass would be


\begin{verbatim}


HMMIRest -p 0 -A -V -D -H <startdir>/MMF -M <enddir> -C <config> -q <num-lat-dir> 


   -r <den-lat-dir> <hmmlist> <enddir>/HDR1.acc.1 <enddir>/HDR1.acc.2 


   <enddir>/HDR2.acc.1  <enddir>/HDR2.acc.2 ... <enddir>/HDR<P>.acc.2 


\end{verbatim}


This is for MMI training; for MPE there will be three files \texttt{HDRn.acc.1,HDRn.acc.2,HDRn.acc.3}


per block of training data.  \texttt{ISMOOTHTAU} can also be tuned.





\subsubsection{Typical config variables}





The most important config variables are given as follows for a number of example setups.


MMI training is the default behaviour.


\begin{enumerate}  


   \item I-smoothed MMI training: set \texttt{ISMOOTHTAU=100}


   \item MPE training (``approximate-accuracy'', which is default kind of MPE): set \texttt{MPE=TRUE}, \texttt{ISMOOTHTAU=50}


   \item ``Exact'' MPE training: set \texttt{MPE=TRUE}, \texttt{EXACTCORRECTNESS=TRUE}, \texttt{INSCORRECTNESS=-0.9}


        (this  may have to be tuned in the range -0.8 to -1; it affects the testing 


        insertion/deletion ratio)


   \item ``Approximate-error'' MPE training (recommended): set \texttt{MPE=TRUE}, \texttt{CALCASERROR=TRUE}, \texttt{INSCORRECTNESS=-0.9}


        (this  may have to be tuned in the range -0.8 to -1; it affects the testing 


        insertion/deletion ratio)


   \item MWE training: set \texttt{MWE=TRUE}, \texttt{ISMOOTHTAU=25}


\end{enumerate}





In all cases set \texttt{LATPROBSCALE=<f>}, where \texttt{<f>} will


normally be in the range 1/10 to 1/20, typically the inverse of the


language model scale.











\subsubsection{Storage of lattices}





If there are great many training files, the directories specified for the numerator and denominator lattice


can contain subdirectories containing the actual lattices.  The name of the subdirectory required can


be extracted from the filenames by adding config variables for instance as follows:


\begin{verbatim}


LATMASKNUM = */%%%?????.???


LATMASKDEN = */%%%?????.???


\end{verbatim}


which would convert a fileneme \texttt{foo/bar12345.lat} to \texttt{<dir>/bar/bar12345.lat},


where \texttt{<dir>} would be the directory specified by the \texttt{-q} or \texttt{-r} option.


The \texttt{\%}'s are converted to the directory name and other characters are discarded.


If lattices are gzipped in order to save space, they can be read in by adding the 


config


\begin{verbatim}


HNETFILTER='gunzip -c < $.gz'


\end{verbatim}








\mysubsect{Command-line options}{HMMIRest-options}











The full list of the options accepted by \htool{HMMIRest} is as follows.


Options not ever needed for normal operation are marked ``obscure''.





\begin{optlist}





  \ttitem{-d dir} 


      (obscure)


      Normally \htool{HMMIRest} looks for HMM definitions


       (not already loaded via MMF files) 


      in the current directory.  This option tells \htool{HMMIRest} to look in


      the directory {\tt dir} to find them.





  \ttitem{-g~~~~} (obscure) Maximum Likelihood (ML) mode-- Maximum Likelihood lattice-based


   estimation using only the numerator (correct-sentence) lattices.





  \ttitem{-l~~~~} (obscure) Maximum number of sentences to use (useful only for troubleshooting)





  \ttitem{-o ext} (obscure) This causes the file name extensions of the


      original models (if any) to be replaced by {\tt ext}.





  \ttitem{-p N~}  Set parallel mode.  1, 2 $\ldots$ N for a forward-backward


     pass on data split into N blocks.  0 for re-estimating using accumulator


    files that this will write.  Default (-1) is single-pass operation.





  \ttitem{-q dir}   Directory to look for numerator lattice files.  These files


      will have a filename the same as the speech file, but with extension


      \texttt{lat} (or as given by \texttt{-Q} option).





  \ttitem{-qp s}   Path pattern for the extended path to look for numerator lattice files. 


      The matched string will be spliced to the end of the directory string specified by


      option `-q' for a deeper path.





  \ttitem{-r dir}   Directory to look for denominator lattice files.





  \ttitem{-rp s}   Path pattern for the extended path to look for denominator lattice files.


     The matched string will be spliced to the end of the directory string specified by


      option `-r' for a deeper path.





  \ttitem{-s file}  File to write HMM statistics.





  \ttitem{-twodatafiles} (obscure) Expect two of each data file for single pass retraining.  


    This works as for HERest; command line contains alternating alignment and update


    files.





  \ttitem{-u flags} By default, \htool{HMMIRest} updates all of the HMM parameters,


      that is, means, variances, mixture weights and 


      transition probabilies. This 


      option causes just the parameters indicated by the {\tt flags}


      argument to be updated, this argument is a string containing one


      or more of the letters {\tt m} (mean), {\tt v} (variance) ,


      {\tt t} (transition) and {\tt w} (mixture weight).  The 


      presence of a letter enables


      the updating of the corresponding parameter set.





  \ttitem{-umle flags} (obscure) A format as in \texttt{-u}; directs that the specified parameters


     be updated using the ML criterion (for MMI operation only, not MPE/MWE).





  \ttitem{-w floor} (obscure) Set the mixture weight floor as a multiple


    of \texttt{MINMIX=1.0e-5}.  Default is 2.





  \ttitem{-x ext}  By default, \htool{HMMIRest} expects a HMM definition for 


      the label X to be stored in a file called {\tt X}.  This


      option causes \htool{HMMIRest} to look for the HMM definition in the


      file {\tt X.ext}.


\stdoptB


\stdoptF


\stdoptH


\ttitem{-Hprior MMF}  Load HMM macro file {\tt MMF}, to be used as prior model in adaptation.  Use with config variables {\tt PRIORTAU} etc.  


\stdoptI


\stdoptM





 \ttitem{-Q ext} Set the lattice file extension to {\tt ext}


\end{optlist}





\stdopts{HMMIRest}





\mysubsect{Tracing}{HMMIRest-Tracing}








The command-line trace option can only be set to 0 (trace off)


or 1 (trace on), which is the default.  Tracing behaviour can


be altered by the {\tt TRACE} configuration variables in the modules \htool{HArc}


and \htool{HFBLat}.





\index{hmmirest@\htool{HMMIRest}|)}








%%% Local Variables: 


%%% mode: latex


%%% TeX-master: "../htkbook"


%%% End: 


⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -