⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 herest.tex

📁 隐马尔可夫模型源代码
💻 TEX
📖 第 1 页 / 共 2 页
字号:
%/* ----------------------------------------------------------- */


%/*                                                             */


%/*                          ___                                */


%/*                       |_| | |_/   SPEECH                    */


%/*                       | | | | \   RECOGNITION               */


%/*                       =========   SOFTWARE                  */ 


%/*                                                             */


%/*                                                             */


%/* ----------------------------------------------------------- */


%/*         Copyright: Microsoft Corporation                    */


%/*          1995-2000 Redmond, Washington USA                  */


%/*                    http://www.microsoft.com                */


%/*                                                             */


%/*   Use of this software is governed by a License Agreement   */


%/*    ** See the file License for the Conditions of Use  **    */


%/*    **     This banner notice must not be removed      **    */


%/*                                                             */


%/* ----------------------------------------------------------- */


%


% HTKBook - Steve Young and Dave Ollason   24/11/97


%





\newpage


\mysect{HERest}{HERest}





\mysubsect{Function}{HERest-Function}





\index{herest@\htool{HERest}|(} This program is used to perform a


single re-estimation of the parameters of a set of HMMs, or linear


transforms, using an {\em embedded training} version of the Baum-Welch


algorithm.  Training data consists of one or more utterances each of


which has a transcription in the form of a standard label file


(segment boundaries are ignored).  For each training utterance, a


composite model is effectively synthesised by concatenating the


phoneme models given by the transcription.  Each phone model has the


same set of accumulators allocated to it as are used in HRest but in


\htool{HERest} they are updated simultaneously by performing a


standard Baum-Welch pass over each training utterance using the


composite model.


  


\htool{HERest} is intended to operate on HMMs with initial parameter values 


estimated by HInit/HRest.


\htool{HERest} supports multiple mixture Gaussians, discrete and tied-mixture


HMMs, multiple data streams, parameter tying within and between models, and


full or diagonal covariance matrices. \htool{HERest} also supports tee-models


(see section~\ref{s:teemods}), for handling optional silence and non-speech


sounds. These may be placed between the units (typically words or phones)


listed in the transcriptions but they cannot be used at the start or end of a


transcription. Furthermore, chains of tee-models are not permitted.





\htool{HERest} includes features to allow parallel operation where a network


of processors is available. When the training set is large, it can be split into separate chunks that are processed in parallel on multiple machines/processors, consequently speeding up the training process. 





Like all re-estimation tools, \htool{HERest} allows a floor to be set on


each individual variance by defining a variance floor macro for each data


stream (see chapter~\ref{c:Training}).  The configuration variable {\tt


VARFLOORPERCENTILE} allows the same thing to be done in a different way


which appears to improve recognition results.  By setting this to e.g. 20,


the variances from each dimension are floored to the 20th percentile of the


distribution of variances for that dimensioon.





%as suggested in:


%\bibitem[Lee, Giachin, Rabiner, Pieraccini \& Rosenberg, 1992]{lee92csl}


%Lee C-H., Giachin E., Rabiner L.R., Pieraccini R. \& Rosenberg A.E.


%(1992). ``Improved Acoustic Modeling for Large Vocabulary Continuous


%Speech Recognition,'' {\it Computer Speech and Language} {\bf 6}, pp.


%103-127.








\htool{HERest} supports two specific methods for initilisation of


model parameters , \textit{single pass retraining} and \textit{2-model


  reestimation}.





\textit{Single pass retraining} is useful when the parameterisation of


the front-end (e.g. from MFCC to PLP coefficients) is to be modified.


Given a set of well-trained models, a set of new models using a


different parameterisation of the training data can be generated in a


single pass.  This is done by computing the forward and backward


probabilities using the original well-trained models and the original


training data, but then switching to a new set of training data to


compute the new parameter estimates.





In \textit{2-model re-estimation} one model set can be used to obtain


the forward backward probablilites which then are used to update the


parameters of another model set. Contrary to \textit{single pass


  retraining} the two model sets are not required to be tied in the


same fashion.  This is particulary useful for training of single


mixture models prior to decision-tree based state clustering. The use


of 2-model re-estimation in \htool{HERest} is triggered by setting the


config variables {\tt ALIGNMODELMMF} or {\tt ALIGNMODELDIR} and {\tt


  ALIGNMODELEXT} together with {\tt ALIGNHMMLIST} (see section \ref{s:twomodel}).


As the model list can differ for the alignment model set a seperate set of


input transforms may be specified using the {\tt ALIGNXFORMDIR} and


{\tt ALIGNXFORMEXT}. 





\htool{HERest} for updating model parameters operates in two distinct stages. 





\begin{enumerate}





\item


    In the first stage, one of the following two options applies


 \begin{enumerate}


  \item        


    Each input data file contains training data which is 


    processed and the accumulators for state occupation, 


    state transition, means and variances are updated.


        


  \item      


    Each data file contains a dump of the accumulators


    produced by previous runs of the program.  These


    are read in and added together to form a single set


    of accumulators.


  \end{enumerate}





\item


   In the second stage, one of the following options applies


  \begin{enumerate}


    \item


         The accumulators are used to calculate new 


         estimates for the HMM parameters.


    \item


         The accumulators are dumped into a file.


  \end{enumerate}


\end{enumerate}





Thus, on a single processor the default combination 1(a) and 2(a) would


be used.  However, if N processors are available then the 


training data would be split into N equal groups and \htool{HERest} would


be set to process one data set on each processor using the combination


1(a) and 2(b). 


When all processors had finished, the 


program would then be run again using the combination 1(b) and 2(a)


to load in the partial accumulators created by the N processors


and do the final parameter re-estimation.  The choice of which combination


of operations \htool{HERest} will perform is governed by the {\tt -p} option


switch as described below.





As a further performance optimisation, \htool{HERest} will also prune the


$\alpha$ and $\beta$ matrices.  By this means, a factor of 3 to 5


speed improvement and a similar reduction in memory requirements can be


achieved with negligible effects on training performance (see the {\tt


-t} option below).  





\htool{HERest} is able to make use of, and estimate, linear


transformations for model adaptation. There are three types of linear


transform that are made use in \htool{HERest}.


\begin{itemize}


\item {\it Input transform}: the input transform is used to determine


the forward-backward probabilities, hence the component posteriors, for 


estimating model and transform 


\item {\it Output transform}: the output transform is generated when the 


{\tt -u} option is set to {\tt a}. The transform will be stored in the 


current directory, or the directory specified by the {\tt -K} option


and optionally the transform extension.


\item {\it Parent transform}: the parent transform determines the 


model, or features, on which the model set or transform is to be 


generated. For transform estimation this allows {\em cascades} of transforms


to be used to adapt the model parameters. For model estimation this 


supports {\em speaker adaptive training}. Note the current implementation 


only supports adaptove training with CMLLR. Any parent transform can be


used when generating transforms.


\end{itemize}


When input or parent transforms are specified the transforms may 


physically be stored in multple diirectories. Which transform to be used 


is determined in the following search order:


order is used.


\begin{enumerate}


\item Any loaded macro that matches the transform (and its' extension) name.


\item If it is a parent transform, the directory specified with the 


{\tt -E} option.


\item The list of directories specified with the {\tt -J} option.


The directories are searched in the order that they are specified


in the command line.


\end{enumerate}


As the search order above looks for loaded macros first it is 


recommended that unique extensions are specified for each set of


transforms generated. Transforms may either be stored in 


a single TMF. These TMFs may be loaded using the {\tt -H} option.


When macros are specified for the regression class trees and 


the base classes the following search order is used


\begin{enumerate}


\item Any loaded macro that matches the macro name.


⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -