📄 herest.tex
字号:
%/* ----------------------------------------------------------- */%/* */%/* ___ */%/* |_| | |_/ SPEECH */%/* | | | | \ RECOGNITION */%/* ========= SOFTWARE */ %/* */%/* */%/* ----------------------------------------------------------- */%/* Copyright: Microsoft Corporation */%/* 1995-2000 Redmond, Washington USA */%/* http://www.microsoft.com */%/* */%/* Use of this software is governed by a License Agreement */%/* ** See the file License for the Conditions of Use ** */%/* ** This banner notice must not be removed ** */%/* */%/* ----------------------------------------------------------- */%% HTKBook - Steve Young and Dave Ollason 24/11/97%\newpage\mysect{HERest}{HERest}\mysubsect{Function}{HERest-Function}\index{herest@\htool{HERest}|(} This program is used to perform asingle re-estimation of the parameters of a set of HMMs, or lineartransforms, using an {\em embedded training} version of the Baum-Welchalgorithm. Training data consists of one or more utterances each ofwhich has a transcription in the form of a standard label file(segment boundaries are ignored). For each training utterance, acomposite model is effectively synthesised by concatenating thephoneme models given by the transcription. Each phone model has thesame set of accumulators allocated to it as are used in HRest but in\htool{HERest} they are updated simultaneously by performing astandard Baum-Welch pass over each training utterance using thecomposite model. \htool{HERest} is intended to operate on HMMs with initial parameter values estimated by HInit/HRest.\htool{HERest} supports multiple mixture Gaussians, discrete and tied-mixtureHMMs, multiple data streams, parameter tying within and between models, andfull or diagonal covariance matrices. \htool{HERest} also supports tee-models(see section~\ref{s:teemods}), for handling optional silence and non-speechsounds. These may be placed between the units (typically words or phones)listed in the transcriptions but they cannot be used at the start or end of atranscription. Furthermore, chains of tee-models are not permitted.\htool{HERest} includes features to allow parallel operation where a networkof processors is available. When the training set is large, it can be split into separate chunks that are processed in parallel on multiple machines/processors, consequently speeding up the training process. Like all re-estimation tools, \htool{HERest} allows a floor to be set oneach individual variance by defining a variance floor macro for each datastream (see chapter~\ref{c:Training}). The configuration variable {\ttVARFLOORPERCENTILE} allows the same thing to be done in a different waywhich appears to improve recognition results. By setting this to e.g. 20,the variances from each dimension are floored to the 20th percentile of thedistribution of variances for that dimensioon.%as suggested in:%\bibitem[Lee, Giachin, Rabiner, Pieraccini \& Rosenberg, 1992]{lee92csl}%Lee C-H., Giachin E., Rabiner L.R., Pieraccini R. \& Rosenberg A.E.%(1992). ``Improved Acoustic Modeling for Large Vocabulary Continuous%Speech Recognition,'' {\it Computer Speech and Language} {\bf 6}, pp.%103-127.\htool{HERest} supports two specific methods for initilisation ofmodel parameters , \textit{single pass retraining} and \textit{2-model reestimation}.\textit{Single pass retraining} is useful when the parameterisation ofthe front-end (e.g. from MFCC to PLP coefficients) is to be modified.Given a set of well-trained models, a set of new models using adifferent parameterisation of the training data can be generated in asingle pass. This is done by computing the forward and backwardprobabilities using the original well-trained models and the originaltraining data, but then switching to a new set of training data tocompute the new parameter estimates.In \textit{2-model re-estimation} one model set can be used to obtainthe forward backward probablilites which then are used to update theparameters of another model set. Contrary to \textit{single pass retraining} the two model sets are not required to be tied in thesame fashion. This is particulary useful for training of singlemixture models prior to decision-tree based state clustering. The useof 2-model re-estimation in \htool{HERest} is triggered by setting theconfig variables {\tt ALIGNMODELMMF} or {\tt ALIGNMODELDIR} and {\tt ALIGNMODELEXT} together with {\tt ALIGNHMMLIST} (see section \ref{s:twomodel}).As the model list can differ for the alignment model set a seperate set ofinput transforms may be specified using the {\tt ALIGNXFORMDIR} and{\tt ALIGNXFORMEXT}. \htool{HERest} for updating model parameters operates in two distinct stages. \begin{enumerate}\item In the first stage, one of the following two options applies \begin{enumerate} \item Each input data file contains training data which is processed and the accumulators for state occupation, state transition, means and variances are updated. \item Each data file contains a dump of the accumulators produced by previous runs of the program. These are read in and added together to form a single set of accumulators. \end{enumerate}\item In the second stage, one of the following options applies \begin{enumerate} \item The accumulators are used to calculate new estimates for the HMM parameters. \item The accumulators are dumped into a file. \end{enumerate}\end{enumerate}Thus, on a single processor the default combination 1(a) and 2(a) wouldbe used. However, if N processors are available then the training data would be split into N equal groups and \htool{HERest} wouldbe set to process one data set on each processor using the combination1(a) and 2(b). When all processors had finished, the program would then be run again using the combination 1(b) and 2(a)to load in the partial accumulators created by the N processorsand do the final parameter re-estimation. The choice of which combinationof operations \htool{HERest} will perform is governed by the {\tt -p} optionswitch as described below.As a further performance optimisation, \htool{HERest} will also prune the$\alpha$ and $\beta$ matrices. By this means, a factor of 3 to 5speed improvement and a similar reduction in memory requirements can beachieved with negligible effects on training performance (see the {\tt-t} option below). \htool{HERest} is able to make use of, and estimate, lineartransformations for model adaptation. There are three types of lineartransform that are made use in \htool{HERest}.\begin{itemize}\item {\it Input transform}: the input transform is used to determinethe forward-backward probabilities, hence the component posteriors, for estimating model and transform \item {\it Output transform}: the output transform is generated when the {\tt -u} option is set to {\tt a}. The transform will be stored in the current directory, or the directory specified by the {\tt -K} optionand optionally the transform extension.\item {\it Parent transform}: the parent transform determines the model, or features, on which the model set or transform is to be generated. For transform estimation this allows {\em cascades} of transformsto be used to adapt the model parameters. For model estimation this supports {\em speaker adaptive training}. Note the current implementation only supports adaptove training with CMLLR. Any parent transform can beused when generating transforms.\end{itemize}When input or parent transforms are specified the transforms may physically be stored in multple diirectories. Which transform to be used is determined in the following search order:order is used.\begin{enumerate}\item Any loaded macro that matches the transform (and its' extension) name.\item If it is a parent transform, the directory specified with the {\tt -E} option.\item The list of directories specified with the {\tt -J} option.The directories are searched in the order that they are specifiedin the command line.\end{enumerate}As the search order above looks for loaded macros first it is recommended that unique extensions are specified for each set of
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -