📄 discmods.tex
字号:
vector. If the configuration variable \texttt{SAVEASVQ} is set true, thenthe output routines in \htool{HParm} will discard the original vectorsand just save the VQ indices in a \texttt{DISCRETE} file. Alternatively, \HTK\ will regard any speech vector with \texttt{\_V} setas being compatible with discrete HMMs. Thus, it is not necessaryto explicitly create a database of discrete training files ifa set of continuous speech vector parameter files already exists.Fig.~\href{f:vqtohmm} illustrates this process.}\index{saveasvq@\texttt{SAVEASVQ}}\index{targetkind@\texttt{TARGETKIND}}Once the training data has been configured for discrete HMMs, the rest of the training process is similar to that previously described.The normal sequence is to build a set of monophone models and thenclone them to make triphones. As in continuous density systems, state tying can be used to improve therobustness of the parameter estimates. However, in the case of discrete HMMs,alternative methods based on interpolation are possible. These are discussedin section~\ref{s:psmooth}.\mysect{Tied Mixture Systems}{tiedmix}\index{tied-mixtures}Discrete systems have the advantage of low run-time computation. However,vector quantisation reduces accuracy and this can lead to poor performance.As a intermediate between discrete and continuous, a fully tied-mixturesystem can be used.Tied-mixtures are conceptually just another example of the general parameter tyingmechanism built-in to \HTK. However, to use them effectively inspeech recognition systems a number of storage and computational optimisations must be made. Hence, they are given special treatment in \HTK.When specific mixtures are tied as in \begin{verbatim} TI "mix" {*.state[2].mix[1]} \end{verbatim}then a Gaussian mixture component is shared across all of the ownersof the tie. In this example, all models will share the same Gaussianfor the first mixture component of state 2. However, if the mixturecomponent index is missing, then all of the mixture components participating inthe tie are {\it joined} rather than tied. More specifically, the commands\begin{verbatim} JO 128 2.0 TI "mix" {*.state[2-4].mix} \end{verbatim}has the following effect. All of the mixture components in states 2 to 4 ofall models are collected into a pool. If the number of componentsin the pool exceeds 128, as set by the preceding join command \texttt{JO}\index{jo@\texttt{JO} command}, thencomponents with the smallest weights are removed until the pool size is exactly128. Similarly, if the size of the initial pool is less than 128, then mixturecomponents are split using the same algorithm as for the Mix-Up \texttt{MU}command.\index{mixture tying} All states then share all of the mixture components in this pool. The new mixture weights are chosen to be proportionalto the log probability of the corresponding new mixture component mean withrespect to the original distribution for that state. The log is used hereto give a wider spread of mixture weights. All mixture weights are flooredto the value of the second argument of the \texttt{JO} command times \texttt{MINMIX}\index{minmix@\texttt{MINMIX}}.The net effect of the above two commands is to create a set of \texttt{tied-mixture}HMMs\footnote{Also called {\it semi-continuous} HMMs in the the literature.}where the same set of mixture components is shared across all states ofall models. However, the type of the HMM set so created will still be\texttt{SHARED} and the internal representation will be the same as forany other set of parameter tyings. To obtain the optimised representation of the tied-mixture weightsdescribed in section~\ref{s:tmix}, the following \htool{HHEd}\texttt{HK}\index{hk@\texttt{HK} command} command must be issued\begin{verbatim} HK TIEDHS\end{verbatim}This will convert the internal representation to the special tied-mixtureform in which all of the tied mixtures are stored in a global table andreferenced implicitly insteadof being referenced explicitly using pointers.Tied-mixture HMMs work best if the information relating to different sourcessuch as delta coefficients and energy are separated into distinct data streams.This can be done by setting up multiple data stream HMMs from the outset. However, it is simpler to use the \texttt{SS}\index{ss@\texttt{SS} command}command in \htool{HHEd} to split the data streams of the currently loaded HMM set.Thus, for example, the command\begin{verbatim} SS 4 \end{verbatim}would convert the currently loaded HMMs to use four separate data streamsrather than one. When used in the construction of tied-mixture HMMsthis is analogous to the use of multiple codebooks in discrete density HMMs.The procedure for building a set of tied-mixture HMMs may be summarisedas follows\index{tied-mixtures!build procedure}\begin{enumerate}\item Choose a {\it codebook} size for each data stream and then decide how many Gaussian components will be needed from an initial set of monophones to approximately fill this codebook. For example, suppose that there are 48 three state monophones. If codebook sizes of 128 are chosen for streams 1 and 2, and a codebook size of 64 is chosen for stream 3 then single Gaussian monophones would provide enough mixtures in total to fill the codebooks.\item Train the initial set of monophones.\item Use \htool{HHEd} to first split the HMMs into the required number of data streams, tie each individual stream and then convert the tied-mixture HMM set to have the kind \texttt{TIEDHS}. A typical script to do this for four streams would be\begin{verbatim} SS 4 JO 256 2.0 TI st1 {*.state[2-4].stream[1].mix} JO 128 2.0 TI st2 {*.state[2-4].stream[2].mix} JO 128 2.0 TI st3 {*.state[2-4].stream[3].mix} JO 64 2.0 TI st4 {*.state[2-4].stream[4].mix} HK TIEDHS\end{verbatim}\item Re-estimate the models using \htool{HERest} in the normal way.\end{enumerate}Once the set of retrained tied-mixture models has been produced, contextdependent models can be constructed using similar methods to thoseoutlined previously.When evaluating probabilities in tied-mixture systems, it is oftensufficient to sum just the most likely mixture components since for anyparticular input vector, its probability with respect to many of the Gaussiancomponents will be very low.\index{pruning!in tied mixtures}\HTK\ tools recognise \texttt{TIEDHS}HMM sets as being special in the sense that additional optimisationsare possible. When full tied-mixtures are used, then an additional layer of pruningis applied. At each time frame, the log probability of the current observationis computed for each mixture component. Then only those components which lie withina threshold of the most likely component are retained. This pruning is controlled by the \texttt{-c} option in \htool{HRest}, \htool{HERest} and \htool{HVite}.\mysect{Parameter Smoothing}{psmooth}When large sets of context-dependent triphones are built usingdiscrete models ortied-mixture models, under-training\index{under-training} can be a severe problem since each state has a large number of mixture weight parameters to estimate.The \HTK\ tool \htool{HSmooth} allows these discrete probabilities ormixture component weightsto be smoothed with the monophone weights using a technique called deleted interpolation\index{deleted interpolation}. \htool{HSmooth} is used in combination with \htool{HERest}working in parallel mode. The training data is splitinto blocks and each block is used separately to re-estimate theHMMs. However, since \htool{HERest} is in parallel mode, it outputs a dumpfile of accumulators instead of updating the models. \htool{HSmooth} is thenused in place of the second pass of \htool{HERest}. It reads in the accumulator information from each of the blocks, performs deletedinterpolation smoothing on the accumulator values and then outputsthe re-estimated HMMs in the normal way.\htool{HSmooth}\index{hsmooth@\htool{HSmooth}} implements a conventional deleted interpolation scheme.However, optimisation of the smoothing weights uses a fast binary chop\index{binary chop} scheme rather than the more usual Baum-Welch approach.The algorithm for finding the optimal interpolation weights for a givenstate and stream is as follows where the description is given in termsof tied-mixture weights but the same applies to discrete probabilities. Assume that \htool{HERest}has been set-up to output $N$ separate blocks of accumulators.Let $w_i^{(n)}$ be the $i$'thmixture weight based on the accumulator blocks $1$ to $N$ but excludingblock $n$, and let $\bar{w}_i^{(n)}$ be the corresponding contextindependent weight. Let $x_i^{(n)}$ be the $i$'th mixture weight count for the deleted block $n$. The derivative of the loglikelihood of the deleted block, given the probability distribution withweights $c_i = \lambda w_i + (1 - \lambda) \bar{w}_i$ is given by\begin{equation} D(\lambda) = \sum_{n=1}^N \sum_{i=1}^M x_i^{(n)} \left[ \frac{w_i^{(n)} - \bar{w}_i^{(n)}}{ \lambda w_i^{(n)} + (1 - \lambda ) \bar{w}_i^{(n)}} \right]\end{equation}Since the log likelihood is a convex function of $\lambda$, this derivativeallows the optimal value of $\lambda$ to be found by a simplebinary chop algorithm, viz.\begin{verbatim} function FindLambdaOpt: if (D(0) <= 0) return 0; if (D(1) >= 0) return = 1; l=0; r=1; for (k=1; k<=maxStep; k++){ m = (l+r)/2; if (D(m) == 0) return m; if (D(m) > 0) l=m; else r=m; } return m;\end{verbatim}\htool{HSmooth} is invoked in a similar way to \htool{HERest}.For example, suppose that the directory \texttt{hmm2} contains a set ofaccumulator files output by the first pass of \htool{HERest} running in parallelmode using as source the HMM definitions listed in \texttt{hlist} and stored in \texttt{hmm1/HMMDefs}. Then the command\begin{verbatim} HSmooth -c 4 -w 2.0 -H hmm1/HMMDefs -M hmm2 hlist hmm2/*.acc\end{verbatim}would generate a new smoothed HMM set in \texttt{hmm2}. Here the \texttt{-w}option is used to set the minimum mixture component weight in any state totwice the value of \texttt{MINMIX}\index{minmix@\texttt{MINMIX}}. The \texttt{-c} option sets the maximumnumber of iterations of the binary chop procedure to be 4.%%% Local Variables: %%% mode: latex%%% TeX-master: "htkbook"%%% End:
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -