📄 adapt.tex
字号:
MLLR mean and variance transformation calculations can be found insection~\ref{s:mllrformulae}.\htool{HEAdapt} is typically invoked by a command line of the form\begin{verbatim} HEAdapt -S adaptlist -I labs -H dir1/hmacs -M dir2 hmmlist\end{verbatim}where \texttt{hmmlist} contains the list of HMMs. %On startup, \htool{HEAdapt} will %load the HMM master macro file (MMF) \texttt{hmacs} (there may be%several of these). It then searches for a definition for each%HMM listed in the \texttt{hmmlist}, if any HMM name is not found, %it attempts to open a file of the same name in the current directory%(or a directory designated by the \texttt{-d} option).%Usually in large subword systems, however, all of the HMM definitions%will be stored in MMFs. Similarly, all of the required transcriptions%will be stored in one or more Master Label Files%\index{master label files} (MLFs), and in the%example, they are stored in the single MLF called \texttt{labs}.Once all MMFs and MLFs have been loaded, \htool{HEAdapt} processes each file in the\texttt{adaptlist}, and accumulates the required statistics as describedabove. On completion, an updated MMF is output to the directory\texttt{dir2}.If the following form of the command is used\begin{verbatim} HEAdapt -S adaptlist -I labs -H dir1/hmacs -K dir2/tmf hmmlist\end{verbatim}then on completion a transform model file (TMF) \texttt{tmf} is output to the directory \texttt{dir2}.This process is illustrated by Fig~\href{f:headaptrdp}.Section~\ref{s:tmfs} describes the TMF format in moredetail. The output \texttt{tmf} contains transforms that transform theMMF \texttt{hmacs}.Once this is saved, \htool{HVite} can be used to perform recognitionfor the adapted speaker either using a transformed MMF or by using thespeaker independent MMF together with a speaker specific TMF.\htool{HEAdapt} employs the same pruning mechanism as \htool{HERest}during the forward-backward computation. As such the pruning on thebackward path is under the user's control, and the beam is setusing the \texttt{-t} option.\htool{HEAdapt} can also be run several times in block or staticfashion. For instance a first passmight entail a global adaptation (forced using the \texttt{-g} option), producing the TMF \texttt{global.tmf} by invoking\begin{verbatim} HEAdapt -g -S adaptlist -I labs -H mmf -K tmfs/global.tmf \ hmmlist\end{verbatim}The second pass could load in the global transformations (and tranformthe model set) using the \texttt{-J} option, performing a better frame/state alignment than the speaker independent model set, and output a set of regression class transformations,\begin{verbatim} HEAdapt -S adaptlist -I labs -H mmf -K tmfs/rc.tmf \ -J tmfs/global.tmf hmmlist\end{verbatim}Note again that the number of transformations is selectedautomatically and isdependent on the node occupation threshold setting and the amount ofadaptation data available. Finally when producing a TMF, \htool{HEAdapt}always generates a TMF to transform the input MMF in all cases. In the last example the input MMF is transformed by the global transform file \texttt{global.tmf} in order to obtain the frame/state alignment only. The final TMF that is output, \texttt{rc.tmf}, contains the set of transforms to transform the inputMMF \texttt{mmf}, based on this frame/state alignment.As an alternative, the second pass could entail MLLR together withMAP adaptation, outputing a new model set. Note that with MAP adaptation atransform can not be saved and a full HMM set must be output. \begin{verbatim} HEAdapt -S adaptlist -I labs -H mmf -M dir2 -k -j 12.0 -J tmfs/global.tmf hmmlist\end{verbatim}Note that MAP alone could be used by removing the \texttt{-k}option. The argument to the \texttt{-j} option represents the MAPadaptation scaling factor.%\pagebreak\mysect{MLLR Formulae}{mllrformulae}For reference purposes, this section lists the various formulaeemployed within the \HTK\ adaptation tool\index{adaptation!MLLRformulae}. It is assumed throughoutthat single stream data is used and that diagonal covariances are alsoused. All are standard and can be found in various literature. The following notation is used in this section\begin{tabbing}++ \= ++++++++ \= \kill\> $\mathcal{M}$ \> the model set\\\> $T$ \> number of observations \\\> $m$ \> a mixture component \\\> $\bm{O}$ \> a sequence of observations \\\> $\bm{o}(t)$ \> the observation at time $t$, $1 \leq t \leq T $\\\> $\bm{\mu}_{m_r}$ \> mean vector for the mixture component $m_r$\\\> $\bm{\xi}_{m_r}$ \> extended mean vector for the mixture component $m_r$\\\> $\bm{\Sigma}_{m_r}$ \> covariance matrix for the mixture component $m_r$ \\\> $L_{m_r}(t)$ \> the occupancy probability for the mixture component $m_r$\\\> \> at time $t$ \end{tabbing}\newcommand{\like}{L_{m_r}(t)}\mysubsect{Estimation of the Mean Transformation Matrix}{mtransest}To enable robust transformations to be trained, the transform matricesare tied across a number of Gaussians. The set of Gaussians whichshare a transform is referred to as a regression class.For a particular transform case $\bm{W_m}$, the $R$ Gaussiancomponents $\left\{m_1, m_2, \dots, m_R\right\}$ will be tiedtogether, as determined by the regression class tree (seesection~\ref{s:reg_classes}).By formulating the standard auxiliary function, and then maximising itwith respect to the transformed mean, and considering only these tiedGaussian components, the following is obtained,\hequation{\sum_{t=1}^{T} \sum_{r=1}^{R} \like\bm{\Sigma}_{m_r}^{-1}\bm{o}(t)\bm{\xi}_{m_r}^T =\sum_{t=1}^{T} \sum_{r=1}^{R} \like\bm{\Sigma}_{m_r}^{-1}\bm{W_m} \bm{\xi}_{m_r}\bm{\xi}_{m_r}^T }{meantrans1}and $\like$, the occupation likelihood, is defined as,\[ \like = p(q_{m_r}(t)\;|\;\mathcal{M}, \bm{O}_T)\]where $q_{m_r}(t)$ indicates the Gaussian component $m_r$ at time $t$,and $\bm{O}_T = \left\{\bm{o}(1),\dots,\bm{o}(T)\right\}$ is theadaptation data. The occupation likelihood is obtained from theforward-backward process described in section~\ref{s:bwformulae}.To solve for $\bm{W}_m$, two new terms are defined.\begin{enumerate}\itemThe left hand side of equation~\ref{e:meantrans1} is independent ofthe transformation matrix and is referred to as $\bm{Z}$, where\[ \bm{Z} = \sum_{t=1}^{T} \sum_{r=1}^{R} \like\bm{\Sigma}_{m_r}^{-1}\bm{o}(t)\bm{\xi}_{m_r}^T\]\itemA new variable $\bm{G}^{(i)}$ is defined with elements\[ g_{jq}^{(i)} = \sum_{r=1}^{R} v_{ii}^{(r)} d_{jq}^{(r)}\]where\[ \bm{V}^{(r)} = \sum_{t=1}^{T} \like \bm{\Sigma}_{m_r}^{-1}\] and\[ \bm{D}^{(r)} = \bm{\xi}_{m_r}\bm{\xi}_{m_r}^T \]\end{enumerate} It can be seen that from these two new terms, $\bm{W}_m$ can becalculated from\[ \bm{w}_i^T = \bm{G}_i^{-1} \bm{z}_i^T\]where $\bm{w}_i$ is the $i^{th}$ vector of $\bm{W}_m$ and $\bm{z}_i$is the $i^{th}$ vector of $\bm{Z}$.The regression class tree is used to generate the classes dynamically,so it is not known a-priori which regression classes will be used toestimate the transform. This does not present a problem, since$\bm{G}^{(i)}$ and $\bm{Z}$ for the chosen regression class may beobtained from its child classes (as defined by the tree). If theparent node $R$ has children $\left\{R_1,\dots,R_C\right\}$ then\[ \bm{Z} = \sum_{c=1}^{C} \bm{Z}^{(R_c)}\]and\[ \bm{G}^{(i)} = \sum_{c=1}^{C} \bm{G}^{(iR_c)}\]From this it is clear that it is only necessary to calculate$\bm{G}^{(i)}$ and $\bm{Z}$ for only the most specific regressionclasses possible -- i.e. the base classes.\mysubsect{Estimation of the Variance Transformation Matrix}{vtransest}Estimation of the variance transformation matrices is only availablefor diagonal covariance Gaussian systems. The Gaussian covariance istransformed using,\[ \hat{\bm{\Sigma}}_{m} = \bm{B}_m^T\bm{H}_m\bm{B}_m\]where $\bm{H}_m$ is the linear transformation to be estimated and$\bm{B}_m$ is the inverse of the Choleski factor of $\bm{\Sigma}_{m}^{-1}$,so\[ \bm{\Sigma}_{m}^{-1} = \bm{C}_m\bm{C}_m^T\]and\[ \bm{B}_m = \bm{C}_m^{-1}\]After rewriting the auxiliary function, the transform matrix $\bm{H}_m$is estimated from,\[ \bm{H}_m = \frac{ \sum_{r=1}^{R_c}\bm{C}_{m_r}^T \left[ \like(\bm{o}(t) - \bm{\mu}_{m_r}) (\bm{o}(t) - \bm{\mu}_{m_r})^T \right] \bm{C}_{m_r} } { \like }\]Here, $\bm{H}_m$ is forced to be a diagonal transformation by settingthe off-diagonal terms to zero, which ensures that$\hat{\bm{\Sigma}}_{m}$ is also diagonal.%%% Local Variables: %%% mode: plain-tex%%% TeX-master: "htkbook"%%% End:
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -