📄 .#models.tex.1.3

📁 该压缩包为最新版htk的源代码,htk是现在比较流行的语音处理软件,请有兴趣的朋友下载使用
💻 3
📖 第 1 页 / 共 5 页
字号:
12 3 4 5 下一页
%/* ----------------------------------------------------------- */%/*                                                             */%/*                          ___                                */%/*                       |_| | |_/   SPEECH                    */%/*                       | | | | \   RECOGNITION               */%/*                       =========   SOFTWARE                  */ %/*                                                             */%/*                                                             */%/* ----------------------------------------------------------- */%/* developed at:                                               */%/*                                                             */%/*      Speech Vision and Robotics group                       */%/*      Cambridge University Engineering Department            */%/*      http://svr-www.eng.cam.ac.uk/                          */%/*                                                             */%/*      Entropic Cambridge Research Laboratory                 */%/*      (now part of Microsoft)                                */%/*                                                             */%/* ----------------------------------------------------------- */%/*         Copyright: Microsoft Corporation                    */%/*          1995-2000 Redmond, Washington USA                  */%/*                    http://www.microsoft.com                 */%/*                                                             */%/*               2002 Cambridge University                     */%/*                    Engineering Department                   */%/*                                                             */%/*   Use of this software is governed by a License Agreement   */%/*    ** See the file License for the Conditions of Use  **    */%/*    **     This banner notice must not be removed      **    */%/*                                                             */%/* ----------------------------------------------------------- */%% HTKBook - Steve Young 15/11/95%\mychap{HMM Definition Files}{HMMDefs}\sidepic{Tool.model}{80}{}The principle function of \HTK\ is to manipulate sets of hiddenMarkov  models (HMMs). The definition of a HMM must specify themodel topology, the transition parameters and the outputdistribution parameters. The HMM observation vectors can bedivided into multiple independent data streams and each stream canhave its own weight. In addition, a HMM can have ancillaryinformation such as duration parameters.\HTK\ supports both continuous mixture densities and discretedistributions. \HTK\ also provides a generalised tying mechanism whichallows parameters to be shared within and between models.  \index{HMM!definitions}In order to encompass this rich variety of HMM types withina single framework, \HTK\ uses a formal language to define HMMs.  The interpretation of this language is handledby the library module \htool{HModel} which is responsible for convertingbetween the external and internal representations of HMMs.  In addition,it provides all the basic probability function calculations.A second module \htool{HUtil} provides various additional facilities formanipulating HMMs once they have been loaded into memory.The purpose of this chapter is to describethe HMM definition language in some detail.  The chapter begins by describing how to writeindividual HMM definitions.  \HTK\ macros arethen explained and the mechanisms for defining a completemodel set are presented.  The various flavours ofHMM are then described and the use of binary files discussed.Finally, a formal description of the \HTK\ HMM definitionlanguage is given.As will be seen, the definition of a large HMM system can involve considerable complexity.  However, inpractice, HMM systems are built incremently.  The usual starting point is a single HMM definition which is thenrepeatedly cloned and refined using the various \HTK\ tools(in particular, \htool{HERest} and \htool{HHEd}).Hence, in practice, the \HTK\ user rarely has to generate complex HMM definition files directly.  \mysect{The HMM Parameters}{HMMparm}A HMM consists of a number of states.  Each state $j$ has an associatedobservation probability distribution $b_{j}(\bm{o}_t)$  whichdetermines the probability of generating observation $\bm{o}_t$ attime $t$ and each pair of states $i$ and $j$ has an associatedtransition probability $a_{ij}$.  In \HTK\,  the entry state $1$ andthe exit state $N$ of an $N$ state HMM are non-emitting.  \sidefig{hmm1}{70}{Simple Left-Right HMM}{-4}{Fig.~\href{f:hmm1} shows a simple left-right HMM with five states intotal.  Three of these are emitting states and have output probabilitydistributions associated with them.   The transition matrix forthis model will have 5 rows and 5 columns.  Each row will sum to oneexcept for the final row which is always all zero since notransitions are allowed out of the final state.\HTK\ is principally concerned with continuous\index{HMM!parameters}density models in which each observation probability distributionis represented by a mixture Gaussian density.  In this case,for state $j$the probability $b_{j}(\bm{o}_t)$ of generating observation $\bm{o}_t$ is given by}\hequation{  b_{j}(\bm{o}_t) = \prod_{s=1}^S \left[     \sum_{m=1}^{M_{js}} c_{jsm} {\cal N}(\bm{o}_{st};                    \bm{\mu}_{jsm}, \bm{\Sigma}_{jsm})  \right]^{\gamma_s}}{cdpdf}where $M_{js}$ is the number of mixture components\index{mixture component} in state $j$ for stream $s$,$c_{jsm}$ is the weight of the $m$'th  component and ${\cal N}(\cdot; \bm{\mu}, \bm{\Sigma})$ is a multivariate Gaussianwith mean vector\index{mean vector} $\bm{\mu}$ and covariance matrix\index{covariance matrix} $\bm{\Sigma}$, thatis\index{output probability!continuous case}\hequation{{\cal N}(\bm{o}; \bm{\mu}, \bm{\Sigma}) =       \frac{1}{\sqrt{(2 \pi)^n | \bm{\Sigma} |}}        e^{- \frac{1}{2}(\bm{o}-\bm{\mu})' \bm{\Sigma}^{-1}(\bm{o}-\bm{\mu})}}{gnorm}where $n$ is the dimensionality of $\bm{o}$.  The exponent $\gamma_s$ isa stream weight\index{stream weight} and its default value is one.  Other values can beused to emphasise particular streams, however, none of the standard\HTK\ tools manipulate it.\HTK\ also supports discrete probability distributions\index{discrete probability} in whichcase \index{output probability!discrete case}\hequation{  b_{j}(\bm{o}_t) = \prod_{s=1}^S \left\{     P_{js}[v_s(\bm{o}_{st})]  \right\}^{\gamma_s}}{ddpdf}where $v_s(\bm{o}_{st})$ is the output of the vector quantiser for stream $s$given input vector $\bm{o}_{st}$ and $P_{js}[v]$ is the probability of state $j$ generating symbol $v$ in stream $s$.In addition to the above, any model or state can have anassociated vector of duration parameters\index{duration parameters} $\{d_k\}$\footnote{No current \HTK\ tool can estimate or use these.}.  Also,it is necessary to specify the kind of the observationvectors, and the width of the observation vector in each stream.Thus, the total information needed to define a single HMM is as follows   \begin{itemize}    \item type of observation vector    \item number and width of each data stream    \item optional model duration parameter vector    \item number of states    \item for each emitting state and each stream        \begin{itemize}           \item mixture component weights or discrete probabilities           \item if continuous density, then means and covariances           \item optional stream weight vector               \item optional duration parameter vector         \end{itemize}      \item transition matrix\end{itemize}The following sections explain how these are defined.\mysect{Basic HMM Definitions}{OneHMM}Some \HTK\ tools require a single HMM to be defined.  For example, theisolated-unit re-estimation tool \htool{HRest} would be invoked as\begin{verbatim}    HRest hmmdef s1 s2 s3 ....\end{verbatim}\noindentThis would cause the model defined in the file \texttt{hmmdef}to be input and its parameters re-estimated using the speech datafiles \texttt{s1}, \texttt{s2}, etc.\index{HMM definition!basic form}\sideprog{hmm1def}{60}{Definition for Simple L-R HMM}{\hmmc{h}{hmm1} \\\hmkw{BeginHMM} \\\> \hmkw{VecSize} 4 \hmkw{MFCC}  \\\> \hmkw{NumStates} 5  \\\> \hmkw{State} 2 \\\>\>    \hmkw{Mean} 4 \\\>\>\>       0.2 0.1 0.1 0.9  \\\>\>    \hmkw{Variance} 4 \\\>\>\>       1.0 1.0 1.0 1.0  \\\> \hmkw{State} 3 \\\>\>    \hmkw{Mean} 4 \\\>\>\>       0.4 0.9 0.2 0.1  \\\>\>    \hmkw{Variance} 4 \\\>\>\>       1.0 2.0 2.0 0.5  \\\> \hmkw{State} 4 \\\>\>    \hmkw{Mean} 4 \\\>\>\>       1.2 3.1 0.5 0.9  \\\>\>    \hmkw{Variance} 4 \\\>\>\>       5.0 5.0 5.0 5.0  \\\> \hmkw{TransP} 5 \\\>\>   0.0 0.5 0.5 0.0 0.0 \\\>\>   0.0 0.4 0.4 0.2 0.0 \\\>\>   0.0 0.0 0.6 0.4 0.0 \\\>\>   0.0 0.0 0.0 0.7 0.3 \\\>\>   0.0 0.0 0.0 0.0 0.0  \\\hmkw{EndHMM}}{}HMM definition files consist of a sequence of symbols representingthe elements of a simple language.  These symbols are mainlykeywords written within angle brackets and integer andfloating point numbers. The full \HTK\ definition language is presentedmore formally later in section~\ref{s:hmmdef}.  For now, themain features of the language will be described by someexamples.\index{HMM definition!symbols in}Fig~\href{f:hmm1def} shows a HMM definition corresponding to the simpleleft-right HMM illustrated in Fig~\href{f:hmm1}.  It is a continuous densityHMM with 5 states in total, 3 of which are emitting.  The first symbol in thefile \hmmt{h} indicates that the following string is the name of a macro oftype \textsf{h} which means that it is a HMM definition (macros are explainedin detail later).  Thus, this definition describes a HMM called ``hmm1''.  Note that HMM names should be composed of alphanumeric characters only and mustnot consist solely of numbers. The HMM definition itself is bracketed by thesymbols \hmkw{BeginHMM}\index{beginhmm@$<$BeginHMM$>$} and\hmkw{EndHMM}\index{endhmm@$<$EndHMM$>$}.\index{HMM name}The first line of the definition proper specifies\index{HMM definition!global features}the \textit{global} features of the HMM.  In any systemconsisting of many HMMs, thesefeatures will be the same for all of them.In this case, the global definitions indicate thatthe observation vectors have 4 components(\hmkw{VecSize}\index{vecsize@$<$VecSize$>$} 4) and that they denoteMFCC coefficients\index{MFCC coefficients} (\hmkw{MFCC}).The next line specifies the number of states in the HMM. Therethen follows a definition for each emitting state $j$, each of whichhas a single mean vector $\bm{\mu}_j$ introduced by the keyword \hmkw{Mean}\index{mean@$<$Mean$>$} and a diagonal variance vector $\bm{\Sigma}_j$introduced by the keyword \hmkw{Variance}.\index{variance@$<$Variance$>$}The definition ends with the transition matrix $\{a_{ij}\}$introduced by the keyword \hmkw{TransP}\index{transp@$<$TransP$>$}.  \index{HMM definition!mean vector}\index{HMM definition!covariance matrix}\index{HMM definition!transition matrix}Notice thatthe dimension of each vector or matrix is specifiedexplicitly before listing the component values.  Thesedimensions must be consistent with the corresponding observation width (in the case of output distribution parameters) ornumber of states (in the case of transition matrices).Although in this example they could be inferred, \HTK\ requires that they are included explicitly since, as willbe described shortly, they can be detached from the HMM definitionand stored elsewhere as a macro.\index{vector dimensions}\index{matrix dimensions}The definition for \textsf{hmm1} makes use of many defaults.In particular, there is no definition for the number ofinput data streams or for the number of mixture components per output distribution.  Hence, in bothcases, a default of 1 is assumed.Fig~\href{f:hmm2def} shows a HMM definition in whichthe emitting states are 2 component mixture Gaussians.The number of mixture components in each state $j$ is indicated by the keyword\hmkw{NumMixes}\index{nummixes@$<$NumMixes$>$} and each mixture component is prefixed by the keyword \hmkw{Mixture}\index{mixture@$<$Mixture$>$}  followed by the component index $m$ and component weight $c_{jm}$.  Notethat there is no requirement for the number of mixture componentsto be the same in each distribution.\index{HMM definition!mixture components}State definitions and the mixture components within them may belisted in any order.  When a HMM definition is loaded, a check is madethat all the required components have been defined.  In addition,checks are made that the mixture component weights and each rowof the transition matrix sum to one.If very rapid loading is required, this consistency checking can be inhibitedby setting the Boolean configuration variable \texttt{CHKHMMDEFS}\index{chkhmmdefs@\texttt{CHKHMMDEFS}} tofalse.As an alternative to diagonal variance vectors, a Gaussian distributioncan have a full rank covariance\index{full rank covariance} matrix.   An example ofthis is shown in the definition for \textsf{hmm3} shown in Fig~\href{f:hmm3def}.  Since covariance matrices are symmetric,they are stored in upper triangular form\index{upper triangular form}  i.e. each row of the matrixstarts at the diagonal element\footnote{Covariance matrices are actually stored internally in lower triangularform}.  Also, covariance matrices are storedin their inverse form i.e.\ HMM definitions contain $\bm{\Sigma}^{-1}$rather than  $\bm{\Sigma}$.  To reflect this, the keyword chosen tointroduce a full covariance matrix is \hmkw{InvCovar}\index{invcovar@$<$InvCovar$>$}.  \sideprog{hmm2def}{60}{Simple Mixture Gaussian HMM}{\hmmc{h}{hmm2} \\\hmkw{BeginHMM} \\\>\hmkw{VecSize} 4 \hmkw{MFCC} \\
12 3 4 5 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -