📄 models.tex
字号:
The \hmmt{p} macro is used by the HMM editor \htool{HHEd}for building tied mixture systems (see section~\ref{s:tmix}).The \hmmt{l} or \hmmt{p} macros are special in the sensethat they are created implicitly in order to represent specific kinds of parameter sharing and they never occur explicitly in HMM definitions.\mysect{HMM Sets}{hmmsets}The previous sections have described how a single HMM definitioncan be specified. However, many \HTK\ tools require complete modelsets to be specified rather than just a single model.\index{HMM sets}When this is the case, the individual HMMs which belong to the setare listed in a file rather than being enumerated explicitly onthe command line. Thus, for example, a typical invocation ofthe tool \htool{HERest} might be as follows\begin{verbatim} HERest ... -H mf1 -H mf2 ... hlist\end{verbatim}where each \texttt{-H} option names a macro file and \texttt{hlist}contains a list of HMM names, one per line. For example, it might contain\begin{verbatim} ha hb hc\end{verbatim}In a case such as this, the macro files would normally\index{HMM lists}contain definitions for the models \texttt{ha}, \texttt{hb} and \texttt{hc}, along with any lower level macro definitions that they might require.As an illustration,Fig~\href{f:mac6def} and Fig~\href{f:hmm6def} give examples ofwhat the macro files \texttt{mf1} and \texttt{mf2} might contain.The first file contains definitions for three states and atransition matrix. The second filecontains definitions for the three HMMs. In this example,each HMM shares the threestates and the common transition matrix. A HMM set such as this is called a \textit{tied-state} system.The order in which macro files are listed onthe command line and the order of definition withineach file must ensure that all macrodefinitions are defined before they are referenced.Thus, macro files are typically organised such that alllow level structures come first followed by states andtransition matrices, with the actual HMM definitions coming last.When the HMM list contains the name of a HMM for which no correspondingmacro has been defined, then an attempt is made to open a file with thesame name. This file is expected to contain a single definition corresponding to the required HMM. Thus, the general mechanism forloading a set of HMMs is as shown in Fig~\href{f:hsetdef}. In thisexample, the HMM list \texttt{hlist} contains the names of five HMMs of whichonly three have been predefined via the macro files. Hence, theremaining definitions are found in individual HMM definition files\texttt{hd} and \texttt{he}.When a large number of HMMs must be loaded from individual files, it iscommon to store them in a specific directory. Most \HTK\ tools allowthis directory to be specified explicitly using a command line option.For example, in the command\begin{verbatim} HERest -d hdir ... hlist ....\end{verbatim}the definitions for the HMM listed in \texttt{hlist} will besearched for in the subdirectory \texttt{hdir}.After loading each HMM set,\index{tied-state} \htool{HModel} marks it as belongingto one of the following categories (called the \textit{HSKind}\index{hskind@HSKind}\begin{itemize}\item \texttt{PLAINHS}\item \texttt{SHAREDHS}\item \texttt{TIEDHS}\item \texttt{DISCRETEHS}\end{itemize}Any HMM set containing discrete output distributions is assigned\index{HMM sets!types}to the \texttt{DISCRETEHS}\index{discretehs@\texttt{DISCRETEHS}} category (see section~\ref{s:dischmm}). If all mixture components are tied, then itis assigned to the \texttt{TIEDHS} category (see section~\ref{s:tmix}). If it contains any shared states (\hmmt{s} macros) or Gaussians (\hmmt{m} macros) then it is \texttt{SHAREDHS}\index{sharedhs@\texttt{SHAREDHS}}.Otherwise, it is \texttt{PLAINHS}. The category assignedto a HMM set determines which of several possible optimisationsthe various \HTK\ tools can apply to it. As a check, the required kind ofa HMM set can also be set via the configuration variable \texttt{HMMSETKIND}.For debugging purposes, this can also be used to re-categorise a \texttt{SHAREDHS} system as \texttt{PLAINHS}\index{plainhs@\texttt{PLAINHS}}.As shown in Figure~\href{f:hierarch}, complete HMMdefinitions can be tied as well as their individual parameters. However,tying at the HMM level is defined in a different way. HMM lists have so far\index{HMM tying}been described as simply a list of model names. In fact, every HMM has twonames: a {\it logical} name and a {\it physical name}. The logical namereflects the r\^{o}le of the model and the physical name is used toidentify the definition on disk. By default, the logical and physical namesare identical. HMM tying is implemented by letting several logicallydistinct HMMs share the same physical definition. This is done by givingan explicit physical name immediately after the logical name in a HMM list\index{HMM lists}.\putprog{mac6def}{100}{File mf1: shared state andtransition matrix macros}{\hmmt{o} \>\> \hmkw{VecSize} 4 \hmkw{MFCC} \\\hmmc{s}{stateA} \\\> \hmkw{Mean} 4 \\\>\> 0.2 0.1 0.1 0.9 \\\> \hmkw{Variance} 4 \\\>\> 1.0 1.0 1.0 1.0 \\\hmmc{s}{stateB} \\\> \hmkw{Mean} 4 \\\>\> 0.4 0.9 0.2 0.1 \\\> \hmkw{Variance} 4 \\\>\> 1.0 2.0 2.0 0.5 \\\hmmc{s}{stateC} \\\> \hmkw{Mean} 4 \\\>\> 1.2 3.1 0.5 0.9 \\\> \hmkw{Variance} 4 \\\>\> 5.0 5.0 5.0 5.0 \\\hmmc{t}{tran} \\\> \hmkw{TransP} 5 \\\>\> 0.0 0.5 0.5 0.0 0.0 \\\>\> 0.0 0.4 0.4 0.2 0.0 \\\>\> 0.0 0.0 0.6 0.4 0.0 \\\>\> 0.0 0.0 0.0 0.7 0.3 \\\>\> 0.0 0.0 0.0 0.0 0.0 }\putprog{hmm6def}{100}{Simple Tied-State System}{\hmmc{h}{ha} \\\hmkw{BeginHMM} \\\> \hmkw{NumStates} 5 \\\> \hmkw{State} 2 \\\>\> \hmmc{s}{stateA} \\\> \hmkw{State} 3 \\\>\> \hmmc{s}{stateB} \\\> \hmkw{State} 4 \\\>\> \hmmc{s}{stateB} \\\> \hmmc{t}{tran} \\\hmkw{EndHMM} \\ \\\hmmc{h}{hb} \\\hmkw{BeginHMM} \\\> \hmkw{NumStates} 5 \\\> \hmkw{State} 2 \\\>\> \hmmc{s}{stateB} \\\> \hmkw{State} 3 \\\>\> \hmmc{s}{stateA} \\\> \hmkw{State} 4 \\\>\> \hmmc{s}{stateC} \\\> \hmmc{t}{tran} \\\hmkw{EndHMM} \\ \\\hmmc{h}{hc} \\\hmkw{BeginHMM} \\\> \hmkw{NumStates} 5 \\\> \hmkw{State} 2 \\\>\> \hmmc{s}{stateC} \\\> \hmkw{State} 3 \\\>\> \hmmc{s}{stateC} \\\> \hmkw{State} 4 \\\>\> \hmmc{s}{stateB} \\\> \hmmc{t}{tran} \\\hmkw{EndHMM}}\centrefig{hsetdef}{120}{Defining a Model Set}For example, in the HMM list shown in Fig~\href{f:hlisteg},the logical HMMs {\tt two}, {\tt too} and {\tt to} are tiedand share the same physical HMM definition {\tt tuw}. The HMMs {\tt one}and {\tt won} are also tied but in this case {\tt won} shares {\tt one}'sdefinition. There is, however, no subtle distinction here. The two differentcases are given just to emphasise that the names used for the logical and physicalHMMs can be the same or different, as is convenient. Finally, in this example,the models {\tt three} and {\tt four} are untied.\sideprog{hlisteg}{70}{HMM List with Tying}{two \>\>tuw \\too \>\> tuw \\to \>\> tuw \\one \\won \>\> one \\three \\four }{}This mechanism is implemented internally by creating a \hmmt{l} macrodefinition for every HMM in the HMM list. If an explicit physical HMMis also given in the list, then the logical HMM is linked to that macro, otherwise a \hmmt{h} macrois created with the same name as the \hmmt{l} macro. Notice that this isone case where the ``define before use'' rule is relaxed. If an undefined\hmmt{h} is encountered then a dummy place-holder is created for it and,as explained above,\htool{HModel} subsequently tries to find a HMM definition file of the same name.Finally it should be noted that in earlier versions of \HTK, there were no HMM macros. However,HMM definitions could be listed in a single \index{master macro file}\textit{master macro file} or MMF\index{MMF}. Each HMM definition beganwith its name written as a quoted string and ended with a periodwritten on its own (just like master label files), and the firstline of an MMF contained the string \texttt{\#!MMF!\#}. \inthisversionthe use of MMFs has been subsumed within the general macrodefinition facility using the \hmmt{h} type.However, for compatibility, the older MMF style of file can still beread by all \HTK\ tools.\mysect{Tied-Mixture Systems}{tmix}A Tied-Mixture System\index{tied-mixture system} is one in which all Gaussian components are stored ina pool and all state output distributions share this pool. Fig~\href{f:tmixeg}illustrates this for the case of single data stream. \sidefig{tmixeg}{60}{Tied Mixture System}{-2}{}Each state output distribution is defined by $M$mixture component weights and since all states share the same components,all of the state-specific discrimination is encapsulated within theseweights. The set of Gaussian components selected for the poolshould be representative of the acoustic space covered by the featurevectors. To keep $M$ manageable, multiple data streams are typicallyused with tied-mixture systems. For example, static parameters maybe in one stream and delta parameters in another (see section~\ref{s:streams}).Each stream then has a separate pool of Gaussians which are often referredto as \textit{codebooks}.More formally, for $S$ independent data streams, the output distribution for state $j$is defined as \index{tied-mixtures!output distribution}\hequation{ b_{j}(\bm{o}_t) = \prod_{s=1}^S \left[ \sum_{m=1}^{M_s} c_{jsm} {\cal N}(\bm{o}_{st}; \bm{\mu}_{sm}, \bm{\Sigma}_{sm}) \right]^{\gamma_s}}{tmixpdf}where the notation is identical to that used in equation~\ref{e:cdpdf}.Note however that this equation differs from equation~\ref{e:cdpdf}in that the Gaussian component parameters and the number of mixture componentsper stream are state independent.Tied-mixture systems lack the modelling accuracy of fully continuousdensity systems. However, they can often be implemented more efficientlysince the total number of Gaussians which must be evaluated ateach input frame is independent of the number of active HMM states andis typically much smaller.A tied-mixture HMM system in \HTK\ is defined by representing thepool of shared Gaussians as \hmmt{m} macros with names ``xxx1'',``xxx2'', \ldots, ``xxxM'' where ``xxx'' is an arbitrary name.Each HMM state definition is then specified by giving the name``xxx'' followed by a list of the mixture weights. Multiplestreams are identified using the \hmkw{Stream}\index{stream@$<$Stream$>$} keyword as describedpreviously.As an example, Fig~\href{f:tmixpool} shows a set of macrodefinitions which specify a 5 Gaussian component tied-mixture pool.Fig~\href{f:tmixhmm} then shows a typical tied-mixture\index{HMM definition!tied-mixture} HMM definitionwhich uses this pool. As can be seen, the mixture component weightsare represented an array of real numbers as in the continuous density case.The number of components in each tied-mixture codebook is typicallyof the order of 2 or 3 hundred. Hence, the list of mixture weights ineach state is often long with many values being repeated, particularlyfloor values. To allow more efficient coding, successive identicalvalues can be represented as a single value plus a repeat count in theform of an asterix followed by an integer multiplier. For example,Fig~\href{f:tmixhmm2} shows the same HMM definition as above but usingrepeat counts. When \HTK\ writes out a tied-mixture definition, ituses repeat counts wherever possible. \putprog{tmixpool}{60}{Tied-Mixture Codebook}{\hmmt{o} \hmkw{VecSize} 2 \hmkw{MFCC} \\\hmmc{m}{mix1} \\ \> \hmkw{Mean} \>\>\> 2 0.0 0.1 \\ \> \hmkw{Variance} \>\>\> 2 1.0 1.0 \\\hmmc{m}{mix2} \\ \> \hmkw{Mean} \>\>\> 2 0.2 0.3 \\ \> \hmkw{Variance} \>\>\> 2 2.0 1.0 \\\hmmc{m}{mix3} \\ \> \hmkw{Mean} \>\>\> 2 0.0 0.1 \\ \> \hmkw{Variance} \>\>\> 2 1.0 2.0 \\\hmmc{m}{mix4} \\\> \hmkw{Mean} \>\>\> 2 0.4 0.1 \\\> \hmkw{Variance} \>\>\> 2 1.0 1.5 \\\hmmc{m}{mix5} \\\> \hmkw{Mean} \>\>\> 2 0.9 0.7 \\\> \hmkw{Variance} \>\>\> 2 1.5 1.0 }\mysect{Discrete Probability HMMs}{dischmm}Discrete probability\index{discrete probability} HMMs model
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -