📄 models.tex

📁 隐马尔科夫模型工具箱
💻 TEX
📖 第 1 页 / 共 5 页
字号:
上一页 1 2 3 45
{\sf\begin{tabbing}++++ \= ++++++++ \= ++ \= +++++++++++++++++ \= +++ \=  \kill\> optmacro = \> $\sim$o globalOpts \end{tabbing}}\noindentor they can be included in one or more HMM definitions.  Globaloptions may be repeated but no definition can change a previousdefinition.  All global options must be defined before any othermacro definition is processed.  In practice this means that anyHMM system which uses parameter tying must have a \hmmt{o} option macro at the head of the first macro file processed.The full set of global options is given below.  Every HMM set mustdefine the vector size (via \hmkw{VecSize}\index{vecsize@$<$VecSize$>$}), the stream widths  (via \hmkw{StreamInfo}\index{streaminfo@$<$StreamInfo$>$})and the observation parameter kind.  However, if only the streamwidths are given, then the vector size will be inferred.  Ifonly the vector size is given, then a single stream of identicalwidth will be assumed.  All other options default to null.{\sf\begin{tabbing}++++ \= ++++++++ \= ++ \= +++++++++++++++++ \= +++ \=  \kill\> globalOpts = \> option \{ option \} \\\>  option = \> $<$HmmSetId$>$ string $|$ \\ \>\>  $<$StreamInfo$>$ short \{ short \} $|$  \\\>\>   $<$VecSize$>$    short $|$  \\\>\>   $<$InputXform$>$ inputXform $|$  \\\>\>   covkind $|$ \\\>\>   durkind $|$ \\\>\>   parmkind \end{tabbing}}\noindentThe {\sf $<$HmmSetId$>$} option allows the user to give the MMF anidentifier. This is used as a sanity check to make sure that a TMF canbe safely applied to this MMF.The arguments to the{\sf $<$StreamInfo$>$} option are the number of streams (default 1) and thenfor each stream, the width of that stream.  The {\sf $<$VecSize$>$} option gives the total number of elements in each input vector.  If both \hmkw{VecSize} and \hmkw{StreamInfo} are included then thesum of all the stream widths must equal the input vector size.The {\sf covkind } defines the kind of the covariance matrix{\sf\begin{tabbing}++++ \= ++++++++ \=  \kill\>  covkind =\> $<$DiagC$>$ $|$ $<$InvDiagC$>$ $|$ $<$FullC$>$ $|$ \\\>\>            $<$LLTC$>$ $|$ $<$XformC$>$ \end{tabbing}}\noindentwhere {\sf $<$InvDiagC$>$} is used internally.  {\sf $<$LLTC$>$}and {\sf $<$XformC$>$} are not used in \HTK\ Version 2.0.Setting the covariance kind as a global option forces all components tohave this kind.  In particular, it prevents mixing full and diagonal covarianceswithin a HMM set.The {\sf durkind} denotes the type of durationmodel used according to the following rules{\sf\begin{tabbing}++++ \= ++++++++ \=  \kill\>  durkind =\> $<$nullD$>$ $|$ $<$poissonD$>$ $|$ $<$gammaD$>$ $|$ $<$genD$>$ \end{tabbing}}\noindentFor anything other than {\sf $<$nullD$>$}, a duration vector\index{duration vector} mustbe supplied for the model or each state as described below. Note that nocurrent HTK tool can estimate or use such duration vectors.  The parameter kind is any legal parameter kind including qualified forms(see section~\ref{s:genio}){\sf\begin{tabbing}++++ \= ++++++++ \=  \kill\>  parmkind =\> $<$basekind\{\_D$|$\_A$|$\_T$|$\_E$|$\_N$|$\_Z$|$\_O$|$\_V$|$\_C$|$\_K\}$>$ \\\>  basekind =\> $<$discrete$>$$|$$<$lpc$>$$|$$<$lpcepstra$>$$|$$<$mfcc$>$ $|$ $<$fbank$>$ $|$ \\ \> \>          $<$melspec$>$$|$ $<$lprefc$>$$|$$<$lpdelcep$>$ $|$ $<$user$>$ \end{tabbing}}\noindentwhere the syntax rule for {\sf parmkind} is non-standard in that no spacesare allowed between the base kind and any subsequent qualifiers.As noted in chapter~\ref{c:speechio}, {\sf $<$lpdelcep$>$} is provided only for compatibilitywith earlier versions of \HTK\ and its further use should be avoided.Each state of each HMM must have its own section defining the parametersassociated with that state{\sf\begin{tabbing}++++ \= ++++++++ \=  \kill\> state =\>  $<$State: Exp $>$ short stateinfo\end{tabbing}}\noindentwhere the  short following {\sf $<$State: Exp $>$} is the state number.  Stateinformation can be defined in any order.  The syntax is as follows{\sf\begin{tabbing}++++ \= ++++++++ \=  \kill\>   stateinfo = \> $\sim$s macro $|$ \\  \>\>              [ mixes ] [ weights ] stream \{ stream \} [ duration ] \\\>   macro     = \> string\end{tabbing}}\noindentA {\sf stateinfo} definition consists of an optional specification of the number of mixtures, an optional set of stream weights,followed by a block of information for each stream, optionally terminated witha duration vector.  Alternatively, {\sf $\sim$s macro} can bewritten where {\sf macro} is the name of a previously defined macro.The optional {\sf mixes} in a {\sf stateinfo} definition specifythe number of mixture components (or discrete codebook size) for each stream of that state{\sf\begin{tabbing}++++ \= ++++++++ \=  \kill\>   mixes = \>  $<$NumMixes$>$ short \{short\}\end{tabbing}}\noindentwhere there should be one {\sf short} for each stream.  If thisspecification is omitted, it is assumed that all streamshave just one mixture component.The optional {\sf weights} in a {\sf stateinfo} definition definea set of exponent weights for each independent data stream.  Thesyntax is{\sf\begin{tabbing}++++ \= ++++++++ \=  \kill\>   weights = \> $\sim$w macro $|$ $<$SWeights$>$ short vector \\\>   vector  = \> float \{ float \} \end{tabbing}}\noindentwhere the {\sf short} gives the number $S$ of weights (which should match thevalue given in the \hmkw{StreamInfo} option) and the {\sf vector}contains the $S$ stream weights $\gamma_s$ (see section~\ref{s:HMMparm}).The definition of each  {\sf stream} depends on the kind of HMM set.  In the normal case, itconsists of a sequence of mixturecomponentdefinitions optionally preceded by the stream number.  If the streamnumber is omitted then it is assumed to be 1.  For tied-mixtureand discrete HMM sets, special forms are used.{\sf\begin{tabbing}++++ \= ++++++++ \=  \kill\>   stream = \> [ $<$Stream$>$ short ] \\\>            \> (mixture \{ mixture \} $|$ tmixpdf $|$ discpdf)\end{tabbing}}The definition of each mixture component consists of a Gaussianpdf optionally preceded by the mixture number and its weight{\sf\begin{tabbing}++++ \= ++++++++ \=  \kill\>   mixture = \> [ $<$Mixture$>$ short float ] mixpdf\end{tabbing}}\noindentIf the \hmkw{Mixture}\index{mixture@$<$Mixture$>$} part is missing then mixture 1 is assumed and theweight defaults to 1.0. The {\sf tmixpdf} option is used only for fully tied mixture sets.  Since the {\sf mixpdf} parts are all macros ina tied mixture system and since they are identical for every streamand state, it is only necessary to know the mixture weights.  The{\sf tmixpdf} syntax allows these to be specified in the followingcompact form{\sf\begin{tabbing}++++ \= ++++++++ \=  \kill\>   tmixpdf = \> $<$TMix$>$ macro weightList \\\>   weightList = \> repShort \{ repShort \} \\\>   repShort = \> short [ $\ast$ char ]\end{tabbing}}\noindentwhere each {\sf short} is a mixture component weight scaled so thata weight of 1.0 is represented by the integer 32767.  The optional asterix followed by a {\sf char} is used to indicatea repeat count.  For example, {\tt 0*5} is equivalent to 5 zeroes.The Gaussians which make-up the pool of tied-mixtures are defined using  \hmmt{m} macros called{\sf macro1}, {\sf macro2}, {\sf macro3}, etc. Discrete probability HMMs are defined in a similar way{\sf\begin{tabbing}++++ \= ++++++++ \=  \kill\>   discpdf = \> $<$DProb$>$ weightList\end{tabbing}}\noindentThe only difference is that the weights in the \textsf{weightList}are scaled log probabilities as defined in section~\ref{s:dischmm}.The definition of a Gaussian pdf requires the mean vector to be given and one of the possible forms of covariance {\sf\begin{tabbing}++++ \= ++++++++ \=  \kill\>   mixpdf = \> $\sim$m macro $|$ [ rclass ] mean cov [ $<$GConst$>$ float ] \\\>   rclass = \> $<$RClass$>$ short \\\>   mean = \> $\sim$u macro $|$ $<$Mean$>$ short vector \\\>   cov =  \> var $|$ inv $|$ xform \\\>   var = \> $\sim$v macro $|$ $<$Variance$>$ short vector \\\>   inv = \> $\sim$i macro $|$ \\\>        \> ($<$InvCovar$>$ $|$ $<$LLTCovar$>$) short tmatrix \\\>   xform = \> $\sim$x macro $|$ $<$Xform$>$ short short matrix \\\>   matrix = \> float \{float\} \\\>   tmatrix = \> matrix \\\end{tabbing}}\noindentIn {\sf mean} and {\sf var}, the {\sf short} preceding the {\sf vector}defines the length of the vector, in {\sf inv} the {\sf short} preceding the {\sftmatrix} gives the size of this square upper triangular matrix, and in {\sf xform} thetwo {\sf short}'s preceding the {\sf matrix} give the number of rows andcolumns. The optional {\sf $<$GConst$>$}\footnote{specifically, in equation ~\ref{e:gnorm} the GCONST value seen in HMM sets is calculated by multiplying the determinant of the covariance matrix by $\bm{(2 \pi)^n}$} \index{GCONST value} gives that part of the logprobability of a Gaussian that can be precomputed.  If it is omitted, thenit will be computed during load-in, including it simply saves some time.\HTK\ tools which output HMM definitions always include this field.The optional {\sf $<$RClass$>$} stores the regression base class index thatthis mixture component belongs to, as specified by the regressionclass tree (which is also stored in the model set). \HTK\ tools whichoutput HMM definitions always include this field, and if there is noregression class tree then the regression identifier is set to zero.In addition to defining the output distributions, a state can have aduration probability distribution defined for it. However, no current HTKtool can estimate or use these.{\sf\begin{tabbing}++++ \= ++++++++ \=  \kill\>   duration = \> $\sim$d macro $|$ $<$Duration$>$ short vector\end{tabbing}}\noindentAlternatively, as shown by the top level syntax for a {\sf hmmdef},duration parameters can be specified for a whole model.A binary regression class tree (for the purposes of HMM adaptation as inchapter~\ref{c:Adapt}) may also exist for an HMM set. This is definedby {\sf\begin{tabbing}++++ \= ++++++++ \=  \kill\>   regTree = \> $\sim$r macro tree \\\>   tree    = \> $<$RegTree$>$ short nodes \\\>   nodes   = \> ($<$Node$>$ short short short $|$ $<$TNode$>$ shortint) [ nodes ]\end{tabbing}}\noindentIn {\sf tree} the {\sf short} preceding the {\sf nodes} refers tothe number of terminal nodes or leaves that the regression treecontains. Each node in {\sf nodes} can either be a non-terminal {\sf$<$Node$>$} or a terminal (leaf) {\sf $<$TNode$>$}. For a {\sf$<$Node$>$} the three following {\sf short}s refer to the node's indexnumber and the index numbers of its children. For a {\sf$<$TNode$>$}, the {\sf short} refers to the leaf's index (whichcorrespond to a regression base class index as stored at thecomponent level in {\sf RClass}, see above), while the {\sf int}refers to the number of mixture components in this leaf cluster.The transition matrix is defined by{\sf\begin{tabbing}++++ \= ++++++++ \=  \kill\>   transP = \> $\sim$t macro $|$ $<$TransP$>$ short matrix\end{tabbing}}\noindentwhere the {\sf short} in this case should be equal to the number ofstates in the model. Finally the input transform is defined by{\sf\begin{tabbing}++++ \= ++++++++ \=  \kill\>  inputXform  = \> $\sim$j macro $|$ inhead inmatrix\\\>  inhead      = \> $<$MMFIdMask$>$ string parmkind [$<$PreQual$>$]\\\>  inmatrix    = \> $<$LinXform$>$ $<$VecSize$>$ short $<$BlockInfo$>$ short short \{short\} block \{block\}\\\>  block       = \> $<$Block$>$ short xform\end{tabbing}}\noindentwhere the {\sf short} following \hmkw{VecSize} is the number of dimensionsafter applyingthe linear transform and must match the vector sizeof the HMM definition. The first {\sf short} after \hmkw{BlockInfo}is the number of block, this is followed by the number of outputdimensions from each of the blocks.%%% Local Variables: %%% mode: latex%%% TeX-master: "htkbook"%%% End:
上一页 1 2 3 45
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -