📄 hhed.tex

📁 隐马尔可夫模型源代码
💻 TEX
📖 第 1 页 / 共 3 页
字号:
上一页 1 23
node number that the Gaussian component resides in). In order to grow


the regression class tree it is necessary to load in a \texttt{statsfile}


using the \texttt{LS} command. It is also possible to specify an


\texttt{itemlist} containing the ``non-speech'' sound components 


such as the silence mixture components. If this is included then the


first split made will result in one leaf containing the specified


non-speech sound conmponents, while the other leaf will contain the


rest of the model set components. Tree contruction then continues as usual.





\subsubsection*{\tt RN hmmIdName}





Rename or add the hmm set identifier in the global options macro to 


{\tt hmmIdName}.





\subsubsection*{\tt RM hmmFile}





Load the hmm from \texttt{hmmFile} and subtract the mean from state 2,


mixture 1 of the model from every loaded model.  Every component


of the mean is subtracted including deltas and accelerations.





\subsubsection*{\tt RO f [statsfile]}





This command is used to remove outlier states during clustering


with subsequent \texttt{NC} or \texttt{TC} commands.


If \texttt{statsfile} is present it first reads in the \htool{HERest} statistics 


file (see \texttt{LS}) otherwise it expects a separate \texttt{LS} command


to have already been used to read in the statistics.


Any subsequent \texttt{NC}, \texttt{TC} or \texttt{TB} commands are


extended to ensure that the occupancy clusters produced exceeds the 


threshold \texttt{f}.


For \texttt{TB} this is used to choose which questions are allowed to


be used to split each node.   Whereas for \texttt{NC} and \texttt{TC} 


a final merging pass is used and for as long the smallest cluster count 


falls below the threshold \texttt{f}, then that cluster is merged with 


its nearest neighbour.





\subsubsection*{\tt RT i j itemList(t)}





Remove the transition from state \texttt{i} to \texttt{j} in all transition


matrices given in the \texttt{itemList}.  After removal, the remaining


non-zero transition probabilities for  state \texttt{i} 


are rescaled so that $\sum_k a_{ik} = 1 $.





\subsubsection*{\tt SH}





Show the current HMM set.  This command can be inserted into


edit scripts for debugging.  It prints a summary of each


loaded HMM identifying any tied parameters.





\subsubsection*{\tt SK skind}





Change the sample kind of all loaded HMMs to \texttt{skind}.  This


command is typically used in conjunction with the \texttt{SW} command.


For example, to add delta coefficients to a set of models, the \texttt{SW}


command would be used to double the stream widths and then this


command would be used to add the \texttt{\_D} qualifier.





\subsubsection*{\tt SS N}





Split into N independent data streams.


This command causes the currently loaded set of HMMs to be converted


from 1 data stream to N independent data streams.  The widths of 


each stream are determined from the single stream vector size and


the sample kind as described in section~\ref{s:streams}.


Execution of this command will cause


any tyings associated with the split stream to


be undone.





\subsubsection*{\tt ST filename}





Save the currently defined questions and trees to file \texttt{filename}.


This allows subsequent construction of models using for new contexts


using the \texttt{LT} and \texttt{AU} commands.





\subsubsection*{\tt SU N w1 w2 w3 .. wN}





Split into N independent data streams with stream widths as specified.


This command is similar to the \texttt{SS} command except that the 


width of each stream is defined explicity by the user rather


than using the built-in stream splitting rules.


Execution of this command will cause


any tyings associated with the split stream to


be undone.





\subsubsection*{\tt SW s n}





Change the width of stream \texttt{s} of all currently loaded HMMs to 


\texttt{n}.  Changing the width of stream involves changing the dimensions


of all mean and variance vectors or covariance matrices.  If \texttt{n}


is greater than the current width of stream \texttt{s}, then mean vectors


are extended with zeroes and variance vectors are extended with 1's.


Covariance matrices are extended with zeroes everywhere except for the


diagonal elements which are set to 1.  This command preserves any


tyings which may be in force.





\subsubsection*{\tt TB f macro itemList(s or h)}





Decision tree cluster all states in the given \texttt{itemList} and 


tie them as \texttt{macroi} where \texttt{i} is 1,2,3,\ldots. 


This command performs a top down clustering of the states or


models appearing in \texttt{itemlist}.  This clustering starts by


placing all items in a single root node and then choosing a


question from the current set to split the node in such a way


as to maximise the likelihood of a single diagonal covariance


Gaussian at each of the child nodes generating the training data.


This splitting continues until the increase in likelihood falls


below threshold \texttt{f} or no questions are available which do


not pass the outlier threshold test.


This type of clustering is only implimented for single mixture,


diagonal covariance untied models.





\subsubsection*{\tt TC f macro itemList(s)}





Cluster all states in the given 


\texttt{itemList} and tie them as \texttt{macroi} where 


\texttt{i} is 1,2,3,\ldots. This command is identical to the


\texttt{NC} command described above except that the number of clusters


is varied such that the maximum within cluster distance is less than


the value given by \texttt{f}.





\subsubsection*{\tt TI macro itemList}





Tie the items in \texttt{itemList} and assign them to the specified 


\texttt{macro} name.  This command applies to any item type but 


all of the items in \texttt{itemList} must be of the same type.


The detailed method of tying depends on the item type as follows:


\begin{description}


   \item[state(s)] the state with the largest total value of \texttt{gConst}


     in stream 1 (indicating broad variances) and the minimum number of


     defunct mixture weights (see \texttt{MU} command) is selected from the


   item list and all states are tied to this typical state.


   \item[transitions(t)] all transition matrices in the item list are


     tied to the last in the list.


   \item[mixture(m)] all mixture components in the item list are tied


   to the last in the list.


   \item[mean(u)] the average vector of all the mean vectors


   in the item list is calculated and all the means are tied to this


   average vector.


   \item[variance(v)] a vector is constructed for which each element


   is the maximum of the corresponding elements from the set of 


   variance vectors to be tied.  All of the variances are then tied


   to this maximum vector.


   \item[covariance(i)] all covariance matrices in the item list are tied


   to the last in the list.


   \item[xform(x)] all transform matrices in the item list are tied


   to the last in the list.


   \item[duration(d)] all duration vectors in the item list are tied


   to the last in the list.


   \item[stream weights(w)] all stream weight vectors in the item 


   list are tied to the last in the list.


   \item[pdf(p)]  as noted earlier, pdf's are tied to create tied


   mixture sets rather than to create a shared pdf.  The procedure


   for tying pdf's is as follows


   \begin{enumerate}


      \item All mixtures from all pdf's in the item list are collected 


       together in order of mixture weight.


      \item If the number of mixtures exceeds the join size $J$ [see the


       Join (\texttt{JO}) command above], then all but the first $J$ mixtures


       are discarded.


      \item If the number of mixtures is less than $J$, then the


       mixture with the largest weight is repeatedly split until


       there are exactly $J$ mixture components.  The split procedure


       used is the same as for the MixUp (\texttt{MU}) command


       described above.


      \item All pdf's in the item list are made to share all $J$


       mixture components.  The weight for each mixture is set


       proportional to the log likelihood of the mean vector of


       that mixture with respect to the original pdf.      


      \item Finally, all mixture weights below the floor set by the


       Join command are raised to the floor value and all of the


       mixture weights are renormalised.


   \end{enumerate}


\end{description}





\subsubsection*{\tt TR n}





Change the level of detail for tracing and consists of a number


of separate flags which can be added together.


Values 0001, 0002, 0004, 0008 have the same meaning as the command


line trace level but apply only to a single block of commands


(a block consisting of a set of commands of the name).


A value of 0010 can be used to show current memory usage.





\subsubsection*{\tt UT itemList}





Untie all items in \texttt{itemList}.  For each item in the item list,


if the usage counter for that item is greater than 1 then it


is cloned, the original shared item is replaced by the cloned copy


and the usage count of the shared item is reduced by 1. 


If the usage count is already 1, the associated macro is simply


deleted and the usage count set to 0 to indicate an unshared item.


Note that it is not possible to untie a pdf since these are not


actually shared [see the Tie (\texttt{TI}) command above].





\subsubsection*{\tt XF filename}





Sets the input transform of the model-set to be filename.





\subsection{Use}





\htool{HHEd} is invoked by typing the command line


\begin{verbatim}


   HHEd [options] edCmdFile hmmList


\end{verbatim}


where \texttt{edCmdFile} is a text file containing a sequence of edit commands


as described above and \texttt{hmmList} defines the set of HMMs to be edited


(see \htool{HModel} for the format of HMM list). 


If the models are to be kept in separate files rather than being stored in an


MMF, the configuration variable \texttt{KEEPDISTINCT} should be set to true.


The available options for \htool{HHEd} are





\begin{optlist}





  \ttitem{-d dir} This option tells \htool{HHEd} to look in


      the directory \texttt{dir} to find the model definitions.





  \ttitem{-o ext}  This causes the file name extensions of the


      original models (if any) to be replaced by \texttt{ext}.





  \ttitem{-w mmf} Save all the macros and model definitions in a


        single master macro file \texttt{mmf}.





  \ttitem{-x s} Set the extension for the edited output files to be \texttt{s} 


      (default is to to use the original names unchanged).





  \ttitem{-z} Setting this option causes all aliases in the loaded


      HMM set to be deleted (zapped) immediately before 


      loading the definitions.  The result is that all logical names


      are ignored and the actual HMM list


      consists of just the physically distinct HMMs.


\stdoptB


\stdoptH


\stdoptM


\stdoptQ





\end{optlist}


\stdopts{HHEd}





\subsection{Tracing}





\htool{HHEd} supports the following trace options where each


trace flag is given using an octal base


\begin{optlist}


   \ttitem{00001} basic progress reporting.


   \ttitem{00002} intermediate progress reporting.


   \ttitem{00004} detailed progress reporting.


   \ttitem{00010} show item lists used for each command.


   \ttitem{00020} show memory usage.


   \ttitem{00100} show changes to macro definitions.


   \ttitem{00200} show changes to stream widths.


   \ttitem{00400} show clusters.


   \ttitem{00800} show questions.


   \ttitem{01000} show tree filtering.


   \ttitem{02000} show tree splitting.


   \ttitem{04000} show tree merging.


   \ttitem{10000} show good question scores.


   \ttitem{20000} show all question scores.


   \ttitem{40000} show all merge scores.


\end{optlist}


Trace flags are set using the \texttt{-T} option or the  \texttt{TRACE} 


configuration variable.


\index{hhed@\htool{HHEd}|)}








%%% Local Variables: 


%%% mode: latex


%%% TeX-master: "../htkbook"


%%% End:
上一页 1 23
💿 文件大小 2136 K
👤 上传用户 my
📂 所属分类人工智能/神经网络
🏷️ 相关标签

#马尔可夫模型 #源代码
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -