📄 hhed.tex
字号:
node number that the Gaussian component resides in). In order to grow
the regression class tree it is necessary to load in a \texttt{statsfile}
using the \texttt{LS} command. It is also possible to specify an
\texttt{itemlist} containing the ``non-speech'' sound components
such as the silence mixture components. If this is included then the
first split made will result in one leaf containing the specified
non-speech sound conmponents, while the other leaf will contain the
rest of the model set components. Tree contruction then continues as usual.
\subsubsection*{\tt RN hmmIdName}
Rename or add the hmm set identifier in the global options macro to
{\tt hmmIdName}.
\subsubsection*{\tt RM hmmFile}
Load the hmm from \texttt{hmmFile} and subtract the mean from state 2,
mixture 1 of the model from every loaded model. Every component
of the mean is subtracted including deltas and accelerations.
\subsubsection*{\tt RO f [statsfile]}
This command is used to remove outlier states during clustering
with subsequent \texttt{NC} or \texttt{TC} commands.
If \texttt{statsfile} is present it first reads in the \htool{HERest} statistics
file (see \texttt{LS}) otherwise it expects a separate \texttt{LS} command
to have already been used to read in the statistics.
Any subsequent \texttt{NC}, \texttt{TC} or \texttt{TB} commands are
extended to ensure that the occupancy clusters produced exceeds the
threshold \texttt{f}.
For \texttt{TB} this is used to choose which questions are allowed to
be used to split each node. Whereas for \texttt{NC} and \texttt{TC}
a final merging pass is used and for as long the smallest cluster count
falls below the threshold \texttt{f}, then that cluster is merged with
its nearest neighbour.
\subsubsection*{\tt RT i j itemList(t)}
Remove the transition from state \texttt{i} to \texttt{j} in all transition
matrices given in the \texttt{itemList}. After removal, the remaining
non-zero transition probabilities for state \texttt{i}
are rescaled so that $\sum_k a_{ik} = 1 $.
\subsubsection*{\tt SH}
Show the current HMM set. This command can be inserted into
edit scripts for debugging. It prints a summary of each
loaded HMM identifying any tied parameters.
\subsubsection*{\tt SK skind}
Change the sample kind of all loaded HMMs to \texttt{skind}. This
command is typically used in conjunction with the \texttt{SW} command.
For example, to add delta coefficients to a set of models, the \texttt{SW}
command would be used to double the stream widths and then this
command would be used to add the \texttt{\_D} qualifier.
\subsubsection*{\tt SS N}
Split into N independent data streams.
This command causes the currently loaded set of HMMs to be converted
from 1 data stream to N independent data streams. The widths of
each stream are determined from the single stream vector size and
the sample kind as described in section~\ref{s:streams}.
Execution of this command will cause
any tyings associated with the split stream to
be undone.
\subsubsection*{\tt ST filename}
Save the currently defined questions and trees to file \texttt{filename}.
This allows subsequent construction of models using for new contexts
using the \texttt{LT} and \texttt{AU} commands.
\subsubsection*{\tt SU N w1 w2 w3 .. wN}
Split into N independent data streams with stream widths as specified.
This command is similar to the \texttt{SS} command except that the
width of each stream is defined explicity by the user rather
than using the built-in stream splitting rules.
Execution of this command will cause
any tyings associated with the split stream to
be undone.
\subsubsection*{\tt SW s n}
Change the width of stream \texttt{s} of all currently loaded HMMs to
\texttt{n}. Changing the width of stream involves changing the dimensions
of all mean and variance vectors or covariance matrices. If \texttt{n}
is greater than the current width of stream \texttt{s}, then mean vectors
are extended with zeroes and variance vectors are extended with 1's.
Covariance matrices are extended with zeroes everywhere except for the
diagonal elements which are set to 1. This command preserves any
tyings which may be in force.
\subsubsection*{\tt TB f macro itemList(s or h)}
Decision tree cluster all states in the given \texttt{itemList} and
tie them as \texttt{macroi} where \texttt{i} is 1,2,3,\ldots.
This command performs a top down clustering of the states or
models appearing in \texttt{itemlist}. This clustering starts by
placing all items in a single root node and then choosing a
question from the current set to split the node in such a way
as to maximise the likelihood of a single diagonal covariance
Gaussian at each of the child nodes generating the training data.
This splitting continues until the increase in likelihood falls
below threshold \texttt{f} or no questions are available which do
not pass the outlier threshold test.
This type of clustering is only implimented for single mixture,
diagonal covariance untied models.
\subsubsection*{\tt TC f macro itemList(s)}
Cluster all states in the given
\texttt{itemList} and tie them as \texttt{macroi} where
\texttt{i} is 1,2,3,\ldots. This command is identical to the
\texttt{NC} command described above except that the number of clusters
is varied such that the maximum within cluster distance is less than
the value given by \texttt{f}.
\subsubsection*{\tt TI macro itemList}
Tie the items in \texttt{itemList} and assign them to the specified
\texttt{macro} name. This command applies to any item type but
all of the items in \texttt{itemList} must be of the same type.
The detailed method of tying depends on the item type as follows:
\begin{description}
\item[state(s)] the state with the largest total value of \texttt{gConst}
in stream 1 (indicating broad variances) and the minimum number of
defunct mixture weights (see \texttt{MU} command) is selected from the
item list and all states are tied to this typical state.
\item[transitions(t)] all transition matrices in the item list are
tied to the last in the list.
\item[mixture(m)] all mixture components in the item list are tied
to the last in the list.
\item[mean(u)] the average vector of all the mean vectors
in the item list is calculated and all the means are tied to this
average vector.
\item[variance(v)] a vector is constructed for which each element
is the maximum of the corresponding elements from the set of
variance vectors to be tied. All of the variances are then tied
to this maximum vector.
\item[covariance(i)] all covariance matrices in the item list are tied
to the last in the list.
\item[xform(x)] all transform matrices in the item list are tied
to the last in the list.
\item[duration(d)] all duration vectors in the item list are tied
to the last in the list.
\item[stream weights(w)] all stream weight vectors in the item
list are tied to the last in the list.
\item[pdf(p)] as noted earlier, pdf's are tied to create tied
mixture sets rather than to create a shared pdf. The procedure
for tying pdf's is as follows
\begin{enumerate}
\item All mixtures from all pdf's in the item list are collected
together in order of mixture weight.
\item If the number of mixtures exceeds the join size $J$ [see the
Join (\texttt{JO}) command above], then all but the first $J$ mixtures
are discarded.
\item If the number of mixtures is less than $J$, then the
mixture with the largest weight is repeatedly split until
there are exactly $J$ mixture components. The split procedure
used is the same as for the MixUp (\texttt{MU}) command
described above.
\item All pdf's in the item list are made to share all $J$
mixture components. The weight for each mixture is set
proportional to the log likelihood of the mean vector of
that mixture with respect to the original pdf.
\item Finally, all mixture weights below the floor set by the
Join command are raised to the floor value and all of the
mixture weights are renormalised.
\end{enumerate}
\end{description}
\subsubsection*{\tt TR n}
Change the level of detail for tracing and consists of a number
of separate flags which can be added together.
Values 0001, 0002, 0004, 0008 have the same meaning as the command
line trace level but apply only to a single block of commands
(a block consisting of a set of commands of the name).
A value of 0010 can be used to show current memory usage.
\subsubsection*{\tt UT itemList}
Untie all items in \texttt{itemList}. For each item in the item list,
if the usage counter for that item is greater than 1 then it
is cloned, the original shared item is replaced by the cloned copy
and the usage count of the shared item is reduced by 1.
If the usage count is already 1, the associated macro is simply
deleted and the usage count set to 0 to indicate an unshared item.
Note that it is not possible to untie a pdf since these are not
actually shared [see the Tie (\texttt{TI}) command above].
\subsubsection*{\tt XF filename}
Sets the input transform of the model-set to be filename.
\subsection{Use}
\htool{HHEd} is invoked by typing the command line
\begin{verbatim}
HHEd [options] edCmdFile hmmList
\end{verbatim}
where \texttt{edCmdFile} is a text file containing a sequence of edit commands
as described above and \texttt{hmmList} defines the set of HMMs to be edited
(see \htool{HModel} for the format of HMM list).
If the models are to be kept in separate files rather than being stored in an
MMF, the configuration variable \texttt{KEEPDISTINCT} should be set to true.
The available options for \htool{HHEd} are
\begin{optlist}
\ttitem{-d dir} This option tells \htool{HHEd} to look in
the directory \texttt{dir} to find the model definitions.
\ttitem{-o ext} This causes the file name extensions of the
original models (if any) to be replaced by \texttt{ext}.
\ttitem{-w mmf} Save all the macros and model definitions in a
single master macro file \texttt{mmf}.
\ttitem{-x s} Set the extension for the edited output files to be \texttt{s}
(default is to to use the original names unchanged).
\ttitem{-z} Setting this option causes all aliases in the loaded
HMM set to be deleted (zapped) immediately before
loading the definitions. The result is that all logical names
are ignored and the actual HMM list
consists of just the physically distinct HMMs.
\stdoptB
\stdoptH
\stdoptM
\stdoptQ
\end{optlist}
\stdopts{HHEd}
\subsection{Tracing}
\htool{HHEd} supports the following trace options where each
trace flag is given using an octal base
\begin{optlist}
\ttitem{00001} basic progress reporting.
\ttitem{00002} intermediate progress reporting.
\ttitem{00004} detailed progress reporting.
\ttitem{00010} show item lists used for each command.
\ttitem{00020} show memory usage.
\ttitem{00100} show changes to macro definitions.
\ttitem{00200} show changes to stream widths.
\ttitem{00400} show clusters.
\ttitem{00800} show questions.
\ttitem{01000} show tree filtering.
\ttitem{02000} show tree splitting.
\ttitem{04000} show tree merging.
\ttitem{10000} show good question scores.
\ttitem{20000} show all question scores.
\ttitem{40000} show all merge scores.
\end{optlist}
Trace flags are set using the \texttt{-T} option or the \texttt{TRACE}
configuration variable.
\index{hhed@\htool{HHEd}|)}
%%% Local Variables:
%%% mode: latex
%%% TeX-master: "../htkbook"
%%% End:
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -