hlrescore.tex

来自「隐马尔可夫模型源代码」· TEX 代码 · 共 202 行

TEX
202
字号
%/* ----------------------------------------------------------- */


%/*                                                             */


%/*                          ___                                */


%/*                       |_| | |_/   SPEECH                    */


%/*                       | | | | \   RECOGNITION               */


%/*                       =========   SOFTWARE                  */ 


%/*                                                             */


%/*                                                             */


%/* ----------------------------------------------------------- */


%/*         Copyright:                                          */


%/*              2002  Cambridge University                     */


%/*                    Engineering Department                   */


%/*                                                             */


%/*   Use of this software is governed by a License Agreement   */


%/*    ** See the file License for the Conditions of Use  **    */


%/*    **     This banner notice must not be removed      **    */


%/*                                                             */


%/* ----------------------------------------------------------- */


%


% HTKBook - Gunnar Evermann 10.12.2002


%





\newpage


\mysect{HLRescore}{HLRescore}





\mysubsect{Function}{HLRescore-Function}





\index{hlrescore@\htool{HLRescore}|(}


\htool{HLRescore} is a general lattice post-processing tool. It reads


lattices (for example produced by \htool{HVite}) and applies one of


the following operations on them:





\begin{itemize}


\item finding 1-best path through lattice


\item pruning lattice using forward-backward scores


\item expanding lattices with new language model


\item converting lattices to equivalent word networks


\item calculating various lattice statistics


\item converting word MLF files to lattices with a language model


\end{itemize}





A typical scenario for the use of \htool{HLRescore} is the application


of a higher order n-gram to the word lattices generated with HVite and


a bigram. This would involve the following steps:





\begin{itemize}


\item lattice generation with HVite using a bigram


\item lattice pruning with HLRescore (\texttt{-t})


\item expansion of lattices using a trigram (\texttt{-n})


\item finding 1-best transcription in the expanded lattice


  (\texttt{-f})


\end{itemize}





Another common use of HLRescore is the tuning of the language


model scaling factor and the word insertion penalty for use in


recognition. Instead of having to re-run a decoder many times with


different parameter settings the decoder is run once to generate


lattices. \htool{HLRescore} can be used to find the best transcription


for a give parameter setting very quickly. These different


transcriptions can then be scored (using \htool{HResults}) and the


parameter setting that yields the lowest word error rate can be


selected. 





Lattices produced by standard HTK decoders, for example,


\htool{HVite} and \htool{HDecode}, may still contain duplicate word


paths corresponding to different phonetic contexts caused by


pronunciation variants or optional between word short pause silence


models. These duplicate lattice nodes and arcs must be merged to


ensure that the finite state grammar created from the lattices by HTK


decoders are deterministic, and therefore usable for recognition. 


This function is also supported by HLRescore.








\mysubsect{Use}{HLRescore-Use}





\htool{HLRescore} is invoked via the command line


\begin{verbatim}


   HLRescore [options] vocabFile LatFiles......


\end{verbatim}





\htool{HLRescore} reads each of the lattice files and performs the


requested operation(s) on them. At least one of the following


operations must be selected: find 1-best (\texttt{-f}), write lattices


(\texttt{-w}), calculate statistics (\texttt{-c}).








The detailed operation of \htool{HLRescore} is controlled by the following


command line options


\begin{optlist}





  \ttitem{-i mlf} Output transcriptions to master file \texttt{mlf}.





  \ttitem{-l s} Directory in which to store label/lattice files.





  \ttitem{-m s} Direction of merging duplicate nodes and arcs of


  lattices. The default value is \texttt{b}, indicating a merging in


  a backward direction starting from the sentence end node of the


  lattice will be performed. If using direction \texttt{f}, then the


  forward merging will be performed instead. 





  \ttitem{-n lm} Load ARPA-format n-gram language model from file


  \texttt{lm} and expand lattice with this LM. All acoustic scores are


  unchanged but the LM scores are replaced and lattices nodes (i.e.\


  contexts) are expanded as required by the structure of the LM.





  \ttitem{-wn lm} Load Resource Management format word pair language


  model from file \texttt{lm} and apply this LM to a lattice converted


  from a word MLF file. 





  \ttitem{-o s} Choose how the output labels should be formatted.


        \texttt{s} is a string with certain letters (from \texttt{NSCTWM})


        indicating binary flags that control formatting options.


        \texttt{N} normalize acoustic scores by dividing by the duration


        (in frames) of the segment.


        \texttt{S} remove scores from output label.  By default 


        scores will be set to the total likelihood of the segment.


        \texttt{C} Set the transcription labels to start and end on


        frame centres. By default start times are set to the start


        time of the frame and end times are set to the end time of 


        the frame.


        \texttt{T} Do not include times in output label files.


        \texttt{W} Do not include words in output label files


        when performing state or model alignment.


        \texttt{M} Do not include model names in output label


        files.





  \ttitem{-t f [a]} Perform lattice pruning after reading lattices with


  beamwidth \texttt{f}. If second argument is given lower beam to


  limit arcs per second to \texttt{a}.





  \ttitem{-u f} Perform lattice pruning before writing output


  lattices. Otherwise like \texttt{-t}.





  \ttitem{-p f} Set the word insertion log probability to \texttt{f} 


  (default 0.0).


  


  \ttitem{-a f} Set the acoustic model scale factor to \texttt{f}.


  (default value 1.0).





  \ttitem{-r f} Set the dictionary pronunciation probability scale 


        factor to \texttt{f}. (default value 1.0).





  \ttitem{-s f} Set the grammar scale factor to \texttt{f}.


        This factor post-multiplies the language model likelihoods


        from the word lattices.  (default value 1.0).





  \ttitem{-d} Take pronunciation probabilities from the dictionary


  instead of from the lattice.





  \ttitem{-c} Calculate and output lattice statistics.





  \ttitem{-f} Find 1-best transcription (path) in lattice.





  \ttitem{-w} Write output lattice after processing.





  \ttitem{-q s} Choose how the output lattice should be formatted.


         \texttt{s} is a string with certain letters (from \texttt{ABtvaldmn})


         indicating binary flags that control formatting options.


         \texttt{A} attach word labels to arcs rather than nodes.


         \texttt{B} output lattices in binary for speed.


         \texttt{t} output node times.


         \texttt{v} output pronunciation information.


         \texttt{a} output acoustic likelihoods.


         \texttt{l} output language model likelihoods.


         \texttt{d} output word alignments (if available).


         \texttt{m} output within word alignment durations.


         \texttt{n} output within word alignment likelihoods.





  \ttitem{-y ext}  This sets the extension for output label files to


        \texttt{ext} (default \texttt{rec}).





\stdoptF


\stdoptG


\stdoptH


\stdoptI


\stdoptJ


\stdoptK


\stdoptP





\end{optlist}


\stdopts{HLRescore}





\mysubsect{Tracing}{HLRescore-Tracing}





\htool{HLRescore} supports the following trace options where each


trace flag is given using an octal base


\begin{optlist}


   \ttitem{0001} enable basic progress reporting.  


   \ttitem{0002} output generated transcriptions.


   \ttitem{0004} show details of lattice I/O


   \ttitem{0010} show memory usage after each lattice


\end{optlist}


Trace flags are set using the \texttt{-T} option or the \texttt{TRACE} 


configuration variable.


\index{hlrescore@\htool{HLRescore}|)}








%%% Local Variables: 


%%% mode: latex


%%% TeX-master: "../htkbook"


%%% End: 


⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?