📄 openviron.tex
字号:
all of the remaining parameter definitions can be seen to be masked byhigher level over-rides.The configuration file itself consists of a sequence of parameterdefinitions of the form\index{configuration files!format}\begin{verbatim} [MODULE:] PARAMETER = VALUE\end{verbatim}One parameter definition is written per line and square bracketsindicate that the module name isoptional. Parameter definitions are not case sensitivebut by convention they are written in upper case. A \verb+#+ character indicates that the rest of the line is a comment.As an example, the following is a simple configuration file\begin{verbatim} # Example config file TARGETKIND = MFCC NUMCHANS = 20 WINDOWSIZE = 250000.0 # ie 25 msecs PREEMCOEF = 0.97 ENORMALISE = T HSHELL: TRACE = 02 # octal HPARM: TRACE = 0101\end{verbatim}The first five lines contain no module name and hence they applyglobally, that is, any library module or tool which is interestedin the configuration parameter \texttt{NUMCHANS} will read the givenparameter value. In practice, this is not a problem with library modulessince nearly all configuration parameters have uniquenames. The final two lines show the same parameter name being givendifferent values within different modules. This is an example ofa parameter which every module responds to and hence does not have a uniquename.This example also shows each of the four possible types of value that canappear in a configuration file: string\index{string values}, integer\index{integer values}, float\index{float values} and Boolean\index{Boolean values}.The configuration parameter \texttt{TARGETKIND}\index{targetkind@\texttt{TARGETKIND}} requires a stringvalue specifying the name of a speech parameter kind. Strings notstarting with a letter should be enclosed in double quotes. \texttt{NUMCHANS}\index{numchans@\texttt{NUMCHANS}} requires an integer value specifying the number offilter-bank channels to use in the analysis.\texttt{WINDOWSIZE}\index{windowsize@\texttt{WINDOWSIZE}} actually requires a floating-point valuespecifying the window size in units of 100ns. However, an integercan always be given wherever a float is required.\texttt{PREEMCOEF} also requires a floating-point value specifying thepre-emphasis coefficient to be used. Finally, \texttt{ENORMALISE}\index{enormalise@\texttt{ENORMALISE}} isa Boolean parameter which determines whether or not energynormalisation is to be performed, its value must be \texttt{T}, \texttt{TRUE} or\texttt{F}, \texttt{FALSE}. Notice also that, as in command line options,integer values can use the C conventions for writing in non-decimal bases.Thus, the trace value of 0101 is equal to decimal 65. This is particularlyuseful in thiscase because trace values are typically interpreted as bit-strings by\HTK\ modules and tools.\index{configuration files!types}If the name of a configuration variable is mis-typed, there will be nowarning and the variable will simply be ignored. To help guardagainst this, the standard option \texttt{-D} can be used. Thisdisplays all of the configuration variables before and after the toolruns. In the latter case, all configuration variables which are stillunread are marked by a hash character. The initial display allows theconfiguration values to be checked before potentially wasting a largeamount of cpu time through incorrectly set parameters. The finaldisplay shows which configuration variables were actually used duringthe execution of the tool. The form of the output is shown by thefollowing example\begin{verbatim} HTK Configuration Parameters[3] Module/Tool Parameter Value # SAVEBINARY TRUE HPARM TARGETRATE 256000.000000 TARGETKIND MFCC_0\end{verbatim}Here three configuration parameters have been set but the hash(\verb+#+) indicates that \texttt{SAVEBINARY} has not been used.\index{configuration variables!display}\index{configuration files|)}\mysect{Standard Options}{stdopts}\index{standard options} As noted in section~\ref{s:cmdline}, optionsconsisting of a capital letter are common across all tools. Many arespecific to particular file types and they will be introduced as theyarise. However, there are six options that are standard across alltools. Three of these have been mentioned already. Theoption \texttt{-C}\index{standard options!aaac@\texttt{-C}} is used tospecify a configuration file name and the option\texttt{-S}\index{standard options!aaas@\texttt{-S}} is used tospecify a script file name, whilst the option\texttt{-D}\index{standard options!aaad@\texttt{-D}} is used todisplay configuration settings.The two remaining standard options provided directly by \htool{HShell} are \texttt{-A}\index{standard options!aaaa@\texttt{-A}}and \texttt{-V}. The option \texttt{-A} causes the current command line arguments to be printed. When running experiments viascripts, it is a good idea to use this option to record in a log file theprecise settings used for each tool.The option \texttt{-V}\index{standard options!aaav@\texttt{-V}}causes version information for the tool and each module used by thattool to be listed. These should always be quoted when making bug reports.Finally, all tools implement the trace option \texttt{-T}\index{standard options!aaat@\texttt{-T}}.Trace values are typically bit strings and the meaning of each bitis described in the reference section for each tool. Setting a trace\index{tracing}option via the command line overrides any setting for that same traceoption in a configuration file. This is a general rule, command lineoptions always override defaults set in configuration files.All of the standard options are listed in the final summary section ofthis chapter. As a general rule, you should consider passing at least-A -D -V -T 1 to all tools, which will guarantee that sufficientinformation is available in the tool output.\mysect{Error Reporting}{erep}The \htool{HShell} module provides a standard mechanism for reportingerrors\index{errors} and warnings\index{warnings}. A typical error message is as follows\begin{verbatim} HList: ERROR [+1110] IsWave: cannot open file speech.dat\end{verbatim}This indicates that the tool \htool{HList} is reporting an errornumber +1110. All errors have positive error numbers\index{error numbers!structure of} and alwaysresult in the tool terminating. Warnings have negative error numbersand the tool does not terminate. The first two digits of an errornumber indicate the module or tool in which the error is located(\htool{HList} in this case)and the last two digits define the class of error.The second line of the error message names the actual routinein which the error occurred (here \texttt{IsWave}) and the actual error message. All errors and warnings are listedin the reference section at the end of this book indexed byerror/warning number. This listing contains more details on eacherror or warning along with suggested causes.Error messages are sent to the standard error stream but warningsare sent to the standard output stream. The reason for the latteris that most \HTK\ tools are run with progress tracing enabled. Sending warningsto the standard output stream ensures that they are properlyinterleaved with the trace of progress so that it is easy to determinethe point at which the warning was issued. Sending warnings tostandard error would lose this information.The default behaviour of a \HTK\ tool on terminating due to anerror is to exit normally returning the error number as exit status.If, however, the configuration variable \texttt{ABORTONERR}\index{abortonerr@\texttt{ABORTONERR}} is set totrue then the tool will core dump. This is a debugging facility whichshould not concern most users.\index{termination}\mysect{Strings and Names}{htkstrings}Many \HTK\ definition files include names of various types ofobjects: for example labels, model names, words, etc.In order to achieve some uniformity, \HTK\ applies standardrules for reading strings which are names.\index{strings!rules for}These rules are not, however, necessary when using the languagemodelling tools -- see below.A name string consists of a single white space delimited word ora quoted string. Either the single quote \verb+'+ or the double quote \verb+"+ can be used to quote strings but thestart and end quotes must be matched. The backslash \verb+\+ character can alsobe used to introduce otherwise reserved characters. The character following a backslash is inserted into the string without specialprocessing unless that character is a digit in the range 0 to 7. In that case, the threecharacters following the backslash are read and interpreted as an octalcharacter code. When the three characters are not octal digits the resultis not well defined.\index{strings!metacharacters in}In summary the special processing is\begin{center}\begin{tabular}{|c|l|} \hlineNotation & Meaning \\ \hline\verb+\\+ & \verb+\+ \\ \hline\verb+\_ + & represents a space that will not terminate a string \\ \hline\verb+\'+ & \verb+'+ (and will not end a quoted string) \\ \hline\verb+\"+ & \verb+"+ (and will not end a quoted string) \\ \hline\verb+\nnn+ & the character with octal code \verb+\nnn+ \\ \hline\end{tabular}\end{center}\noindent\index{non-printing chars}Note that the above allows the same effect to be achieved in a number of different ways. For example,\begin{verbatim} "\"QUOTE" \"QUOTE '"QUOTE' \042QUOTE \end{verbatim}all produce the string \verb+"QUOTE+.The only exceptions to the above general rules are:\begin{itemize}\item Where models are specified in \htool{HHEd} scripts,commas (\verb+,+), dots (\verb+.+),and closing brackets (\verb+)+) are all used as extra delimiters to allow \htool{HHEd} scriptscreated for earlier versions of \HTK\ to be used unchanged.Hence for example, \texttt{(a,b,c,d)} would be split into 4 distinct name strings \texttt{a}, \texttt{b}, \texttt{c} and \texttt{d}.\item When the configuration variable\texttt{RAWMITFORMAT} is set true, each word in a language modeldefinition file consists of a white space delimited string with no special processing being performed.\item Source dictionaries read by \htool{HDMan} are read usingthe standard \HTK\ string conventions, however, the command \texttt{IR}can be used in a \htool{HDMan} source edit script to switch to usingthis raw format.\item To ensure that the general definition of a name string worksproperly in \HTK\ master label files, allMLFs must have the reserved \texttt{.} and \verb+///+ terminators alone on a line with no surrounding white space. If this causes problems reading old MLF files, the configurationvariable \texttt{V1COMPAT} should be set true in the module \htool{HLabel}. In this case,\HTK\ will attempt to simulate the behaviour of the older version 1.5.\itemTo force numbers to be interpreted as strings rather than times or scores in alabel file, they must be quoted. If the configuration variable\texttt{QUOTECHAR} is set to \verb+'+ or \verb+"+ then output labels will bequoted with the specified quote character. If \texttt{QUOTECHAR} is set to \verb+\+, then output labels will be escaped. The default is to select the simplest quoting mechanism.\index{strings!output of}\end{itemize}Note that under some versions of \texttt{Unix} \HTK\ can support the 8-bitcharacter sets used for the representation of various orthographies. Insuch cases the shell environment variable \texttt{\$LANG} usually governswhich ISO character set is in use.\subsubsection{Language modelling tools}
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -