📄 segment-nbest.1
字号:
.\" $Id: segment-nbest.1,v 1.8 2004/12/03 17:59:01 stolcke Exp $.TH segment-nbest 1 "$Date: 2004/12/03 17:59:01 $" "SRILM Tools".SH NAMEsegment-nbest \- rescore and segment N-best lists using hidden segment N-gram model.SH SYNOPSIS.B segment-nbest[\c.BR \-help ]option\&...nbest-file-list\&....SH DESCRIPTION.B segment-nbestprocesses a series of consecutive N-best lists from a speechrecognizerand applies a hidden segment N-gram language model to them.The language model is a standard backoff N-gram model in ARPA.BR ngram-format (5)modeling sentence segmentation using the boundary tags <s> and </s>.The program reads in all N-best lists and outputs the hypotheses that have the highest aggregate (combined acoustic and language model) score.Hypothesized sentence boundaries are marked by <s> tags..SH OPTIONS.PPEach filename argument can be an ASCII file, or a compressed file (name ending in .Z or .gz), or ``-'' to indicatestdin/stdout..TP.B \-helpPrint option summary..TP.B \-versionPrint version information..TP.BI \-order " n"Set the maximal N-gram order to be used, by default 3.NOTE: The order of the model is not set automatically when a modelfile is read, so the same file can be used at various orders..TP.BI \-debug " level"Set the debugging output level (0 means no debugging output).Debugging messages are sent to stderr..TP.BI \-lm " file"Read the N-gram model from.IR file ..TP.B \-tolowerMap all vocabulary to lowercase.Useful if case conventions for N-best lists and language model differ..TP.BI \-mix-lm " file"Read a second, standard N-gram model for interpolation purposes..TP.BI \-lambda " weight"Set the weight of the main model when interpolating with.BR \-mix-lm .Default value is 0.5..TP.BI \-bayes " length"Interpolate the second and the main model using posterior probabilitiesfor local N-gram-contexts of length.IR length .The .B \-lambda value is used as a prior mixture weight in this case..TP.BI \-bayes-scale " scale"Set the exponential scale factor on the context likelihood in conjunctionwith the.B \-bayesfunction.Default value is 1.0..TP.BI \-nbest-files " list"Specifies a list of N-best files.The file.I listshould contain a list of filenames, one per line,each corresponding to an N-best file in one of the formatsdescribed in .BR nbest-format (5).The N-best files should correspond to consecutive speech waveformsin the order listed..TP.B \-fb-rescorePerform Forward-backward rescoring.This generates new N-best listsas output whose LM scores reflect the posterior probability of eachhypothesis.The default is to perform Viterbi rescoring and output only thebest combined hypothesis..TP.BI \-write-nbest-dir " dir"Write rescored N-best lists to directory .I dirinstead of to stdout.The filenames from the input are preserved..TP.BI \-max-nbest " n"Limits the number of hypotheses read from each N-best list to the first.IR n ..TP.BI \-max-rescore " m"Only choose among the top .I mhypotheses of each list (after reordering hypotheses, see below).This is an effective way to limit the quadratic computation of the Viterbi or forward/backward dynamic programming..TP.B \-no-reorderDo not reorder the hypotheses before limiting the computation tothe top.IR m .By default the hypotheses will first be sorted according to the acoustic and language model scores recorded in the N-best lists..TP.BI \-rescore-lmw " weight"Specifies the language model weight to be use in combiningacoustic and language model scores to select the best hypotheses..TP.BI \-rescore-wtw " weight"Specifies the word transition weight to be used in selecting thebest hypotheses..TP.BI \-noise " noise-tag"Designate.I noise-tagas a vocabulary item that is to be ignored by the LM.(This is typically used to identify a noise marker.).TP.BI \-noise-vocab " file"Read several noise tags from.IR file ,instead of, or in addition to, the single noise tag specified by.BR \-noise ..TP.BI \-decipher-lm " model-file"Designates the N-gram backoff model (typically a bigram) that was used by theDecipher(TM) recognizer in computing composite scores.Used to compute acoustic scores from the composite scores if theN-best lists are in "NBestList1.0" format..TP.BI \-decipher-lmw " weight"Specifies the language model weight used by the recognizer.Used to compute acoustic scores from the composite scores..TP.BI \-decipher-wtw " weight"Specifies the word transition weight used by the recognizer.Used to compute acoustic scores from the composite scores..TP.BI \-stag " string"Use.I stringto mark segment boundaries in the output.Default is the start-of-sentence symbol defined in the language model (<s>)..TP.BI \-bias " b"Make a segment boundary a priori more likely by a factor of.IR b .If.I bis 0, the dynamic program algorithm is restricted to never considerhidden sentence boundaries; this is useful when.B segment-nbest is used merely for its ability to apply the LM across N-best boundaries..TP.BI \-start-tag " string"Insert a tag .I stringat the front of every N-best hypothesis read in..TP.BI \-end-tag " string"Insert a tag .I stringat the end of every N-best hypothesis read in.This and the previous option are useful if the LM marks acousticwaveform boundaries with a special tag..PP.B segment-nbestwill also process any command line arguments following the optionsas lists of N-best lists, as with the .B \-nbest-filesoption.Each .I nbest-file-listwill be processed in turn,with individual output delimited by a line of the form.br <nbestfile \fInbest-file-list\fP>.br.SH "SEE ALSO"ngram-count(1), segment(1), ngram-format(5), nbest-format(5)..brA. Stolcke, ``Modeling Linguistic Segment and Turn Boundaries for N-bestRescoring of Spontaneous Speech,'' \fIProc. Eurospeech\fP, 2779\-2782, 1997..SH BUGSN-gram models of arbitrary order can be used, but the context at the beginning of a hypothesis never extends beyond the words from the precedingN-best list..SH AUTHORAndreas Stolcke <stolcke@speech.sri.com>..brCopyright 1997\-2004 SRI International
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -