📄 julius.txt
字号:
larger, but distortion of the resulting signal also becomes
remarkable. (default: 2.0)
-ssfloor value
Flooring coefficient of spectral subtraction. The spectral
parameters that go under zero after subtraction will be substi-
tuted by the source signal with this coefficient multiplied.
(default: 0.5)
GMM-based Input Verification and Rejection
-gmm filename
GMM definition file in HTK format. If specified, GMM-based input
verification will be performed concurrently with the 1st pass,
and you can reject the input according to the result as speci-
fied by "-gmmreject". Note that the GMM should be defined as
one-state HMMs, and their training parameter should be the same
as the acoustic model you want to use with.
-gmmnum N
Number of Gaussian components to be computed per frame on GMM
calculation. Only the N-best Gaussians will be computed for
rapid calculation. The default is 10 and specifying smaller
value will speed up GMM calculation, but too small value (1 or
2) may cause degradation of identification performance.
-gmmreject string
Comma-separated list of GMM names to be rejected as invalid
input. When recognition, the log likelihoods of GMMs accumu-
lated for the entire input will be computed concurrently with
the 1st pass. If the GMM name of the maximum score is within
this string, the 2nd pass will not be executed and the input
will be rejected.
Language Model (word N-gram)
-nlr 2gram_filename
2-gram language model file in standard ARPA format.
-nrl rev_3gram_filename
Reverse 3-gram language model file. This is required for the
second search pass. If this is not defined then only the first
pass will take place.
-d bingram_filename
Use binary format language model instead of ARPA formats. The
2-gram and 3-gram model can be combined and converted to this
binary format using mkbingram. Julius can read this format much
faster than ARPA format.
-lmp lm_weight lm_penalty
-lmp2 lm_weight2 lm_penalty2
Language model score weights and word insertion penalties for
the first and second passes respectively.
The hypothesis language scores are scaled as shown below:
lm_score1 = lm_weight * 2-gram_score + lm_penalty lm_score2 =
lm_weight2 * 3-gram_score + lm_penalty2
The defaults are dependent on acoustic model:
First-Pass | Second-Pass
--------------------------
5.0 -1.0 | 6.0 0.0 (monophone)
8.0 -2.0 | 8.0 -2.0 (triphone,PTM)
9.0 8.0 | 11.0 -2.0 (triphone,PTM, setup=v2.1)
-transp float
Additional insertion penalty for transparent words. (default:
0.0)
Word Dictionary
-v dictionary_file
Word dictionary file (required).
-silhead {WORD|WORD[OUTSYM]|#num}
-siltail {WORD|WORD[OUTSYM]|#num}
Sentence start and end silence word as defined in the dictio-
nary. (default: "<s>" / "</s>")
Julius deal these words as fixed start-word and end-word of
recognition. They can be defined in several formats as shown
below.
Example
Word_name <s>
Word_name[output_symbol] <s>[silB]
#Word_ID #14
(Word_ID is the word position in the dictionary
file starting from 0)
-forcedict
Ignore dictionary errors and force running. Words with errors
will be dropped from dictionary at startup.
Acoustic Model (HMM)
-h hmmfilename
HMM definition file to use. Format (ascii/binary) will be auto-
matically detected. (required)
-hlist HMMlistfilename
HMMList file to use. Required when using triphone based HMMs.
This file provides a mapping between the logical triphones names
genertated from phone sequence in the dictionary and the HMM
definition names.
-iwcd1 {best N|max|avg}
When using a triphone model, select method to handle inter-word
triphone context on the first and last phone of a word in the
first pass.
best N: use average likelihood of N-best scores from the same
context triphones (default, N=3)
max: use maximum likelihood of the same
context triphones
avg: use average likelihood of the same
context triphones
-force_ccd / -no_ccd
Normally Julius determines whether the specified acoustic model
is a context-dependent model from the model names, i.e., whether
the model names contain character '+' and '-'. You can explic-
itly specify by these options to avoid mis-detection. These
will override the automatic detection result.
-notypecheck
Disable checking of input parameter type. (default: enabled)
Acoustic Computation
Gaussian Pruning will be automatically enabled when using tied-mixture
based acoutic model. It is disabled by default for non tied-mixture
models, but you can activate pruning to those models by explicitly
specifying "-gprune". Gaussian Selection needs a monophone model con-
verted by mkgshmm.
-gprune {safe|heuristic|beam|none}
Set the Gaussian pruning technique to use.
(default: 'safe' (setup=standard), 'beam' (setup=fast) for tied
mixture model, 'none' for non tied-mixture model)
-tmix K
With Gaussian Pruning, specify the number of Gaussians to com-
pute per mixture codebook. Small value will speed up computa-
tion, but likelihood error will grow larger. (default: 2)
-gshmm hmmdefs
Specify monophone hmmdefs to use for Gaussian Mixture Selectio.
Monophone model for GMS is generated from an ordinary monophone
HMM model using mkgshmm. This option is disabled by default.
(no GMS applied)
-gsnum N
When using GMS, specify number of monophone state to select from
whole monophone states. (default: 24)
Inter-word Short Pause Handling
-iwspword
Add a word entry to the dictionary that should correspond to
inter-word short pauses that may occur in input speech. This
may improve recognition accuracy in some language model that has
no inter-word pause modeling. The word entry can be specified
by "-iwspentry".
-iwspentry
Specify the word entry that will be added by "-iwspword".
(default: "<UNK> [sp] sp sp")
-iwsp (Multi-path version only) Enable inter-word context-free short
pause handling. This option appends a skippable short pause
model for every word end. The added model will be skipped on
inter-word context handling. The HMM model to be appended can
be specified by "-spmodel" option.
-spmodel
Specify short-pause model name that will be used in "-iwsp".
(default: "sp")
Short-pause Segmentation
The short pause segmentation can be used for sucessive decoding of a
long utterance. Enabled when compiled with '--enable-sp-segment'.
-spdur Set the short-pause duration threshold in number of frames. If
a short-pause word has the maximum likelihood in successive
frames longer than this value, then interrupt the first pass and
start the second pass. (default: 10)
Search Parameters (First Pass)
-b beamwidth
Beam width (number of HMM nodes) on the first pass. This value
defines search width on the 1st pass, and has great effect on
the total processing time. Smaller width will speed up the
decoding, but too small value will result in a substantial
increase of recognition errors due to search failure. Larger
value will make the search stable and will lead to failure-free
search, but processing time and memory usage will grow in pro-
portion to the width.
Default value is acoustic model dependent:
400 (monophone)
800 (triphone,PTM)
1000 (triphone,PTM, setup=v2.1)
-sepnum N
Number of high frequency words to be separated from the lexicon
tree. (default: 150)
-1pass Only perform the first pass search. This mode is automatically
set when no 3-gram language model has been specified (-nlr).
-realtime
-norealtime
Explicitly specify whether real-time (pipeline) processing will
be done in the first pass or not. For file input, the default
is OFF (-norealtime), for microphone, adinnet and NetAudio
input, the default is ON (-realtime). This option relates to
the way CMN is performed: when OFF, CMN is calculated for each
input using cepstral mean of the whole input. When the realtime
option is ON, MAP-CMN will be performed. When MAP-CMN, the cep-
stral mean of last 5 seconds are used as the initial cepstral
mean at the beginning of each input. Also refer to "-progout".
-cmnsave filename
Save last CMN parameters computed while recognition to the spec-
ified file. The parameters will be saved to the file in each
time a input is recognized, so the output file always keeps the
last CMN parameters. If output file already exist, it will be
overridden.
-cmnload filename
Load initial CMN parameters previously saved in a file by "-cmn-
save". Loading an initial CMN enables Julius to better recog-
nize the first utterance on a microphone / network input. Also
see "-cmnnoupdate".
-cmnmapweight
Specify weight of initial cepstral mean at the beginning of each
utterance for microphone / network input. Specify larger value
to retain the initial cepstral mean for a longer period, and
smaller value to rely more on the current input. (default:
100.0)
-cmnnoupdate
When microphone / network input, this option makes engine not to
update the cepstral mean at each input and force engine to use
the initial cepstral mean given by "-cmnload" parmanently.
Search Parameters (Second Pass)
-b2 hyponum
Beam width (number of hypothesis) in second pass. If the count
of word expantion at a certain length of hypothesis reaches this
limit while search, shorter hypotheses are not expanded further.
This prevents search to fall in breadth-first-like status stack-
ing on the same position, and improve search failure. (default:
30)
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -