📄 usages.doc
字号:
----------------------------------------------------------------------------
NAME
mlptrain - Training multi-layer perceptron network by the
conjugate gradient decent with line searching algorithm
SYNOPSIS
mlptrain [options] configfile datafile weightfile
DESCRIPTION
This is the training program of standard feed-forward MLP
networks. It uses the conjugate gradient decent with line searching
algorithm. This algorithm is usually able to locate a minimum of the
error function much faster (an order of magnitude) than the standard
BP algorithm. A part of the codes are adapted from the OGI speech tool.
OPTIONS
-i [ 100] : weightfile dump cycle
-S [1997] : Random Seed
-l [1000] : Maximum Epoch
-r : Retrain from the old weight file
-n : No input bias node
-v : verbose
EXAMPLES
mlptrain -v 3layer.cfg spkr_tr.dat spkr_tr.net
Config file "3layer.cfg" is an ASCII file and looks like this
3
16 30 10
The first line indicates the total layers of nodes and then followed by
the number of nodes at each layer. The example will train a MLP
network (16 input, 30 hidden and 10 output nodes) and save the weight in
"spkr_tr.net".
----------------------------------------------------------------------------
NAME
mlptest - MLP classifier test program
SYNOPSIS
mlptest [options] datafile weightfile
DESCRIPTION
This is the companion program of "mlptrain". It test the
performance of a MLP network and write out confusion matrix.
Optionally, it gives the network outputs and sentence level result (for
speaker recognition).
OPTIONS
-o [file] : Network output file
-a [file] : Accumulating results (sentence)
-n : No bias node
-v : Verbose
EXAMPLE
mlptest -a spkr_te.sen spkr_te.dat spkr_tr.net
The file "spkr_tr.net" contains the weight trained by "mlptrain".
----------------------------------------------------------------------------
NAME
lbglvq - Codebook training program.
SYNOPSIS
lbglvq [options] datafile codebook
DESCRIPTION
This the general training program to generate a coodbook from
given training vectors. One can specify to use the LBG, Kohonen's
LVQ or a traing algorithm proposed by [Jialong He]. The codebook trained
by this algorithm usually gives better performance.
OPTIONS
-a [0.05] alpha
-e [0.10] epsilon
-w [0.40] window size
-c [ 8] No. vectors per class, 2, 4, 8, 16, 32 ...
-l [ 20] LVQ training epoch
-g generating codebook by LBG (no LVQ)
-q retrain the codebook by LVQ (no LBG)
-N New LVQ algorithm by <Jialong He>
-n [1] No. of vectors in group
-v verbose
where -a, -e, -w, -l have the same meaning as in the LVQ package. If
-g option is specified, no global training algorithm is invoked, -q
means to retrain the codebook without initializing it by the LBG
algorithm.
EXAMPLE
lbglvq -v -g spkr_tr.dat spkr_tr.cod
lbglvq -v -N spkr_tr.dat spkr_tr.gvq
The first example generate a codebook "spkr_tr.cod" by the LBG
algorithm and the second example trains the codebook by the algorithm
proposed by [Jialong He].
----------------------------------------------------------------------------
NAME
nearest - Codebook test program
SYNOPSIS
nearest [options] datafile codebook
DESCRIPTION
This the companion test program the codebook trained by
"lbglvq" program. It writes out confusion matrix, optionally gives the
sentence level performance for speaker recognition.
OPTIONS
-o [file] distance file name
-a [file] sentence level results
-M Majority voting, [Default accu. distance]
-v verbose
EXAMPLE
nearest -a spkr_te.sen spkr_te.dat spkr_tr.cod
----------------------------------------------------------------------------
NAME
gmmtrain - Gaussian mixture model training program
SYNOPSIS
gmmtrain [options] datafile modelfile
DESCRIPTION
This program generates Gaussian mixture models (GMM) from
given training data and saves the mean and variance vectors in
"modelfile". The GMM can be regarded as an extension of the VQ model.
The default training method is based on the Expectation-Maximization (EM)
algorithm. This algorithm maximize the likelihood of training data
for individual classes, which is similar to that of training a codebook
by the LBG algorithm. You can specify -L option to use a global
training algorithm I proposed. The idea of this algorithm is the same
as that of training a coodbook by the LVQ algorithm.
OPTIONS
-a [0.07] alpha
-w [1.00] window size
-n [ 8] No. Gaussian functions, 2, 4, 8, 16, 32 ...
-l [ 10] training epoch
-L discriminative training, default only EM algorithm
-v verbose
where -a, -w, -l have the same meaning as that in the LVQ algorithm
and only have effects with -L option. Since I use a splitting procedure
to generate Gaussian components, the model's order should be the
power of two.
EXAMPLE
gmmtrain -v -n 2 -L spkr_tr.dat spkr_tr.gmm
This generates GMM with 2 Gaussian components for each class.
The models will be trained by the discriminative training algorithm
I proposed.
----------------------------------------------------------------------------
NAME
gmmtest - test Gaussian mixture model
SYNOPSIS
gmmtest [options] datafile meanfile variancefile
DESCRIPTION
This is the companion program of "gmmtrain". It test the
performance of a GMM and write out confusion matrix. Optionally,
it gives the likelihood for each test vectors and sentence level
performance (for speaker recognition).
OPTIONS
-o [file] likelihood of each model
-a [file] sentence level performance
-v verbose
EXAMPLE
gmmtest -a spkr_te.sen spkr_te.dat spkr_tr.gmm
----------------------------------------------------------------------------
NAME
search - Select effective features using SFS or SBS method.
SYNOPSIS
select [options] codefile traindata testdata
DESCRIPTION
This program implemented classical sequential forward search
and sequential backward searching schemes. Unlike other
dimensional reduction method, I use directly the classification rate
for vectors as the criterion which is consistent with the goal of task.
OPTIONS
-o search log file name [stdout]
-d [10] result feature dimension
-S [1997] random seed
-p Preselect features [None]
-B SBS search [Default SFS]
EXAMPLE
search -o results.log -d 5 spkr_tr.cod spkr_tr.dat spkr_te.dat
This example selects 5 features using the SFS method, the ordering
result is put in "results.log".
----------------------------------------------------------------------------
NAME
cepstrum - Extracting features from a speech signal
SYNOPSIS
ceptrum [options] speechfile
DESCRIPTION
This program can extract many features from speech signals. If
more than one feature options specified, feature vectors are
concatenated. The features are: (1) LPC coefficients; (2) LPCC;
(3) PARCOR; (4) MFCC; (5) Residual cepstrum (proposed by Jialong He);
(6) pitch period. If -V option specified, only voiced segments are used.
OPTIONS
-o feature vector file [stdout] -b [512] starting sample
-w [256] window size -s [128] window moving step
-L LPC [off] -p [10] LPC order
-C LPCC [off] -n [10] LPCC order
-M MFCC [off] -r [10] MFCC order
-R RCEP [off] -q [2] RCEP order
-P PARCOR [off] -f pitch period [off]
-V Voiced segments only [all] -t [0] Tolerance, 0 adaptive
-S *not* swap byte order -l Label for this class
EXAMPLE
cepstrum -r 16 -M -V -l 1 foo.wav > foo.dat
This calculates 16 MFCC coefficients from each voiced frame.
----------------------------------------------------------------------------
NAME
randline - Randomizing sequence in the input file
SYNOPSIS
randline [options] inputfile
DESCRIPTION
This utility randomize the sequence (default is row) of intput
data. You can specify several lines as a block and randomize the
sequence of blocks. Some problems such as on-line BP algorithm
need randomly select input vectors. Note, all my programs do not
need this, they have a built-in method.
OPTIONS
-o [stdout] output file
-b [1] number of lines as a block
-l [], label in the first line, used for LVQ format
EXAMPLE
randline spkr_tr.dat > spkr_tr.ran
----------------------------------------------------------------------------
NAME
bin2asc - Dump binary data in ASCII format.
SYNOPSIS
bin2asc [options]
DESCRIPTION
This utility dump binary data in human readable format. Some
data (such as speech samples, or weight file) may stored in binary
format. You can read them by using this utility.
OPTIONS
-i input file [stdin]
-o outnput file [stdout]
-b [0] Starting byte
-e [3000000] Number of items, default all data
-n [10] ASCII items per line
-t [3] binary file type
0: byte, 1: short, 2: long, 3: float, 4: double
-h display this help message
default read "stdin" and write to "stdout".
EXAMPLE
bin2asc -t 1 -e 100 -n 1 -i speech.dat
This example translates 100 speech samples (short integer) to ASCII
format and writes one sample per line.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -