📄 usages.doc

📁 何家龙的说话人识别工具包
💻 DOC
字号:

----------------------------------------------------------------------------
NAME
	mlptrain - Training multi-layer perceptron network by the 
conjugate gradient decent with line searching algorithm

SYNOPSIS
	mlptrain [options] configfile datafile weightfile

DESCRIPTION
	This is the training program of standard feed-forward MLP 
networks. It uses the conjugate gradient decent with line searching 
algorithm. This algorithm is usually able to locate a minimum of the 
error function much faster (an order of magnitude) than the standard 
BP algorithm. A part of the codes are adapted from the OGI speech tool.

OPTIONS
        -i  [ 100]    : weightfile dump cycle
        -S  [1997]    : Random Seed
	-l  [1000]    : Maximum Epoch
        -r            : Retrain from the old weight file
        -n            : No input bias node
        -v            : verbose

EXAMPLES
        mlptrain -v 3layer.cfg spkr_tr.dat spkr_tr.net

Config file "3layer.cfg" is an ASCII file and looks like this
3
16 30 10

The first line indicates the total layers of nodes and then followed by 
the number of nodes at each layer. The example will train a MLP
network (16 input, 30 hidden and 10 output nodes) and save the weight in
"spkr_tr.net".


----------------------------------------------------------------------------
NAME
	mlptest - MLP classifier test program

SYNOPSIS
	mlptest [options] datafile weightfile

DESCRIPTION
	This is the companion program of "mlptrain". It test the 
performance of a MLP network and write out confusion matrix. 
Optionally, it gives the network outputs and sentence level result (for 
speaker recognition).

OPTIONS
	-o [file] 	: Network output file
	-a [file]	: Accumulating results (sentence)
	-n		: No bias node
	-v		: Verbose

EXAMPLE
        mlptest -a spkr_te.sen spkr_te.dat spkr_tr.net
The file "spkr_tr.net" contains the weight trained by "mlptrain".

----------------------------------------------------------------------------
NAME
	lbglvq - Codebook training program.

SYNOPSIS
	lbglvq [options] datafile codebook

DESCRIPTION
	This the general training program to generate a coodbook from 
given training vectors. One can specify to use the LBG, Kohonen's 
LVQ or a traing algorithm proposed by [Jialong He]. The codebook trained
by this algorithm usually gives better performance.

OPTIONS
	-a [0.05] alpha
	-e [0.10] epsilon
	-w [0.40] window size
	-c [   8] No. vectors per class, 2, 4, 8, 16, 32 ...
	-l [  20] LVQ training epoch
	-g generating codebook by LBG (no LVQ)
	-q retrain the codebook by LVQ (no LBG)
	-N New LVQ algorithm by <Jialong He>
	-n [1] No. of vectors in group
	-v verbose

where -a, -e, -w, -l have the same meaning as in the LVQ package. If 
-g option is specified, no global training algorithm is invoked, -q 
means to retrain the codebook without initializing it by the LBG
algorithm.

EXAMPLE
        lbglvq -v -g spkr_tr.dat spkr_tr.cod
        lbglvq -v -N spkr_tr.dat spkr_tr.gvq

The first example generate a codebook "spkr_tr.cod" by the LBG
algorithm and the second example trains the codebook by the algorithm
proposed by [Jialong He].


----------------------------------------------------------------------------
NAME
	nearest - Codebook test program

SYNOPSIS
	nearest [options] datafile codebook

DESCRIPTION
	This the companion test program the codebook trained by 
"lbglvq" program. It writes out confusion matrix, optionally gives the 
sentence level performance for speaker recognition.

OPTIONS
	-o [file]  distance file name
	-a [file]  sentence level results
	-M Majority voting, [Default accu. distance]
	-v verbose

EXAMPLE
        nearest -a spkr_te.sen spkr_te.dat spkr_tr.cod

----------------------------------------------------------------------------
NAME
	gmmtrain - Gaussian mixture model training program

SYNOPSIS
        gmmtrain [options] datafile modelfile

DESCRIPTION
	This program generates Gaussian mixture models (GMM) from 
given training data and saves the mean and variance vectors in 
"modelfile". The GMM can be regarded as an extension of the VQ model.
The default training method is based on the Expectation-Maximization (EM)
algorithm. This algorithm maximize the likelihood of training data 
for individual classes, which is similar to that of training a codebook 
by the LBG algorithm. You can specify -L option to use a global 
training algorithm I proposed. The idea of this algorithm is the same 
as that of training a coodbook by the LVQ algorithm.
 
OPTIONS
	-a  [0.07] alpha
        -w  [1.00] window size
        -n  [   8] No. Gaussian functions, 2, 4, 8, 16, 32 ...
        -l  [  10] training epoch
	-L discriminative training, default only EM algorithm
	-v verbose

where -a, -w, -l have the same meaning as that in the LVQ algorithm 
and only have effects with -L option. Since I use a splitting procedure 
to generate Gaussian components, the model's order should be the 
power of two.

EXAMPLE
        gmmtrain -v -n 2 -L spkr_tr.dat spkr_tr.gmm

This generates GMM with 2 Gaussian components for each class.
The models will be trained by the discriminative training algorithm
I proposed.


----------------------------------------------------------------------------
NAME
	gmmtest - test Gaussian mixture model

SYNOPSIS
	gmmtest [options] datafile meanfile variancefile

DESCRIPTION
	This is the companion program of "gmmtrain". It test the 
performance of a GMM and write out confusion matrix. Optionally, 
it gives the likelihood for each test vectors and sentence level 
performance (for speaker recognition). 

OPTIONS
	-o [file] likelihood of each model
	-a [file] sentence level performance
	-v verbose

EXAMPLE
        gmmtest -a spkr_te.sen spkr_te.dat spkr_tr.gmm



----------------------------------------------------------------------------
NAME
        search - Select effective features using SFS or SBS method.

SYNOPSIS
	select [options] codefile traindata testdata

DESCRIPTION
	This program implemented classical sequential forward search 
and sequential backward searching schemes. Unlike other 
dimensional reduction method, I use directly the classification rate
for vectors as the criterion which is consistent with the goal of task.

OPTIONS
	-o search log file name [stdout]
	-d [10] result feature dimension
	-S [1997] random seed
	-p Preselect features [None]
	-B SBS search [Default SFS]

EXAMPLE
        search -o results.log -d 5 spkr_tr.cod spkr_tr.dat spkr_te.dat

This example selects 5 features using the SFS method, the ordering 
result is put in "results.log".



----------------------------------------------------------------------------
NAME
	cepstrum - Extracting features from a speech signal

SYNOPSIS
	ceptrum [options] speechfile

DESCRIPTION
	This program can extract many features from speech signals. If 
more than one feature options specified, feature vectors are 
concatenated. The features are: (1) LPC coefficients; (2) LPCC;
(3) PARCOR; (4) MFCC; (5) Residual cepstrum (proposed by Jialong He);
(6) pitch period. If -V option specified, only voiced segments are used.

OPTIONS

-o feature vector file [stdout]         -b [512] starting sample
-w [256] window size			-s [128] window moving step
-L LPC                [off]             -p [10] LPC order
-C LPCC               [off]		-n [10] LPCC order 
-M MFCC               [off]             -r [10] MFCC order
-R RCEP               [off]		-q [2] RCEP order
-P PARCOR             [off]             -f pitch period   [off]
-V Voiced segments only [all]           -t [0] Tolerance, 0 adaptive
-S *not* swap byte order		-l Label for this class

EXAMPLE
	cepstrum -r 16 -M -V -l 1 foo.wav > foo.dat

This calculates 16 MFCC coefficients from each voiced frame. 


----------------------------------------------------------------------------
NAME
	randline - Randomizing sequence in the input file

SYNOPSIS
	randline [options] inputfile

DESCRIPTION
	This utility randomize the sequence (default is row) of intput 
data. You can specify several lines as a block and randomize the 
sequence of  blocks. Some problems such as on-line BP algorithm 
need randomly select input vectors. Note, all my programs do not 
need this, they have a built-in method. 

OPTIONS
	-o [stdout] output file
	-b [1] number of lines as a block
	-l [], label in the first line, used for LVQ format

EXAMPLE
        randline spkr_tr.dat > spkr_tr.ran



----------------------------------------------------------------------------
NAME
	bin2asc - Dump binary data in ASCII format.

SYNOPSIS
	bin2asc [options]

DESCRIPTION
	This utility dump binary data in human readable format. Some 
data (such as speech samples, or weight file) may stored in binary 
format. You can read them by using this utility.

OPTIONS
	-i input file [stdin]
	-o outnput file [stdout]
	-b [0] Starting byte
	-e [3000000] Number of items, default all data
	-n [10] ASCII items per line
	-t [3] binary file type
	    0: byte, 1: short, 2: long, 3: float, 4: double
	-h display this help message

default read "stdin" and write to "stdout". 

EXAMPLE
	bin2asc -t 1 -e 100 -n 1 -i speech.dat

This example translates 100 speech samples (short integer) to ASCII 
format and writes one sample per line.
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -