📄 readme.2nd
字号:
All utilities use the same data format as that in the "LVQ-PAK", so that
you can compare the results with that by Kohonen's LVQ algorithm.
Data format example:
4 <--- vector dimension
0.2 0.4 0.5 0.1 1 <--- vector data following by a class ID number
1.5 0.4 0.3 0.1 2 the class ID is a integer and starting from 1
0.8 1.1 1.5 0.4 1
....
In the "example" directory, There is a demo data set
---------------------------------
spkr_tr.dat ; training data
spkr_te.dat ; test data
---------------------------------
These data were drrived from 10 speakers selected from the TIMIT database.
from each voiced frame (32 ms), I calculated 16 MFCC coefficients.
Some notes about the algorithms:
(1) The "mlptrain" uses the conjugate gradient with line search
algorithm which is an order of magnitude faster than standard BP algorithm.
(2) In "gmmtrain", except for the commonly used EM algorithm, I also
add a discriminative training algorithm I proposed.
(3) In "lbglvq", three algorithms are implemented, they are the LBG algorithm,
Kohonen's LVQ algorithm, and a new training algorithm I propsed.
For details of how to use this package, refer to "usages.doc" and demo
program in the "example" directory.
===========================================================================
Purpose of uploading this package
---------------------------------
I am nearly finishing my PhD work and start to look for a job. It is hoped
that this package could demonstrate my programming capabilities. Besides,
I have also done some excellent research work. The following are my
publication list:
[1] J. He, L. Liu, and G. Palm, "A text-independent speaker
identification system based on neural networks,"
Proc. of International Conference on Spoken Language processing
(ICSLP'94), pp. 1851-1854, Sept. 1994, Yokohama, Japan.
[2] J. He, L. Liu, and G. Palm, "Perception of stop consonants in
VCV utterances reconstructed from partial Fourier transform
information," Proc. of Australian International Conference on
Speech science and technology (SST'94),
pp. 436-441, Nov. 1994, Perth, Australia.
[3] J. He, L. Liu, and G. Palm, "On the use of features from prediction
residual signals in speaker identification,"
Proc. of EUROSPEECH'95, Vol. 1, pp. 313-316, Sept. 1995,
Madrid, Spain.
[4] J. He, L. Liu, and G. Palm, "Speaker identification using hybrid
LVQ-SLP networks," Proc. IEEE ICNN'95, Vol.4, pp. 2052-2055,
Perth, Australia.
[5] J. He, L. Liu, and G. Palm, "On the use of residual cepstrum in
speech recognition," Proc. IEEE ICASSP'96, Vol. 1, pp. 5-8, Atlanta,
1996, USA.
[6] L. Liu, J. He, and A. Smit, "The importance of phase in the
perception of intervocalic stop consonants,"
J. Acoust. Soc. Am., pp. 2340, 1992.
[7] L. Liu, J. He, and G. Palm, "Perception of stop consonants in
speech signals reconstructed from phase or amplitude,"
J. Acoust. Soc. Am., Vol. 94, pp. 1883, 1993.
[8] L. Liu, J. He, and G. Palm, "The importance of phase in the
perception of intervocalic stop consonants," Proc. of Australian
International Conference on Speech science and technology (SST'94),
pp. 442-447, Nov. 1994, Perth, Australia.
[9] L. Liu, J. He, and G. Palm, "Influence of short-time phase on the
perception of stop consonants," Proc. of EUROSPEECH'95, pp. 2269-2272,
Sept. 1995, Madrid, Spain.
[10] L. Liu, J. He, and G. Palm, "Signal modeling for speaker
identification," Proc. IEEE ICASSP'96, Vol. 2, 665-668,
Atlanta, USA.
If you have job available and need to know more about me,
I am happy to provide other information.
--------------------------------
Jialong He
Abt. Neuroinformatik
University of ULM
89069 ULM, GERMANY
email: jialong@neuro.informatik.uni-ulm.de
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -