📄 voicebox speech processing toolbox for matlab.htm
字号:
<HR>
<H2><A name=fourier>Fourier, DCT and Hartley Transforms</A></H2>
<UL>
<LI>The routines <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/rfft.txt">rfft</A>, and
<A href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/irfft.txt">irfft</A>
perform forward and inverse fourier transforms on real data. Only half of the
conjugate symmetric transform is generated by the forward routine RFFT. For
even length data, the inverse routine, IRFFT, is asymptotically twice as fast
as the built-in fft routine IFFT. The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/rsfft.txt">rsfft
</A>performs the forward transform on real symmetric data.
<LI>The routines <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/rdct.txt">rdct</A>, and
<A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/irdct.txt">irdcft</A>
perform forward and inverse discrete cosine transforms on real data. The
routines are asymptotically twice as fast as the complex-data routines in the
image-processing and signal-processing toolboxes.
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/rhartley.txt">rhartley
</A>performs a forward or inverse Hartley transform. </LI></UL>
<HR>
<H2><A name=random>Random Number Generation</A></H2>
<UL>
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/randvec.txt">randvec
</A>generates random vectors with a given mean and covariance.
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/usasi.txt">usasi
</A>generates noise with a USASI spectrum.
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/randfilt.txt">randfilt</A>
generates filtered gaussian noise without any startup transients. </LI></UL>
<HR>
<H2><A name=distance>Vector Distance</A></H2>
<UL>
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/disteusq.txt">disteusq
</A>calculates the squared euclidean distance between all pairs of rows of two
matrices.
<LI>The routines <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/distitar.txt">distitar</A>,
<A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/distisar.txt">distisar</A>
and <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/distchar.txt">distchar</A>
calculate the Itakura, Itakura-Saito and COSH spectral distances between sets
of AR coefficients.
<LI>The routines <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/distitpf.txt">distitpf</A>,
<A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/distispf.txt">distispf</A>
and <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/distchpf.txt">distchpf</A>
calculate the Itakura, Itakura-Saito and COSH spectral distances between power
spectra. </LI></UL>
<HR>
<H2><A name=analysis>Speech Analysis</A></H2>
<UL>
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/enframe.txt">enframe</A>
can be used to split a signal up into frames. It can optionally apply a window
to each frame.
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/activlev.txt">activlev
</A>calculates the active level of a speech segment according to ITU-T
recommendation P.56.
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/spgrambw.txt">spgrambw
</A>draws a monochrome spectrogram with a dB scale.
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/schmitt.txt">schmitt
</A>passes a signal through a schmitt trigger. </LI></UL>
<HR>
<H2><A name=enhance>Speech Enhancement</A></H2>
<UL>
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/specsubm.txt">specsubm</A>
implements spectral subtraction using an algorithm of Martin. </LI></UL>
<HR>
<H2><A name=lpc>LPC Analysis of Speech</A></H2>
<UL>
<LI>The routines <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/lpcauto.txt">lpcauto</A>
and <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/lpccovar.txt">lpccovar</A>
perform linear predictive coding (LPC) analysis. The routines relating to LPC
are described in more detail on <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/lpc.html">another page</A>.
A large number of <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/lpc.html">conversion
routines</A> are included for changing the form of the LPC coefficients (e.g.
AR coefficients, reflection coefficients etc.): these are of the form lpcxx2yy
where xx and yy denote the coefficient sets. The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/lpcrr2am.txt">lpcrr2am
</A>calculates LPC filters for all orders up to a given maximum.
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/lpcbwexp.txt">lpcbwexp
</A>performs bandwidth expansion on an LPC filter.
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/ccwarpf.txt">ccwarpf
</A>performs frequency warping in the complex cepstrum domain.
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/lpcifilt.txt">lpcifilt
</A>performs inverse filtering to estimate the glottal waveform from the
speech signal and the lpc coefficients.
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/lpcrand.txt">lpcrand</A>
can be used to generate random, stable filters for testing purposes. </LI></UL>
<HR>
<H2><A name=synthesis>Speech Synthesis</A></H2>
<UL>
<LI>The routines <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/glotros.txt">glotros</A>
and <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/glotlf.txt">glotlf</A>
implement two common models for the waveform of airflow through the vocal
folds. </LI></UL>
<HR>
<H2><A name=coding>Speech Coding</A></H2>
<UL>
<LI>The routines <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/lin2pcma.txt">lin2pcma</A>,
<A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/lin2pcmu.txt">lin2pcmu</A>,
<A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/pcma2lin.txt">pcma2lin</A>,
and <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/pcmu2lin.txt">pcmu2lin</A>
convert audio waveforms to and from the 8-bit A-law and Mu-law PCM formats
that are used in telecommunications: Mu-law is used in the USA and Japan while
A-law is used in the rest of the world. The two formats are very similar and,
for speech waveforms, give about the same perceived quality as 12-bit linear
encoding. Alternate bits in the A-law format are usually inverted before
transmission: the conversion routines can optionally include this. The
conversions are defined by ITU standard G.711.
<LI>The routines <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/kmeans.txt">kmeans
</A>and <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/kmeanlbg.txt">kmeanlbg
</A>perform vector quantisation using the kmeans algorithm.
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/potsband.txt">potsband
</A>calculates a bandpass filter corresponding to the standard telephone
passband. </LI></UL>
<HR>
<H2><A name=recog>Speech Recognition</A></H2>
<UL>
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/melcepst.txt">melcepst
</A>implements a mel-cepstrum front end for a recogniser. The associated
bandpass filter matrix is generated by <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/melbankm.txt">melbankm
</A>.
<LI>The routines <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/cep2pow.txt">cep2pow
</A>and <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/pow2cep.txt">pow2cep
</A>convert state means and variances between the mel-cepstrum and power
domains.
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/gaussmix.txt">gaussmix</A>
fits a gaussian mixture distribution to a collection of observation vectors.
</LI></UL>
<HR>
<H2><A name=utility>Utility Functions</A></H2>
<UL>
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/zerotrim.txt">zerotrim
</A>removes from a matrix any trailing rows and columns that are all zero.
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/logsum.txt">logsum</A>
calculates log(sum(exp(x))) without overflow problems.
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/dualdiag.txt">dualdiag</A>
simultaneously diagonalises two matrices: this is useful in computing LDA or
IMELDA transforms.
<LI>The routines <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/permutes.txt">permutes
</A>and <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/choosenk.txt">choosenk
</A>generate respectively all possible permutations of the numbers 1:n and all
possible ways of choosing k elements out of the numbers 1:n without
duplications. They are equivalent to the standard <B
style="BACKGROUND-COLOR: #ffff66; COLOR: black">MATLAB</B> routines PERMS and
NCHOOSEK but much faster.
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/zerotrim.txt">sprintsi
</A>prints a value with the correct standard SI multiplier (e.g. 2100 prints
as 2.1 k). </LI></UL></BODY></HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -