📄 计算能量.txt
字号:
function [y,mm]=ewgrpdel(x,w,m)
%EWGRPDEL calculates the energy weighted group delay waveform Y=(X,W,M)
% For each sample, x(n), this routine calculates the energy-weighted average
% group delay over frequency using a window centred on x(n). This is equal to
% the delay from the window centre to the centre of gravity of energy in the window.
%
% Inputs: x is the input signal
% w is the window (or just the length of a Hamming window)
% m is the sample of w to use as the centre [default=(length(w)+1)/2]
%
% mm the actual value of m used. Output point y(i) is based on x(i+m-w:i+m-1).
%
% If w is odd and m has its default value, then an impulse at x(i) will
% result in a negative-going zero crossing at y(i). More generally, if the
% window is symmetrical it will result in a negative-going zero crossing at
% y(i+m-(w+1)/2).
% Example: x=zeros(1000,1); x(100:100:900)=1; ewgrpdel(x,99);
% Copyright (C) Mike Brookes 2003
% Version: $Id: ewgrpdel.m,v 1.3 2005/02/21 15:22:12 dmb Exp $
%
% VOICEBOX is a MATLAB toolbox for speech processing.
% Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% This program is free software; you can redistribute it and/or modify
% it under the terms of the GNU General Public License as published by
% the Free Software Foundation; either version 2 of the License, or
% (at your option) any later version.
%
% This program is distributed in the hope that it will be useful,
% but WITHOUT ANY WARRANTY; without even the implied warranty of
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
% GNU General Public License for more details.
%
% You can obtain a copy of the GNU General Public License from
% ftp://prep.ai.mit.edu/pub/gnu/COPYING-2.0 or by writing to
% Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
if nargin < 2
w=hamming(length(x));
elseif length(w)==1
w=hamming(w);
end
w=w.^2;
lw=length(w);
if nargin < 3
m=(1+lw)/2;
end
m=max(round(m),1);
mm=m;
wn=w(:).*(m-(1:lw))';
x2=[x(:); zeros(m-1,1)].^2;
yn=filter(wn,1,x2);
yd=filter(w,1,x2);
yd(yd<eps)=1;
y=yn(m:end)./yd(m:end);
if nargout==0
plot(y);
hold on;
plot(x,'r');
hold off;
end
************************************************************************************************************************
VOICEBOX: Speech Processing Toolbox for MATLAB
Introduction
VOICEBOX is a speech processing toolbox consists of MATLAB routines that are maintained by and mostly written by Mike Brookes, Department of Electrical & Electronic Engineering, Imperial College, Exhibition Road, London SW7 2BT, UK. Several of the routines require MATLAB V5.
The routines are available as a compressed tar file or as a zip archive and are made available under the terms of the GNU Public License.
The routine VOICEBOX.M contains various installation-dependent parameters which may need to be altered before using the toolbox. In particular it contains a number of default directory paths indicating where temporary files should be created, where speech data normally resides, etc. See the comments in voicebox.m for a fuller description.
For reading compressed SPHERE format files, you will need the SHORTEN program written by Tony Robinson and SoftSound Limited www.softsound.com. The path to the shorten executable must be set in voicebox.m.
Please send any comments, suggestions, bug reports etc to mike.brookes@ic.ac.uk.
--------------------------------------------------------------------------------
Contents
--------------------------------------------------------------------------------
Audio File Input/Output
Read and write WAV and other speech file formats
Frequency Scales
Convert between Hz, Mel, Erb and MIDI frequency scales
Fourier/DCT/Hartley Transforms
Various related transforms
Random Number and Probability Distributions
Generate random vectors and noise signals
Vector Distances
Calculate distances between vector lists
Speech Analysis
Active level estimation, Spectrograms
LPC Analysis of Speech
Linear Predictive Coding routines
Speech Synthesis
Glottal waveform models
Speech Enhancement
Spectral noise subtraction
Speech Coding
PCM coding, Vector quantisation
Speech Recognition
Front-end processing for recognition
Signal Processing
Miscellaneous signal processing functions
Printing and Display Functions
Utilities for printing and graphics
Voicebox Parameters and System Interface
Get or set VOICEBOX and WINDOWS system parameters
Utility Functions
Miscellaneous utility functions
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Audio File Input/Output
Routines are available to read and, in some cases write, a variety of file formats:
Read Write Suffix
readwav writewav .wav These routines allow an arbitrary number of channels and can deal with linear PCM (any precision up to 32 bits), A-law PCM and Mu-law PCM. Large files can be read and written in small chunks.
readhtk writehtk .htk Read and write waveform and parameter files used by Microsoft's Hidden Markov Toolkit.
readsfs .sfs Speech Filing system files from Mark Huckvale at UCL.
readsph .sph NIST Sphere format files (including TIMIT). Needs SHORTEN for compressed files.
readaif .aif AIFF format (Audio Interchange File Format) used by Mac users.
readcnx cnx Read Connex database files (from BT)
--------------------------------------------------------------------------------
Frequency Scale Conversion
From f To f Scale
frq2mel mel2frq mel The mel scale is based on the human perception of sinewave pitch.
frq2erb erb2frq erb The erb scale is based on the equivalent rectangular bandwidths of the human ear.
frq2midi midi2frq midi The midi standard specifies a numbering of semitones with middle C being 60. They can use the normal equal tempered scale or else the pythagorean scale of just intonation. They will in addition output note names in a character format.
--------------------------------------------------------------------------------
Fourier, DCT and Hartley Transforms
The routines rfft, and irfft perform forward and inverse fourier transforms on real data. Only half of the conjugate symmetric transform is generated by the forward routine RFFT. For even length data, the inverse routine, IRFFT, is asymptotically twice as fast as the built-in fft routine IFFT. The routine rsfft performs the forward transform on real symmetric data.
The routines rdct, and irdcft perform forward and inverse discrete cosine transforms on real data. The routines are asymptotically twice as fast as the complex-data routines in the image-processing and signal-processing toolboxes.
The routine rhartley performs a forward or inverse Hartley transform.
--------------------------------------------------------------------------------
Random Numbers and Probability Distributions
The routine randvec generates random vectors with a given mean and covariance.
randiscr generates discrete random values with a specified probability vector
The routine usasi generates noise with a USASI spectrum.
The routine randfilt generates filtered gaussian noise without any startup transients.
lognmpdf calculates the pdf of a lognormal distribution
histndim calculates an n-dimensional histogram (and plots a 2-D one)
gausprod calculates the product of two gaussian distributions
--------------------------------------------------------------------------------
Vector Distance
The routine disteusq calculates the squared euclidean distance between all pairs of rows of two matrices.
The routines distitar, distisar and distchar calculate the Itakura, Itakura-Saito and COSH spectral distances between sets of AR coefficients.
The routines distitpf, distispf and distchpf calculate the Itakura, Itakura-Saito and COSH spectral distances between power spectra.
--------------------------------------------------------------------------------
Speech Analysis
The routine enframe can be used to split a signal up into frames. It can optionally apply a window to each frame.
The routine ewgrpdel calculates the energy-weighted group delay waveform.
The routine activlev calculates the active level of a speech segment according to ITU-T recommendation P.56.
The routine spgrambw draws a monochrome spectrogram with a dB scale.
The routine schmitt passes a signal through a schmitt trigger.
The routine txalign finds the best alignment (in a least squares sense) between two sets of time markers (e.g. glottal closure instants).
--------------------------------------------------------------------------------
LPC Analysis of Speech
The routines lpcauto and lpccovar perform linear predictive coding (LPC) analysis. The routines relating to LPC are described in more detail on another page. A large number of conversion routines are included for changing the form of the LPC coefficients (e.g. AR coefficients, reflection coefficients etc.): these are of the form lpcxx2yy where xx and yy denote the coefficient sets. The routine lpcrr2am calculates LPC filters for all orders up to a given maximum.
The routine lpcbwexp performs bandwidth expansion on an LPC filter.
The routine ccwarpf performs frequency warping in the complex cepstrum domain.
The routine lpcifilt performs inverse filtering to estimate the glottal waveform from the speech signal and the lpc coefficients.
The routine lpcrand can be used to generate random, stable filters for testing purposes.
--------------------------------------------------------------------------------
Speech Synthesis
The routines glotros and glotlf implement two common models for the waveform of airflow through the vocal folds.
--------------------------------------------------------------------------------
Speech Enhancement
The routine specsubm implements spectral subtraction using an algorithm of Martin.
--------------------------------------------------------------------------------
Speech Coding
The routines lin2pcma, lin2pcmu, pcma2lin, and pcmu2lin convert audio waveforms to and from the 8-bit A-law and Mu-law PCM formats that are used in telecommunications: Mu-law is used in the USA and Japan while A-law is used in the rest of the world. The two formats are very similar and, for speech waveforms, give about the same perceived quality as 12-bit linear encoding. Alternate bits in the A-law format are usually inverted before transmission: the conversion routines can optionally include this. The conversions are defined by ITU standard G.711.
The routines kmeans and kmeanlbg perform vector quantisation using the kmeans algorithm.
The routine potsband calculates a bandpass filter corresponding to the standard telephone passband.
--------------------------------------------------------------------------------
Speech Recognition
The routine melcepst implements a mel-cepstrum front end for a recogniser. The associated bandpass filter matrix is generated by melbankm .
The routines cep2pow and pow2cep convert state means and variances between the mel-cepstrum and power domains.
The routine gaussmix fits a gaussian mixture distribution to a collection of observation vectors.
ldatrace performs Linear Discriminant Analysis with optional constraints on the transform matrix
--------------------------------------------------------------------------------
Signal Processing
findpeaks finds the peaks in a signal
maxfilt performs running maximum filter
meansqtf calculates the output power of a rational filter with a white noise input
windows generates window functions
zerocros finds the zero crossings of a signal with interpolation
ditherq adds dither and quantizes a signal
--------------------------------------------------------------------------------
Printing and Display Functions
figbolden makes the lines on a figure bold and enlarges font sizes for printing clearly
sprintsi prints a value with the correct standard SI multiplier (e.g. 2100 prints as 2.1 k)
bitsprec rounds values to a precision of n bits
frac2bin converts numbers to fixed-point binary strings
--------------------------------------------------------------------------------
Voicebox Parameters and System Interface
voicebox contains a number of installation-dependent global parameters and is likely to need editing for each particular setup.
unixwhich searches the WINDOWS system path for an executable (like UNIX which command)
winenvar Obtains WINDIWS environment variables
--------------------------------------------------------------------------------
Utility Functions
zerotrim removes from a matrix any trailing rows and columns that are all zero.
logsum calculates log(sum(exp(x))) without overflow problems.
dualdiag simultaneously diagonalises two matrices: this is useful in computing LDA or IMELDA transforms.
permutes all possible permutations of the numbers 1:n
choosenk all possible ways of choosing k elements out of the numbers 1:n without duplications
choosrnk all possible ways of choosing k elements out of the numbers 1:n with duplications allowed
rotation generates rotation matrices
skew3d manipulates 3#3 skew symmetric matrices
************************************************************************************************************************
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -