📄 fft2melmx.m

📁 the convert the audio from the cep unit fft to mel spectrum
💻 M
字号:
function [wts,binfrqs] = fft2melmx(nfft, sr, nfilts, width, minfrq, maxfrq, htkmel, constamp)% wts = fft2melmx(nfft, sr, nfilts, width, minfrq, maxfrq, htkmel, constamp)%      Generate a matrix of weights to combine FFT bins into Mel%      bins.  nfft defines the source FFT size at sampling rate sr.%      Optional nfilts specifies the number of output bands required %      (else one per bark), and width is the constant width of each %      band relative to standard Mel (default 1).%      While wts has nfft columns, the second half are all zero. %      Hence, Mel spectrum is fft2melmx(nfft,sr)*abs(fft(xincols,nfft));%      minfrq is the frequency (in Hz) of the lowest band edge;%      default is 0, but 133.33 is a common standard (to skip LF).%      maxfrq is frequency in Hz of upper edge; default sr/2.%      You can exactly duplicate the mel matrix in Slaney's mfcc.m%      as fft2melmx(512, 8000, 40, 1, 133.33, 6855.5, 0);%      htkmel=1 means use HTK's version of the mel curve, not Slaney's.%      constamp=1 means make integration windows peak at 1, not sum to 1.% 2004-09-05  dpwe@ee.columbia.edu  based on fft2barkmxif nargin < 2;     sr = 8000;      endif nargin < 3;     nfilts = 40;    endif nargin < 4;     width = 1.0;    endif nargin < 5;     minfrq = 0;     end  % default bottom edge at 0if nargin < 6;     maxfrq = sr/2;  end  % default top edge at nyquistif nargin < 7;     htkmel = 0;     endif nargin < 8;     constamp = 0;   endwts = zeros(nfilts, nfft);% Center freqs of each FFT binfftfrqs = [0:nfft-1]/nfft*sr;% 'Center freqs' of mel bands - uniformly spaced between limitsminmel = hz2mel(minfrq, htkmel);maxmel = hz2mel(maxfrq, htkmel);binfrqs = mel2hz(minmel+[0:(nfilts+1)]/(nfilts+1)*(maxmel-minmel), htkmel);binbin = round(binfrqs/sr*(nfft-1));for i = 1:nfilts%  fs = mel2hz(i + [-1 0 1], htkmel);  fs = binfrqs(i+[0 1 2]);  % scale by width  fs = fs(2)+width*(fs - fs(2));  % lower and upper slopes for all bins  loslope = (fftfrqs - fs(1))/(fs(2) - fs(1));  hislope = (fs(3) - fftfrqs)/(fs(3) - fs(2));  % .. then intersect them with each other and zero%  wts(i,:) = 2/(fs(3)-fs(1))*max(0,min(loslope, hislope));  wts(i,:) = max(0,min(loslope, hislope));  % actual algo and weighting in feacalc (more or less)%  wts(i,:) = 0;%  ww = binbin(i+2)-binbin(i);%  usl = binbin(i+1)-binbin(i);%  wts(i,1+binbin(i)+[1:usl]) = 2/ww * [1:usl]/usl;%  dsl = binbin(i+2)-binbin(i+1);%  wts(i,1+binbin(i+1)+[1:(dsl-1)]) = 2/ww * [(dsl-1):-1:1]/dsl;% need to disable weighting below if you use this oneendif (constamp == 0)  % Slaney-style mel is scaled to be approx constant E per channel  wts = diag(2./(binfrqs(2+[1:nfilts])-binfrqs(1:nfilts)))*wts;end% Make sure 2nd half of FFT is zerowts(:,(nfft/2+1):nfft) = 0;% seems like a good idea to avoid aliasing%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%function f = mel2hz(z, htk)%   f = mel2hz(z, htk)%   Convert 'mel scale' frequencies into Hz%   Optional htk = 1 means use the HTK formula%   else use the formula from Slaney's mfcc.m% 2005-04-19 dpwe@ee.columbia.eduif nargin < 2  htk = 0;endif htk == 1  f = 700*(10.^(z/2595)-1);else    f_0 = 0; % 133.33333;  f_sp = 200/3; % 66.66667;  brkfrq = 1000;  brkpt  = (brkfrq - f_0)/f_sp;  % starting mel value for log region  logstep = exp(log(6.4)/27); % the magic 1.0711703 which is the ratio needed to get from 1000 Hz to 6400 Hz in 27 steps, and is *almost* the ratio between 1000 Hz and the preceding linear filter center at 933.33333 Hz (actually 1000/933.33333 = 1.07142857142857 and  exp(log(6.4)/27) = 1.07117028749447)  linpts = (z < brkpt);  f = 0*z;  % fill in parts separately  f(linpts) = f_0 + f_sp*z(linpts);  f(~linpts) = brkfrq*exp(log(logstep)*(z(~linpts)-brkpt));end%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%function z = hz2mel(f,htk)%  z = hz2mel(f,htk)%  Convert frequencies f (in Hz) to mel 'scale'.%  Optional htk = 1 uses the mel axis defined in the HTKBook%  otherwise use Slaney's formula% 2005-04-19 dpwe@ee.columbia.eduif nargin < 2  htk = 0;endif htk == 1  z = 2595 * log10(1+f/700);else  % Mel fn to match Slaney's Auditory Toolbox mfcc.m  f_0 = 0; % 133.33333;  f_sp = 200/3; % 66.66667;  brkfrq = 1000;  brkpt  = (brkfrq - f_0)/f_sp;  % starting mel value for log region  logstep = exp(log(6.4)/27); % the magic 1.0711703 which is the ratio needed to get from 1000 Hz to 6400 Hz in 27 steps, and is *almost* the ratio between 1000 Hz and the preceding linear filter center at 933.33333 Hz (actually 1000/933.33333 = 1.07142857142857 and  exp(log(6.4)/27) = 1.07117028749447)  linpts = (f < brkfrq);  z = 0*f;  % fill in parts separately  z(linpts) = (f(linpts) - f_0)/f_sp;  z(~linpts) = brkpt+(log(f(~linpts)/brkfrq))./log(logstep);end
💿 文件大小 3 K
👤 上传用户 anlan001
📂 所属分类加密解密
📄 代码行数 137 行
💻 语言类型 M
🏷️ 相关标签

#the #spectrum #convert #audio
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -