📄 index.html
字号:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"><html xmlns:mwsh="http://www.mathworks.com/namespace/mcode/v1/syntaxhighlight.dtd"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <!--This HTML is auto-generated from an M-file.To make changes, update the M-file and republish this document. --> <title>Sound Processing with Short Time Fourier Transform</title> <meta name="generator" content="MATLAB 7.4"> <meta name="date" content="2008-10-18"> <meta name="m-file" content="index"> <LINK REL="stylesheet" HREF="style.css" TYPE="text/css"> </head> <body> <div class="content"> <h1>Sound Processing with Short Time Fourier Transform</h1> <introduction> <p>This numerical tour explore local Fourier analysis of sounds, and its application to source denoising.</p> </introduction> <h2>Contents</h2> <div> <ul> <li><a href="#1">Installing toolboxes and setting up the path.</a></li> <li><a href="#8">Local Fourier analysis of sound.</a></li> <li><a href="#18">Short time Fourier transform.</a></li> <li><a href="#23">Audio Denoising</a></li> <li><a href="#30">Audio Block Thresholding</a></li> </ul> </div> <h2>Installing toolboxes and setting up the path.<a name="1"></a></h2> <p>You need to download the <a href="../toolbox_general.zip">general purpose toolbox</a> and the <a href="../toolbox_signal.zip">signal toolbox</a>. </p> <p>You need to unzip these toolboxes in your working directory, so that you have <tt>toolbox_general/</tt> and <tt>toolbox_signal/</tt> in your directory. </p> <p><b>For Scilab user:</b> you must replace the Matlab comment '%' by its Scilab counterpart '//'. </p> <p><b>Recommandation:</b> You should create a text file named for instance <tt>numericaltour.sce</tt> (in Scilabe) or <tt>numericaltour.m</tt> to write all the Scilab/Matlab command you want to execute. Then, simply run <tt>exec('numericaltour.sce');</tt> (in Scilab) or <tt>numericaltour;</tt> (in Matlab) to run the commands. </p> <p>Execute this line only if you are using Matlab.</p><pre class="codeinput">getd = @(p)path(path,p); <span class="comment">% scilab users must *not* execute this</span></pre><p>Then you can add these toolboxes to the path.</p><pre class="codeinput"><span class="comment">% Add some directories to the path</span>getd(<span class="string">'toolbox_signal/'</span>);getd(<span class="string">'toolbox_general/'</span>);</pre><h2>Local Fourier analysis of sound.<a name="8"></a></h2> <p>A sound is a 1D signal that is locally highly oscillating and stationary. A local Fourier analysis is thus usefull to study the property of the sound such as its local amplitude and frequency. </p> <p>First we load a sound, with a slight sub-sampling</p><pre class="codeinput">n = 1024*16;options.n = n;[x,fs] = load_sound(<span class="string">'bird'</span>, n);</pre><p>You can actually play a sound.</p><pre class="codeinput">sound(x,fs);</pre><p>We can display the sound.</p><pre class="codeinput">clf;plot(1:n,x);axis(<span class="string">'tight'</span>);set_graphic_sizes([], 20);title(<span class="string">'Signal'</span>);</pre><img vspace="5" hspace="5" src="index_01.png"> <p><i>Exercice 1:</i> (the solution is <a href="../private/audio_processing/exo1.m">exo1.m</a>) Compute the local Fourier transform around a point <tt>t0</tt> of <tt>x</tt>, which is the FFT (use the function <tt>fft</tt>) of the windowed signal <tt>x.*h</tt> where <tt>h</tt> is smooth windowing function located around <tt>t0</tt>. For instance you can use for <tt>h</tt> a Gaussian bump centered at <tt>t0</tt>. To center the FFT for display, use <tt>fftshift</tt>. </p><pre class="codeinput">exo1;</pre><img vspace="5" hspace="5" src="index_02.png"> <p>A good windowing function should balance both time localization and frequency localization.</p><pre class="codeinput">t = linspace(-10,10,2048);eta = 1e-5;vmin = -2;</pre><p>The block window has a sharp transition and thus a poor frequency localization.</p><pre class="codeinput">h = abs(t)<1;hf = fftshift(abs(fft(h)));hf = log10(eta+hf); hf = hf/max(hf);clf;subplot(2,1,1);title(<span class="string">'Block window'</span>);plot(t, h); axis([-2 2, -.1, 1.1]);subplot(2,1,2);plot(t, hf); axis([-2 2, vmin, 1.1]);title(<span class="string">'Fourier transform'</span>);</pre><img vspace="5" hspace="5" src="index_03.png"> <p>A Hamming window is smoother.</p><pre class="codeinput">h = cos(t*pi/2) .* (abs(t)<1);hf = fftshift(abs(fft(h)));hf = log10(eta+hf); hf = hf/max(hf);clf;subplot(2,1,1);title(<span class="string">'Hamming window'</span>);plot(t, h); axis([-2 2, -.1, 1.1]);subplot(2,1,2);plot(t, hf); axis([-2 2, vmin, 1.1]);title(<span class="string">'Fourier transform'</span>);</pre><img vspace="5" hspace="5" src="index_04.png"> <p>A Haning window has continuous derivatives.</p><pre class="codeinput">h = (cos(t*pi)+1)/2 .* (abs(t)<1);hf = fftshift(abs(fft(h)));hf = log10(eta+hf); hf = hf/max(hf);clf;subplot(2,1,1);title(<span class="string">'Haning window'</span>);plot(t, h); axis([-2 2, -.1, 1.1]);subplot(2,1,2);plot(t, hf); axis([-2 2, vmin, 1.1]);title(<span class="string">'Fourier transform'</span>);</pre><img vspace="5" hspace="5" src="index_05.png"> <p>A normalized Haning window has a sharper transition. It has the advantage of generating a tight frame STFT, and is used in the following. </p><pre class="codeinput">h = sqrt(2)/2 * (1+cos(t*pi)) ./ sqrt( 1+cos(t*pi).^2 ) .* (abs(t)<1);hf = fftshift(abs(fft(h)));hf = log10(eta+hf); hf = hf/max(hf);clf;subplot(2,1,1);title(<span class="string">'Normalized Haning window'</span>);plot(t, h); axis([-2 2, -.1, 1.1]);subplot(2,1,2);plot(t, hf); axis([-2 2, vmin, 1.1]);title(<span class="string">'Fourier transform'</span>);</pre><img vspace="5" hspace="5" src="index_06.png"> <h2>Short time Fourier transform.<a name="18"></a></h2> <p>Gathering a local Fourier transform at equispaced point create a local Fourier transform, also called <b>spectrogram</b>. By carefully chosing the window, this transform corresponds to the decomposition of the signal in a redundant tight frame. The redundancy corresponds to the overlap of the windows, and the tight frame corresponds to the fact that the pseudo-inverse is simply the transposed of the transform (it means that the same window can be used for synthesis with a simple summation of the reconstructed signal over each window). </p> <p>We can compute a spectrogram of the sound to see its local Fourier content. The number of windowed used is <tt>(n-noverlap)/(w-noverlap)</tt></p><pre class="codeinput"><span class="comment">% size of the window</span>w = 64*2;<span class="comment">% overlap of the window</span>q = w/2;S = perform_stft(x,w,q, options);</pre><p>To see more clearly the evolution of the harmonics, we can display the spectrogram in log coordinates. The top of the spectrogram corresponds to low frequencies. </p><pre class="codeinput"><span class="comment">% display the spectrogram</span>clf; imageplot(abs(S)); axis(<span class="string">'on'</span>);<span class="comment">% display log spectrogram</span>plot_spectrogram(S,x);</pre><img vspace="5" hspace="5" src="index_07.png"> <p>The STFT transform is decomposing the signal in a redundant tight frame. This can be checked by measuring the energy conservation.</p><pre class="codeinput"><span class="comment">% energy of the signal</span>e = norm(x,<span class="string">'fro'</span>).^2;<span class="comment">% energy of the coefficients</span>eS = norm(abs(S),<span class="string">'fro'</span>).^2;disp(strcat([<span class="string">'Energy conservation (should be 1)='</span> num2str(e/eS)]));</pre><pre class="codeoutput">Energy conservation (should be 1)=1</pre><p>One can also check that the inverse transform (which is just the transposed operator - it implements exactly the pseudo inverse) is working fine. </p><pre class="codeinput"><span class="comment">% one must give the signal size for the reconstruction</span>x1 = perform_stft(S,w,q, options);disp(strcat([<span class="string">'Reconstruction error (should be 0)='</span> num2str( norm(x-x1, <span class="string">'fro'</span>)./norm(x,<span class="string">'fro'</span>) ) ]));</pre><pre class="codeoutput">Reconstruction error (should be 0)=2.2405e-16</pre><h2>Audio Denoising<a name="23"></a></h2> <p>One can perform denosing by a non-linear thresholding over the transfomede Fourier domain.</p> <p>First we create a noisy signal</p><pre class="codeinput">sigma = .2;xn = x + randn(size(x))*sigma;</pre><p>Play the noisy sound.</p><pre class="codeinput">sound(xn,fs);</pre><p>Display the Sounds.</p><pre class="codeinput">clf;subplot(2,1,1);plot(x); axis([1 n -1.2 1.2]);set_graphic_sizes([], 20);title(<span class="string">'Original signal'</span>);subplot(2,1,2);plot(xn); axis([1 n -1.2 1.2]);set_graphic_sizes([], 20);title(<span class="string">'Noisy signal'</span>);</pre><img vspace="5" hspace="5" src="index_08.png"> <p>One can threshold the spectrogram.</p><pre class="codeinput"><span class="comment">% perform thresholding</span>Sn = perform_stft(xn,w,q, options);SnT = perform_thresholding(Sn, 2*sigma, <span class="string">'hard'</span>);<span class="comment">% display the results</span>subplot(2,1,1);plot_spectrogram(Sn);subplot(2,1,2);plot_spectrogram(SnT);</pre><img vspace="5" hspace="5" src="index_09.png"> <p><i>Exercice 2:</i> (the solution is <a href="../private/audio_processing/exo2.m">exo2.m</a>) A denoising is performed by hard or soft thresholding the STFT of the noisy signal. Compute the denosing SNR with both soft and hard thresholding, and compute the threshold that minimize the SNR. Remember that a soft thresholding should be approximately twice smaller than a hard thresholding. Check the result by listening. What can you conclude about the quality of the denoised signal ? </p><pre class="codeinput">exo2;</pre><img vspace="5" hspace="5" src="index_10.png"> <p><i>Exercice 3:</i> (the solution is <a href="../private/audio_processing/exo3.m">exo3.m</a>) Display and hear the results. What do you notice ? </p><pre class="codeinput">exo3;</pre><img vspace="5" hspace="5" src="index_11.png"> <h2>Audio Block Thresholding<a name="30"></a></h2> <p>It is possible to remove musical noise by thresholding blocks of STFT coefficients.</p> <p>Denoising is performed by block soft thresholding.</p><pre class="codeinput"><span class="comment">% perform thresholding</span>Sn = perform_stft(xn,w,q, options);SnT = perform_thresholding(Sn, sigma, <span class="string">'block'</span>);<span class="comment">% display the results</span>subplot(2,1,1);plot_spectrogram(Sn);subplot(2,1,2);plot_spectrogram(SnT);</pre><img vspace="5" hspace="5" src="index_12.png"> <p><i>Exercice 4:</i> (the solution is <a href="../private/audio_processing/exo4.m">exo4.m</a>) Trie for various block sizes and report the best results. </p><pre class="codeinput">exo4;</pre><img vspace="5" hspace="5" src="index_13.png"> <p class="footer"><br> Copyright ® 2008 Gabriel Peyre<br></p> </div> <!--##### SOURCE BEGIN #####%% Sound Processing with Short Time Fourier Transform% This numerical tour explore local Fourier analysis of sounds, and its% application to source denoising.%% Installing toolboxes and setting up the path.%%% You need to download the % <../toolbox_general.zip general purpose toolbox>% and the <../toolbox_signal.zip signal toolbox>.%%% You need to unzip these toolboxes in your working directory, so% that you have |toolbox_general/| and |toolbox_signal/| in your directory.%%% *For Scilab user:* you must replace the Matlab comment '%' by its Scilab% counterpart '//'.%%% *Recommandation:* You should create a text file named for instance% |numericaltour.sce| (in Scilabe) or |numericaltour.m| to write all the% Scilab/Matlab command you want to execute. Then, simply run% |exec('numericaltour.sce');| (in Scilab) or |numericaltour;| (in Matlab)% to run the commands. %%% Execute this line only if you are using Matlab.getd = @(p)path(path,p); % scilab users must *not* execute this%%% Then you can add these toolboxes to the path.% Add some directories to the pathgetd('toolbox_signal/');getd('toolbox_general/');%% Local Fourier analysis of sound.% A sound is a 1D signal that is locally highly oscillating and stationary.% A local Fourier analysis is thus usefull to study the property of the% sound such as its local amplitude and frequency. %%% First we load a sound, with a slight sub-samplingn = 1024*16;
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -