📄 index.html
字号:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"><html xmlns:mwsh="http://www.mathworks.com/namespace/mcode/v1/syntaxhighlight.dtd"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <!--This HTML is auto-generated from an M-file.To make changes, update the M-file and republish this document. --> <title>Sparse Representation in a Gabor Dictionary</title> <meta name="generator" content="MATLAB 7.4"> <meta name="date" content="2008-10-19"> <meta name="m-file" content="index"> <LINK REL="stylesheet" HREF="style.css" TYPE="text/css"> </head> <body> <div class="content"> <h1>Sparse Representation in a Gabor Dictionary</h1> <introduction> <p>This numerical tour explores the use of L1 optimization to find sparse representation in a redundant Gabor dictionary. It shows application to denoising and stereo separation. </p> </introduction> <h2>Contents</h2> <div> <ul> <li><a href="#1">Installing toolboxes and setting up the path.</a></li> <li><a href="#8">Gabor Tight Frame Transform</a></li> <li><a href="#15">Gabor Tight Frame Denoising</a></li> <li><a href="#20">Basis Pursuit in the Gabor Frame</a></li> <li><a href="#32">Sparsity to Improve Audio Separation</a></li> </ul> </div> <h2>Installing toolboxes and setting up the path.<a name="1"></a></h2> <p>You need to download the <a href="../toolbox_general.zip">general purpose toolbox</a> and the <a href="../toolbox_signal.zip">signal toolbox</a>. </p> <p>You need to unzip these toolboxes in your working directory, so that you have <tt>toolbox_general/</tt> and <tt>toolbox_signal/</tt> in your directory. </p> <p><b>For Scilab user:</b> you must replace the Matlab comment '%' by its Scilab counterpart '//'. </p> <p><b>Recommandation:</b> You should create a text file named for instance <tt>numericaltour.sce</tt> (in Scilabe) or <tt>numericaltour.m</tt> to write all the Scilab/Matlab command you want to execute. Then, simply run <tt>exec('numericaltour.sce');</tt> (in Scilab) or <tt>numericaltour;</tt> (in Matlab) to run the commands. </p> <p>Execute this line only if you are using Matlab.</p><pre class="codeinput">getd = @(p)path(path,p); <span class="comment">% scilab users must *not* execute this</span></pre><p>Then you can add these toolboxes to the path.</p><pre class="codeinput"><span class="comment">% Add some directories to the path</span>getd(<span class="string">'toolbox_signal/'</span>);getd(<span class="string">'toolbox_general/'</span>);</pre><h2>Gabor Tight Frame Transform<a name="8"></a></h2> <p>The Gabor transform is a collection of short time Fourier transforms (STFT) computed with several windows. The redundancy <tt>K*L</tt> of the transform depends on the number <tt>L</tt> of windows used and of the overlapping factor <tt>K</tt> of each STFT. </p> <p>We decide to use a collection of windows with dyadic sizes.</p><pre class="codeinput"><span class="comment">% sizes of the window</span>wlist = 32*[4 8 16 32];L = length(wlist);<span class="comment">% overlap of the window, so that K=2.</span>K = 2;qlist = wlist/K;<span class="comment">% overall redundancy</span>disp( strcat([<span class="string">'Approximate redundancy of the dictionary='</span> num2str(K*L) <span class="string">'.'</span>]) );</pre><pre class="codeoutput">Approximate redundancy of the dictionary=8.</pre><p>We load a sound.</p><pre class="codeinput">n = 1024*32;options.n = n;<span class="comment">% [x0,fs] = load_sound('bird', n);</span>[x0,fs] = load_sound(<span class="string">'glockenspiel'</span>, n);</pre><p>Compute its short time Fourier transform with a collection of windows.</p><pre class="codeinput"><span class="comment">% transform</span>options.multichannel = 0;S = perform_stft(x0,wlist,qlist, options);</pre><p><i>Exercice 1:</i> (the solution is <a href="../private/sparsity_gabor/exo1.m">exo1.m</a>) Compute the true redundancy of the transform. Check that the transform is a tight frame (energy conservation). </p><pre class="codeinput">exo1;</pre><pre class="codeoutput">True redundancy of the dictionary=8.0586.</pre><p>Display the coefficients.</p><pre class="codeinput">plot_spectrogram(S, x0);</pre><img vspace="5" hspace="5" src="index_01.png"> <p>Reconstructs the signal using the inverse Gabor transform.</p><pre class="codeinput"><span class="comment">% inversion</span>x1 = perform_stft(S,wlist,qlist, options);<span class="comment">% error</span>e = norm(x0-x1)/norm(x0);disp(strcat([<span class="string">'Reconstruction error (should be 0) = '</span> num2str(e, 3)]));</pre><pre class="codeoutput">Reconstruction error (should be 0) = 1.6e-16</pre><h2>Gabor Tight Frame Denoising<a name="15"></a></h2> <p>We can perform denoising by thresholding the Gabor representation.</p> <p>We add noise to the signal.</p><pre class="codeinput">sigma = .05;x = x0 + sigma*randn(size(x0));</pre><p>Denoising with soft thresholding. Setting correctly the threshold is quite difficult because of the redundancy of the representation.</p><pre class="codeinput"><span class="comment">% transform</span>S = perform_stft(x,wlist,qlist, options);<span class="comment">% threshold</span>T = sigma;ST = perform_thresholding(S, T, <span class="string">'soft'</span>);<span class="comment">% reconstruct</span>xT = perform_stft(ST,wlist,qlist, options);</pre><p>Display the result.</p><pre class="codeinput">err = snr(x0,xT);clfplot_spectrogram(ST, xT);subplot(length(ST)+1,1,1);title(strcat([<span class="string">'Denoised, SNR='</span> num2str(err,3), <span class="string">'dB'</span>]));</pre><img vspace="5" hspace="5" src="index_02.png"> <p><i>Exercice 2:</i> (the solution is <a href="../private/sparsity_gabor/exo2.m">exo2.m</a>) Find the best threshold, that gives the smallest error. </p><pre class="codeinput">exo2;</pre><img vspace="5" hspace="5" src="index_03.png"> <h2>Basis Pursuit in the Gabor Frame<a name="20"></a></h2> <p>Since the representation is highly redundant, it is possible to improve the quality of the representation using a basis pursuit denoising that optimize the L1 norm of the coefficients. </p> <p>The basis pursuit finds a set of coefficients <tt>S1</tt> by minimizing </p> <p><tt>min_{S1} 1/2*norm(x-x1)^2 + lambda*norm(S1,1) (*)</tt></p> <p>Where <tt>x1</tt> is the signal reconstructed from the Gabor coefficients <tt>S1</tt>. </p> <p>The parameter <tt>lambda</tt> should be optimized to match the noise level. Increasing <tt>lambda</tt> increases the sparsity of the solution, but then the approximation <tt>x1</tt> deviates from the noisy observations <tt>x1</tt>. </p> <p>Basis pursuit denoising <tt>(*)</tt> is solved by iterative thresholding, which iterates between a step of gradient descent, and a step of thresholding. </p> <p>Initialization of <tt>x1</tt> and <tt>S1</tt>. </p><pre class="codeinput">lambda = .1;x1 = x;S1 = perform_stft(x1,wlist,qlist, options);</pre><p>Step 1: gradient descent of <tt>norm(x-x1)^2</tt>. </p><pre class="codeinput"><span class="comment">% residual</span>r = x - x1;Sr = perform_stft(r, wlist, qlist, options);S1 = cell_add(S1, Sr);</pre><p>Step 2: thresholding and update of <tt>x1</tt>. </p><pre class="codeinput"><span class="comment">% threshold</span>S1 = perform_thresholding(S1, lambda, <span class="string">'soft'</span>);<span class="comment">% update the denoised signal</span>x1 = perform_stft(S1,wlist,qlist, options);</pre><p>The difficulty is to set the value of <tt>lambda</tt>. If the basis were orthogonal, it should be set to approximately 3/2*sigma (soft thresholding). Because of the redundancy of the representation in Gabor frame, it should be set to a slightly larger value. </p> <p><i>Exercice 3:</i> (the solution is <a href="../private/sparsity_gabor/exo3.m">exo3.m</a>) Perform the iterative thresholding by progressively decaying the value of <tt>lambda</tt> during the iterations, starting from <tt>lambda=1.5*sigma</tt> until <tt>lambda=.5*sigma</tt>. Retain the solution <tt>xbp</tt> together with the coefficients <tt>Sbp</tt> that provides the smallest error. </p><pre class="codeinput">exo3;</pre><img vspace="5" hspace="5" src="index_04.png"> <p>Display the solution computed by basis pursuit.</p><pre class="codeinput">e = snr(x0,xbp);clfplot_spectrogram(Sbp, xbp);subplot(length(Sbp)+1,1,1);title(strcat([<span class="string">'Denoised, SNR='</span> num2str(e,3), <span class="string">'dB'</span>]));</pre><img vspace="5" hspace="5" src="index_05.png"> <h2>Sparsity to Improve Audio Separation<a name="32"></a></h2> <p>The increase of sparsity produced by L1 minimization is helpful to improve audio stereo separation.</p> <p>Load 3 sounds.</p><pre class="codeinput">n = 1024*32;options.n = n;s = 3; <span class="comment">% number of sound</span>p = 2; <span class="comment">% number of micros</span>options.subsampling = 1;x = zeros(n,3);[x(:,1),fs] = load_sound(<span class="string">'bird'</span>, n, options);[x(:,2),fs] = load_sound(<span class="string">'male'</span>, n, options);[x(:,3),fs] = load_sound(<span class="string">'glockenspiel'</span>, n, options);<span class="comment">% normalize the energy of the signals</span>x = x./repmat(std(x,1), [n 1]);</pre><p>We mix the sound using a <tt>2x3</tt> transformation matrix. Here the direction are well-spaced, but you can try with more complicated mixing matrices. </p><pre class="codeinput"><span class="comment">% compute the mixing matrix</span>theta = linspace(0,pi(),s+1); theta(s+1) = [];theta(1) = .2;M = [cos(theta); sin(theta)];<span class="comment">% compute the mixed sources</span>y = x*M';</pre><p>We transform the stero pair using the multi-channel STFT (each channel is transformed independantly.</p><pre class="codeinput">options.multichannel = 1;S = perform_stft(y, wlist, qlist, options);<span class="comment">% check for reconstruction</span>y1 = perform_stft(S, wlist, qlist, options);disp(strcat([<span class="string">'Reconstruction error (should be 0)='</span> num2str(norm(y-y1,<span class="string">'fro'</span>)/norm(y, <span class="string">'fro'</span>)) <span class="string">'.'</span> ]));</pre><pre class="codeoutput">Reconstruction error (should be 0)=1.5713e-16.</pre><p>Now we perform a multi-channel basis pursuit to find a sparse approximation of the coefficients.</p><pre class="codeinput"><span class="comment">% regularization parameter</span>lambda = .2;<span class="comment">% initialization</span>y1 = y;S1 = S;niter = 100;err = [];<span class="comment">% iterations</span><span class="keyword">for</span> i=1:niter <span class="comment">% progressbar(i,niter);</span> <span class="comment">% gradient</span> r = y - y1; Sr = perform_stft(r, wlist, qlist, options); S1 = cell_add(S1, Sr); <span class="comment">% multi-channel thresholding</span> S1 = perform_thresholding(S1, lambda, <span class="string">'soft-multichannel'</span>); <span class="comment">% update the value of lambda to match noise</span> y1 = perform_stft(S1,wlist,qlist, options);<span class="keyword">end</span></pre><p>Create the point cloud of both the tight frame and the sparse BP coefficients.</p><pre class="codeinput">P1 = []; P = [];<span class="keyword">for</span> i=1:length(S) Si = reshape( S1{i}, [size(S1{i},1)*size(S1{i},2) 2] ); P1 = cat(1, P1, Si); Si = reshape( S{i}, [size(S{i},1)*size(S{i},2) 2] ); P = cat(1, P, Si);<span class="keyword">end</span>P = [real(P);imag(P)];P1 = [real(P1);imag(P1)];</pre><p>Display the two point clouds.</p><pre class="codeinput">p = size(P,1);m = 10000;sel = randperm(p); sel = sel(1:m);clf;subplot(1,2,1);plot( P(sel,1),P(sel,2), <span class="string">'.'</span> );title(<span class="string">'Tight frame coefficients'</span>);axis([-10 10 -10 10]);subplot(1,2,2);plot( P1(sel,1),P1(sel,2), <span class="string">'.'</span> );title(<span class="string">'Basis Pursuit coefficients'</span>);axis([-10 10 -10 10]);</pre><img vspace="5" hspace="5" src="index_06.png"> <p>Compute the angles of the points with largest energy.</p><pre class="codeinput">d = sqrt(sum(P.^2,2));d1 = sqrt(sum(P1.^2,2));I = find( d>.2 );I1 = find( d1>.2 );<span class="comment">% compute angles</span>Theta = mod(atan2(P(I,2),P(I,1)), pi());Theta1 = mod(atan2(P1(I1,2),P1(I1,1)), pi());</pre><p>Compute and display the histogram of angles. We reaint only a small sub-set of most active coefficients.</p><pre class="codeinput"><span class="comment">% compute histograms</span>nbins = 150;[h,t] = hist(Theta, nbins);h = h/sum(h);[h1,t1] = hist(Theta1, nbins);h1 = h1/sum(h1);<span class="comment">% display histograms</span>clf;subplot(2,1,1);bar(t,h); axis(<span class="string">'tight'</span>);set_graphic_sizes([], 20);title(<span class="string">'Tight frame coefficients'</span>);subplot(2,1,2);bar(t1,h1); axis(<span class="string">'tight'</span>);set_graphic_sizes([], 20);title(<span class="string">'Sparse coefficients'</span>);</pre><img vspace="5" hspace="5" src="index_07.png"> <p><i>Exercice 4:</i> (the solution is <a href="../private/sparsity_gabor/exo4.m">exo4.m</a>) Compare the source separation obtained by masking with a tight frame Gabor transform and with the coefficients computed by a basis pursuit sparsification process. </p><pre class="codeinput">exo4;</pre><p class="footer"><br> Copyright ® 2008 Gabriel Peyre<br></p> </div> <!--##### SOURCE BEGIN #####%% Sparse Representation in a Gabor Dictionary% This numerical tour explores the use of L1 optimization to find sparse representation in a redundant Gabor dictionary.% It shows application to denoising and stereo separation.%% Installing toolboxes and setting up the path.%%% You need to download the% <../toolbox_general.zip general purpose toolbox>% and the <../toolbox_signal.zip signal toolbox>.%%% You need to unzip these toolboxes in your working directory, so% that you have |toolbox_general/| and |toolbox_signal/| in your directory.%%% *For Scilab user:* you must replace the Matlab comment '%' by its Scilab% counterpart '//'.%%% *Recommandation:* You should create a text file named for instance% |numericaltour.sce| (in Scilabe) or |numericaltour.m| to write all the% Scilab/Matlab command you want to execute. Then, simply run% |exec('numericaltour.sce');| (in Scilab) or |numericaltour;| (in Matlab)% to run the commands.%%% Execute this line only if you are using Matlab.getd = @(p)path(path,p); % scilab users must *not* execute this%%% Then you can add these toolboxes to the path.% Add some directories to the pathgetd('toolbox_signal/');getd('toolbox_general/');%% Gabor Tight Frame Transform% The Gabor transform is a collection of short time Fourier transforms% (STFT) computed with several windows. The redundancy |K*L| of the transform% depends on the number |L| of windows used and of the overlapping factor |K|% of each STFT.%%% We decide to use a collection of windows with dyadic sizes.% sizes of the windowwlist = 32*[4 8 16 32]; L = length(wlist);% overlap of the window, so that K=2.K = 2;qlist = wlist/K;% overall redundancy disp( strcat(['Approximate redundancy of the dictionary=' num2str(K*L) '.']) );%%% We load a sound.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -