📄 chap32.htm

📁 程序设计的艺术,此书是英文版的,经久不衰,非常的经典
💻 HTM
📖 第 1 页 / 共 5 页
字号:
For <I>j, k</I> = 0, 1, . . . , <I>n</I> - 1, the (<I>j, k</I>) entry of <img src="790_a.gif">.<P>
<I><B>Proof     </I></B>We show that <img src="790_b.gif">, the <I>n</I> X <I>n</I> identity matrix. Consider the (<I>j, j'</I>) entry of <img src="790_c.gif">:<P>
<img src="790_d.gif"><P>
This summation equals 1 if <I>j</I>' = <I>j</I>, and it is 0 otherwise by the summation lemma (Lemma 32.6). Note that we rely on -(<I>n</I> - 1) &lt; <I>j</I>' - <I>j</I> &lt; <I>n</I> - l, so that <I>j</I>' - <I>j</I> is not divisible by <I>n</I>, in order for the summation lemma to apply.      <P>
Given the inverse matrix <img src="790_e.gif">, we have that <img src="790_f.gif"> is given by<P>
<img src="790_g.gif"><P>
<h4><a name="0995_1b1f">(32.11)<a name="0995_1b1f"></sub></sup></h4><P>
for <I>j</I> = 0,1, . . . , <I>n</I> - 1. By comparing equations (32.8) and (32.11), we see that by modifying the FFT algorithm to switch the roles of <I>a</I> and <I>y</I>, replace <I>w<SUB>n</I></SUB> by <img src="790_h.gif">, and divide each element of the result by <I>n</I>, we compute the inverse DFT (see Exercise 32.2-4). Thus, <img src="790_i.gif"> can be computed in <img src="../images/bound.gif">(<I>n </I>lg <I>n</I>) time as well.<P>
Thus, by using the FFT and the inverse FFT, we can transform a polynomial of degree-bound <I>n</I> back and forth between its coefficient representation and a point-value representation in time <img src="../images/bound.gif">(<I>n </I>lg <I>n</I>). In the context of polynomial multiplication, we have shown the following.<P>
<a name="0995_1b20">Theorem 32.8<a name="0995_1b20"><P>
<a name="0995_1b1d">For any two vectors <I>a</I> and <I>b</I> of length <I>n</I>, where <I>n</I> is a power of 2,<P>
<img src="790_j.gif"><P>
where the vectors <I>a</I> and <I>b</I> are padded with 0's to length 2<I>n</I> and <img src="../images/dot10.gif"> denotes the componentwise product of two 2<I>n</I>-element vectors.      <P>
<P>







<h2><a name="0996_1b1f">Exercises<a name="0996_1b1f"></h2><P>
<a name="0996_1b20">32.2-1<a name="0996_1b20"><P>
Prove Corollary 32.4.<P>
<a name="0996_1b21">32.2-2<a name="0996_1b21"><P>
Compute the DFT of the vector (0, 1, 2, 3).<P>
<a name="0996_1b22">32.2-3<a name="0996_1b22"><P>
Do Exercise 32.1-1 by using the <img src="../images/bound.gif">(<I>n </I>lg <I>n</I>)-time scheme.<P>
<a name="0996_1b23">32.2-4<a name="0996_1b23"><P>
Write pseudocode to compute <img src="791_a.gif"> in <img src="../images/bound.gif">(<I>n </I>lg <I>n</I>) time.<P>
<a name="0996_1b24">32.2-5<a name="0996_1b24"><P>
Describe the generalization of the FFT procedure to the case in which <I>n</I> is a power of 3. Give a recurrence for the running time, and solve the recurrence.<P>
<a name="0996_1b25">32.2-6<a name="0996_1b25"><P>
Suppose that instead of performing an <I>n</I>-element FFT over the field of complex numbers (where <I>n</I> is even), we use the ring Z<I><SUB>m</I></SUB> of integers modulo <I>m</I>, where <I>m</I> = 2<I><SUP>tn</I>/2</SUP> + 1 and <I>t</I> is an arbitrary positive integer. Use <I>w</I> = 2<I><SUP>t</I></SUP> instead of <I>w<SUB>n</I></SUB> as a principal <I>n</I>th root of unity, modulo <I>m</I>. Prove that the DFT and the inverse DFT are well defined in this system.<P>
<a name="0996_1b26">32.2-7<a name="0996_1b26"><P>
Given a list of values <I>z</I><SUB>0</SUB>, <I>z</I><SUB>1</SUB>, . . . , <I>z<SUB>n-</I>1</SUB> (possibly with repetitions), show how to find the coefficients of the polynomial <I>P</I>(<I>x</I>) of degree-bound <I>n</I> that has zeros only at <I>z</I><SUB>0</SUB>, <I>z</I><SUB>1</SUB>, . . . , <I>z<SUB>n </I>- 1</SUB> (possibly with repetitions). Your procedure should run in time <I>O</I>(<I>n</I> lg<SUP>2</SUP> <I>n</I>). (<I>Hint:</I> The polynomial <I>P</I>(<I>x</I>) has a zero at <I>z<SUB>j</I></SUB> if and only if <I>P</I>(<I>x</I>) is a multiple of (<I>x</I> - <I>z<SUB>j</I></SUB>).)<P>
<a name="0996_1b27">32.2-8<a name="0996_1b27"><P>
<a name="0996_1b1e">The <I><B>chirp transform</I></B> of a vector <I>a</I> = (<I>a</I><SUB>0</SUB>, <I>a</I><SUB>1</SUB>, . . . , <I>a<SUB>n - </I>1</SUB>) is the vector <I>y</I> = (<I>y</I><SUB>0</SUB>, <I>y</I><SUB>1</SUB>, . . . , <I>y<SUB>n - </I>1</SUB>), where <img src="791_b.gif"> and <I>z</I> is any complex number. The DFT is therefore a special case of the chirp transform, obtained by taking <I>z</I> = <I>w<SUB>n</I></SUB>. Prove that the chirp transform can be evaluated in time <I>O</I>(<I>n</I> lg <I>n</I>) for any complex number <I>z</I>. (<I>Hint:</I> Use the equation<P>
<img src="791_c.gif"><P>
to view the chirp transform as a convolution.)<P>
<P>


<P>







<h1><a name="0997_0001">32.3 Efficient FFT implementations<a name="0997_0001"></h1><P>
Since the practical applications of the DFT, such as signal processing, demand the utmost speed, this section examines two efficient FFT implementations. First, we shall examine an iterative version of the FFT algorithm that runs in <img src="../images/bound.gif">(<I>n </I>lg <I>n</I>) time but has a lower constant hidden in the <img src="../images/bound.gif">-notation than the recursive implementation in Section 32.2. Then, we shall use the insights that led us to the iterative implementation to design an efficient parallel FFT circuit.<P>
<img src="792_a.gif"><P>
<h4><a name="0997_0002">Figure 32.3 A butterfly operation. The two input values enter from the left, <img src="792_b.gif"> is multiplied by <img src="792_c.gif">, and the sum and difference are output on the right. The figure can be interpreted as a combinational circuit.<a name="0997_0002"></sub></sup></h4><P>
<img src="792_d.gif"><P>
<h4><a name="0997_0003">Figure 32.4 The tree of input vectors to the recursive calls of the <FONT FACE="Courier New" SIZE=2>RECURSIVE<FONT FACE="Times New Roman" SIZE=2>-FFT procedure. The initial invocation is for n = 8.<a name="0997_0003"></FONT></FONT></sub></sup></h4><P>





<h2>An iterative FFT implementation</h2><P>
<a name="0998_1b1f">We first note that the <B>for</B> loop of lines 10-13 of <FONT FACE="Courier New" SIZE=2>RECURSIVE</FONT>-FFT involves computing the value <img src="792_e.gif"> twice. In compiler terminology, this value is known as a <I><B>common subexpression</I></B>. We can change the loop to compute it only once, storing it in a temporary variable <I>t</I>.<P>
<img src="792_f.gif"><P>
<a name="0998_1b20">The operation in this loop, multiplying <I>w</I> (which is equal to <img src="792_g.gif">, storing the product into <I>t</I>, and adding and subtracting <I>t</I> from <img src="792_h.gif">, is known as a <I><B>butterfly</I></B> <I><B>operation</I></B> and is shown schematically in Figure 32.3.<P>
We now show how to make the FFT algorithm iterative rather than recursive in structure. In Figure 32.4, we have arranged the input vectors to the recursive calls in an invocation of <FONT FACE="Courier New" SIZE=2>RECURSIVE</FONT>-FFT in a tree structure, where the initial call is for <I>n</I> = 8. The tree has one node for each call of the procedure, labeled by the corresponding input vector. Each <FONT FACE="Courier New" SIZE=2>RECURSIVE</FONT>-FFT invocation makes two recursive calls, unless it has received a 1-element vector. We make the first call the left child and the second call the right child.<P>
Looking at the tree, we observe that if we could arrange the elements of the initial vector <I>a</I> into the order in which they appear in the leaves, we could mimic the execution of the R<FONT FACE="Courier New" SIZE=2>ECURSIVE-</FONT>FFT procedure as follows. First, we take the elements in pairs, compute the DFT of each pair using one butterfly operation, and replace the pair with its DFT. The vector then holds <I>n</I>/2 2-element DFT's. Next, we take these <I>n</I>/2 DFT's in pairs and compute the DFT of the four vector elements they come from by executing two butterfly operations, replacing two 2-element DFT's with one 4-element DFT. The vector then holds <I>n</I>/4 4-element DFT's. We<I> </I>continue in this manner until the vector holds two (<I>n</I>/2)-element DFT's, which we can combine using <I>n</I>/2 butterfly operations into the final <I>n-</I>element DFT.<P>
To turn this observation into code, we use an array <I>A</I>[0 <I>. . n</I> - 1] that initially holds the elements of the input vector <I>a</I> in the order in which they appear in the leaves of the tree of Figure 32.4. (We shall show later how to determine this order.) Because the combining has to be done on each level of the tree, we introduce a variable <I>s</I> to count the levels, ranging from 1 (at the bottom, when we are combining pairs to form 2-element DFT's) to lg <I>n</I> (at the top, when we are combining two (<I>n</I>/2)-element DFT's to produce the final result). The algorithm therefore has the following structure:<P>
<pre>1<B>  for</B> <I>s</I> <img src="../images/arrlt12.gif"> 1 <B>to</B> lg <I>n</I></sub></sup></pre><P>
<pre>2<B>       do for</B> <I>k</I> <img src="../images/arrlt12.gif"> 0 <B>to</B> <I>n</I> - 1 <B>by</B> 2<I><SUP>s</I></sub></sup></pre><P>
<pre>3<B>              do</B> combine the two 2<I><SUP>s</I>-1</SUP> -element DFT's in</sub></sup></pre><P>
<pre><I>A</I>[<I>k . . k</I> + 2<I><SUP>s</I>-1</SUP> - 1] and <I>A</I>[<I>k + 2<SUP>s</I>-1</SUP> <I>. . k + </I>2<I><SUP>s</SUP> </I>- 1]</sub></sup></pre><P>
<pre>into one 2<I><SUP>s</I></SUP>-element DFT in <I>A</I>[<I>k . . k + </I>2<I><SUP>s</I></SUP> - 1]</sub></sup></pre><P>
We can express the body of the loop (line 3) as more precise pseudocode. We copy the <B>for</B> loop from the R<FONT FACE="Courier New" SIZE=2>ECURSIVE-</FONT>FFT procedure, identifying <I>y</I><SUP>[0]</SUP> with <I>A</I>[<I>k . . k + 2<SUP>s-</I>1 <FONT FACE="Times New Roman" SIZE=1>_ </FONT></SUP>1] and <I>y</I><SUP>[1]</SUP> with <I>A</I>[<I>k + 2<SUP>s</I>-1</SUP>. . <I>k</I> + 2<I><SUP>s</I></SUP> <SUP><FONT FACE="Times New Roman" SIZE=1>_</FONT></SUP> 1]. The value of <I>w</I> used in each butterfly operation depends on the value of <I>s;</I> we use <I>w<SUB>m</I></SUB>, where <I>m = 2<SUP>s</I></SUP>. (We introduce the variable <I>m</I> solely for the sake of readability.) We introduce another temporary variable <I>u</I> that allows us to perform the butterfly operation in place. When we replace line 3 of the overall structure by the loop body, we get the following pseudocode, which forms the basis of our final iterative FFT algorithm as well as the parallel implementation we shall present later.<P>
<pre><a name="0998_1b21">FFT-BASE(<I>a</I>)</sub></sup></pre><P>
<pre>1<I>  n</I> <img src="../images/arrlt12.gif"> <I>length</I>[<I>a</I>]  <img src="794_a.gif"> <I>n</I> is a power of 2.</sub></sup></pre><P>
<pre>2<B>  for</B> <I>s</I> <img src="../images/arrlt12.gif"> 1 <B>to</B> lg <I>n</I></sub></sup></pre><P>
<pre>3     <B>  do </B><I>m</I> <img src="../images/arrlt12.gif"> 2<I><SUP>s</I></sub></sup></pre><P>
<pre>4<B>          </B><I>w<SUB>m</I></SUB> <img src="../images/arrlt12.gif"> <I>e</I><SUP>2</SUP><img src="../images/piuc.gif"><I>i</I>/<I>m</I></sub></sup></pre><P>
<pre>5          <B>for</B> <I>k</I> <img src="../images/arrlt12.gif"> 0 <B>to</B> <I>n</I> - 1 <B>by</B> <I>m</I></sub></sup></pre><P>
<pre>6<B>              do</B> <I>w</I> <img src="../images/arrlt12.gif"> 1</sub></sup></pre><P>
<pre>7<B>                 for</B> <I>j</I> <img src="../images/arrlt12.gif"> 0 <B>to</B> <I>m</I>/2 - 1</sub></sup></pre><P>
<pre>8<B>                     do</B> <I>t</I> <img src="../images/arrlt12.gif"> <I>w</I> <I>A</I>[<I>k + j + m</I>/2]</sub></sup></pre><P>
<pre>9                        <I>u</I> <img src="../images/arrlt12.gif"> <I>A</I>[ <I>k + j</I>]</sub></sup></pre><P>
<pre>10<I>                        A</I>[<I>k + j</I>]<I> </I><img src="../images/arrlt12.gif"> u + t</sub></sup></pre><P>
<pre>11<I>                        A</I>[<I>k + j + m /2</I>]<I> </I><img src="../images/arrlt12.gif"> u - t</sub></sup></pre><P>
<pre>12                        <I>w</I> <img src="../images/arrlt12.gif"> <I>ww<SUB>m</I></sub></sup></pre><P>
We now present the final version of our iterative FFT code, which inverts the two inner loops to eliminate some index computation and uses the auxiliary procedure B<FONT FACE="Courier New" SIZE=2>IT-</FONT>R<FONT FACE="Courier New" SIZE=2>EVERSE-</FONT><FONT FACE="Courier New" SIZE=2>COPY</FONT>(<I>a, A</I>) to copy vector <I>a</I> into array <I>A</I> in the initial order in which we need the values.<P>
<pre><a name="0998_1b22">ITERATIVE-FFT(<I>a</I>)</sub></sup></pre><P>
<pre>1  BIT-REVERSE-COPY (<I>a, A</I>)</sub></sup></pre><P>
<pre>2<I>  n</I> <img src="../images/arrlt12.gif"> <I>length</I>[<I>a</I>]      <img src="794_b.gif"><I> n</I> is a power of 2.</sub></sup></pre><P>
<pre>3<B>  for </B><I>s</I> <img src="../images/arrlt12.gif"> 1 <B>to</B> lg <I>n</I></sub></sup></pre><P>
<pre>4<B>        do </B><I>m</I> <img src="../images/arrlt12.gif"> 2<SUP>s</sub></sup></pre><P>
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -