fftalgorithms.tex

来自「开放gsl矩阵运算」· TEX 代码 · 共 2,043 行 · 第 1/5 页
TEX
2,043 行
\sum_{b_0=0}^{1} \sum_{b_1=0}^{1} \dots\sum_{b_{n-3}=0}^{1} W_N^{a[b_{n-3} \dots b_1 b_0]}\sum_{b_{n-2}=0}^{1} W_4^{[a_1 a_0] b_{n-2}}\sum_{b_{n-1}=0}^{1}  g(b) W_2^{a_0 b_{n-1}}.\end{equation}%If we repeat the process for all the remaining bits we obtain asimplified DFT formula which is the basis of the radix-2decimation-in-time algorithm,%\begin{eqnarray}h([a_{n-1} \dots a_1 a_0]) &=& \sum_{b_0=0}^{1} W_{N}^{[a_{n-1} \dots a_1 a_0]b_0} %\sum_{b_1=0}^{1} %W_{N/2}^{[a_{n-1} \dots a_1 a_0]b_1} \dots\sum_{b_{n-2}=0}^{1} W_4^{ [a_1 a_0] b_{n-2}}\sum_{b_{n-1}=0}^{1} W_2^{a_0 b_{n-1}}g(b)\end{eqnarray}%To convert the formula to an algorithm we expand out the sumrecursively, evaluating each of the intermediate summations, which wedenote by $g_1$, $g_2$, \dots, $g_n$,%\begin{eqnarray}g_1(a_0,  b_{n-2}, b_{n-3}, \dots, b_1, b_0) &=& \sum_{b_{n-1}=0}^{1} W_2^{a_0 b_{n-1}}g([b_{n-1}  b_{n-2}  b_{n-3}  \dots  b_1  b_0]) \\g_2(a_0, {}_{\phantom{-2}} a_{1}, b_{n-3}, \dots, b_1, b_0) &=& \sum_{b_{n-2}=0}^{1} W_4^{[a_1 a_0] b_{n-2}}g_1(a_0, b_{n-2}, b_{n-3}, \dots, b_1, b_0) \\g_3(a_0, {}_{\phantom{-2}} a_{1}, {}_{\phantom{-3}} a_{2}, \dots, b_1, b_0) &=& \sum_{b_{n-3}=0}^{1} W_8^{[a_2 a_1 a_0] b_{n-2}}g_2(a_0, a_1, b_{n-3}, \dots, b_1, b_0) \\\dots &=& \dots \\g_n(a_0, a_{1}, a_{2}, \dots, a_{n-2}, a_{n-1}) &=&\sum_{b_{0}=0}^{1} W_N^{[a_{n-1} \dots a_1 a_0]b_0}g_{n-1}(a_0, a_1, a_2, \dots, a_{n-2}, b_0) \end{eqnarray}%After the final sum, we can obtain the transform $h$ from $g_n$,%\begin{equation}h([a_{n-1} \dots a_1 a_0]) = g_n(a_0, a_1, \dots, a_{n-1}) \end{equation}% Note that we left the storage arrangements of the intermediate sumsunspecified by using the bits as function arguments and not as anindex. The storage of intermediate sums is different for thedecimation-in-time and decimation-in-frequency algorithms.Before deciding on the best storage scheme we'll show that the resultsof each stage, $g_1$, $g_2$, \dots, can be carried out {\emin-place}. For example, in the case of $g_1$, the inputs are,%\begin{equation}g([\underline{b_{n-1}} b_{n-2} b_{n-3} \dots b_1 b_0])\end{equation}%for $b_{n-1}=(0,1)$, and the corresponding outputs are,%\begin{equation}g_1(\underline{a_0},b_{n-2}, b_{n-3}, \dots, b_1, b_0)\end{equation}%for $a_0=(0,1)$.  It's clear that if we hold $b_{n-2}, b_{n-3}, \dots,b_1, b_0$ fixed and compute the sum over $b_{n-1}$ in memory for bothvalues of $a_0 = 0,1$ then we can store the result for $a_0=0$ in thelocation which originally had $b_0=0$ and the result for $a_0=1$ inthe location which originally had $b_0=1$. The two inputs and twooutputs are known as {\em dual node pairs}. At each stage of thecalculation the sums for each dual node pair are independent of theothers. It is this property which allows an in-place calculation.So for an in-place pass our storage has to be arranged so that the twooutputs $g_1(a_0,\dots)$ overwrite the two input terms$g([b_{n-1},\dots])$. Note that the order of $a$ is reversed from thenatural order of $b$. i.e. the least significant bit of $a$replaces the most significant bit of $b$. This is inconvenientbecause $a$ occurs in its natural order in all the exponentials,$W^{ab}$. We could keep track of both $a$ and its bit-reverse,$a^{\mathit bit-reversed}$ at all times but there is a neat trickwhich avoids this: if we bit-reverse the order of the input data $g$before we start the calculation we can also bit-reverse the order of$a$ when storing intermediate results. Since the storage involving $a$was originally in bit-reversed order the switch in the input $g$ nowallows us to use normal ordered storage for $a$, the same orderingthat occurs in the exponential factors.This is complicated to explain, so here is an example of the 4 passesneeded for an $N=16$ decimation-in-time FFT, with the initial datastored in bit-reversed order,%\begin{eqnarray}g_1([b_0 b_1 b_2 a_0]) &=& \sum_{b_3=0}^{1} W_2^{a_0 b_3} g([b_0 b_1 b_2 b_3])\\g_2([b_0 b_1 a_1 a_0]) &=& \sum_{b_2=0}^{1} W_4^{[a_1 a_0] b_2} g_1([b_0 b_1 b_2 a_0])\\g_3([b_0 a_2 a_1 a_0]) &=& \sum_{b_1=0}^{1} W_8^{[a_2 a_1 a_0] b_1} g_2([b_0 b_1 a_1 a_0])\\h(a) = g_4([a_3 a_2 a_1 a_0]) &=& \sum_{b_0=0}^{1} W_{16}^{[a_3 a_2 a_1 a_0] b_0} g_3([b_0 a_2 a_1 a_0])\end{eqnarray}%We compensate for the bit reversal of the input data by accessing $g$with the bit-reversed form of $b$ in the first stage. This ensuresthat we are still carrying out the same calculation, using the samedata, and not accessing different values. Only single bits of $b$ everoccur in the exponential so we never need the bit-reversed form of$b$.Let's examine the third pass in detail,%\begin{equation}g_3([b_0 a_2 a_1 a_0]) =\sum_{b_1=0}^{1} W_8^{[a_2 a_1 a_0] b_1} g_2([b_0 b_1 a_1 a_0])\end{equation}%First note that only one bit, $b_1$, varies in each summation.  Theother bits of $b$ ($b_0$) and of $a$ ($a_1 a_0$) are essentially``spectators'' -- we must loop over all combinations of these bits andcarry out the same basic calculation for each, remembering to updatethe exponentials involving $W_8$ appropriately.  If we are storing theresults in-place (with $g_3$ overwriting $g_2$ we will need to computethe sums involving $b_1=0,1$ and $a_2=0,1$ simultaneously.%\begin{equation}\left(\begin{array}{c}g_3([b_0 0 a_1 a_0]) \vphantom{W_8^{[]}} \\g_3([b_0 1 a_1 a_0]) \vphantom{W_8^{[]}} \end{array}\right)=\left(\begin{array}{c}g_2([b_0 0 a_1 a_0]) + W_8^{[0 a_1 a_0]} g_2([b_2 1 a_1 a_0]) \\g_2([b_0 0 a_1 a_0]) + W_8^{[1 a_1 a_0]} g_2([b_2 1 a_1 a_0])\end{array}\right)\end{equation}%We can write this in a more symmetric form by simplifying the exponential,%\begin{equation}W_8^{[a_2 a_1 a_0]} = W_8^{4 a_2 + [a_1 a_0]} = (-1)^{a_2} W_8^{[a_1 a_0]}\end{equation}%\begin{equation}\left(\begin{array}{c}g_3([b_0 0 a_1 a_0]) \vphantom{W_8^{[]}} \\g_3([b_0 1 a_1 a_0]) \vphantom{W_8^{[]}} \end{array}\right)=\left(\begin{array}{c}g_2([b_0 0 a_1 a_0]) + W_8^{[a_1 a_0]} g_2([b_2 1 a_1 a_0]) \\g_2([b_0 0 a_1 a_0]) - W_8^{[a_1 a_0]} g_2([b_2 1 a_1 a_0])\end{array}\right)\end{equation}%The exponentials $W_8^{[a_1 a_0]}$ are referred to as {\em twiddlefactors}. The form of this calculation, a symmetrical sum anddifference involving a twiddle factor is called {\em a butterfly}.It is often shown diagrammatically, and in the case $b_0=a_0=a_1=0$would be drawn like this,%\begin{center}\setlength{\unitlength}{1cm}\begin{picture}(9,4)%%\put(0,0){\line(1,0){8}}%\put(0,0){\line(0,1){4}}%\put(8,4){\line(0,-1){4}}%\put(8,4){\line(-1,0){8}}%\put(0,1){$g_2(4)$} \put(4.5,1){$g_3(4)=g_2(0) - W^a_8 g_2(4)$}\put(0,3){$g_2(0)$} \put(4.5,3){$g_3(0)=g_2(0) + W^a_8 g_2(4)$}\put(1,1){\vector(1,0){0.5}}\put(1.5,1){\line(1,0){0.5}}\put(1.5,0.5){$W^a_8$}\put(1,3){\vector(1,0){0.5}}\put(1.5,3){\line(1,0){0.5}}\put(2,1){\circle*{0.1}}\put(2,3){\circle*{0.1}}\put(2,1){\vector(1,1){2}} \put(2,1){\vector(1,0){1}} \put(3,1){\line(1,0){1}}\put(3,0.5){$-1$}\put(2,3){\vector(1,-1){2}} \put(2,3){\vector(1,0){1}} \put(3,3){\line(1,0){1}}\put(4,1){\circle*{0.1}}\put(4,3){\circle*{0.1}}\end{picture}\end{center}%The inputs are shown on the left and the outputs on the right. Theoutputs are computed by multiplying the incoming lines by theiraccompanying factors (shown next to the lines) and summing the resultsat each node.In general, denoting the bit for dual-node pairs by $\Delta$ and theremaining bits of $a$ and $b$ by ${\hat a}$ and ${\hat b}$, thebutterfly is,%\begin{equation}\left(\begin{array}{c}g({\hat b} + {\hat a}) \\g({\hat b} + \Delta + {\hat a}) \\\end{array}\right)\leftarrow\left(\begin{array}{c}g({\hat b} + {\hat a}) + W_{2\Delta}^{\hat a} g({\hat b} + \Delta + {\hat a})\\g({\hat b} + {\hat a}) - W_{2\Delta}^{\hat a} g({\hat b} + \Delta + {\hat a})\end{array}\right)\end{equation}%where ${\hat a}$ runs from $0 \dots \Delta-1$ and ${\hat b}$ runsthrough $0 \times 2\Delta$, $1\times 2\Delta$, $\dots$, $(N/\Delta -1)2\Delta$. The value of $\Delta$ is 1 on the first pass, 2 on thesecond pass and $2^{n-1}$ on the $n$-th pass.  Each pass requires$N/2$ in-place computations, each involving two input locations andtwo output locations.In the example above $\Delta = [100] = 4$, ${\hat a} = [a_1 a_0]$ and${\hat b} = [b_0 0 0 0]$.This leads to the canonical radix-2 decimation-in-time FFT algorithmfor $2^n$ data points stored in the array $g(0) \dots g(2^n-1)$.%\begin{algorithm}\STATE bit-reverse ordering of $g$\STATE {$\Delta \Leftarrow 1$}\FOR {$\mbox{pass} = 1 \dots n$}  \STATE {$W \Leftarrow \exp(-2 \pi i / 2\Delta)$}  \FOR {$(a = 0 ; a < \Delta ; a{++})$}    \FOR{$(b = 0 ; b < N ; b {+=} 2*\Delta)$}        \STATE{$t_0 \Leftarrow g(b+a) + W^a g(b+\Delta+a)$}        \STATE{$t_1 \Leftarrow g(b+a) - W^a g(b+\Delta+a)$}        \STATE{$g(b+a) \Leftarrow t_0$}        \STATE{$g(b+\Delta+a) \Leftarrow t_1$}    \ENDFOR  \ENDFOR  \STATE{$\Delta \Leftarrow 2\Delta$}\ENDFOR\end{algorithm}%%This algorithm appears in Numerical Recipes as the routine {\tt%four1}, where the implementation is attributed to N.M. Brenner.%\subsection{Details of the Implementation}It is straightforward to implement a simple radix-2 decimation-in-timeroutine from the algorithm above. Some optimizations can be made bypulling the special case of $a=0$ out of the loop over $a$, to avoidunnecessary multiplications when $W^a=1$,%\begin{algorithm}    \FOR{$(b = 0 ; b < N ; b {+=} 2*\Delta)$}        \STATE{$t_0 \Leftarrow g(b) + g(b+\Delta)$}        \STATE{$t_1 \Leftarrow g(b) - g(b+\Delta)$}        \STATE{$g(b) \Leftarrow t_0$}        \STATE{$g(b+\Delta) \Leftarrow t_1$}    \ENDFOR\end{algorithm}%There are several algorithms for doing fast bit-reversal. We use theGold-Rader algorithm, which is simple and does not require any workingspace,%\begin{algorithm}\FOR{$i = 0 \dots n - 2$}        \STATE {$ k = n / 2 $}        \IF {$i < j$}                \STATE {swap $g(i)$ and $g(j)$}        \ENDIF        \WHILE {$k \leq j$}                \STATE{$j \Leftarrow j - k$}                 \STATE{$k \Leftarrow k / 2$}         \ENDWHILE            \STATE{$j \Leftarrow j + k$}\ENDFOR\end{algorithm}%The Gold-Rader algorithm is typically twice as fast as a naivebit-reversal algorithm (where the bit reversal is carried out byleft-shifts and right-shifts on the index).  The library also has aroutine for the Rodriguez bit reversal algorithm, which also does notrequire any working space~\cite{rodriguez89}. There are faster bitreversal algorithms, but they all use additional scratchspace~\cite{rosel89}.Within the loop for $a$ we can compute $W^a$  using a trigonometricrecursion relation,%\begin{eqnarray}W^{a+1} &=& W W^a \\        &=& (\cos(2\pi/2\Delta) + i \sin(2\pi/2\Delta)) W^a\end{eqnarray}%This requires only $2 \log_2 N$ trigonometric calls, to compute theinitial values of $\exp(2\pi i /2\Delta)$ for each pass.\subsection{Radix-2 Decimation-in-Frequency (DIF)}%To derive the decimation-in-frequency algorithm we start by separatingout the lowest order bit of the index $a$. Here is an example for thedecimation-in-frequency $N=16$ DFT.%\begin{eqnarray}W_{16}^{[a_3 a_2 a_1 a_0][b_3 b_2 b_1 b_0]} &=&W_{16}^{[a_3 a_2 a_1 a_0][b_2 b_1 b_0]} W_{16}^{[a_3 a_2 a_1 a_0] [b_30 0 0]} \\&=&W_8^{[a_3 a_2 a_1][b_2 b_1 b_0]} W_{16}^{a_0 [b_2 b_1 b_0]} W_2^{a_0b_3} \\&=&W_8^{[a_3 a_2 a_1][b_2 b_1 b_0]} W_{16}^{a_0 [b_2 b_1 b_0]} (-1)^{a_0 b_3}\end{eqnarray}%By repeating the same type of the expansion on the term,%\begin{equation}W_8^{[a_3 a_2 a_1][b_2 b_1 b_0]}\end{equation}%we can reduce the transform to an alternative simple form,%\begin{equation}h(a) = \sum_{b_0=0}^1 (-1)^{a_3 b_0} W_4^{a_2 b_0}\sum_{b_1=0}^1 (-1)^{a_2 b_1} W_8^{a_1 [b_1 b_0]}\sum_{b_2=0}^1 (-1)^{a_1 b_2} W_{16}^{a_0 [b_2 b_1 b_0]}\sum_{b_3=0}^1 (-1)^{a_0 b_3} g(b)\end{equation}%To implement this we can again write the sum recursively. In this casewe do not have the problem of the order of $a$ being bit reversed --the calculation can be done in-place using the natural ordering of$a$ and $b$,%\begin{eqnarray}g_1([a_0 b_2 b_1 b_0]) &=&W_{16}^{a_0 [b_2 b_1 b_0]} \sum_{b_3=0}^1 (-1)^{a_0 b_3} g([b_3 b_2 b_1 b_0]) \\g_2([a_0 a_1 b_1 b_0]) &=&W_{8}^{a_1 [b_1 b_0]} \sum_{b_2=0}^1 (-1)^{a_1 b_2} g_1([a_0 b_2 b_1 b_0]) \\g_3([a_0 a_1 a_2 b_0]) &=&W_{4}^{a_2 b_0} \sum_{b_1=0}^1 (-1)^{a_2 b_1} g_2([a_0 a_1 b_1 b_0]) \\h(a)= g_4([a_0 a_1 a_2 a_3]) &=&\sum_{b_0=0}^1 (-1)^{a_3 b_0} g_3([a_0 a_1 a_2 b_0])\end{eqnarray}%The final pass leaves the data for $h(a)$ in bit-reversed order, butthis is easily fixed by a final bit-reversal of the ordering.The basic in-place calculation or butterfly for each pass is slightlydifferent from the decimation-in-time version,%\begin{equation}\left(\begin{array}{c}
fftalgorithms.tex - 源码说明

本页面展示了「开放gsl矩阵运算」中的 fftalgorithms.tex 源码文件，采用 TEX 编程语言编写，共 2,043 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与gsl相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?