📄 fftw_1.html
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"><HTML><HEAD><!-- This HTML file has been created by texi2html 1.52 from fftw.texi on 24 March 2003 --><TITLE>FFTW - Introduction</TITLE></HEAD><BODY TEXT="#000000" BGCOLOR="#FFFFFF">Go to the first, previous, <A HREF="fftw_2.html">next</A>, <A HREF="fftw_10.html">last</A> section, <A HREF="fftw_toc.html">table of contents</A>.<P><HR><P><H1><A NAME="SEC1">Introduction</A></H1><P>This manual documents version 2.1.5 of FFTW, the <EM>FastestFourier Transform in the West</EM>. FFTW is a comprehensive collection offast C routines for computing the discrete Fourier transform (DFT) inone or more dimensions, of both real and complex data, and of arbitraryinput size. FFTW also includes parallel transforms for both shared- anddistributed-memory systems. We assume herein that the reader is alreadyfamiliar with the properties and uses of the DFT that are relevant toher application. Otherwise, see e.g. <CITE>The Fast Fourier Transform</CITE>by E. O. Brigham (Prentice-Hall, Englewood Cliffs, NJ, 1974).<A HREF="http://www.fftw.org">Our web page</A> also has links toFFT-related information online.<A NAME="IDX1"></A><P>FFTW is usually faster (and sometimes much faster) than all otherfreely-available Fourier transform programs found on the Net. Fortransforms whose size is a power of two, it compares favorably with theFFT codes in Sun's Performance Library and IBM's ESSL library, which aretargeted at specific machines. Moreover, FFTW's performance is<EM>portable</EM>. Indeed, FFTW is unique in that it automatically adaptsitself to your machine, your cache, the size of your memory, the numberof registers, and all the other factors that normally make it impossibleto optimize a program for more than one machine. An extensivecomparison of FFTW's performance with that of other Fourier transformcodes has been made. The results are available on the Web at<A HREF="http://theory.lcs.mit.edu/~benchfft">the benchFFT home page</A>.<A NAME="IDX2"></A><A NAME="IDX3"></A><P>In order to use FFTW effectively, you need to understand one basicconcept of FFTW's internal structure. FFTW does not used a fixedalgorithm for computing the transform, but it can adapt the DFTalgorithm to details of the underlying hardware in order to achieve bestperformance. Hence, the computation of the transform is split into twophases. First, FFTW's <EM>planner</EM> is called, which "learns" the<A NAME="IDX4"></A>fastest way to compute the transform on your machine. The planner<A NAME="IDX5"></A>produces a data structure called a <EM>plan</EM> that contains thisinformation. Subsequently, the plan is passed to FFTW's <EM>executor</EM>,<A NAME="IDX6"></A>along with an array of input data. The executor computes the actualtransform, as dictated by the plan. The plan can be reused as manytimes as needed. In typical high-performance applications, manytransforms of the same size are computed, and consequently arelatively-expensive initialization of this sort is acceptable. On theother hand, if you need a single transform of a given size, the one-timecost of the planner becomes significant. For this case, FFTW providesfast planners based on heuristics or on previously computed plans.<P>The pattern of planning/execution applies to all four operation modes ofFFTW, that is, I) one-dimensional complex transforms (FFTW), II)multi-dimensional complex transforms (FFTWND), III) one-dimensionaltransforms of real data (RFFTW), IV) multi-dimensional transforms ofreal data (RFFTWND). Each mode comes with its own planner and executor.<P>Besides the automatic performance adaptation performed by the planner,it is also possible for advanced users to customize FFTW for theirspecial needs. As distributed, FFTW works most efficiently for arrayswhose size can be factored into small primes (2, 3,5, and 7), and uses a slower general-purpose routine forother factors. FFTW, however, comes with a code generator that canproduce fast C programs for any particular array size you may careabout.<A NAME="IDX7"></A>For example, if you need transforms of size513 = 19*3<sup>3</sup>,you can customize FFTW to support the factor 19 efficiently.<P>FFTW can exploit multiple processors if you have them. FFTW comes witha shared-memory implementation on top of POSIX (and similar) threads, aswell as a distributed-memory implementation based on MPI.<A NAME="IDX8"></A><A NAME="IDX9"></A><A NAME="IDX10"></A>We also provide an experimental parallel implementation written in Cilk,<EM>the superior programming tool of choice for discriminatinghackers</EM> (Olin Shivers). (See <A HREF="http://supertech.lcs.mit.edu/cilk">the Cilk home page</A>.)<A NAME="IDX11"></A><P>For more information regarding FFTW, see the paper, "The FastestFourier Transform in the West," by M. Frigo and S. G. Johnson, which isthe technical report MIT-LCS-TR-728 (Sep. '97). See also, "FFTW: AnAdaptive Software Architecture for the FFT," by M. Frigo andS. G. Johnson, which appeared in the 23rd International Conference onAcoustics, Speech, and Signal Processing (<CITE>Proc. ICASSP 1998</CITE><B>3</B>, p. 1381). The code generator is described in the paper "A FastFourier Transform Compiler", <A NAME="IDX12"></A>by M. Frigo, to appear in the <CITE>Proceedings of the 1999 ACM SIGPLANConference on Programming Language Design and Implementation (PLDI),Atlanta, Georgia, May 1999</CITE>. These papers, along with the latestversion of FFTW, the FAQ, benchmarks, and other links, are available at<A HREF="http://www.fftw.org">the FFTW home page</A>. The currentversion of FFTW incorporates many good ideas from the past thirty yearsof FFT literature. In one way or another, FFTW uses the Cooley-Tukeyalgorithm, the Prime Factor algorithm, Rader's algorithm for primesizes, and the split-radix algorithm (with a variation due to DanBernstein). Our code generator also produces new algorithms that we donot yet completely understand.<A NAME="IDX13"></A>The reader is referred to the cited papers for the appropriatereferences.<P>The rest of this manual is organized as follows. We first discuss thesequential (one-processor) implementation. We start by describing thebasic features of FFTW in Section <A HREF="fftw_2.html#SEC2">Tutorial</A>. This discussion includes thestorage scheme of multi-dimensional arrays (Section <A HREF="fftw_2.html#SEC7">Multi-dimensional Array Format</A>) and FFTW's mechanisms for storing plans on disk (Section <A HREF="fftw_2.html#SEC13">Words of Wisdom</A>). Next, Section <A HREF="fftw_3.html#SEC16">FFTW Reference</A> provides comprehensivedocumentation of all FFTW's features. Parallel transforms are discussedin their own chapter Section <A HREF="fftw_4.html#SEC47">Parallel FFTW</A>. Fortran programmers can alsouse FFTW, as described in Section <A HREF="fftw_5.html#SEC62">Calling FFTW from Fortran</A>.Section <A HREF="fftw_6.html#SEC66">Installation and Customization</A> explains how to install FFTW inyour computer system and how to adapt FFTW to your needs. License andcopyright information is given in Section <A HREF="fftw_8.html#SEC74">License and Copyright</A>. Finally,we thank all the people who helped us in Section <A HREF="fftw_7.html#SEC73">Acknowledgments</A>.<P><HR><P>Go to the first, previous, <A HREF="fftw_2.html">next</A>, <A HREF="fftw_10.html">last</A> section, <A HREF="fftw_toc.html">table of contents</A>.</BODY></HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -