📄 exp7.html
字号:
</span><![endif]>Profile the DSP run-time cycles for 128 points and 1024 points
FFT. Record the memory usage of the fixed-point function <span
style='font-family:"Courier New"'>bit_rev() </span>and <span style='font-family:
"Courier New"'>function fft()</span>.</p>
<p style='margin-right:1.0in;margin-left:1.0in'> </p>
<p><a href="#top"><span style='font-size:7.5pt'>Back to Top</span></a>
<input type=button value="go back" onclick="history.go(-1)">
</p>
<div class=MsoNormal align=center style='text-align:center'>
<hr size=3 width="100%" align=center>
</div>
<p style='margin-right:1.0in;margin-left:1.0in'><span style='font-size:13.5pt;
color:blue'>Experiment 7B - Radix-2 Complex FFT in C55x Assembly Language</span></p>
<p style='margin-right:1.0in;margin-left:1.0in'>Assembly routines provide the
most efficiency in terms of run time and code density. For digital signal
processing applications such as filtering and FFT, the assembly language is
usually used. This experiment uses the C55x assembly routines to perform the
same FFT function as the previous experiment. The assembly routine <a
href="exp7/fft.asm"><span style='font-family:"Courier New"'>fft.asm</span></a>
in Table E7-6 closely follows the code flow in fixed-point C function for
easier understand. As mentioned in the book, there are many ways readers can
use to improve the run time efficiency of the FFT routine.</p>
<p align=center style='margin-right:.5in;margin-left:.5in;text-align:center'>Table
E7-6 Assembly routine <span style='font-family:"Courier New"'>fft.asm</span></p>
<form>
<p align=center style='margin-right:.5in;margin-left:.5in;text-align:center'><TEXTAREA ROWS="7" COLS="70" NAME="fft.asm">; ; fft.asm - Radix-2 complex FFT (for N=2^EXP points) ; Using table lookup method ; ; Prototype: void fft(complex *, int, complex *, int); ; ; Entry: arg0: AR0-FFT input sample buffer pointer ; arg1: T0-number of FFT stage, EXP ; arg2: AR1-twiddle factor array pointer ; arg3: T1-scale flag ; if scale needed, (T1==1) SCALE=2; ; if don't need scale (T1==0) SCALE=1; ; ; Note: 1. When used as the FFT routine it takes in time domain ; samples in Q15 and generates Q14 frequency domain samples ; 2. When used as the IFFT routine it takes in frequency domain ; samples in Q14 and generates Q15 time domain samples ; ; Revised: Dec. 3, 2001 ; Add an instruction to copy XAR0 to XAR5 for addressing extended page .global _fft ; ;	Declare local variables in SP address mode for C-callable ; ARGS .set 0 ; Number of variables passed via stack FFT_var .struct ; Define local variable structure d_temp .int (2) ; Temporary variables (Re, Im) d_L .int d_N .int d_T2 .int ; Used to save content of T2 d_ST1 .int ; Used to save content of ST1 d_ST3 .int ; Used to save content of ST3 return_addr .int ; Space for routine return address Size .endstruct fft .set 0 fft .tag FFT_var .sect "fft_code" 	 _fft: aadd #(ARGS-Size+1),SP ; Adjust stack for local vars mov mmap(ST1_55),AR2 ; Save ST1,ST3 mov mmap(ST3_55),AR3 mov AR2,fft.d_ST1 mov AR3,fft.d_ST3 btst @#0,T1,TC1 ; Check SCALE flag set		 mov #0x6340,mmap(ST1_55) ; Set CPL,XF,SATD,SXAM,FRCT (SCALE=1) mov #0x1f22,mmap(ST3_55) ; Set: HINT,SATA,SMUL xcc do_scale,TC1 mov #0x6300,mmap(ST1_55) ; Set CPL,XF,SATD,SXAM (SCALE=2) do_scale mov T2,fft.d_T2 ; Save T2 || mov #1,AC0 mov AC0,fft.d_L ; Initialize L=1 || sfts AC0,T0 ; T0=EXP mov AC0,fft.d_N ; N=1<<EXP mov XAR1,XCDP ; CDP = pointer to U[] mov XSP,XAR4	 add #fft.d_temp,AR4 ; AR4 = pointer to temp mov XAR0,XAR1 ; AR1 points to sample buffer mov T0,T1 mov XAR0,XAR5 ; Copy externd bits to XAR5 			 outer_loop ; for (L=1; L<=EXP; L++) mov fft.d_L,T0 ; note: Since the buffer is || mov #2,AC0 ; arranged in re,im pairs sfts AC0,T0 ; the index to the buffer neg T0	 ; is doubled || mov fft.d_N,AC1 ; But the repeat coutners sftl AC1,T0 ; are not doubled mov AC0,T0 ; LE=2<<L 	 || sfts AC0,#-1 mov AC0,AR0 ; LE1=LE>>1 || sfts AC0,#-1 sub #1,AC0 ; Init mid_loop counter mov mmap(AC0L),BRC0 ; BRC0=LE1-1 sub #1,AC1 ; Init inner loop counter mov mmap(AC1L),BRC1 ; BRC1=(N>>L)-1 add AR1,AR0 mov #0,T2 ; j=0 || rptblocal mid_loop-1 ; for (j=0; j<LE1;j++) mov T2,AR5 ; AR5=id=i+LE1 mov T2,AR3	 add AR0,AR5 ; AR5 = pointer to X[id].re add #1,AR5,AR2 ; AR2 = pointer to X[id].im add AR1,AR3 ; AR3 = pointer to X[i].re || rptblocal inner_loop-1 ; for(i=j; i<N; i+=LE) mpy *AR5+,*CDP+,AC0 ; AC0=(X[id].re*U.re :: mpy *AR2-,*CDP+,AC1 ; -X[id].im*U.im)/SCALE	 masr *AR5-,*CDP-,AC0 ; AC1=(X[id].im*U.re :: macr *AR2+,*CDP-,AC1 ; +X[id].re*U.im)/SCALE mov pair(hi(AC0)),dbl(*AR4); AC0H=temp.re AC1H=temp.im || mov dbl(*AR3),AC2 xcc scale,TC1 || mov AC2>>#1,dual(*AR3) ; Scale X[i] by 1/SCALE mov dbl(*AR3),AC2 scale add T0,AR2 || sub dual(*AR4),AC2,AC1 ; X[id].re=X[i].re/SCALE-temp.re mov AC1,dbl(*(AR5+T0)) ; X[id].im=X[i].im/SCALE-temp.im || add dual(*AR4),AC2 ; X[i].re=X[i].re/SCALE+temp.re mov AC2,dbl(*(AR3+T0)) ; X[i].im=X[i].im/SCALE+temp.im inner_loop ; End of inner loop amar *CDP+ amar *CDP+ ; Update k for pointer to U[k] || add #2,T2 ; Update j mid_loop ; End of mid-loop sub #1,T1 add #1,fft.d_L ; Update L bcc outer_loop,T1>0 ; End of outer-loop mov fft.d_ST1,AR2 ; Restore ST1,ST3,T2 mov fft.d_ST3,AR3 mov AR2,mmap(ST1_55) mov AR3,mmap(ST3_55) mov fft.d_T2,T2 aadd #(Size-ARGS-1),SP ; Reset SP ret .end </TEXTAREA></p>
</form>
<p style='margin-right:1.0in;margin-left:1.0in'> </p>
<p style='margin-right:1.0in;margin-left:1.0in'>Since the DSP provides a
special addressing mode, the bit-reversal addressing, for efficient FFT
implementation, we use the C55x bit-reversal addressing mode for the <a
href="exp7/bit_rev.asm"><span style='font-family:"Courier New"'>bit_rev.asm</span></a>
routine, see Table E7-7. Since C does not support the bit reversal addressing
mode, the assembly routine does not follow the C function in this case.</p>
<p align=center style='margin-right:.5in;margin-left:.5in;text-align:center'>Table
E7-7 Bit reversal using the C55x bit reversal addressing mode</p>
<form>
<p align=center style='margin-right:.5in;margin-left:.5in;text-align:center'><TEXTAREA ROWS="7" COLS="70" NAME="bit_rev.asm">; ; bit_rev.asm - Bit-reversal for FFT (N=2^EXP points) ; ; Prototype: void bit_rev(complex *, int); ; ; Entry: arg0: AR0-FFT input sample buffer pointer ; arg1: T0-number of FFT stage, EXP ; .global _bit_rev 	 .sect "fft_code" 	 _bit_rev psh mmap(ST2_55) ; Save ST2 bclr ARMS ; Reset ARMS bit mov	 #1,AC0 sfts AC0,T0	 ; T0=EXP, AC0=N=2^EXP mov AC0,T0 ; T0=N mov T0,T1 add T0,T1 mov mmap(T1),BK03 ; Circular buffer size=2N mov mmap(AR0),BSA01 ; Init circular buffer base sub #2,AC0 mov mmap(AC0L),BRC0 ; Init repeat counter to N-1 mov #0,AR0 ; Set buffer start address mov #0,AR1 ; as offset = 0 bset AR0LC ; Enable AR0 and AR1 as bset AR1LC ; circular pointers || rptblocal loop_end-1 ; Start bit reversal loop mov	 dbl(*AR0),AC0 ; Get a pair of sample || amov AR1,T1 mov dbl(*AR1),AC1 ; Get another pair || asub AR0,T1 xccpart swap1,T1>=#0 || mov AC1,dbl(*AR0+) ; Swap samples if j>=i swap1 xccpart loop_end,T1>=#0 || mov AC0,dbl(*(AR1+T0B)) loop_end ; End bit reversal loop pop mmap(ST2_55) ; Restore ST2 ret .end </TEXTAREA></p>
</form>
<p style='margin-right:.5in;margin-left:.5in'> </p>
<p style='margin-right:1.0in;margin-left:1.0in'>Another function that the FFT
uses is the twiddle factor generator. In our experiment, <a
href="exp7/w_table.c"><span style='font-family:"Courier New"'>w_table.c</span></a>
is used to create the twiddle factor table. The table lookup method trades the
program space with the execution speed. The twiddle factor can also be
generated using the recursive computation as the floating-point C example. The
function that generate the twiddle factor is listed in Table E7-8.</p>
<p align=center style='margin-right:.5in;margin-left:.5in;text-align:center'>Table
E7-8 Generating twiddle factor</p>
<form>
<p align=center style='margin-right:.5in;margin-left:.5in;text-align:center'><TEXTAREA ROWS="7" COLS="70" NAME="w_table.c">/* w_table.c - Generate twiddle factor lookup table */ #include "icomplex.h" #include <math.h> #define pi 3.1415926535897 void w_table(complex *U, unsigned int EXP) { complex W; /* Twiddle factor W^k */ unsigned int i,j,l; unsigned int SL; /* Number of points in sub DFT at stage L and offset to next DFT in stage */ unsigned int SL1; /* Number of butterflies in one DFT at stage L. Also is offset to lower point in butterfly at stage L */ for (i=0, l=1; l<=EXP; l++) { SL=1<<l; 	/* LE=2^L=points of sub DFT */ SL1=SL>>1; 		/* Number of twiddle factors in sub-DFT */ for (j=0; j<SL1; j++) { W.re = (int)((0x7fff*cos(j*pi/SL1))+0.5); W.im = -(int)((0x7fff*sin(j*pi/SL1))+0.5); U[i++] = W; } } } </TEXTAREA></p>
</form>
<p style='margin-right:.5in;margin-left:.5in'> </p>
<p style='margin-right:1.0in;margin-left:1.0in'>Finally, the C function <a
href="exp7/exp7b.c"><span style='font-family:"Courier New"'>exp7b.c</span></a>
in Table E7-9 is used to control the experiment.</p>
<p align=center style='margin-right:.5in;margin-left:.5in;text-align:center'>Table
E7-9 List of <span style='font-family:"Courier New"'>epx7b.c</span></p>
<form>
<p align=center style='margin-right:.5in;margin-left:.5in;text-align:center'><TEXTAREA ROWS="7" COLS="70" NAME="exp7b.c">/* exp7b.c - Example to test FFT (fixed-point) Use Twiddle lookup-table method */ #include "icomplex.h" #include "input7_i.dat" /* Experiment data */ extern void fft(complex *, unsigned int, complex *, unsigned int); extern void bit_rev(complex *, unsigned int); extern void w_table(complex *, unsigned int); #define N 128 /* Number of FFT points */ #define EXP 7 /* log2(N) */ #pragma DATA_SECTION(U, "fft_coef"); #pragma DATA_SECTION(X, "fft_in"); #pragma DATA_SECTION(spectrum, "fft_out"); #pragma DATA_SECTION(ltemp, "fft_temp"); #pragma DATA_SECTION(temp, "fft_temp"); #pragma CODE_SECTION(main, "fft_code"); complex X[N]; /* Declare input array */ complex U[N]; /* Twiddle e^(-j2pi/N) table */ complex temp; lcomplex ltemp; int spectrum[N]; int re1[N],im1[N]; void main() { unsigned int i,j; /* Create Twiddle lookup table for FFT */ w_table(U,EXP); /* Start FFT test */ j=0; for (;;) { for (i=0; i<N; i++) { /* Generate input samples */ X[i].re = input7_i[j++]; X[i].im = 0; /* Copy to reference buffer */ re1[i] = X[i].re; im1[i] = X[i].im; if (j==1664) j=0; 	} bit_rev(X,EXP); /* Arrange sample in bit reversal order */ fft(X,EXP,U,1); /* Perform FFT */ for (i=0; i<N; i++) /* Verify FFT result */ { ltemp.re = (long)X[i].re*X[i].re; ltemp.im = (long)X[i].im*X[i].im; spectrum[i] = (int)((ltemp.re+ltemp.im)>>13); } } } </TEXTAREA></p>
</form>
<p style='margin-right:.5in;margin-left:.5in'> </p>
<p style='margin-right:1.0in;margin-left:1.0in'>For Experiment 7B, complete the
following steps:</p>
<p class=MsoNormal style='margin-right:1.0in;mso-margin-top-alt:auto;
mso-margin-bottom-alt:auto;margin-left:1.5in;text-indent:-.25in;mso-list:l4 level1 lfo6;
tab-stops:list .5in'><![if !supportLists]>1.<span style='font:7.0pt "Times New Roman"'>
</span><![endif]>Create the project <span style='font-family:"Courier New"'>exp7b</span>,
and include <span style='font-family:"Courier New"'>exp7.cmd</span>, <span
style='font-family:"Courier New"'>exp7b.c</span>, <span style='font-family:
"Courier New"'>w_table.c</span>, <span style='font-family:"Courier New"'>fft.asm</span>,
and <span style='font-family:"Courier New"'>bit_rev.asm</span> from the
experimental software package.</p>
<p class=MsoNormal style='margin-right:1.0in;mso-margin-top-alt:auto;
mso-margin-bottom-alt:auto;margin-left:1.5in;text-indent:-.25in;mso-list:l4 level1 lfo6;
tab-stops:list .5in'><![if !supportLists]>2.<span style='font:7.0pt "Times New Roman"'>
</span><![endif]>Build and verify the FFT function, and compare your results
with Experiment 7A. Make sure that the scale flag for the FFT routine is set to
1.</p>
<p class=MsoNormal style='margin-right:1.0in;mso-margin-top-alt:auto;
mso-margin-bottom-alt:auto;margin-left:1.5in;text-indent:-.25in;mso-list:l4 level1 lfo6;
tab-stops:list .5in'><![if !supportLists]>3.<span style='font:7.0pt "Times New Roman"'>
</span><![endif]>Profile the FFT run-time and its memory usage. Compare these
results with those obtained in Experiment 7A.</p>
<p> </p>
<p><a href="#top"><span style='font-size:7.5pt'>Back to Top</span></a>
<input type=button value="go back" onclick="history.go(-1)">
</p>
<div class=MsoNormal align=center style='text-align:center'>
<hr size=3 width="100%" align=center>
</div>
<p style='margin-right:1.0in;margin-left:1.0in'><span style='font-size:13.5pt;
color:blue'>Experiment 7C - Radix-2 Complex FFT and IFFT</span></p>
<p style='margin-right:1.0in;margin-left:1.0in'>Inverse FFT (IFFT) is very
similar to the FFT computation. The DSP implementation of IFFT can use the FFT
routine from the previous experiment. In this experiment, we will show you how
to modify the FFT routine and data so the FFT routine can be used for the IFFT.</p>
<p style='margin-right:1.0in;margin-left:1.0in'>First, in order to use FFT
routine for the IFFT function, the sign of the imaginary portion of the sample
must be reversed. This is done by the experiment control function <a
href="exp7/exp7c.c"><span style='font-family:"Courier New"'>exp7c.c</span></a>,
see Table E7-10. To compare the experiment result, two temporary buffers <span
style='font-family:"Courier New"'>re1[]</span> and <span style='font-family:
"Courier New"'>re2[]</span> are used to store the input and out output samples,
respectively.</p>
<p align=center style='margin-right:.5in;margin-left:.5in;text-align:center'>Table
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -