📄 dsp_fft32x32_sa.sa
字号:
* The fft() code shown here performs the bulk of the computation * * in place. However, because digit-reversal cannot be performed * * in-place, the final result is written to a separate array, y[]. * * * * There is one slight break in the flow of packed processing that * * needs to be comprehended. The real part of the complex number is * * in the lower half, and the imaginary part is in the upper half. * * The flow breaks in case of "xl0" and "xl1" because in this case * * the real part needs to be combined with the imaginary part because * * of the multiplication by "j". This requires a packed quantity like * * "xl21xl20" to be rotated as "xl20xl21" so that it can be combined * * using add2's and sub2's. Hence the natural version of C code * * shown below is transformed using packed data processing as shown: * * * * xl0 = x[2 * i0 ] - x[2 * i2 ]; * * xl1 = x[2 * i0 + 1] - x[2 * i2 + 1]; * * xl20 = x[2 * i1 ] - x[2 * i3 ]; * * xl21 = x[2 * i1 + 1] - x[2 * i3 + 1]; * * * * xt1 = xl0 + xl21; * * yt2 = xl1 + xl20; * * xt2 = xl0 - xl21; * * yt1 = xl1 - xl20; * * * * xl1_xl0 = _sub2(x21_x20, x21_x20) * * xl21_xl20 = _sub2(x32_x22, x23_x22) * * xl20_xl21 = _rotl(xl21_xl20, 16) * * * * yt2_xt1 = _add2(xl1_xl0, xl20_xl21) * * yt1_xt2 = _sub2(xl1_xl0, xl20_xl21) * * * * Also notice that xt1, yt1 endup on seperate words, these need to * * be packed together to take advantage of the packed twiddle fact * * ors that have been loaded. In order for this to be achieved they * * are re-aligned as follows: * * * * yt1_xt1 = _packhl2(yt1_xt2, yt2_xt1) * * yt2_xt2 = _packhl2(yt2_xt1, yt1_xt2) * * * * The packed words "yt1_xt1" allows the loaded"sc" twiddle factor * * to be used for the complex multiplies. The real part os the * * complex multiply is implemented using _dotp2. The imaginary * * part of the complex multiply is implemented using _dotpn2 * * after the twiddle factors are swizzled within the half word. * * * * (X + jY) ( C + j S) = (XC + YS) + j (YC - XS). * * * * The actual twiddle factors for the FFT are cosine, - sine. The * * twiddle factors stored in the table are csine and sine, hence * * the sign of the "sine" term is comprehended during multipli- * * cation as shown above. * * * * * * ASSUMPTIONS * * * * The size of the FFT, n, must be a power of 4 and greater than * * or equal to 16 and less than 32768. * * * * The arrays 'x[]', 'y[]', and 'w[]' all must be aligned on a * * double-word boundary for the "optimized" implementations. * * * * The input and output data are complex, with the real/imaginary * * components stored in adjacent locations in the array. The real * * components are stored at even array indices, and the imaginary * * components are stored at odd array indices. * * * * C CODE * * * * * * ------------------------------------------------------------------------- ** Copyright (c) 2007 Texas Instruments, Incorporated. ** All Rights Reserved. ** ========================================================================= * .sect ".text:psa" .global _DSP_fft32x32* ======================================================================== **S Place file level definitions here. S** ======================================================================== *_DSP_fft32x32 .cproc A_ptr_w, B_n, A_ptr_x, B_ptr_y .no_mdep; ====================== SYMBOLIC REGISTER ASSIGNMENTS ======================= .rega A_fft_jmp .rega A_y .regb B_y .regb B_i .rega A_w .regb B_w .regb B_x_1:B_x_0 .rega A_x_3:A_x_2 .regb B_xl1_1i:B_xl1_0i .rega A_xl1_3i:A_xl1_2i .regb B_xl2_1i:B_xl2_0i .rega A_xl2_3i:A_xl2_2i .regb B_xh2_1i:B_xh2_0i .rega A_xh2_3i:A_xh2_2i .regb B_2h2 .rega A_2h2 .regb B_xh0_0:B_xl0_0 .regb B_xh1_0:B_xl1_0 .rega A_xh0_1:A_xl0_1 .rega A_xh1_1:A_xl1_1 .regb B_xh20_0:B_xl20_0 .regb B_xh21_0:B_xl21_0 .rega A_xh20_1:A_xl20_1 .rega A_xh21_1:A_xl21_1 .regb B_xt1_0:B_xt2_0 .regb B_yt2_0:B_yt1_0 .rega A_xt1_1:A_xt2_1 .rega A_yt2_1:A_yt1_1 .regb B_x_1o:B_x_0o .rega A_x_3o:A_x_2o .regb B_xh2_1o:B_xh2_0o .rega A_xh2_3o:A_xh2_2o .regb B_xl1_1o:B_xl1_0o .rega A_xl1_3o:A_xl1_2o .regb B_xl2_1o:B_xl2_0o .rega A_xl2_3o:A_xl2_2o .rega A_x1:A_x0 .regb B_x3:B_x2 .rega A_x5:A_x4 .regb B_x7:B_x6 .rega A_xh0_0:A_xl0_0 .rega A_xh1_0:A_xl1_0 .regb B_xh0_1:B_xl0_1 .regb B_xh1_1:B_xl1_1 .regb B_y1:B_y0 .rega A_y3:A_y2 .regb B_y5:B_y4 .rega A_y7:A_y6 .regb B_w0, B_x .rega A_x .regb B_xp1:B_xp0 .rega A_l1 .rega A_xl1p1:A_xl1p0 .rega A_h2 .regb B_xh2p1:B_xh2p0 .rega A_l2 .rega A_xl2p1:A_xl2p0 .rega A_xh0, A_xh1_0c .rega A_xh1 .regb B_xl0 .regb B_xl1 .rega A_xh20 .rega A_xh21 .regb B_xl20 .regb B_xl21, B_xl1_1c .rega A_y_h1_1:A_y_h1_0 .rega A_w0 .rega A_j .regb B_j .regb B_co10:B_si10 .rega A_co20:A_si20 .regb B_co30:B_si30 .rega A_co11:A_si11 .regb B_co21:B_si21 .rega A_co31:A_si31 .rega A_xt0 .rega A_yt0 .regb B_xt1 .regb B_yt2 .regb B_xt2 .regb B_yt1 .regb B_p0r .rega A_p1r .regb B_p01r .regb B_p0c .rega A_p1c .regb B_y_h2_1:B_y_h2_0 .regb B_p01c .rega A_p2r .rega A_p3r .rega A_p23r .rega A_p2c .rega A_p3c .rega A_y_l1_1:A_y_l1_0 .rega A_p23c .regb B_p4r .regb B_p5r .regb B_p45r .regb B_p4c .regb B_p5c .regb B_y_l2_1:B_y_l2_0 .regb B_p45c .rega A_x_1 .rega B_x__ .regb B_fft_jmp .rega A_fft_jmp_1 .rega A_ifj .regb B_ifj .regb B_h2 .regb B_l1 .rega A_i .regb B_xt0_0, B_yt0_0 .rega A_xt0_1, A_yt0_1 .regb B_p0, B_p1, B_p2, B_p3 .rega A_p4, A_p5, A_p6, A_p7 .rega A_p8, A_pb, A_pc, A_pe .regb B_p9, B_pa, B_pd, B_pf .regb B_p10, B_p11, B_p12, B_p13 .rega A_p14, A_p15, A_p16, A_p17 .rega A_tw_offset .regb B_stride, B_while .rega A_p_x0 .regb B_p_x0 .regb B_p_y0, B_p_y1, B_p_y2, B_p_y3 .regb B_h0, B_h1, B_h3, B_h4 .rega A_r2, A_radix, A_temp .regb B_j0, B_radix2 ; ====================================================================== ;-------------------------------------------------------------; ; Assume radix is 4, by default. Check the norm of the # of ; ; points to be transformed, and change radix to 2 if reqd. ; ;-------------------------------------------------------------; MVK .1 4, A_radix NORM .2 B_n, B_radix2 AND .2 B_radix2, 1, B_radix2[B_radix2]MVK .1 2, A_radix ;-------------------------------------------------------------; ; "stride" is a vraibale that denotes the speration between ; ; the legs of the butterfly. "tw_offset" is the offset within ;
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -