⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 fft32x32.asm

📁 davinci技术 源码 视频监控汇编源码
💻 ASM
📖 第 1 页 / 共 4 页
字号:
*                                                                           *
*                        xl0  = x[2 * i0    ] - x[2 * i2    ];              *
*                        xl1  = x[2 * i0 + 1] - x[2 * i2 + 1];              *
*                        xl20 = x[2 * i1    ] - x[2 * i3    ];              *
*                        xl21 = x[2 * i1 + 1] - x[2 * i3 + 1];              *
*                                                                           *
*                        xt1  = xl0 + xl21;                                 *
*                        yt2  = xl1 + xl20;                                 *
*                        xt2  = xl0 - xl21;                                 *
*                        yt1  = xl1 - xl20;                                 *
*                                                                           *
*                        xl1_xl0   = _sub2(x21_x20, x21_x20)                *
*                        xl21_xl20 = _sub2(x32_x22, x23_x22)                *
*                        xl20_xl21 = _rotl(xl21_xl20, 16)                   *
*                                                                           *
*                        yt2_xt1   = _add2(xl1_xl0, xl20_xl21)              *
*                        yt1_xt2   = _sub2(xl1_xl0, xl20_xl21)              *
*                                                                           *
*       Also notice that xt1, yt1 endup on seperate words, these need to    *
*       be packed together to take advantage of the packed twiddle fact     *
*       ors that have been loaded. In order for this to be achieved they    *
*       are re-aligned as follows:                                          *
*                                                                           *
*       yt1_xt1 = _packhl2(yt1_xt2, yt2_xt1)                                *
*       yt2_xt2 = _packhl2(yt2_xt1, yt1_xt2)                                *
*                                                                           *
*       The packed words "yt1_xt1" allows the loaded"sc" twiddle factor     *
*       to be used for the complex multiplies. The real part os the         *
*       complex multiply is implemented using _dotp2. The imaginary         *
*       part of the complex multiply is implemented using _dotpn2           *
*       after the twiddle factors are swizzled within the half word.        *
*                                                                           *
*       (X + jY) ( C + j S) = (XC + YS) + j (YC - XS).                      *
*                                                                           *
*       The actual twiddle factors for the FFT are cosine, - sine. The      *
*       twiddle factors stored in the table are csine and sine, hence       *
*       the sign of the "sine" term is comprehended during multipli-        *
*       cation as shown above.                                              *
*                                                                           *
*                                                                           *
*   ASSUMPTIONS                                                             *
*                                                                           *
*       The size of the FFT, n, must be a power of 4 and greater than       *
*       or equal to 16 and less than 32768.                                 *
*                                                                           *
*       The arrays 'x[]', 'y[]', and 'w[]' all must be aligned on a         *
*       double-word boundary for the "optimized" implementations.           *
*                                                                           *
*       The input and output data are complex, with the real/imaginary      *
*       components stored in adjacent locations in the array.  The real     *
*       components are stored at even array indices, and the imaginary      *
*       components are stored at odd array indices.                         *
*                                                                           *
*   C CODE                                                                  *
*                                                                           *
*                                                                           *
*   NOTES                                                                   *
*                                                                           *
*                                                                           *
*   CYCLES                                                                  *
*                                                                           *
*       cycles = [12*N/8+12]*ceil[log4(N)-1]+6*N/4+79                       *
*       For nx = 512, cycles = 3967                                         *
*                                                                           *
*   CODESIZE                                                                *
*                                                                           *
*       1056 bytes                                                          *
*                                                                           *
* ------------------------------------------------------------------------- *
*             Copyright (c) 2005 Texas Instruments, Incorporated.           *
*                            All Rights Reserved.                           *
* ========================================================================= *


* ======================================================================== *
* ======================================================================== *

        .text        .global _fft32x32_fft32x32:                                                                
*================== SYMBOLIC REGISTER ASSIGNMENTS: SETUP ====================*
        .asg            B15,        B_SP                        ; Stack pointer, B datapath
        .asg            A0,         A_SP                        ; Stack pointer, A datapath
        .asg            B3,         B_ret                       ; Return address


        ;registers used in kernel
        .asg            A7,         A_w0
        .asg            A8,         A_h2
        .asg            B7,         B_h2
        .asg            A9,         A_fft_jmp
        .asg            B14,        B_fft_jmp
        .asg            A6,         A_j
        .asg            A10,        A_w
        .asg            B28,        B_w
        .asg            B25,        B_co10
        .asg            B24,        B_si10
        .asg            A25,        A_co20
        .asg            A24,        A_si20
        .asg            B29,        B_co30
        .asg            B28,        B_si30
        .asg            A17,        A_co11
        .asg            A16,        A_si11
        .asg            B31,        B_co21
        .asg            B30,        B_si21
        .asg            A29,        A_co31
        .asg            A28,        A_si31
        .asg            A5,         A_x
        .asg            B12,        B_x
        .asg            B21,        B_x_1
        .asg            B20,        B_x_0
        .asg            A17,        A_x_3
        .asg            A16,        A_x_2
        .asg            B27,        B_xh2_1i
        .asg            B26,        B_xh2_0i
        .asg            A21,        A_xh2_3i
        .asg            A20,        A_xh2_2i
        .asg            B9,         B_xl1_1i
        .asg            B8,         B_xl1_0i
        .asg            A23,        A_xl1_3i
        .asg            A22,        A_xl1_2i
        .asg            B11,        B_xl2_1i
        .asg            B10,        B_xl2_0i
        .asg            A11,        A_xl2_3i
        .asg            A10,        A_xl2_2i
        .asg            B26,        B_2h2
        .asg            A2,         A_ifj
        .asg            B19,        B_xh0_0
        .asg            B18,        B_xl0_0
        .asg            B21,        B_xh1_0
        .asg            B20,        B_xl1_0
        .asg            A15,        A_xh0_1
        .asg            A14,        A_xl0_1
        .asg            A27,        A_xh1_1
        .asg            A26,        A_xl1_1
        .asg            B27,        B_xh20_0
        .asg            B26,        B_xl20_0
        .asg            B9,         B_xh21_0
        .asg            B8,         B_xl21_0
        .asg            A13,        A_xh20_1
        .asg            A12,        A_xl20_1
        .asg            A21,        A_xh21_1
        .asg            A20,        A_xl21_1
        .asg            B24,        B_x_0o
        .asg            B25,        B_x_1o
        .asg            A26,        A_x_2o
        .asg            A27,        A_x_3o
        .asg            B17,        B_xt0_0
        .asg            B19,        B_yt0_0
        .asg            A3,         A_yt0_1
        .asg            A13,        A_xt0_1
        .asg            B17,        B_xt1_0
        .asg            B16,        B_xt2_0
        .asg            B23,        B_yt2_0
        .asg            B22,        B_yt1_0
        .asg            A31,        A_xt1_1
        .asg            A30,        A_xt2_1
        .asg            A19,        A_yt2_1
        .asg            A18,        A_yt1_1
        .asg            B22,        B_p0
        .asg            B10,        B_p1
        .asg            B12,        B_xh2_0o
        .asg            B18,        B_p2
        .asg            B22,        B_p3
        .asg            B13,        B_xh2_1o
        .asg            A20,        A_p4
        .asg            A31,        A_p5
        .asg            A18,        A_xh2_2o
        .asg            A28,        A_p6
        .asg            A12,        A_p7
        .asg            A19,        A_xh2_3o
        .asg            A28,        A_p8
        .asg            B22,        B_p9
        .asg            B18,        B_xl1_0o
        .asg            B3,         B_pa
        .asg            A12,        A_pb
        .asg            B19,        B_xl1_1o
        .asg            A20,        A_pc
        .asg            B20,        B_pd
        .asg            A14,        A_xl1_2o
        .asg            A20,        A_pe
        .asg            B4,         B_pf
        .asg            A15,        A_xl1_3o
        .asg            B11,        B_p10
        .asg            B13,        B_p11
        .asg            B26,        B_xl2_0o
        .asg            B25,        B_p12
        .asg            B5,         B_p13
        .asg            B27,        B_xl2_1o
        .asg            A22,        A_p14
        .asg            A31,        A_p15
        .asg            A22,        A_xl2_2o
        .asg            A25,        A_p16
        .asg            A31,        A_p17
        .asg            A23,        A_xl2_3o
        .asg            A4,         A_y
        .asg            B8,         B_y
        .asg            A11,        A_2h2
        .asg            B6,         B_j
        .asg            B2,         B_ifj
        ; end registers used in kernel


        ; registers used in outer loop
        .asg            B6,         B_ptr_y
        .asg            B0,         B_stride
        .asg            B1,         B_wh
        .asg            A12,        A_ptr_x
        .asg            A13,        A_ptr_w
        .asg            A1,         A_tw_offset
        .asg            B15,        B_radix

        .asg            A6,         A_ptr_x
        .asg            A4,         A_ptr_w
        .asg            B8,         B_i
        .asg            B4,         B_n
        .asg            B1,         B_radix2
        .asg            A8,         A_radix
        .asg            A0,         A_SP

        .asg            B9,         B_fft_jmp_old
        .asg            B16,        B_fft_jmp_temp
        .asg            B21,        B_h2_old
        .asg            B31,        B_h2_old_2
        .asg            A16,        A_y_old
        .asg            A17,        A_h2_old
        .asg            A30,        A_h2_old_2
        .asg            B22,        B_fft_jmp_old_2
        .asg            B17,        B_y_old
        .asg            A24,        A_y_old_2

; ====================== END SYMBOLIC REGISTER ASSIGNMENTS =======================

        ; Stack frame.  16 words:  A10..A15, B10..B14, B3, A_ptr_x, A_ptr_w, B_ptr_y,B_n



        STW     .D2T2   B14,        *B_SP--[16]                 ; Reserve stack, Save Return
||      SHRU    .S2     B_n,        3,          B_i

        MV      .S1X    B_SP,       A_SP                        ; Twin Stack Pointer
||      MVC     .S2     B_i,        ILC
||      NORM    .L2     B_n,        B_radix2
||      ZERO    .L1     A_tw_offset

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -