⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 fir_interp_spl.asm

📁 blackfin 533 DSP上优化的多速率数字滤波程序源码。
💻 ASM
字号:
/*******************************************************************************
Copyright(c) 2000 - 2002 Analog Devices. All Rights Reserved.
Developed by Joint Development Software Application Team, IPDC, Bangalore, India
for Blackfin DSPs  ( Micro Signal Architecture 1.0 specification).

By using this module you agree to the terms of the Analog Devices License
Agreement for DSP Software. 
********************************************************************************
Module Name     : fir_interp_spl.asm
Label name      : __fir_interp_spl
Version         : 1.3
Change History  :
                Version   Date            Author        Comments
                1.3       11/18/2002      Swarnalatha   Tested with VDSP++ 3.0
                                                        compiler 6.2.2 on 
                                                        ADSP-21535 Rev.0.2
                1.2       11/13/2002      Swarnalatha   Tested with VDSP++ 3.0
                                                        on ADSP-21535 Rev. 0.2
                1.1       03/27/2002      Nishanth      Modified to match 
                                                        silicon cycle count
                1.0       06/02/2001      Nishanth      Original

Description     : This function performs FIR based Interpolation Filter. The 
                  function produces the filtered interpolated output for a given
                  input data. The characteristics of the filter are dependant on
                  the coefficient values,the number of taps(L) and interpolation
                  factor(M) supplied by the calling program. 
                  The coefficients stored in array `h` are applied to the 
                  elements of vector `x[]`. For filtering, 40 bit accumulator is
                  used. The most significant 16 bits of the result is stored in 
                  the output vetor `y[ ]`computed according to a interpolation 
                  index `M`. Coefficients are to be stored in normal order, not 
                  as polyphases.

                  The implementation of an interpolator is demonstrated in the 
                  program. The implementation provided below does not use a 
                  delay line once it does not require samples older than x(0).
                  This has been done since delay line will not be correct for 
                  polyphases greater than 3.

                  This implementation is divided into two stages and there is an
                  outer loop which is done no: of polyphases/2 times.

                  a) In the first stage, it finds the output samples which 
                  require delay line, i.e. for the first L/M-1 output samples
                  y(0) = h(0) * x(0) + h(M) * x(-1) + .. + h(L-M) * x(- (L/M-1))

                  y(1) = h(1) * x(0) + h(M+1) * x(-1) + ...+ h(L-M+1) * 
                                                                x(- (L/M-1))
                    ...
                  y(M-1) = h(M-1) * x(0) + h(2*M-1) * x(-1) + ... h(L-1) * 
                                                                x(- (L/M-1))

                  y(M) = h(0) * x(1) + h(M) * x(0) + ... + h(L-M+1) *
                                                                x(- (L/M-2))
                    ....

                  This stage has been separated out due to the use of delay 
                  line. There are two inner loops. One finds sum of terms 
                  containing inputs present in delay line and the other, ones in
                  input buffer.

                  b) In the second stage, all the remaining output samples are 
                  calculated. i.e. y((L/M)-1) to y(Nout - 1) are computed in 
                  stage 2.

                  c) After filtering the input, the delay line is updated by the
                  last L/M-1 input samples.

                  d)  Two output samples are computed simultaneously using the 2
                  MACs. For finding rest of output samples corresponding 
                  to each input, there is an outer loop which is done M/2 times.

Assumptions     : 1. This routine assumes that the number of filter 
                     coefficients(L) is a multiple of interpolation factor(M)
                     since filtering of each sample requires L/M terms at the 
                     maximum and is done assuming there are L/M terms.
                  2. L is assumed to be greater than M. Otherwise L/M - 1 = 0 
                     and hence stage 1 need not be done but will be done once.
                  3. Interpolation factor(M) is assumed to be even since two 
                     polyphases are done at the same time usiing 2 MACs.

Prototype       : void fir_interp_spl(const fract16 x[], fract16 y[], short Ni, 
                             fract16 h[], int L, int M, int LBYM, fract16 d[]);

                    x[]  -  input array 
                    y[]  -  output array
                    Ni   -  Number of input samples
                    h[]  -  Filter coefficient array
                    L    -  No. of coefficients 
                    M    -  Interpolation Factor
                    LBYM -  Number of coefficients in one polyphase (L/M)
                    d[]  -  Delay line buffer

Registers used  : A0, A1, R0-R2, R7, I0-I3, B0-B3, M1-M3, L0-L3, P0-P2, P4, P5, 
                  LC0, LC1.

Performance     : 
                Code Size   : 274  Bytes
                Cycle count : 2673 Cycles  (For Ni=256, L=16, M=2)
*******************************************************************************/
.section  L1_code;
.global __fir_interp_spl;
.align 8;
    
__fir_interp_spl:
    
    [--SP]=(R7:7,P5:4);     // Push R7 and P5-P4
    
    P5 = [SP+24];           // Address of filter coefficients
    R7 = [SP+28];           // Number of Coefficients (L)
    R3 = [SP+32];           // Interpolation Factor (M)
    P1 = [SP+36];           // Number of coefficients in one polyphase (L/M)
    
    B0 = R0;                // Input array circular buffer
    I0 = R0;                // Start of input array
    R0 = R3 + R3(S) || P2=[SP+40];
                            // R3 = 2*M and Address of delay line buffer into P2
    M1 = R0;                // M1 = 2*M
    R0 = -R0;
    M3 = R0;                // M3 = -2*M
    
    B1 = P2;                // Start of delay line (circular buffer)
    I1 = P2;
    P2 = P1 << 1;           // P2 = 2*L/M (length of delay line)
    
    B2 = P5;                // Coefficient array circular buffer
    I2 = P5;               
    
    P5 = R3;                // P5 = M
    
    B3 = R1;                // Output buffer is circular buffer
    I3 = R1;
    
    R0 = R2 + R2;           // R0 = 2*Ni
    P5 = P5 >> 1;           // P5 = M/2 (OUTER LOOP COUNTER)
    
    L0 = R0;                // Length of circular buffer = 2*Ni
    
    R1 = R7 + R7;           // R1 = 2*L
    L2 = R1;                // Length of circular buffer = 2*L
    
    R0.L = R0.L * R3.L(IS);
    L3 = R0;                // Length of circular buffer = No = (Ni*M)
    L1 = P2;                // Length of delay circular buffer = 2*L/M
    P1 += -1;               // P1 = L/M -1  (Innermost loop , Stage1 , delayline
                            // updation counters)
    P2 = R2;                // P2 = Ni
    
    I3 += M3 || R7 = [I2++M3];
                            // Output pointer is modified, Coefficient pointer 
                            // is modified.
    I3 -= 4 || R0 = [I2--];
                            // Output pointer is modified, Coefficient pointer 
                            // is modified.
    R2 = 2(Z);              // R2 initialized to 2
    P2 -= P1;               // P2 = Ni - (L/M - 1) (LOOP3 COUNTER)

// Start of outer loop

FIR_INT_SPL_LOOP1:
    P4 = 0;                 // Loop counter for Loop 2b(using input buffer) = 1
    P0 = P1;                // Loop counter for Loop 2a(using delay line) = 
                            // L/M - 1
    R0.L = W[I0++] || I2 += 4 ;                   
                            // Modify input pointer, Modify coefficient pointer
    R3.L = R2.L + R2.H(S) ||  R1 = [I2++M3] || [I3++] = R7;
                            // Modifier for input buffer and delay line is 
                            // initialized to 2
                            // Last coefficients of the first two polyphases are
                            // fetched to R1
                            // Store result produced by last input of the two 
                            // polyphases
    LSETUP(FIR_INT_SPL_LOOP2_ST,FIR_INT_SPL_LOOP2_END) LC0 = P1; 
                            // Execute loop L/M - 1 times

// Start of Stage 1

FIR_INT_SPL_LOOP2_ST:
        A1=A0=0 || R0.L = W[I1++] || R0.H = W[I0--];
                            // x(-(L/M-1)) is fetched to R0.L from delay line, 
                            // modify input pointer
    
        M2 = R3;            // Adjust the modifier for delay line and input
                            // buffer
        P4 += 1;            // Increment input buffer counter

        LSETUP(FIR_INT_SPL_LOOP2A,FIR_INT_SPL_LOOP2A) LC1 = P0;
                            // Loop for delay line fetching
FIR_INT_SPL_LOOP2A: 
            A0+=R0.L*R1.L, A1+= R0.L*R1.H || R0.L = W[I1++] || R1 = [I2++M3];
                            // Find sum of terms having samples from delay line
                            // Fetch delay line samples to R0.L and coefficients
                            // to R1
        R3 = R3 + R2(S) || R0.L = W[I0++] || [I3++M1] = R7;
                            // Modify the register copied to modifier
                            // Fetch x(0) from input buffer, store previous 
                            // result
    
        LSETUP(FIR_INT_SPL_LOOP2B,FIR_INT_SPL_LOOP2B) LC1 = P4;
                            // Loop for input buffer fetching
FIR_INT_SPL_LOOP2B: 
            R7.L=(A0+=R0.L*R1.L), R7.H=(A1+= R0.L*R1.H) || R0.L = W[I0++] 
            || R1 = [I2++M3];
                            // Find sum of terms having samples from input 
                            // buffer
                            // Fetch input samples to R0.L and coefficients to 
                            // R1
        P0 += -1;           // Decrement delay line counter
FIR_INT_SPL_LOOP2_END:
        I0 -= M2 || R0 = [I1++M2];
                            // Modify input pointer, Modify delay line pointer

// End of stage 1

    R0.L = W[I0--] || R0.H = W[I1++];
                            // Modify input pointer, Modify delay line pointer
    P1 += -2;

    LSETUP(FIR_INT_SPL_LOOP3_ST,FIR_INT_SPL_LOOP3_END) LC0 = P2; 
                            // Ni - L/M + 1

// Start of stage 2

FIR_INT_SPL_LOOP3_ST:
 		A1=A0=0 || R0.L = W[I0++] || [I3++M1] = R7;
                            // Fetch input sample to R0.L and store previous 
                            // result
        R7.L=(A0+=R0.L*R1.L), R7.H=(A1+= R0.L*R1.H)  || R0.L = W[I0++] || 
        R1 = [I2++M3];
        R7.L=(A0+=R0.L*R1.L), R7.H=(A1+= R0.L*R1.H)  || R0.L = W[I0++] || 
        R1 = [I2++M3];
        
        LSETUP(FIR_INT_SPL_LOOP3A,FIR_INT_SPL_LOOP3A) LC1= P1; 
                            // L/M - 1 times
       
FIR_INT_SPL_LOOP3A:
            R7.L=(A0+=R0.L*R1.L), R7.H=(A1+= R0.L*R1.H)  || R0.L = W[I0++] || 
            R1 = [I2++M3];
                            // Find interpolated output and Fetch input samples 
                            // and coefficients
FIR_INT_SPL_LOOP3_END:   
        R7.L=(A0+=R0.L*R1.L), R7.H=(A1+= R0.L*R1.H)  || I0 -= M2 
        || R1 = [I2++M3];
                            // Find interpolated output, Modify input pointer, 
                            // Fetch coefficient
    P1 += 2;

// End of stage 2

    P5 += -1;               // Modify outer loop counter for polyphases/2
    CC = P5 == 0;
    I0 += M2 || R0 = [I2++M1];
                            // Modify input pointer
    IF !CC JUMP FIR_INT_SPL_LOOP1(BP); 
                            // M/2 times

// End of outer loop

   I1-=4 || R0.L = W[I0--];
                            // Modify delay line pointer, Modify input pointer
    R0.L=W[I0--] || [I3++] = R7;
                            // Fetch last sample and store last result

    LSETUP(FIR_INTERP_SPL_DELUPDATE,FIR_INTERP_SPL_DELUPDATE) LC0 = P1;
                            // L/M-1 
 
FIR_INTERP_SPL_DELUPDATE:
        W[I1--]=R0.L || R0.L=W[I0--]; 
                            // Updation of delay line
    (R7:7,P5:4) = [SP++];   // Pop R7 and P5-P3
    RTS;
    NOP;                    //to avoid one stall if LINK or UNLINK happens to be
                            //the next instruction after RTS in the memory.
                            
__fir_interp_spl.end:

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -