📄 fir_interp_spl.asm
字号:
/*******************************************************************************
Copyright(c) 2000 - 2002 Analog Devices. All Rights Reserved.
Developed by Joint Development Software Application Team, IPDC, Bangalore, India
for Blackfin DSPs ( Micro Signal Architecture 1.0 specification).
By using this module you agree to the terms of the Analog Devices License
Agreement for DSP Software.
********************************************************************************
Module Name : fir_interp_spl.asm
Label name : __fir_interp_spl
Version : 1.3
Change History :
Version Date Author Comments
1.3 11/18/2002 Swarnalatha Tested with VDSP++ 3.0
compiler 6.2.2 on
ADSP-21535 Rev.0.2
1.2 11/13/2002 Swarnalatha Tested with VDSP++ 3.0
on ADSP-21535 Rev. 0.2
1.1 03/27/2002 Nishanth Modified to match
silicon cycle count
1.0 06/02/2001 Nishanth Original
Description : This function performs FIR based Interpolation Filter. The
function produces the filtered interpolated output for a given
input data. The characteristics of the filter are dependant on
the coefficient values,the number of taps(L) and interpolation
factor(M) supplied by the calling program.
The coefficients stored in array `h` are applied to the
elements of vector `x[]`. For filtering, 40 bit accumulator is
used. The most significant 16 bits of the result is stored in
the output vetor `y[ ]`computed according to a interpolation
index `M`. Coefficients are to be stored in normal order, not
as polyphases.
The implementation of an interpolator is demonstrated in the
program. The implementation provided below does not use a
delay line once it does not require samples older than x(0).
This has been done since delay line will not be correct for
polyphases greater than 3.
This implementation is divided into two stages and there is an
outer loop which is done no: of polyphases/2 times.
a) In the first stage, it finds the output samples which
require delay line, i.e. for the first L/M-1 output samples
y(0) = h(0) * x(0) + h(M) * x(-1) + .. + h(L-M) * x(- (L/M-1))
y(1) = h(1) * x(0) + h(M+1) * x(-1) + ...+ h(L-M+1) *
x(- (L/M-1))
...
y(M-1) = h(M-1) * x(0) + h(2*M-1) * x(-1) + ... h(L-1) *
x(- (L/M-1))
y(M) = h(0) * x(1) + h(M) * x(0) + ... + h(L-M+1) *
x(- (L/M-2))
....
This stage has been separated out due to the use of delay
line. There are two inner loops. One finds sum of terms
containing inputs present in delay line and the other, ones in
input buffer.
b) In the second stage, all the remaining output samples are
calculated. i.e. y((L/M)-1) to y(Nout - 1) are computed in
stage 2.
c) After filtering the input, the delay line is updated by the
last L/M-1 input samples.
d) Two output samples are computed simultaneously using the 2
MACs. For finding rest of output samples corresponding
to each input, there is an outer loop which is done M/2 times.
Assumptions : 1. This routine assumes that the number of filter
coefficients(L) is a multiple of interpolation factor(M)
since filtering of each sample requires L/M terms at the
maximum and is done assuming there are L/M terms.
2. L is assumed to be greater than M. Otherwise L/M - 1 = 0
and hence stage 1 need not be done but will be done once.
3. Interpolation factor(M) is assumed to be even since two
polyphases are done at the same time usiing 2 MACs.
Prototype : void fir_interp_spl(const fract16 x[], fract16 y[], short Ni,
fract16 h[], int L, int M, int LBYM, fract16 d[]);
x[] - input array
y[] - output array
Ni - Number of input samples
h[] - Filter coefficient array
L - No. of coefficients
M - Interpolation Factor
LBYM - Number of coefficients in one polyphase (L/M)
d[] - Delay line buffer
Registers used : A0, A1, R0-R2, R7, I0-I3, B0-B3, M1-M3, L0-L3, P0-P2, P4, P5,
LC0, LC1.
Performance :
Code Size : 274 Bytes
Cycle count : 2673 Cycles (For Ni=256, L=16, M=2)
*******************************************************************************/
.section L1_code;
.global __fir_interp_spl;
.align 8;
__fir_interp_spl:
[--SP]=(R7:7,P5:4); // Push R7 and P5-P4
P5 = [SP+24]; // Address of filter coefficients
R7 = [SP+28]; // Number of Coefficients (L)
R3 = [SP+32]; // Interpolation Factor (M)
P1 = [SP+36]; // Number of coefficients in one polyphase (L/M)
B0 = R0; // Input array circular buffer
I0 = R0; // Start of input array
R0 = R3 + R3(S) || P2=[SP+40];
// R3 = 2*M and Address of delay line buffer into P2
M1 = R0; // M1 = 2*M
R0 = -R0;
M3 = R0; // M3 = -2*M
B1 = P2; // Start of delay line (circular buffer)
I1 = P2;
P2 = P1 << 1; // P2 = 2*L/M (length of delay line)
B2 = P5; // Coefficient array circular buffer
I2 = P5;
P5 = R3; // P5 = M
B3 = R1; // Output buffer is circular buffer
I3 = R1;
R0 = R2 + R2; // R0 = 2*Ni
P5 = P5 >> 1; // P5 = M/2 (OUTER LOOP COUNTER)
L0 = R0; // Length of circular buffer = 2*Ni
R1 = R7 + R7; // R1 = 2*L
L2 = R1; // Length of circular buffer = 2*L
R0.L = R0.L * R3.L(IS);
L3 = R0; // Length of circular buffer = No = (Ni*M)
L1 = P2; // Length of delay circular buffer = 2*L/M
P1 += -1; // P1 = L/M -1 (Innermost loop , Stage1 , delayline
// updation counters)
P2 = R2; // P2 = Ni
I3 += M3 || R7 = [I2++M3];
// Output pointer is modified, Coefficient pointer
// is modified.
I3 -= 4 || R0 = [I2--];
// Output pointer is modified, Coefficient pointer
// is modified.
R2 = 2(Z); // R2 initialized to 2
P2 -= P1; // P2 = Ni - (L/M - 1) (LOOP3 COUNTER)
// Start of outer loop
FIR_INT_SPL_LOOP1:
P4 = 0; // Loop counter for Loop 2b(using input buffer) = 1
P0 = P1; // Loop counter for Loop 2a(using delay line) =
// L/M - 1
R0.L = W[I0++] || I2 += 4 ;
// Modify input pointer, Modify coefficient pointer
R3.L = R2.L + R2.H(S) || R1 = [I2++M3] || [I3++] = R7;
// Modifier for input buffer and delay line is
// initialized to 2
// Last coefficients of the first two polyphases are
// fetched to R1
// Store result produced by last input of the two
// polyphases
LSETUP(FIR_INT_SPL_LOOP2_ST,FIR_INT_SPL_LOOP2_END) LC0 = P1;
// Execute loop L/M - 1 times
// Start of Stage 1
FIR_INT_SPL_LOOP2_ST:
A1=A0=0 || R0.L = W[I1++] || R0.H = W[I0--];
// x(-(L/M-1)) is fetched to R0.L from delay line,
// modify input pointer
M2 = R3; // Adjust the modifier for delay line and input
// buffer
P4 += 1; // Increment input buffer counter
LSETUP(FIR_INT_SPL_LOOP2A,FIR_INT_SPL_LOOP2A) LC1 = P0;
// Loop for delay line fetching
FIR_INT_SPL_LOOP2A:
A0+=R0.L*R1.L, A1+= R0.L*R1.H || R0.L = W[I1++] || R1 = [I2++M3];
// Find sum of terms having samples from delay line
// Fetch delay line samples to R0.L and coefficients
// to R1
R3 = R3 + R2(S) || R0.L = W[I0++] || [I3++M1] = R7;
// Modify the register copied to modifier
// Fetch x(0) from input buffer, store previous
// result
LSETUP(FIR_INT_SPL_LOOP2B,FIR_INT_SPL_LOOP2B) LC1 = P4;
// Loop for input buffer fetching
FIR_INT_SPL_LOOP2B:
R7.L=(A0+=R0.L*R1.L), R7.H=(A1+= R0.L*R1.H) || R0.L = W[I0++]
|| R1 = [I2++M3];
// Find sum of terms having samples from input
// buffer
// Fetch input samples to R0.L and coefficients to
// R1
P0 += -1; // Decrement delay line counter
FIR_INT_SPL_LOOP2_END:
I0 -= M2 || R0 = [I1++M2];
// Modify input pointer, Modify delay line pointer
// End of stage 1
R0.L = W[I0--] || R0.H = W[I1++];
// Modify input pointer, Modify delay line pointer
P1 += -2;
LSETUP(FIR_INT_SPL_LOOP3_ST,FIR_INT_SPL_LOOP3_END) LC0 = P2;
// Ni - L/M + 1
// Start of stage 2
FIR_INT_SPL_LOOP3_ST:
A1=A0=0 || R0.L = W[I0++] || [I3++M1] = R7;
// Fetch input sample to R0.L and store previous
// result
R7.L=(A0+=R0.L*R1.L), R7.H=(A1+= R0.L*R1.H) || R0.L = W[I0++] ||
R1 = [I2++M3];
R7.L=(A0+=R0.L*R1.L), R7.H=(A1+= R0.L*R1.H) || R0.L = W[I0++] ||
R1 = [I2++M3];
LSETUP(FIR_INT_SPL_LOOP3A,FIR_INT_SPL_LOOP3A) LC1= P1;
// L/M - 1 times
FIR_INT_SPL_LOOP3A:
R7.L=(A0+=R0.L*R1.L), R7.H=(A1+= R0.L*R1.H) || R0.L = W[I0++] ||
R1 = [I2++M3];
// Find interpolated output and Fetch input samples
// and coefficients
FIR_INT_SPL_LOOP3_END:
R7.L=(A0+=R0.L*R1.L), R7.H=(A1+= R0.L*R1.H) || I0 -= M2
|| R1 = [I2++M3];
// Find interpolated output, Modify input pointer,
// Fetch coefficient
P1 += 2;
// End of stage 2
P5 += -1; // Modify outer loop counter for polyphases/2
CC = P5 == 0;
I0 += M2 || R0 = [I2++M1];
// Modify input pointer
IF !CC JUMP FIR_INT_SPL_LOOP1(BP);
// M/2 times
// End of outer loop
I1-=4 || R0.L = W[I0--];
// Modify delay line pointer, Modify input pointer
R0.L=W[I0--] || [I3++] = R7;
// Fetch last sample and store last result
LSETUP(FIR_INTERP_SPL_DELUPDATE,FIR_INTERP_SPL_DELUPDATE) LC0 = P1;
// L/M-1
FIR_INTERP_SPL_DELUPDATE:
W[I1--]=R0.L || R0.L=W[I0--];
// Updation of delay line
(R7:7,P5:4) = [SP++]; // Pop R7 and P5-P3
RTS;
NOP; //to avoid one stall if LINK or UNLINK happens to be
//the next instruction after RTS in the memory.
__fir_interp_spl.end:
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -