📄 isadct.asm
字号:
/*******************************************************************************
Copyright(c) 2000 - 2002 Analog Devices. All Rights Reserved.
Developed by Joint Development Software Application Team, IPDC, Bangalore, India
for Blackfin DSPs ( Micro Signal Architecture 1.0 specification).
By using this module you agree to the terms of the Analog Devices License
Agreement for DSP Software.
********************************************************************************
Module name : isadct.asm
Label name : __isadct
Version : 1.3
Change History :
Version Date Author Comments
1.3 11/18/2002 Swarnalatha Tested with VDSP++ 3.0
compiler 6.2.2 on
ADSP-21535 Rev.0.2
1.2 11/13/2002 Swarnalatha Tested with VDSP++ 3.0
on ADSP-21535 Rev.0.2
1.1 03/10/2002 Manoj Modified to match
silicon cycle count
1.0 08/30/2001 Manoj Original
Description : This program performs inverse SADCT on a 8x8 as prescribed in
the MPEG-4 standard. It takes the input transformed data array
X[] in short format to be inverse transformed
X00,X01 ...X07;
X10,X11 ....X17;
........
X70,X71.....X77;
and the corresponding shape information in character
format (Alpha Map). Consider the shape array of a 3x3 (taken
for ease of demonstration). The following sequence of
operations are performed.
a) Perform inverse SADCT of the appropriate length on the rows
of the input X[]. To do this the shape array is transformed as
shown (after column alignment and row alignment [refer
sadct.asm])
[255 255 0 ; col+row [255 255 255 ;
0 0 255; =======> 255 0 0;
0 255 0 ] align 0 0 0 ]
Perform ISADCT(3) on row 1 of X and ISADCT(1) on row 2 of X.
Row 3 is skipped as there are no non-zero elements in row 3 to
get XC,
where ISADCT(N) => ISADCT(N,N)=K * cos(j*(i+0.5)*(pi/N)),
where i,j E [0 N) and K=sqrt(1/N) : i=0;
=sqrt(2/N) : else.
N is the number of shape elements in the row on which ISADCT
is being performed. In this program, the ISADCT coefficients
are stored in an array and a direct matrix multiplication
method is used to implement the SADCT. It is to be noted that
for N > 6 special flowgraph implementation of ISADCT will be
optimal (considering the conditional branch and SADCT
complexity). However, this has not been incorporated in this
program.
For ISADCT(8) the chens IDCT will suffice. If required, the
flowgraph of ISADCT(7) is to be integrated. However, since the
cycle count is highly dependent on the shape array, the user
has to prudently decide whether to use the flowgraph approach
in application using the considerations of code-size and speed
improvement. In the implementation provided, two ISADCT
outputs are computed simultaneously, by a slight compromise on
memory storage (coefficients).
b)Undo the row alignment as shown
[255 255 255 ; row [255 255 255 ;
255 0 0 ; =======> 0 255 0 ; =====>
0 0 0 ] unalign 0 0 0 ]
XC=[XC00 XC01 XC12 ;
0 XC31 0 ;
0 0 0 ]
c) Perform inverse SADCT along the columns of XC using ISADCT
kernels of appropriate lengths i.e. perform ISADCT(1) on
column 1 of XC, ISADCT(2) on row 2 of XC and ISADCT(1) on
row 3.
xc=[x00 x10 x20 ;
0 x11 0 ;
0 0 0 ]
d) Undo the column alignment as shown
[255 255 255 ; column [255 255 0 ;
0 255 0 ; =======> 0 0 255 ; =====>
0 0 0 ] unalign 0 255 0 ]
xrec=[x00 x10 0 ;
0 0 x20 ;
0 x11 0 ]
Instead of the conventional method of rearranging the shape
array a number of times, in this program a novel technique has
been used. The column count of the shape array is stored in
a temporary storage in the stack and used to ease the task of
finding the ISADCT length both row-wise and column-wise.
Prototype : void isadct(short in[], unsigned char shape[], short out[],
short coeff_tans[]);
in -> Address of the 8x8 sadct array
shape -> Address of the 8x8 shape array
out -> Address of the 8x8 output data array
coeff_tans -> Address of the coefficients
Registers used : A0, A1, R0-R7, I0-I3, B0-B3, M0, M2, L0-L3, P0-P5, LC0, LC1.
Performance :
Code Size : 498 Bytes
Cycle count : 2284 Cycles for a lower triangular output matrix
(including the diagonal elements)
*******************************************************************************/
.section L1_code;
.global __isadct;
.align 8;
.extern _Coeff_offset;
__isadct:
//Initializations
B0 = R0; //Address of the sadct array
P0 = R1; //Address of the shape array
R1 = [SP+12]; //Address of the coefficient array
[--SP] = (R7:4,P5:3);
P5 = R2; //Address of the output array
B3 = R2; //Address of the output array
I2 = R1; //Address of the coefficient array
B2 = R1; //Address of the coefficient array
I3.L = _Coeff_offset; //Address of the offset array
I3.H = _Coeff_offset; //Address of the offset array
L2 = 0;
L3 = 0;
R4 = 1;
P3 = 8;
SP += -16; //Temporary storage
R5 = SP;
SP += -16; //To store the column length information in shape
I1 = SP;
B1 = SP;
L1 = 16;
R1 = 0;
/*Clear length array in stack*/
[I1++] = R1;[I1++] = R1;
[I1++] = R1;
R7 = R7-R7 (S) || [I1++] = R1;
//Row loop counter = 0
//Determining the number of nonzero elements in the Columns of the shape array
P4 = 64;
R6 = R6-R6 (S) || R1.L = W[I1++] || R0 = B[P0++] (Z);
//Fetch the length. Read shape
R3 = R0 >> 7;
LSETUP($1LP_ST,$1LP_END) LC0 = P4;
$1LP_ST:
R3.L = R1.L+R3.L (S) || R1.L = W[I1--] || R0 = B[P0++] (Z);
//Fetch the length. Read shape
$1LP_END:
R3 = R0>>7 || W[I1] = R3.L || I1 += 4;
//Update the length
P0 += -1;
P0 += -64; //Restore the address of the shape array
B1 = R5;
/****************************Inverse Row SADCT*********************************/
I_ROW_ST:
P1 = SP; //P1 points to the count
R3 = R4<<3 || R1 = W[P1++](Z);
//Count for Row_ISADCT. Read the column count
LSETUP($2LP_ST,$2LP_END) LC0 = P3>>1;
$2LP_ST:
CC = R1 <= R7;
R2 = CC;
R3 = R3-R6 (S) || R1 = W[P1++](Z);
//Count the number of non-zero values along each row
CC = R1 <= R7;
R6 = CC;
$2LP_END:
R3 = R3-R2 (S) || R1 = W[P1++](Z);
//Count the number of non-zero values along each row
CC = R3 == 0; //Check if the row length is zero. If zero, row
//SADCT is over
IF CC JUMP I_ROW_OVER;
/*Process the row from the input array I0 by adjusting I0, B0 and L0*/
R1 = R3 << 1 || NOP; //2*length
M0 = R1;
I0 = B0; //Set I0 to B0
L0 = R1; //Set I0 as a circular buffer of desired length
P2 = R3; //Loop counter
R2 = R2-R2 (S) || I3 += M0;
//Point to the right offset
R1 = R3+R4 (S) || R2.L = W[I3] || I3 -= M0;
//Length+1, Fetch the offset. Restore I3
M2 = R2;
P4 = R1;
I1 = B1;
P1 = SP; //Restore the address of the count
/*Compute ISADCT*/
I2 += M2;
A1 = A0 = 0 || R3.L = W[I0++] || R1 = [I2++];
LSETUP($3LP_ST,$3LP_END) LC1 = P4>>1;
//Set Loop for (L+1)>>1
$3LP_ST:
LSETUP($4LP_ST,$4LP_ST) LC0 = P2;
//Set Loop for L
$4LP_ST:
R2.H = (A1 += R3.L*R1.H),R2.L = (A0 += R3.L*R1.L) || R3.L = W[I0++]
|| R1 = [I2++];
//Fetch a data and 2 coeff.
$3LP_END:
A1 = A0 = 0 || [I1++] = R2;
/*Store the data in the right position in output array*/
I1 = B1;
R5 = R4+|+R4,R6 = R4-|-R4 || R1 = W[P1++](Z);
//Set R5 = 2 and clear R6
R0 = I1;
R3 = I1;
R3 = R3+R5 (S)|| R2.L = W[I1] ;
//Read the stored data, read the count
CC = R1 <= R7;
LSETUP($5LP_ST,$5LP_END) LC0 = P3;
$5LP_ST:
IF CC R3 = R0;
I1 = R3;
IF CC R2 = R6;
R0 = PACK(R3.H,R3.L) || R1 = W[P1++](Z);
CC = R1 <= R7;
$5LP_END:
R3 = R3+R5 (S)|| R2.L = W[I1] || W[P5++] = R2;
//Read the stored data, read the count
/*Update pointers*/
R0 = B0;
R0 += 16; //As data packing ensures that if a row count is
//zero, ISADCT_ROW is over
L0 = 0; //Clear the circular buffering of I1
B0 = R0;
R7 = R7+R4 (S);
I2 = B2; //Restore pointer to the coeff. buffer
CC = R7 <= 7 (IU);
IF CC JUMP I_ROW_ST (BP);
I_ROW_OVER:
/****************************Inverse Column SADCT*****************************/
P1 = SP; //Column count
SP += -16; //Temporary storage for output
R7 = 0; //Column loop counter
P3 = 16;
I_COL_ST:
R1 = R7 << 1 || R3 = W[P1++](Z);
//Column ISADCT count
CC = R3 <= 0;
IF CC JUMP I_COL_END;
R0 = B3;
R0 = R0+R1 (S);
P5 = R0; //Address of the current column
P2 = R3; //Column ISADCT count
I1 = B1;
R1 = R3 << 1; //2*length
M0 = R1;
/*Copy column into temporary buffer*/
LSETUP($6LP_ST,$6LP_ST) LC0 = P2;
R1 = W[P5++P3](Z) || NOP;
//First data from a column
$6LP_ST:
R1 = W[P5++P3](Z) || W[I1++] = R1.L;
P5 = R0; //Restore the output buffer pointer
L1 = M0;
I1 = B1;
R2 = R2-R2 (S) || I3 += M0;
//Point to the right offset
R1 = R3+R4 (S) || R2.L = W[I3] || I3 -= M0;
//Length+1, Fetch the offset. Restore I3
M2 = R2;
P4 = R1;
I0 = SP;
R6 = SP;
R5 = PACK(R6.H,R6.L) || I2 += M2;
/*Store in output array column*/
A1 = A0 = 0 || R0.L = W[I1++] || R1 = [I2++];
LSETUP($8LP_ST,$8LP_END) LC1 = P4>>1;
//Set Loop for (L+1)>>1
$8LP_ST:
LSETUP($9LP_ST,$9LP_ST) LC0 = P2;
//Set Loop for L
$9LP_ST:
R2.H = (A1 += R0.L*R1.H),R2.L = (A0 += R0.L*R1.L) || R0.L = W[I1++]
|| R1 = [I2++];
//Fetch a data and 2 coeff.
$8LP_END:
A1 = A0 = 0 || [I0++] = R2;
/*Store the data in the right position in output array*/
R2 = R2-R2 (S) || R0 = B[P0] (Z);
//Read the shape array
R3 = 2;
I0 = SP;
LSETUP($10LP_ST,$10LP_END) LC1 = P3>>1;
//Set Loop for 8
$10LP_ST:
P0 += 8;
R6 = R6+R3 (S) || R1.L = W[I0];
CC = R0 == 0;
IF CC R6 = R5;
I0 = R6;
IF !CC R2 = R1;
R5 = PACK(R6.H,R6.L) || W[P5++P3] = R2.L;
$10LP_END:
R2 = R2-R2 (S) || R0 = B[P0] (Z);
I2 = B2;
P0 += -64; //Set P0 to the next column
L1 = 0;
I_COL_END:P0 += 1;
R7 += 1;
CC = R7 <= 7 (IU);
IF CC JUMP I_COL_ST (BP);
SP += 48;
(R7:4,P5:3) = [SP++];
RTS;
NOP; //to avoid one stall if LINK or UNLINK happens to be
//the next instruction after RTS in the memory.
__isadct.end:
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -