📄 three_step_search.asm
字号:
/*******************************************************************************
Copyright(c) 2000 - 2002 Analog Devices. All Rights Reserved.
Developed by Joint Development Software Application Team, IPDC, Bangalore, India
for Blackfin DSPs ( Micro Signal Architecture 1.0 specification).
By using this module you agree to the terms of the Analog Devices License
Agreement for DSP Software.
********************************************************************************
Module Name : three_step_search.asm
Label Name : __three_step_search
Version : 1.2
Change History :
Version Date Author Comments
1.2 11/18/2002 Swarnalatha Tested with VDSP++ 3.0
compiler 6.2.2 on
ADSP-21535 Rev.0.2
1.1 11/13/2002 Swarnalatha Tested with VDSP++ 3.0
on ADSP-21535 Rev.0.2
1.0 07/02/2001 Vijay Original
Description : This routine does the motion vector computation using the
Three step search algorithm for a given macroblock. The
integer pel motion vector is first computed and a half pixel
correction is given. The range covered by different step sizes
is shown below :
=====================================
| Initial | |
| step size (SS) | Range (SR) |
=====================================
| 4 | -7.5 to 7.5 |
| 5 | -8.5 to 8.5 |
| 6 | -10.5 to 10.5 |
| 7 | -11.5 to 11.5 |
| 8 | -14.5 to 14.5 |
| 9 | -15.5 to 15.5 |
=====================================
The input and output to this routine is through a structure.
The motion vectors are written back to the same structure.
The input/output structure is declared as follows :
typedef struct
{
unsigned char *ptr_target;
// Target block address (16x16)
unsigned char *ptr_reference;
// Reference window address
int winwidth; // Width of the reference window
int step_size; // Initial step size
tss_struct *ptr_tss;// Pointer to the initialized tss_struct
short mv_x; // Address of the horizontal MV
short mv_y; // Address of the vertical MV
}tss_par;
where tss_struct is defined as
typedef struct
{
short vmv[9];
short hmv[9];
short modifier[25];
}tss_struct;
The tss_struct is initialized by calling the function
__init_tss() once before invoking the TSS routine.
This structure is used by the TSS routine in the motion vector
computation (Refer init_tss.asm for initialization details).
The target block is assumed to be stored in a 16x16 buffer and
the pointer to this buffer is initialized in the tss_par
structure.
The reference block stores all the required data for covering
the range of the step size and for doing the half pixel
interpolation, that is, if the initial step size is SS, then
the range covered is SR = (SS + (SS>>1) + (SS>>2)). In
addition to this we need one more row/column for half pixel
interpolation.
Thus, in the reference window we have to store a stretch of
{2*(SR + 1) + 16} pixels around the target block. The width of
the reference window becomes WINWIDTH = {2*(SR+1) + 16}. The
following picture depicts the data storage in the reference
window.
Data layout of the reference window
<------- WINWIDTH ---------->
------------------------------- ---
| --------------------------- | |
| | | | | |
1 pel gap->| |<- SR | | |
for half | | | | | W
pel inter.| | ----------- | | I
| | | TARGET | | | N
| | | (ZERO | | | W
| |< SR >| MOTION | | | I
| | | POSITION)| | | D
| | | | | | T
| | ----------- | | H
| | <--- 16 ---> | | |
| | | | |
| --------------------------- | |
------------------------------ ---
Assumption : The width of the reference window is assumed tobe a multiple
of 4.
Prototype : void three_step_search(tss_par *tss_in_out);
Registers used : A0, A1, R0-R7, I0-I2, M0, M1, L0-L2, P0-P5, LC0, LC1.
Performance :
Code size :
three_step_search : 688 bytes
hpel : 768 bytes
init_tss : 192 bytes
Total cycle count : 4239 cycles
Cycle count split up :
Integer pel estimation : 2544 cycles
Half pel estimation : 1695 cycles
The cycle count given above is for the first iteration of the test case 1 in
test_three_step_search.c
*******************************************************************************/
#define PTR_MACROBLOCK 0
#define PTR_REFERENCE 4
#define WINWIDTH 8
#define STEP_SIZE 12
#define PTR_TSS 16
#define H_MV 20
#define V_MV 22
.section L1_code;
.align 8;
.global _three_step_search;
.extern __hpel;
_three_step_search:
[--SP] = (R7:4, P5:3);
P5 = R0; // Address of tss parameter structure
[--SP] = RETS;
SP += -36;
L0 = 0;
L1 = 0;
L2 = 0;
I2 = SP; // I2 points to mv_ind array
R7 = R7 - R7 (S) || R0 = [P5 + STEP_SIZE];
// Initial step size
R1 = R0 >>> 1 || R3 = [P5 + WINWIDTH];
// WINWIDTH
R2 = R0 >>> 2 || P4 = [P5 + PTR_TSS];
// Address of tss structure
R0 = R0 + R1(S);
R0 = R0 + R2 (S) || R1 = [P5 + PTR_MACROBLOCK] || [I2++] = R7;
// Fetch the address of the target block
R0 += 1; // Search Range (SR)
I0 = R1;
R1 = R0.L*R3.L (IS) || R2 = [P5 + PTR_REFERENCE];
// (SR*(WINWIDTH+1)),Address of reference window
R1 = R0 + R1 (NS) || [I2++] = R7;
// [SP] = mv_ind[0]
R1 = R2 + R1 (NS) || [I2--] = R7;
// ref_ptr = reference + SR(WINWIDTH + 1),
// [SP + 8] = mv_ind[2]
I1 = R1; // Address of reference data
R4 = R4 - R4 (S) || [SP + 20] = R1;
// [SP + 20] = ref_ptr
P4 += 36;
A1 = A0 = 0 || [SP + 24] = P4;
// [SP + 24] = Address of modifier[0]
P4 += 2;
M0 = -260 (X);
R3 = [P5 + WINWIDTH];
R3 += -16;
M1 = R3; // Modifier for the reference window
P2 = 16 (Z);
[SP + 28] = P2; // To retrieve the loop count if it is modified
/******************** ZERO MOTION VECTOR POSITION *****************************/
DISALGNEXCPT || R0 = [I0++] || R2 = [I1++];
// Fetch the first data from the two blocks
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -