⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 three_step_search.asm

📁 adi blackfin 图象mpeg4 压缩解压缩
💻 ASM
📖 第 1 页 / 共 2 页
字号:
/*******************************************************************************
Copyright(c) 2000 - 2002 Analog Devices. All Rights Reserved.
Developed by Joint Development Software Application Team, IPDC, Bangalore, India
for Blackfin DSPs  ( Micro Signal Architecture 1.0 specification).

By using this module you agree to the terms of the Analog Devices License
Agreement for DSP Software. 
********************************************************************************
Module Name     : three_step_search.asm
Label Name      : __three_step_search
Version         :   1.2
Change History  :

                Version     Date          Author        Comments
                1.2         11/18/2002    Swarnalatha   Tested with VDSP++ 3.0
                                                        compiler 6.2.2 on 
                                                        ADSP-21535 Rev.0.2
                1.1         11/13/2002    Swarnalatha   Tested with VDSP++ 3.0
                                                        on ADSP-21535 Rev.0.2
                1.0         07/02/2001    Vijay         Original 

Description     : This routine does the motion vector computation using the 
                  Three step search algorithm for a given macroblock. The 
                  integer pel motion vector is first computed and a half pixel 
                  correction is given. The range covered by different step sizes
                  is shown below :

                  =====================================
                  |   Initial       |                 |
                  | step size (SS)  |     Range (SR)  |
                  =====================================
                  |     4           |  -7.5 to  7.5   |
                  |     5           |  -8.5 to  8.5   |
                  |     6           | -10.5 to 10.5   |
                  |     7           | -11.5 to 11.5   |
                  |     8           | -14.5 to 14.5   |
                  |     9           | -15.5 to 15.5   |
                  =====================================

                  The input and output to this routine is through a structure. 
                  The motion vectors are written back to the same structure.
                  The input/output structure is declared as follows :

                  typedef struct
                  {
                    unsigned char *ptr_target;
                                        // Target block address (16x16) 
                    unsigned char *ptr_reference;
                                        // Reference window address 
                    int winwidth;       // Width of the reference window
                    int step_size;      // Initial step size
                    tss_struct *ptr_tss;// Pointer to the initialized tss_struct
                    short mv_x;         // Address of the horizontal MV
                    short mv_y;         // Address of the vertical MV
                  }tss_par;

                  where tss_struct is defined as
                  typedef struct
                  {
                    short vmv[9];
                    short hmv[9];
                    short modifier[25];
                  }tss_struct;

                  The tss_struct is initialized by calling the function
                  __init_tss() once before invoking the TSS routine.
                  This structure is used by the TSS routine in the motion vector
                  computation (Refer init_tss.asm for initialization details).

                  The target block is assumed to be stored in a 16x16 buffer and
                  the pointer to this buffer is initialized in the tss_par 
                  structure.

                  The reference block stores all the required data for covering
                  the range of the step size and for doing the half pixel 
                  interpolation, that is, if the initial step size is SS, then 
                  the range covered is SR = (SS + (SS>>1) + (SS>>2)). In 
                  addition to this we need one more row/column for half pixel 
                  interpolation.
                  Thus, in the reference window we have to store a stretch of 
                  {2*(SR + 1) + 16} pixels around the target block. The width of
                  the reference window becomes WINWIDTH = {2*(SR+1) + 16}. The 
                  following picture depicts the data storage in the reference 
                  window.

                  Data layout of the reference window

                            <-------  WINWIDTH  ---------->  
                            -------------------------------  ---
                            | --------------------------- |   |
                            | |           |             | |   |
                 1 pel gap->| |<-        SR             | |   |
                  for half  | |           |             | |   W
                  pel inter.| |       -----------       | |   I
                            | |      |  TARGET  |       | |   N
                            | |      |  (ZERO   |       | |   W
                            | |< SR >|  MOTION  |       | |   I
                            | |      | POSITION)|       | |   D
                            | |      |          |       | |   T
                            | |       -----------       | |   H
                            | |      <--- 16 --->       | |   |
                            | |                         | |   |
                            | --------------------------- |   |
                            ------------------------------  ---

Assumption      : The width of the reference window is assumed tobe a multiple 
                  of 4.

Prototype       : void three_step_search(tss_par *tss_in_out);

Registers used  : A0, A1, R0-R7, I0-I2, M0, M1, L0-L2, P0-P5, LC0, LC1.

Performance     :
        Code size :
            three_step_search :    688 bytes
            hpel              :    768 bytes
            init_tss          :    192 bytes

        Total cycle count       : 4239 cycles
        Cycle count split up :
        Integer pel estimation  :   2544 cycles
        Half pel estimation     :   1695 cycles

    The cycle count given above is for the first iteration of the test case 1 in
    test_three_step_search.c 
*******************************************************************************/
#define PTR_MACROBLOCK 0
#define PTR_REFERENCE  4
#define WINWIDTH       8
#define STEP_SIZE     12
#define PTR_TSS       16
#define H_MV          20
#define V_MV          22

.section L1_code;
.align 8;
.global _three_step_search;

.extern __hpel;
    
_three_step_search:
    
    [--SP] = (R7:4, P5:3);
    P5 = R0;                // Address of tss parameter structure
    [--SP] = RETS;
    SP += -36;
    L0 = 0;
    L1 = 0;
    L2 = 0;

    I2 = SP;                // I2 points to mv_ind array
    R7 = R7 - R7 (S) || R0 = [P5 + STEP_SIZE];
                            // Initial step size 
    R1 = R0 >>> 1 || R3 = [P5 + WINWIDTH];
                            // WINWIDTH 
    R2 = R0 >>> 2 || P4 = [P5 + PTR_TSS];
                            // Address of tss structure 
    R0 = R0 + R1(S);
    R0 = R0 + R2 (S) || R1 = [P5 + PTR_MACROBLOCK] || [I2++] = R7;
                            // Fetch the address of the target block 
    R0 += 1;                // Search Range (SR)
    I0 = R1;
    R1 = R0.L*R3.L (IS) || R2 = [P5 + PTR_REFERENCE];
                            // (SR*(WINWIDTH+1)),Address of reference window 
    R1 = R0 + R1 (NS) || [I2++] = R7;
                            // [SP] = mv_ind[0] 
    R1 = R2 + R1 (NS) || [I2--] = R7;
                            // ref_ptr = reference + SR(WINWIDTH + 1),
                            // [SP + 8] = mv_ind[2] 
    I1 = R1;                // Address of reference data
    R4 = R4 - R4 (S) || [SP + 20] = R1;
                            // [SP + 20] = ref_ptr 
    P4 += 36;
    A1 = A0 = 0 || [SP + 24] = P4;
                            // [SP + 24] = Address of modifier[0] 
    P4 += 2;
    M0 = -260 (X);
    R3 = [P5 + WINWIDTH];
    R3 += -16;
    M1 = R3;                // Modifier for the reference window
    P2 = 16 (Z);
    [SP + 28] = P2;         // To retrieve the loop count if it is modified
/******************** ZERO MOTION VECTOR POSITION *****************************/
    DISALGNEXCPT || R0 = [I0++] || R2 = [I1++];
                            // Fetch the first data from the two blocks 

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -