⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 sobel.asm

📁 ADI BF DSP的几种常用的图象滤波汇编优化后的代码
💻 ASM
字号:
/*******************************************************************************
Copyright(c) 2000 - 2002 Analog Devices. All Rights Reserved.
Developed by Joint Development Software Application Team, IPDC, Bangalore, India
for Blackfin DSPs  ( Micro Signal Architecture 1.0 specification).

By using this module you agree to the terms of the Analog Devices License
Agreement for DSP Software. 
********************************************************************************
Module name     : sobel.asm
Label name      : __sobel
Version         :   1.3
Change History  :

                Version     Date          Author        Comments
                1.3         11/18/2002    Swarnalatha   Tested with VDSP++ 3.0
                                                        compiler 6.2.2 on 
                                                        ADSP-21535 Rev.0.2
                1.2         11/13/2002    Swarnalatha   Tested with VDSP++ 3.0
                                                        on ADSP-21535 Rev. 0.2
                1.1         01/28/2002    Raghavendra   Modified to match 
                                                        silicon cycle count
                1.0         07/16/2001    Raghavendra   Original 

Description     : This function performs edge detection using sobel method.
                  Input is processed using horizontal and vertical mask.
                  Both the result are squared and added.
                  This value is  compared with square of threshold value and if 
                  result is greater than threshold value, output is written as 
                  one else zero. The resultant image is a binary image.
                  Since first row and last does not contain valid information, 
                  output values in these rows are zero. Similarly in each row 
                  first and last columns are zero value. i.e all boundary 
                  elements are zero. User has to pass proper threshold value.
                                
                  Horizontal mask = | -1  -2  -1 |
                                    |  0   0   0 |
                                    |  1   2   1 |

                  Vertical mask =   | -1   0   1 |   
                                    | -2   0   2 |
                                    | -1   0   1 |

                  To get one pixel edge the following condition is checked
                  output(r,c) =    (b(r,c)>cutoff) & ( ( (bx(r,c) >= by(r,c))
                                & (b(r,c-1) <= b(r,c)) & (b(r,c) > b(r,c+1)) ) 
                                | ( (by(r,c) >= bx(r,c)) &  (b(r-1,c) <= b(r,c))
                                & (b(r,c) > b(r+1,c))));

                     r->  row
                     c-> column
                     bx->  resultant matrix when horizontal mask is applied
                     by->  resultant matrix when vertical mask is applied
                     b-> bx(r,c)*bx(r,c) +by(r,c)*by(r,c);

                  In output image first row,last row, first column and last 
                  columns are zero.
                 
Prototype       : void _sobel(unsigned char* in, int row, int col, 
                              unsigned char *out,int threshold );

                   in   ->  It is pointer to the input image.
                   row  ->  It is number of rows of input image.
                   col  ->  It is number of columns of input image.
                   out  ->  It is pointer the output buffer.
             threshold  ->  Threshold value to compare.

Registers used  : A0, A1, R0-R7, I1, I3, B0-B3, M0-M3, L1, L3, P0-P5, LC0, LC1.

Performance     : 

  If image size is less than 64x64 then 2 stall [ Dcache Bank Collision ] will 
occur, because both temporary results are in stack. In condition check Branch 
prediction is assumed because most of the values will be below threshold.

        Image chosen for cycle count : 8x8 image with central 6x6 pixels with 
255 value and rest of the pixels with value zero. Threshold value used is 966.

        Code Size   : 464 bytes
        Cycle Count : 1804 cycles  for 8x8 image 

     First row and last row = 2*Column

     Loop for applying Horizontal and vertical mask
          Inner loop : 23 * (column-2)
          outer loop :  5 * (row-2)
     Loop for conditional check:
          Inner loop : 25 * (column-2)
          outer loop :  4 * (row-2)
*******************************************************************************/
.section      L1_code;
.global       __sobel;
.align              8;
    
__sobel:

    [--SP]=(R7:4,P5:3);
                            // Push R7:5 and P5:4 
    L1 = 0;
    L3 = 0;
    P5=R0;                  // Address of input image
    P0=R1;                  // Number of rows
    P1=R2;                  // Number of columns
    R3=R2<<1 ||P4=[SP+40];  // fetch output address
    M0=R2;                  // m0=number of columns
    P0+=-2;                 // ROW-2
    M3=P0;
    R3+=2;                  // 2*col +2
    M2=P4;
    R4=R1.L*R2.L(is)||R6=[SP+44];
                            // fetch threshold value 
    P2=R3;                  // P2=2*col +2
    R7=1;                   // Initialize R7 to 1
    R0=-1;                  // Initialize R0 to -1
    R1=-2;                  // Initialize R1 to -2
    R2=2;                   // Initialize  R2 to 2  to store coeff. on stack
    SP+=-16;                // decrement stack to store coeff.
    I1=SP;
    B1=SP;                  // set I1 and B1 to sp
    B2=R4;                  // ROW*COL
    R4=R4<<2||W[I1++]=R0.L; // 4*row*col
    B3=R4;                  
    P3=R4;
    W[I1++]=R1.L;           // horizontal and vertical masks are stored on stack
    W[I1++]=R7.L;
    W[I1++]=R1.L;
    W[I1++]=R2.L;
    W[I1++]=R7.L;           // store all coeff. on stack
    R6=R6.L*R6.L(IS)||W[I1++]=R0.L;
                            // get square of threshold value 
    R5=R7-R7(NS)|| W[I1++]=R2.L;
                            // clear R5 
    I1=B1;
    L1=16;                  // set L1 to 16 to have circular buffer
    R0=B[P5++] (Z)||R1.L=W[I1++];
                            // fetch first input and coeff. 
    SP-=P3;
    P4=SP;
    I3=P4;
    P3=B2;                  // ROW*COL
    SP-=P3;
    B0=SP;                  // TO STORE BX<=BY VALUE
    MNOP;

    LSETUP(FIRST_ROW,FIRST_ROW)LC0=P1;
                            // loop counter == ROWS 
FIRST_ROW:
        [P4++]=R5;   
    P1+=-2;                 // COL-2

    LSETUP(ROW_ST,ROW_END)LC0=P0;
                            // loop counter ==ROW-2 
    SP-=P3;                 // to store by<=bx
    P0=SP;                   
    P3=B0;
ROW_ST:
        R4=R1-R1(ns)||[P4++]=R5;

        LSETUP(COL_ST,COL_END)LC1=P1;
                            // loop counter=COL-2 
    
/****************************************************************************
    Coefficients are store as -1, -2, 1, -2, 2, 1, -1, 2   on stack    
    MAC1 is used for calculating  the output with horizontal mask and
    MAC0 is used for calculating output with vertical mask.
****************************************************************************/
COL_ST:
            A1=R0.L*R1.L, A0=R0.L *R1.L(IS) ||R0=B[P5++](Z)||R1.H=W[I1++];
                            // A1=-x00,A0=-x00,fetch x01and -2 
            A1+=R0.L*R1.H(IS) ||R0=B[P5](Z)||R1.H=W[I1++];
                            // A1+=-2* x01,fetch x02 and 1 
            P5=P5+P1;               // change pointer to starting of next row
            A1+=R0.L*R1.L ,A0+=R0.L *R1.H(IS)||R0=B[P5++](Z)||R1.L=W[I1++];
                            //A1+=-x02,A0+=x02,fetch x10 and -2 
            A0+=R0.L*R1.L(IS)||R0=B[P5++](Z)|| R1.H=W[I1++];
                            // A0+=-2*x10 ,fetch 2 
            R5=R1-R1(NS)||R0=B[P5](Z)||R1.L=W[I1++];
                            // fetch x10 and 1 
            P5=P5+P1;               // Modify pointer to next row
            A0+=R0.L *R1.H(IS)||R0=B[P5++](Z)||R1.H=W[I1++];
                            // A0+=2* x12, fetch x20 and -1 
            A1+=R0.L*R1.L,A0+=R0.L*R1.H(IS)|| R0=B[P5++](Z)||R1.H=W[I1++];
                            //A1+=x20, A0=-x20,fetch x21,2 
            A1+=R0.L *R1.H(IS)||R0=B[P5++](Z);
                            // A1+=2* x21,fetch x22 
            R3=(A1+=R0.L*R1.L),R2=(A0+=R0.L*R1.L)(IS);
                            // R3=result of horizontal mask R2=result of 
                            // vertical mask 
            R3=ABS R3;
            R2=ABS R2;
            CC=R3<=R2;      // check if BX<=BY
            IF CC R5=R7;
            CC=R2<=R3;      // check if BY<=BY
            IF CC R4=R7;
            P5-=P2;         // modify pointer back to process next set of data
            R3=R3.L*R3.L(IS)||B[P3++]=R5;
                            // store bx<=by 
            R2=R2.L*R2.L(IS)|| B[P0++]=R4;
                            // square each output and store by<=bx 
            R3=R3+R2(ns)||R0=B[P5++](z)||R1.L=W[I1++];
                            // R3 contain final result to compare 
COL_END:    R4=R1-R1(ns)||[P4++]=R3;
                            // store the result 
        P5+=1;              // move pointer to starting of next row
        R5=R1-R1(ns)||R0=B[P5++](Z);
ROW_END:[P4++]=R5;          // store last element as zero
    
    P0=M0;                  // p0==Number of columns

    LSETUP(END_ROW,END_ROW)LC0=P0;
                            // Clear last row 
END_ROW:
        [P4++]=R5;
//  testing for threshold and  comparison to get single pixel edge 
    P3=B0;          
    P4=M2;                  // OUTPUT ARRAY
    R0=M0;                  // NO OF COLUMN
    R1=R0<<2;
    M0=R1;                  // 4*COL
    R1+=-4;                 // 4*(COL-1)
    M1=R1;                  // 4*(COL-1)
    R1=R0<<3;
    R1+=-4;
    R1=R5-R1(NS)||I3+=4;
    M2=R1;                  // 4*(2*COL-1)
    
    LSETUP(FIRST_ROW_OUT,FIRST_ROW_OUT)LC0=P0;
                            // Clear last row 
FIRST_ROW_OUT:
        B[P4++]=R5;
    
    P0=M3;
    M3=8;

    LSETUP(OUT_ROW_ST,OUT_ROW_END)LC0=P0;
                            // loop counter ==ROW-2 
    P0=SP;
    
OUT_ROW_ST:
        MNOP||B[P4++]=R5;
                            // clear first element in the row 
        LSETUP(OUT_COL_ST,OUT_COL_END)LC1=P1;
                            // loop counter=COL-2 
OUT_COL_ST: R0=[I3++M1]||R3=B[P3++](Z);
                            // get b(r-1,c) and flag value bx<=by 
            R1=[I3++]||R4=B[P0++](z);
                            // get b(r,c-1) and flag value by<=bx 
            R5=R1-R1(NS)||R2=[I3]|| I3-=M1;
                            // clear r5,fetch b(r,c) and modify i3 to fetch 
                            //next data 
            CC=R2<=R6;              // check if b(r,c) <= threshold*threshold
            IF CC JUMP OUT_COL_END(bp);
            I3+=M0;         // modify to fetch b(r,c+1)
            R5=[I3++M1];    // fetch b(r,c+1) and modify pointer to process 
                            //next set of data
            CC=R1<=R2;      // check if b(r,c-1) <=b(r,c)
            R1=CC;
            CC=R0<=R2;      // check if b(r-1,c) <= b(r,c)
            R0=CC;
            R1=R1&R3;       // AND both results with bx<=by
            CC=R5<R2;       // check if b(r,c+1) <=b(r,c)
            R3=CC;              
            R0=R0&R4;
            R4=[I3++M2];    // fetch b(r+1,c)
            CC=R4<R2;       // check if b(r+1,c) <= b(r,c)
            R4=CC;
            R5=0;           // clear R5
            R1=R3&R1;      
            R4=R4&R0;
            R3=R1|R4;       // OR both results in horizontal and vertical
            CC=R3==0;                          
            IF !CC R5=R7;
OUT_COL_END:B[P4++]=R5;
                            // Store the result 
        R5=R1-R1(ns)||I3+=M3;
                            // modify I3 to process next row
OUT_ROW_END:
        B[P4++]=R5;
                            // clear last element in the row 
    P1+=2;

    LSETUP(LAST_ROW_OUT,LAST_ROW_OUT)LC0=P1;
                            // Clear last row 
LAST_ROW_OUT:
        B[P4++]=R5;
    P3=B3;
    P1=B2;
    SP+=16;                            
    P3=P3+(P1<<1);          // offset value to bring stack pointer to normal 
                            //position
    SP=SP+P3;               // modify stack to normal position
    (R7:4,P5:3)=[SP++];
    RTS;
    NOP;                    // To avoid one stall if LINK or UNLINK happens to 
                            // be the  next instruction in the memory.
    

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -