📄 arith_decoder_mpeg4.asm
字号:
/*******************************************************************************
Copyright(c) 2000 - 2002 Analog Devices. All Rights Reserved.
Developed by Joint Development Software Application Team, IPDC, Bangalore, India
for Blackfin DSPs ( Micro Signal Architecture 1.0 specification).
By using this module you agree to the terms of the Analog Devices License
Agreement for DSP Software.
********************************************************************************
Module Name : arith_decoder_mpeg4.asm
Label Name : __arith_decoder_mpeg4
Version : 1.3
Change History :
Version Date Author Comments
1.3 11/18/2002 Swarnalatha Tested with VDSP++ 3.0
compiler 6.2.2 on
ADSP-21535 Rev.0.2
1.2 11/13/2002 Swarnalatha Tested with VDSP++ 3.0
on ADSP-21535 Rev.0.2
1.1 03/20/2002 Raghavendra Modified to match
silicon cycle count
1.0 09/11/2001 Raghavendra Original
Description : Arithmetic decoding contains four main steps
1. Removal of stuffed bits.
2. Initialization which is performed prior to the decoding of
the first symbol.
3. Decoding of symbol themselves. The decoding of each symbol
may be followed by a re-normalization step.
4. Termination which is performed after the decoding of the
last symbol.
The least probable symbol LPS is defined as the symbol with
least probability. If both probabilities are equal to half
(i.e 0x8000), then '0' symbol is considered as least probable.
In initialization lower bound L is set to zero and range
register R is set to 0x7fffffff. Encoder will do bit stuffing
depending on the following condition in order to avoid start
code emulation.
1's are stuffed into the bitstream whenever there are too many
successive 0's. If first 3(MAX-HEADING) bits are 0's then 1 is
transmitted and after MAX_HEADING th 0. If 10(MAX_MIDDLE) or
more 0's are sent successively a 1 is inserted after the
MAX_MIDDLE th 0. If the number of trailing 0's is larger than
2(MAX_TRAILING) then a 1 is appended. These stuffed bits are
removed properly while decoding.
The range associated with least probable symbol(LPS) is simply
computed as R*pLPS.
where R ->16 most significant bits of Range register value
pLPS -> probability of LPS symbol.
If R value is less than QUATER 1/4(i.e 0x40000000)then
re-normalization is performed. In this procedure both lower
value L and range R is doubled till R is greater than QUATER.
The following structure is used :
struct arcodec {
UInt L; -> 32bit fixed point register. Contains
the lower bound of the interval
UInt R; -> 32bit fixed point register. Contains
the range of the interval
UInt V; -> Contains the value of arithmetic code
value. It is always larger than or
equal to L and less than R value.
UInt arpipe;
Int bits_to_follow; -> follow bit count
Int first_bit; -> flag to check first bit
Int nzeros; -> counter to count consecutive zeros
Int nonzero;
Int nzerosf;
Int extrabits;
Int mh; -> to hold MAX_HEAD
Int mm; -> to hold MAX_MIDDLE
Int mt; -> to hold MAX_TRAIL
unsigned char *in; -> address of input compressed data array
}; typedef struct arcodec ArCoder;
UInt -> unsigned integer
Int -> interger
Assumption : Both input and output arrays are unsigned character array.
Each bit is stored in one location.
Prototype : void StartArDecoder(ArDecoder *decoder,unsigned char *in);
void arith_decoder_mpeg4(int co,ArDecoder *decoder);
void decode_renormlise(Arcoder *decoder);
void AddNextInputBit(ArDecoder *decoder);
void StopArDecoder( ArDecoder *decoder);
Calling sequence: Decoder is initialised by calling StartArDecoder.
arith_decoder_mpeg4 function is called for each context
which returns one bit as output. Finally _StopArcoder
function is called.
Following C code explains the calling sequence.
main()
{
int i,j,k,C;
struct ArCoder coder;
.
.
.
_StartArDecoder(coder,&bit_input[0]) ;
// bit_input is address of input array
for(i=0;i<NO_OF_PAIR;i++)
{
j=arrayIn0_C[i];
C=intra_prob[j];
// probability of '0' fetched from table
k=_arith_decoder_mpeg4(C,coder);
// returns one bit output for each call
output[i]=k;
}
_StopArDecoder(coder);
.
.
}
Performance :
Code Size :
StartArDecoder : 154 bytes
arith_decoder_mpeg4 : 94 bytes
decode_renormlise : 124 bytes
AddNextInputBit : 126 bytes
StopArDecoder : 98 bytes
Cycle count :
Best case worst case
StartArDecoder : 659 659 cycles
arith_decoder_mpeg4 : 66 539 cycles
StopArDecoder : 103 103 cycles
As the cycle count depends on the data, the above mentioned cycle counts are for
Test case 1 in the C file arith_decoder_mpeg4.c
Cycle count for 512 output bits which takes 2 bits as input is
36374 Cycles(71.04 cycles per output symbol),
whereas Cycle count for 512 output bits which takes 472 bits as input is
57040 Cycles(111.40 cycles per output symbol).
Reference : Appendix G and Appendix F of MPEG4 Video Verification
model(V16.0).
*******************************************************************************/
/******************************************************************************
Prototype : void StartArDecoder(ArDecoder *decoder,unsigned char *in);
In this procedure lower bound register (decoder->L) is set to zero,
the range register R(deocder->R) is to 0x7fffffff. The first 31 bits are read in
decoder->V register.
Registers used : R0-R7, P0-P2, P5, LC0.
*******************************************************************************/
#define MAXHEADING_ER 3
#define MAXMIDDLE_ER 10
#define MAXTRAILING_ER 2
.section L1_code;
.global __StartArDecoder;
.align 8;
__StartArDecoder:
P0 = R0; // Address of structure decoder
[--SP] = (R7:4,P5:5); // push R7:4,P5 register
P1 = 31; // set loop counter to read 31 bits
R0 = MAXHEADING_ER;
R5 = MAXMIDDLE_ER;
R2 = 1;
R1 = R1-R2(NS)||[P0+44] = R5;
// set decoder->mm to maximum zero count
[P0+52] = R1; // Address of input array
R3 = R1-R1(NS)||[P0+32] = R0;
// set decoder->mh to MAXHEADING_ER
[P0+40] = R0;
R0 = MAXTRAILING_ER;
R6 = R1-R1(NS)||[P0+48] = R0;
// set decoder->mt to MAXTRAILING_ER
P5 = [P0+52]; // get address of input array
R7 = R1-R1(NS)||[P0+36] = R3;
// clear decoder-> extra-bits
LSETUP(LOAD_31BITS_ST,LOAD_31BITS_END)LC0 = P1;
P1 = 1;
LOAD_31BITS_ST:
P2 = P1+P5; // address to fetch a bit
R7 = R7<<1||R0 = B[P2](Z);
// left shift V register by 1 and fetch next bit
R7 = R7+R0(NS)||R4 = [P0+32];
// add that bit to V register and fetch decoder->
// nzerof register
CC = R0 == 0; // check if bit == 0
IF CC R6 = R2;
R4 = R4-R6(NS)||R1 = [P0+36];
// if true decrement decoder->nzerof by one
CC = R4 == 0; // check if decoder->nzerof is zero
R6 = CC;
IF CC R3 = R2; // if true increment decoder->extrabit by one
R1 = R1+R3;
BITTGL(r6,0);
R6 = R6&R0; // check whether to set decoder->nzerof to
// MAXMIDDLE( i.e 10)
CC = R6 == 0;
IF CC R5 = R4;
P2 = R1;
P1 += 1; // increment the pointer to fetch next bit
R6 = R1-R1(NS)||[P0+36] = R1;
R3 = R2-R2(NS)||[P0+32] = R5;
// store decoder->nzerof
R5 = MAXMIDDLE_ER;
LOAD_31BITS_END:
P1 = P1+P2; // offset to fetch next bit
[P0+8] = R7; // store first 31 bits in decoder->V register
[P0] = R6; // Clear decoder->L register
[P0+16] = R6; // clear decoder->bits-to-follow register
[P0+12] = R7; // set decoder->arpipe to decoder->V register
[P0+28] = R6; // clear decoder->nonzero register
R0 = [P0+40];
BITSET(R3,31);
R3 += -1;
[P0+24] = R0;
[P0+4] = R3; // set decoder->R register to 0x7fffffff.
(R7:4,P5:5) = [SP++]; // pop R7:4,P5
RTS;
NOP;
__StartArDecoder.end:
/******************************************************************************
Prototype : void arith_decoder_mpeg4(int co,ArDecoder *decoder);
co-> probability of '0' symbol
In this procedure, probability of symbol '1' is calculated using probability of
symbol '0'. If probability of symbol '1' is greater than probability of symbol
'0' then '0' is the least probable symbol(LPS), else '1' is LPS. Range of LPS
symbol(rLPS) is calculated by multiplying higher 16 bit of range register(R) and
probability of LPS (CLPS). The interval(L,L+R) is split into two
intervals(L,L+R-rLPS) and (L+R-rLPS,R). If decoder->V is in later interval then
decoded symbol equals to LPS. Otherwise decoded symbol is opposite of LPS. The
interval (L,R) is then reduced to the sub interval in which decoder->V lies.
After the new interval has been computed, the new range R might be smaller than
0x40000000(QUATER).If so renormalization is carried out.
Registers used : R0-R3, R5-R7, P0-P2.
*******************************************************************************/
.section program;
.global __arith_decoder_mpeg4;
.align 8;
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -