📄 vp3-format.txt
字号:
Unpacking The Block Coding Information--------------------------------------After unpacking the frame header, the decoder unpacks the block codinginformation. The only information determined in this phase is whether aparticular superblock and its fragments are coded in the current frameor unchanged from the previous frame. The actual coding method isdetermined in the next phase.If the frame is a golden frame then every superblock, macroblock, andfragment is marked as coded.If the frame is an interframe, then the block coding information must bedecoded. This is the phase where a decoder will build a list of codedfragments for which coding mode, motion vector, and DCT coefficient datamust be decoded.First, a list of partially-coded superblocks is unpacked from thestream. This list is coded as a series of variable-length run lengthcodes (VLRLC). First, the code is initialized by reading the next bit inthe stream. Then, while there are still superblocks remaining in thelist, fetch a VLC from the stream according to this table: Codeword Run Length 0 1 10x 2-3 110x 4-5 1110xx 6-9 11110xxx 10-17 111110xxxx 18-33 111111xxxxxxxxxxxx 34-4129For example, a VLC of 1101 represents a run length of 5. If the VLRLCwas initialized to 1, then the next 5 superblocks would be set to 1,indicating that they are partially coded in the current frame. Then thebit value is toggled to 0, another VLC is fetched from the stream andthe process continues until each superblock has been marked eitherpartially coded (1) or not (0).If any of the superblocks were marked as not partially coded in theprevious step, then a list of fully-coded superblocks is unpacked nextusing the same VLRLC as the list of partially-coded superblocks.Initialize the VLRLC with the next bit in the stream. For eachsuperblock that was not marked as partially coded, mark it with either a0 or 1 according to the current VLRLC. By the end of this step, eachsuperblock will be marked as either not coded, partially coded, or fullycoded.Let's work through an example with an image frame that is 256x64 pixels.This means that the Y plane contains 4x2 superblocks and each of the Cplanes contains 2 superblocks each. The superblocks are numbered asfollows: Y: 0 1 2 3 U: 8 9 4 5 6 7 V: 10 11This is the state of the bitstream: 1100011001101Which is interpreted as: initial 2 1's 1 0 4 1's 5 0's 1 100 0 1100 1101Superblocks 0-1 and 3-6 are marked as partially coded. Since there wereblocks that were not marked, proceed to unpack the list of fully-codedsuperblocks. This is the state of the bitstream: 1101101Which is interpreted as: initial 3 1's 3 0's 1 101 100Superblocks 2, 7, and 8 are marked as fully coded while superblocks 9,10, and 11 are marked as not coded. If any of the superblocks were marked as partially coded, the next datain the bitstream will define which fragments inside each partially-codedsuperblock are coded. This is the first place where the Hilbert patterncomes into play.For each partially-coded superblock, iterate through each fragmentaccording to the Hilbert pattern. Use the VLRLC method, only with adifferent table, to determine which fragments are coded. The VLRLC tablefor fragment coding runs is: Codeword Run Length 0x 1-2 10x 3-4 110x 5-6 1110xx 7-10 11110xx 11-14 11111xxxx 15-30Continuing with the contrived example, superblocks 0 and 1 are bothpartially coded. This is the state of the bitstream: 0011001111010001111010...(not complete)Which is interpreted as: initial 2 0's 3 1's 13 0's 1 1 13 0's 0 01 100 1111010 00 1111010 ...This indicates that fragments 2-4 in superblock 0 are coded, whilefragments 0, 1, and 5-15 are not. Note that the run of 12 0's cascadesover into the next fragment, indicating that fragment 0 of superblock 1is not coded. Fragment 1 of superblock 1 is coded, while the rest of thesuperblock's fragments are not coded. The example ends there (a realbitstream should have enough data to describe all of the partially-codedsuperblocks). Superblock 2 is fully coded which means all 16 fragmentsare coded. Thus, superblocks 0-2 have the following coded fragments: 0 | x x x x x x x x 0 1 14 15 32 | 3 2 x x x 2 x x 3 2 13 12 64 | 4 x x x x x x x 4 7 8 11 96 | x x x x x x x x 5 6 9 10This is a good place to generate the list of coded fragment numbers forthis frame. In this case, the list will begin as: 33 32 64 37 8 9 41 40 72 104 105 73 ...and so on through the remaining 8 fragments of superblock 2 and onto thefragments for the remaining superblocks that are either fully orpartially coded.Unpacking The Macroblock Coding Mode Information------------------------------------------------After unpacking the block coding information, the decoder unpacks themacroblock coding mode information. This process is simple whendecoding a golden frame-- since the only possible decoding mode is INTRA,no macroblock coding mode information is transmitted. However, in aninterframe, each coded macroblock is encoded with one of 8 methods:0, INTER_NO_MV: current fragment = (fragment from previous frame @ same coordinates) + (DCT-encoded residual)1, INTRA: current fragment = DCT-encoded block, just like in a golden frame2, INTER_PLUS_MV: current fragment = (fragment from previous frame @ (same coords + motion vector)) + (DCT-encoded residual)3, INTER_LAST_MV: same as INTER_PLUS_MV but using the last motion vector decoded from the bitstream4, INTER_PRIOR_LAST; same as INTER_PLUS_MV but using the second-to-last motion vector decoded from the bitstream5, USING_GOLDEN: same as INTER_NO_MV but referencing the golden frame instead of previous interframe6, GOLDEN_MV: same as INTER_PLUS_MV but referencing the golden frame instead of previous interframe7, INTER_FOURMV: same as INTER_PLUS_MV except that each of the 4 Y fragments gets its own motion vector, and the U and V fragments share the same motion vector which is the average of the 4 Y fragment vectorsThe MB coding mode information is encoded using one of 8 alphabets. Thefirst 3 bits of the MB coding mode stream indicate which of the 8alphabets, 0..7, to use to decode the MB coding information in this frame.The reason for the different alphabets is to minimize the number of bitsneeded to encode this section of information. Each alphabet arranges thecoding modes in a different order, indexing the 8 modes into 8 indexslots. Index 0 is encoded with 1 bit (0), index 1 is encoded with 2 bits(10), index 2 is encoded with 3 bits (110), and so on up to indices 6 and7 which are encoded with 6 bits each (1111110 and 1111111, respectively): index encoding ----- -------- 0 0 1 10 2 110 3 1110 4 11110 5 111110 6 1111110 7 1111111For example, the coding modes are arranged in alphabet 1 as follows: index coding mode ----- ----------- 0 MODE_INTER_LAST_MV 1 MODE_INTER_PRIOR_LAST 2 MODE_INTER_PLUS_MV 3 MODE_INTER_NO_MV 4 MODE_INTRA 5 MODE_USING_GOLDEN, 6 MODE_GOLDEN_MV 7 MODE_INTER_FOURMVThis alphabet arrangement is designed for frames in which motion vectorsbased off of the previous interframe dominate.When unpacking MB coding mode information for a frame, the decoder firstreads 3 bits from the stream to determine the alphabet. In this example,the 3 bits would be 001 to indicate alphabet 1. Consider this contrivedbitstream following the alphabet number: 1010000011000011111110... The bits are read as follows: 10 10 0 0 0 0 110 0 0 0 1111111 0 index: 1 1 0 0 0 0 2 0 0 0 7 0This arrangement of indices translates to this series of coding modes: index coding mode ----- ----------- 1 MODE_INTER_PRIOR_LAST 1 MODE_INTER_PRIOR_LAST 0 MODE_INTER_LAST_MV 0 MODE_INTER_LAST_MV 0 MODE_INTER_LAST_MV 0 MODE_INTER_LAST_MV 2 MODE_INTER_PLUS_MV 0 MODE_INTER_LAST_MV 0 MODE_INTER_LAST_MV 0 MODE_INTER_LAST_MV 7 MODE_INTER_FOURMV 0 MODE_INTER_LAST_MV There are 6 pre-defined alphabets. Consult Appendix B for the completealphabets. What happens if none of the 6 pre-defined alphabets fit? TheVP3 encoder can choose to use alphabet 0 which indicates a customalphabet. The 3-bit coding mode numbers for each index, 0..7, are storedafter the alphabet number in the bitstream. For example, the sequence: 000 111 110 101 100 011 010 001 000would indicate coding alphabet 0 (custom alphabet), index 0 corresponds tocoding mode 7 (INTER_FOURMV), index 1 corresponds to coding mode 6(GOLDEN_MV), and so on down to index 7 which would correspond to codingmode 0 (INTER_NO_MV).There is one more possible alphabet: Alphabet 7. This alphabet isreserved for when there is such a mixture of coding modes used in a framethat using any variable-length coding mode would result in more bits thana fixed-length representation. When alphabet 7 is specified, the decoderreads 3 bits at a time from the bitstream, and uses those directly as themacroblock coding modes.To recap, this is the general algorithm for decoding macroblock codingmode information: if (golden frame) all frames are intracoded, there is no MB coding mode information else read 3 bits from bitstream to determine alphabet if alphabet = 0 this is a custom alphabet, populate index table with 8 3-bit coding modes read from bitstream foreach coded macroblock, unpack a coding mode: if alphabet = 7 read 3 bits from the bitstream as the coding mode for the macroblock else read a VLC from the bitstream use the decoded VLC value to index into the coding mode alphabet selected for this frame and assign the indexed coding mode to this macroblock Unpacking The Macroblock Motion Vectors---------------------------------------After unpacking the macroblock coding mode information, the decoderunpacks the macroblock motion vectors. This phase essentially assigns amotion vector to each of the 6 constituent fragments of any codedmacroblock that requires motion vectors.If the frame is a golden frame then there is no motion compensation andno motion vectors are encoded in the bitstream.If the frame is an interframe, the next bit is read from the bitstreamto determine the vector entropy coding method used. If the coding methodis zero then all of the vectors will be unpacked using a VLC method. Ifthe coding method is 1 then all of the vectors will be unpacked using afixed length method.The VLC unpacking method reads 3 bits from the bitstream. These 3 bitscomprise a number ranging from 0..7 which indicate the next action:0, MV component = 01, MV component = 12, MV component = -13, MV component = 2, read next bit for sign4, MV component = 3, read next bit for sign5, MV component = 4 + (read next 2 bits), read next bit for sign range: (4..7, -4..-7)6, MV component = 8 + (read next 3 bits), read next bit for sign range: (8..15, -8..-15)7, MV component = 16 + (read next 4 bits), read next bit for sign range: (16..31, -16..-31)The fixed length vector unpacking method simply reads the next 5 bitsfrom the bitstream, reads the next bit for sign, and calls the wholething a motion vector component. This gives a range of (-31..31), whichis the same range as the VLC method.For example, consider the following contrived motion vector bitstream: 000001011011111000...The stream is read as: 0 (000 010) (110 111 1 100 0)The first bit indicates the entropy method which, in this example, isvariable length as opposed to fixed length. The next 3 bits are 0 whichindicate a X MV component of 0. The next 3 bits are 2 which indicate a YMV component of -1. The first motion vector encoded in this stream is
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -