⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 architecture

📁 MPEG2编解码的源代码.zip
💻
字号:
This file contains a couple of (currently unstructured and incomplete) encoderimplementation notes.Basic Assumptions- data structuresPrimary data structures are: - picture data arrays, containing either source or reconstructed pels - mbinfo array, containing all macroblock level side information - blocks array, containing DCT coefficientsmbinfo and blocks together are completely equivalent to the VLC encodedpicture_data() part of the MPEG stream, although in uncompressed andtherefore rather voluminous internal format.a) picture data arraysFrames are represented internally as in the following declaration:unsigned char *frame[3];i.e., an array of three pointers to the picture data proper. frame[0]points to the luminance (Y) data, frame[1] to the first chrominance (Cb, U)component, frame[2] to the second chrominance component (Cr, V).Width and height of the luminance data array is 16*mb_width and 16*mb_height,even if horizontal_size and vertical_size are not divisible by 16. Theactual pels are top left aligned and padded to the right and bottom (bypixel replication) while being read into the encoder.Width and height of the two chrominance arrays depends on the chroma_format.For 4:2:0 data, they are half size in both directions, for 4:2:2 data onlywidth is half that of the luminance array, for 4:4:4 data, chrominance andluminance are of identical size.The dimensions are stored in the following global variables:             | horizontal   | vertical-------------+--------------+------------luminance    | width        | heightchrominance  | chrom_width  | chrom_heightPicture data (array of pels) is always stored as frames, even when afield picture sequence is to be generated. The two fields are storedin interleaved from, that is alternating lines from both fields. Inmost cases, however, it's easier to view the two fields as being storedside to side of each other as an array of double width and half height.The following picture demonstrates this: <----- width ---------> T1 T1 T1 T1 T1 T1 T1 T1 <+ B1 B1 B1 B1 B1 B1 B1 B1  | T2 T2 T2 T2 T2 T2 T2 T2  | height B2 B2 B2 B2 B2 B2 B2 B2  | T3 T3 T3 T3 T3 T3 T3 T3  | B3 B3 B3 B3 B3 B3 B3 B3 <+ <------------- 2*width (width2) ---------------> T1 T1 T1 T1 T1 T1 T1 T1  B1 B1 B1 B1 B1 B1 B1 B1 <+ T2 T2 T2 T2 T2 T2 T2 T2  B2 B2 B2 B2 B2 B2 B2 B2  | height/2 (height2) T3 T3 T3 T3 T3 T3 T3 T3  B3 B3 B3 B3 B3 B3 B3 B3 <+ Tn: top field pels Bn: bottom field pelsThese are of cause only two different two-dimensional interpretations ofthe same one-dimensional memory layout. Either the top or the bottom fieldcan be the earlier field in time.The following table shows the relation between width, height, width2 andheight2 for frame and field pictures:         | frame  | field --------+--------+--------- width2  | width  | 2*width height2 | height | height/2Using the convenience variables width2 and height2, many loops areindependent of frame / field picture coding.b) mbinfo arrayThis array contains the complete encoded picture (field or frame),except the DCT coefficients. This includes macroblock type, motion vectors,motion vector type, dct type, quantization parameter etc. The numberof entries (size of the array) is identical to the number of macroblocksin the picture.c) blocksThis array contains all DCT coefficients of the picture. It is declared asEXTERN short (*blocks)[64];i.e. a pointer to an array whose elements are blocks of 64 short integerseach. The number of blocks is block_count times the number of macroblocksin the picture. block_count depends on the chroma format (6, 8 or 12 blocksper macroblock).The actual content depends on the encoding stage. Either prediction error(or intra block data), DCT transformed data, quantized DCT coefficients orinverse transformed data is stored in blocks. This is done to keepmemory requirements reasonable.Encoding procedureThe high-level structure of the bitstream is determined in putseq().This includes writing the appropriate headers and extensions and thedecision which picture coding type to choose for each picture.Encoding of a picture is divided into the following steps:- motion estimation (motion_estimation())- calculate prediction (predict())- DCT type estimation (dct_type_estimation())- subtract prediction from picture and perform DCT (transform())- quantize DCT coefficients and generate VLC data (putpict())- inverse quantize DCT coefficients (iquant())- perform IDCT and add prediction (itransform())Each of these steps is performed for the complete picture beforeproceding to the next one. The intention is to keep thesesteps as independent from each other as possible. They communicateonly via the above mentioned basic data structures. This should simplifyexperimenting with different coding models.Quantization and VLC generation could not be separated from each other,as quantization parameters and output buffer content usually form aclosed loop on macroblock level.- Motion EstimationThe procedures for motion estimation are in the file motion.c. The mainfunction motion_estimation() loops through all macroblocks of the currentpicture calling frame_ME (for frame pictures) or field_ME (for fieldpictures) to calculate motion vectors for each macroblock.Motion estimation is currently done separately. More efficient schemeslike telescopic motion estimation, which span several pictures (e.g.two I/P pictures and all intervening B pictures), are not yet implemented.Motion estimation splits into three steps:- calculation of optimum motion vectors for each of the possible motion  compensation types (frame, field, 16x8, dual prime)- selection of the best motion compensation type by calculating and comparing  a prediction error based cost function- selection of either motion compensation or any other possible encoding  type (intra coding, No MC coding)Motion vectors are estimated by integer pel full search in a search windowof user defined size. This full search is based on the original sourcereference pictures. Subsequently the cost functions for the 9 motion vectorswith an offset of -0.5, 0, and 0.5 relative to the best integer pel vectorsare evaluated (using the reconstructed reference picture) and the vector withsmallest cost function is used as estimation.To speed up full search, the cost function calculation is aborted if theintermediate values exceed the cost function value of an earlier motionvector in the same full search. This is most efficient if vectors ofpotentially low cost are evaluated fist. Therefore full search is organized inan outward spiral.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -