📄 172-200.html

📁 The primary purpose of this book is to explain various data-compression techniques using the C progr
💻 HTML
📖 第 1 页 / 共 3 页
字号:
上一页 1 23
* to context "C" and the symbol 'D'.  It then returns a pointer to* "CD", which I use to create the "BCD" table.  The recursion is* guaranteed to end if it ever gets to order -1, because the null table* is guaranteed to have a link for every symbol to the order 0 table.* This is the most complicated part of the modeling program, but it is* necessary for performance reasons.*/CONTEXT *shift_to_next_context( table, c, order )CONTEXT *table;int c;int order;{     int i;     CONTEXT *new_lesser;/** First, try to find the new context by backing up to the lesser* context and searching its link table.  If I find the link, we take* a quick and easy exit, returning the link.  Note that there is a* special kludge for context order 0.  We know for a fact that* the lesser context pointer at order 0 points to the null table,* order -1, and we know that the -1 table only has a single link* pointer, which points back to the order 0 table.*/     table = table-&gt;lesser_context;     if ( order == 0 )       return( table-&gt;links[ 0 ].next );     for ( i = 0 ; i &lt;= table-&gt;max_index ; i++ )       if ( table-&gt;stats[ i ].symbol == (unsigned char) c )         if ( table-&gt;links[ i ].next != NULL)           return( table-&gt;links[ i ].next );         else           break;/** If I get here, it means the new context did not exist.  I have to* create the new context, add a link to it here, and add the backwards* link to *his* previous context.  Creating the table and adding it to* this table is pretty easy, but adding the back pointer isn't.  Since* creating the new back pointer isn't easy, I duck my responsibility* and recurse to myself in order to pick it up.*/  new_lesser = shift_to_next_context( table, c, order-1 );/** Now that I have the back pointer for this table, I can make a call* to a utility to allocate the new table.*/  table = allocate_next_order_table( table, c, new_lesser );  return( table );}/** Rescaling the table needs to be done for one of three reasons.* First, if the maximum count for the table has exceeded 16383, it* means that arithmetic coding using 16 and 32 bit registers might* no longer work.  Secondly, if an individual symbol count has* reached 255, it will no longer fit in a byte.  Third, if the* current model isn't compressing well, the compressor program may* want to rescale all tables in order to give more weight to newer* statistics.  All this routine does is divide each count by 2.* If any counts drop to 0, the counters can be removed from the* stats table, but only if this is a leaf context.  Otherwise, we* might cut a link to a higher order table.*/void rescale_table( table )CONTEXT *table;{     int i;     if ( table-&gt;max_index == -1 )       return;     for ( i = 0 ; i &lt;= table-&gt;max_index ; i ++ )       table-&gt;stats[ i ].counts /= 2;     if ( table-&gt;stats[ table]&gt;max_index ].counts == 0 &amp;&amp;         table-&gt;links == NULL ) {         while ( table-&gt;stats[ table-&gt;max_index ].counts == 0 &amp;&amp;              table-&gt;max_index &gt;= 0 )           table-&gt;max_index&#150;&#150;;         if ( table-&gt;max_index == -1 ) {           free( (char *) table-&gt;stats );           table-&gt;stats = NULL;         } else {           table-&gt;stats = (STATS *)             realloc( (char *) table-&gt;stats,                     sizeof( STATS ) * ( table-&gt;max_index + 1 ) );           if ( table-&gt;stats == NULL )             fatal_error( "Error #11: reallocating stats space!" );     }   }}/** This routine has the job of creating a cumulative totals table for* a given context.  The cumulative low and high for symbol c are going to* be stored in totals[c+2] and totals[c+1].  Locations 0 and 1 are* reserved for the special ESCAPE symbol.  The ESCAPE symbol* count is calculated dynamically, and changes based on what the* current context looks like.  Note also that this routine ignores* any counts for symbols that have already shown up in the scoreboard,* and it adds all new symbols found here to the scoreboard.  This* allows us to exclude counts of symbols that have already appeared in*  higher order contexts, improving compression quite a bit.*/void totalize_table( table )CONTEXT *table;{     int i;     unsigned char max;     for ( ; ; ) {       max = 0;       i = table-&gt;max_index + 2;       totals[ i ] = 0;       for ( ; i &gt; 1 ; i- ) {         totals[ i-1 ] = totals[ i ];         if ( table-&gt;stats[ i-2 ].counts )           if ( ( current_order == -2 ) ||             scoreboard[ table-&gt;stats[ i-2 ].symbol ] == 0 )             totals[ i-1 ] += table-&gt;stats[ i-2].counts;         if ( table-&gt;stats[ i-2 ].counts &gt; max )           max = table-&gt;stats[ i-2 ].counts;       }/** Here is where the escape calculation needs to take place.*/     if ( max == 0 )       totals[ 0 ] = 1;     else {       totals[ 0 ] = (short int) ( 256 - table-&gt;max_index );       totals[ 0 ] *= table-&gt;max_index;       totals[ 0 ] /= 256;       totals[ 0 ] /= max;       totals[ 0 ]++;       totals[ 0 ] += totals[ 1 ];     }     if ( totals[ 0 ] &lt; MAXIMUM_SCALE )       break;     rescale_table( table );     }     for ( i = 0 ; i &lt; table-&gt;max_index ; i++ )       if (table-&gt;stats[i].counts != 0)         scoreboard[ table-&gt;stats[ i ].symbol ] = 1;}/** This routine is called when the entire model is to be flushed.* This is done in an attempt to improve the compression ratio by* giving greater weight to upcoming statistics.  This routine* starts at the given table, and recursively calls itself to* rescale every table in its list of links.  The table itself* is then rescaled.*/void recursive_flush( table )CONTEXT *table;{     int i;     if ( table-&gt;links != NULL )       for ( i = 0 ; i &lt;= table-&gt;max_index ; i++ )         if ( table-&gt;links[ i ].next != NULL )           recursive_flush( table-&gt;links[ i ].next );     rescale_table( table );}/** This routine is called to flush the whole table, which it does* by calling the recursive flush routine starting at the order 0* table.*/void flush_model(){  putc( 'F', stdout );  recursive_flush( contexts[ 0 ] );}/** Everything from here down define the arithmetic coder section* of the program.*/*/* These four variables define the current state of the arithmetic* coder/decoder.  They are assumed to be 16 bits long.  Note that* by declaring them as short ints, they will actually be 16 bits* on most 80X86 and 680X0 machines, as well as VAXen.*/static unsigned short int code;/* The present input code value      */static unsigned short int low; /* Start of the current code range   */static unsigned short int high;/* End of the current code range     */long underflow_bits;           /* Number of underflow bits pending  *//** This routine must be called to initialize the encoding process.* The high register is initialized to all 1s, and it is assumed that* it has an infinite string of 1s to be shifted into the lower bit* positions when needed.*/void initialize_arithmetic_encoder(){     low = 0;     high = 0xffff;     underflow_bits = 0;}/** At the end of the encoding process, there are still significant* bits left in the high and low registers.  We output two bits,* plus as many underflow bits as are necessary.*/void flush_arithmetic_encoder( stream )BIT_FILE *stream;{      OutputBit( stream, low &amp; 0x4000 );      underflow_bits++;      while ( underflow_bits-- &gt; 0 )        OutputBit( stream, ~low &amp; 0X4000 );      OutputBits( stream, 0L, 16 );}/** This routine is called to encode a symbol.  The symbol is passed* in the SYMBOL structure as a low count, a high count, and a range,* instead of the more conventional probability ranges.  The encoding* process takes two steps.  First, the values of high and low are* updated to take into account the range restriction created by the* new symbol.  Then, as many bits as possible are shifted out to* the output stream.  Finally, high and low are stable again and* the routine returns.*/void encode_symbol( stream, s )BIT_FILE *stream;SYMBOL *s;{  long range;/**  These three lines rescale high and low for the new symbol.*/     range = (long) ( high-low ) + 1;     high = low + (unsigned short int)                  (( range * s-&gt;high_count ) / s-&gt;scale -1 );     low = low + (unsigned short int)                  (( range * s-&gt;low_count ) / s-&gt;scale );/** This loop turns out new bits until high and low are far enough* apart to have stabilized.*/  for ( ; ; ) {/** If this test passes, it means that the MSDigits match, and can* be sent to the output stream.*/     if ( ( high &amp; 0x8000 ) == ( low &amp; 0x8000 ) ) {       OutputBit( stream, high &amp; 0x8000 );       while ( underflow_bits &gt; 0 ) {         OutputBit( stream, ~high &amp; 0x8000 );         underflow_bits--;       }     }/** If this test passes, the numbers are in danger of underflow, because* the MSDigits don't match, and the 2nd digits are just one apart.*/     else if ( ( low &amp; 0x4000 ) &amp;&amp; !( high &amp; 0x4000 )) {       underflow_bits += 1;       low &amp;= 0x3fff;       high |= 0x4000;     } else       return ;     low &lt;&lt;= 1;     high &lt;&lt;= 1;     high  |= 1;   }}/** When decoding, this routine is called to figure out which symbol* is presently waiting to be decoded.  This routine expects to get* the current model scale in the s-&gt;scale parameter, and it returns* a count that corresponds to the present floating point code;** code = count / s-&gt;scale*/short int get_current_count( s )SYMBOL *s;{     long range;     short int count;     range = (long) ( high - low ) + 1;     count = (short int)          ((((long) ( code - low ) + 1 ) * s-&gt;scale-1 ) / range );     return( count );}/** This routine is called to initialize the state of the arithmetic* decoder.  This involves initializing the high and low registers* to their conventional starting values, plus reading the first* 16 bits from the input stream into the code value.*/void initialize_arithmetic_decoder( stream )BIT_FILE *stream;{     int i;     code = 0;     for ( i = 0 ; i &lt; 16 ; i++ ) {        code &lt;&lt;= 1;        code += InputBit( stream );     }     low = 0;     high = 0xffff;}/** Just figuring out what the present symbol is doesn't remove* it from the input bit stream.  After the character has been* decoded, this routine has to be called to remove it from the* input stream.*/void remove_symbol_from_stream( stream, s )BIT_FILE *stream;SYMBOL *s;{  long range;/** First, the range is expanded to account for the symbol removal.*/     range = (long)( high - low ) + 1;     high = low + (unsigned short int)                  (( range * s-&gt;high_count ) / s-&gt;scale -1 );     low = low + (unsigned short int)                  (( range * s-&gt;low_count ) / s-&gt;scale );/** Next, any possible bits are shipped out.*/     for ( ; ; ) {/** If the MSDigits match, the bits will be shifted out.*/     if ( ( high &amp; 0x8000 ) == ( low &amp; 0x8000 ) ) {     }/** Else, if underflow is threatening, shift out the 2nd MSDigit.*/     else if ((low &amp; 0x4000) == 0x4000  &amp;&amp; (high &amp; 0x4000) == 0 ) {       code ^= 0x4000;       low &amp;= 0x3fff;       high |= 0x4000;     } else/** Otherwise, nothing can be shifted out, so I return.*/        return;     low &lt;&lt;= 1;     high &lt;&lt;= 1;     high |= 1;     code &lt;&lt;= 1;     code += InputBit( stream );   }}/************************** End of ARITH-N.C ***************************/</PRE><!--  END CODE //--><P><BR></P><CENTER><TABLE BORDER><TR><TD><A HREF="169-171.html">Previous</A></TD><TD><A HREF="../ewtoc.html">Table of Contents</A></TD><TD><A HREF="../ch07/201-203.html">Next</A></TD></TR></TABLE></CENTER></TD></TR></TABLE></BODY></HTML>
上一页 1 23
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -