406-456.html

来自「The primary purpose of this book is to e」· HTML 代码 · 共 1,981 行 · 第 1/5 页
HTML
1,981 行
* going to the end of the larger_child descendant chain.*/int FindNextNode( node )int node;{ int next; next = tree[ node ].smaller_child; while ( tree [ next ].larger_child != UNUSED )  next = tree[ next ].larger_child; return( next );}/** This routine performs the classic binary tree deletion algorithm.* If the node to be deleted has a null link in either direction, we* just pull the non-null link up one to replace the existing link.* If both links exist, we instead delete the next link in order, which* is guaranteed to have a null link, then replace the node to be deleted* with the next link.*/void DeleteString( p )int p;{ int replacement; if ( tree[ p ].parent == UNUSED )  return; if ( tree[ p ].larger_child == UNUSED )  ContractNode( p, tree [ p ].smaller_child ); else if (tree[ p ].smaller_child == UNUSED )  ContractNode( p , tree[ p ].larger_child ); else {  replacement = FindNextNode( p );  DeleteString( replacement );  ReplaceNode( p , replacement ); }}/** This is where most of the work done by the encoder takes place. This* routine is responsible for adding the new node to the binary tree.* It also has to find the best match among all the existing nodes in* the tree, and return that to the calling routine. To make mattes* even more complicated, if the new_node has a duplicate in the tree,* the old_node is deleted, for reasons of efficiency.*/int AddString( new_node, match_position )int new_node;int *match_position;{ int i; int test_node; int delta; int match_length; int *child; if ( new_node == END_OF_STREAM )  return( 0 ) ; test_node = tree[ TREE_ROOT ].larger_child; match_length = 0; for ( ; ; ) {  for ( i = 0 ; i &lt; LOOK_AHEAD_SIZE ; i++ ) {  delta = window[ MOD_WINDOW( new_node + 1 ) ] -      window[ MOD_WINDOW( test_node + i ) ];  if ( delta != 0 )   break; } if ( i &gt;= match_length ) {  match_length = i;  *match_position = test_node;  if ( match_length &gt;= LOOK_AHEAD_SIZE ) {   ReplaceNode( test_node, new_node );   return( match_length );  } } if ( delta &gt;= 0 )  child = &#38;tree[ test_node ].larger_child; else  child = &#38;tree[ test_node ].smaller_child; if ( *child == UNUSED ) {  *child = new_node;  tree[ new_node].parent = test_node;  tree[ new_node].larger_child = UNUSED;  tree[ new_node].smaller_child = UNUSED;  return( match_length );  }  test_node = *child; }}/** This section of code and data makes up the blocked I/O portion of the* program. Every token output consists of a single flag bit, followed* by either a single character or a index/length pair. The flag bits* are stored in the first byte of a buffer array, and the characters* and index/length pairs are stored sequentially in the remaining* positions in the array. After every eight output operations, the* first character of the array is full of flag bits, so the remaining* bytes stored in the array can be output. This can be done with a* single fwrite() operation, making for greater efficiency.** All that is needed to implement this is a few routines, plus three* data objects, which follow below. The buffer has the flag bits* packed into its first character, with the remainder consisting of* the characters and index/length pairs, appearing in the order they* were output. The FlagBitMask is used to indicate where the next* flag bit will go when packed into DataBuffer[ 0 ]. Finally, the* BufferOffset is used to indicate where the next token will be stored* in the buffer.*/char DataBuffer[ 17 ];int FlagBitMask;int BufferOffset;/** To initialize the output buffer, we set the FlagBitMask to the first* bit position, can clear DataBuffer[0], which will hold all the* Flag bits. Finally, the BufferOffset is set to 1, which is where the* first character or index/length pair will go.*/void InitOutputBuffer(){ DataBuffer[ 0 ] = 0; FlagBitMask = 1; BufferOffset = 1;}/** This routine is called during one of two different situations. First,* it can potentially be called right after a character or a length/index* pair is added to the DataBuffer[]. If the position of the bit in the* FlagBitMask indicates that it is full, the output routine calls this* routine to flush data into the output file, and reset the output* variables to their initial state. The other time this routine is* called is when the compression routine is ready to exit. If there is* any data in the buffer at that time, it needs to the flushed.** Note that this routine checks carefully to be sure that it doesn't* ever write out more data than was in the original uncompressed file.* It returns a 0 if this happens, which filters back to the compression* program, so that it can abort if this happens.**/int FlushOutputBuffer(){ if ( BufferOffset == 1 )  return( 1 ); Header.compressed_size += BufferOffset; if ( ( Header.compressed_size ) &gt;= Header.original_size )  return( 0 ); if ( fwrite( DataBuffer, 1, BufferOffset, OutputCarFile )   !=BufferOffset )   FatalError( "Error writing compressed data to CAR file" ); InitOutputBuffer(); return( 1 );}/** This routine adds a single character to the output buffer. In this* case, the flag bit is set, indicating that the next character is an* uncompressed byte. After setting the flag and storing the byte,* the flag bit is shifted over, and checked. If it turns out that all* eight bits in the flag bit character are used up, then we have to* flush the buffer and reinitialize the data. Note that if the* FlushOutputBuffer() routine detects that the output has grown larger* than the input, it returns a 0 back to the calling routine.*/int OutputChar( data )int data;{ DataBuffer[ BufferOffset++ ] = (char) data; DataBuffer[ 0 ] |= FlagBitMask; FlagBitMask &lt;&lt;= 1; if ( FlagBitMask == 0x100 )  return( FlushOutputBuffer() ); else  return( 1 );}/** This routine is called to output a 12-bit position pointer and a 4-bit* length. The 4-bit length is shifted to the top four bits of the first* of two DataBuffer[] characters. The lower four bits contain the upper* four bits of the 12-bit position index. The next of the two DataBuffer* characters gets the lower eight bits of the position index. After* all that work to store those 16 bits, the FlagBitMask is shifted over,* and checked to see if we have used up all our bits. If we have,* the output buffer is flushed, and the output data elements are reset.* If the FlushOutputBuffer routine detects that the output file has* grown too large, it passes and error return back via this routine,* so that it can abort.*/int OutputPair( position, length )int position;int length;{ DataBuffer[ BufferOffset ] = (char) ( length &lt;&lt; 4 ); DataBuffer[ BufferOffest++ ] |= ( position &gt;&gt; 8 ); DataBuffer[ BufferOffset++ ] = (char) ( position &#38; Oxff ); FlagBitMask &lt;&lt;= 1; if ( FlagBitMask == 0x100 )  return( FlushOutputBuffer() ); else  return( 1 );}/** The input process uses the same data structures as the blocked output* routines, but it is somewhat simpler, in that it doesn't actually have* to read in a whole block of data at once. Instead, it just reads in* a single character full of flag bits into DataBuffer[0], and passes* individual bits back to the Expansion program when asked for them.* The expansion program is left to its own devices for reading in the* characters, indices, and match lengths. They can be read in* sequentially using normal file I/0.*/void InitInputBuffer(){ FlagBitMask = 1; DataBuffer[ 0 ] = (char) getc( InputCarFile );}/** When the Expansion program wants a flag bit, it calls this routine.* This routine has to keep track of whether or not it has run out of* flag bits. If it has, it has to go back and reinitialize so as to* have a fresh set.*/int InputBit(){ if ( FlagBitMask == 0x00 )  InitInputBuffer(); FlagBitMask &lt;&lt;= 1; return( DataBuffer[ 0 ] &#38; ( FlagBitMask &gt;&gt; 1 ) );}/** This is the compression routine. It has to first load up the look* ahead buffer, then go into the main compression loop. The main loop* decides whether to output a single character or an index/length* token that defines a phrase. Once the character or phrase has been* sent out, another loop has to run. The second loop reads in new* characters, deletes the strings that are overwritten by the new* character, then adds the strings that are created by the new* character. While running it has the additional responsibility of* creating the checksum of the input data, and checking for when the* output data grows too large. The program returns a success or failure* indicator. It also has to update the original_crc and compressed_size* elements in Header data structure.**/int LZSSCompress( input_text_file )FILE *input_text_file;{ int i; int c; int look_ahead_bytes; int current_position; int replace_count; int match_length; int match_position; Header.compressed_size = 0; Header.original_crc = CRC_MASK; InitOutputBuffer(); current_position = 1; for ( i = 0 ; i &lt; LOOK_AHEAD_SIZE ; i++ ) {   if ( ( c = getc( input_text_file ) ) == EOF )    break;   window[ current_position + i ] = (unsigned char) c;   Header.original_crc = UpdateCharacterCRC32( Header.original_crc, c ); } look_ahead_bytes = i; InitTree( current_position ); match_length = 0; match_position = 0; while ( look_ahead_bytes &gt; 0 ) {  if ( match_length &gt; look_ahead_bytes )   match_length = look_ahead_bytes;  if ( match_length &lt;= BREAK_EVEN ) {   replace_count = 1;   if ( ! OutputChar( window[ current_position ] ) )    return( 0 );  } else {   if ( !OutputPair( match_position, match_length -                     ( BREAK_EVEN + 1 ) ) )    return( 0 ):   replace_count = match_length;  }  for ( i = 0 ; i &lt; replace_count ; i++ ) {   DeleteString( MOD_WINDOW( current_position + LOOK_AHEAD_SIZE ) );   if ( ( c = getc( input_text_file ) ) == EOF ) {    look ahead bytes--;   } else {    Header.original_crc =     UpdateCharacterCRC32( Header.original_crc, c );    window[ MOD_WINDOW( current_position + LOOK_AHEAD_SIZE ) ] =       (unsigned char) c;   }   current_position = MOD_WINDOW( current_position + 1 );   if ( current_position == 0 )    putc( '.', stderr );   if ( look_ahead_bytes )    match_length = AddString( current_position, &#38;match_position );  } }; Header.original_crc ^= CRC_MASK; return( FlushOutputBuffer() );}/** This is the expansion routine for the LZSS algorithm. All it has to do* is read in flag bits, decide whether to read in a character or a* index/length pair, and take the appropriate action. It is responsible* for keeping track of the crc of the output data, and must return it* to the calling routine, for verification.*/unsigned long LZSSExpand( output )FILE *output;{ int i; int current_position; int c; int match_length; int match_position; unsigned long crc; unsigned long output_count; output_count = 0; crc = CRC_MASK; InitInputBuffer(); current_position = 1; while ( output_count &lt; Header.original_size ) {  if ( InputBit() ) {   c = getc( InputCarFile );   putc( c, output );   output_count++;   crc = UpdateCharacterCRC32( crc, c );   window[ current_position ] = (unsigned char) c;   current_position = MOD_WINDOW( current_position + 1 );   if ( current_position == 0 &#38;&#38; output != stdout )    putc( '.', stderr );  } else {   match_length = getc( InputCarFile );   match_position = getc( InputCarFile );   match_position |= (match_length &#38; Oxf ) &lt;&lt; 8;   match_length &gt;&gt;= 4; match_length += BREAK_EVEN;   output_count += match_length + 1;   for ( i = 0 ; i &lt;= match_length ; i++ ) {    c = window[ MOD_WINDOW( match_position + i ) ];    putc( c, output );    crc = UpdateCharacterCRC32( crc, c );    window[ current_position ] = (unsigned char) c;    current_position = MOD_WINDOW( current_position + 1 );    if ( current_position == 0 &#38;&#38; output != stdout )     putc( '.', stderr );   }  } } return( crc ^ CRC_MASK );}/*************************** End of CARMAN.C ***************************/</PRE><!--  END CODE //--><P><BR></P><CENTER><TABLE BORDER><TR><TD><A HREF="403-406.html">Previous</A></TD><TD><A HREF="../ewtoc.html">Table of Contents</A></TD><TD><A HREF="../ch13/457-459.html">Next</A></TD></TR></TABLE></CENTER></TD></TR></TABLE></BODY></HTML>
406-456.html - 源码说明

本页面展示了「The primary purpose of this book is to explain various data-compression techniques using the C progr」中的 406-456.html 源码文件，采用 HTML 编程语言编写，共 1,981 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与data-compression相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?