172-200.html

来自「The primary purpose of this book is to e」· HTML 代码 · 共 1,215 行 · 第 1/3 页
HTML
1,215 行
<!DOCTYPE HTML PUBLIC "html.dtd"><HTML><HEAD><TITLE>The Data Compression Book-:Statistical Modeling</TITLE><META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"><SCRIPT><!--function displayWindow(url, width, height) {        var Win = window.open(url,"displayWindow",'width=' + width +',height=' + height + ',resizable=1,scrollbars=yes');}//--></SCRIPT></HEAD><BODY  BGCOLOR="#FFFFFF" VLINK="#DD0000" TEXT="#000000" LINK="#DD0000" ALINK="#FF0000"><TD WIDTH="540" VALIGN="TOP"><!--  <CENTER><TABLE><TR><TD><FORM METHOD="GET" ACTION="http://search.itknowledge.com/excite/cgi-bin/AT-foldocsearch.cgi"><INPUT NAME="search" SIZE="20" VALUE=""><BR><CENTER><INPUT NAME="searchButton" TYPE="submit" VALUE="Glossary Search"></CENTER><INPUT NAME="source" TYPE="hidden" VALUE="local" CHECKED> <INPUT NAME="bltext" TYPE="hidden" VALUE="Back to Search"><INPUT NAME="sp" TYPE="hidden" VALUE="sp"></FORM></TD><TD><IMG SRC="http://www.itknowledge.com/images/dotclear.gif" WIDTH="15"   HEIGHT="1"></TD><TD><FORM METHOD="POST" ACTION="http://search.itknowledge.com/excite/cgi-bin/AT-subscriptionsearch.cgi"><INPUT NAME="search" SIZE="20" VALUE=""><BR><CENTER><INPUT NAME="searchButton" TYPE="submit" VALUE="  Book Search  "></CENTER><INPUT NAME="source" TYPE="hidden" VALUE="local" CHECKED> <INPUT NAME="backlink" TYPE="hidden" VALUE="http://search.itknowledge.com:80/excite/AT-subscriptionquery.html"><INPUT NAME="bltext" TYPE="hidden" VALUE="Back to Search"><INPUT NAME="sp" TYPE="hidden" VALUE="sp"></FORM></TD></TR></TABLE></CENTER> --><!-- ISBN=1558514341//--><!-- TITLE=The Data Compression Book-//--><!-- AUTHOR=Mark Nelson//--><!-- PUBLISHER=IDG Books Worldwide, Inc.//--><!-- IMPRINT=M & T Books//--><!-- CHAPTER=6//--><!-- PAGES=172-200//--><!-- UNASSIGNED1//--><!-- UNASSIGNED2//--><CENTER><TABLE BORDER><TR><TD><A HREF="169-171.html">Previous</A></TD><TD><A HREF="../ewtoc.html">Table of Contents</A></TD><TD><A HREF="../ch07/201-203.html">Next</A></TD></TR></TABLE></CENTER><P><BR></P><H3><A NAME="Heading18"></A><FONT COLOR="#000077">ARITH-N Listing</FONT></H3><!--  CODE //--><PRE>/************************* Start of ARITH-N.C **************************/* * This is the order-n arithmetic coding module used in the final* part of chapter 6.** Compile with BITIO.C. ERRHAND.C, and either MAIN-C.C OR MAIN-E.C. This* program should be compiled in large model.  An even better alternative* is a DOS extender.**/#include &lt;stdio.h&gt;#include &lt;stdlib.h&gt;#include &lt;string.h&gt;#include "errhand.h"#include "bitio.h"/** The SYMBOL structure is what is used to defined a symbol in* arithmetic coding terms.  A symbol is defined as a range between* 0 and 1.  Since we are using integer math, instead of using 0 and 1* as our end points, we have an integer scale.  The low_count and* high_count define where the symbol falls in the range.*/typedef struct {     unsigned short int low_count;     unsigned short int high_count;     unsigned short int scale;} SYMBOL;#define MAXIMUM_SCALE 16383  /* Maximum allowed frequency count */#define ESCAPE   256         /* The escape symbol               */#define DONE (-1)            /* The output stream empty  symbol */#define FLUSH  (-2)          /* The symbol to flush the model   *//**  Function prototypes.*/#ifdef __STDC__void initialize_options( int argc, char **argv );int check_compression( FILE *input, BIT_FILE *output );void initialize_model( void );void update_model( int symbol );int convert_int_to_symbol( int symbol, SYMBOL *s );void get_symbol_scale( SYMBOL *s );int convert_symbol_to_int( int count, SYMBOL *s );void add_character_to_model( int c );void flush_model( void );void initialize_arithmetic_decoder( BIT_FILE *stream ) ;void remove_symbol_from_stream( BIT_FILE *stream, SYMBOL *s );void initialize_arithmetic_encoder( void );void encode_symbol( BIT_FILE *stream, SYMBOL *s );void flush_arithmetic_encoder( BIT_FILE *stream );short int get_current_count( SYMBOL *s );#elsevoid initialize_options();int check_compression();void initialize_model();void update_model();int convert_int_to_symbol();void get_symbol_scale();int convert_symbol_to_int();void add_character_to_model();void flush_model();void initialize_arithmetic_decoder();void remove_symbol_from_stream();void initialize_arithmetic_encoder();void encode_symbol();void flush_arithmetic_encoder();short int get_current_count();#endifchar *CompressionName = "Adaptive order n model with arithmetic coding";char *Usage           = "in-file out-file [ -o order ]\n\n";int max_order         = 3;/** The main procedure is similar to the main found in ARITH1E.C.  It has* to initialize the coder and the model.  It then sits in a loop reading* input symbols and encoding them.  One difference is that every 256* symbols a compression check is performed.  If the compression ratio* falls below 10%, a flush character is encoded.  This flushes the encod* ing model, and will cause the decoder to flush its model when the* file is being expanded.  The second difference is that each symbol is* repeatedly encoded until a successful encoding occurs.  When trying to* encode a character in a particular order, the model may have to* transmit an ESCAPE character.  If this is the case, the character has* to be retransmitted using a lower order.  This process repeats until a* successful match is found of the symbol in a particular context.* Usually this means going down no further than the order -1 model.* However, the FLUSH and DONE symbols drop back to the order -2 model.**/void CompressFile( input, output, argc, argv )FILE *input;BIT_FILE *output;int argc;char *argv[];{     SYMBOL s;     int c;     int escaped;     int flush = 0;     long int text_count = 0;     initialize_options( argc, argv );     initialize_model();     initialize-arithmetic_encoder();     for ( ; ; ) {       if ( ( ++text_count &amp; 0x0ff ) == 0 )         flush = check_compression( input, output );       if ( !flush )         c = getc( input );       else         c = FLUSH;       if ( c == EOF )         c = DONE;       do {         escaped = convert_int_to_symbol( c, &amp;s);         encode_symbol( output, &amp;s );       } while ( escaped );       if ( c == DONE )         break;       if ( c == FLUSH ) {         flush_model();         flush = 0;       }       update_model( c );       add_character_to_model( c );     }     flush_arithmetic_encoder( output );}/** The main loop for expansion is very similar to the expansion* routine used in the simpler compression program, ARITH1E.C.  The* routine first has to initialize the the arithmetic coder and the* model.  The decompression loop differs in a couple of respect.* First of all, it handles the special ESCAPE character, by* removing them from the input bit stream but just throwing them* away otherwise.  Secondly, it handles the special FLUSH character.* Once the main decoding loop is done, the cleanup code is called,* and the program exits.**/void ExpandFile( input, output, argc, argv )BIT_FILE *input;FILE *output;int argc;char *argv[];{     SYMBOL s;     int c;     int count;     initialize_options( argc, argv );     initialize_model();     initialize_arithmetic_decoder( input );     for ( ; ; ) {       do {         get_symbol_scale( &amp;s );         count = get_current_count( &amp;s );         c = convert_symbol_to_int( count, &amp;s );         remove_symbol_from_stream( input, &amp;s );       } while ( c == ESCAPE );       if ( c == DONE )         break;       if ( c != FLUSH )         putc( (char) c, output );       else         flush_model();       update_model( c );       add_character_to_model( c );     }}/** This routine checks for command line options.  At present, the only* option being passed on the command line is the order.*/void initialize_options( argc, argv )int argc;char *argv[];{     while ( argc&#150;&#150; &gt; 0 ) {       if ( strcmp( *argv, "-o" ) == 0 ) {         argc&#150;&#150;;         max_order = atoi( *++argv );       } else         printf( "Uknown argument on command line: %s\n", *argv );       argc&#150;&#150;;       argv++;     }}/** This routine is called once every 256 input symbols.  Its job is to* check to see if the compression ratio falls below 10%.  If the* output size is 90% of the input size, it means not much compression* is taking place, so we probably ought to flush the statistics in the* model to allow for more current statistics to have greater impact.* This heuristic approach does seem to have some effect.*/int check_compression( input, output )FILE *input;BIT_FILE *output;{     static long local_input_marker = 0L;     static long local_output_marker = 0L;     long total_input_bytes;     long total_output_bytes;     int local_ratio;     total_input_bytes = ftell( input ) - local_input_marker;     total_output_bytes = ftell( output-&gt;file );     total_output_bytes -= local_output_marker;     if ( total_output_bytes == 0 )       total_output_bytes = 1;     local_ratio = (int)( ( total_output_bytes * 100 ) /                                   total_input_bytes );     local_input_marker = ftell( input );     local_output_marker = ftell( output-&gt;file );     return( local_ratio &gt; 90 );}/*** The next few routines contain all of the code and data used to* perform modeling for this program.  This modeling unit keeps track* of all contexts from 0 up to max_order, which defaults to 3.  In* addition, there is a special context -1 which is a fixed model used* to encode previously unseen characters, and a context -2 which is* used to encode EOF and FLUSH messages.** Each context is stored in a special CONTEXT structure, which is* documented below.  Context tables are not created until the context* is seen, and they are never destroyed.**//** A context table contains a list of the counts for all symbols* that have been seen in the defined context.  For example, a* context of "Zor" might have only had 2 different characters* appear.  't' might have appeared 10 times, and '1' might have* appeared once.  These two counts are stored in the context* table.  The counts are stored in the STATS structure.  All of* the counts for a given context are stored in and array of STATS.* As new characters are added to a particular contexts, the STATS* array will grow.  Sometimes the STATS array will shrink* after flushing the model.*/typedef struct {     unsigned char symbol;     unsigned char counts;} STATS;/** Each context has to have links to higher order contexts.  These* links are used to navigate through the context tables.  For example,* to find the context table for "ABC", I start at the order 0 table,* then find the pointer to the "A" context table by looking through* the LINKS array.  At that table, we find the "B" link and go to* that table.  The process continues until the destination table is* found.  The table pointed to by the LINKS array corresponds to the* symbol found at the same offset in the STATS table.  The reason that* LINKS is in a separate structure instead of being combined with* STATS is to save space.  All of the leaf context nodes don't need* next pointers, since they are in the highest order context.  In the* leaf nodes, the LINKS array is a NULL pointer.*/typedef struct {     struct context *next;} LINKS;/** The CONTEXT structure holds all of the known information about* a particular context.  The links and stats pointers are discussed* immediately above here.  The max_index element gives the maximum* index that can be applied to the stats or link array.  When the* table is first created, and stats is set to NULL, max_index is set* to -1.  As soon as single element is added to stats, max_index is* incremented to 0.** The lesser context pointer is a navigational aid.  It points to* the context that is one less than the current order.  For example,* if the current context is "ABC", the lesser_context pointer will* point to "BC".  The reason for maintaining this pointer is that* this particular bit of table searching is done frequently, but* the pointer only needs to be built once, when the context is* created.*/typedef struct context {     int max_index;     LINKS *links;     STATS *stats;     struct context *lesser_context;} CONTEXT;/** *contexts[] is an array of current contexts.  If I want to find* the order 0 context for the current state of the model.  I just* look at contexts[0].  This array of context pointers is set up* every time the model is updated.*/CONTEXT **contexts;/** current_order contains the current order of the model.  It starts* at max_order, and is decremented every time an ESCAPE is sent.  It* will only go down to -1 for normal symbols, but can go to -2 for* EOF and FLUSH.*/int current_order;/** This table contains the cumulative totals for the current context.* Because this program is using exclusion, totals has to be calculated* every time a context is used.  The scoreboard array keeps track of* symbols that have appeared in higher order models, so that they* can be excluded from lower order context total calculations.*/short int totals[ 258 ];char scoreboard[ 256 ];/** Local procedure declarations for modeling routines.
172-200.html - 源码说明

本页面展示了「The primary purpose of this book is to explain various data-compression techniques using the C progr」中的 172-200.html 源码文件，采用 HTML 编程语言编写，共 1,215 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与data-compression相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?