⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 128-130.html

📁 The primary purpose of this book is to explain various data-compression techniques using the C progr
💻 HTML
字号:
<!DOCTYPE HTML PUBLIC "html.dtd"><HTML><HEAD><TITLE>The Data Compression Book-:Huffman One Better: Arithmetic Coding</TITLE><META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"><SCRIPT><!--function displayWindow(url, width, height) {        var Win = window.open(url,"displayWindow",'width=' + width +',height=' + height + ',resizable=1,scrollbars=yes');}//--></SCRIPT></HEAD><BODY  BGCOLOR="#FFFFFF" VLINK="#DD0000" TEXT="#000000" LINK="#DD0000" ALINK="#FF0000"><TD WIDTH="540" VALIGN="TOP"><!--  <CENTER><TABLE><TR><TD><FORM METHOD="GET" ACTION="http://search.itknowledge.com/excite/cgi-bin/AT-foldocsearch.cgi"><INPUT NAME="search" SIZE="20" VALUE=""><BR><CENTER><INPUT NAME="searchButton" TYPE="submit" VALUE="Glossary Search"></CENTER><INPUT NAME="source" TYPE="hidden" VALUE="local" CHECKED> <INPUT NAME="bltext" TYPE="hidden" VALUE="Back to Search"><INPUT NAME="sp" TYPE="hidden" VALUE="sp"></FORM></TD><TD><IMG SRC="http://www.itknowledge.com/images/dotclear.gif" WIDTH="15"   HEIGHT="1"></TD><TD><FORM METHOD="POST" ACTION="http://search.itknowledge.com/excite/cgi-bin/AT-subscriptionsearch.cgi"><INPUT NAME="search" SIZE="20" VALUE=""><BR><CENTER><INPUT NAME="searchButton" TYPE="submit" VALUE="  Book Search  "></CENTER><INPUT NAME="source" TYPE="hidden" VALUE="local" CHECKED> <INPUT NAME="backlink" TYPE="hidden" VALUE="http://search.itknowledge.com:80/excite/AT-subscriptionquery.html"><INPUT NAME="bltext" TYPE="hidden" VALUE="Back to Search"><INPUT NAME="sp" TYPE="hidden" VALUE="sp"></FORM></TD></TR></TABLE></CENTER> --><!-- ISBN=1558514341//--><!-- TITLE=The Data Compression Book-//--><!-- AUTHOR=Mark Nelson//--><!-- PUBLISHER=IDG Books Worldwide, Inc.//--><!-- IMPRINT=M & T Books//--><!-- CHAPTER=5//--><!-- PAGES=128-130//--><!-- UNASSIGNED1//--><!-- UNASSIGNED2//--><CENTER><TABLE BORDER><TR><TD><A HREF="126-128.html">Previous</A></TD><TD><A HREF="../ewtoc.html">Table of Contents</A></TD><TD><A HREF="130-133.html">Next</A></TD></TR></TABLE></CENTER><P><BR></P><P>The code for count_bytes scans the entire input file, then returns the input pointer to where it was when called. We assume that the number of counts of a given symbol in the file cannot exceed the span of an unsigned long. If this is not true, other measures will need to be taken to avoid overflow of the counts[] array elements.</P><P>After the array has been counted, the counts have to be scaled down. This is done in the scale_counts() routine. The first step here is to scale down the unsigned long counts[] array to fit in an array of unsigned characters. This stores the counts in less space in the output file. The code for this is listed here.</P><!--  CODE //--><PRE>max_count = 0;for ( i = 0 ; i &lt; 256 ; i&#43;&#43; )  if (counts[ i ] &gt; max_count )    max_count = counts[ i ];scale = max_count / 256;scale = scale &#43; 1;for ( i = 0 ; i &lt; 256 ; i&#43;&#43; ) &#123;    scaled_counts[ i ] = (unsigned char ) ( counts[ i ] /scale );    if ( scaled_counts[ i ] == 0 &#38;&#38; counts[ i ] != 0 )      scaled_counts[ i ] = 1;&#125;</PRE><!--  END CODE //--><P>After this is complete, one more scaling may need to be done. As part of the limitations on performing arithmetic coding using fixed-length registers, we have to restrict the total of our counts to less than 16,384, or fourteen bits. The second part of scale_counts does this with brute force, checking the total, then dividing by either two or four to get the correct results. An additional count has to be added because of the single count used by the end-of-stream character.</P><!--  CODE //--><PRE>total = 1;for ( i = 0 ; i &lt; 256 ; i&#43;&#43; )  total &#43;= scaled_counts[ i ];if ( total &gt; ( 32767 - 256 ) )  scale = 4;else if ( total &gt; 16383 )  scale = 2;else  return;for ( i = 0 ; i &lt; 256 ; i &#43;&#43; )   scaled_counts[ i ] /= scale;</PRE><!--  END CODE //--><P>There is certainly room in the scale_counts() routine for more sophistication. Every time we lose some of the accuracy in our counts by scaling, we also lose a certain amount of compression. An ideal scaling routine would scale down the counts to fit in the correct range while doing as little scaling as possible. The routines listed here don&#146;t work very hard at doing that.</P><P>Once the counts have been scaled down to fit in the unsigned character array, the output_counts() routine is called to save them to the output file. This program employs the same method to store the counts as used in Chapter 3 for the Huffman coding example. Instead of storing the entire array of counts, we only store runs on nonzero values. For details on this, refer to Chapter 3.</P><P>The last step in building the model is to set up the cumulative totals array in totals[]. This is the array used when actually performing the arithmetic coding. The code shown below builds that array. Remember that after the totals[] array has been built, the range for symbol x is found in totals[ x ] and totals[ x &#43; 1 ]. The range used for the entire alphabet is found in totals[ END_OF_STREAM &#43;1 ].</P><!--  CODE SNIP //--><PRE>totals[ 0 ] = 0;for ( i = 0 ; i &lt; END_OF_STREAM ; i&#43;&#43; )  totals[ i &#43; 1 ] = totals[ i ] &#43; scaled_counts[ i ];totals[ END_OF_STREAM &#43; 1 ] = totals[ END_OF_STREAM ] &#43; 1;</PRE><!--  END CODE SNIP //--><H4 ALIGN="LEFT"><A NAME="Heading12"></A><FONT COLOR="#000077">Reading the Model</FONT></H4><P>For expansion, the code needs to build the same model array in totals[] that was used in the compression routine. Since the original file is not available to scan for counts, the program reads in the scaled_counts[] array stored in the compressed file. The code that accomplishes this is identical to the Huffman expansion code in Chapter 3. Refer to Chapter 3 for details on how this code works.</P><P>After the scaled_counts[] array has been read in, the same routine used by the compression code can be invoked to build the totals[] array. Calling build_totals() in both the compression and expansion routines helps ensure that we are working with the same array.</P><H4 ALIGN="LEFT"><A NAME="Heading13"></A><FONT COLOR="#000077">Initializing the Encoder</FONT></H4><P>Before compression can begin, we have to initialize the variables that constitute the arithmetic encoder. Three 16-bit variables define the arithmetic encoder: low, high, and underflow_bits. When the encoder first starts to run, the range of the output floating-point number is anywhere between 0 and 1. The low variable is initialized to 0 and the high to 0xFFFF. These two variables have an implied decimal point just ahead of the most significant bit and an infinite trail of ones or zeros. The ones will be shifted into the high variable and the zeros into the low.</P><!--  CODE SNIP //--><PRE>low = 0;high = 0xffff;underflow_bits = 0;</PRE><!--  END CODE SNIP //--><P><BR></P><CENTER><TABLE BORDER><TR><TD><A HREF="126-128.html">Previous</A></TD><TD><A HREF="../ewtoc.html">Table of Contents</A></TD><TD><A HREF="130-133.html">Next</A></TD></TR></TABLE></CENTER></TD></TR></TABLE></BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -