⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 113-115.html

📁 The primary purpose of this book is to explain various data-compression techniques using the C progr
💻 HTML
字号:
<!DOCTYPE HTML PUBLIC "html.dtd"><HTML><HEAD><TITLE>The Data Compression Book-:Huffman One Better: Arithmetic Coding</TITLE><META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"><SCRIPT><!--function displayWindow(url, width, height) {        var Win = window.open(url,"displayWindow",'width=' + width +',height=' + height + ',resizable=1,scrollbars=yes');}//--></SCRIPT></HEAD><BODY  BGCOLOR="#FFFFFF" VLINK="#DD0000" TEXT="#000000" LINK="#DD0000" ALINK="#FF0000"><TD WIDTH="540" VALIGN="TOP"><!--  <CENTER><TABLE><TR><TD><FORM METHOD="GET" ACTION="http://search.itknowledge.com/excite/cgi-bin/AT-foldocsearch.cgi"><INPUT NAME="search" SIZE="20" VALUE=""><BR><CENTER><INPUT NAME="searchButton" TYPE="submit" VALUE="Glossary Search"></CENTER><INPUT NAME="source" TYPE="hidden" VALUE="local" CHECKED> <INPUT NAME="bltext" TYPE="hidden" VALUE="Back to Search"><INPUT NAME="sp" TYPE="hidden" VALUE="sp"></FORM></TD><TD><IMG SRC="http://www.itknowledge.com/images/dotclear.gif" WIDTH="15"   HEIGHT="1"></TD><TD><FORM METHOD="POST" ACTION="http://search.itknowledge.com/excite/cgi-bin/AT-subscriptionsearch.cgi"><INPUT NAME="search" SIZE="20" VALUE=""><BR><CENTER><INPUT NAME="searchButton" TYPE="submit" VALUE="  Book Search  "></CENTER><INPUT NAME="source" TYPE="hidden" VALUE="local" CHECKED> <INPUT NAME="backlink" TYPE="hidden" VALUE="http://search.itknowledge.com:80/excite/AT-subscriptionquery.html"><INPUT NAME="bltext" TYPE="hidden" VALUE="Back to Search"><INPUT NAME="sp" TYPE="hidden" VALUE="sp"></FORM></TD></TR></TABLE></CENTER> --><!-- ISBN=1558514341//--><!-- TITLE=The Data Compression Book-//--><!-- AUTHOR=Mark Nelson//--><!-- PUBLISHER=IDG Books Worldwide, Inc.//--><!-- IMPRINT=M & T Books//--><!-- CHAPTER=5//--><!-- PAGES=113-115//--><!-- UNASSIGNED1//--><!-- UNASSIGNED2//--><CENTER><TABLE BORDER><TR><TD><A HREF="../ch04/101-112.html">Previous</A></TD><TD><A HREF="../ewtoc.html">Table of Contents</A></TD><TD><A HREF="115-117.html">Next</A></TD></TR></TABLE></CENTER><P><BR></P><H2><A NAME="Heading1"></A><FONT COLOR="#000077">Chapter 5<BR>Huffman One Better: Arithmetic Coding</FONT></H2><P>The last two chapters show that Huffman coding uses knowledge about information content to efficiently encode symbols. If the probability of a symbol&#146;s appearance in a message is known, Huffman techniques can encode that symbol using a minimal number of bits. Huffman coding has been proven the best fixed-length coding method available.</P><H3><A NAME="Heading2"></A><FONT COLOR="#000077">Difficulties</FONT></H3><P>Huffman codes have to be an integral number of bits long, and this can sometimes be a problem. If the probability of a character is 1/3, for example, the optimum number of bits to code that character is around 1.6 bits. Huffman coding has to assign either one or two bits to the code, and either choice leads to a longer compressed message than is theoretically possible.</P><P>This non optimal coding becomes a noticeable problem when the probability of a character is very high. If a statistical method could assign a 90 percent probability to a given character, the optimal code size would be 0.15 bits. The Huffman coding system would probably assign a 1-bit code to the symbol, which is six times longer than necessary.</P><P>This would be a problem when compressing two-color images, like those coming from a fax machine. Since there are only two colors, an ordinary coding method would assign the 1 bit to one color and the 0 bit to the other. Since both codes have only a single bit, Huffman coding is not going to be able to compress this data at all. No matter how high the probability of one of the bits, we are still going to have to encode it using one bit.</P><P>The conventional solution to this problem is to group the bits into packets and apply Huffman coding. But this weakness prevents Huffman coding from being a universal compressor.</P><H3><A NAME="Heading3"></A><FONT COLOR="#000077">Arithmetic Coding: A Step Forward</FONT></H3><P>Only in the last fifteen years has a respectable candidate to replace Huffman coding been successfully demonstrated: arithmetic coding. Arithmetic coding bypasses the idea of replacing an input symbol with a specific code. It replaces a stream of input symbols with a single floating-point output number. More bits are needed in the output number for longer, complex messages. This concept has been known for some time, but only recently were practical methods found to implement arithmetic coding on computers with fixed-sized registers.</P><P>The output from an arithmetic coding process is a single number less than 1 and greater than or equal to 0. This single number can be uniquely decoded to create the exact stream of symbols that went into its construction. To construct the output number, the symbols are assigned a set probabilities. The message &#147;BILL GATES,&#148; for example, would have a probability distribution like this:</P><TABLE WIDTH="100%"><TR><TH COLSPAN="2"><HR><TR><TH WIDTH="30%" ALIGN="LEFT">Character<TH WIDTH="70%" ALIGN="LEFT">Probability<TR><TH COLSPAN="2"><HR><TR><TD>SPACE<TD>1/10<TR><TD>A<TD>1/10<TR><TD>B<TD>1/10<TR><TD>E<TD>1/10<TR><TD>G<TD>1/10<TR><TD>I<TD>1/10<TR><TD>L<TD>2/10<TR><TD>S<TD>1/10<TR><TD>T<TD>1/10<TR><TD COLSPAN="2"><HR></TABLE><P>Once character probabilities are known, individual symbols need to be assigned a range along a &#147;probability line,&#148; nominally 0 to 1. It doesn&#146;t matter which characters are assigned which segment of the range, as long as it is done in the same manner by both the encoder and the decoder. The nine-character symbol set used here would look like the following:</P><TABLE WIDTH="100%"><TR><TH COLSPAN="3"><HR><TR><TH WIDTH="25%" ALIGN="LEFT">Character<TH WIDTH="25%" ALIGN="LEFT">Probability<TH WIDTH="50%" ALIGN="LEFT">Range<TR><TH COLSPAN="3"><HR><TR><TD>SPACE<TD>1/10<TD>0.00 [gte] r &gt; 0.10<TR><TD>A<TD>1/10<TD>0.10 [gte] r &gt; 0.20<TR><TD>B<TD>1/10<TD>0.20 [gte] r &gt; 0.30<TR><TD>E<TD>1/10<TD>0.30 [gte] r &gt; 0.40<TR><TD>G<TD>1/10<TD>0.40 [gte] r &gt; 0.50<TR><TD>I<TD>1/10<TD>0.50 [gte] r &gt; 0.60<TR><TD>L<TD>2/10<TD>0.60 [gte] r &gt; 0.80<TR><TD>S<TD>1/10<TD>0.80 [gte] r &gt; 0.90<TR><TD>T<TD>1/10<TD>0.90 [gte] r &gt; 1.00<TR><TD COLSPAN="3"><HR></TABLE><P><BR></P><CENTER><TABLE BORDER><TR><TD><A HREF="../ch04/101-112.html">Previous</A></TD><TD><A HREF="../ewtoc.html">Table of Contents</A></TD><TD><A HREF="115-117.html">Next</A></TD></TR></TABLE></CENTER></TD></TR></TABLE></BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -