⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 034-036.html

📁 The primary purpose of this book is to explain various data-compression techniques using the C progr
💻 HTML
字号:
<!DOCTYPE HTML PUBLIC "html.dtd"><HTML><HEAD><TITLE>The Data Compression Book-:The Dawn Age: Minimum Redundancy Coding</TITLE><META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"><SCRIPT><!--function displayWindow(url, width, height) {        var Win = window.open(url,"displayWindow",'width=' + width +',height=' + height + ',resizable=1,scrollbars=yes');}//--></SCRIPT></HEAD><BODY  BGCOLOR="#FFFFFF" VLINK="#DD0000" TEXT="#000000" LINK="#DD0000" ALINK="#FF0000"><TD WIDTH="540" VALIGN="TOP"><!--  <CENTER><TABLE><TR><TD><FORM METHOD="GET" ACTION="http://search.itknowledge.com/excite/cgi-bin/AT-foldocsearch.cgi"><INPUT NAME="search" SIZE="20" VALUE=""><BR><CENTER><INPUT NAME="searchButton" TYPE="submit" VALUE="Glossary Search"></CENTER><INPUT NAME="source" TYPE="hidden" VALUE="local" CHECKED> <INPUT NAME="bltext" TYPE="hidden" VALUE="Back to Search"><INPUT NAME="sp" TYPE="hidden" VALUE="sp"></FORM></TD><TD><IMG SRC="http://www.itknowledge.com/images/dotclear.gif" WIDTH="15"   HEIGHT="1"></TD><TD><FORM METHOD="POST" ACTION="http://search.itknowledge.com/excite/cgi-bin/AT-subscriptionsearch.cgi"><INPUT NAME="search" SIZE="20" VALUE=""><BR><CENTER><INPUT NAME="searchButton" TYPE="submit" VALUE="  Book Search  "></CENTER><INPUT NAME="source" TYPE="hidden" VALUE="local" CHECKED> <INPUT NAME="backlink" TYPE="hidden" VALUE="http://search.itknowledge.com:80/excite/AT-subscriptionquery.html"><INPUT NAME="bltext" TYPE="hidden" VALUE="Back to Search"><INPUT NAME="sp" TYPE="hidden" VALUE="sp"></FORM></TD></TR></TABLE></CENTER> --><!-- ISBN=1558514341//--><!-- TITLE=The Data Compression Book-//--><!-- AUTHOR=Mark Nelson//--><!-- PUBLISHER=IDG Books Worldwide, Inc.//--><!-- IMPRINT=M & T Books//--><!-- CHAPTER=3//--><!-- PAGES=034-036//--><!-- UNASSIGNED1//--><!-- UNASSIGNED2//--><CENTER><TABLE BORDER><TR><TD><A HREF="031-034.html">Previous</A></TD><TD><A HREF="../ewtoc.html">Table of Contents</A></TD><TD><A HREF="036-043.html">Next</A></TD></TR></TABLE></CENTER><P><BR></P><P>Note, however, that the Huffman codes differ in length from Shannon-Fano codes. The code length for A is only a single bit, instead of two, and the B and C symbols have 3-bit codes instead of two bits. The following table shows what effect this has on the total number of bits produced by the message.</P><TABLE WIDTH="100%"><TR><TH COLSPAN="6"><HR><TR><TH WIDTH="13%" VALIGN="TOP" ALIGN="LEFT">Symbol<TH WIDTH="10%" VALIGN="TOP" ALIGN="LEFT">Count<TH WIDTH="22%" ALIGN="LEFT">Shannon-Fano Size<TH WIDTH="22%" ALIGN="LEFT">Shannon-Fano Bits<TH WIDTH="20%" VALIGN="TOP" ALIGN="LEFT">Huffman Size<TH WIDTH="13%" VALIGN="TOP" ALIGN="LEFT">Huffman Bits<TR><TH COLSPAN="6"><HR><TR><TD>A<TD>15<TD>2<TD>30<TD>1<TD>15<TR><TD>B<TD>7<TD>2<TD>14<TD>3<TD>21<TR><TD>C<TD>6<TD>2<TD>12<TD>3<TD>18<TR><TD>D<TD>6<TD>3<TD>18<TD>3<TD>18<TR><TD>E<TD>5<TD>3<TD>15<TD>3<TD>15<TR><TD COLSPAN="6"><HR></TABLE><P>This adjustment in code size adds 13 bits to the number needed to encode the B and C symbols, but it saves 15 bits when coding the A symbol, for a net savings of 2 bits. Thus, for a message with an information content of 85.25 bits, Shannon-Fano coding requires 89 bits, but Huffman coding requires only 87.</P><P>In general, Shannon-Fano and Huffman coding are close in performance. But Huffman coding will always at least equal the efficiency of Shannon-Fano coding, so it has become the predominant coding method of its type. Since both algorithms take a similar amount of processing power, it seems sensible to take the one that gives slightly better performance. And Huffman was able to prove that this coding method cannot be improved on with any other integral bit-width coding stream.</P><P>Since D. A. Huffman first published his 1952 paper, &#147;A Method for the Construction of Minimum Redundancy Codes,&#148; his coding algorithm has been the subject of an overwhelming amount of additional research. Information theory journals to this day carry numerous papers on the implementation of various esoteric flavors of Huffman codes, searching for ever better ways to use this coding method. Huffman coding is used in commercial compression programs, FAX machines, and even the JPEG algorithm. The next logical step in this book is to outline the C code needed to implement the Huffman coding scheme.</P><H3><A NAME="Heading4"></A><FONT COLOR="#000077">Huffman in C</FONT></H3><P>A Huffman coding tree is built as a binary tree, from the leaf nodes up. Huffman may or may not have had digital computers in mind when he developed his code, but programmers use the tree data structure all the time.</P><P>Two programs used here illustrate Huffman coding. The compressor, HUFF-C, implements a simple order-0 model and a single Huffman tree to encode it. HUFF-E expands files compressed using HUFF-C. Both programs use a few pieces of utility code that will be seen throughout this book. Before we go on the actual Huffman code, here is a quick overview of what some of the utility modules do.</P><H4 ALIGN="LEFT"><A NAME="Heading5"></A><FONT COLOR="#000077">BITIO.C</FONT></H4><P>Data-compression programs perform lots of input/output (I/O) that does reads or writes of unconventional numbers of bits. Huffman coding, for example, reads and writes bits one at a time. LZW programs read and write codes that can range in size from 9 to 16 bits. The standard C I/O library defined in STDIO.H only accommodates I/O on even byte boundaries. Routines like putc() and getc() read and write single bytes, while fread() and fwrite() read and write whole blocks of bytes at a time. The library offers no help for programmers needing a routine to write a single bit at a time.</P><P>To support this conventional I/O in a conventional way, bit-oriented I/O routines are confined to a single source module, BITIO.C. Access to these routines is provided via a header file called BITIO.H, which contains a structure definition and several function prototypes.</P><P>Two routines open files for bit I/O, one for input and one for output. As defined in BITIO.H, they are</P><!--  CODE SNIP //--><PRE>BIT_FILE *OpenInputBitFile( char *name );BIT_FILE *OpenOutputBitFile ( char *name );</PRE><!--  END CODE SNIP //--><P>These two routines return a pointer to a new structure, BIT_FILE. BIT_FILE is also defined in BITIO.H as shown:</P><!--  CODE SNIP //--><PRE>typedef struct bit_file &#123;     FILE *file;     unsigned char mask;     int rack;     int pacifier_counter;&#125; BIT_FILE:</PRE><!--  END CODE SNIP //--><P>OpenInputBitFile() or OpenOutputBitFile() perform a conventional fopen() call and store the returned FILE structure pointer in the BIT_FILE structure. The other two structure elements are initialized to their startup values, and a pointer to the resulting BIT_FILE structure is returned.</P><P>In BITIO.H, <I>rack</I> contains the current byte of data either read in from the file or waiting to be written out to the file. <I>mask</I> contains a single bit mask used either to set or clear the current output bit or to mask in the current input bit.</P><P><BR></P><CENTER><TABLE BORDER><TR><TD><A HREF="031-034.html">Previous</A></TD><TD><A HREF="../ewtoc.html">Table of Contents</A></TD><TD><A HREF="036-043.html">Next</A></TD></TR></TABLE></CENTER></TD></TR></TABLE></BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -