📄 chap17.htm
字号:
<pre>= <I>f</I>[<I>z</I>]<I>d<SUB>T</I></SUB>'<SUB>(<I>z</I>)</SUB> + (<I>f</I>[<I>x</I>] + <I>f</I>[<I>y</I>]),</sub></sup></pre><P>
from which we conclude that<P>
<pre><I>B</I>(<I>T</I>) = <I>B</I>(<I>T</I>') + <I>f</I>[<I>x</I>] + <I>f</I>[<I>y</I>].</sub></sup></pre><P>
If <I>T</I><I>'</I> represents a nonoptimal prefix code for the alphabet <I>C</I><I>'</I>, then there exists a tree <I>T</I>\" whose leaves are characters in <I>C</I><I>'</I> such that <I>B</I>(<I>T</I>\"<I>) < </I><I>B</I>(<I>T</I><I>'</I>). Since <I>z</I> is treated as a character in <I>C</I>'<I>, it appears as a leaf in </I>T<I>\"</I>. If we add <I>x</I> and <I>y</I> as children of <I>z</I> in <I>T</I><I>\"</I>, then we obtain a prefix code for <I>C </I>with cost <I>B</I>(<I>T</I><I>\"</I>) + <I>f</I>[<I>x</I>] + <I>f</I>[<I>y</I>] < <I>B</I>(<I>T</I>), contradicting the optimality of <I>T</I>. Thus, <I>T</I><I>'</I> must be optimal for the alphabet <I>C</I>'<I>. </I><P>
<a name="0832_15a0">Theorem 17.4<a name="0832_15a0"><P>
Procedure <FONT FACE="Courier New" SIZE=2>HUFFMAN </FONT>produces an optimal prefix code.<P>
<I><B>Proof </I></B>Immediate from Lemmas 17.2 and 17.3. <P>
<P>
<h2><a name="0833_0001">Exercises<a name="0833_0001"></h2><P>
<a name="0833_0002">17.3-1<a name="0833_0002"><P>
Prove that a binary tree that is not full cannot correspond to an optimal prefix code.<P>
<a name="0833_0003">17.3-2<a name="0833_0003"><P>
What is an optimal Huffman code for the following set of frequencies, based on the first 8 Fibonacci numbers?<P>
<pre>a:1 b:1 c:2 d:3 e:5 f:8 g:13 h:21</sub></sup></pre><P>
Can you generalize your answer to find the optimal code when the frequencies are the first <I>n</I> Fibonacci numbers?<P>
<a name="0833_0004">17.3-3<a name="0833_0004"><P>
Prove the total cost of a tree for a code can also be computed as the sum, over all internal nodes, of the combined frequencies of the two children of the node.<P>
<a name="0833_0005">17.3-4<a name="0833_0005"><P>
Prove that for an optimal code, if the characters are ordered so that their frequencies are nonincreasing, then their codeword lengths are nondecreasing.<P>
<a name="0833_0006">17.3-5<a name="0833_0006"><P>
Let <I>C</I> = {0, 1, . . . , <I>n</I> - 1} be a set of characters. Show that any optimal prefix code on <I>C</I> can be represented by a sequence of<P>
<pre>2<I>n</I> - 1 + <I>n</I> <img src="../images/hfbrul14.gif">lg <I>n</I><img src="../images/hfbrur14.gif"></sub></sup></pre><P>
bits. (<I>Hint:</I> Use 2<I>n</I>- 1 bits to specify the structure of the tree, as discovered by a walk of the tree.)<P>
<a name="0833_0007">17.3-6<a name="0833_0007"><P>
Generalize Huffman's algorithm to ternary codewords (i.e., codewords using the symbols 0, 1, and 2), and prove that it yields optimal ternary codes.<P>
<a name="0833_0008">17.3-7<a name="0833_0008"><P>
Suppose a data file contains a sequence of 8-bit characters such that all 256 characters are about as common: the maximum character frequency is less than twice the minimum character frequency. Prove that Huffman coding in this case is no more efficient than using an ordinary 8-bit fixed-length code.<P>
<a name="0833_0009">17.3-8<a name="0833_0009"><P>
Show that no compression scheme can expect to compress a file of randomly chosen 8-bit characters by even a single bit. (<I>Hint:</I> Compare the number of files with the number of possible encoded files.)<P>
<P>
<P>
<h1><a name="0834_159d">* 17.4 Theoretical foundations for greedy methods<a name="0834_159d"></h1><P>
<a name="0834_159c">There is a beautiful theory about greedy algorithms, which we sketch in this section. This theory is useful in determining when the greedy method yields optimal solutions. It involves combinatorial structures known as "matroids." Although this theory does not cover all cases for which a greedy method applies (for example, it does not cover the activity-selection problem of Section 17.1 or the Huffman coding problem of Section 17.3), it does cover many cases of practical interest. Furthermore, this theory is being rapidly developed and extended to cover many more applications; see the notes at the end of this chapter for references.<P>
<h2><a name="0835_15ab">17.4.1 Matroids<a name="0835_15ab"></h2><P>
<a name="0835_159d">A <I><B>matroid</I></B> is an ordered pair <img src="345_a.gif"> satisfying the following conditions.<P>
1. <I>S</I> is a finite nonempty set.<P>
<a name="0835_159e"><a name="0835_159f"><a name="0835_15a0"><a name="0835_15a1">2. <img src="345_b.gif"> is a nonempty family of subsets of <I>S</I>, called the <I><B>independent</I></B> subsets of <I>S</I>, such that if <img src="345_c.gif"> and <I>A </I><img src="../images/rgtubar.gif"><I> B</I>, then <img src="345_d.gif">. We say that <img src="345_e.gif"><I> </I>is <I><B>hereditary</I></B> if it satisfies this property. Note that the empty set <img src="345_f.gif"> is necessarily a member of <img src="345_g.gif">.<P>
<a name="0835_15a2">3. If <img src="345_h.gif"><I>,</I> and<I> |A| < |B|</I>, then there is some element <I>x</I> <img src="../images/memof12.gif"> <I>B - A </I>such that <img src="345_i.gif">. We say that <I>M </I>satisfies the <I><B>exchange property</I></B>.<P>
<a name="0835_15a3">The word "matroid" is due to Hassler Whitney. He was studying <I><B>matric matroids</I></B>, in which the elements of <I>S</I> are the rows of a given matrix and a set of rows is independent if they are linearly independent in the usual sense. It is easy to show that this structure defines a matroid (see Exercise 17.4-2).<P>
<a name="0835_15a4">As another illustration of matroids, consider the <I><B>graphic matroid</I></B> <img src="345_j.gif"> defined in terms of a given undirected graph <I>G = </I>(<I> V, E</I>) as follows.<P>
<img src="../images/dot12.gif"> The set <I>S<SUB>G</I></SUB> is defined to be <I>E</I>, the set of edges of <I>G</I>.<P>
<img src="../images/dot12.gif"> If <I>A</I> is a subset of <I>E</I>, then <img src="345_k.gif"> if and only if <I>A</I> is acyclic. That is, a set of edges is independent if and only if it forms a forest.<P>
<a name="0835_15a5"><a name="0835_15a6">The graphic matroid <I>M<SUB>G</I></SUB> is closely related to the minimum-spanning-tree problem, which is covered in detail in Chapter 24.<P>
<a name="0835_15ac">Theorem 17.5<a name="0835_15ac"><P>
If<I> G</I> is an undirected graph, then <img src="345_l.gif"> is a matroid.<P>
<I><B>Proof </I></B>Clearly, <I>S<SUB>G</I></SUB> = <I>E</I> is a finite set. Furthermore, <img src="345_m.gif"> is hereditary, since a subset of a forest is a forest. Putting it another way, removing edges from an acyclic set of edges cannot create cycles.<P>
Thus, it remains to show that <I>M<SUB>G</I></SUB> satisfies the exchange property. Suppose that <I>A</I> and <I>B</I> are forests of <I>G</I> and that |<I>B| > |</I>A|. That is, <I>A</I> and <I>B </I>are acyclic sets of edges, and <I>B</I> contains more edges than <I>A</I> does.<P>
It follows from Theorem 5.2 that a forest having <I>k</I> edges contains exactly |<I>V|-</I>k<I> trees. (To prove this another way, begin with |</I>V| trees and no edges. Then, each edge that is added to the forest reduces the number of trees by one.) Thus, forest <I>A</I> contains |<I>V| - |</I>A| trees, and forest <I>B</I> contains |<I>V| - |</I>B| trees.<P>
Since forest <I>B</I> has fewer trees than forest <I>A</I> does, forest <I>B</I> must contain some tree <I>T</I> whose vertices are in two different trees in forest <I>A</I>. Moreover, since <I>T</I> is connected, it must contain an edge (<I>u, v</I>) such that vertices <I>u</I> and<I> v</I> are in different trees in forest <I>A</I>. Since the edge (<I>u, v</I>) connects vertices in two different trees in forest <I>A</I>, the edge (<I>u, v</I>) can be added to forest <I>A</I> without creating a cycle. Therefore, <I>M<SUB>G</I></SUB> satisfies the exchange property, completing the proof that <I>M<SUB>G</I></SUB> is a matroid. <P>
<a name="0835_15a7">Given a matroid <img src="346_a.gif">, we call an element <I>x</I> <img src="../images/notmem.gif"> <I>A</I> an <I><B>extension</I></B> of <img src="346_b.gif"> if <I>x</I> can be added to <I>A</I> while preserving independence; that is, <I>x</I> is an extension of <I>A </I>if <img src="346_c.gif">. As an example, consider a graphic matroid <I>M<SUB>G</I></SUB>. If <I>A</I> is an independent set of edges, then edge <I>e</I> is an extension of <I>A</I> if and only if <I>e</I> is not in <I>A </I>and the addition of<I> x</I> to <I>A </I>does not create a cycle.<P>
<a name="0835_15a8">If <I>A</I> is an independent subset in a matroid <I>M</I>, we say that <I>A</I> is <I><B>maximal </I></B>if it has no extensions. That is, <I>A</I> is maximal if it is not contained in any larger independent subset of <I>M</I>. The following property is often useful.<P>
<a name="0835_15ad">Theorem 17.6<a name="0835_15ad"><P>
All maximal independent subsets in a matroid have the same size.<P>
<I><B>Proof </I></B>Suppose to the contrary that <I>A</I> is a maximal independent subset of <I>M</I> and there exists another larger maximal independent subset <I>B</I> of <I>M</I>. Then, the exchange property implies that <I>A</I> is extendable to a larger independent set <I>A</I> <img src="../images/wideu.gif"> {<I>x</I>} for some <I>x</I> <img src="../images/memof12.gif"> <I>B</I> - <I>A</I>, contradicting the assumption that <I>A</I> is maximal. <P>
As an illustration of this theorem, consider a graphic matroid <I>M<SUB>G</I></SUB> for a connected, undirected graph <I>G</I>. Every maximal independent subset of <I>M<SUB>G</I></SUB> must be a free tree with exactly |<I>V| - 1 edges that connects all the vertices of </I>G<I>. Such a tree is called a </I><B>spanning tree<I></B> of </I>G<I>.</I><P>
<a name="0835_15a9"><a name="0835_15aa">We say that a matroid <img src="346_d.gif"> is <I><B>weighted</I> </B>if there is an associated weight function <I>w</I> that assigns a strictly positive weight <I>w</I>(<I>x</I>) to each element <I>x</I> <img src="../images/memof12.gif"> <I>S</I>. The weight function <I>w</I> extends to subsets of <I>S</I> by summation: <P>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -