📄 chap07.htm
字号:
<HTML><HEAD>
<TITLE>Intro to Algorithms: CHAPTER 7: HEAPSORT</TITLE></HEAD><BODY BGCOLOR="#FFFFFF">
<a href="chap08.htm"><img align=right src="../../images/next.gif" alt="Next Chapter" border=0></A>
<a href="toc.htm"><img align=right src="../../images/toc.gif" alt="Return to Table of Contents" border=0></A>
<a href="partii.htm"><img align=right src="../../images/prev.gif" alt="Previous Chapter" border=0></A>
<h1><a name="0764_130d">CHAPTER 7: HEAPSORT<a name="0764_130d"></h1><P>
<a name="0764_1308"><a name="0764_1309">In this chapter, we introduce another sorting algorithm. Like merge sort, but unlike insertion sort, heapsort's running time is <I>O</I>(<I>n </I>lg <I>n</I>). Like insertion sort, but unlike merge sort, heapsort sorts in place: only a constant number of array elements are stored outside the input array at any time. Thus, heapsort combines the better attributes of the two sorting algorithms we have already discussed.<P>
Heapsort also introduces another algorithm design technique: the use of a data structure, in this case one we call a "heap," to manage information during the execution of the algorithm. Not only is the heap data structure useful for heapsort, it also makes an efficient priority queue. The heap data structure will reappear in algorithms in later chapters.<P>
<a name="0764_130a"><a name="0764_130b"><a name="0764_130c">We note that the term "heap" was originally coined in the context of heapsort, but it has since come to refer to "garbage-collected storage," such as the programming language Lisp provides. Our heap data structure is <I>not</I> garbage-collected storage, and whenever we refer to heaps in this book, we shall mean the structure defined in this chapter.<P>
<h1><a name="0766_1318">7.1 Heaps<a name="0766_1318"></h1><P>
<a name="0766_130d"><a name="0766_130e"><a name="0766_130f">The <B>(</B><I><B>binary</I>) </B><I><B>heap</I></B> data structure is an array object that can be viewed as a complete binary tree (see Section 5.5.3), as shown in Figure 7.1. Each node of the tree corresponds to an element of the array that stores the value in the node. The tree is completely filled on all levels except possibly the lowest, which is filled from the left up to a point. An array <I>A</I> that represents a heap is an object with two attributes: <I>length</I>[<I>A</I>], which is the number of elements in the array, and <I>heap-size</I>[<I>A</I>], the number of elements in the heap stored within array <I>A</I>. That is, although <I>A</I>[1 . . <I>length</I>[<I>A</I>]] may contain valid numbers, no element past <I>A</I>[<I>heap-size</I>[<I>A</I>]], where <I>heap-size</I>[<I>A</I>] <img src="../images/lteq12.gif"> [<I>length</I>[<I>A</I>], is an element of the heap. The root of the tree is <I>A</I>[1], and given the index <I>i</I> of a node, the indices of its parent <FONT FACE="Courier New" SIZE=2>PARENT</FONT>(<I>i</I>), left child <FONT FACE="Courier New" SIZE=2>LEFT</FONT>(<I>i</I>), and right child <FONT FACE="Courier New" SIZE=2>RIGHT</FONT>(<I>i</I>) can be computed simply:<P>
<img src="141_a.gif"><P>
<h4><a name="0766_1319">Figure 7.1 A heap viewed as (a) a binary tree and (b) an array. The number within the circle at each node in the tree is the value stored at that node. The number next to a node is the corresponding index in the array.<a name="0766_1319"></sub></sup></h4><P>
<pre>PARENT(<I>i</I>)</sub></sup></pre><P>
<pre><B>return</B> <img src="../images/hfbrdl12.gif"><I>i</I>/2<img src="../images/hfbrdr12.gif"></sub></sup></pre><P>
<pre>LEFT(<I>i</I>)</sub></sup></pre><P>
<pre><B>return</B> 2<I>i</I></sub></sup></pre><P>
<pre>RIGHT(<I>i</I>)</sub></sup></pre><P>
<pre><B>return</B> 2<I>i</I> + 1</sub></sup></pre><P>
<a name="0766_1310"><a name="0766_1311"><a name="0766_1312">On most computers, the <FONT FACE="Courier New" SIZE=2><FONT FACE="Courier New" SIZE=1>LEFT</FONT></FONT> procedure can compute 2<I>i</I> in one instruction by simply shifting the binary representation of <I>i</I> left one bit position. Similarly, the <FONT FACE="Courier New" SIZE=2>RIGHT</FONT> procedure can quickly compute 2<I>i</I> + 1 by shifting the binary representation of <I>i</I> left one bit position and shifting in a 1 as the low-order bit. The <FONT FACE="Courier New" SIZE=2>PARENT</FONT> procedure can compute <FONT FACE="Courier New" SIZE=2><img src="../images/hfbrdl12.gif"><I>i</I></FONT>/2 <FONT FACE="Courier New" SIZE=2><img src="../images/hfbrdr12.gif"> </FONT>by shifting <I>i</I> right one bit position. In a good implementation of heapsort, these three procedures are often implemented as "macros" or "in-line" procedures.<P>
<a name="0766_1313">Heaps also satisfy the <I><B>heap property</I></B>: for every node <I>i</I> other than the root,<P>
<pre><I>A</I>[PARENT(<I>i</I>)] <img src="../images/gteq.gif"> <I>A</I>[<I>i</I>],</sub></sup></pre><P>
<h4><a name="0766_131a">(7.1)<a name="0766_131a"></sub></sup></h4><P>
that is, the value of a node is at most the value of its parent. Thus, the largest element in a heap is stored at the root, and the subtrees rooted at a node contain smaller values than does the node itself.<P>
<a name="0766_1314"><a name="0766_1315"><a name="0766_1316"><a name="0766_1317">We define the <I><B>height</I></B> of a node in a tree to be the number of edges on the longest simple downward path from the node to a leaf, and we define the height of the tree to be the height of its root. Since a heap of <I>n</I> elements is based on a complete binary tree, its height is <img src="../images/bound.gif">(lg <I>n</I>) (see Exercise 7.1-2). We shall see that the basic operations on heaps run in time at most proportional to the height of the tree and thus take <I>O</I>(lg <I>n</I>) time. The remainder of this chapter presents five basic procedures and shows how they are used in a sorting algorithm and a priority-queue data structure.<P>
<img src="../images/dot12.gif"> The <FONT FACE="Courier New" SIZE=2>HEAPIFY</FONT> procedure, which runs in <I>O</I>(lg <I>n</I>) time, is the key to maintaining the heap property (7.1).<P>
<img src="../images/dot12.gif"> The <FONT FACE="Courier New" SIZE=2>BUILD</FONT>-<FONT FACE="Courier New" SIZE=2>HEAP</FONT> procedure, which runs in linear time, produces a heap from an unordered input array.<P>
<img src="../images/dot12.gif"> The <FONT FACE="Courier New" SIZE=2>HEAPSORT</FONT> procedure, which runs in <I>O</I>(<I>n </I>lg <I>n</I>) time, sorts an array in place.<P>
<img src="../images/dot12.gif"> The E<FONT FACE="Courier New" SIZE=2>XTRACT-</FONT><FONT FACE="Courier New" SIZE=2>MAX</FONT> and <FONT FACE="Courier New" SIZE=2>INSERT</FONT> procedures, which run in <I>O</I>(1g <I>n</I>) time, allow the heap data structure to be used as a priority queue.<P>
<h2><a name="0767_1319">Exercises<a name="0767_1319"></h2><P>
<a name="0767_131a">7.1-1<a name="0767_131a"><P>
<a name="0767_1318">What are the minimum and maximum numbers of elements in a heap of height <I>h</I>?<P>
<a name="0767_131b">7.1-2<a name="0767_131b"><P>
Show that an <I>n-</I>element heap has height <img src="../images/hfbrdl12.gif">lg <I>n</I><img src="../images/hfbrdr12.gif">.<P>
<a name="0767_131c">7.1-3<a name="0767_131c"><P>
Show that the largest element in a subtree of a heap is at the root of the subtree.<P>
<a name="0767_131d">7.1-4<a name="0767_131d"><P>
Where in a heap might the smallest element reside?<P>
<a name="0767_131e">7.1-5<a name="0767_131e"><P>
Is an array that is in reverse sorted order a heap?<P>
<a name="0767_131f">7.1-6<a name="0767_131f"><P>
Is the sequence <img src="../images/lftwdchv.gif">23, 17, 14, 6, 13, 10, 1, 5, 7, 12<img src="../images/wdrtchv.gif"> a heap?<P>
<P>
<P>
<h1><a name="0768_131b">7.2 Maintaining the heap property<a name="0768_131b"></h1><P>
<a name="0768_1319"><a name="0768_131a"><FONT FACE="Courier New" SIZE=2>HEAPIFY</FONT> is an important subroutine for manipulating heaps. Its inputs are an array <I>A</I> and an index <I>i</I> into the array. When <FONT FACE="Courier New" SIZE=2>HEAPIFY</FONT> is called, it is assumed that the binary trees rooted at <FONT FACE="Courier New" SIZE=2>LEFT</FONT>(<I>i</I>) and <FONT FACE="Courier New" SIZE=2>RIGHT</FONT>(<I>i</I>) are heaps, but that <I>A</I>[<I>i</I>] may be smaller than its children, thus violating the heap property (7.1). The function of <FONT FACE="Courier New" SIZE=2>HEAPIFY</FONT> is to let the value at <I>A</I>[<I>i</I>] "float down" in the heap so that the subtree rooted at index <I>i</I> becomes a heap.<P>
<img src="143_a.gif"><P>
<h4><a name="0768_131c">Figure 7.2 The action of <FONT FACE="Courier New" SIZE=2>HEAPIFY</FONT>(A, 2), where heap-size[A] = 10. (a) The initial configuration of the heap, with A[2] at node i = 2 violating the heap property since it is not larger than both children. The heap property is restored for node 2 in (b) by exchanging A[2] with A[4], which destroys the heap property for node 4. The recursive call <FONT FACE="Courier New" SIZE=2>HEAPIFY</FONT>(A, 4) now sets i = 4. After swapping A[4] with A[9], as shown in (c), node 4 is fixed up, and the recursive call <FONT FACE="Courier New" SIZE=2>HEAPIFY</FONT>(A, 9) yields no further change to the data structure.<a name="0768_131c"></sub></sup></h4><P>
<pre>HEAPIFY(<I>A</I>, <I>i</I>)</sub></sup></pre><P>
<pre>1 <I>l</I> <img src="../images/arrlt12.gif"> LEFT(<I>i</I>)</sub></sup></pre><P>
<pre>2 <I>r</I> <img src="../images/arrlt12.gif"> RIGHT(<I>i</I>)</sub></sup></pre><P>
<pre>3 <B>if</B> <I>l</I> <img src="../images/lteq12.gif"> <I>heap-size</I>[<I>A</I>] and <I>A</I>[<I>l</I>] > <I>A</I>[<I>i</I>]</sub></sup></pre><P>
<pre>4 <B>then</B> largest <img src="../images/arrlt12.gif"> l</sub></sup></pre><P>
<pre>5 <B>else</B> <I>largest</I> <img src="../images/arrlt12.gif"> <I>i</I></sub></sup></pre><P>
<pre>6 <B>if</B> <I>r</I> <img src="../images/lteq12.gif"> <I>heap-size</I>[<I>A</I>] and <I>A</I>[<I>r</I>] > <I>A</I>[<I>largest</I>]</sub></sup></pre><P>
<pre>7 <B>then</B> <I>largest</I> <img src="../images/arrlt12.gif"> <I>r</I></sub></sup></pre><P>
<pre>8 <B>if</B> <I>largest</I> <img src="../images/noteq.gif"> <I>i</I></sub></sup></pre><P>
<pre>9 <B>then</B> exchange <I>A</I>[<I>i</I>] <img src="../images/dblarr12.gif"> <I>A</I>[<I>largest</I>]</sub></sup></pre><P>
<pre>10 HEAPIFY(<I>A,largest</I>)</sub></sup></pre><P>
Figure 7.2 illustrates the action of <FONT FACE="Courier New" SIZE=2>HEAPIFY</FONT>. At each step, the largest of the elements <I>A</I>[<I>i</I>], <I>A</I>[<FONT FACE="Courier New" SIZE=2>LEFT</FONT>(<I>i</I>)], and <I>A</I>[<FONT FACE="Courier New" SIZE=2>RIGHT</FONT>(<I>i</I>)] is determined, and its index is stored in <I>largest</I>. If <I>A</I>[<I>i</I>] is largest, then the subtree rooted at node <I>i</I> is a heap and the procedure terminates. Otherwise, one of the two children has the largest element, and <I>A</I>[<I>i</I>] is swapped with <I>A</I>[<I>largest</I>], which causes node <I>i</I> and its children to satisfy the heap property. The node <I>largest</I>, however, now has the original value <I>A</I>[<I>i</I>], and thus the subtree rooted at <I>largest</I> may violate the heap property. Consequently, <FONT FACE="Courier New" SIZE=2>HEAPIFY</FONT> must be called recursively on that subtree.<P>
The running time of <FONT FACE="Courier New" SIZE=2>HEAPIFY</FONT> on a subtree of size<I> n</I> rooted at given node <I>i</I> is the <img src="../images/bound.gif"> (1) time to fix up the relationships among the elements <I>A</I>[<I>i</I>], <I>A</I>[<FONT FACE="Courier New" SIZE=2>LEFT<FONT FACE="Courier New" SIZE=1> </FONT>(<I>i</I>)], and <I>A</I>[<FONT FACE="Courier New" SIZE=2>RIGHT<FONT FACE="Courier New" SIZE=1> </FONT>(<I>i</I>)], plus the time to run <FONT FACE="Courier New" SIZE=2>HEAPIFY</FONT> on a subtree rooted at one of the children of node <I>i</I>. The children's subtrees each have size at most 2<I>n</I>/3--the worst case occurs when the last row of the tree is exactly half full--and the running time of <FONT FACE="Courier New" SIZE=2>HEAPIFY</FONT> can therefore be described by the recurrence</FONT></FONT><P>
<pre><I>T</I>(<I>n</I>) <img src="../images/lteq12.gif"> <I>T</I>(2<I>n</I>/3) + <img src="../images/bound.gif">(1).</sub></sup></pre><P>
The solution to this recurrence, by case 2 of the master theorem (Theorem 4.1), is<I> T</I>(<I>n</I>)<I> = O(</I>lg <I>n</I>). Alternatively, we can characterize the running time of <FONT FACE="Courier New" SIZE=2>HEAPIFY</FONT> on a node of height <I>h</I> as <I>O</I>(<I>h</I>).<P>
<h2><a name="0769_131c">Exercises<a name="0769_131c"></h2><P>
<a name="0769_131d">7.2-1<a name="0769_131d"><P>
<a name="0769_131b">Using Figure 7.2 as a model, illustrate the operation of <FONT FACE="Courier New" SIZE=2>HEAPIFY </FONT>(<I>A,</I> 3) on the array <I>A </I>= <img src="../images/lftwdchv.gif">27, 17, 3, 16, 13, 10, 1, 5, 7, 12, 4, 8, 9, 0<img src="../images/wdrtchv.gif">.<P>
<a name="0769_131e">7.2-2<a name="0769_131e"><P>
What is the effect of calling <FONT FACE="Courier New" SIZE=2>HEAPIFY<I>(A, i</I></FONT>) when the element A[<I>i</I>] is larger than its children?<P>
<a name="0769_131f">7.2-3<a name="0769_131f"><P>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -