📄 chap15.htm
字号:
<HTML><HEAD>
<TITLE>Intro to Algorithms: CHAPTER 15: AUGMENTING DATA STRUCTURES</TITLE></HEAD><BODY BGCOLOR="#FFFFFF">
<a href="partiv.htm"><img align=right src="../../images/next.gif" alt="Next Chapter" border=0></A>
<a href="toc.htm"><img align=right src="../../images/toc.gif" alt="Return to Table of Contents" border=0></A>
<a href="chap14.htm"><img align=right src="../../images/prev.gif" alt="Previous Chapter" border=0></A>
<h1><a name="07f4_14fb">CHAPTER 15: AUGMENTING DATA STRUCTURES<a name="07f4_14fb"></h1><P>
<a name="07f4_14f4"><a name="07f4_14f5">There are some engineering situations that require no more than a "textbook" data structure--such as a doubly linked list, a hash table, or a binary search tree--but many others require a dash of creativity. Only in rare situations will you need to create an entirely new type of data structure, though. More often, it will suffice to augment a textbook data structure by storing additional information in it. You can then program new operations for the data structure to support the desired application. Augmenting a data structure is not always straightforward, however, since the added information must be updated and maintained by the ordinary operations on the data structure.<P>
<a name="07f4_14f6"><a name="07f4_14f7"><a name="07f4_14f8"><a name="07f4_14f9"><a name="07f4_14fa">This chapter discusses two data structures that are constructed by augmenting red-black trees. Section 15.1 describes a data structure that supports general order-statistic operations on a dynamic set. We can then quickly find the <I>i</I>th smallest number in a set or the rank of a given element in the total ordering of the set. Section 15.2 abstracts the process of augmenting a data structure and provides a theorem that can simplify the augmentation of red-lack trees. Section 15.3 uses this theorem to help design a data structure for maintaining a dynamic set of intervals, such as time intervals. Given a query interval, we can then quickly find an interval in the set that overlaps it.<P>
<h1><a name="07f6_14fd">15.1 Dynamic order statistics<a name="07f6_14fd"></h1><P>
Chapter 10 introduced the notion of an order statistic. Specifically, the <I>i</I>th order statistic of a set of<I> n </I>elements, where <I>i</I> <img src="../images/memof12.gif">{ 1, 2, . . ., <I>n</I>}, is simply the element in the set with the <I>i</I>th smallest key. We saw that any order statistic could be retrieved in <I>O(n) </I>time from an unordered set. In this section, we shall see how red-black trees can be modified so that any order statistic can be determined in <I>O</I>(lg<I> n</I>) time. We shall also see how the<B> </B><I><B>rank</I></B> of an element--its position in the linear order of the set--can likewise be determined in <I>O</I>(lg <I>n</I>) time.<P>
<img src="282_a.gif"><P>
<h4><a name="07f6_14fe">Figure 15.1 An order-statistic tree, which is an augmented red-black tree. Shaded nodes are red, and darkened nodes are black. In addition to its usual fields, each node x has a field size[x], which is the number of nodes in the subtree rooted at x.<a name="07f6_14fe"></sub></sup></h4><P>
<a name="07f6_14fb">A data structure that can support fast order-statistic operations is shown in Figure 15.1. An<I><B> order-statistic tree</I></B> <I>T</I> is simply a red-black tree with additional information stored in each node. Besides the usual red-black tree fields <I>key</I>[<I>x</I>],<B> </B><I>color</I>[<I>x</I>]<I>, p</I>[<I>x</I>]<I>, left</I>[<I>x</I>]<I>, </I>and<I> right</I>[<I>x</I>] in a node <I>x</I>, we have another field <I>size</I>[<I>x</I>]. This field contains the number of (internal) nodes in the subtree rooted at<I> x </I>(including <I>x</I> itself), that is, the size of the subtree. If we define <I>size</I>[<FONT FACE="Courier New" SIZE=2>NIL</FONT>] to be 0, then we have the identity<P>
<pre>size[x] = size[left[x]] + size[right[x]] + 1 .</sub></sup></pre><P>
<a name="07f6_14fc">(To handle the boundary condition for <FONT FACE="Courier New" SIZE=2>NIL</FONT> properly, an actual implementation might test explicitly for <FONT FACE="Courier New" SIZE=2>NIL</FONT> each time the <I>size</I> field is accessed or, more simply, as in Section 14.4, use a sentinel <I>nil</I>[<I>T</I>] to represent <FONT FACE="Courier New" SIZE=2>NIL</FONT>, where <I>size</I>[<I>nil</I>[<I>T</I>]]<I> = </I>0.)<P>
<h2>Retrieving an element with a given rank</h2><P>
<a name="07f7_14fd"><a name="07f7_14fe">Before we show how to maintain this size information during insertion and deletion, let us examine the implementation of two order-statistic queries that use this additional information. We begin with an operation that retrieves an element with a given rank. The procedure OS-<FONT FACE="Courier New" SIZE=2>SELECT</FONT>(<I>x, i</I>) returns a pointer to the node containing the <I>i</I>th smallest key in the subtree rooted at <I>x</I>. To find the <I>i</I>th smallest key in an order-statistic tree <I>T</I>, we call OS-<FONT FACE="Courier New" SIZE=2>SELECT</FONT>(<I>root</I>[<I>T</I>]<I>, i</I>).<P>
<pre>OS-SELECT(<I>x,i</I>)</sub></sup></pre><P>
<pre>1 <I>r </I><img src="../images/arrlt12.gif"> size<I>[</I>left<I>[</I>x<I>]] + 1</I></sub></sup></pre><P>
<pre>2 <B>if</B> <I>i = r</I></sub></sup></pre><P>
<pre>3 <B>then return</B> <I>x</I></sub></sup></pre><P>
<pre>4 <B>elseif</B> <I>i </I>< <I>r</I></sub></sup></pre><P>
<pre>5 <B>then return</B> OS-SELECT(<I>left</I>[<I>x</I>],<I>i</I>)</sub></sup></pre><P>
<pre>6 <B>else return</B> OS-SELECT(<I>right</I>[<I>x</I>],<I>i</I> - <I>r</I>)</sub></sup></pre><P>
<a name="07f7_14ff">The idea behind OS-<FONT FACE="Courier New" SIZE=2>SELECT</FONT> is similar to that of the selection algorithms in Chapter 10. The value of <I>size</I>[<I>left</I>[<I>x</I>]] is the number of nodes that come before <I>x</I> in an inorder tree walk of the subtree rooted at <I>x.</I> Thus, <I>size</I>[<I>left</I>[<I>x</I>]] + 1 is the rank of<I> x</I> within the subtree rooted at <I>x</I>.<P>
In line 1 of OS-<FONT FACE="Courier New" SIZE=2>SELECT</FONT>, we compute<I> r,</I> the rank of node <I>x</I> within the subtree rooted at <I>x.</I> If <I>i </I>= <I>r</I>, then node <I>x</I> is the<I> i</I>th smallest element, so we return <I>x</I> in line 3. If <I>i</I> < <I>r</I>, then the <I>i</I>th smallest element is in <I>x'</I>s left subtree, so we recurse on <I>left</I>[<I>x</I>] in line 5. If <I>i > r,</I> then the <I>i</I>th smallest element is in <I>x'</I>s right subtree. Since there are <I>r</I> elements in the subtree rooted at <I>x </I>that come before <I>x</I>'s<I> </I>right subtree in an inorder tree walk, the<I> i</I>th smallest element in the subtree rooted at <I>x</I> is the <I>(i - r)</I>th smallest element in the subtree rooted at <I>right</I>[<I>x</I>]. This element is determined recursively in line 6.<P>
To see how OS-<FONT FACE="Courier New" SIZE=2>SELECT</FONT> operates, consider a search for the 17th smallest element in the order-statistic tree of Figure 15.1. We begin with <I>x</I> as the root, whose key is 26, and with <I>i</I> = 17. Since the size of 26's left subtree is 12, its rank is 13. Thus, we know that the node with rank 17 is the 17 - 13 = 4th smallest element in 26's right subtree. After the recursive call, <I>x</I> is the node with key 41, and <I>i</I> = 4. Since the size of 41's left subtree is 5, its rank within its subtree is 6. Thus, we know that the node with rank 4 is in the 4th smallest element in 41's left subtree. After the recursive call, <I>x</I> is the node with key 30, and its rank within its subtree is 2. Thus, we recurse once again to find the 4 - 2 = 2nd smallest element in the subtree rooted at the node with key 38. We now find that its left subtree has size 1, which means it is the second smallest element. Thus, a pointer to the node with key 38 is returned by the procedure.<P>
Because each recursive call goes down one level in the order-statistic tree, the total time for OS-<FONT FACE="Courier New" SIZE=2>SELECT</FONT> is at worst proportional to the height of the tree. Since the tree is a red-black tree, its height is <I>O</I>(lg<I> n</I>), where <I>n</I> is the number of nodes. Thus, the running time of OS-<FONT FACE="Courier New" SIZE=2>SELECT</FONT> is <I>O</I>(lg<I> n</I>) for a dynamic set of <I>n</I> elements.<P>
<P>
<h2>Determining the rank of an element</h2><P>
<a name="07f8_1500">Given a pointer to a node <I>x</I> in an order-statistic tree <I>T</I>, the procedure OS-<FONT FACE="Courier New" SIZE=2>RANK</FONT> returns the position of <I>x</I> in the linear order determined by an inorder tree walk of <I>T.</I><P>
<pre>OS-RANK(<I>T,x</I>)</sub></sup></pre><P>
<pre>1 <I>r</I> <img src="../images/arrlt12.gif"> <I>size</I>[<I>left</I>[<I>x</I>]] + 1</sub></sup></pre><P>
<pre>2 <I>y</I> <img src="../images/arrlt12.gif"> <I>x</I></sub></sup></pre><P>
<pre>3 <B>while</B> <I>y </I><img src="../images/noteq.gif"> <I>root</I>[<I>T</I>]</sub></sup></pre><P>
<pre>4 <B>do if </B><I>y</I> = <I>right</I>[<I>p</I>[<I>y</I>]]</sub></sup></pre><P>
<pre>5 <B>then</B> <I>r</I> <img src="../images/arrlt12.gif"> <I>r</I> + <I>size</I>[<I>left</I>[<I>p</I>[<I>y</I>]]] + 1</sub></sup></pre><P>
<pre>6 <I>y</I> <img src="../images/arrlt12.gif"> <I>p</I>[<I>y</I>]</sub></sup></pre><P>
<pre>7 <B>return </B><I>r</I></sub></sup></pre><P>
<a name="07f8_1501"><a name="07f8_1502"><a name="07f8_1503">The procedure works as follows. The rank of <I>x </I>can be viewed as the number of nodes preceding <I>x</I> in an inorder tree walk, plus 1 for <I>x</I> itself. The following invariant is maintained: at the top of the <B>while</B> loop of lines 3-6, <I>r</I> is the rank of <I>key</I>[<I>x</I>] in the subtree rooted at node <I>y</I>. We maintain this invariant as follows. In line 1, we set <I>r</I> to be the rank of <I>key</I>[<I>x</I>]<I> </I>within the subtree rooted at<I> x</I>. Setting <I>y </I><img src="../images/arrlt12.gif"> <I>x </I>in line 2 makes the invariant true the first time the test in line 3 executes. In each iteration of the <B>while</B> loop, we consider the subtree rooted at <I>p</I>[<I>y</I>]. We have already counted the number of nodes in the subtree rooted at node <I>y</I> that precede <I>x</I> in an inorder walk, so we must add the nodes in the subtree rooted at <I>y</I>'s sibling that precede <I>x</I> in an inorder walk, plus 1 for<I> p</I>[<I>y</I>] if it, too, precedes <I>x</I>. If <I>y</I> is a left child, then neither <I>p</I>[<I>y</I>] nor any node in<I> p</I>[<I>y</I>]'s right subtree precedes<I> x,</I> so we leave <I>r</I> alone. Otherwise, <I>y</I> is a right child and all the nodes in<I> p</I>[<I>y</I>]'s left subtree precede <I>x,</I> as does <I>p</I>[<I>y</I>] itself. Thus, in line 5, we add size[<I>left</I>[<I>y</I>]] + 1 to the current value of <I>r</I> Setting<I> y </I><img src="../images/arrlt12.gif"> <I>p</I>[<I>y</I>] makes the invariant true for the next iteration. When<I> y = root</I>[<I>T</I>], the procedure returns the value of <I>r</I>, which is now the rank of<I> key</I>[<I>x</I>]. <P>
As an example, when we run OS-<FONT FACE="Courier New" SIZE=2>RANK</FONT> on the order-statistic tree of Figure 15.1 to find the rank of the node with key 38, we get the following sequence of values of <I>key</I>[<I>y</I>] and <I>r</I> at the top of the <B>while</B> loop:<P>
<pre>iteration <B><I>key</I></B>[<B><I>y</I></B>] <B><I>r</B></I></sub></sup></pre><P>
<pre>--------------------</sub></sup></pre><P>
<pre> 1 38 2</sub></sup></pre><P>
<pre> 2 30 4</sub></sup></pre><P>
<pre> 3 41 4</sub></sup></pre><P>
<pre> 4 26 17</sub></sup></pre><P>
The rank 17 is returned.<P>
Since each iteration of the <B>while</B> loop takes <I>O</I>(1) time, and<I> y</I> goes up one level in the tree with each iteration, the running time of OS-<FONT FACE="Courier New" SIZE=2>RANK</FONT> is at worst proportional to the height of the tree:<I> O</I>(lg<I> n</I>) on an <I>n</I>-node order-statistic tree.<P>
<P>
<h2>Maintaining subtree sizes</h2><P>
<a name="07f9_1504"><a name="07f9_1505">Given the <I>size</I> field in each node, OS-<FONT FACE="Courier New" SIZE=2>SELECT</FONT> and OS-<FONT FACE="Courier New" SIZE=2>RANK</FONT> can quickly compute order-statistic information. But unless these fields can be efficiently maintained by the basic modifying operations on red-black trees, our work will have been for naught. We shall now show that subtree sizes can be maintained for both insertion and deletion without affecting the asymptotic running times of either operation. <P>
We noted in Section 14.3 that insertion into a red-black tree consists of two phases. The first phase goes down the tree from the root, inserting the new node as a child of an existing node. The second phase goes up the tree, changing colors and ultimately performing rotations to maintain the red-black properties.<P>
<img src="285_a.gif"><P>
<h4><a name="07f9_1507">Figure 15.2 Updating subtree sizes during rotations. The two size fields that need to be updated are the ones incident on the link around which the rotation is performed. The updates are local, requiring only the size information stored in x, y, and the roots of the subtrees shown as triangles.<a name="07f9_1507"></sub></sup></h4><P>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -