📄 chap15.htm
字号:
To maintain the subtree sizes in the first phase, we simply increment <I>size</I>[<I>x</I>] for each node <I>x</I> on the path traversed from the root down toward the leaves. The new node added gets a <I>size</I> of 1. Since there are <I>O</I>(lg <I>n</I>) nodes on the traversed path, the additional cost of maintaining the <I>size </I>fields is <I>O</I>(lg <I>n</I>).<P>
In the second phase, the only structural changes to the underlying red-black tree are caused by rotations, of which there are at most two. Moreover, a rotation is a local operation: it invalidates only the two <I>size</I> fields in the nodes incident on the link around which the rotation is performed. Referring to the code for <FONT FACE="Courier New" SIZE=2>LEFT</FONT>-<FONT FACE="Courier New" SIZE=2>ROTATE</FONT>(<I>T,x</I>) in Section 14.2, we add the following lines:<P>
<pre>13 <I>size</I>[<I>y</I>] <img src="../images/arrlt12.gif"> <I>size</I>[<I>x</I>]</sub></sup></pre><P>
<pre>14 <I>size</I>[<I>x</I>] <img src="../images/arrlt12.gif"> <I>size</I>[<I>left</I>[<I>x</I>]] + <I>size</I>[<I>right</I>[<I>x</I>]] + 1</sub></sup></pre><P>
Figure 15.2 illustrates how the fields are updated. The change to <FONT FACE="Courier New" SIZE=2>RIGHT</FONT>-<FONT FACE="Courier New" SIZE=2>ROTATE</FONT> is symmetric.<P>
Since at most two rotations are performed during insertion into a red-black tree, only <I>O</I>(1) additional time is spent updating <I>size</I> fields in the second phase. Thus, the total time for insertion into an <I>n</I>-node order-statistic tree is <I>O</I>(lg <I>n</I>)--asymptotically the same as for an ordinary red-black tree.<P>
<a name="07f9_1506">Deletion from a red-black tree also consists of two phases: the first operates on the underlying search tree, and the second causes at most three rotations and otherwise performs no structural changes. (See Section 14.4.) The first phase splices out one node <I>y</I>. To update the subtree sizes, we simply traverse a path from node <I>y</I> up to the root, decrementing the <I>size</I> field of each node on the path. Since this path has length <I>O</I>(lg <I>n</I>) in an <I>n</I>-node red-black tree, the additional time spent maintaining <I>size</I> fields in the first phase is <I>O</I>(lg <I>n</I>). The <I>O</I>(1) rotations in the second phase of deletion can be handled in the same manner as for insertion. Thus, both insertion and deletion, including the maintenance of the <I>size</I> fields, take <I>O</I>(lg <I>n</I>) time for an <I>n</I>-node order-statistic tree.<P>
<P>
<h2><a name="07fa_150c">Exercises<a name="07fa_150c"></h2><P>
<a name="07fa_150d">15.1-1<a name="07fa_150d"><P>
<a name="07fa_1507">Show how OS-<FONT FACE="Courier New" SIZE=2>SELECT</FONT>(<I>T,</I> 10) operates on the red-black tree <I>T</I> of Figure 15.2.<P>
<a name="07fa_150e">15.1-2<a name="07fa_150e"><P>
Show how OS-<FONT FACE="Courier New" SIZE=2>RANK</FONT>(<I>T,x</I>) operates on the red-black tree <I>T</I> of Figure 15.2 and the node <I>x</I> with <I>key</I>[<I>x</I>] = 35.<P>
<a name="07fa_150f">15.1-3<a name="07fa_150f"><P>
Write a nonrecursive version of OS-<FONT FACE="Courier New" SIZE=2>SELECT</FONT>.<P>
<a name="07fa_1510">15.1-4<a name="07fa_1510"><P>
<a name="07fa_1508"><a name="07fa_1509">Write a recursive procedure OS-<FONT FACE="Courier New" SIZE=2>KEY</FONT>-<FONT FACE="Courier New" SIZE=2>RANK</FONT>(<I>T,k</I>) that takes as input an order-statistic tree <I>T</I> and a key <I>k</I> and returns the rank of <I>k</I> in the dynamic set represented by <I>T</I>. Assume that the keys of <I>T</I> are distinct.<P>
<a name="07fa_1511">15.1-5<a name="07fa_1511"><P>
<a name="07fa_150a">Given an element <I>x</I> in an <I>n</I>-node order-statistic tree and a natural number<I> i,</I> how can the <I>i</I>th successor of <I>x</I> in the linear order of the tree be determined in <I>O</I>(lg <I>n</I>) time?<P>
<a name="07fa_1512">15.1-6<a name="07fa_1512"><P>
Observe that whenever the <I>size</I> field is referenced in either OS-<FONT FACE="Courier New" SIZE=2>SELECT</FONT> or OS-<FONT FACE="Courier New" SIZE=2>RANK</FONT>, it is only used to compute the rank of <I>x</I> in the subtree rooted at <I>x</I>. Accordingly, suppose we store in each node its rank in the subtree of which it is the root. Show how this information can be maintained during insertion and deletion. (Remember that these two operations can cause rotations.)<P>
<a name="07fa_1513">15.1-7<a name="07fa_1513"><P>
Show how to use an order-statistic tree to to count the number of inversions (see Problem 1-3) in an array of size <I>n</I> in time <I>O</I>(<I>n</I> lg <I>n</I>).<P>
<a name="07fa_1514">15.1-8<a name="07fa_1514"><P>
<a name="07fa_150b">Consider <I>n</I> chords on a circle, each defined by its endpoints. Describe an <I>O</I>(<I>n </I>lg <I>n</I>)-time algorithm for determining the number of pairs of chords that intersect inside the circle. (For example, if the <I>n</I> chords are all diameters that meet at the center, then the correct answer is <img src="286_a.gif">.) Assume that no two chords share an endpoint.<P>
<P>
<P>
<h1><a name="07fb_0001">15.2 How to augment a data structure<a name="07fb_0001"></h1><P>
The process of augmenting a basic data structure to support additional functionality occurs quite frequently in algorithm design. It will be used again in the next section to design a data structure that supports operations on intervals. In this section, we shall examine the steps involved in such augmentation. We shall also prove a theorem that allows us to augment red-black trees easily in many cases.<P>
Augmenting a data structure can be broken into four steps:<P>
1. choosing an underlying data structure,<P>
2. determining additional information to be maintained in the underlying data structure,<P>
3. verifying that the additional information can be maintained for the basic modifying operations on the underlying data structure, and<P>
4. developing new operations.<P>
As with any prescriptive design method, you should not blindly follow the steps in the order given. Most design work contains an element of trial and error, and progress on all steps usually proceeds in parallel. There is no point, for example, in determining additional information and developing new operations (steps 2 and 4) if we will not be able to maintain the additional information efficiently. Nevertheless, this four-step method provides a good focus for your efforts in augmenting a data structure, and it is also a good way to organize the documentation of an augmented data structure.<P>
We followed these steps in Section 15.1 to design our order-statistic trees. For step 1, we chose red-black trees as the underlying data structure. A clue to the suitability of red-black trees comes from their efficient support of other dynamic-set operations on a total order, such as <FONT FACE="Courier New" SIZE=2>MINIMUM</FONT>, <FONT FACE="Courier New" SIZE=2>MAXIMUM</FONT>, <FONT FACE="Courier New" SIZE=2>SUCCESSOR</FONT>, and <FONT FACE="Courier New" SIZE=2>PREDECESSOR</FONT>.<P>
For step 2, we provided the <I>size</I> fields, which in each node<I> x</I> stores the size of the subtree rooted at <I>x. </I>Generally, the additional information makes operations more efficient. For example, we could have implemented OS-<FONT FACE="Courier New" SIZE=2>SELECT</FONT> and OS-<FONT FACE="Courier New" SIZE=2>RANK</FONT> using just the keys stored in the tree, but they would not have run in <I>O</I>(lg <I>n</I>) time. Sometimes, the additional information is pointer information rather than data, as in Exercise 15.2-1.<P>
For step 3, we ensured that insertion and deletion could maintain the <I>size </I>fields while still running in <I>O</I>(lg <I>n</I>) time. Ideally, a small number of changes to the data structure should suffice to maintain the additional information. For example, if we simply stored in each node its rank in the tree, the OS-<FONT FACE="Courier New" SIZE=2>SELECT</FONT> and OS-<FONT FACE="Courier New" SIZE=2>RANK</FONT> procedures would run quickly, but inserting a new minimum element would cause a change to this information in every node of the tree. When we store subtree sizes instead, inserting a new element causes information to change in only <I>O</I>(lg <I>n</I>) nodes.<P>
For step 4, we developed the operations OS-<FONT FACE="Courier New" SIZE=2>SELECT</FONT> and OS-<FONT FACE="Courier New" SIZE=2>RANK</FONT>. After all, the need for new operations is why we bother to augment a data structure in the first place. Occasionally, rather than developing new operations, we use the additional information to expedite existing ones, as in Exercise 15.2-1.<P>
<h2>Augmenting red-black trees</h2><P>
<a name="07fc_150c">When red-black trees underlie an augmented data structure, we can prove that certain kinds of additional information can always be efficiently maintained by insertion and deletion, thereby making step 3 very easy. The proof of the following theorem is similar to the argument from Section 15.1 that the <I>size</I> field can be maintained for order-statistic trees.<P>
<a name="07fc_150d">Theorem 15.1<a name="07fc_150d"><P>
Let â be a field that augments a red-black tree <I>T</I> of <I>n</I> nodes, and suppose that the contents of â for a node<I> x </I>can be computed using only the information in nodes <I>x</I>, <I>left</I>[<I>x</I>], and <I>right</I>[<I>x</I>], including â[<I>left</I>[<I>x</I>]] and â[<I>right</I>[<I>x</I>]]. Then, we can maintain the values of <I>f </I>in all nodes of <I>T</I> during insertion and deletion without asymptotically affecting the <I>O</I>(lg <I>n</I>) performance of these operations.<P>
<I><B>Proof </I></B>The main idea of the proof is that a change to an â field in a node <I>x</I> propagates only to ancestors of <I>x</I> in the tree. That is, changing â[<I>x</I>] may require â[<I>p</I>[<I>x</I>]] to be updated, but nothing else; updating â[<I>p</I>[<I>x</I>]] may require â[<I>p</I>[<I>p</I>[<I>x</I>]]] to be updated, but nothing else; and so on up the tree. When â[<I>root</I>[<I>T</I>]] is updated, no other node depends on the new value, so the process terminates. Since the height of a red-black tree is <I>O</I>(lg <I>n</I>), changing an <I>f</I> field in a node costs <I>O</I>(lg <I>n</I>) time in updating nodes dependent on the change.<P>
Insertion of a node <I>x</I> into <I>T</I> consists of two phases. (See Section 14.3.) During the first phase, <I>x</I> is inserted as a child of an existing node <I>p</I>[<I>x</I>]. The value for <I>f</I>[<I>x</I>] can be computed in <I>O</I>(1) time since, by supposition, it depends only on information in the other fields of <I>x</I> itself and the information in <I>x</I>'s children, but <I>x</I>'s children are both <FONT FACE="Courier New" SIZE=2>NIL</FONT>. Once â[<I>x</I>] is computed, the change propagates up the tree. Thus, the total time for the first phase of insertion is <I>O</I>(lg <I>n</I>). During the second phase, the only structural changes to the tree come from rotations. Since only two nodes change in a rotation, the total time for updating theâ fields is <I>O</I>(lg<I> n</I>) per rotation. Since the number of rotations during insertion is at most two, the total time for insertion is <I>O</I>(lg <I>n</I>).<P>
Like insertion, deletion has two phases. (See Section 14.4.) In the first phase, changes to the tree occur if the deleted node is replaced by its successor, and then again when either the deleted node or its successor is spliced out. Propagating the updates to <I>f</I> caused by these changes costs at most <I>O</I>(lg <I>n</I>) since the changes modify the tree locally. Fixing up the red-black tree during the second phase requires at most three rotations, and each rotation requires at most <I>O</I>(lg <I>n</I>) time to propagate the updates to <I>f</I>. Thus, like insertion, the total time for deletion is <I>O</I>(lg <I>n</I>). <P>
In many cases, such as maintenance of the <I>size</I> fields in order-statistic trees, the cost of updating after a rotation is <I>O</I>(1), rather than the <I>O</I>(lg <I>n</I>) derived in the proof of Theorem 15.1. Exercise 15.2-4 gives an example.<P>
<P>
<h2><a name="07fd_1515">Exercises<a name="07fd_1515"></h2><P>
<a name="07fd_1516">15.2-1<a name="07fd_1516"><P>
<a name="07fd_150d"><a name="07fd_150e"><a name="07fd_150f"><a name="07fd_1510"><a name="07fd_1511">Show how the dynamic-set queries <FONT FACE="Courier New" SIZE=2>MINIMUM</FONT>, <FONT FACE="Courier New" SIZE=2>MAXIMUM</FONT>, <FONT FACE="Courier New" SIZE=2>SUCCESSOR</FONT>, and <FONT FACE="Courier New" SIZE=2>PREDECESSOR</FONT> can each be supported in <I>O</I>(1) worst-case time on an augmented order-statistic tree. The asymptotic performance of other operations on order-statistic trees should not be affected.<P>
<a name="07fd_1517">15.2-2<a name="07fd_1517"><P>
Can the black-heights of nodes in a red-black tree be maintained as fields in the nodes of the tree without affecting the asymptotic performance of any of the red-black tree operations? Show how, or argue why not.<P>
<a name="07fd_1518">15.2-3<a name="07fd_1518"><P>
<a name="07fd_1512">Can the depths of nodes in a red-black tree be efficiently maintained as fields in the nodes of the tree? Show how, or argue why not.<P>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -