📄 chap22.htm

📁 程序设计的艺术,此书是英文版的,经久不衰,非常的经典
💻 HTM
📖 第 1 页 / 共 5 页
字号:
上一页 1 2 3 45
Suppose we convert a sequence <I>S</I>' of <I>m</I>' <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT>, <FONT FACE="Courier New" SIZE=2>UNION</FONT>, and <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations into a sequence <I>S</I> of <I>m</I> <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT>, <FONT FACE="Courier New" SIZE=2>LINK</FONT>, and <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations by turning each <FONT FACE="Courier New" SIZE=2>UNION</FONT> into two <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations followed by a <FONT FACE="Courier New" SIZE=2>LINK</FONT>. Then, if sequence <I>S</I> runs in <I>O</I>(<I>m</I> 1g<SUP>*</SUP> <I>n</I>) time, sequence <I>S</I>' runs in <I>O</I>(<I>m</I>' 1g<SUP>*</SUP> <I>n</I>) time.<P>
<I><B>Proof     </I></B>Since each <FONT FACE="Courier New" SIZE=2>UNION</FONT> operation in sequence <I>S</I>' is converted into three operations in <I>S</I>, we have <I>m</I>'<I> <img src="../images/lteq12.gif"> </I>m<I> <img src="../images/lteq12.gif"> 3</I>m<I>'</I>. Since <I>m</I> = <I>O</I>(<I>m</I>'), an <I>O</I>(<I>m</I> 1g<SUP>*</SUP> <I>n</I>) time bound for the converted sequence <I>S</I> implies an <I>O</I>(<I>m</I>' 1g<SUP>*</SUP> <I>n</I>) time bound for the original sequence <I>S</I>'.      <P>
In the remainder of this section, we shall assume that the initial sequence of <I>m</I>' <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT>, <FONT FACE="Courier New" SIZE=2>UNION</FONT>, and <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations has been converted to a sequence of <I>m</I> <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT>, <FONT FACE="Courier New" SIZE=2>LINK</FONT>, and <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations. We now prove an <I>O</I>(<I>m</I> 1g<SUP>*</SUP> <I>n</I>) time bound for the converted sequence and appeal to Lemma 22.6 to prove the <I>O</I>(<I>m</I><I>'</I> 1g<SUP>*</SUP> <I>n</I>) running time of the original sequence of <I>m</I>' operations.<P>
<a name="0899_1713">Theorem 22.7<a name="0899_1713"><P>
A sequence of <I>m</I> <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT>, <FONT FACE="Courier New" SIZE=2>LINK</FONT>, and <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations, <I>n</I> of which are <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations, can be performed on a disjoint-set forest with union by rank and path compression in worst-case time <I>O</I>(<I>m</I> 1g<SUP>*</SUP> <I>n</I>).<P>
<I><B>Proof     </I></B>We assess <I><B>charges</I></B> corresponding to the actual cost of each set operation and compute the total number of charges assessed once the entire sequence of set operations has been performed. This total then gives us the actual cost of all the set operations.<P>
The charges assessed to the <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> and <FONT FACE="Courier New" SIZE=2>LINK</FONT> operations are simple: one charge per operation. Since these operations each take <I>O</I>(1) actual time, the charges assessed equal the actual costs of the operations.<P>
Before discussing charges assessed to the <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations, we partition node ranks into <I><B>blocks</I></B> by putting rank <I>r</I> into block 1g<SUP>*</SUP> <I>r</I> for <I>r</I> = 0, 1, . . . , <FONT FACE="Courier New" SIZE=2><img src="../images/hfbrdl12.gif"></FONT>1g <I>n</I><FONT FACE="Courier New" SIZE=2><img src="../images/hfbrdr12.gif"></FONT>. (Recall that <FONT FACE="Courier New" SIZE=2><img src="../images/hfbrdl12.gif"></FONT>1g <I>n</I><FONT FACE="Courier New" SIZE=2><img src="../images/hfbrdr12.gif"></FONT> is the maximum rank.) The highest-numbered block is therefore block 1g<SUP>*</SUP>(1g <I>n</I>) = 1g<SUP>*</SUP> <I>n</I> - 1. For notational convenience, we define for integers <I>j</I> <img src="../images/gteq.gif"> -1,<P>
<img src="455_a.gif"><P>
Then, for <I>j</I> = 0, 1, . . . , lg<SUP>*</SUP> <I>n</I> - 1, the <I>j</I>th block consists of the set of ranks<P>
<pre>{<I>B</I>(<I>j</I> - 1) + 1, <I>B</I>(<I>j</I> - 1) + 2, . . . , <I>B</I>(<I>j</I>)} .</sub></sup></pre><P>
We use two types of charges for a <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operation: <I><B>block charges </I></B>and <I><B>path charges</I></B>. Suppose that the <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> starts at node <I>x</I><SUB>0</SUB> and that the find path consists of nodes <I>x</I><SUB>0</SUB>, <I>x</I><SUB>1</SUB>, . . . , <I>x<SUB>l</I></SUB>, where for <I>i</I> = 1, 2, . . . , <I>l</I>, node <I>x<SUB>i </I></SUB>is <I>p</I>[<I>x<SUB>i</I>-1</SUB>] and <I>x<SUB>l</I></SUB> (a root) is <I>p</I>[<I>x<SUB>l</I></SUB>]. For <I>j</I> = 0, 1, . . . , lg<SUP>*</SUP> <I>n</I> - 1, we assess one block charge to the <I>last</I> node with rank in block <I>j</I> on the path. (Note that Lemma 22.2 implies that on any find path, the nodes with ranks in a given block are consecutive.) We also assess one block charge to the child of the root, that is, to <I>x<SUB>l</I></SUB>-<I>1. </I>Because ranks strictly increase along any find path, an equivalent formulation assesses one block charge to each node <I>x<SUB><FONT FACE="Courier New" SIZE=2>i</I></FONT></SUB> such that <I>p</I>[<I>x<SUB><FONT FACE="Courier New" SIZE=2>i</I></FONT></SUB>] = <I>x<SUB><FONT FACE="Courier New" SIZE=2>l</I></FONT></SUB> (<I>x<SUB><FONT FACE="Courier New" SIZE=2>i</I></FONT></SUB> is the root or its child) or 1g<SUP>*</SUP> <I>rank</I>[<I>x<SUB><FONT FACE="Courier New" SIZE=2>i</I></FONT></SUB>] &lt; 1g<SUP>*</SUP> <I>rank</I>[<I>x<SUB><FONT FACE="Courier New" SIZE=2>i</I>+1</FONT></SUB>](the block of <I>x<SUB><FONT FACE="Courier New" SIZE=2>i</I></FONT></SUB>'s rank differs from that of its parent). At each node on the find path for which we do not assess a block charge, we assess one path charge.<P>
Once a node other than the root or its child is assessed block charges, it will never again be assessed path charges. To see why, observe that each time path compression occurs, the rank of a node <I>x<SUB>i</I></SUB> for which <I>p</I>[<I>x<SUB>i</I></SUB>] <img src="../images/noteq.gif"> <I>x<SUB>l</I> </SUB>remains the same, but the new parent of <I>x<SUB>i</I></SUB> has a rank strictly greater than that of <I>x<SUB>i</I></SUB>'s old parent. The difference between the ranks of <I>x<SUB>i</I></SUB> and its parent is a monotonically increasing function of time. Thus, the difference between 1g<SUP>*</SUP> <I>rank</I>[<I>p</I>[<I>x<SUB>i</I></SUB>]] and 1g<SUP>*</SUP> <I>rank</I>[<I>x<SUB>i</I></SUB>] is also a monotonically increasing function of time. Once <I>x<SUB>i</I></SUB> and its parent have ranks in different blocks, they will always have ranks in different blocks, and so <I>x<SUB>i</I></SUB> will never again be assessed a path charge.<P>
Since we have charged once for each node visited in each <FONT FACE="Courier New" SIZE=2>FIND</FONT>-S<FONT FACE="Courier New" SIZE=2>ET </FONT>operation, the total number of charges assessed is the total number of nodes visited in all the <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations; this total represents the actual cost of all the <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations. We wish to show that this total is<I> O</I>(<I>m</I> 1g<SUP>*</SUP> <I>n</I>).<P>
The number of block charges is easy to bound. There is at most one block charge assessed for each block number on the given find path, plus one block charge for the child of the root. Since block numbers range from 0 to 1g<SUP>*</SUP> <I>n</I> - 1, there are at most 1g<SUP>*</SUP> <I>n</I> + 1 block charges assessed for each <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operation. Thus, there are at most <I>m</I>(1g<SUP>*</SUP> <I>n</I> + 1) block charges assessed over all <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations.<P>
Bounding the path charges is a little trickier. We start by observing that if a node <I>x<SUB>i</I></SUB> is assessed a path charge, then <I>p</I>[<I>x<SUB>i</I></SUB>] <img src="../images/noteq.gif"> <I>x<SUB>l</I></SUB> before path compression, so that <I>x<SUB>i</I></SUB> will be assigned a new parent during path compression. Moreover, as we have observed, <I>x<SUB>i</I></SUB>'s new parent has a higher rank than its old parent. Suppose that node <I>x<SUB>i</I></SUB>'s rank is in block <I>j</I>. How many times can <I>x<SUB>i</I></SUB> be assigned a new parent, and thus assessed a path charge, before <I>x<SUB>i</I></SUB>is assigned a parent whose rank is in a different block (after which <I>x<SUB>i</I></SUB> will never again be assessed a path charge)? This number of times is maximized if <I>x<SUB>i</I></SUB> has the lowest rank in its block, namely <I>B</I>(<I>j</I> - 1) + 1, and its parents' ranks successively take on the values <I>B</I>(<I>j</I> - 1) + 2, <I>B</I>(<I>j</I> - 1) + 3, . . . , <I>B</I>(<I>j</I>). Since there are <I>B</I>(<I>j</I>) - <I>B</I>(<I>j</I> - 1) - 1 such ranks, we conclude that a vertex can be assessed at most <I>B</I>(<I>j</I>) - <I>B</I>(<I>j </I>- 1) - 1 path charges while its rank is in block <I>j</I>.<P>
Our next step in bounding the path charges is to bound the number of nodes that have ranks in block <I>j</I> for integers <I>j</I> <img src="../images/gteq.gif"> 0. (Recall that by Lemma 22.2, the rank of a node is fixed once it becomes a child of another node.) Let the number of nodes whose ranks are in block <I>j</I> be denoted by <I>N</I>(<I>j</I>). Then, by Lemma 22.4,<P>
<img src="457_a.gif"><P>
For <I>j</I> = 0, this sum evaluates to<P>
<pre><I>N</I>(0)  =  <I>n</I>/2<SUP>0</SUP> + <I>n</I>/2<SUP>1</sub></sup></pre><P>
<pre>=  3<I>n</I>/2</sub></sup></pre><P>
<pre>=  3<I>n</I>/2<I>B</I>(0).</sub></sup></pre><P>
For <I>j</I> <img src="../images/gteq.gif"> 1, we have<P>
<img src="457_b.gif"><P>
Thus, <I>N</I>(<I>j</I>) <img src="../images/lteq12.gif"> 3<I>n</I>/2<I>B</I>(<I>j</I>) for all integers <I>j</I> <img src="../images/gteq.gif"> 0.<P>
We finish bounding the path charges by summing over all blocks the product of the maximum number of nodes with ranks in the block and the maximum number of path charges per node of that block. Denoting by P(<I>n</I>) the overall number of path charges, we have<P>
<img src="457_c.gif"><P>
Thus, the total number of charges incurred by <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations is <I>O</I>(<I>m</I>(1g* <I>n</I> + 1) + <I>n</I> 1g* <I>n</I>), which is <I>O</I>(<I>m</I> 1g* <I>n</I>) since <I>m</I> <img src="../images/gteq.gif"> <I>n</I>. Since there are <I>O</I>(<I>n</I>) <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> and <FONT FACE="Courier New" SIZE=2>LINK</FONT> operations, with one charge each, the total time is <I>O</I>(<I>m</I> 1g* <I>n</I>).      <P>
<a name="0899_1714">Corollary 22.8<a name="0899_1714"><P>
A sequence of <I>m</I> <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT>, <FONT FACE="Courier New" SIZE=2>UNION</FONT>, and <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations, <I>n</I> of which are <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations, can be performed on a disjoint-set forest with union by rank and path compression in worst-case time <I>O</I>(<I>m</I> 1g* <I>n</I>).<P>
<I><B>Proof     </I></B>Immediate from Theorem 22.7 and Lemma 22.6.      <P>
<P>







<h2><a name="089a_0001">Exercises<a name="089a_0001"></h2><P>
<a name="089a_0002">22.4-1<a name="089a_0002"><P>
Prove Lemma 22.2.<P>
<a name="089a_0003">22.4-2<a name="089a_0003"><P>
For each node <I>x</I>, how many bits are necessary to store size(<I>x</I>)? How about <I>rank</I>[<I>x</I>]?<P>
<a name="089a_0004">22.4-3<a name="089a_0004"><P>
Using Lemma 22.2 and Corollary 22.5, give a simple proof that operations on a disjoint-set forest with union by rank but without path compression run in <I>O</I>(<I>m </I>1g<I> n</I>) time.<P>
<a name="089a_0005">22.4-4<a name="089a_0005"><P>
Suppose we modify the rule about assessing charges so that we assess one block charge to the last node on the find path whose rank is in block <I>j</I> for <I>j</I> = 0, 1, . . . , 1g* <I>n</I> - 1. Otherwise, we assess one path charge to the node. Thus, if a node is a child of the root and is not the last node of a block, it is assessed a path charge, not a block charge. Show that <img src="../images/omega12.gif">(<I>m</I>) path charges could be assessed a given node while its rank is in a given block <I>j</I>.<P>
<P>


<P>







<h1><a name="089b_1721">Problems<a name="089b_1721"></h1><P>
<a name="089b_1722">22-1     Off-line minimum<a name="089b_1722"><P>
<a name="089b_1712"><a name="089b_1713"><a name="089b_1714">The <I><B>off-line minimum problem</I></B> asks us to maintain a dynamic set <I>T</I> of elements from the domain {1, 2, . . . ,<I>n</I>} under the operations <FONT FACE="Courier New" SIZE=2>INSERT</FONT> and <FONT FACE="Courier New" SIZE=2>EXTRACT</FONT>-<FONT FACE="Courier New" SIZE=2>MIN</FONT>. We are given a sequence <I>S</I> of <I>n</I> <FONT FACE="Courier New" SIZE=2>INSERT</FONT> and <I>m</I> <FONT FACE="Courier New" SIZE=2>EXTRACT</FONT>-<FONT FACE="Courier New" SIZE=2>MIN</FONT> calls, where each key in {1, 2, . . . , <I>n</I>} is inserted exactly once. We wish to determine which key is returned by each <FONT FACE="Courier New" SIZE=2>EXTRACT</FONT>-M<FONT FACE="Courier New" SIZE=2>IN </FONT>call. Specifically, we wish to fill in an array <I>extracted</I>[1 . . <I>m</I>], where for <I>i</I> = 1, 2, . . . , <I>m</I>, <I>extracted</I>[<I>i</I>] is the key returned by the <I>i</I>th <FONT FACE="Courier New" SIZE=2>EXTRACT</FONT>-<FONT FACE="Courier New" SIZE=2>MIN</FONT> call. The problem is "off-line" in the sense that we are allowed to process the entire sequence <I>S</I> before determining any of the returned keys.<P>
<I><B>a.</I></B>     In the following instance of the off-line minimum problem, each <FONT FACE="Courier New" SIZE=2>INSERT</FONT> is represented by a number and each <FONT FACE="Courier New" SIZE=2>EXTRACT</FONT>-<FONT FACE="Courier New" SIZE=2>MIN</FONT> is represented by the letter E:<P>
<pre>4, 8, E, 3, E, 9, 2, 6, E, E, E, 1, 7, E, 5.</sub></sup></pre><P>
Fill in the correct values in the <I>extracted</I> array.<P>
To develop an algorithm for this problem, we break the sequence<I> S</I> into homogeneous subsequences. That is, we represent <I>S</I> by<P>
<pre>I<SUB>1</SUB>,E,I<SUB>2</SUB>,E,I<SUB>3</SUB>,...,I<SUB>m</SUB>,E,I<SUB>m+1 </SUB>,</sub></sup></pre><P>
where each E represents a single <FONT FACE="Courier New" SIZE=2>EXTRACT</FONT>-<FONT FACE="Courier New" SIZE=2>MIN</FONT> call and each I<I>j</I> represents a (possibly empty) sequence of <FONT FACE="Courier New" SIZE=2>INSERT</FONT> calls. For each subsequence I<I>j</I>, we initially place the keys inserted by these operations into a set <I>Kj</I>, which is empty if I<I>j</I> is empty. We then do the following.<P>
<pre><a name="089b_1715">OFF-LINE-MINIMUM(<I>m,n</I>)</sub></sup></pre><P>
<pre>1  <B>for</B> <I>i</I> <img src="../images/arrlt12.gif"> 1 <B>to</B> <I>n</I></sub></sup></pre><P>
<pre>2       <B>do</B> determine <I>j</I> such that <I>i </I><img src="../images/memof12.gif"> <I>Kj</I></sub></sup></pre><P>
<pre>3          <B>if</B> <I>j</I> <img src="../images/noteq.gif"> <I>m </I>+ 1</sub></sup></pre><P>
<pre>4             <B>then</B> extracted[<I>j</I>] <img src="../images/arrlt12.gif"> <I>i</I></sub></sup></pre><P>
<pre>5                 let <I>l</I> be the smallest value greater than <I>j</I></sub></sup></pre><P>
<pre>for which set <I>K<SUB>l</I></SUB> exists </sub></sup></pre><P>
<pre>6                 <I>K<
上一页 1 2 3 45
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -