📄 chap22.htm
字号:
<h1><a name="088d_16ed">22.2 Linked-list representation of disjoint sets<a name="088d_16ed"></h1><P>
<a name="088d_16e8"><a name="088d_16e9"><a name="088d_16ea">A simple way to implement a disjoint-set data structure is to represent each set by a linked list. The first object in each linked list serves as its set's representative. Each object in the linked list contains a set member, a pointer to the object containing the next set member, and a pointer back to the representative. Figure 22.2(a) shows two sets. Within each linked list, the objects may appear in any order (subject to our assumption that the first object in each list is the representative).<P>
<a name="088d_16eb"><a name="088d_16ec">With this linked-list representation, both <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> and <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> are easy, requiring <I>O</I>(1) time. To carry out <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT>(<I>x</I>), we create a new linked list whose only object is <I>x</I>. For <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT>(<I>x</I>), we just return the pointer from <I>x</I> back to the representative.<P>
<h2>A simple implementation of union</h2><P>
The simplest implementation of the <FONT FACE="Courier New" SIZE=2>UNION</FONT> operation using the linked-list set representation takes significantly more time than <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> or <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT>. As Figure 22.2(b) shows, we perform <FONT FACE="Courier New" SIZE=2>UNION</FONT>(<I>x, y</I>) by appending <I>x</I>'s list onto the end of <I>y</I>'s list. The representative of the new set is the element that was originally the representative of the set containing <I>y</I>. Unfortunately, we must update the pointer to the representative for each object originally on <I>x</I>'s list, which takes time linear in the length of <I>x</I>'s list.<P>
In fact, it is not difficult to come up with a sequence of <I>m</I> operations that requires <img src="../images/bound.gif">(<I>m</I><SUP>2</SUP>) time. We let <I>n </I>= <FONT FACE="Courier New" SIZE=2><img src="../images/hfbrul14.gif"><I>m/</I></FONT>2<FONT FACE="Courier New" SIZE=2><img src="../images/hfbrur14.gif"></FONT> + 1 and <I>q </I>=<I> m</I> -<I> n</I> = <FONT FACE="Courier New" SIZE=2><img src="../images/hfbrdl12.gif"><I>m/</I></FONT>2<FONT FACE="Courier New" SIZE=2><img src="../images/hfbrdr12.gif"></FONT> - 1 and suppose that we have objects <I>x</I><SUB>1</SUB><I>, x</I><SUB>2</SUB>,<I> . . . , x<SUB>n</I></SUB>. We then execute the sequence of <I>m</I> =<I> n</I> +<I> q</I> operations shown in Figure 22.3. We spend <img src="../images/bound.gif">(<I>n</I>) time performing the <I>n</I> <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations. Because the <I>i</I>th <FONT FACE="Courier New" SIZE=2>UNION</FONT> operation updates <I>i</I> objects, the total number of objects updated by all the <FONT FACE="Courier New" SIZE=2>UNION</FONT> operations is<P>
<img src="443_a.gif"><P>
<img src="444_a.gif"><P>
<h4><a name="088e_16ef">Figure 22.2 (a) Linked-list representations of two sets. One contains objects b, c, e, and h, with c as the representative, and the other contains objects d, f, and g, with f as the representative. Each object on the list contains a set member, a pointer to the next object on the list, and a pointer back to the first object on the list, which is the representative. (b) The result of <FONT FACE="Courier New" SIZE=2>UNION</FONT>(e, g). The representative of the resulting set is f.<a name="088e_16ef"></sub></sup></h4><P>
<img src="444_b.gif"><P>
<h4><a name="088e_16f0">Figure 22.3 A sequence of m operations that takes O(m<SUP>2</SUP>) time using the linked-list set representation and the simple implementation of <FONT FACE="Courier New" SIZE=2>UNION</FONT>. For this example, n = <FONT FACE="Courier New" SIZE=2><img src="../images/hfbrul14.gif"><FONT FACE="Times New Roman" SIZE=2>m/2<FONT FACE="Courier New" SIZE=2><img src="../images/hfbrur14.gif"><FONT FACE="Times New Roman" SIZE=2>+1 </FONT></FONT></FONT></FONT>and q = m - n.<a name="088e_16f0"></sub></sup></h4><P>
<a name="088e_16ed"><a name="088e_16ee">The total time spent is therefore <img src="../images/bound.gif">(<I>n+q</I><SUP>2</SUP>), which is<img src="../images/bound.gif">(<I>m</I><SUP>2</SUP>)<I> </I>since <I>n</I> = <img src="../images/bound.gif">(<I>m</I>)<I> </I>and <I>q</I> = <img src="../images/bound.gif">(<I>m</I>). Thus, on the average, each operation requires <img src="../images/bound.gif">(<I>m</I>) time. That is, the amortized time of an operation is <img src="../images/bound.gif">(<I>m</I>).<P>
<P>
<h2>A weighted-union heuristic</h2><P>
<a name="088f_16ef">The above implementation of the <FONT FACE="Courier New" SIZE=2>UNION</FONT> procedure requires an average of <img src="../images/bound.gif">(<I>m</I>) time per call because we may be appending a longer list onto a shorter list; we must update the pointer to the representative for each member of the longer list. Suppose instead that each representative also includes the length of the list (which is easily maintained) and that we always append the smaller list onto the longer, with ties broken arbitrarily. With this simple <I><B>weighted-union heuristic</I></B>, a single <FONT FACE="Courier New" SIZE=2>UNION</FONT> operation can still take <img src="../images/omega12.gif">(<I>m</I>) time if both sets have <img src="../images/omega12.gif">(<I>m</I>) members. As the following theorem shows, however, a sequence of <I>m</I> <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT>, <FONT FACE="Courier New" SIZE=2>UNION</FONT>, and <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations, <I>n</I> of which are <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations, takes <I>O</I>(<I>m</I> + <I>n</I> 1g <I>n</I>) time.<P>
<a name="088f_16f0">Theorem 22.1<a name="088f_16f0"><P>
Using the linked-list representation of disjoint sets and the weighted-union heuristic, a sequence of <I>m</I> <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT>, <FONT FACE="Courier New" SIZE=2>UNION</FONT>, and <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations, <I>n</I> of which are <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations, takes <I>O</I>(<I>m</I> + <I>n</I> 1g <I>n</I>) time.<P>
<I><B>Proof </I></B>We start by computing, for each object in a set of size <I>n</I>, an upper bound on the number of times the object's pointer back to the representative has been updated. Consider a fixed object <I>x</I>. We know that each time <I>x</I>'s representative pointer was updated, <I>x</I> must have started in the smaller set. The first time <I>x</I>'s representative pointer was updated, therefore, the resulting set must have had at least 2 members. Similarly, the next time <I>x</I>'s representative pointer was updated, the resulting set must have had at least 4 members. Continuing on, we observe that for any <I>k</I> <img src="../images/lteq12.gif"> <I>n</I>, after <I>x</I>'s representative pointer has been updated <FONT FACE="Courier New" SIZE=2><img src="../images/hfbrul14.gif"></FONT>1g <I>k<FONT FACE="Courier New" SIZE=2></I><img src="../images/hfbrur14.gif"></FONT><I> times, the resulting set must have at least </I>k<I> members. Since the largest set has at most </I>n<I> members, each object's representative pointer has been updated at most <FONT FACE="Courier New" SIZE=2><img src="../images/hfbrul14.gif"></FONT>1g </I>n<FONT FACE="Courier New" SIZE=2><I><img src="../images/hfbrur14.gif"></I></FONT> times over all the <FONT FACE="Courier New" SIZE=2>UNION</FONT> operations. The total time used in updating the <I>n</I> objects is thus <I>O</I>(<I>n</I> 1g <I>n</I>). <P>
The time for the entire sequence of <I>m</I> operations follows easily. Each <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> and <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operation takes <I>O</I>(1) time, and there are <I>O</I>(<I>m</I>) of them. The total time for the entire sequence is thus <I>O</I>(<I>m</I> + <I>n</I> 1g <I>n</I>). <P>
<P>
<h2><a name="0890_16f5">Exercises<a name="0890_16f5"></h2><P>
<a name="0890_16f6">22.2-1<a name="0890_16f6"><P>
<a name="0890_16f0"><a name="0890_16f1"><a name="0890_16f2"><a name="0890_16f3">Write pseudocode for <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT>, <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT>, and <FONT FACE="Courier New" SIZE=2>UNION</FONT> using the linked-list representation and the weighted-union heuristic. Assume that each object <I>x</I> has attributes <I>rep</I>[<I>x</I>] pointing to the representative of the set containing <I>x</I>, <I>last</I>[<I>x</I>] pointing to the last object in the linked list containing <I>x</I>, and <I>size</I>[<I>x</I>] giving the size of the set containing <I>x</I>. Your pseudocode can assume that <I>last</I>[<I>x</I>] and <I>size</I>[<I>x</I>] are correct only if <I>x</I> is a representative.<P>
<a name="0890_16f7">22.2-2<a name="0890_16f7"><P>
Show the data structure that results and the answers returned by the <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations in the following program. Use the linked-list representation with the weighted-union heuristic.<P>
<pre>1 <B>for</B> <I>i</I> <img src="../images/arrlt12.gif"> 1 <B>to</B> 16</sub></sup></pre><P>
<pre>2<B> do </B>MAKE-SET(<I>x<SUB>i</I></SUB>)</sub></sup></pre><P>
<pre>3 <B>for</B> <I>i</I> <img src="../images/arrlt12.gif"> 1 <B>to</B> 15 <B>by</B> 2</sub></sup></pre><P>
<pre>4<B> do</B> UNION(<I>x<SUB>i</I></SUB>,<I>x<SUB>i</I>+1</SUB>)</sub></sup></pre><P>
<pre>5 <B>for</B> i <img src="../images/arrlt12.gif"> 1 <B>to</B> 13 <B>by</B> 4</sub></sup></pre><P>
<pre>6 <B>do</B> UNION(<I>x<SUB>i</I></SUB>,<I>x<SUB>i</I>+2</SUB>)</sub></sup></pre><P>
<pre>7 UNION(<I>x</I><SUB>1</SUB>,<I>x</I><SUB>5</SUB>)</sub></sup></pre><P>
<pre>8 UNION(<I>x</I><SUB>11</SUB><I>,x</I><SUB>13</SUB>)</sub></sup></pre><P>
<pre>9 UNION (<I>x</I><SUB>1</SUB>,<I>x</I><SUB>10</SUB>)</sub></sup></pre><P>
<pre>10 FIND-SET(<I>x</I><SUB>2</SUB>)</sub></sup></pre><P>
<pre>11 FIND-SET(<I>x</I><SUB>9</SUB>)</sub></sup></pre><P>
<a name="0890_16f8">22.2-3<a name="0890_16f8"><P>
<a name="0890_16f4">Argue on the basis of Theorem 22.1 that we can obtain amortized time bounds of <I>O</I>(1) for <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> and <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> and <I>O</I>(1g <I>n</I>) for UNION using the linked-list representation and the weighted-union heuristic.<P>
<a name="0890_16f9">22.2-4<a name="0890_16f9"><P>
Give a tight asymptotic bound on the running time of the sequence of operations in Figure 22.3 assuming the linked-list representation and the weighted-union heuristic.<P>
<P>
<P>
<h1><a name="0891_16fa">22.3 Disjoint-set forests<a name="0891_16fa"></h1><P>
<a name="0891_16f5"><a name="0891_16f6"><a name="0891_16f7">In a faster implementation of disjoint sets, we represent sets by rooted trees, with each node containing one member and each tree representing one set. In a <I><B>disjoint-set forest</I></B>, illustrated in Figure 22.4(a), each member points only to its parent. The root of each tree contains the representative and is its own parent. As we shall see, although the straightforward algorithms that use this representation are no faster than ones that use the linked-list representation, by introducing two heuristics--"union by rank" and "path compression"--we can achieve the asymptotically fastest disjoint-set data structure known.<P>
<img src="447_a.gif"><P>
<h4><a name="0891_16fb">Figure 22.4 A disjoint-set forest. (a) Two trees representing the two sets of Figure 22.2. The tree on the left represents the set {b, c, e, h}, with c as the representative, and the tree on the right represents the set {d, f, g}, with f as the representative. (b) The result of <FONT FACE="Courier New" SIZE=2>UNION<FONT FACE="Times New Roman" SIZE=2>(e, g).<a name="0891_16fb"></FONT></FONT></sub></sup></h4><P>
<a name="0891_16f8"><a name="0891_16f9">We perform the three disjoint-set operations as follows. A <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operation simply creates a tree with just one node. We perform a <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operation by chasing parent pointers until we find the root of the tree. The nodes visited on this path toward the root constitute the <I><B>find path</I></B>. A <FONT FACE="Courier New" SIZE=2>UNION</FONT> operation, shown in Figure 22.4(b), causes the root of one tree to point to the root of the other.<P>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -