📄 chap22.htm

📁 程序设计的艺术,此书是英文版的,经久不衰,非常的经典
💻 HTM
📖 第 1 页 / 共 5 页
字号:
12 3 4 5 下一页
<HTML><HEAD>

<TITLE>Intro to Algorithms: CHAPTER 22: DATA STRUCTURES FOR DISJOINT SETS</TITLE></HEAD><BODY BGCOLOR="#FFFFFF">

<a href="partvi.htm"><img align=right src="../../images/next.gif" alt="Next Chapter" border=0></A>
<a href="toc.htm"><img align=right src="../../images/toc.gif" alt="Return to Table of Contents" border=0></A>
<a href="chap21.htm"><img align=right src="../../images/prev.gif" alt="Previous Chapter" border=0></A>


<h1><a name="0888_16df">CHAPTER 22: DATA STRUCTURES FOR DISJOINT SETS<a name="0888_16df"></h1><P>
<a name="0888_16de">Some applications involve grouping <I>n</I> distinct elements into a collection of disjoint sets. Two important operations are then finding which set a given element belongs to and uniting two sets. This chapter explores methods for maintaining a data structure that supports these operations.<P>
Section 22.1 describes the operations supported by a disjoint-set data structure and presents a simple application. In Section 22.2, we look at a simple linked-list implementation for disjoint sets. A more efficient representation using rooted trees is given in Section 22.3. The running time using the tree representation is linear for all practical purposes but is theoretically superlinear. Section 22.4 defines and discusses Ackermann's function and its very slowly growing inverse, which appears in the running time of operations on the tree-based implementation, and then uses amortized analysis to prove a slightly weaker upper bound on the running time.<P>





<h1><a name="088a_16e5">22.1 Disjoint-set operations<a name="088a_16e5"></h1><P>
<a name="088a_16df"><a name="088a_16e0"><a name="088a_16e1">A <I><B>disjoint-set data structure</I></B><I> </I>maintains a collection<I> S </I>= {<I>S</I><SUB>1</SUB><I>, S</I><SUB>2</SUB><I>, . . . ,S<SUB>k</I></SUB>} of disjoint dynamic sets. Each set is identified by a <I><B>representative,</I></B><I> </I>which is some member of the set. In some applications, it doesn't matter which member is used as the representative; we only care that if we ask for the representative of a dynamic set twice without modifying the set between the requests, we get the same answer both times. In other applications, there may be a prespecified rule for choosing the representative, such as choosing the smallest member in the set (assuming, of course, that the elements can be ordered).<P>
<a name="088a_16e2">As in the other dynamic-set implementations we have studied, each element of a set is represented by an object. Letting <I>x</I> denote an object, we wish to support the following operations.<P>
<a name="088a_16e3"><FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT>(<I>x</I>)     creates a new set whose only member (and thus representative) is pointed to by <I>x</I>. Since the sets are disjoint, we require that <I>x </I>not already be in a set.<P>
<FONT FACE="Courier New" SIZE=2>UNION</FONT>(<I>x, y</I>)     unites the dynamic sets that contain <I>x</I> and <I>y</I>, say <I>S<SUB>x</I></SUB> and <I>S<SUB>y</I></SUB>, into a new set that is the union of these two sets. The two sets are assumed to be disjoint prior to the operation. The representative of the resulting set is some member of <I>S<SUB>x</SUB> </I><img src="../images/wideu.gif"> S<SUB>y<I></SUB>, although many implementations of <FONT FACE="Courier New" SIZE=2>UNION</FONT> choose the representative of either </I>S<SUB>x<I></SUB> or </I>S<SUB>y<I></SUB>, as the new representative. Since we require the sets in the collection to be disjoint, we &quot;destroy&quot; sets </I>S<SUB>x<I></SUB> and </I>S<SUB>y<I></SUB>, removing them from the collection </I>S<I>.</I><P>
<a name="088a_16e4"><FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT>(<I>x</I>)     returns a pointer to the representative of the (unique) set containing <I>x</I>.<P>
Throughout this chapter, we shall analyze the running times of disjoint-set data structures in terms of two parameters: <I>n</I>, the number of <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations, and <I>m</I>, the total number of <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT>, <FONT FACE="Courier New" SIZE=2>UNION</FONT>, and <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations. Since the sets are disjoint, each <FONT FACE="Courier New" SIZE=2>UNION</FONT> operation reduces the number of sets by one. After <I>n</I> - 1 <FONT FACE="Courier New" SIZE=2>UNION</FONT> operations, therefore, only one set remains. The number of <FONT FACE="Courier New" SIZE=2>UNION</FONT> operations is thus at most <I>n</I> - 1. Note also that since the <FONT FACE="Courier New" SIZE=2>MAKE</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> operations are included in the total number of operations <I>m</I>, we have <I>m </I><img src="../images/gteq.gif"><I> n</I>.<P>





<h2>An application of disjoint-set data structures</h2><P>
<a name="088b_16e5">One of the many applications of disjoint-set data structures arises in determining the connected components of an undirected graph (see Section 5.4). Figure 22.1(a), for example, shows a graph with four connected components.<P>
<a name="088b_16e6">The procedure <FONT FACE="Courier New" SIZE=2>CONNECTED</FONT>-<FONT FACE="Courier New" SIZE=2>COMPONENTS</FONT> that follows uses the disjoint-set operations to compute the connected components of a graph. Once <FONT FACE="Courier New" SIZE=2>CONNECTED</FONT>-<FONT FACE="Courier New" SIZE=2>COMPONENTS</FONT> has been run as a preprocessing step, the procedure <FONT FACE="Courier New" SIZE=2>SAME</FONT>-<FONT FACE="Courier New" SIZE=2>COMPONENT</FONT> answers queries about whether two vertices are in the same connected component.<SUP>1 </SUP>(The set of vertices of a graph <I>G </I>is denoted by <I>V</I>[<I>G</I>], and the set of edges is denoted by <I>E</I>[<I>G</I>].)<P>
<SUP>1</SUP>When the edges of the graph are &quot;static&quot;--not changing over time--the connected components can be computed faster by using depth-first search (Exercise 23.3-9). Sometimes, however, the edges are added &quot;dynamically&quot; and we need to maintain the connected components as each edge is added. In this case, the implementation given here can be more efficient than running a new depth-first search for each new edge.<P>
<pre>CONNECTED-COMPONENTS(<I>G</I>)</sub></sup></pre><P>
<pre>1  <B>for</B> each vertex <I>v</I> <img src="../images/memof12.gif"> <I>V</I>[<I>G</I>]</sub></sup></pre><P>
<pre>2       <B>do</B> MAKE-SET(<I>v</I>)</sub></sup></pre><P>
<pre>3  <B>for</B> each edge (<I>u</I>,<I>v</I>) <img src="../images/memof12.gif"> <I>E</I>[<I>G</I>]</sub></sup></pre><P>
<pre>4       <B>do</B> <B>if</B> FIND-SET(<I>u</I>)<I> </I><img src="../images/noteq.gif"> <I>FIND-SET(</I>v<I>)</I></sub></sup></pre><P>
<pre>5             <B>then</B> UNION(<I>u,v</I>)</sub></sup></pre><P>
<img src="442_a.gif"><P>
<pre>Edge processed                    Collection of disjoint sets</sub></sup></pre><P>
<pre>------------------------------------------------------------------------------</sub></sup></pre><P>
<pre> initial sets<I>   </I>{<I>a</I>}        {<I>b</I>}    {<I>c</I>}  {<I>d</I>}  {<I>e</I>}      {<I>f</I>}  {<I>g</I>}  {<I>h</I>}    {<I>i</I>}  {<I>j</I>}</sub></sup></pre><P>
<pre>    (<I>b,d</I>)       {<I>a</I>}        {<I>b,d</I>}  {<I>c</I>}       {<I>e</I>}      {<I>f</I>}  {<I>g</I>}  {<I>h</I>}    {<I>i</I>}  {<I>j</I>}</sub></sup></pre><P>
<pre>    (<I>e,g</I>)       {<I>a</I>}        {<I>b,d</I>}  {<I>c</I>}       {<I>e,g</I>}    {<I>f</I>}       {<I>h</I>}    {<I>i</I>}  {<I>j</I>}</sub></sup></pre><P>
<pre>    (<I>a,c</I>)       {<I>a,c</I>}      {<I>b,d</I>}            {<I>e,g</I>}    {<I>f</I>}       {<I>h</I>}    {<I>i</I>}  {<I>j</I>}</sub></sup></pre><P>
<pre>    (<I>h,i</I>)       {<I>a,c</I>}      {<I>b,d</I>}            {<I>e,g</I>}    {<I>f</I>}       {<I>h,i</I>}       {<I>j</I>}</sub></sup></pre><P>
<pre>    (<I>a,b</I>)       {<I>a,b,c,d</I>}                   {<I>e,g</I>}    {<I>f</I>}       {<I>h,i</I>}       {<I>j</I>}</sub></sup></pre><P>
<pre>    (<I>e,f</I>)       {<I>a,b,c,d</I>}                   {<I>e,f,g</I>}            {<I>h,i</I>}       {<I>j</I>}</sub></sup></pre><P>
<pre>    (<I>b,c</I>)       {<I>a,b,c,d</I>}                   {<I>e,f,g</I>}            {<I>h,i</I>}       {<I>j</I>}</sub></sup></pre><P>
<pre>                                     (b)</sub></sup></pre><P>
<h4><a name="088b_16e8">Figure 22.1 (a) A graph with four connected components: {a, b, c, d}, {e, f, g}, {h, i}, and {j}. (b) The collection of disjoint sets after each edge is processed.<a name="088b_16e8"></sub></sup></h4><P>
<pre><a name="088b_16e7">SAME-COMPONENT(<I>u</I>,<I>v</I>)</sub></sup></pre><P>
<pre>1  <B>if</B> FIND-SET(<I>u</I>) = FIND SET(<I>v</I>)</sub></sup></pre><P>
<pre>2     <B>then return</B> TRUE</sub></sup></pre><P>
<pre>3     <B>else return</B> FALSE</sub></sup></pre><P>
The procedure <FONT FACE="Courier New" SIZE=2>CONNECTED</FONT>-<FONT FACE="Courier New" SIZE=2>COMPONENTS</FONT> initially places each vertex <I>v </I>in its own set. Then, for each edge (<I>u, v</I>), it unites the sets containing <I>u </I>and <I>v</I>. By Exercise 22.1-2, after all the edges are processed, two vertices are in the same connected component if and only if the corresponding objects are in the same set. Thus, <FONT FACE="Courier New" SIZE=2>CONNECTED</FONT>-<FONT FACE="Courier New" SIZE=2>COMPONENTS</FONT> computes sets in such a way that the procedure <FONT FACE="Courier New" SIZE=2>SAME</FONT>-<FONT FACE="Courier New" SIZE=2>COMPONENT</FONT> can determine whether two vertices are in the same connected component. Figure 22.1 (b) illustrates how the disjoint sets are computed by <FONT FACE="Courier New" SIZE=2>CONNECTED</FONT>-<FONT FACE="Courier New" SIZE=2>COMPONENTS</FONT>.<P>
<P>







<h2><a name="088c_0001">Exercises<a name="088c_0001"></h2><P>
<a name="088c_0002">22.1-1<a name="088c_0002"><P>
Suppose that <FONT FACE="Courier New" SIZE=2>CONNECTED</FONT>-<FONT FACE="Courier New" SIZE=2>COMPONENTS</FONT> is run on the undirected graph <I>G </I>= (<I>V, E</I>), where <I>V</I> = {<I>a, b, c, d, e, f, g, h, i, j, k</I>} and the edges of <I>E </I>are processed in the following order: (<I>d, i</I>), (<I>f, k</I>), (<I>g, i</I>), (<I>b, g</I>), (<I>a, h</I>), (<I>i, j</I>), (<I>d, k</I>), (<I>b, j</I>), (<I>d, f</I>), <I>(g, j</I>), (<I>a, e</I>), (<I>i, d</I>). List the vertices in each connected component after each iteration of lines 3-5.<P>
<a name="088c_0003">22.1-2<a name="088c_0003"><P>
Show that after all edges are processed by <FONT FACE="Courier New" SIZE=2>CONNECTED</FONT>-<FONT FACE="Courier New" SIZE=2>COMPONENTS</FONT>, two vertices are in the same connected component if and only if they are in the same set.<P>
<a name="088c_0004">22.1-3<a name="088c_0004"><P>
During the execution of <FONT FACE="Courier New" SIZE=2>CONNECTED</FONT>-<FONT FACE="Courier New" SIZE=2>COMPONENTS</FONT> on an undirected graph <I>G</I> = (<I>V, E</I>) with <I>k</I> connected components, how many times is <FONT FACE="Courier New" SIZE=2>FIND</FONT>-<FONT FACE="Courier New" SIZE=2>SET</FONT> called? How many times is <FONT FACE="Courier New" SIZE=2>UNION</FONT> called? Express your answers in terms of |<I>V</I>|, |<I>E</I>|, and <I>k</I>.<P>
<P>


<P>
12 3 4 5 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -