📄 chap17.htm
字号:
<HTML><HEAD>
<TITLE>Intro to Algorithms: CHAPTER 17: GREEDY ALGORITHMS</TITLE></HEAD><BODY BGCOLOR="#FFFFFF">
<a href="chap18.htm"><img align=right src="../../images/next.gif" alt="Next Chapter" border=0></A>
<a href="toc.htm"><img align=right src="../../images/toc.gif" alt="Return to Table of Contents" border=0></A>
<a href="chap16.htm"><img align=right src="../../images/prev.gif" alt="Previous Chapter" border=0></A>
<h1><a name="0825_1576">CHAPTER 17: GREEDY ALGORITHMS<a name="0825_1576"></h1><P>
<a name="0825_1574"><a name="0825_1575">Algorithms for optimization problems typically go through a sequence of steps, with a set of choices at each step. For many optimization problems, using dynamic programming to determine the best choices is overkill; simpler, more efficient algorithms will do. A <I><B>greedy algorithm</I></B> always makes the choice that looks best at the moment. That is, it makes a locally optimal choice in the hope that this choice will lead to a globally optimal solution. This chapter explores optimization problems that are solvable by greedy algorithms.<P>
Greedy algorithms do not always yield optimal solutions, but for many problems they do. We shall first examine in Section 17.1 a simple but nontrivial problem, the activity-selection problem, for which a greedy algorithm efficiently computes a solution. Next, Section 17.2 reviews some of the basic elements of the greedy approach. Section 17.3 presents an important application of greedy techniques: the design of data-compression (Huffman) codes. In Section 17.4, we investigate some of the theory underlying combinatorial structures called "matroids" for which a greedy algorithm always produces an optimal solution. Finally, Section 17.5 illustrates the application of matroids using the problem of scheduling unit-time tasks with deadlines and penalties.<P>
The greedy method is quite powerful and works well for a wide range of problems. Later chapters will present many algorithms that can be viewed as applications of the greedy method, including minimum-spanning-tree algorithms (Chapter 24), Dijkstra<FONT FACE="CG Times (W1)" SIZE=2>'</FONT>s algorithm for shortest paths form a single source (Chapter 25), and Chvátal's greedy set-covering heuristic (Chapter 37). Minimum spanning trees form a classic example of the greedy method. Although this chapter and Chapter 24 can be read independently of each other, you may find it useful to read them together.<P>
<h1><a name="0827_1578">17.1 An activity-selection problem<a name="0827_1578"></h1><P>
<a name="0827_1576">Our first example is the problem of scheduling a resource among several competing activities. We shall find that a greedy algorithm provides an elegant and simple method for selecting a maximum-size set of mutually compatible activities.<P>
Suppose we have a set <I>S</I> = { 1, 2, . . . , <I>n</I>} of <I>n</I> proposed <I><B>activities</I></B> that wish to use a resource, such as a lecture hall, which can be used by only one activity at a time. Each activity <I>i</I> has a<I><B> start time</I></B> <I>s<SUB>i </I></SUB>and a <I><B>finish time</I></B> â<I><SUB>i</I></SUB>, where <I>s<SUB>i</I></SUB> <img src="../images/lteq12.gif"> â<I><SUB>i</I></SUB>. If selected, activity <I>i</I> takes place during the half-open time interval [<I>s<SUB>i</I></SUB>,â<I><SUB>i</I></SUB>). Activities <I>i</I> and <I>j</I> are <I><B>compatible</I></B> if the intervals [<I>s<SUB>i</I></SUB>, â<I><SUB>i</I></SUB>) and [<I>s<SUB>j</I></SUB>,â<I><SUB>j</I></SUB>) do not overlap (i.e., <I>i</I> and <I>j</I> are compatible if <I>s<SUB>i</I></SUB> <img src="../images/gteq.gif"> â<I><SUB>j</I></SUB> or <I>s<SUB>j</I></SUB> <img src="../images/gteq.gif"> â<I><SUB>i</I></SUB>). The <I><B>activity-selection problem</I></B> is to select a maximum-size set of mutually compatible activities.<P>
A greedy algorithm for the activity-selection problem is given in the following pseudocode. We assume that the input activities are in order by increasing finishing time:<P>
<pre>â<SUB>1</SUB> <img src="../images/lteq12.gif"> â<SUB>2</SUB> <img src="../images/lteq12.gif"> . . . <img src="../images/lteq12.gif"> â<I><SUB>n</I> .</sub></sup></pre><P>
<h4><a name="0827_1579">(17.1)<a name="0827_1579"></sub></sup></h4><P>
If not, we can sort them into this order in time <I>O</I>(<I>n </I>1g <I>n</I>), breaking ties arbitrarily. The pseudocode assumes that inputs <I>s</I> and â are represented as arrays.<P>
<pre><a name="0827_1577">GREEDY-ACTIVITY-SELECTOR(<I>s, f</I>)</sub></sup></pre><P>
<pre>1 <I>n</I> <img src="../images/arrlt12.gif"> <I>length</I>[<I>s</I>]</sub></sup></pre><P>
<pre>2 <I>A</I> <img src="../images/arrlt12.gif"> {1}</sub></sup></pre><P>
<pre>3 <I>j</I> <img src="../images/arrlt12.gif"> 1</sub></sup></pre><P>
<pre>4 <B>for</B> <I>i</I> <img src="../images/arrlt12.gif"> 2 <B>to</B> <I>n</I></sub></sup></pre><P>
<pre>5 <B>do</B> <B>if</B> <I>s<SUB>i</I></SUB> <img src="../images/gteq.gif"> â<I><SUB>j</I></sub></sup></pre><P>
<pre>6 <B>then</B> <I>A</I> <img src="../images/arrlt12.gif"> <I>A</I> <img src="../images/wideu.gif"> {<I>i</I>}</sub></sup></pre><P>
<pre>7 <I>j</I> <img src="../images/arrlt12.gif"> <I>i</I></sub></sup></pre><P>
<pre>8 <B>return</B> <I>A</I></sub></sup></pre><P>
The operation of the algorithm is shown in Figure 17.1. The set <I>A</I> collects the selected activities. The variable <I>j</I> specifies the most recent addition to <I>A</I>. Since the activities are considered in order of nondecreasing finishing time, <I>f<SUB>j</I></SUB> is always the maximum finishing time of any activity in <I>A</I>. That is,<P>
<pre>â<I><SUB>j</I></SUB> = max{<I>f<SUB>k</I></SUB> : <I>K</I> <img src="../images/memof12.gif"> <I>A</I>}.</sub></sup></pre><P>
<h4><a name="0827_157a">(17.2)<a name="0827_157a"></sub></sup></h4><P>
Lines 2-3 select activity 1, initialize <I>A</I> to contain just this activity, and initialize <I>j</I> to this activity. Lines 4-7 consider each activity <I>i</I> in turn and add <I>i</I> to <I>A</I> if it is compatible with all previously selected activities. To see if activity <I>i</I> is compatible with every activity currently in <I>A</I>, it suffices by equation (17.2) to check (line 5) that its start time <I>s<SUB>i</I></SUB> is not earlier than the finish time <I>f<SUB>j</I></SUB> of the activity most recently added to <I>A</I>. If activity <I>i</I> is compatible, then lines 6-7 add it to <I>A</I> and update <I>j</I>. The <FONT FACE="Courier New" SIZE=2>GREEDY</FONT>-<FONT FACE="Courier New" SIZE=2>ACTIVITY</FONT>-<FONT FACE="Courier New" SIZE=2>SELECTOR</FONT> procedure is quite efficient. It can schedule a set <I>S</I> of <I>n</I> activities in <img src="../images/bound.gif">(<I>n</I>) time, assuming that the activities were already sorted initially by their finish times.<P>
<img src="331_a.gif"><P>
<h4><a name="0827_157b">Figure 17.1 The operation of <FONT FACE="Courier New" SIZE=2>GREEDY<FONT FACE="Times New Roman" SIZE=2>-<FONT FACE="Courier New" SIZE=2>ACTIVITY<FONT FACE="Times New Roman" SIZE=2>-<FONT FACE="Courier New" SIZE=2>SELECTOR</FONT></FONT></FONT></FONT></FONT> on 11 activities given at the left. Each row of the figure corresponds to an iteration of the for loop in lines 4-7. The activities that have been selected to be in set A are shaded, and activity i, shown in white, is being considered. If the starting time s<SUB>i</SUB> of activity i occurs before the finishing time f<SUB>j</SUB> of the most recently selected activity j (the arrow between them points left), it is rejected. Otherwise (the arrow points directly up or to the right), it is accepted and put into set A.<a name="0827_157b"></sub></sup></h4><P>
The activity picked next by <FONT FACE="Courier New" SIZE=2>GREEDY</FONT>-<FONT FACE="Courier New" SIZE=2>ACTIVITY</FONT>-<FONT FACE="Courier New" SIZE=2>SELECTOR</FONT> is always the one with the earliest finish time that can be legally seheduled. The activity picked is thus a "greedy" choice in the sense that, intuitively, it leaves as much opportunity as possible for the remaining activities to be scheduled. That is, the greedy choice is the one that maximizes the amount of unscheduled time remaining.<P>
<h2>Proving the greedy algorithm correct</h2><P>
Greedy algorithms do not always produce optimal solutions. However, <FONT FACE="Courier New" SIZE=2>GREEDY</FONT>-<FONT FACE="Courier New" SIZE=2>ACTIVITY</FONT>-<FONT FACE="Courier New" SIZE=2>SELECTOR</FONT> always finds an optimal solution to an instance of the activity-selection problem.<P>
<a name="0828_0001">Theorem 17.1<a name="0828_0001"><P>
Algorithm <FONT FACE="Courier New" SIZE=2>GREEDY</FONT>-<FONT FACE="Courier New" SIZE=2>ACTIVITY</FONT>-<FONT FACE="Courier New" SIZE=2>SELECTOR</FONT> produces solutions of maximum size for the activity-selection problem.<P>
<I><B>Proof </I></B>Let <I>S</I> = {1, 2, . . . , <I>n</I>} be the set of activities to schedule. Since we are assuming that the activities are in order by finish time, activity 1 has the earliest finish time. We wish to show that there is an optimal solution that begins with a greedy choice, that is, with activity 1.<P>
Suppose that <I>A</I> <img src="../images/rgtubar.gif"> <I>S</I> is an optimal solution to the given instance of the activity-selection problem, and let us order the activities in <I>A</I> by finish time. Suppose further that the first activity in <I>A</I> is activity <I>k</I>. If <I>k</I> = 1, then schedule <I>A</I> begins with a greedy choice. If <I>k</I> <img src="../images/noteq.gif"> 1, we want to show that there is another optimal solution <I>B</I> to <I>S</I> that begins with the greedy choice, activity 1. Let <I>B</I> = <I>A</I> - {<I>k</I>} <img src="../images/wideu.gif"> {1}. Because <I>f<SUB>i</I></SUB> <img src="../images/lteq12.gif"> <I>f<SUB>k</I></SUB>, the activities in <I>B</I> are disjoint, and since <I>B</I> has the same number of activities as <I>A</I>, it is also optimal. Thus, <I>B</I> is an optimal solution for <I>S</I> that contains the greedy choice of activity 1. Therefore, we have shown that there always exists an optimal schedule that begins with a greedy choice.<P>
Moreover, once the greedy choice of activity 1 is made, the problem reduces to finding an optimal solution for the activity-selection problem over those activities in <I>S</I> that are compatible with activity 1. That is, if <I>A</I> is an optimal solution to the original problem <I>S</I>, then <I>A</I>' = <I>A</I> - {1} is an optimal solution to the activity-selection problem <I>S</I>'<I> = {</I>i<I> <img src="../images/memof12.gif"></I> <I>S</I>: <I><FONT FACE="Courier New" SIZE=2>S<SUB>i</I></FONT></SUB> <img src="../images/gteq.gif"> <I><FONT FACE="Courier New" SIZE=2>f<SUB>1</I></FONT></SUB>}. Why? If we could find a solution <I>B</I>'<I> to </I>S<I>'</I> with more activities than <I>A</I>'<I>, adding activity 1 to </I>B<I>'</I> would yield a solution <I>B</I> to <I>S</I> with more activities than <I>A</I>, thereby contradicting the optimality of <I>A</I>. Therefore, after each greedy choice is made, we are left with an optimization problem of the same form as the original problem. By induction on the number of choices made, making the greedy choice at every step produces an optimal solution. <P>
<P>
<h2><a name="0829_157b">Exercises<a name="0829_157b"></h2><P>
<a name="0829_157c">17.1-1<a name="0829_157c"><P>
<a name="0829_1578">Give a dynamic-programming algorithm for the activity-selection problem, based on computing <I>m<SUB>i</I></SUB> iteratively for <I>i</I> = 1, 2, . . . , <I>n</I>, where <I>m<SUB>i</I></SUB> is the size of the largest set of mutually compatible activities among activities {1, 2, . . . , <I>i</I>}. Assume that the inputs have been sorted as in equation (17.1). Compare the running time of your solution to the running time of <FONT FACE="Courier New" SIZE=2>GREEDY</FONT>-<FONT FACE="Courier New" SIZE=2>ACTIVITY</FONT>-<FONT FACE="Courier New" SIZE=2>SELECTOR</FONT>.<P>
<a name="0829_157d">17.1-2<a name="0829_157d"><P>
<a name="0829_1579">Suppose that we have a set of activities to schedule among a large number of lecture halls. We wish to schedule all the activities using as few lecture halls as possible. Give an efficient greedy algorithm to determine which activity should use which lecture hall.<P>
<a name="0829_157a">(This is also known as the <I><B>interval-graph coloring problem.</I></B> We can create an interval graph whose vertices are the given activities and whose edges connect incompatible activities. The smallest number of colors required to color every vertex so that no two adjacent vertices are given the same color corresponds to finding the fewest lecture halls needed to schedule all of the given activities.)<P>
<a name="0829_157e">17.1-3<a name="0829_157e"><P>
Not just any greedy approach to the activity-selection problem produces a maximum-size set of mutually compatible activities. Give an example to show that the approach of selecting the activity of least duration from those that are compatible with previously selected activities does not work. Do the same for the approach of always selecting the activity that overlaps the fewest other remaining activities.<P>
<P>
<P>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -