📄 chap37.htm
字号:
<HTML><HEAD>
<TITLE>Intro to Algorithms: CHAPTER 37: APPROXIMATION ALGORITHMS</TITLE></HEAD><BODY BGCOLOR="#FFFFFF">
<a href="biblio.htm"><img align=right src="../../images/next.gif" alt="Next Chapter" border=0></A>
<a href="toc.htm"><img align=right src="../../images/toc.gif" alt="Return to Table of Contents" border=0></A>
<a href="chap36.htm"><img align=right src="../../images/prev.gif" alt="Previous Chapter" border=0></A>
<h1><a name="0a17_1d26">CHAPTER 37: APPROXIMATION ALGORITHMS<a name="0a17_1d26"></h1><P>
<a name="0a17_1d1c"><a name="0a17_1d1d">Many problems of practical significance are NP-complete but are too important to abandon merely because obtaining an optimal solution is intractable. If a problem is NP-complete, we are unlikely to find a polynomial-time algorithm for solving it exactly, but this does not imply that all hope is lost. There are two approaches to getting around NP-completeness. First, if the actual inputs are small, an algorithm with exponential running time may be perfectly satisfactory. Second, it may still be possible to find <I>near-optimal</I> solutions in polynomial time (either in the worst case or on the average). In practice, near-optimality is often good enough. An algorithm that returns near-optimal solutions is called an <I><B>approximation algorithm</I></B>. This chapter presents polynomial-time approximation algorithms for several NP-complete problems.<P>
Performance bounds for approximation algorithms<P>
Assume that we are working on an optimization problem in which each potential solution has a positive cost, and that we wish to find a near-optimal solution. Depending on the problem, an optimal solution may be defined as one with maximum possible cost or one with minimum possible cost; the problem may be a maximization or a minimization problem.<P>
<a name="0a17_1d1e"><a name="0a17_1d1f">We say that an approximation algorithm for the problem has a <I><B>ratio bound </I></B>of <img src="../images/rho12.gif">(<I>n</I>) if for any input of size <I>n</I>, the cost <I>C</I> of the solution produced by the approximation algorithm is within a factor of <img src="../images/rho12.gif">(<I>n</I>) of the cost <I>C</I>* of an optimal solution:<P>
<img src="964_a.gif"><P>
<h4><a name="0a17_1d27">(37.1)<a name="0a17_1d27"></sub></sup></h4><P>
This definition applies for both minimization and maximization problems. For a maximization problem, 0 < <I>C</I> <img src="../images/lteq12.gif"> <I>C</I>*, and the ratio <I>C</I>*/<I>C</I> gives the factor by which the cost of an optimal solution is larger than the cost of the approximate solution. Similarly, for a minimization problem, 0 < <I>C</I>* <img src="../images/lteq12.gif"> <I>C</I>, and the ratio <I>C</I>/<I>C</I>* gives the factor by which the cost of the approximate solution is larger than the cost of an optimal solution. Since all solutions are assumed to have positive cost, these ratios are always well defined. The ratio bound of an approximation algorithm is never less than 1, since <I>C/C</I>*<I> < </I>1<I> </I>implies<I> C</I>*<I>/C > </I>1<I>.</I> An optimal algorithm has ratio bound 1, and an approximation algorithm with a large ratio bound may return a solution that is very much worse than optimal.<P>
<a name="0a17_1d20"><a name="0a17_1d21"><a name="0a17_1d22">Sometimes, it is more convenient to work with a measure of relative error. For any input, the <I><B>relative error</I></B> of the approximation algorithm is defined to be<P>
<img src="965_a.gif"><P>
where, as before, <I>C</I>* is the cost of an optimal solution and <I>C</I> is the cost of the solution produced by the approximation algorithm. The relative error is always nonnegative. An approximation algorithm has a <I><B>relative</I></B> <I><B>error</I></B> <I><B>bound</I></B> of <FONT FACE="Courier New" SIZE=2><img src="../images/memof12.gif"></FONT>(<I>n</I>) if<P>
<img src="965_b.gif"><P>
<h4><a name="0a17_1d28">(37.2)<a name="0a17_1d28"></sub></sup></h4><P>
It follows from the definitions that the relative error bound can be bounded as a function of the ratio bound:<P>
<pre><img src="../images/memof12.gif"> (<I>n</I>) <img src="../images/lteq12.gif"> <img src="../images/rho12.gif">(<I>n</I>) - 1.</sub></sup></pre><P>
<h4><a name="0a17_1d29">(37.3)<a name="0a17_1d29"></sub></sup></h4><P>
(For a minimization problem, this is an equality, whereas for a maximization problem, we have <FONT FACE="Courier New" SIZE=2><img src="../images/memof12.gif"> </FONT>(<I>n</I>) = (<img src="../images/rho12.gif">(<I>n</I>) - 1) / <img src="../images/rho12.gif">(<I>n</I>), which satisfies inequality (37.3) since <img src="../images/rho12.gif">(<I>n</I>) <img src="../images/gteq.gif"> 1.)<P>
For many problems, approximation algorithms have been developed that have a fixed ratio bound, independent of <I>n</I>. For such problems, we simply use the notation <img src="../images/rho12.gif"> or <FONT FACE="Courier New" SIZE=2><img src="../images/memof12.gif"></FONT>, indicating no dependence on <I>n</I>.<P>
For other problems, computer scientists have been unable to devise any polynomial-time approximation algorithm having a fixed ratio bound. For such problems, the best that can be done is to let the ratio bound grow as a function of the input size <I>n</I>. An example of such a problem is the set-cover problem presented in Section 37.3.<P>
Some NP-complete problems allow approximation algorithms that can achieve increasingly smaller ratio bounds (or, equivalently, increasingly smaller relative error bounds) by using more and more computation time. That is, there is a trade-off between computation time and the quality of the approximation. An example is the subset-sum problem studied in Section 37.4. This situation is important enough to deserve a name of its own.<P>
<a name="0a17_1d23"><a name="0a17_1d24">An <I><B>approximation scheme</I></B> for an optimization problem is an approximation algorithm that takes as input not only an instance of the problem, but also a value <FONT FACE="Courier New" SIZE=2><img src="../images/memof12.gif"></FONT> > 0 such that for any fixed <FONT FACE="Courier New" SIZE=2><img src="../images/memof12.gif"></FONT>, the scheme is an approximation algorithm with relative error bound <FONT FACE="Courier New" SIZE=2><img src="../images/memof12.gif"></FONT>. We say that an approximation scheme is a <I><B>polynomial-time approximation scheme</I></B> if for any fixed <FONT FACE="Courier New" SIZE=2><img src="../images/memof12.gif"></FONT> > 0, the scheme runs in time polynomial in the size <I>n</I> of its input instance.<P>
The running time of a polynomial-time approximation scheme should not increase too rapidly as <FONT FACE="Courier New" SIZE=2><img src="../images/memof12.gif"></FONT> decreases. Ideally, if <FONT FACE="Courier New" SIZE=2><img src="../images/memof12.gif"></FONT> decreases by a constant factor, the running time to achieve the desired approximation should not increase by more than a constant factor. In other words, we would like the running time to be polynomial in 1/<FONT FACE="Courier New" SIZE=2><img src="../images/memof12.gif"></FONT> as well as in <I>n</I>.<P>
<a name="0a17_1d25">We say that an approximation scheme is a <I><B>fully polynomial-time approximation scheme</I></B> if its running time is polynomial both in 1/<FONT FACE="Courier New" SIZE=2><img src="../images/memof12.gif"></FONT> and in the size <I>n</I> of the input instance, where <FONT FACE="Courier New" SIZE=2><img src="../images/memof12.gif"></FONT> is the relative error bound for the scheme. For example, the scheme might have a running time of (1/<FONT FACE="Courier New" SIZE=2><img src="../images/memof12.gif"></FONT>)<SUP>2</SUP><I>n</I><SUP>3</SUP>. With such a scheme, any constant-factor decrease in <FONT FACE="Courier New" SIZE=2><img src="../images/memof12.gif"></FONT> can be achieved with a corresponding constant-factor increase in the running time.<P>
Chapter outline<P>
The first three sections of this chapter present some examples of polynomial-time approximation algorithms for NP-complete problems, and the last section presents a fully polynomial-time approximation scheme. Section 37.1 begins with a study of the vertex-cover problem, an NP-complete minimization problem that has an approximation algorithm with a ratio bound of 2. Section 37.2 presents an approximation algorithm with ratio bound 2 for the case of the traveling-salesman problem in which the cost function satisfies the triangle inequality. It also shows that without triangle inequality, an <FONT FACE="Courier New" SIZE=2><img src="../images/memof12.gif"></FONT>-approximation algorithm cannot exist unless P = NP. In Section 37.3, we show how a greedy method can be used as an effective approximation algorithm for the set-covering problem, obtaining a covering whose cost is at worst a logarithmic factor larger than the optimal cost. Finally, Section 37.4 presents a fully polynomial-time approximation scheme for the subset-sum problem.<P>
<h1><a name="0a19_1d2e">37.1 The vertex-cover problem<a name="0a19_1d2e"></h1><P>
<a name="0a19_1d26"><a name="0a19_1d27"><a name="0a19_1d28"><a name="0a19_1d29">The vertex-cover problem was defined and proved NP-complete in Section 36.5.2. A <I><B>vertex cover</I></B> of an undirected graph <I>G</I> = (<I>V</I>,<I>E</I>) is a subset <I>V</I>' <img src="../images/rgtubar.gif"> <I>V</I> such that if (<I>u</I>,<I>v</I>) is an edge of <I>G</I>, then either <I>u</I> <FONT FACE="Courier New" SIZE=2><img src="../images/memof12.gif"><I>V</I>'</FONT> or <I>v <FONT FACE="Courier New" SIZE=2></I><img src="../images/memof12.gif"></FONT> <I>V</I>' (or both). The size of a vertex cover is the number of vertices in it.<P>
<a name="0a19_1d2a"><a name="0a19_1d2b"><a name="0a19_1d2c">The <I><B>vertex-cover problem</I></B> is to find a vertex cover of minimum size in a given undirected graph. We call such a vertex cover an <I><B>optimal vertex</I></B> <I><B>cover</I></B>. This problem is NP-hard, since the related decision problem is NP-complete, by Theorem 36.12.<P>
Even though it may be difficult to find an optimal vertex cover in a graph <I>G</I>, however, it is not too hard to find a vertex cover that is near-optimal. The following approximation algorithm takes as input an undirected graph <I>G</I> and returns a vertex cover whose size is guaranteed to be no more than twice the size of an optimal vertex cover.<P>
<img src="967_a.gif"><P>
<h4><a name="0a19_1d2f">Figure 37.1 The operation of <FONT FACE="Courier New" SIZE=2>APPROX-VERTEX-COVER</FONT>. (a) The input graph G, which has 7 vertices and 8 edges. (b) The edge (b,c), shown heavy, is the first edge chosen by <FONT FACE="Courier New" SIZE=2>APPROX-VERTEX-COVER</FONT>. Vertices b and c, shown lightly shaded, are added to the set A containing the vertex cover being created. Edges (a, b), (c, e), and (c,d), shown dashed, are removed since they are now covered by some vertex in A. (c) Edge (e, f) is added to A. (d) Edge (d,g) is added to A. (e) The set A, which is the vertex cover produced by <FONT FACE="Courier New" SIZE=2>APPROX-VERTEX-COVER</FONT>, contains the six vertices b, c, d, e, f, g. (f) The optimal vertex cover for this problem contains only three vertices: b, d, and e.<a name="0a19_1d2f"></sub></sup></h4><P>
<img src="967_b.gif"><P>
<a name="0a19_1d2d">Figure 37.1 illustrates the operation of <FONT FACE="Courier New" SIZE=2>APPROX</FONT>-<FONT FACE="Courier New" SIZE=2>VERTEX</FONT>-<FONT FACE="Courier New" SIZE=2>COVER</FONT>. The variable <I>C</I> contains the vertex cover being constructed. Line 1 initializes <I>C</I> to the empty set. Line 2 sets <I>E</I>' to be a copy of the edge set <I>E</I>[<I>G</I>] of the graph. The loop on lines 3-6 repeatedly picks an edge (<I>u</I>, <I>v</I>) from <I>E</I>', adds its endpoints <I>u</I> and <I>v</I> to <I>C</I>, and deletes all edges in <I>E</I>'<I> </I>that are covered by either <I>u</I> or <I>v</I>. The running time of this algorithm is <I>O</I>(<I>E</I>), using an appropriate data structure for representing <I>E</I>'.<P>
<a name="0a19_1d30">Theorem 37.1<a name="0a19_1d30"><P>
<FONT FACE="Courier New" SIZE=2>APPROX</FONT>-<FONT FACE="Courier New" SIZE=2>VERTEX</FONT>-C<FONT FACE="Courier New" SIZE=2>OVER </FONT>has a ratio bound of 2.<P>
<I><B>Proof </I></B>The set <I>C </I>of vertices that is returned by <FONT FACE="Courier New" SIZE=2>APPROX</FONT>-<FONT FACE="Courier New" SIZE=2>VERTEX</FONT>-<FONT FACE="Courier New" SIZE=2>COVER</FONT> is a vertex cover, since the algorithm loops until every edge in <I>E</I>[<I>G</I>] has been covered by some vertex in<I> C</I>.<P>
To see that <FONT FACE="Courier New" SIZE=2>APPROX</FONT>-<FONT FACE="Courier New" SIZE=2>VERTEX</FONT>-<FONT FACE="Courier New" SIZE=2>COVER</FONT> returns a vertex cover that is at most twice the size of an optimal cover, let <I>A</I> denote the set of edges that were picked in line 4 of <FONT FACE="Courier New" SIZE=2>APPROX</FONT>-<FONT FACE="Courier New" SIZE=2>VERTEX</FONT>-<FONT FACE="Courier New" SIZE=2>COVER</FONT>. No two edges in <I>A </I>share an endpoint, since once an edge is picked in line 4, all other edges that are incident on its endpoints are deleted from <I>E</I>' in line 6. Therefore, each execution of line 5 adds two new vertices to<I> C</I>, and |<I>C</I>| = 2 |<I>A</I>|. In order to cover the edges in <I>A</I>, however, any vertex cover--in particular, an optimal cover <I>C</I>*--must include at least one endpoint of each edge in <I>A</I>. Since no two edges in <I>A</I> share an endpoint, no vertex in the cover is incident on more than one edge in <I>A</I>. Therefore, |<I>A</I>| <img src="../images/lteq12.gif"> |<I>C</I>*|, and |<I>C</I>| <img src="../images/lteq12.gif"> 2 |<I>C</I>*|, proving<I><B><U> </I></B></U>the theorem. <P>
<h2><a name="0a1a_1d33">Exercises<a name="0a1a_1d33"></h2><P>
<a name="0a1a_1d34">37.1-1<a name="0a1a_1d34"><P>
<a name="0a1a_1d2e">Given an example of a graph for which <FONT FACE="Courier New" SIZE=2>APPROX</FONT>-<FONT FACE="Courier New" SIZE=2>VERTEX</FONT>-<FONT FACE="Courier New" SIZE=2>COVER</FONT> always yields a suboptimal solution.<P>
<a name="0a1a_1d35">37.1-2<a name="0a1a_1d35"><P>
Professor Nixon proposes the following heuristic to solve the vertex-cover problem. Repeatedly select a vertex of highest degree, and remove all of its incident edges. Give an example to show that the professor's heuristic does not have a ratio bound of 2.<P>
<a name="0a1a_1d36">37.1-3<a name="0a1a_1d36"><P>
<a name="0a1a_1d2f"><a name="0a1a_1d30">Give an efficient greedy algorithm that finds an optimal vertex cover for a tree in linear time.<P>
<a name="0a1a_1d37">37.1-4<a name="0a1a_1d37"><P>
<a name="0a1a_1d31"><a name="0a1a_1d32">From the proof of Theorem 36.12, we know that the vertex-cover problem and the NP-complete clique problem are complementary in the sense that an optimal vertex cover is the complement of a maximum-size clique in the complement graph. Does this relationship imply that there is an approximation algorithm with constant ratio bound for the clique problem? Justify your answer.<P>
<P>
<P>
<h1><a name="0a1b_1d37">37.2 The traveling-salesman problem<a name="0a1b_1d37"></h1><P>
<a name="0a1b_1d33"><a name="0a1b_1d34">In the traveling-salesman problem introduced in Section 36.5.5, we are given a complete undirected graph <I>G = </I>(<I>V, E</I>) that has a nonnegative integer cost <I>c</I>(<I>u, v</I>) associated with each edge (<I>u, v</I>) <img src="../images/memof12.gif"> <I>E</I>, and we must find a hamiltonian cycle (a tour) of <I>G</I> with minimum cost. As an extension of our notation, let <I>c</I>(<I>A</I>) denote the total cost of the edges in the subset <I>A</I> <img src="../images/rgtubar.gif"> <I>E</I>:<P>
<img src="969_a.gif"><P>
<a name="0a1b_1d35"><a name="0a1b_1d36">In many practical situations, it is always cheapest to go directly from a place <I>u</I> to a place <I>w</I>; going by way of any intermediate stop <I>v</I> can't be less expensive. Putting it another way, cutting out an intermediate stop never increases the cost. We formalize this notion by saying that the cost function <I>c</I> satisfies the <I><B>triangle inequality</I></B> if for all vertices <I>u, v, w </I><img src="../images/memof12.gif"><I> V,</I><P>
<pre><I>c</I>(<I>u, w</I>) <img src="../images/lteq12.gif"> <I>c</I>(<I>u, v</I>) + <I>c</I>(<I>v, w</I>).</sub></sup></pre><P>
The triangle inequality is a natural one, and in many applications it is automatically satisfied. For example, if the vertices of the graph are points in the plane and the cost of traveling between two vertices is the ordinary euclidean distance between them, then the triangle inequality is satisfied. <P>
As Exercise 37.2-1 shows, restricting the cost function to satisfy the triangle inequality does not alter the NP-completeness of the traveling- salesman problem. Thus, it is unlikely that we can find a polynomial-time algorithm for solving this problem exactly. We therefore look instead for good approximation algorithms.<P>
In Section 37.2.1, we examine an approximation algorithm for the traveling-salesman problem with triangle inequality that has a ratio bound of 2. In Section 37.2.2, we show that without triangle inequality, an approximation algorithm with constant ratio bound does not exist unless P = NP.<P>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -