📄 chap16.htm

📁 介绍算法的书籍
💻 HTM
📖 第 1 页 / 共 5 页
字号:
<P>







<h2>A recursive solution</h2><P>
The second step of the dynamic-programming paradigm is to define the value of an optimal solution recursively in terms of the optimal solutions to subproblems. For the matrix-chain multiplication problem, we pick as our subproblems the problems of determining the minimum cost of a parenthesization of <I>A<SUB>i</SUB>A<SUB>i+</I>1</SUB><I>. . . A<SUB>j</I></SUB> for 1 <img src="../images/lteq12.gif"> <I>i</I> <img src="../images/lteq12.gif"> <I>j</I> <img src="../images/lteq12.gif"> <I>n</I>. Let <I>m</I>[<I>i, j</I>] be the minimum number of scalar multiplications needed to compute the matrix <I>A<SUB>i..j</SUB>;</I> the cost of a cheapest way to compute <I>A</I><SUB>1<I>..n</I></SUB> would thus be <I>m</I>[1<I>, n</I>]<I>.</I><P>
We can define <I>m</I>[<I>i, j</I>] recursively as follows. If <I>i = j,</I> the chain consists of just one matrix <I>Ai..i = A<SUB>i</SUB>,</I> so no scalar multiplications are necessary to compute the product. Thus, <I>m</I>[<I>i,i</I>]<I> = </I>0 for <I>i = </I>1, 2<I>, . . . , n.</I> To compute <I>m</I>[<I>i, j</I>]<I> when i &lt; j,</I> we take advantage of the structure of an optimal solution from step 1. Let us assume that the optimal parenthesization splits the product <I>A<SUB>i</SUB>A<SUB>i</I>+l</SUB><FONT FACE="Times New Roman" SIZE=2> . . . <I>A<SUB>j</I></FONT></SUB> between <I>A<SUB>k</I></SUB> and <I>A<SUB>k</I>+1</SUB>, where <I>i</I> <img src="../images/lteq12.gif"> <I>k</I> &lt; <I>j</I>. Then, <I>m</I>[<I>i, j</I>]<I> </I>is equal to the minimum cost for computing the subproducts <I>A<SUB>i..k </I></SUB>and <I>A<SUB>k+</I>1<I>..j,</I></SUB> plus the cost of multiplying these two matrices together. Since computing the matrix product <I>A<SUB>i..k</SUB>A<SUB>k+</I>1<I>..j</I></SUB> takes <I>p<SUB>i-</I>1</SUB><I>p<SUB>k</SUB>p<SUB>j</I></SUB> scalar multiplications, we obtain<P>
<pre>m[i, j] = m[i, k] + m[k + 1, j] + p<SUB>i-1</SUB>p<SUB>k</SUB>p<SUB>j </SUB>.</sub></sup></pre><P>
This recursive equation assumes that we know the value of <I>k</I>, which we don't. There are only <I>j - i</I> possible values for <I>k</I>, however, namely <I>k = i, i + 1, . . . , j -</I> 1. Since the optimal parenthesization must use one of these values for <I>k</I>, we need only check them all to find the best. Thus, our recursive definition for the minimum cost of parenthesizing the product <I>A<SUB>i</SUB>A<SUB>i+</I>1</SUB> . . . <I>A<SUB>j</I></SUB> becomes<P>
<img src="305_a.gif"><P>
<h4><a name="080e_0001">(16.2)<a name="080e_0001"></sub></sup></h4><P>
The <I>m</I>[<I>i, j</I>] values give the costs of optimal solutions to subproblems. To help us keep track of how to construct an optimal solution, let us define <I>s</I>[<I>i, j</I>] to be a value of <I>k</I> at which we can split the product <I>A<SUB>i</SUB>A<SUB>i</I>+1</SUB><I> . . . A<SUB>j</I></SUB> to obtain an optimal parenthesization. That is,<I> s</I>[<I>i, j</I>] equals a value <I>k</I> such that<I> m</I>[<I>i, j</I>]<I> = m</I>[<I>i, k</I>]<I> + m</I>[<I>k + 1, j</I>] + <I>p<SUB>i - </I>1</SUB><I>p<SUB>k</SUB>p<SUB>j </I></SUB>.<P>
<P>







<h2>Computing the optimal costs</h2><P>
At this point, it is a simple matter to write a recursive algorithm based on recurrence (16.2) to compute the minimum cost <I>m</I>[<I>1, n</I>] for multiplying <I>A</I><SUB>1</SUB><I>A</I><SUB>2 . . . </SUB><I>A<SUB>n</I></SUB>. As we shall see in Section 16.2, however, this algorithm takes exponential time--no better than the brute-force method of checking each way of parenthesizing the product.<P>
The important observation that we can make at this point is that we have relatively few subproblems: one problem for each choice of <I>i</I> and <I>j</I> satisfying 1 <img src="../images/lteq12.gif"><I> i </I><img src="../images/lteq12.gif"><I> j </I><img src="../images/lteq12.gif"><I> n, </I>or<I> </I><img src="306_a.gif"><I> + n = </I><img src="../images/bound.gif">(<I>n<SUP>2</I></SUP>) total. A recursive algorithm may encounter each subproblem many times in different branches of its recursion tree. This property of overlapping subproblems is the second hallmark of the applicability of dynamic programming.<P>
Instead of computing the solution to recurrence (16.2) recursively, we perform the third step of the dynamic-programming paradigm and compute the optimal cost by using a bottom-up approach. The following pseudocode assumes that matrix <I>A<SUB>i</I></SUB> has dimensions <I>p<SUB>i - 1</SUB> </I>X<I> p<SUB>i</SUB> </I>for<I> i</I> = 1, 2, . . . ,<I> n.</I> The input is a sequence <img src="../images/lftwdchv.gif"><I>p</I><SUB>0</SUB>, <I>p</I><SUB>1</SUB>, . . . , <I>p<SUB>n</I></SUB><img src="../images/wdrtchv.gif">, here <I>length</I>[<I>p</I>]<I> = n</I> + 1. The procedure uses an auxiliary table <I>m</I>[<I>1 . . n,1 . . n</I>] for storing the <I>m</I>[<I>i, j</I>] costs and an auxiliary table <I>s</I>[1 <I>. . n, </I>1 <I>. . n</I>] that records which index of <I>k</I> achieved the optimal cost in computing <I>m</I>[<I>i, j</I>]<I>.</I><P>
<pre><a name="080f_153a">MATRIX-CHAIN-ORDER(<I>p</I>)</sub></sup></pre><P>
<pre>1 <I>n</I> <img src="../images/arrlt12.gif"> <I>length</I>[<I>p</I>] - 1</sub></sup></pre><P>
<pre>2 <B>for</B> <I>i</I> <img src="../images/arrlt12.gif"> 1 <B>to</B> <I>n</I></sub></sup></pre><P>
<pre>3      <B>do</B> <I>m</I>[<I>i, i</I>]<I> </I><img src="../images/arrlt12.gif"><I> </I>0</sub></sup></pre><P>
<pre>4 <B>for</B> <I>l </I><img src="../images/arrlt12.gif"><I> </I>2 <B>to</B><I> n</I></sub></sup></pre><P>
<pre>5      <B>do for</B> <I>i</I> <img src="../images/arrlt12.gif"> 1 <B>to</B> <I>n - l + </I>1</sub></sup></pre><P>
<pre>6             <B>do</B> <I>j </I><img src="../images/arrlt12.gif"><I> i + l-</I>1</sub></sup></pre><P>
<pre>7                <I>m</I>[<I>i, j</I>] <img src="../images/arrlt12.gif"> <img src="../images/infin.gif"></sub></sup></pre><P>
<pre>8                <B>for</B> <I>k</I> <img src="../images/arrlt12.gif"> <I>i</I> <B>to</B> <I>j</I> - 1</sub></sup></pre><P>
<pre>9                    <B>do</B> <I>q </I><img src="../images/arrlt12.gif"> <I>m</I>[<I>i, k</I>]<I> + m</I>[<I>k + </I>1,<I> j</I>] +<I>p<SUB>i</I> - 1</SUB><I>p<SUB>k</SUB>p<SUB>j</I></sub></sup></pre><P>
<pre>10                       <B>if</B> <I>q</I> &lt; <I>m</I>[<I>i, j</I>]</sub></sup></pre><P>
<pre>11                          <B>then</B><I> m</I>[<I>i, j</I>] <img src="../images/arrlt12.gif"> <I>q</I></sub></sup></pre><P>
<pre>12                               <I>s</I>[<I>i, j</I>]<I> </I><img src="../images/arrlt12.gif"><I> k</I></sub></sup></pre><P>
<pre>13 <B>return</B><I> m </I>and<I> s</I></sub></sup></pre><P>
The algorithm fills in the table <I>m</I> in a manner that corresponds to solving the parenthesization problem on matrix chains of increasing length. Equation (16.2)shows that the cost <I>m</I>[<I>i, j</I>] of computing a matrix-chain product of <I>j - i</I> + 1 matrices depends only on the costs of computing matrix-chain products of fewer than <I>j - i</I> + 1 matrices. That is, for <I>k = i, i + </I>1<I>, . . . ,j - </I>1, the matrix <I>A<SUB>i..k</I></SUB> is a product of <I>k - i</I> + 1 &lt;<I> j</I> -<I> i</I> + 1 matrices and the matrix <I>A<SUB>k+</I>1 <I>. . j</I></SUB> a product of <I>j</I> -<I> k </I>&lt; <I>j </I>-<I> i </I>+ 1 matrices.<P>
The algorithm first computes <I>m</I>[<I>i, i</I>] <img src="../images/arrlt12.gif"> 0 for <I>i</I> = 1, 2, . . . , <I>n</I> (the minimum costs for chains of length 1) in lines 2-3. It then uses recurrence (16.2) to compute <I>m</I>[<I>i, i+ </I>1] for <I>i </I>= 1, 2, . . . <I>, n - </I>1 (the minimum costs for chains of length<I> l </I>= 2) during the first execution of the loop in lines 4-12. The second time through the loop, it computes <I>m</I>[<I>i, i</I> + <I>2</I>] for <I>i</I> = 1, 2, . . . ,<I> n</I> - 2 (the minimum costs for chains of length <I>l</I> = 3), and so forth. At each step, the <I>m</I>[<I>i, j</I>] cost computed in lines 9-12 depends only on table entries <I>m</I>[<I>i, k</I>]<I> </I>and <I>m</I>[<I>k + </I>1<I> ,j</I>] already computed.<P>
Figure 16.1 illustrates this procedure on a chain of <I>n </I>= 6 matrices. Since we have defined <I>m</I>[<I>i, j</I>] only for <I>i</I> <img src="../images/lteq12.gif"><I> j</I>, only the portion of the table<I> m</I> strictly above the main diagonal is used. The figure shows the table rotated to make the main diagonal run horizontally. The matrix chain is listed along the bottom. Using this layout, the minimum cost <I>m</I>[<I>i, j</I>] for multiplying a subchain <I>A<SUB>i</SUB>A<SUB>i+</I>1</SUB><I>. . . A<SUB>j</I></SUB> of matrices can be found at the intersection of lines running northeast from <I>A<SUB>i</I></SUB> and northwest from <I>A<SUB>j</I></SUB>. Each horizontal row in the table contains the entries for matrix chains of the same length. <FONT FACE="Courier New" SIZE=2>MATRIX</FONT>-<FONT FACE="Courier New" SIZE=2>CHAIN</FONT>-<FONT FACE="Courier New" SIZE=2>ORDER</FONT> computes the rows from bottom to top and from left to right within each row. An entry <I>m</I>[<I>i, j</I>] is computed using the products <I>p<SUB>i - </I>1</SUB><I>p<SUB>k</SUB>p<SUB>j</SUB> for k = i, i </I>+ 1, . . . <I>, j - </I>1 and all entries southwest and southeast from <I>m</I>[<I>i, j</I>].<P>
<img src="307_a.gif"><P>
<h4><a name="080f_153b">Figure 16.1 The m and s tables computed by <FONT FACE="Courier New" SIZE=2>MATRIX<FONT FACE="Times New Roman" SIZE=2>-<FONT FACE="Courier New" SIZE=2>CHAIN<FONT FACE="Times New Roman" SIZE=2>-<FONT FACE="Courier New" SIZE=2>ORDER<FONT FACE="Times New Roman" SIZE=2> for n = 6 and the following matrix dimensions:<a name="080f_153b"></FONT></FONT></FONT></FONT></FONT></FONT></sub></sup></h4><P>
<pre>matrix  dimension</sub></sup></pre><P>
<pre>-----------------</sub></sup></pre><P>
<pre><I>  A</I><SUB>1     </SUB>30 X 35 </sub></sup></pre><P>
<pre><I>  A</I><SUB>2     </SUB>35 X 15 </sub></sup></pre><P>
<pre><I>  A</I><SUB>3     </SUB>15 X 5 </sub></sup></pre><P>
<pre><I>  A</I><SUB>4     </SUB>5 X 10 </sub></sup></pre><P>
<pre><I>  A</I><SUB>5     </SUB>10 X 20 </sub></sup></pre><P>
<pre><I>  A</I><SUB>6     </SUB>20 X 25 </sub></sup></pre><P>
The tables are rotated so that the main diagonal runs horizontally. Only the main diagonal and upper triangle are used in the<I> m</I> table, and only the upper triangle is used in the <I>s</I> table. The minimum number of scalar multiplications to multiply the 6 matrices is <I>m</I>[1, 6] = 15,125. Of the lightly shaded entries, the pairs that have the same shading are taken together in line 9 when computing<P>
<img src="307_b.gif"><P>
A simple inspection of the nested loop structure of <FONT FACE="Courier New" SIZE=2>MATRIX</FONT>-<FONT FACE="Courier New" SIZE=2>CHAIN</FONT>-<FONT FACE="Courier New" SIZE=2>ORDER</FONT> yields a running time of <I>O</I>(<I>n</I><SUP>3</SUP>) for the algorithm. The loops are nested three deep, and each loop index (<I>l, i, </I>and<I> k</I>) takes on at most <I>n</I> values. Exercise 16.1-3 asks you to show that the running time of this algorithm is in fact also <img src="../images/omega12.gif">(<I>n</I><SUP>3</SUP>). The algorithm requires <img src="../images/bound.gif">(<I>n</I><SUP>2</SUP>) space to store the <I>m</I> and <I>s</I> tables. Thus, <FONT FACE="Courier New" SIZE=2>MATRIX</FONT>-<FONT FACE="Courier New" SIZE=2>CHAIN</FONT>-<FONT FACE="Courier New" SIZE=2>ORDER</FONT> is much more efficient than the exponential-time method of enumerating all possible parenthesizations and checking each one.<P>
<P>







<h2>Constructing an optimal solution</h2><P>
<a name="0810_153b">Although <FONT FACE="Courier New" SIZE=2>MATRIX</FONT>-<FONT FACE="Courier New" SIZE=2>CHAIN</FONT>-<FONT FACE="Courier New" SIZE=2>ORDER</FONT> determines the optimal number of scalar multiplications needed to compute a matrix-chain product, it does not directly show how to multiply the matrices. Step 4 of the dynamic-programming paradigm is to construct an optimal solution from computed information.<P>
In our case, we use the table <I>s</I>[1 . <I>. n, </I>1 . <I>. n</I>] to determine the best way to multiply the matrices. Each entry <I>s</I>[<I>i, j</I>] records the value of <I>k</I> such that the optimal parenthesization of <I>A<SUB>i</SUB>A<SUB>i </I>+ 1</SUB><I> </I><img src="../images/dot10.gif"> <I><img src="../images/dot10.gif"> </I><img src="../images/dot10.gif"> <I>A<SUB>j</I></SUB> splits the product between <I>A<SUB>k</I></SUB> and <I>A<SUB>k </I>+ 1</SUB>. Thus, we know that the final matrix multiplication in computing <I>A</I><SUB>1..<I>n</I></SUB> optimally is <I>A</I><SUB>1..<I>s</I>[1<I>,</I>n]</SUB><I>A<SUB>s</I>[1,<I>n</I>]+1.<I>.n</SUB>.</I> The earlier matrix multiplications can be computed recursively, since <I>s</I>[l,<I>s</I>[1,<I>n</I>]] determines the last matrix multiplication in computing <I>A</I><SUB>1..<I>s</I>[1,<I>n</I>]</SUB>, and <I>s</I>[<I>s</I>[1,<I>n</I>] + 1, <I>n</I>] determines the last matrix multiplication in computing <I>A<SUB>s</I>[1,<I>n</I>]+..<I>n</SUB>.</I> The following recursive procedure computes the matrix-chain product <I>A<SUB>i..j</I></SUB> given the matrices <I>A = </I><img src="../images/lftwdchv.gif"><I>A<SUB>1</SUB>, A<SUB>2</SUB>, . . . , A<SUB>n</I></SUB><img src="../images/wdrtchv.gif">, the <I>s</I> table computed by <FONT FACE="Courier New" SIZE=2>MATRIX</FONT>-<FONT FACE="Courier New" SIZE=2>CHAIN</FONT>-<FONT FACE="Courier New" SIZE=2>ORDER</FONT>, and the indices <I>i</I> and <I>j</I>. The initial call is <FONT FACE="Courier New" SIZE=2>MATRIX</FONT>-<FONT FACE="Courier New" SIZE=2>CHAIN</FONT>-<FONT FACE="Courier New" SIZE=2>MULTIPLY</FONT>(<I>A, s,</I> 1,<I> n</I>).<P>
<pre>MATRIX-CHAIN-MULTIPLY(<I>A, s, i, ,j</I>)</sub></sup></pre><P>
<pre>1 <B>if</B><I> j &gt;i</I></sub></sup></pre><P>
<pre>2     <B>then</B> <I>X </I><img src="../images/arrlt12.gif"> MATRIX-CHAIN-MULTIPLY(<I>A, s, i, s</I>[<I>i, j</I>])</sub></sup></pre><P>
<pre>3          <I>Y </I><img src="../images/arrlt12.gif"> MATRIX-CHAIN-MULTIPLY(<I>A, s, s</I>[<I>i, j</I>] + 1,<I> j</I>)</sub></sup></pre><P>
<pre>4          <B>return</B> MATRIX-MULTIPLY(<I>X, Y</I>)</sub></sup></pre><P>
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -