📄 chap31.htm

📁 美国麻省理工的算法导论,英文版,看后程序员宏观的能力培养有好处
💻 HTM
📖 第 1 页 / 共 5 页
字号:
12 3 4 5 下一页
<HTML><HEAD>

<TITLE>Intro to Algorithms: CHAPTER 31: MATRIX OPERATIONS</TITLE></HEAD><BODY BGCOLOR="#FFFFFF">


<a href="chap32.htm"><img align=right src="../../images/next.gif" alt="Next Chapter" border=0></A>
<a href="toc.htm"><img align=right src="../../images/toc.gif" alt="Return to Table of Contents" border=0></A>
<a href="chap30.htm"><img align=right src="../../images/prev.gif" alt="Previous Chapter" border=0></A>


<h1><a name="0967_1a50">CHAPTER 31: MATRIX OPERATIONS<a name="0967_1a50"></h1><P>
<a name="0967_1a4f">Operations on matrices are at the heart of scientific computing. Efficient algorithms for working with matrices are therefore of considerable practical interest. This chapter provides a brief introduction to matrix theory and matrix operations, emphasizing the problems of multiplying matrices and solving sets of simultaneous linear equations.<P>
After Section 31.1 introduces basic matrix concepts and notations, Section 31.2 presents Strassen's surprising algorithm for multiplying two <I>n</I> <img src="../images/mult.gif"> <I>n</I> matrices in <img src="../images/bound.gif">(<I>n</I><SUP>lg 7</SUP>) = <I>O</I>(<I>n</I><SUP>2.81</SUP>) time. Section 31.3 defines quasirings, rings, and fields, clarifying the assumptions required to make Strassen's algorithm work. It also contains an asymptotically fast algorithm for multiplying boolean matrices. Section 31.4 shows how to solve a set of linear equations using LUP decompositions. Then, Section 31.5 explores the close relationship between the problem of multiplying matrices and the problem of inverting a matrix. Finally, Section 31.6 discusses the important class of symmetric positive-definite matrices and shows how they can be used to find a least-squares solution to an overdetermined set of linear equations.<P>





<h1><a name="0969_0001">31.1 Properties of matrices<a name="0969_0001"></h1><P>
In this section, we review some basic concepts of matrix theory and some fundamental properties of matrices, focusing on those that will be needed in later sections.<P>





<h2>Matrices and vectors</h2><P>
A <I><B>matrix</I></B> is a rectangular array of numbers. For example,<P>
<img src="730_a.gif"><P>
<h4><a name="096a_1a62">(31.1)<a name="096a_1a62"></sub></sup></h4><P>
is a 2 <img src="../images/mult.gif"> 3 matrix <I>A</I> = (<I>a<SUB>ij</I></SUB>), where for <I>i</I> = 1, 2 and <I>j</I> = 1, 2, 3, the element of the matrix in row <I>i</I> and column <I>j</I> is <I>a<SUB>ij</I></SUB>. We use uppercase letters to denote matrices and corresponding subscripted lowercase letters to denote their elements. The set of all <I>m</I> <img src="../images/mult.gif"> <I>n</I> matrices with real-valued entries is denoted <B>R</B><I><SUP>m<img src="../images/mult.gif">n</I></SUP>. In general, the set of <I>m</I> <img src="../images/mult.gif"> <I>n</I> matrices with entries drawn from a set <I>S</I> is denoted <I>S<SUP>m<img src="../images/mult.gif">n</I></SUP>.<P>
<a name="096a_1a50"><a name="096a_1a51">The <I><B>transpose</I></B> of a matrix <I>A</I> is the matrix <I>A</I><SUP>T</SUP> obtained by exchanging the rows and columns of <I>A</I>. For the matrix <I>A</I> of equation (31.1),<P>
<img src="731_a.gif"><P>
<a name="096a_1a52"><a name="096a_1a53"><a name="096a_1a54">A <I><B>vector</I></B> is a one-dimensional array of numbers. For example,<P>
<img src="731_b.gif"><P>
<h4><a name="096a_1a63">(31.2)<a name="096a_1a63"></sub></sup></h4><P>
is a vector of size 3. We use lowercase letters to denote vectors, and we denote the <I>i</I>th element of a size-<I>n</I> vector <I>x</I> by <I>x<SUB>i</SUB>, </I>for<I> i </I>=<I> </I>1, 2, . . . , <I>n</I>. We take the standard form of a vector to be as a <I><B>column vector</I></B> equivalent to an <I>n</I> <img src="../images/mult.gif"> 1 matrix; the corresponding <I><B>row vector</I></B> is obtained by taking the transpose:<P>
<pre><I>x</I><SUP>T</SUP> = ( 2 3 5 ) .</sub></sup></pre><P>
<a name="096a_1a55">The <I><B>unit vector</I></B> <I>e<SUB>i</I></SUB> is the vector whose <I>i</I>th element is 1 and all of whose other elements are 0. Usually, the size of a unit vector is clear from the context.<P>
<a name="096a_1a56">A <I><B>zero matrix</I></B> is a matrix whose every entry is 0. Such a matrix is often denoted 0, since the ambiguity between the number 0 and a matrix of 0's is usually easily resolved from context. If a matrix of 0's is intended, then the size of the matrix also needs to be derived from the context.<P>
<a name="096a_1a57"><I><B>Square</I></B> <I>n</I> <img src="../images/mult.gif"> <I>n</I> matrices arise frequently. Several special cases of square matrices are of particular interest:<P>
<a name="096a_1a58">1.     A <I><B>diagonal matrix</I></B> has <I>a<SUB>ij</I></SUB> = 0 whenever <I>i</I> <img src="../images/noteq.gif"> <I>j</I>. Because all of the off-diagonal elements are zero, the matrix can be specified by listing the elements along the diagonal:<P>
<img src="731_c.gif"><P>
<a name="096a_1a59">2     The <I>n</I> <img src="../images/mult.gif"> <I>n</I> <I><B>identity matrix</I></B> <I><B>I</I></B><I><SUB>n</I></SUB> is a diagonal matrix with 1's along the diagonal:<P>
<pre><I><B>I</I></B><I><SUB>n</I></SUB> = diag(1,1,...,1)</sub></sup></pre><P>
<img src="732_a.gif"><P>
When <I>I</I> appears without a subscript, its size can be derived from context. The <I>i</I>th column of an identity matrix is the unit vector <I>e<SUB>i</SUB>.</I><P>
<a name="096a_1a5a">3.     A <I><B>tridiagonal matrix </I></B><I>T</I> is one for which <I>t<SUB>ij</I></SUB> = 0 if |<I>i</I> - <I>j</I>| &gt; 1. Nonzero entries appear only on the main diagonal, immediately above the main diagonal (<I>t<SUB>i</I>,<I>i</I>+1</SUB> for <I>i</I> = 1, 2, . . . , <I>n </I>- 1), or immediately below the main diagonal (<I>t<SUB>i</I>+1,<I>i</I></SUB> for <I>i</I> = 1, 2, . . . , <I>n</I> - 1):<P>
<img src="732_b.gif"><P>
<a name="096a_1a5b"><a name="096a_1a5c">4.     An <I><B>upper-triangular matrix </I></B><I>U</I> is one for which <I>u<SUB>ij</I></SUB> = 0 if <I>i</I> &gt; <I>j</I>. All entries below the diagonal are zero:<P>
<img src="732_c.gif"><P>
<a name="096a_1a5d">An upper-triangular matrix is <I><B>unit upper-triangular</I></B> if it has all 1's along the diagonal.<P>
<a name="096a_1a5e">5.     A <I><B>lower-triangular matrix</I></B><I> L</I> is one for which <I>l<SUB>ij</I></SUB> = 0 if <I>i</I> &lt; <I>j</I>. All entries above the diagonal are zero:<P>
<img src="732_d.gif"><P>
<a name="096a_1a5f">A lower-triangular matrix is <I><B>unit lower-triangular</I></B> if it has all 1's along the diagonal.<P>
<a name="096a_1a60">6.     A <I><B>permutation matrix</I></B> <I>P</I> has exactly one 1 in each row or column, and 0's elsewhere. An example of a permutation matrix is<P>
<img src="732_e.gif"><P>
Such a matrix is called a permutation matrix because multiplying a vector <I>x</I> by a permutation matrix has the effect of permuting (rearranging) the elements of <I>x</I>.<P>
<a name="096a_1a61">7.     A <I><B>symmetric matrix</I></B> <I>A</I> satisfies the condition <I>A </I>= <I>A</I><SUP>T</SUP>. For example,<P>
<img src="733_a.gif"><P>
is a symmetric matrix.<P>
<P>







<h2>Operations on matrices</h2><P>
<a name="096b_1a62">The elements of a matrix or vector are numbers from a number system, such as the real numbers, the complex numbers, or integers modulo a prime. The number system defines how to add and multiply numbers. We can extend these definitions to encompass addition and multiplication of matrices.<P>
<a name="096b_1a63"><a name="096b_1a64">We define <I><B>matrix addition</I></B> as follows. If <I>A</I> = (<I>a<SUB>ij</I></SUB>) and <I>B</I> = (<I>b<SUB>ij</I></SUB>) are <I>m</I> <img src="../images/mult.gif"> <I>n</I> matrices, then their matrix sum <I>C</I> = (<I>c<SUB>ij</I></SUB>) = <I>A</I> + <I>B</I> is the <I>m</I> <img src="../images/mult.gif"> <I>n</I> matrix defined by<P>
<pre><I>c<SUB>ij</SUB> </I>=<I> a<SUB>ij</SUB> </I>+<I> b<SUB>ij</I></sub></sup></pre><P>
for <I>i</I> = 1, 2, . . . , <I>m</I> and <I>j</I> = 1, 2, . . . , <I>n</I>. That is, matrix addition is performed componentwise. A zero matrix is the identity for matrix addition:<P>
<pre><I>A </I>+ 0 = <I>A</I></sub></sup></pre><P>
<pre>= 0 + <I>A </I>.</sub></sup></pre><P>
<a name="096b_1a65"><a name="096b_1a66"><a name="096b_1a67">If <img src="../images/lambdauc.gif"> is a number and <I>A</I> = (<I>a<SUB>ij</I></SUB>) is a matrix, then <img src="../images/lambdauc.gif"><I>A =</I> (<img src="../images/lambdauc.gif"><I>a<SUB>ij</I></SUB>) is the <I><B>scalar multiple</I></B> of <I>A</I> obtained by multiplying each of its elements by <img src="../images/lambdauc.gif">. As a special case, we define the <I><B>negative</I></B> of a matrix <I>A</I> = (<I>a<SUB>ij</I></SUB>) to be -1 <img src="../images/dot10.gif"> <I>A</I> = - <I>A</I>, so that the <I>ij</I>th entry of - <I>A</I> is -<I>a<SUB>ij</I></SUB>. Thus,<P>
<pre><I>A </I>+ (-<I>A</I>) = 0</sub></sup></pre><P>
<pre>= (-<I>A</I>) + <I>A .</I></sub></sup></pre><P>
Given this definition, we can define <I><B>matrix subtraction</I></B> as the addition of the negative of a matrix: <I>A</I> - <I>B</I> = <I>A</I> + (-<I>B</I>).<P>
<a name="096b_1a68">We define <I><B>matrix multiplication</I></B> as follows. We start with two matrices <I>A</I> and <I>B</I> that are <I><B>compatible</I></B> in the sense that the number of columns of <I>A</I> equals the number of rows of <I>B</I>. (In general, an expression containing a matrix product <I>AB</I> is always assumed to imply that matrices <I>A</I> and <I>B</I> are<SUB> </SUB><FONT FACE="Times New Roman" SIZE=2>compatible.) If <I>A</I> = (<I>a<SUB>ij</I></FONT></SUB>) is an <I>m</I> <img src="../images/mult.gif"> <I>n</I> matrix and <I>B</I> = (<I>b<SUB>jk</I></SUB>) is an <I>n</I> <img src="../images/mult.gif"> <I>p</I> matrix, then their matrix product <I>C</I> = <I>AB</I> is the <I>m</I> <img src="../images/mult.gif"> <I>p</I> matrix <I>C</I> = (<I>c<SUB>ik</I></SUB>),<SUB> </SUB>where<P>
<img src="733_b.gif"><P>
<h4><a name="096b_1a70">(31.3)<a name="096b_1a70"></sub></sup></h4><P>
for <I>i</I> = 1, 2, . . . , <I>m</I> and <I>k</I> = 1, 2, . . . , <I>p</I>. The procedure M<FONT FACE="Courier New" SIZE=2>ATRIX-</FONT><FONT FACE="Courier New" SIZE=2>MULTIPLY</FONT> in Section 26.1 implements matrix multiplication in the straightforward manner based on equation (31.3), assuming that the matrices are square: <I>m</I> = <I>n</I> = <I>p</I>. To multiply <I>n</I> <img src="../images/mult.gif"> <I>n</I> matrices, M<FONT FACE="Courier New" SIZE=2>ATRIX-</FONT><FONT FACE="Courier New" SIZE=2>MULTIPLY</FONT> performs <I>n</I><SUP>3</SUP> multiplications and <I>n</I><SUP>2</SUP>(<I>n</I> - 1) additions, and its running time is <img src="../images/bound.gif">(<I>n</I><SUP>3</SUP>).<P>
Matrices have many (but not all) of the algebraic properties typical of numbers. Identity matrices are identities for matrix multiplication:<P>
<pre><I>I<SUB>m</SUB>A</I> = <I>AI<SUB>n</I></SUB> = <I>A</I></sub></sup></pre><P>
for any <I>m</I> <img src="../images/mult.gif"> <I>n</I> matrix <I>A</I>. Multiplying by a zero matrix gives a zero matrix:<P>
<pre><I>A </I>0 = 0 .</sub></sup></pre><P>
Matrix multiplication is associative:<P>
<pre><I>A</I>(<I>BC</I>)<I> </I>=<I> </I>(<I>AB</I>)<I>C</I></sub></sup></pre><P>
<h4><a name="096b_1a71">(31.4)<a name="096b_1a71"></sub></sup></h4><P>
for compatible matrices <I>A</I>, <I>B</I>, and <I>C</I>. Matrix multiplication distributes over addition:<P>
<pre><I>A</I>(<I>B</I> + <I>C</I>) = <I>AB</I> + <I>AC </I>,</sub></sup></pre><P>
<pre>(<I>B</I> + <I>C</I>)<I>D</I> = <I>BD</I> + <I>CD </I>.</sub></sup></pre><P>
<h4><a name="096b_1a72">(31.5)<a name="096b_1a72"></sub></sup></h4><P>
Multiplication of <I>n</I> <img src="../images/mult.gif"> <I>n</I> matrices is not commutative, however, unless <I>n</I> = 1. For example, if <img src="734_a.gif"> and <img src="734_b.gif">, then<P>
<img src="734_c.gif"><P>
and<P>
<img src="734_d.gif"><P>
<a name="096b_1a69"><a name="096b_1a6a"><a name="096b_1a6b"><a name="096b_1a6c"><a name="096b_1a6d"><a name="096b_1a6e"><a name="096b_1a6f">Matrix-vector products or vector-vector products are defined as if the vector were the equivalent <I>n</I> <img src="../images/mult.gif"> 1 matrix (or a 1 <img src="../images/mult.gif"> <I>n</I> matrix, in the case of a row vector). Thus, if <I>A</I> is an <I>m</I> <img src="../images/mult.gif"> <I>n</I> matrix and <I>x</I> is a vector of size <I>n</I>, then <I>Ax</I> is a vector of size <I>m</I>. If <I>x</I> and <I>y</I> are vectors of size <I>n</I>, then<P>
<img src="734_e.gif"><P>
is a number (actually a 1 <img src="../images/mult.gif"> 1 matrix) called the <I><B>inner product</I></B> of <I>x</I> and <I>y</I>. The matrix <I>xy</I><SUP>T</SUP> is an <I>n</I> <img src="../images/mult.gif"> <I>n</I> matrix <I>Z</I> called the <I><B>outer product</I></B> of <I>x</I> and <I>y</I>, with <I>z<SUB>ij</I></SUB> = <I>x<SUB>i</SUB>y<SUB>j</I></SUB>. The <I><B>(euclidean) norm </I></B>||<I>x</I>|| of a vector <I>x</I> of size <I>n</I> is defined by<P>
<img src="734_f.gif"><P>
Thus, the norm of <I>x</I> is its length in <I>n</I>-dimensional euclidean space.<P>
<P>
12 3 4 5 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -