Jim Lambers MAT 610 Summer Session 2009-10 Lecture 7 Notes These notes correspond to Sections 4.2 and 4.3 in the text. Positive Deο¬nite Systems A real, π × π symmetric matrix π΄ is symmetric positive deο¬nite if π΄ = π΄π and, for any nonzero vector x, xπ π΄x > 0. A symmetric positive deο¬nite matrix is the generalization to π × π matrices of a positive number. If π΄ is symmetric positive deο¬nite, then it has the following properties: β π΄ is nonsingular; in fact, det(π΄) > 0. β All of the diagonal elements of π΄ are positive. β The largest element of the matrix lies on the diagonal. β All of the eigenvalues of π΄ are positive. In general it is not easy to determine whether a given π × π symmetric matrix π΄ is also positive deο¬nite. One approach is to check the matrices β€ β‘ π11 π12 ⋅ ⋅ ⋅ π1π β’ π21 π22 ⋅ ⋅ ⋅ π2π β₯ β₯ β’ π΄π = β’ . .. .. β₯ , π = 1, 2, . . . , π, β£ .. . . β¦ ππ1 ππ2 ⋅ ⋅ ⋅ πππ which are the leading principal minors of π΄. It can be shown that π΄ is positive deο¬nite if and only if det(π΄π ) > 0 for π = 1, 2, . . . , π. One desirable property of symmetric positive deο¬nite matrices is that Gaussian elimination can be performed on them without pivoting, and all pivot elements are positive. Furthermore, Gaussian elimination applied to such matrices is robust with respect to the accumulation of roundoο¬ error. However, Gaussian elimination is not the most practical approach to solving systems of linear equations involving symmetric positive deο¬nite matrices, because it is not the most eο¬cient approach in terms of the number of ο¬oating-point operations that are required. Instead, it is preferable to compute the Cholesky factorization of π΄, π΄ = πΊπΊπ , 1 where πΊ is a lower triangular matrix with positive diagonal entries. Because π΄ is factored into two matrices that are the transpose of one another, the process of computing the Cholesky factorization requires about half as many operations as the πΏπ decomposition. The algorithm for computing the Cholesky factorization can be derived by matching entries of πΊπΊπ with those of π΄. This yields the following relation between the entries of πΊ and π΄, πππ = π ∑ πππ πππ , π, π = 1, 2, . . . , π, π ≥ π. π=1 From this relation, we obtain the following algorithm. for π = 1, 2, . . . , π do √ πππ = πππ for π = π + 1, π + 2, . . . , π do πππ = πππ /πππ for π = π + 1, . . . , π do πππ = πππ − πππ πππ end end end The innermost loop subtracts oο¬ all terms but the last (corresponding to π = π) in the above summation that expresses πππ in terms of entries of πΊ. Equivalently, for each π, this loop subtracts the matrix gπ gππ from π΄, where gπ is the πth column of πΊ. Note that based on the outer product view of matrix multiplication, the equation π΄ = πΊπΊπ is equivalent to π΄= π ∑ gπ gππ . π=1 Therefore, for each π, the contributions of all columns gβ of πΊ, where β < π, have already been subtracted from π΄, thus allowing column π of πΊ to easily be computed by the steps in the outer loops, which account for the last term in the summation for πππ , in which π = π. Example Let β‘ β€ 9 −3 3 9 β’ −3 17 −1 −7 β₯ β₯ π΄=β’ β£ 3 −1 17 15 β¦ . 9 −7 15 44 π΄ is a symmetric positive deο¬nite matrix. To compute its Cholesky decomposition π΄ = πΊπΊπ , we 2 equate entries of π΄ to those β‘ π11 π12 π13 β’ π21 π22 π23 β’ β£ π31 π32 π33 π41 π42 π43 of πΊπΊπ , which yields the matrix equation β€β‘ β€ β‘ π11 π21 π31 π11 0 0 0 π14 β₯ β’ β₯ β’ 0 0 β₯ β’ 0 π22 π32 π24 β₯ β’ π21 π22 = 0 π33 0 β¦β£ 0 π34 β¦ β£ π31 π32 π33 0 0 0 π41 π42 π43 π44 π44 β€ π41 π42 β₯ β₯, π43 β¦ π44 and the equivalent scalar equations 2 π11 = π11 , π21 = π21 π11 , π31 = π31 π11 , π41 = π41 π11 , 2 2 π22 = π21 + π22 , π32 = π31 π21 + π32 π22 , π42 = π41 π21 + π42 π22 , 2 2 2 π33 = π31 + π32 + π33 , π43 = π41 π31 + π42 π32 + π43 π33 , 2 2 2 2 π44 = π41 + π42 + π43 + π44 . We compute the nonzero entries of πΊ one column at a time. For the ο¬rst column, we have √ √ π11 = π11 = 9 = 3, π21 = π21 /π11 = −3/3 = −1, π31 = π31 /π11 = 3/3 = 1, π41 = π41 /π11 = 9/3 = 3. Before proceeding to the next column, we ο¬rst subtract all contributions to the remaining entries of π΄ from the entries of the ο¬rst column of πΊ. That is, we update π΄ as follows: 2 π22 = π22 − π21 = 17 − (−1)2 = 16, π32 = π32 − π31 π21 = −1 − (1)(−1) = 0, π42 = π42 − π41 π21 = −7 − (3)(−1) = −4, 2 π33 = π33 − π31 = 17 − 12 = 16, π43 = π43 − π41 π31 = 15 − (3)(1) = 12, 2 π44 = π44 − π41 = 44 − 32 = 35. 3 Now, we can compute the nonzero entries of the second column of πΊ just as for the ο¬rst column: √ √ π22 = π22 = 16 = 4, π32 = π32 /π22 = 0/4 = 0, π42 = π42 /π22 = −4/4 = −1. We then remove the contributions from πΊ’s second column to the remaining entries of π΄: 2 π33 = π33 − π32 = 16 − 02 = 16, π43 = π43 − π42 π32 = 12 − (−1)(0) = 12, 2 π44 = π44 − π42 = 35 − (−1)2 = 34. The nonzero portion of the third column of πΊ is then computed as follows: √ √ π33 = 16 = 4, π33 = π43 = π43 /π43 = 12/4 = 3. Finally, we compute π44 : 2 π44 = π44 − π43 = 34 − 32 = 25, Thus the complete Cholesky factorization of π΄ is β‘ β€ β‘ 9 −3 3 9 3 0 β’ −3 17 −1 −7 β₯ β’ −1 4 β’ β₯ β’ β£ 3 −1 17 15 β¦ = β£ 1 0 9 −7 15 44 3 −1 π44 = 0 0 4 3 √ π44 = √ 25 = 5. β€β‘ β€ 0 3 −1 1 3 β’ 0 β₯ 4 0 −1 β₯ β₯β’ 0 β₯. 0 β¦β£ 0 0 4 3 β¦ 5 0 0 0 5 β‘ If π΄ is not symmetric positive deο¬nite, then the algorithm will break down, because it will attempt to compute πππ , for some π, by taking the square root of a negative number, or divide by a zero πππ . Example The matrix [ π΄= 4 3 3 2 ] is symmetric but not positive deο¬nite, because det(π΄) = 4(2) − 3(3) = −1 < 0. If we attempt to compute the Cholesky factorization π΄ = πΊπΊπ , we have √ √ π11 = 4 = 2, π11 = π21 = π21 /π11 = 3/2, 2 π22 = π22 − π21 = 2 − 9/4 = −1/4, √ √ π22 = π22 = −1/4, 4 and the algorithm breaks down. β‘ In fact, due to the expense involved in computing determinants, the Cholesky factorization is also an eο¬cient method for checking whether a symmetric matrix is also positive deο¬nite. Once the Cholesky factor πΊ of π΄ is computed, a system π΄x = b can be solved by ο¬rst solving πΊy = b by forward substitution, and then solving πΊπ x = y by back substitution. This is similar to the process of solving π΄x = b using the πΏπ·πΏπ factorization, except that there is no diagonal system to solve. In fact, the πΏπ·πΏπ factorization is also known as the “squareroot-free Cholesky factorization”, since it computes factors that are similar in structure to the Cholesky factors, but without computing any square roots. Speciο¬cally, if π΄ = πΊπΊπ is the Cholesky factorization of π΄, then πΊ = πΏπ·1/2 . As with the πΏπ factorization, the Cholesky factorization is unique, because the diagonal is required to be positive. Banded Systems An π × π matrix π΄ is said to have upper bandwidth π if πππ = 0 whenever π − π > π. Similarly, π΄ has lower bandwidth π if πππ = 0 whenever π − π > π. A matrix that has upper bandwidth π and lower bandwidth π is said to have bandwidth π€ = π + π + 1. Any π × π matrix π΄ has a bandwidth π€ ≤ 2π − 1. If π€ < 2π − 1, then π΄ is said to be banded. However, cases in which the bandwidth is π(1), such as when π΄ is a tridiagonal matrix for which π = π = 1, are of particular interest because for such matrices, Gaussian elimination, forward substitution and back substitution are much more eο¬cient. This is because β If π΄ has lower bandwidth π, and π΄ = πΏπ is the πΏπ decomposition of π΄ (without pivoting), then πΏ has lower bandwidth π, because at most π elements per column need to be eliminated. β If π΄ has upper bandwidth π, and π΄ = πΏπ is the πΏπ decomposition of π΄ (without pivoting), then π has upper bandwidth π, because at most π elements per row need to be updated. It follows that if π΄ has π(1) bandwidth, then Gaussian elimination, forward substitution and back substitution all require only π(π) operations each, provided no pivoting is required. Example The matrix β‘ β’ β’ π΄=β’ β’ β£ −2 1 0 0 0 1 −2 1 0 0 0 1 −2 1 0 0 0 1 −2 1 0 0 0 1 −2 β€ β₯ β₯ β₯, β₯ β¦ which arises from discretization of the second derivative operator, is banded with lower bandwith 5 and upper bandwith 1, and total bandwidth 3. Its πΏπ factorization is β€ β‘ β‘ β€β‘ −2 1 0 0 0 1 0 0 0 0 −2 1 0 0 0 β₯ β’ −1 β’ 1 −2 β₯ β’ 0 −3 1 0 0 1 0 0 0 1 0 0 2 β₯ β’ 2 β’ β₯β’ 2 4 β₯ β’ 0 β’ β₯ β’ 1 −2 1 0 β₯ = β’ 0 −3 0 −3 1 0 0 β₯β’ 0 1 0 β’ β£ 0 0 1 −2 1 β¦ β£ 0 0 − 34 1 0 β¦β£ 0 0 0 − 54 1 0 0 0 1 −2 0 0 0 − 45 1 0 0 0 0 − 65 β€ β₯ β₯ β₯. β₯ β¦ We see that πΏ has lower bandwith 1, and π has upper bandwith 1. β‘ When a matrix π΄ is banded with bandwidth π€, it is wasteful to store it in the traditional 2-dimensional array. Instead, it is much more eο¬cient to store the elements of π΄ in π€ vectors of length at most π. Then, the algorithms for Gaussian elimination, forward substitution and back substitution can be modiο¬ed appropriately to work with these vectors. For example, to perform Gaussian elimination on a tridiagonal matrix, we can proceed as in the following algorithm. We assume that the main diagonal of π΄ is stored in the vector a, the subdiagonal (entries ππ+1,π ) is stored in the vector l, and the superdiagonal (entries ππ,π+1 ) is stored in the vector u. for π = 1, 2, . . . , π − 1 do ππ = ππ /ππ ππ+1 = ππ+1 − ππ π’π end Then, we can use these updated vectors to solve the system π΄x = b using forward and back subtitution as follows: π¦1 = π1 for π = 2, 3, . . . , π do π¦π = ππ − ππ−1 π¦π−1 end π₯π = π¦π /ππ for π = π − 1, π − 2, . . . , 1 do π₯π = (π¦π − π’π π₯π+1 )/ππ end After Gaussian elimination, the components of the vector l are the subdiagonal entries of πΏ in the πΏπ decomposition of π΄, and the components of the vector u are the superdiagonal entries of π . Pivoting can cause diο¬culties for banded systems because it can cause ο¬ll-in: the introduction of nonzero entries outside of the band. For this reason, when pivoting is necessary, pivoting schemes that oο¬er more ο¬exibility than partial pivoting are typically used. The resulting trade-oο¬ is that the entries of πΏ are permitted to be somewhat larger, but the sparsity (that is, the occurrence of zero entries) of π΄ is preserved to a greater extent. 6