2.5. Accuracy Estimation Reading: Trefethen and Bau (1997), Lecture 22 Accuracy: { With roundo, how accurate is a solution obtained by Gaussian elimination? { How sensitive is a solution to perturbations introduced by round o error (stability)? { How can we appraise the accuracy of a computed solution? { Can the accuracy of a computed solution be improved? Let x^ be a computed solution of { Compute the residual Ax = b (1) r = b ; Ax^ (2) { Is r a good measure of the accuracy of x^ ? Example 1. Solve (1) with 1 :2969 0:8648 0 :8642 A = 0:2161 0:1441 b = 0:1440 { An approximate solution is x^ = 0:9911 ;0:4870] { The residual is r = ;10;8 10;8] { Exact solution: { det(A) = 10;8 T T 1 Sensitivity Example 2. Consider 2 5 67 A = 664 6 5 7 10 8 7 6 8 10 9 3 5 7 7 7 9 7 5 10 { A is symmetric and positive denite with det(A) = 1 { Exact solutions b x T T 23 32 33 31] 1 1 1 1] 22:9 32:1 32:9 31:1] ;7:2 6 2:9 0:1] 22:99 32:01 32:99 31:01] 0:18 1:5 1:19 0:89] 2 Perturbations Denition 1: A matrix is ill-conditioned if small changes in A or b produce large changes in the solution. { Suppose that b is perturbed to b + b { The solution x + x of the perturbed system would satisfy A(x + x) = b + b { Estimate the magnitude of x x = A;1 b { Taking a norm jj xjj = jjA;1 bjj jjA;1jjjj bjj { In addition jjbjj = jjAxjj jjAjjjjxjj { Multiplying the two inequalities jj xjjjjbjj jjAjjjjA;1jjjj bjjjjxjj or x where b (3a) = jjAjjjjA;1jj (3b) jj jj jj jj (A) jjxjj jjbjj (A) is called the condition number 3 Note: The Condition Number i. The condition number relates the relative uncertainty of the data (jj bjj=jjbjj) to the relative uncertainty of the solution (jj xjj=jjxjj) ii. If (A) is large, small changes in b may result in large changes in x iii. If (A) is large, A is ill-conditioned iv. The bound (3a) is sharp: there are bs for which equality holds v. The det(A) has nothing to do with conditioning. The matrix ; 30 10 0 0 10;30 is well-conditioned Repeat the analysis for a perturbation in A (A + A)(x + x) = b or Ax + A x + A(x + x) = b or x = ;A;1 A(x + x) or jj xjj jjA;1jjjj Ajjjjx + xjj or jj xjj jj Ajj (A) (3c) jjx + xjj jjAjj 4 Condition Numbers Example 3. The matrix A of Example 1 and its inverse are 0:8648 ;1 = 108 0:1441 ;0:8648 A = 10::2969 A 2161 0:1441 ;0:2161 1:2969 { Calculate jjAjj1 = 2:1617 jjA;1jj1 = 1:5130 108 8 1 (A) = 3:2707 10 Example 4. The matrix A of Example 2 and its inverse are 2 3 2 3 5 7 6 5 68 ;41 ;17 10 6 7 10 8 7 7 6 ;41 25 10 ;6 7 A = 664 6 8 10 9 775 A;1 = 664 ;17 10 5 ;3 775 5 7 9 10 10 ;6 ;3 2 and jjAjj1 = 33 jjA;1jj1 = 136 1 (A) = 4488 5 Stability of Gaussian Elimination How do rounding errors propagate in Gaussian elimination? { Let's discuss this without explicit treatment of pivoting Assume that A has its rows arranged so that pivoting is unnecessary Theorem 1: Let A be an n n real matrix of oating-point numbers on a computer with unit round-o error u. Assume ^ be the that no zero pivots are encountered. Let L^ and U computed triangular factors of A and let y^ and x^ be the computed solutions of ^ = b ^ = y^ Ly Ux (4a) Then (A + A)^x = b (4b) where j Aj 3nujLjjUj + O(n2u2) (4c) { Remark 1: Recall (Section 2.1) that jAj is a matrix with elements ja j, i j = 1 : n { Remark 2: The computed solution x^ is the exact solution of a system A + A { Remark 3: Additionally, L^ and U^ are the exact factors of a perturbed matrix A + E Proof: Follow the arguments used in the round-o error analysis of the dot product of Section 2.1 (cf. Demmel, Section 2.4) 2 ij 6 Growth Factor Remark: For the innity, one and Frobenius norms, jj jzj jj = jjzjj { Taking a norm of (4c) jj Ajj 3nujjLjjjjUjj + O(n2u2) (5) We would like to estimate jjLjj and jjUjj { With partial pivoting, the elements of L are bounded by unity Thus, jjLjj1 n { Bounding jjUjj is much more dicult Dene the growth factor as max juj = (6) i j Then jUj jjAjj and jjAjj1 ij jjUjj1 njjAjj1 { Using the bounds for jjLjj1 and jjUjj1 in (5), we nd jj Ajj1 3 jjAjj1 (7) n uC The constant C accounts for our \casual" treatment of the O(n2u2) term 7 Growth Factors Note: i. The bound (7) for A is acceptable unless is large ii. J.H. Wilkinson (1961) 1 shows that if ja j 1, i j = 1 : n, { jjAjj1 2 ;1 with partial pivoting { jjAjj1 1:8n0 25 ln with complete pivoting { The bound with partial pivoting is sharp! Consider ij n : n 2 1 6 ;1 6 A = 666 ;1 4 ;1 ;1 which has 2 1 6 ;1 6 L = 666 ;1 4 ;1 ;1 3 1 ;1 ;1 ;1 1 17 7 1 17 7 7 ;1 1 1 5 ;1 ;1 1 3 1 ;1 ;1 ;1 1 ;1 ;1 1 ;1 1 2 7 7 7 7 7 5 6 6 U = 666 4 1 3 1 1 27 7 1 47 7 1 87 5 16 Thus, = 16. For an n n matrix, = 2 ;1 iii. In most cases, jjAjj1 8 with partial pivoting iv. Our bounds are conservative. In most cases n jj 1 Ajj1 nu jjAjj1 J.H. Wilkinson, \Error analysis of direct methods of matrix inversion," J. A.C.M., 8, 281-330 8 Iterative Improvement Iterative improvement can enhance the accuracy of a computed solution { It works best when double precision arithmetic is available but costs much more than single precision arithmetic Let x^ be an approximate solution of (1). Calculate the residual (2) r = b ; Ax^ = A(x ; x^ ) Solve for A x=r x and calculate an \improved" solution x = x^ + x (8a) (8b) x^ cannot be improved unless: { The residual is calculated in double precision { The original (not the factored) A is used to calculate r The procedure is 1. Calculate r = b ; Ax^ using double precision 2. Solve L^ y = Pr by forward substitution ^ x = y by backward substitution 3. Solve U 4. x = x^ + x 9 Notes: Iterative Improvement i. Round the double precision residual to single precision ii. The cost of iterative improvement: { One matrix-by-vector multiplication for the residual and one forward and backward substitution { Each cost n2 multiplications if double precision hardware is available { The total cost of 2n2 multiplications is small relative to the factorization iii. Continue the iteration for further improvement ^ iv. Have to save the original A and the factors L^ and U v. Have to do mixed-precision arithmetic Accuracy of the improved solution { From (8) x := x ; x^ = A;1(b ; Ax^ ) = A;1r { Take a norm jj xjj1 jjA; jj1jjrjj1 1 { x^ satises (4c), so r = (A + A)^x ; Ax^ = Ax^ { Take a norm ^ jj1 jjrjj1 jj Ajj1jjx 10 Still Improving { Combining results jj { Using (3b) xjj1 jjA; jj1jj Ajj1jjx^ jj1 1 A x jj jj1 jj jj1 (A) ^ jj1 jjx jjAjj1 (9a) { We could estimate jj Ajj1 using (7) Golub and Van Loan (1996) suggest Ajj1 = jj jjAjj1 nu Trefethen and Bau (1997) suggest = O(n1 2) = x jj jj1 nu(A) ^ jj1 jjx (9b) If u / 1; and (A) / ;1 t s x jj jj1 nc ; ^ jj1 jjx s (9c) t { represents constants arising from and The computed solution x^ will have about ; correct -digits { If is small, A is well conditioned and the error is small { If is large, A is ill-conditioned and the error is large { If , the error grows relative to x^ c u t s s s > t 11 s It Don't Get Much Better Let x^ (0) = x^ , x(0) = x, and x^ (1) = x^ (0) + x(0) be the solution computed by iterative improvement. Assume x x(0) and Calculate an estimate of (A) from (9b) as 1 jj x(0)jj1 (A) ^ (0)jj1 nu jjx An empirical result that holds widely is 1 1 jj x(0)jj1 (A) (A) (10) ^ jj1 n nu jjx Theorem 2: Suppose r( ) = b ; Ax^ ( ) x^ ( +1) = x( ) + x( ) (11) are calculated in double precision and x( ) is the solution of A x( ) = r( ) in single precision. Thus, (A + A( )) x( ) = r( ) If jjA;1 A( )jj 1=2 then jjx( ) ; xjj ! 0 as k ! 1 Corollary 2: If jj A( )jj nujjAjj and nu(A) 1=2 then jjx( ) ; xjj ! 0 as k ! 1 { Proof: cf. G.E. Forsythe and C. Moler (1967), Computer Solution of Linear Algebraic Systems, Prentice Hall, Englewood Clis. Remark: Iterative improvement converges rather generally and each iteration gives t ; s correct digits k k k k k k k k k k k k k 12 k k A Hilbert Matrix Example 5. An n n Hilbert matrix has entries 1 h = i + j ; 1 { With n = 4: 2 3 1 1=2 1=3 1=4 6 1=2 1=3 1=4 1=5 7 H = 664 1=3 1=4 1=5 1=6 775 1=4 1=5 1=6 1=7 { It is symmetric and positive denite { It arises in the polynomial approximation of functions Condition numbers of Hilbert matrices n 2 (H) 2 1.9282e+01 4 1.5514e+04 8 1.5258e+10 16 6.5341e+17 Solve Hx = b with n = 12 { Select b the rst column of the identity matrix { x is the rst column of H;1 Solve by Gaussian elimination with one step of iterative improvement ^ (1) ; xjj2 ^ (0) ; xjj2 jjx jjx = 0:0784 = 0:0086 ij jjxjj2 jjxjj2 13