Math 405: Numerical Methods for Differential Equations 2015 W1 Topic 10a: Numerical Linear Algebra: Gaussian Elimination Setup: given a square n by n matrix A and vector with n components b, find x such that Ax = b. Equivalently find x = (x1 , x2 , . . . , xn )T for which a11 x1 + a12 x2 + · · · + a1n xn = b1 a21 x1 + a22 x2 + · · · + a2n xn = b2 .. . (1) an1 x1 + an2 x2 + · · · + ann xn = bn . Lower-triangular matrices: the matrix A is lower triangular if aij = 0 for all 1 ≤ i < j ≤ n. The system (1) is easy to solve if A is lower triangular. b1 a11 b2 − a21 x1 =⇒ x2 = a22 a11 x1 = b1 =⇒ x1 = ⇓ a21 x1 + a22 x2 .. . = b2 ⇓ ⇓ bi − ai1 x1 + ai2 x2 + · · · + aii xi .. . = bi =⇒ xi = i−1 X j=1 aii aij xj ⇓ ⇓ This works if, and only if, aii 6= 0 for each i. The procedure is known as forward substitution. Computational work estimate: one floating-point operation (flop) is one scalar multiply/division/addition/subtraction as in y = a ∗ x where a, x and y are computer representations of real scalars.1 Hence the work in forward substitution is 1 flop to compute x1 plus 3 flops to compute x2 plus . . . plus 2i − 1 flops to compute xi plus . . . plus 2n − 1 flops to compute xn , or in total ! n n X X (2i − 1) = 2 i − n = 2 ( 21 n(n + 1)) − n = n2 + lower order terms i=1 i=1 flops. We sometimes write this as n2 + O(n) flops or more crudely O(n2 ) flops. Upper-triangular matrices: the matrix A is upper triangular if aij = 0 for all 1 ≤ j < i ≤ n. Once again, the system (1) is easy to solve if A is upper triangular. 1 This is an abstraction: e.g., some hardware can do y = a ∗ x + b in one FMA flop (“Fused Multiply and Add”) but then needs several FMA flops for a single division. For a trip down this sort of rabbit hole, look up the “Fast inverse square root” as used in the source code of the video game “Quake III Arena”. Topic 10a pg 1 of 3 .. . ⇑ n X bi − aii xi + · · · + ain−1 xn−1 + a1n xn = bi .. . j=i+1 =⇒ xi = aii aij xj ⇑ ⇑ an−1n−1 xn−1 + an−1n xn = bn−1 =⇒ xn−1 = ann xn = bn =⇒ xn = bn−1 − an−1n xn ⇑ an−1n−1 bn . ann ⇑ Again, this works if, and only if, aii 6= 0 for each i. The procedure is known as backward or back substitution. This also takes approximately n2 flops. For computation, we need a reliable, systematic technique for reducing Ax = b to U x = c with the same solution x but with U (upper) triangular: Gauss elimination. Example 3 −1 1 2 x1 x2 = 12 11 . Multiply first equation by 1/3 and subtract from the second =⇒ 3 −1 12 x1 = . 7 0 x2 7 3 Gaussian Elimination (GE): this is most easily described in terms of overwriting the matrix A = {aij } and vector b. At each stage, it is a systematic way of introducing zeros into the lower triangular part of A by subtracting multiples of previous equations (i.e., rows); such (elementary row) operations do not change the solution. for columns j = 1, 2, . . . , n − 1 for rows i = j + 1, j + 2, . . . , n aij row i ← row i − ∗ row j ajj aij bi ← bi − ∗ bj ajj end end Topic 10a pg 2 of 3 Example 12 x1 3 −1 2 1 2 3 x2 = 11 : represent as 2 x3 2 −2 −1 3 −1 7 1 =⇒ row 2 ← row 2 − 3 row 1 0 3 row 3 ← row 3 − 23 row 1 0 − 43 3 −1 7 0 =⇒ 3 0 0 row 3 ← row 3 + 47 row 2 3 −1 2 | 12 1 2 3 | 11 2 −2 −1 | 2 2 | 12 7 | 7 3 7 − 3 | −6 2 | 12 7 | 7 3 −1 | −2 Back substitution: x3 = 2 7 − 73 (2) x2 = =1 7 3 12 − (−1)(1) − 2(2) x1 = = 3. 3 aij Cost of Gaussian Elimination: note, row i ← row i − ∗ row j is ajj for columns k = j + 1, j + 2, . . . , n aik ← aik − aij ajk ajj end This is approximately 2(n − j) flops as the multiplier aij /ajj is calculated with just one flop; ajj is called the pivot. Overall therefore, the cost of GE is approximately n−1 X 2 2(n − j) = 2 j=1 n−1 X l2 = 2 l=1 n(n − 1)(2n − 1) 2 = n3 + O(n2 ) 6 3 flops. The calculations involving b are n−1 X j=1 2(n − j) = 2 n−1 X l=2 l=1 n(n − 1) = n2 + O(n) 2 flops, just as for the triangular substitution. Topic 10a pg 3 of 3