AMCS 251: Numerical Linear Algebra (Numerical Linear Algebra for Computational and Data Sciences) Lecture 6: Accuracy and Stability; Triangular Systems; Backward Stability of Back Substitution Matteo Parsani King Abdullah University of Science and Technology Matteo Parsani Numerical Linear Algebra 1 / 21 Outline 1 Accuracy and Stability (NLA§14-15) 2 Triangular Systems (MC§3.1) 3 Backward Stability of Back Substitution (NLA§17) Matteo Parsani Numerical Linear Algebra 2 / 21 Accuracy Roughly speaking, accuracy means that “error” is small in an asymptotic sense, say O(machine ) Notation ϕ(t) = O(ψ(t)) means ∃C s.t. |ϕ(t)| ≤ C |ψ(t)| as t approaches 0 (or ∞) I Example: sin2 t = O(t 2 ) as t → 0 If ϕ depends on s and t, then ϕ(s, t) = O(ψ(t)) means ∃C s.t. |ϕ(s, t)| ≤ C |ψ(t)| for any s as t approaches 0 (or ∞) I Example: sin2 t sin2 s = O(t 2 ) as t → 0 When we say O(machine ), we are thinking of a series of idealized machines for which machine → 0 Matteo Parsani Numerical Linear Algebra 3 / 21 More on Accuracy An algorithm f˜ is accurate if relative error is in the order of machine precision, i.e., kf˜(x) − f (x)k/kf (x)k = O(machine ), i.e., ≤ C1 machine as machine → 0, where constant C1 may depend on the condition number and the algorithm itself In most cases, we expect kf˜(x) − f (x)k/kf (x)k = O(κmachine ), i.e., ≤ C κmachine as machine → 0, where constant C should be independent of κ and value of x (although it may depends on the dimension of x) How do we determine whether an algorithm is accurate or not? I I I It turns out to be an extremely subtle question A forward error analysis (operation by operation) is often too difficult and impractical, and cannot capture dependence on condition number An effective solution is backward error analysis Matteo Parsani Numerical Linear Algebra 4 / 21 Stability We say an algorithm is stable if it gives “nearly the right answer to nearly the right question” More formally, an algorithm f˜ for problem f is stable if (for all x) kf˜(x) − f (x̃)k/kf (x̃)k = O(machine ) for some x̃ with kx̃ − xk/kxk = O(machine ) Matteo Parsani Numerical Linear Algebra 5 / 21 Stability We say an algorithm is stable if it gives “nearly the right answer to nearly the right question” More formally, an algorithm f˜ for problem f is stable if (for all x) kf˜(x) − f (x̃)k/kf (x̃)k = O(machine ) for some x̃ with kx̃ − xk/kxk = O(machine ) We say an algorithm is backward stable if it gives “exactly the right answer to nearly the right question” More formally, an algorithm f˜ for problem f is backward stable if (for all x) f˜(x) = f (x̃) for some x̃ with kx̃ − xk/kxk = O(machine ) Matteo Parsani Numerical Linear Algebra 5 / 21 Stability We say an algorithm is stable if it gives “nearly the right answer to nearly the right question” More formally, an algorithm f˜ for problem f is stable if (for all x) kf˜(x) − f (x̃)k/kf (x̃)k = O(machine ) for some x̃ with kx̃ − xk/kxk = O(machine ) We say an algorithm is backward stable if it gives “exactly the right answer to nearly the right question” More formally, an algorithm f˜ for problem f is backward stable if (for all x) f˜(x) = f (x̃) for some x̃ with kx̃ − xk/kxk = O(machine ) Is stability or backward stability stronger? Matteo Parsani Numerical Linear Algebra 5 / 21 Stability We say an algorithm is stable if it gives “nearly the right answer to nearly the right question” More formally, an algorithm f˜ for problem f is stable if (for all x) kf˜(x) − f (x̃)k/kf (x̃)k = O(machine ) for some x̃ with kx̃ − xk/kxk = O(machine ) We say an algorithm is backward stable if it gives “exactly the right answer to nearly the right question” More formally, an algorithm f˜ for problem f is backward stable if (for all x) f˜(x) = f (x̃) for some x̃ with kx̃ − xk/kxk = O(machine ) Is stability or backward stability stronger? I Backward stability is stronger. Does (backward ) stability depend on condition number of f (x)? Matteo Parsani Numerical Linear Algebra 5 / 21 Stability We say an algorithm is stable if it gives “nearly the right answer to nearly the right question” More formally, an algorithm f˜ for problem f is stable if (for all x) kf˜(x) − f (x̃)k/kf (x̃)k = O(machine ) for some x̃ with kx̃ − xk/kxk = O(machine ) We say an algorithm is backward stable if it gives “exactly the right answer to nearly the right question” More formally, an algorithm f˜ for problem f is backward stable if (for all x) f˜(x) = f (x̃) for some x̃ with kx̃ − xk/kxk = O(machine ) Is stability or backward stability stronger? I Backward stability is stronger. Does (backward ) stability depend on condition number of f (x)? I No. Matteo Parsani Numerical Linear Algebra 5 / 21 Stability of Floating Point Arithmetic Backward stability of floating point operations is implied by these two floating point axioms: 1 2 ∀x ∈ R, ∃, || ≤ machine s.t. fl(x) = x(1 + ) For floating-point numbers x, y , ∃, || ≤ machine s.t. x ~ y = (x ∗ y )(1 + ) Example: Subtraction f (x1 , x2 ) = x1 − x2 with floating-point operation f˜(x1 , x2 ) = fl(x1 ) fl(x2 ) I I I Axiom 1 implies fl(x1 ) = x1 (1 + 1 ), fl(x2 ) = x2 (1 + 2 ), for some |1 |, |2 | ≤ machine Axiom 2 implies fl(x1 ) fl(x2 ) = (fl(x1 ) − fl(x2 ))(1 + 3 ) for some |3 | ≤ machine Therefore, fl(x1 ) fl(x2 ) = (x1 (1 + 1 ) − x2 (1 + 2 ))(1 + 3 ) = x1 (1 + 1 )(1 + 3 ) − x2 (1 + 2 )(1 + 3 ) = x1 (1 + 4 ) − x2 (1 + 5 ) where |4 |, |5 | ≤ 2machine + O(2 ) machine Matteo Parsani Numerical Linear Algebra 6 / 21 Stability of Floating Point Arithmetic Cont’d Example: Inner product f (x, y ) = x T y using floating-point operations ⊗ and ⊕ is backward stable Example: Outer product f (x, y ) = xy T using ⊗ and ⊕ is not backward stable Example: f (x) = x + 1 computed as f˜(x) = fl(x) ⊕ 1 is not backward stable Example: f (x, y ) = x + y computed as f˜(x, y ) = fl(x) ⊕ fl(y ) is backward stable Matteo Parsani Numerical Linear Algebra 7 / 21 Accuracy of Backward Stable Algorithm Theorem If a backward stable algorithm f˜ is used to solve a problem f with condition number κ using floating-point numbers satisfying the two axioms, then kf˜(x) − f (x)k/kf (x)k = O(κ(x)machine ) Matteo Parsani Numerical Linear Algebra 8 / 21 Accuracy of Backward Stable Algorithm Theorem If a backward stable algorithm f˜ is used to solve a problem f with condition number κ using floating-point numbers satisfying the two axioms, then kf˜(x) − f (x)k/kf (x)k = O(κ(x)machine ) Proof: Backward stability means f˜(x) = f (x̃) for x̃ such that kx̃ − xk/kxk = O(machine ) Definition of condition number gives kf (x̃) − f (x)k/kf (x)k ≤ (κ(x) + o(1))kx̃ − xk/kxk where o(1) → 0 as machine → 0. Combining the two gives desired result. Matteo Parsani Numerical Linear Algebra 8 / 21 Outline 1 Accuracy and Stability (NLA§14-15) 2 Triangular Systems (MC§3.1) 3 Backward Stability of Back Substitution (NLA§17) Matteo Parsani Numerical Linear Algebra 9 / 21 Triangular Systems A matrix G = (gij ) is lower triangular if g11 0 0 g21 g22 0 G = g31 g32 g33 .. .. .. . . . gn1 gn2 gn3 gij = 0 whenever i < j. ··· 0 ··· 0 .. .. . . . .. . 0 · · · gnn Similarly, an upper triangular matrix is one for which gij = 0 whenever i > j. A triangular matrix is one that is either upper or lower triangular. A triangular matrix G ∈ R n×n is nonsingular if and only if gii 6= 0 for i = 1, ..., n. Matteo Parsani Numerical Linear Algebra 10 / 21 Lower-Triangular Systems Consider system Gy = b, where G is nonsingular, lower-triangular matrix. We can solve the system by y1 = b1 /g11 y2 = (b2 − g21 y1 ) /g22 .. . yi = (bi − gi1 y1 − gi2 y2 − · · · − gi,i−1 yi−1 ) /gii , i−1 X = bi − gij yj gii . j=1 Matteo Parsani Numerical Linear Algebra 11 / 21 Forward Substitution Pseudo-code forward substitution (y overwrites b) for i = 1 : n for j = 1 : i − 1 bi ← bi − gij bj bi ← bi /gii This algorithm is row-oriented, as it access G by rows. It may raise an exception if gii = 0 The is Pn number Pi−1 of operations Pn 2 = 2 (i − 1) = n(n − 1) ≈ n2 i=1 j=1 i=1 Matteo Parsani Numerical Linear Algebra 12 / 21 Column-Oriented Forward Substitution We can reorder loops and obtain a column-oriented algorithm Pseudo-code forward substitution (y overwrites b) for j = 1 : n bj ← bj /gjj for i = j + 1 : n bi ← bi − gij bj The number of operations is again ≈ n2 In practice, column-oriented algorithm is faster if G is stored in a column-oriented fashion Like matrix-matrix multiplication, performance can be improved using block-matrix operators Matteo Parsani Numerical Linear Algebra 13 / 21 Exploiting Leading Zeros If b1 = b2 = · · · = bk−1 = 0, how to change algorithm to reduce operations? In row-oriented algorithm: let i = k, . . . , n and j = k, . . . , i − 1 In column-oriented algorithm: let j = k, . . . , n This saves flops, especially if k is large I Example: compute inverse of a lower-triangular matrix Matteo Parsani Numerical Linear Algebra 14 / 21 Upper-Triangular Systems Consider the system Uy = b, where U ∈ Rn×n is nonsingular and upper triangular We solve the system from bottom to top yn = bn /gnn yn−1 = (bn−1 − gn−1,n yn ) /gn−1,n−1 .. . yi = (bi − gin yn − gi,n−1 yn−1 − · · · − gi,i+1 yi+1 ) /gii , n X = bi − gij yj gii . j=i+1 This is back substitution, and it has the same cost as forward substitution Matteo Parsani Numerical Linear Algebra 15 / 21 Outline 1 Accuracy and Stability (NLA§14-15) 2 Triangular Systems (MC§3.1) 3 Backward Stability of Back Substitution (NLA§17) Matteo Parsani Numerical Linear Algebra 16 / 21 Backward Stability of Back Substitution Solve Rx = b using back substitution r11 · · · r1n .. .. . . rnn x1 .. = . xn b1 .. . bn for j = n downto 1 xj = (bj − n X xk rjk )/rjj k=j+1 Back substitute is backward stable (R + δR)x̃ = b, kδRk/kRk = O(machine ) Furthermore, each component of δR satisfies |δrij | ≤ nmachine + O(2machine ) |rij | We will show in full detail for n = 1, 2, 3 as well as general n Matteo Parsani Numerical Linear Algebra 17 / 21 Proof of Backward Stability (n = 1) For n = 1, the algorithm is simply one floating point division. Using floating-point axiom, we get x̃1 = b1 r11 = b1 b1 (1 + 1 ) = r11 r11 (1 + 01 ) where |1 | ≤ machine and |01 | ≤ machine + O(2 ) machine Therefore, we solved a perturbed problem exactly: (r11 + δr11 )x̃1 = b1 with Matteo Parsani |δr11 | ≤ machine + O(2machine ) |r11 | Numerical Linear Algebra 18 / 21 Proof of Backward Stability (n = 2) For n = 2, we first solve for x̃2 as before. Then, we compute x̃1 : (b1 − x̃2 r12 (1 + 2 ))(1 + 3 ) (1 + 4 ) r11 b1 − x̃2 r12 (1 + 2 ) b1 − x̃2 r12 (1 + 2 ) = = 0 0 r11 (1 + 3 )(1 + 4 ) r11 (1 + 25 ) (x̃2 ⊗ r12 )) x̃1 = (b1 r11 = where |2 |, |3 |, |4 | ≤ machine and |03 |, |04 |, |5 | ≤ machine + O(2 ) machine Again, this is an exact solution to (R + δR)x̃ = b with " |δr | |δr | # 11 12 2|5 | |2 | 2 1 |r11 | |r12 | = ≤ machine +O(2machine ) |δr22 | |1 | 1 |r22 | Matteo Parsani Numerical Linear Algebra 19 / 21 Proof for B-S of B-S (n = 3) For n = 3, we solve for x̃3 and x̃2 as before. Then, we compute x̃1 : x̃1 = [b1 (x̃2 ⊗ r12 ) (x̃3 ⊗ r13 )] r11 [(b1 − x̃2 r12 (1 + 4 ))(1 + 6 ) − x̃3 r13 (1 + 5 )](1 + 7 ) = r11 (1 + 08 ) b1 − x̃2 r12 (1 + 4 ) − x̃3 r13 (1 + 5 )(1 + 06 ) = r11 (1 + 06 )(1 + 07 )(1 + 08 ) That is, (R + ∆R)x̃ = b) with |δr12 | |δr13 | |δr11 | 3 1 2 |r12 | |r13 | |r11 | |δr |δr23 | 22 | 2 1 machine + O(2machine ) |r22 | |r23 | ≤ |δr33 | 1 |r33 | Matteo Parsani Numerical Linear Algebra 20 / 21 Proof for B-S of B-S (general n) Similar analysis for general n gives pattern |δR| ≤ W + O(2machine ) |R| where W is (for n = 5) 0 1 1 1 1 1 4 0 1 2 3 0 1 1 1 1 3 0 1 2 0 1 1 + 1 2 0 1 + 0 1 1 1 0 0 1 0 | {z } | {z } | {z } ⊗ or 5 1 2 3 4 1 2 3 1 2 Matteo Parsani Numerical Linear Algebra 4 3 2 1 1 21 / 21