Numerical Methods I: Conditioning, stability and sources of error Georg Stadler Courant Institute, NYU stadler@cims.nyu.edu September 10, 2015 1 / 18 Sources/reading I assume some familiarity with basic linear algebra (vectors, matrices, norms, matrix norms). See Section 1 of Quarteroni/Sacci/Saleri Main reference: Section 2 in Quarteroni/Sacci/Saleri and Section 2 in Deuflhard/Hohmann 2 / 18 Well-posedness and ill-posedness Consider finding x from the implicit equation F (x, d) = 0, where d is some kind of “data”, and F is a functional relation. The variables (x, d) can be a scalars, vectors or functions. 1. Direct problem: Given F and d, find x. 2. Inverse problem: Given F and (parts of) x, find d. In this class, we focus on the direct problem, which is called I well-posed if it admits a unique solution x that depends continuously on d (for some set of data d), I ill-posed if that’s not the case. 3 / 18 Continuous dependence on d Assume F (x, d) = 0 and consider the perturbed problem F (x + δx, d + δd) = 0 assuming d + δd ∈ D. Continuous dependence on the data d means that there exists η, K > 0 (that may depend on d) such that kδdk < η implies (in an appropriate norm) that kδxk ≤ Kkδdk. 4 / 18 Condition numbers The absolute condition number is defined as kδxk , kδdk = 6 0 . κabs (d) = sup kδdk The relative condition number, for d 6= 0, x 6= 0, is kδxk/kxk , kδdk = 6 0 . κ(d) = κrel (d) = sup kδdk/kdk This definition is for arbitrary perturbations δd, not only infinitesimal small perturbations. Other definitions are for δd → 0. 5 / 18 Condition numbers and derivatives Infinitesimal perturbations Assume G(·) such that F (G(d), d) = 0 (exists under certain assumptions using the implicit function theorem). Then κabs (d) = kG0 (d)k, where G0 means derivative w.r. to d, and κ(d) = kG0 (d)k kdk . kxk Examples: 1. Solution of Ax = b for perturbations of b 2. Condition number of addition/subtraction (numerical cancellation!) 6 / 18 Sources of error in computational models I Model errors due to modeling of real-world process (control by improving the model) I Data errors (control by better measurement devices) I Truncation/discretization errors from approximating the model with finite steps/operations I Rounding errors due to finite representation of numbers The latter two errors are the computational error which we mainly study in this class. 7 / 18 Machine representation of numbers Positional system Number representation with base β ∈ N of a number x in positional system, with 0 ≤ xj < β xβ = (−1)s [xn xn−1 . . . x0 .x−1 . . . x−m ] which means xβ = (−1)s n X with xn 6= 0, xk β k . k=−m Depending on β, the same number can have a finite or infinite representation in the positional system. In computing, usual bases are binary (β = 2), decimal (β = 10) and hexadecimal (β = 16). 8 / 18 Machine representation of numbers Floating point system N memory locations, fixed point system: (−1)s [aN −2 aN −3 . . . ak .ak−1 . . . a0 ] Floating point: (−1)s [0.a1 a1 . . . xt ]β e Here, a1 a2 . . . at ∈ N is called the mantissa and e the exponent. Usually, β = 2, and there are single (32bit) and double precision (64bit) numbers. There are gaps of changing size between floating point numbers. Size of the gap at x = 1 in double precision (also called machine epsilon) is about 2.2 × 1016 . 9 / 18 Machine representation of numbers Floating point system I Conventions ensure that floating point representation is unique. I The IEEE Arithmetic standard for floating point numbers avoids accumulation of round-off errors and defines how to handle ∞ and N aN , the latter arising for instance from the operation 0/0. Examples of what can happen in finite precision: a + b − b 6= b a + b + c 6= c + b + a (a + b)c 6= ac + bc 10 / 18 Machine representation of numbers Floating point system MATLAB demo: I largest/smallest floating point numbers (realmin/realmax) I overflow/underflow I NaN, ±Inf I Normalized floating point numbers: a1 6= 0 I De-normalized floating point numbers to extend minimum: a1 = 0 11 / 18 Stability, Consistency, Convergence Numerical method to approximate solution of F (x, d) = 0, can be written, depending on n ∈ N as: Fn (xn , dn ) = 0 We think of xn → x as n → ∞. For that, it’s necessary that dn → d and Fn → F . When is a numerical method convergent, i.e., xn becomes “close to” the exact solution x? 12 / 18 Stability, Consistency, Convergence Consider the numerical method: Fn (xn , dn ) = 0 Stability: is well-posedness for Fn (xn , dn ) = 0 (i.e., unique solvability and continuous dependence on perturbations in dn . 13 / 18 Stability, Consistency, Convergence Consider the numerical method: Fn (xn , dn ) = 0 Stability: is well-posedness for Fn (xn , dn ) = 0 (i.e., unique solvability and continuous dependence on perturbations in dn . Consistency: Fn (x, d) → 0 as n → ∞. Strong consistency: Fn (x, d) = 0 13 / 18 Stability, Consistency, Convergence Consider the numerical method: Fn (xn , dn ) = 0 (or: Fn (xn−k , . . . , xn , dn ) = 0) Stability: is well-posedness for Fn (xn , dn ) = 0 (i.e., unique solvability and continuous dependence on perturbations in dn . Consistency: Fn (x, d) → 0 (or: Fn (x, . . . , x, dn ) → 0) as n → ∞. Strong consistency: Fn (x, d) = 0 (or: Fn (x, . . . , x, d) = 0) 13 / 18 Stability, Consistency, Convergence Consider the numerical method: Fn (xn , dn ) = 0 (or: Fn (xn−k , . . . , xn , dn ) = 0) Stability: is well-posedness for Fn (xn , dn ) = 0 (i.e., unique solvability and continuous dependence on perturbations in dn . Consistency: Fn (x, d) → 0 (or: Fn (x, . . . , x, dn ) → 0) as n → ∞. Strong consistency: Fn (x, d) = 0 (or: Fn (x, . . . , x, d) = 0) Examples: Finite element approximations of PDEs; Numerical quadrature, Newton’s method 13 / 18 Stability, Consistency, Convergence Convergence of a numerical method (loosely defined): xn → x as n → ∞. More rigorous definition (see Quarteroni/Sacci/Saleri): ∀ > 0 ∃n0 , δ > 0 : ∀n > n0 , ∀δdn : kδdn k ≤ δ =⇒ kx(d) − xn (d + δdn )k ≤ . 14 / 18 Stability, Consistency, Convergence Convergence of a numerical method (loosely defined): xn → x as n → ∞. More rigorous definition (see Quarteroni/Sacci/Saleri): ∀ > 0 ∃n0 , δ > 0 : ∀n > n0 , ∀δdn : kδdn k ≤ δ =⇒ kx(d) − xn (d + δdn )k ≤ . General result (can be shown in particular problems): For a consistent numerical method, stability is equivalent to convergence. sometimes also: Consistency + Stability ⇔ Convergence 14 / 18 Forward/backward error analysis input algorithm error in input error in algorithm result error in result 15 / 18 Forward/backward error analysis input algorithm error in input error in algorithm result error in result Forward Analysis: studies the effects of input and algorithmic errors on result; results in bounds of the form kδxn k ≤ . . . 15 / 18 Forward/backward error analysis input algorithm error in input error in algorithm result error in result Forward Analysis: studies the effects of input and algorithmic errors on result; results in bounds of the form kδxn k ≤ . . . Backward Analysis: Finds for a computed solution x̂ (that contains errors) the perturbation δ dˆ that would result in that solution given an exact algorithm. Does not take into account how x̂ has been computed. 15 / 18 Forward/backward error analysis input algorithm error in input error in algorithm result error in result Forward Analysis: studies the effects of input and algorithmic errors on result; results in bounds of the form kδxn k ≤ . . . Backward Analysis: Finds for a computed solution x̂ (that contains errors) the perturbation δ dˆ that would result in that solution given an exact algorithm. Does not take into account how x̂ has been computed. Examples: Polynomial root finding, solution of linear systems. 15 / 18 A priori versus a posteriori analysis A priori analysis is performed before a specific solution is computed, i.e., estimates do not depend on a specific numerically computed solution. 16 / 18 A priori versus a posteriori analysis A priori analysis is performed before a specific solution is computed, i.e., estimates do not depend on a specific numerically computed solution. A posteriori analysis bounds the error for a specific numerical solution x̂ (computed with a specific numerical method), and uses, e.g., residuals (Fn (xn , dn )) for the a posteriori analysis. 16 / 18 A priori versus a posteriori analysis A priori analysis is performed before a specific solution is computed, i.e., estimates do not depend on a specific numerically computed solution. A posteriori analysis bounds the error for a specific numerical solution x̂ (computed with a specific numerical method), and uses, e.g., residuals (Fn (xn , dn )) for the a posteriori analysis. Examples: Numerical approximation of ODEs and PDEs, solution of linear systems. 16 / 18 Notation and other useful concepts Relative errors: kx − xn k kx − xn k or kxk kxn k Absolute error: kx − xn k I Used for theoretical arguments I In numerical practice: exact solution is not available, so these errors are must be approximated. 17 / 18 Notation and other useful concepts Landau symbols Let fn , gn be sequences in R. Then, for n → ∞: fn = O(gn ) ⇔ ∃C > 0, n0 > 0 : |fn | ≤ C|gn |, fn = o(gn ) ⇔ ∀∃n0 > 0 : |fn | ≤ |gn | Let f (·), g(·) be functions that map to R. Then, for x → x0 : f (x) = O(g(x)) ⇔ ∃C > 0, U (x0 ) : ∀x ∈ U (x0 ) : |f (x)| ≤ C|g(x)|, f (x) = o(g(x)) ⇔ ∀∃U (x0 ) : ∀x ∈ U (x0 ) : |f (x)| ≤ |g(x)| 18 / 18