Numerical Methods I: Conditioning, stability and sources of error Georg Stadler

Numerical Methods I: Conditioning, stability and
sources of error
Georg Stadler
Courant Institute, NYU
September 10, 2015
1 / 18
I assume some familiarity with basic linear algebra (vectors,
matrices, norms, matrix norms). See Section 1 of
Main reference:
Section 2 in Quarteroni/Sacci/Saleri and
Section 2 in Deuflhard/Hohmann
2 / 18
Well-posedness and ill-posedness
Consider finding x from the implicit equation
F (x, d) = 0,
where d is some kind of “data”, and F is a functional relation.
The variables (x, d) can be a scalars, vectors or functions.
1. Direct problem: Given F and d, find x.
2. Inverse problem: Given F and (parts of) x, find d.
In this class, we focus on the direct problem, which is called
well-posed if it admits a unique solution x that depends
continuously on d (for some set of data d),
ill-posed if that’s not the case.
3 / 18
Continuous dependence on d
Assume F (x, d) = 0 and consider the perturbed problem
F (x + δx, d + δd) = 0 assuming d + δd ∈ D.
Continuous dependence on the data d means that there exists
η, K > 0 (that may depend on d) such that kδdk < η implies (in
an appropriate norm) that
kδxk ≤ Kkδdk.
4 / 18
Condition numbers
The absolute condition number is defined as
, kδdk =
6 0 .
κabs (d) = sup
The relative condition number, for d 6= 0, x 6= 0, is
, kδdk =
6 0 .
κ(d) = κrel (d) = sup
This definition is for arbitrary perturbations δd, not only
infinitesimal small perturbations. Other definitions are for δd → 0.
5 / 18
Condition numbers and derivatives
Infinitesimal perturbations
Assume G(·) such that F (G(d), d) = 0 (exists under certain
assumptions using the implicit function theorem). Then
κabs (d) = kG0 (d)k,
where G0 means derivative w.r. to d, and
κ(d) = kG0 (d)k
1. Solution of Ax = b for perturbations of b
2. Condition number of addition/subtraction (numerical
6 / 18
Sources of error in computational models
Model errors due to modeling of real-world process (control by
improving the model)
Data errors (control by better measurement devices)
Truncation/discretization errors from approximating the
model with finite steps/operations
Rounding errors due to finite representation of numbers
The latter two errors are the computational error which we mainly
study in this class.
7 / 18
Machine representation of numbers
Positional system
Number representation with base β ∈ N of a number x in
positional system, with 0 ≤ xj < β
xβ = (−1)s [xn xn−1 . . . x0 .x−1 . . . x−m ]
which means
xβ = (−1)s
with xn 6= 0,
xk β k .
Depending on β, the same number can have a finite or infinite
representation in the positional system. In computing, usual bases
are binary (β = 2), decimal (β = 10) and hexadecimal (β = 16).
8 / 18
Machine representation of numbers
Floating point system
N memory locations, fixed point system:
(−1)s [aN −2 aN −3 . . . ak .ak−1 . . . a0 ]
Floating point:
(−1)s [0.a1 a1 . . . xt ]β e
Here, a1 a2 . . . at ∈ N is called the mantissa and e the exponent.
Usually, β = 2, and there are single (32bit) and double precision
(64bit) numbers.
There are gaps of changing size between floating point numbers.
Size of the gap at x = 1 in double precision (also called machine
epsilon) is about 2.2 × 1016 .
9 / 18
Machine representation of numbers
Floating point system
Conventions ensure that floating point representation is
The IEEE Arithmetic standard for floating point numbers
avoids accumulation of round-off errors and defines how to
handle ∞ and N aN , the latter arising for instance from the
operation 0/0.
Examples of what can happen in finite precision:
a + b − b 6= b
a + b + c 6= c + b + a
(a + b)c 6= ac + bc
10 / 18
Machine representation of numbers
Floating point system
MATLAB demo:
largest/smallest floating point numbers (realmin/realmax)
NaN, ±Inf
Normalized floating point numbers: a1 6= 0
De-normalized floating point numbers to extend minimum:
a1 = 0
11 / 18
Stability, Consistency, Convergence
Numerical method to approximate solution of
F (x, d) = 0,
can be written, depending on n ∈ N as:
Fn (xn , dn ) = 0
We think of xn → x as n → ∞. For that, it’s necessary that
dn → d and Fn → F .
When is a numerical method convergent, i.e., xn becomes “close
to” the exact solution x?
12 / 18
Stability, Consistency, Convergence
Consider the numerical method:
Fn (xn , dn ) = 0
Stability: is well-posedness for Fn (xn , dn ) = 0 (i.e., unique
solvability and continuous dependence on perturbations in dn .
13 / 18
Stability, Consistency, Convergence
Consider the numerical method:
Fn (xn , dn ) = 0
Stability: is well-posedness for Fn (xn , dn ) = 0 (i.e., unique
solvability and continuous dependence on perturbations in dn .
Fn (x, d) → 0
as n → ∞.
Strong consistency:
Fn (x, d) = 0
13 / 18
Stability, Consistency, Convergence
Consider the numerical method:
Fn (xn , dn ) = 0 (or: Fn (xn−k , . . . , xn , dn ) = 0)
Stability: is well-posedness for Fn (xn , dn ) = 0 (i.e., unique
solvability and continuous dependence on perturbations in dn .
Fn (x, d) → 0
(or: Fn (x, . . . , x, dn ) → 0) as n → ∞.
Strong consistency:
Fn (x, d) = 0
(or: Fn (x, . . . , x, d) = 0)
13 / 18
Stability, Consistency, Convergence
Consider the numerical method:
Fn (xn , dn ) = 0 (or: Fn (xn−k , . . . , xn , dn ) = 0)
Stability: is well-posedness for Fn (xn , dn ) = 0 (i.e., unique
solvability and continuous dependence on perturbations in dn .
Fn (x, d) → 0
(or: Fn (x, . . . , x, dn ) → 0) as n → ∞.
Strong consistency:
Fn (x, d) = 0
(or: Fn (x, . . . , x, d) = 0)
Examples: Finite element approximations of PDEs; Numerical
quadrature, Newton’s method
13 / 18
Stability, Consistency, Convergence
Convergence of a numerical method (loosely defined):
xn → x as n → ∞.
More rigorous definition (see Quarteroni/Sacci/Saleri):
∀ > 0 ∃n0 , δ > 0 :
∀n > n0 , ∀δdn : kδdn k ≤ δ =⇒ kx(d) − xn (d + δdn )k ≤ .
14 / 18
Stability, Consistency, Convergence
Convergence of a numerical method (loosely defined):
xn → x as n → ∞.
More rigorous definition (see Quarteroni/Sacci/Saleri):
∀ > 0 ∃n0 , δ > 0 :
∀n > n0 , ∀δdn : kδdn k ≤ δ =⇒ kx(d) − xn (d + δdn )k ≤ .
General result (can be shown in particular problems):
For a consistent numerical method, stability is equivalent to
sometimes also:
Consistency + Stability ⇔ Convergence
14 / 18
Forward/backward error analysis
error in
error in
error in
15 / 18
Forward/backward error analysis
error in
error in
error in
Forward Analysis: studies the effects of input and algorithmic
errors on result; results in bounds of the form kδxn k ≤ . . .
15 / 18
Forward/backward error analysis
error in
error in
error in
Forward Analysis: studies the effects of input and algorithmic
errors on result; results in bounds of the form kδxn k ≤ . . .
Backward Analysis: Finds for a computed solution x̂ (that contains
errors) the perturbation δ dˆ that would result in that solution given
an exact algorithm. Does not take into account how x̂ has been
15 / 18
Forward/backward error analysis
error in
error in
error in
Forward Analysis: studies the effects of input and algorithmic
errors on result; results in bounds of the form kδxn k ≤ . . .
Backward Analysis: Finds for a computed solution x̂ (that contains
errors) the perturbation δ dˆ that would result in that solution given
an exact algorithm. Does not take into account how x̂ has been
Examples: Polynomial root finding, solution of linear systems.
15 / 18
A priori versus a posteriori analysis
A priori analysis is performed before a specific solution is
computed, i.e., estimates do not depend on a specific numerically
computed solution.
16 / 18
A priori versus a posteriori analysis
A priori analysis is performed before a specific solution is
computed, i.e., estimates do not depend on a specific numerically
computed solution.
A posteriori analysis bounds the error for a specific numerical
solution x̂ (computed with a specific numerical method), and uses,
e.g., residuals (Fn (xn , dn )) for the a posteriori analysis.
16 / 18
A priori versus a posteriori analysis
A priori analysis is performed before a specific solution is
computed, i.e., estimates do not depend on a specific numerically
computed solution.
A posteriori analysis bounds the error for a specific numerical
solution x̂ (computed with a specific numerical method), and uses,
e.g., residuals (Fn (xn , dn )) for the a posteriori analysis.
Examples: Numerical approximation of ODEs and PDEs, solution
of linear systems.
16 / 18
Notation and other useful concepts
Relative errors:
kx − xn k
kx − xn k
kxn k
Absolute error:
kx − xn k
Used for theoretical arguments
In numerical practice: exact solution is not available, so these
errors are must be approximated.
17 / 18
Notation and other useful concepts
Landau symbols
Let fn , gn be sequences in R. Then, for n → ∞:
fn = O(gn )
∃C > 0, n0 > 0 : |fn | ≤ C|gn |,
fn = o(gn )
∀∃n0 > 0 : |fn | ≤ |gn |
Let f (·), g(·) be functions that map to R. Then, for x → x0 :
f (x) = O(g(x))
∃C > 0, U (x0 ) : ∀x ∈ U (x0 ) : |f (x)| ≤ C|g(x)|,
f (x) = o(g(x))
∀∃U (x0 ) : ∀x ∈ U (x0 ) : |f (x)| ≤ |g(x)|
18 / 18