Pierson Guthrey f (t) is amyptotic to t means the following:

advertisement
Pierson Guthrey
pguthrey@iastate.edu
Asymptotic Behavior f (t) is amyptotic to t means the following:
f (t) ∼ t2 as t → 0 means t−2 f (t) → 0 as t → 0
Equivalently, we write f (t) = t2 + o(t2 ). Also, it could be the case:
f (t) = O(t2 ) means t−2 f (t) is bounded as t → 0
1
Floating Point Arithmetic
A Floating Point Number System is a finite subset of the reals defined by F(b, K, m, M ) where b is the
base of the system, K is the number of digits, m is the smallest exponent representable and M is the
largest exponent representable.
If y ∈ F(b, K, m, M ), then
y = ± (0.d1 d2 d3 ....dK )b × bE , m ≤ M, d1 6= 0 ⇐⇒ y = 0
1.1
Round-off Error
The error in representing z ∈ R by its nearest element in F(b, K, m, M ). If z ∈ R, then
(
±(0.d1 d2 ...dK )b × bE
dk+1 < 2b
f l(z) =
−K
E
± (0.d1 d2 ...dK )b + b
×b
dk+1 ≥ 2b
Example IEEE Double precision (Used in MATLAB) is b = 2 and K = 52. In base 10, this is approximately
K ≈ 16, m ≈ −308, and M ≈ 308.
1.2
Relative error in rounding
Let y ∈ R, y 6= 0. f l(y) ∈ F(b, K, m, M ). Assume dk+1 ≤
b
2
|f l(y)| = (0.d1 d2 ...dK )b × bE
The Relative error is
Rel error:
|y − f l(y)|
|y|
Since |y| = (0.d1 d2 ...dk dk+1 ...)b × bE ≥ (0.1)b × bE = bE−1
and |y − f l(y)| = (0.dk+1dk+2 ....b × bE−k ≤ 12 bE−k thus
Rel error:
|y − f l(y)|
1
≤ b1−K = machine
|y|
2
This Machine epsilon is the smallest representable number
Example
In IEEE DP, b = 2, and K = 52, so machine = 2−52 ≈ 2.2204 × 10−16
1
1.3
Pierson Guthrey
pguthrey@iastate.edu
Error in Computation
1.3
Error in Computation
1.3.1
Finite Difference Operators
Recall
1
1
(f (x − h) − 2f (x) + f (x + h)) = − h2 f (4) (ξ)
h2
12
Assume the round off error e(x − h) is in the evaluation of function values f (x − h) = f˜(x − h) + e(x − h).
1
1
1 f 00 (x) − 2 f˜(x − h) − 2f˜(x) + f˜(x + h) = Eh = − h2 f (4) (ξ) + 2 (e(x − h) − 2e(x) + e(x + h))
h
12
h
1
1 2 (4) |Eh | ≤
h f (ξ) + 2 (|e(x − h)| + |2e(x)| + |e(x + h)|)
12
h
Assume f (4) (ξ) ≤ M for ξ ∈ [x − h, x + h] and assume |e(x)| ≤ for x ∈ [x − h, x + h].
f 00 (x) −
|Eh | ≤
1 2
1
h M + 2 4
12
h
The first term shrinks but the second term blows up as h → 0. One hopes to find the minimum at
1
48 4
hoptimal =
M
We could take to be machine
2
2.1
Polynomial Approximations
Taylor Expansion Theorem
The Taylor series expansion for a function f centered at α evaluated at z is
f (z) =
∞
X
ak (z − α)k
where ak =
f (k) (α)
k!
where ak =
f (k) (α)
k!
k=0
A Taylor polynomial is any finite truncation of this series:
f (z) =
N
X
ak (z − α)k
k=0
The Taylor series is the limit of the Taylor Polynomials, given that the limit exists.
Analytic Functions A function that is equal to its Taylor Series in an open interval (or open disc in the
complex plane), is known as an Analytic Function
Maclaurin Series If the Taylor series or Polynomial is centered at the origin (α = 0), then it is also a
MacLaurin series.
2.1.1
Important Taylor Series
The Maclaurin series for (x − 1)−1 is
(x − 1)−1 = 1 + x + x2 + x3 + ... =
∞
X
k=0
2
xk
Pierson Guthrey
pguthrey@iastate.edu
3
Numerical Linear Algebra
4
Solving Ax = b
4.1
Tridiagonal Solver
For a tridiagonal system of equations
A~u = ~f ,
A tridiagonal
take b1 = cn = 0 and

a1
 b2

A = LU = 

c1
a2
..
.

c2
..
.
bn


1
  β2
 
=
.. 
.  
an
1
..
.




..
.
βn
1
So to solve LU ~u = ~f
1. Solve L~v = ~f using Forward Substitution
2. Solve U ~u = ~v using Backward Substitution
4.1.1
Pseudocode
INPUT: ~a, ~b,~c,~f , all length n LU decomposition:
α1 = a1
for k = 2 to n
βk = bk \ αk−1
αk = ak − βk ck−1
end
Forward Substitution:
v1 = f 1
for k = 2 to n
vk = fk − βk vk−1
end
Backward Substitution:
un = vn \ αn
for k = 2 to n
j = (n + 1) − k
uj = (vj − cj uj+1 ) \ αj
end
Operation count: O(n)
4.2
Spectral Decomposition Method
If A ∈ Cm×m is Hermitian, we can do the following
1. Compute the spectral decomposition (not trivial when m is large)
A = U DU ∗
3
α1
c1
α2

c2
..
.


.. 
. 
αn
Pierson Guthrey
pguthrey@iastate.edu
2. Reform equation
A~x = U DU ∗~x = ~b
So we see
~x = α1 ~u1 + α2 ~u2 + ... + αm ~um =⇒ U ∗~x = ~α
and
~b = β1 ~u1 + β2 ~u2 + ... + βm ~um =⇒ βk = ~u∗ ~b
k
so
U ∗ ~b = ~β and D~α = ~β =⇒ ~α = D−1 ~β
−1
but since Dii
=
1
λi ,
m
X
~u∗ ~b
αk = k =⇒ ~x =
λk
k=1
5
~u∗k ~b
λk
!
~uk
Finite Differences
Finite Differences seeks to approximate an ODE or PDE over a mesh or grid. The steps involved are:
1. Discretize the PDE using a difference scheme.
2. Solve the discretized PDE by iterating and/or time stepping.
5.1
5.1.1
Meshes
Uniform Meshes
Given a closed domain Ω = R̄ × [0, tF ], we divide it into a (J + 1) × (N + 1) grid of parallel lines. Assume
R̄ = [0, 1]. Given the mesh sizes ∆x = J1 , ∆t = N1 , a mesh point is
(xj , tn ) = (j∆x, n∆t)
j = 0, ..., J
n = 0, ..., N
and x0 = 0, xn = 1
An alternative convention uses a (J + 2) × (N + 2) grid with the mesh sizes ∆x =
xn+1 = 1 are the boundary points. and
(xj , tn ) = (j∆x, n∆t)
j = 0, ..., J + 1
1
J+1 , ∆t
=
1
N +1 .
n = 0, ..., N + 1
We seek approximations to the solution at these mesh points, denoted by
Ujn ≈ u(xj , tn )
Where initial values are exact from the initial value function u0 (x, t) = u(x, 0)
Uj0 = u0 (xj )
j = 1, ..., J − 1
and boundary values are exact from the boundary value functions f (t) = u(0, t) and g(t) = u(1, t)
U0n = f (tn )
5.2
UJn = g(tn )
Difference Coefficients
D+ , D−
4
n = 1, 2, ...,
x0 = 0,
5.3
Pierson Guthrey
pguthrey@iastate.edu
Explicit Scheme
5.3
Explicit Scheme
A scheme is explicit if the solution at the next iteration (time level tn+1 ) can be written as a single equation
involving only previous time steps. This is, if it can be written in the form
XX
Ujn+1 =
ai,k Uik + bi,k fik
i
k≤n
Example For the Heat Equation ut = uxx , using a forward difference in time and a centered difference in
space, we get
n
n
Ujn+1 = Ujn + µ(Uj+1
− 2Ujn + Uj−1
)
µ :=
∆t
(∆x)2
Pseudocode :
At n = 0, Uj0 = u0 (xj , 0)
for n = 1 : N
U0n = 0, UJn = 0
for j = 1 : (J − 1)
n
n
Ujn+1 = Ujn + µ(Uj+1
− 2Ujn + Uj−1
)
end
end
The stability of the problem depends on µ.
5.4
Truncation Error
Example For our model problem (the Heat Equation), the trucation error is
T (x, t) :=
D+t u(x, t) Dx2 u(x, t)
−
∆t
(∆x)2
So we see that since ut − uxx = 0,
1
1
1
1
T (x, t) = (ut − uxx ) +
utt ∆t − uxxxx (∆x)2 + ... =
utt ∆t − uxxxx (∆x)2 + ...
2
12
2
12
If we truncate this Infinite Taylor series using η ∈ (t, t + t∆t) and ξ ∈ (x − ∆x, x + ∆x) and assume the
boundry and initial data are consistent at the corners and are both sufficientl smooth, we can then estimate
|utt (x, η)| ≤ Mtt and |uxxxx (ξ, n)| ≤ Mxxxx , so it follows that
1
1
|T (x, t)| ≤ ∆t Mtt −
Mxxxx
2
6µ
We can assume these bounds will hold uniformly over the domain. We see that
|T (x, t)| → 0 as ∆t, ∆x → 0 ∀ (x, t) ∈ Ω
and this result is independent of any relaton between the two mesh sizes. Thus this scheme is unconditionally consistent with the differential equation.
Since |T (x, t)| will behave asymptotically like O(∆t) as ∆t → 0, this scheme is said to have first order
accuracy
Since ut = uxx , utt = uxxxx and so for µ = 61 ,
1
1
T (x, t) = ∆t utt −
uxxxx + O (∆t)2 = O (∆t)2
2
6µ
and so the scheme is second order accurate for µ =
We can define notation: Tjn = T (xj , tn )
5
1
6
5.5
Pierson Guthrey
pguthrey@iastate.edu
Consistency
5.5
Consistency
Does the difference scheme approximate the PDE as ∆x, ∆t → 0?
A scheme is consistent if
T (x, t) → 0 as ∆x, ∆t → 0
• A scheme is unconditionally consistent if the scheme is consistent for any relationship between ∆x
and ∆t
• A scheme is conditionally consistent if the scheme is consistent only for certain relationships between ∆x and ∆t
5.6
Accuracy
a
c
Take µ finite, so (∆t) b = µ(∆x) d , so (∆t)α = (∆t)ad = µbd (∆x)cb and we have
|T (x, t)| = O(∆tα )
• If α = 1, the scheme is first order accurate.
• If α = 2, the scheme is second order accurate.
• etc...
• The scheme is α-order accurate.
5.7
Convergence
A scheme is convergent if as ∆t, ∆x → 0 for any fixed point (x∗ , t∗ ) in the domain,
xj → x∗ , tn → t∗ =⇒ Ujn → u(x∗ , t∗ )
It suffices to show this for mesh points for sufficiently refined meshes, as convergence at all other points will
follow from continuity of u(x, t). We suppose that we can find a bound for the error T̄ :
n
Tj ≤ T̄ < ∞
We denote the error
enj := Ujn − u(xj , tn )
Taking the difference between the scheme and u(xj , tn+1 ) in terms the truncation error and the exact solution at previous time steps yields the error at en+1
. If the RHS of our difference scheme is represented by
j
D, then
en+1
= DUjn − (Du(xj , tn ) + T (xj , tn )∆t) = Denj − Tjn ∆t
j
Choose µ such that the coefficients of the RHS are positive so that you may estimate E n := max enj , j = 0, ..., J
and so
E n+1 ≤ E n + T̄ ∆t s.t. E 0 = 0
and thus
E n ≤ nT̄ ∆t
n = 0, 1, 2, ...
and considering the domain
E n ≤ T̄ tF
n = 0, 1, 2, ..., N
and since T̄ → 0 as ∆t, dx → 0, E n → 0
6
5.8
Pierson Guthrey
pguthrey@iastate.edu
Error: Fourier Analysis
Example If we replace Ujn with u(xj , tn ) in the definition of Tjn we obtain
en+1
= enj + µDx2 enj − Tjn ∆t
j
which is
en+1
= (1 − 2µ)enj + µenj+1 + µenj−1 − Tjn ∆t
j
For µ ≤ 21 , define E n := max enj , j = 0, ..., J and so
n+1 e
≤ E n + T̄ ∆t =⇒ E n+1 ≤ E n + T̄ ∆t
j
Since E 0 = 0 (the initial values are exact), we have
1
1
n
E ≤ nT̄ ∆t = ∆t Mtt −
Mxxxx tF → 0 as t → 0
2
6µ
5.7.1
Refinement Path
A refinement path is a sequence of pairs of mesh sizes each which tends to zero
refinement path := {((∆x)i , (∆t)i ), i = 0, 1, 2, ...; (∆x)i , (∆t)i → 0}
We can specify particular paths by requiring certain relationships between the mesh sizes.
Examples
(∆t)i ∼ (∆x)i or (∆t)i ∼ (∆x)2i
(∆t)i
Theorem For the heat equation, µi = (∆x)
2 and if µi ≤
i
the positive numbers ni , ji are such that
1
2
∀ i and if for all sufficiently large values of i and
ni (∆t)i → t > 0, ji (∆x)i → x ∈ [0, 1]
and if |uxxxx | ≤ Mxxxx uniformly on Ω, then the approximations Ujnii generated by the explicit scheme for
i = 0, 1, ... converge to the solution u(x, t) of the differential equation uniformly in the region.
This means that arbitrarily good accuracy can be attained by use of a sufficiently fine mesh.
5.8
Error: Fourier Analysis
Let
Ujn = (λ)n eik(j∆x)
where λ(k) is known as the amplification factor of the Fourier Node Ujn . Place this into the difference
equation of your scheme and solve for λ. We then have another numerical approximation
Ujn =
∞
X
Am e−imπ(j∆x) (λ(k))
n
−∞
which can be compared to the Fourier expansion approximating the exact solution.
n
n
Example For Ujn = Ujn + µ(Uj+1
− 2Ujn + Uj−1
), we see
λ(k) = 1 + µ(e
ik∆x
−2+e
−ik∆x
2
) = 1 − 4µ sin
1
k∆x
2
So now
e
−k2 ∆t
1 4
1 4
(∆t)2
∆t(∆x)2
2
2
2
2
−λ(k) = 1 − k ∆t + k ∆t(∆t) − ... − 1 − k ∆t + k ∆t(∆x) − ... =
−
k 4 −...
2
12
2
12
Thus we have first order accuracy in general but second order accuracy if (∆x)2 = 6∆t.
7
5.9
Pierson Guthrey
pguthrey@iastate.edu
Stability
5.9
Stability
A scheme is stable if there exists a constant K such that
n
|(λ)| ≤ K,
n∆t ≤ tF , ∀ k
That is, if the difference in the solutions of the DE and the numerical DE is bounded uniformly in the domain
for any amount of time less than tF . Thus
|λ(k)| ≤ 1 + K 0 ∆t
This is necessary and sufficient.
5.10
Implicit Scheme
If the scheme cannot be written in a form that has Ujn+1 explicitly computed given values Ujn , j = 0, 1, ..., J,
it is implicit. Implicit schemes involve more work but often have higher accuracy and/or stability, and thus
much larger time steps allow us to reach the solution much more quickly.
Example
n+1
n+1
Ujn+1 − Ujn
Uj+1
− 2Ujn+1 + Uj−1
=
∆t
(∆x)2
which can be written as
∆−t Ujn+1 = µδx2 Ujn+1
µ=
∆t
(∆x)2
This involves solving a system of linear equations. However, Fourier analysis for the stability shows
λ=
1
1 + 4 sin2
1
2 k∆t
Since λ < 1 for any positive µ, this scheme is unconditionally stable
5.11
Other Conditions
If an equation obeys extra conditions such as a Maximum Principle, uniqueness condition, or a physical
constraint, the numerical scheme must also obey such conditions else it may not converge.
6
6.1
Methods
Weighted Average θ method
Given two schemes, you can weight one θ and the other with (1 − θ) and add them together. Then stability,
covergence, and accuracy may depend on θ, and it can be chosen to
Example For the explicit and implicit first order accurate schemes for the heat equation are averaged, we
have
Ujn+1 − U n = µ θδx2 Ujn+1 + (1 − θ)δx2 Ujn
θ = 0 yields the explicit scheme and θ = 1 yields the implicit scheme.
8
Pierson Guthrey
pguthrey@iastate.edu
7
General Boundary Conditions
Boundary conditions like
ux = α(t)u + g(t)
x=0
Can be handled like
U1n − U n )0
= αn U0n + g n =⇒ U0n = β n U1n − β n g n ∆t
∆x
βn =
1
1 + αn ∆x
Dirichlet conditions are trivial
u(0, t) = 0 =⇒ U0n = 0
Dx2 yi = D+ D− yi =
1
(yi+1 − 2yi + yi−1 )
h2
1
1
1
yi±1 = yi ± hyi0 + h2 yi00 ± h3 yi000 + h4 yi0000 ...
2
6
24
Dx2 yi =
1
h2
1
1
1
1
yi + hyi0 + h2 y 00 + h3 yi000 + h4 yi0000 − 2yi + yi − hyi0 + h2 y 00 − h3 yi000 + h4 yi0000 + ...
6
24
6
24
which simplifies to
Dx2 yi = yi00 + O(h2 )
Let ui ≈ yi = y(xi )
uxx + q(x)u = f (x)
with u(0) = α and u(1) = β becomes
−1
(ui+1 − 2ui + ui−1 ) + qi ui = fi
h2
or

2
2

−u2 + (2 + h q1 )u1 = h f1 + α
−ui+1 + (2 + h2 qi )ui − ui−1 = h2 fi


(2 + h2 qn )un − un−1 = h2 fn + β
i=1
2≤i≤n
i=n
where u0 = α and un+1 = β we must solve
2 + h2 q1

−1

An ~un = 


−1
2 + h2 q2
..
.
−1
..
.
−1
..
.
2 + h2 qn






u1
..
.
..
.
un


 
 
=
 

Note: An is tridiagonal and symmetric. We can solve this by using An = LU
L~vn = ~fn
8
U ~un = ~vn
New Notes
9
h2 f1 + α
h2 f2
..
.
h2 fn + β


 ~
 = fn

Pierson Guthrey
pguthrey@iastate.edu
9
9.1
Finite Difference Coefficients
Centered Differences
For the difference scheme for the α derivative of f ,
f (α) (xi ) ≈ Dhα f =
1 ai−4 f (xi−4 ) + ... + ai f (xi ) + ... + ai+4 f (xi+4 )
d
hα
where d is the denominator to make the coefficients ai integers. If we want the scheme to have accuracy β,
f (α) (xi ) = Dhα f + O(hβ )
Then the difference coefficients are given by
α
1
2
3
4
9.2
β
2
4
6
8
2
4
6
8
2
4
6
2
4
6
d
2
12
60
840
1
12
180
5040
2
8
240
1
6
240
ai−4
ai−3
ai−2
3
−1
−32
1
9
168
−9
2
128
−7
1
72
7
−1
−96
−1
−27
−1008
−1
−8
−338
1
12
676
ai−1
−1
−8
−45
−672
1
16
270
8064
2
13
488
−4
−39
−1952
ai
ai+1
0
1
0
8
0
45
0
672
−2
1
−30
16
−490
270
−14350 8064
0
−2
0
−13
0
−488
6
−4
56
−39
2730
−1952
ai+2
ai+3
ai+4
−1
−9
−168
1
32
−3
2
128
−9
−1
−72
7
−1
−96
7
−1
−27
−1008
1
8
338
1
12
676
Forward/Backwards Differences
For the difference scheme for the α derivative of f ,
α
f (α) (xi ) ≈ D±
f=
1 ai f (xi ) + ... + ai±4 f (xi±4 ) + ... + ai±8 f (xi±8 )
d
hα
where d is the denominator to make the coefficients ai integers. If we want the scheme to have accuracy β,
α
f (α) (xi ) = D±
f + O(hβ )
10
9.2
Pierson Guthrey
pguthrey@iastate.edu
Forward/Backwards Differences
Then the difference coefficients are given by
α
1
2
3
4
β
1
2
3
4
5
6
1
2
3
4
5
6
1
2
3
4
5
6
1
2
3
4
5
d
1
2
6
12
60
60
1
1
12
12
180
180
1
2
4
8
120
240
1
1
6
6
240
ai
∓1
∓3
∓11
∓25
∓137
∓147
1
2
35
45
812
938
∓1
∓5
∓17
∓49
∓967
∓2403
1
3
35
56
3207
ai+1
ai+2
ai+3
±1
±4
∓3
±18
∓9
±2
±48
∓36
±16
±300
∓300
±200
±360
∓450
±400
−2
1
−5
4
−1
−104
114
−56
−154
214
−156
−3132
5265
−5080
−4014
7911
−9490
±3
∓3
1
±18
∓24
±14
±71
∓118
±98
±232
∓461
±496
±5104 ∓11787 ±15560
±13960 ∓36706 ±57384
−4
6
−4
−14
26
−24
−186
411
−484
−333
852
−1219
−21056 61156 −102912
11
ai+4
ai+5
ai+6
∓3
∓75
∓225
±12
±72
∓10
11
61
2970
7389
−10
−972
−3616
137
1019
−126
±7
±104
±6432
±39128
∓15
∓1849
∓16830
±232
±4216
∓469
−2
−114
−555
−76352
17
164
33636
−21
−8576
967
∓3
∓41
∓307
∓12725
∓58280
1
11
321
1056
109930
ai+7
ai+8
Pierson Guthrey
pguthrey@iastate.edu
Part I
New Notes
10
Two Dimensional Problems
Given a problem
Ω = [0, X] × [0, Y ]
ut = b(uxx + uyy
With initial conditions on Ω for t = 0 and boundary conditions on ∂Ω
So for a uniform grid mesh
N
Ui,j
≈ u(xi , yj , tn ) = u(i∆x, j∆y, n∆t), i = 0, 1, ...I, j = 0, 1, ...J, n = 0, 1, ...N
Explicit Scheme
n+1
n
Ui,j
− Ui,j
=b
∆t
n
n
δx2 Ui,j
δy2 Ui,j
+
∆x2
∆y 2
!
Consistency
Tijn =
1
1
∆tutt − b ∆x2 uxxxx + ∆y 2 uyyyy
2
12
n
+ ....
ij
Tijn ≈ O(∆t + ∆x2 + ∆y 2 )
Stability
n
Uij
≈ (λ)n ei(kx i∆x+ky i∆y)
????
Convergence The error estimate
n
enij = Uij
− unij
n
n
n
n
Uij
= Uij
+ b µx δx2 Uij
+ µy δy2 Uij
en+1
= eny + b µx δx2 enij + µy δy2 enij − ∆tTijn
y
µx = b
∆t
∆t
, µy = b 2
∆x2
∆y
10.0.1 θ Scheme
Accuracy is O(∆t + ∆x2 + ∆y 2 ), but if θ =
∆x2 + ∆y 2 ).
This requires to solve a system
1
2
we have the Crank-Nicholson Scheme with accuracy O(∆t2 +
~ n+1 = ~bn
AU
Where U n ij is reshaped into a vector.
Modified Gram-Schmidt INPUT: A ∈ Cm×n where m ≥ n and rank (A) = n OUTPUT: Q̂ ∈ Cm×n
orthogonal, R̂ ∈ Cn×n upper triangular, where A = Q̂R̂
Pn Pn
for i=1 to n rii = k~ai k 2~q = r~aiii f orj = i+1tonrij = ~q∗i ~aj ~aj = ~aj −rij ~qi endendOperationcountf lops ˜ i=1 j+1 (m+
Pn Pn
Pn
(m−1)+m+m = i=1 j+1 (4m−1) = (4m−1) i=1 (n−i) = (4m−1)(n2 − 21 n2 )2̃mn2 (sameasclassicalGS)
12
10.1
Pierson Guthrey
pguthrey@iastate.edu
Solution using Normal Equations
• Modified GS performs better than Classical GS when there is roundoff error involved.
• Both run into issues with ill conditioned matrices.
The modified GS creates a sequence of matrices {Rn }
AR1 R2 ...RN = [~q1 . . . ~qn ] = Q̂
−1
R̂ = (R1 R2 ...RN )−1 = RN
...R2−1 R1−1
Triangular Orthogonalization: Post-multiplying A by a sequence of upper triangular matrices to produce an
orthogonal matrix.
Householder Method: QR factorization Based on Orthogonal triangularization
QN ...Q2 Q1 A = Q−1 A = R(f ullQRf actorization)
Application of Least Squares: Approximating Data. Find a function f (x), typically a polynomial with
unknown coefficients, that approximates ome data set (xi , yi ) i = 1, 2, ..., N in the sense that kf (~x) − ~yk2 is
minimized (we assume there is error in the data).
Assume f (x) = a0 + a1 x + ... + an−1 xn−1 and m ≥ n. Let ~xki = xki , ~ai = ai , and ~bi = yi so
A~a = ~b
A ∈ Rn×m , A := ~1, ~x, ~x2 , ..., ~xn−1 ,
Problem: Find ~a such that k~rk2 = k~y − A~ak2 is minimized.
10.1
Solution using Normal Equations
A∗ A is invertible as long as there are at least n distinct xi s...that is rank(A) = min(n,quantity of distinct
xi ’s). So we can simply use
~a = (A∗ A)−1 A∗~y
10.1.1
Implementation: Solving Normal Equations
Solve Normal Equations via Cholesky Factorization A∗ A = R∗ R since A∗ A is Hermitian and if rank(A∗ A) =
n it is positive definite.
1. Form A∗ A, and A∗ ~b (∼ mn2 flops)
2. Form Cholesky Facotrization A∗ A=R∗ R (∼ 31 n3 flops)
3. Solve R∗~y = A∗ ~b via Forward substitution (∼ n2 flops)
4. Solve R~a = ~y via Backward substitution (∼ n2 flops)
Total operation count: ∼ mn2 + 13 n3 flops
10.1.2
Implementation: QR Decomposition
1. Form reduced QR decomposition A = Q̂R̂ via Householder Triangularization (∼ 2mn2 − 23 n3 flops)
2. Form ~y = Q̂∗ ~b (∼ 2mn)
3. Solve R̂~a = ~y via Backward substitution (∼ n2 flops)
Total operation count: ∼ 2mn2 − 23 n3 flops
13
Pierson Guthrey
pguthrey@iastate.edu
10.1.3
Implementation: Pseudoinverse via reduced SVD
1. Form the reduced SVD A = Û Σ̂V ∗ (∼ 2mn2 + 11n3 flops)
2. Compute ~y = Û ∗ ~b (∼ 2mn flops)
3. Compute ~z = Σ̂−1~y (∼ n flops)
4. Compute ~a = V ~z (∼ 2n2 flops)
Total operation count: ∼ 2mn2 + 11n3 flops
11
Normal Equations
Normal Equations
• A∗ A is Hermitian
2
• If rank(A∗ A) = n it is positive definite since x∗ A∗ Ax = kAxk2 and Ax = 0 is only possible with
rank(A∗ A) < n.
12
12.1
Stability
Condition Numbers
Let ~f : X → Y be a continuous function. We say ~f is well conditioned if small changes in ~x result in small
changes of ~f (~x). That is if δ~f = f (~x + δ~x) − f (~x)
• The Absolute condition Number is
κ̂ := lim
sup
δ→0 kδ~
xk ≤δ
~
δ f kδ~xk
or if ~f is differentiable, let J(~x) be the Jacobian matrix of ~f and so
κ̂ = kJ(~x)k
• The Relative Condition Number is
κ̂ := lim
sup
δ→0 kδ~
xk ≤δ
~
δ f k~xk
kJ(~x)k k~xk
~ kδ~xk =
~ f f • The Condition Number of a Matrix
κ(A) = kAk A−1 And if A is singular, κ(A) = ∞
Condition number facts
• If A = A∗ then one can show that the eigenvalues of A + δA satisfy |δλ| ≤ kδAk .
14
12.2
Pierson Guthrey
pguthrey@iastate.edu
Gaussian Elimination
NOTES: Jacobian Matrix
Theorem: Let x → f (x) be a problem with condition number κ(x). Let x → f˜(x) be a backward
kf˜(x)−f (x)k
= κ(x)O(()m ) therefore a backward stable algorithm applied to a well
stable algorithm. Then
kf (x)k
conditioned problem is accurate.
Proof. f˜ backward stable implies f˜(x) = f (x̃) for some x̃ such that
κ(x) = sup
kx−x̃k
kxk
= O(()m ).
kf (x̃) − f (x)k kxk
kf (x)k kx − x̃k
So
kf (x̃) − f (x)k
≤ κ(x)
kf (x)k
x − f˜(x)
kxk
≤ κ(x)O(()m )
Conditioning of QR: A = QR, A + δA = (Q̂ + δQ)R̂, δA = δ Q̂R̂ , kδQk ≤ kδAk R̂−1 kδQk2 kAk2
−1 ≤
R̂
kAk2 = κ(A)
2
kδAk2 Q̂
2
Householder in IEEE is backward stable. If Q̃, R̃ are the results of applying Householder to A, Then
kδAk
Q̃ is orthogonal, R̃ is upper triangular, and Q̃R̃ = A + δA where kAk = O(()m ). Q̃R̃ is the exact QR
factorization of A + δA.
Backsubstitution is backwards stable. Solution in IEEE satisfies (R+δR)x̃ = ~b for some upper triangular
kδRk
δR such that kRk = O(()m ).
Example
: Least Squares. Given A ∈ Cm×n , m ≥ n, rank (A) = n. A~x = ~b,
~
b − Ax̂ = min ~b − A~x
~
x
2
2
x̂ = f (A, ~b), κ(A, b) is complicated but roughly speaking κ(A, b) ∼ κ(A).
NOTES Vandermonde Matrix (in polynomial stuff)?
4th order compact finite difference. We can improve the accuracy of
yi00 = D+ D− yi −
h2 (4)
y + O(()h4 )
12 i
by using y (4) (x) = (q(x)y)00 − f 00 (x)
h2
D+ D− ((qy)i − fi ) + O(()h4 )
12
h2
h2
yi00 = D+ D−
1 − qi yi + D+ D− fi + O(()h4 )
12
12
yi00 = D+ D− yi −
or
12.2
Gaussian Elimination
Reducing A to I via GE. For A ∈ Cm×m ,
4 3
m
3
m columns, each column has O(()m) entries, and takes O(()m) multiplications and additions to zero out
any given entry.
f lops ∼
Direct methods for solving A~x = ~b.
15
12.3
Pierson Guthrey
pguthrey@iastate.edu
LU Facorization.
12.3
LU Facorization.
This is Gaussian Elimination where only adding multiples of rows to another are allowed
12.3.1
Goal
Find B such that BA = U where U is upper triangular.
12.3.2
Process
B1 eliminates all entries below a11 so B1 A has only zeros in the first column except a11 . Example: Assume
for simplicity a1,1 = 1




1
0
1
0
,

1
B1 =  −a2,1 1
B2 = 
−a3,1 0 1
−a3,2 1
And so B = Bn ...B2 B1 , so B −1 = B1−1 ...Bn−1 , which turns out to be something like


1
0

 a2,1
1
a3,1 a3,2 1
Then B2 eliminates the subdiagonal entries of the second column of B1 A.
General LU. Let ~x denote the Kth column of A at step K, Bk chosen so that ~xk = (x1,K , x2,K , ..., xk+1,K , ..., xn,K ).
xjk
Then choose ljk = xkk
then


li,j i > j
Li,j = 1
i=j


0
i<j
Algorithm in notes.
12.3.3
Problem
Existence of LU factorization fails for some nonsingular matrices. Even for those that exist, stability is not
guaranteed.
12.3.4
Efficiency
LU factorization operation count:
f lops ∼
m−1
X
m X
m
X
k=1 j=k+1 s=k
2∼
m−1
X
m
X
k=1 j=k+1
2(m−k) ∼
m−1
X
2(m−k)2 ∼ 2
k=1
m−1
X
m2 −4m
k=1
2x more efficient than QR or computing A−1 . Forward subs L~x = b costs
f lops ∼ m2
Which is negligible for large m compared to cost of LU factorization.
Backward subs L~x = b costs
f lops ∼ m2
Which is negligible for large m compared to cost of LU factorization.
16
m−1
X
k=1
k+2
m−1
X
k=1
2
2
k 2 ∼ 2m3 −2m3 + m3 ∼ m3
3
3
12.4
12.4
Pierson Guthrey
pguthrey@iastate.edu
PLU factorization: P A = LU
PLU factorization: P A = LU
Every nonsingular A ∈ Cn×n has a PLU factorization.
12.5
Cholesky Factorization
LU decomposition for symmetric positive definite matrices.
Banded matrices Zero above a certain superdiagonal and/or zero below a certain subdiagonal.
• p is the number of superdiagonals
• q is the number of subdiagonals
• 0 ≤ p, q, ≤ m − 1
• The Bandwidth is w = p + q + 1 so 1 ≤ w ≤ 2m − 1
Claim: If A has an LU decomposition then zeros are preserved. That is, if A = LU , then U has p superdiagonals and L has q superdiagonals and zero beyond.
For fixed p, q, and m → ∞, the operation count of LU is O(()ms2 ) << O(()m3 ) where s = max(p, q)
If the matrix is sparse, A=sparse(m,m)
Sometimes you could have bands of zero within the bandwidth. This algorithm will fill in these zeros.
Claim: The PA=LU decomposition then the amount of non-zeros generally decreases. If A has bandwidth w, then L has bandwidth w = q + 1 but U will have bandwidth w = p + q + 1
Symmetric Positive Definite Matrices A ∈ Cm×m , A = A∗ , and ~x∗ A~x > 0 for all ~x 6= 0.
Example
−y 00 = f with second order central finite difference approximation.
A = (diag(−1, −1) + 2I + diag(−1, 1))/h2
We see that



~x∗ A~x = ~x∗ 

2x1 − x2
−x1 + 2x2 − x3
..
.



 = 2x1 x̄1 − x2 x̄1 − x1 x̄2 + 2x2 x̄2 − x̄2 x3 + ...

−xm−1 + 2xm
2
Note |xi − xi−1 | = x̄i xi − x̄i−1 xi − x̄i xi−1 + x̄i−1 xi−1 thus
~x∗ A~x = |x1 |2 + |xm |2 +
m
X
2
|xi − xi−1 |
i=2
~∗
~∗
So x A~x ≥ 0 but if x A~x = 0, then
|x1 | = 0 =⇒ x1 = 0 =⇒ x2 = 0 =⇒ .... =⇒ xm = 0
Example
−∇φ = f . So −φ,x,x + φ,y,y = f on D ⊂ R2 . Using Lexicographic ordering,
4 − 10 − 14 − 10 − 1 − 1004 − 10 − 1 − 10 − 14 − 10 − 1(u1 , u2 , ...., u9 )T = (h2 f11 , f2 f21 , ...., f33 )T + BC
A is block tridiagonal. A is symmetric and positive definite. A is positive definite if and only if eigenvalues of
A are all positive. The N 2 eigenvalues of
λp,q =
2
((1 − cos(pπh)) + (1 − cos(qπh)))
h2
17
12.5
Pierson Guthrey
pguthrey@iastate.edu
Cholesky Factorization
So λp,q > 0 for p, q = 1, 2, ..., N
Cholesky Factorization (modified LU for SPD(symmetric positive definite)) Let A ∈ Cn×n be SPD
~∗
a11 w
A=
~
w
B
~ ∈ C(m−1) . a1 1 = ~e∗1 A~e1 > 0. We perform the first step of Gaussian
Where B ∈ C(m−1)×(m−1) is SPD and w
elimination
~0∗
~∗
a11
w
1
∗
A=
~w
~
~0 B − w
~
a−1
I
11 w
a11
~0∗
~0∗
a11
~∗
1 a−1
1
11 w
∗
=
~w
~
~0 B − w
~0
~
a−1
I
I
11 w
a11
√
√ −1 ∗ √
∗
∗
~0
~0
1
~
a11
a11
a11 w
=
√ −1
~w
~∗
w
~
~
0
B
−
~
a11 w
I
0
I
a11
= R1∗ A2 R1
Properties
• Every SPD A ∈ Cn×n has a unique Cholesky Factorization.
• The Bandwidth is preserved. If A has bandwidth w = 2p + 1, then R has bandwidth p + 1.
• No need for pivoting.
• Algorithm is backwards stable for Ax = b. R̃∗ R̃ = A + δA where
δA ∈ Cn×n . Cholesky gives us x̃ that satisfies (A + δA)x̃ = b.
kδAk
kAk
= O(()machine ) for some
Usage:
To solve Ax = b, obtain A = R∗ R. Then solve R∗~y = b via Forward subs, and R~x = ~y via Backward subs.
~w
~∗
Claim: A2 = (R1∗ )−1 AR1−1 is SPD thus B − w
a11 is SPD. Proof:
• Symmetric:
A∗2 = (R1∗ )−1 AR1−1 = A2
• Positive Definite
~x∗ A2~x = ~x∗ (R1∗ )−1 AR1−1~x = (R1−1~x)A(R1−1~x) = ~y∗ A~y > 0
• B−
~w
~∗
w
a11
is SPD. We repeat the Cholesky step m times
∗
A = R1∗ A2 R1 = R1∗ R2∗ A3 R2 R1 = ... = (R1∗ R2∗ ...Rm
)I(Rm ...R1 ) = R∗ IR = R∗ R
Operation Count:
f lops ∼
m
m
m
X
X
X
k=1 j=(k+1) s=j
2=
m
m
X
X
k=1 j=(k+1)
2(m−j+1) =
m m−k
X
X
k=1 n=1
Thus this is twice as efficient as LU.
18
2n =
m
X
2
3
(m−k) = m −2m
k=1
1
1
1 2
m + m3 = m3
2
3
3
Pierson Guthrey
pguthrey@iastate.edu
13
Numerical Eigenvalue Problems
Let A ∈ Cm×m . The eigenvalue problem is
~x 6= 0
A~x = λ~x,
MISSING NOTES THURS 10
• Phase 1: Reduce A to upper Hessenberg form via unitary similarity transformations.
A ∈ Cm×m implies in 2m − 1 steps we can get our H matrix
Qm−2 Qm−3 ...Q2 Q1 AQ1 Q2 ...Qm−3 Qm−2 = Q∗ AQ = H
• Phase 2: Iterate to reveal eigenvalues of H.
Operation count: 1st inner loop:
m−2
P
k=1
2nd inner loop:
m−2
P
k=1
Total=
4(m − k)2 ∼ 43 m3
2
4m(m − k) ∼ 4m m2 = 2m3
10 3
3 m
The algorithm is backwards stable with conditioning cond(A)
If A is Hermitian, then H is Hermitian.
A − A∗ = QHQ∗ − QH ∗ Q∗ = Q(H − H ∗ )Q∗ = 0
Therefore H is tridiagonal. In this case, a simplified algorithm is possible, reducing the operation count to
4 3
3m .
Phase 2 For remainder of discussion, assume A ∈ Rn×n and AT = A, thus A = QDQT .
{λi } are real, and eigenvectors {~qi } are orthonormal.
Rayleigh Quotient Given ~x ∈ Rm , find a value α̂ ∈ R such that
kA~x − α̂~xk2 = min kA~x − α~xk2
α
kA~x − α̂~xk2 = (A~x − α̂~x)T (A~x − α̂~x) = ~xT A2~x − 2α̂~xT A~x + α̂2~x∗~x
Taking the derivative with respect to α and setting to zero we get
−2~xT A~x + 2α̂~xT ~x = 0 =⇒ α̂ =
~xT A~x
= RA (~x)
~xT ~x
(Rayleigh Quotient)
Error Estimate: (Taylor Series)
2
RA (~x) = RA (qj ) + ∇RA (qj ) · (~x − ~qj ) + O(()) k~x − ~qj k
~xT A~x
∇RA (~x) = ∇
~xT ~x
=
~xT ~x∇(~xT A~x) − (~xT A~x)∇(~xT ~x)
(~xT ~x)2
Since ∇(~xT ~x) = ∇(x21 + ... + x2m ) = 2~xT and ∇(~xT A~x) = 2(A~x)T
∇RA (~x) =
2
~xT ~x
(A~x)T − RA (~x)~xT
19
13.1
Pierson Guthrey
pguthrey@iastate.edu
Shifted Inverse Iteration
Thus for an orthonormal eigenvctor qi ,
∇RA (~qi ) = 2 λi~qi − λi~qTi = 0
So we conclude
2
RA (~x) = λj + O(()) k~x − ~qj k
~ (0) ∈ Rm .
Method 1: Power method Assume |λ1 | > |λ2 | ≥ ... ≥ |λm |. Start with vector w
w(0) = α1~q1 + ... + αm~qm
Aw(0) = α1 A~q1 + ... + αm A~qm = α1 λ1~q1 + ... + αm λm~qm
~ (1) = A~
~ (n) = A~
Assume this is ≈ α1 λ1~q1 (that λ1 >> λi , i > 1). Improved guess is w
w(0) . Iterating, w
wn−1)
~ (n) = α1 An~q1 + ... + αm An~qm = α1 λn1 ~q1 + ... + αm λnm~qm ≈ α1 λn1 ~q1
w
Theorem: Suppose |λ1 | > |λ2 | ≥ ... ≥ |λm | and q1T v (0) 6= 0. The power iteration after k steps produces
an approximate eigenvector eigenvalue pair that satisfies
2k
k
λ2 λ2 (k)
(k)
λ − λ1 = O(()) ~v − ±~q1 = O(()) ,
λ1
λ1
2
Proof. ~v(0) = α1~q1 + ... + αm~qm .
~v
(k)
k 0
k
= βk A ~v = βk A (α1~q1 + ... + αm~qm ) =
αm
~q1 + ... +
α1
βk α1 λ1k
m
λ1
!
k
~qm
→ βk α1 λ1k
k
Thus ~v(k) − (±q1 ) = O(()) λλ12 . Also
λ
(k)
= RA (~v
(k)
2k
λ2 = RA (~qi ) + O(()k~(k) − (±~q1 )k ) = λ1 + O( )
λ1
2
Limitations of Power Iteration
•
Only gives largest eigenvalue λ1
~v(k) , λ(k) converge only linearly:
λ2 k~(k + 1) − (±~q1 )k
= c , 0 < c < 1
lim
n→∞
k~(k) − (±~q1 )k
λ1
which is very slow if
13.1
λ2
λ1
is close to 1.
Shifted Inverse Iteration
Applying the Power Iteration Method to A−1 yields an approximation to |λ1m | and ~qm .
Applying the Power Iteration Method to (A − µI)−1 , which has eigenvalues (λj − µ)−1 and eigenvectors qj ,
yields approximation to (λk − µ)−1 and qk where |λk − µ| = min |λj − µ|.
j
Suppose λJ is the eigenvalue closest to µ and λK is the next closest.
|λJ − µ| < |λK − µ| ≤ |λi − µ| ∀ i 6= j
Assume qJT ~v(0) 6= 0. The shifted inverse iteration produces approximations that satisfy ~v(k) − (±~qJ ) =
K
2K
(k)
λ − λJ = O( λJ −µ O( λλKJ −µ
)
)
−µ
λK −µ
20
13.1
Pierson Guthrey
pguthrey@iastate.edu
Shifted Inverse Iteration
Proof. Proof: same as power iteration λ1 → λJ1−µ , λ2 → λK1−µ We still have linear convergence, but the
constant c depends on µ so we may have faster convergence.
Note. Accuracy of the linear solver of (A − µI)~
w = ~v depends on κ(A − µI) but for good guesses
of µ this could be huge. However, if this is solved by a backwards stable algorithm, then k~
w − w̃k =
κ(A − µI)O(()machine ). Therefore if µ is close to λJ then error k~
w − w̃k could be large. However, even if
~
w
is still a good approximation to ~qJ since we only need the direction.
this is the vase, ~v(k) = kw̃k
Proof. (A − µI)~
w = ~v and (A − µI)w̃ = ṽ , so
(A − µI)(~
w − w̃) = ~v − ṽ = α1~q1 + ... + αm~qm
~ − w̃ = α1 (A − µI)−1 (~v − ṽ) = α1 (λ1 − µ)−1~q1 + ... + αm (λ1 − µ)−1~qm
w
w̃
~ − w̃ ≈ αJ (λJ − µ)−1~qJ . The bulk of the error in w
~ is in the direction of ~qJ so ~v(k) = kw̃k
is a
Therefore w
2
good approximation to ~qJ .
Theorem If ~v(0) ∈ Rm is sufficiently close to the eigenvector ~qJ ∈ Rm then Rayleigh Quotient Iteration
3
3 produces approximations that satisfy ~v(k) − (±~qJ ) = O(~v(k) − (±~qJ ) ) λ(k+1) = O(() λ(k) − λJ )
Proof. Proof: (A − λ(k) I)~
w = ~v(k)
~ =(A − λ(k) I)−1~v(k)
w
=(A − λ(k) I)−1 (~qJ + ~v(k) − ~qJ )
1
~qJ
(A − λ(k) I)−1~qJ =
λJ − λ(k)
~v(k) − ~qJ ≈ ~qK
1
1
(A − λ(k) I)−1~v(k) − ~qJ ≈
(~v(k) − ~qJ ) ≈
(~v(k) − ~qJ )
(k)
λK − λ
λK − λ(k)
1
1
~qJ +
(~v(k) − ~qJ )
λJ − λ(k)
λK − λ(k)
~
w
λJ − λ(k) (k)
=
≈ ~qJ +
(~v − ~qJ )
k~
w k2
λK − λ(k)
~ ≈
w
~v(k+1)
λJ − λ(k) (k)
(~v − ~qJ )
λK − λ(k)
=O(() λJ − λ(k) k~(k) − ~qJ k )
(~v(k+1) − ~qJ ) ≈
2
but λJ − λ(k) = O(()k~(k) − ~qJ k ) from RQ Taylor series, which implies
(k+1)
3
− ~qJ = O(()k~(k) − ~qJ k )
~v
2
6
Thus λ(k+1) − λJ = O(()~vk+1 − ~qJ ) = O(()~vk − ~qJ ) = O(() λ(k)−λJ ) Operation count (cost of 1
iteration)
F ullM atrix
T ridiagonal
Power
O(()m2 )(Matrix Vector Multiply)
O(()m)
Inverse O(()m2 )(Precomputed Cholesky)
O(()m)
Rayleigh
O(()m3 )(Cholesky each time)
O(()m)
Therefore we can use the phase 1 reduction (which is O(()m3 ) to improve efficiency)
Note: Algorithms so far allow us to compute only a single eigenvalue at a time.
21
13.1
Pierson Guthrey
pguthrey@iastate.edu
Shifted Inverse Iteration
13.1.1
QR Algorithm
A(k) = R(k) Q(k) ,
R(k) = (Q(k) )T A(k−1)
A(k) = (Q(k) )T A(k−1) (Q(k) )
This is a similarity transformation, so it will have the same eigenvalues. As k → ∞,
(
D A is normal
(k)
A →
T
otherwise
Theorem: Assume that A ∈ Rn×n is symmectric such that A = QΛQT is the spectral decomposition where
|λ1 | > |λ2 | > ... > |λm | > 0 so det(∆k (Q)) 6= 0 for k = 1, 2, ...m. Then A(k) → D as k → ∞.
Proof. Proof:
A(k) = Q̄(k) A(Q̄(k) )T
,
Q̄(k) = Q(1) ...Q(k) Therefore we need to prove that Q̄(k) → Q as k → ∞ because then A(k) →
T
Q AQ = D.
A = A(0) = Q(1) R(1) , A2 = Q(1) R(1) Q(1) R(1) = Q(1) A(1) R(1)
since R(1) Q(1) = A(1) = Q(2) R(2) .
A2 = Q(1) Q(2) R(2) R(1) ,
A3 = Q(1) Q(2) Q(3) R(3) R(2) R(1)
so by the above definitions, Ak = ¯(Q(k) )T R̄(k) but
Ak = Ak~e1 , ..., Ak~em
This simulates power iteration. Let ~e1 = α11~q1 + ... + α1m~qm where {qi } are the columns of Q. Take the dot
product of this with ~q1 , ~qT1 ~e1 = α11 = ~eT1 ~q1 = det(∆1 (Q)) 6= 0 Therefore α11 6= 0. So
1
α1m
~e1 = ~q1 + ... +
~qm
α11
α11
Multiplying by Ak ,
1 k
α1m k
A ~e1 = λk1 ~q1 + ... +
λ ~qm
α11
α11 m
This becomes
(k)
λk
1 r̄11 (k)
~q1 = ~q1 + ... + m
f racα1m α11~qm
k
α11 λ1
λk1
(k)
So ~q1 → ~q1 as k → ∞. Now,
~e2 = α21
~e2 =
1
~e1 −
α11
α12
α1m
~q2 + ... +
~qm
α11
α11
+ α22~q2 + ...
α22
α11 α22 − α12 α21
~e1 +
~q2 + α̃23~q3 + ... + α̃2m~q + ... + α̃2m q~m
α11
α11
α11 α22 − α12 α21
1
=
det
α11
α11
α11
α21
α12
α22
=
1
det
α11
~qT1 ~e1
~qT1 ~e2
Multiplying by AK , etc. as before gives us that Q(k) → Q as k → ∞ .
One can procede inductively.
Note
22
~qT2 ~e1
~qT2 ~e2
= det(∆2 (Q)) 6= 0
13.2
Pierson Guthrey
pguthrey@iastate.edu
Iterative Schemes for Ax = b
• A(0) is tridiagonal from Phase 1.
• A(k) is also tridiagonal for any finite k.
• As k → ∞, off diagonal elements converge to 0.
A1
3. Deflation: If |Aj,j+1 | < , then set A(k) =
. A1 ∈ Rj×j , A2 ∈ R(m−j)×(m−j) . Then continue
A2
iterating on A1 and A2 independently.
4. Accelerate by shifting
Q(k) R(k) = A(k−1) − µ(k) I
(k)
Where µ(k) is usually chosen to be Amm To recreate A, use
= A(k−1) = Q(k) R(k) + µ(k) I
13.2
Iterative Schemes for Ax = b
Given an interative scheme where A = M + N ,
M~x(k+1) = N~x(k) + b
Thus if we have the amplification matrix G = M −1 N ,
~e(k+1) = G~e(k)
By induction, ~e(k+1) = Gk~e(0) . Perform the spectral decomposition of G,
~e(k) = RΓK R−1~e(0)
If Γ = diag(γ1 , ..., γn ) with |γ1 | ≥ ... ≥ |γm | then we have convergence if |γ1 | < 1. since then GK → 0 as
k → ∞.
13.2.1
Jacobi Iteration
M = D, N = D − A =⇒ G = I − D−1 A
Say we are considering the discrete Laplacian operator problem −∆u = f . For this situation, G =
2
I − h2 A so A and G have the same eigenvectors and the eigenvalues are related by
γp = 1 −
h2 2
h2
λp = 1 −
(1 − cos(pπh) = cos(pπh)
2
2 h2
γ1 = cos(πh) = 1 − 12 π 2 h2 + Ø(h4 ). Therefore ρ(G) < 1 ∀ h > 0 but ρ(G) → 1 as h → 0. Since this
converges quadratically at h → 0, we expect slow convergence for the scheme.
How many iterations will we need?
log()
(k) ~e ≈ ~e(0) =⇒ ρk = =⇒ k ≈
log(ρ)
How do we choose ? Our error for A~u = ~f is O(()h2 ), so take ≈ ch2
k=
log(c) + 2 log(h)
log(c) + 2 log(m + 1)
4
≈
∼ 2 m2 log(m) = O(()m2 log(m))
log(ρ)
π
− 12 π 2 (m + 1)2
The total work is O(()m3 log(m)).
A similar analysis can be performed for m = N d where N is the number of grid points in each coordinate
direction. The total work turns out to be O(()N d+2 log(N )).
23
13.3
13.2.2
Pierson Guthrey
pguthrey@iastate.edu
Descent Methods
Gauss Seidel
M = L + D, N = U =⇒ G = (L + D)−1 R
ρ(G) = 1 − π 2 h2 + O(()h4 ) as h → 0. Therefore this takes half as many iterations as Jacobi. Still ttoal work
O(()m3 log(m).
13.2.3
Successive Over-Relaxation (SOR)
Idea:~u(k+1) is closer to ~u than ~u(k) is to ~u, but not by much. That is, GS moves ~u(k) in the right direction but
not far enough.
1
1
A = D + L + U, M = (D − ωL), N = ((1 − ω)D + ωU )
ω
ω
Theorem: Ostrowski. If A ∈ Rn×n is SPD, and D − ωL is nonsingular, then SOR converges for all 0 < ω < 2
For our standard u00 = f BVP,
(Stage 1)uGS
=
i
(k+1)
(Stage 2)ui
1 (k+1)
(~u
+ ~u(k) + h2 fi )
2
(k)
= ui
+ ω(~uGS
− ~u(k) )
i
(k)
Think of ~uGS
− ~u(k) as the direction and ω is the step size, starting from ui .
i
For our model problem,
ωoptimal =
2
≈ 2 − 2πh =⇒ ρoptimal = ωoptimal − 1 ≈ 1 − 2πh
1 + sin(πh)
So ρoptimal → 1 as h → 0, but much more slowly than GS or Jacobi. Koptimal = O(()N log(N )) in any
dimension. The total work is O(()N d+1 log(N )).
13.3
Descent Methods
Consider φ : Rm → R defined by
φ(~u) =
1 T
~u A~u − bT ~u
2
where A ∈ Rn×n is SPD. Solve A~u = b
Claim: φ has a unique global minimizer, ~u∗ , that satisfies A~u =∗ b.
Proof: φ is quadratic in u1 , ..., um . Thus lim φ(~u) = +∞ because ~uT A~u > 0.
k~
uk →∞
φ(~u) =
m
1X
2
i=1


m
m
X
X
ui 
aij uj  −
bi ui
j=1
i=1


m
m
X
∂φ
1 X
=
akj uj +
ui aik  − bk = 0
∂uk
2 j=1
i=1
But since aik = aki ,


m
X
∂φ
=
akj uj  − bk = (A~u − ~b)k = 0
∂uk
j=1
Therefore ~u∗ such that A~u∗ = b is the only critical point. By positive definiteness, it is the unique global
minimizer.
24
13.3
Pierson Guthrey
pguthrey@iastate.edu
Descent Methods
So our strategy will be to (starting with a guess ~u(0) ) form the sequence ~u(k) k=0 by chooseing a search
direction ~d(k) and a step size α(k) such that
~u(k+1) = ~u(k) − α(k) ~d(k)
Where α(k) is chosen to minimize φ(~u(k) − α(k) ~d(k) ) with respect to α(k) .
13.3.1
Steepest Descent
The direction of steepest descent of a surface is −∇φ(~u(k) ). This turns out to be the residual:
−∇φ(~u(k) ) = A~u(k) − ~b = −~r(k)
To find the step size,
α(k) = argmin α]φ(~u(k) + α(k)~r(k) )
[
This is done via
1
1 T
~u A~u − ~uT b + α ~rT A~r − ~rT b + α2 ~rT A~r
2
2
Taking the derivative with respect to α
α ~rT A~r = ~rT ~b − ~rT A~u = ~rT (~b − A~u) = ~rT ~r
φ(~u + α~r) =
So α =
~rT ~r
~rT A~r
thus
α(k) =
(~r(k) )T ~r(k)
,
(~r(k) )T A~r(k)
~d = −~r
Note
~r(k) = ~b − A~u(k) = ~b − A(~u(k−1) + α(k−1)~r(k−1) ) = ~r(k−1) − α(k−1) A~r(k−1) )
Note: Since A is SPD, level curves of φ are hyper-ellipses centered at ~u∗ .
Convergence: For m = 2, matrices with higher eccentricity λλ21 will take more iterations.
13.3.2
Conjugate Gradient
Guaranteed to converge to the solution in m steps.
Consider m = 2. Start with an initial guess and arbitrary search direction ~d(0) , we want ~d(1) = ~u∗ − ~u(1)
give ~d(0) = ~u(1) − ~u(0) . Claim ~d(1) , ~d(0) are A-othogonal (~d(0) )T A~d(1) = 0.
Proof. ~d(0) is tangent to a level curve of φ ~u(1) . ∇φ(~u(1) ) = A~u(1) − ~b = −~r(1) . ~r(1) is orthogonal to dr(0) :
(~d(0) )T ~r(1) = 0.
~r(1) = ~b − A~u(1) = A~u∗ − A~u(1) = A~d(1)
Thus (~d(0) )T ~r(1) = (~d(0) )T A~d(1) = 0.
Idea: At each step, pick a search direction so that d(k) and d(k−1) are A-orthogonal.
d(k) = −r(k) + β (k) d(k−1)
Choose β (k) so that ~d(k) and d(k−1) are A-orthogonal.
β (k) =
(~r(k) )T A~d(k−1)
(~r(k) )T ~r(k)
= ... = (k−1) T (k−1)
(~r
) ~r
(~d(k−1) )T Ad(k−1)
Optimal step size
α(k) = argmin α]φ(~u(k) + αr(k) − αβd(k−1) =
[
25
(~d(k) )T ~r(k)
(~d(k) )T A~d(k)
13.3
Pierson Guthrey
pguthrey@iastate.edu
Descent Methods
Note: A ∈ Rn×n is SPD. Converges
p in m steps to exact solution.
Converses to k~rk < in O(() κ(A)) . Poisson in 2D: κ(A) = O(()h−2 ) = O(()N 2 ). Iterations will be
O(()N )
26
Download