15: GMRES and preconditioned GMRES

advertisement
15: GMRES and preconditioned GMRES
Math 639
(updated: January 2, 2012)
The following theorem follows from the proposition of the previous class
and the fact that GMRES is a minimizer.
Theorem 1. Assume that A is an n × n real matrix which satisfies
(15.1)
αkxk2 ≤< Ax, x >, for all x ∈ Rn ,
< Ax, y > ≤ βkxk kyk, for all x, y ∈ Rn .
Suppose that ei = x − xi where xi is the i’th iterate in the GMRES algorithm
(with starting iterate x0 ) and x is the solution of Ax = b. Then
for
kAei k ≤ ρi/2 kAe0 k
ρ=1−
α2
.
β2
Proof. We consider the inner product
<< x, y >>=< Ax, Ay > for all x, y ∈ Rn .
We shall denote its corresponding norm by k · k∗ = (<< x, y >>)1/2 . It
follow from (15.1) that
(15.2)
<< Ax, x >>=< A2 x, Ax >≥ αkAxk2 = αkxk∗
and
(15.3)
<< Ax, y >>=< A2 x, Ay >≤ βkAxk kAyk = βkxk∗ kyk∗ .
We consider the Richardson method
(15.4)
x̃i+1 = x̃i + τ (b − Ax̃i )
using the same initial vector as in the GMRES iteration, i.e., x̃0 = x0 . Let
ẽi = x − x̃i .
Then (15.2) and (15.3) and the proposition of last class implies that
√
kẽi+1 k∗ ≤ ρkẽi k∗ .
Repetitively applying the above inequality gives
Now,
kẽi k∗ ≤ ρi/2 ke0 k∗ .
ẽi = (I − τ A)i e0
1
2
so
A(ẽi ) = (I − τ A)i r0 = r0 − Aζ
for some ζ ∈ Ki . By the minimization property of GMRES,
kAei k = kr(θ)k = min kr(ζ)k
ζ∈Ki (A)
= min kr0 − Aζk ≤ kAẽi k
ζ∈Ki (A)
= kẽi k∗ ≤ ρi/2 ke0 k∗ = ρi/2 kAe0 k.
This completes the proof of the theorem.
An alternative to applying GMRES is to apply CG to the normal equations. There are some who argue against this approach as it generally squares
the condition number. The hope is that GMRES might produce CG like
acceleration without squaring the condition number. This is actually true
when A is SPD however, in this case, one should obviously apply CG since it
involves much less computation. Even in the case when A satisfies (15.1), at
least theoretically, GMRES may not lead to any computational advantage.
Indeed, consider applying CG to the normal equations,
A∗ Ax = A∗ b.
(15.5)
Here A∗ denotes the adjoint of A with respect to < ·, · >. By the definition of
A∗ , A∗ A is self adjoint with respect to < ·, · > To estimate the convergence
associated with conjugate gradient applied to (15.5), we need to estimate
the condition number of A∗ A. We first observe that
kAxk2 =< Ax, Ax >≤ βkxk kAxk
from which it follows that
< A∗ Ax, x >= kAxk2 ≤ β 2 kxk2 .
Thus, the largest eigenvalue λn of A∗ A satisfies
< A∗ Ax, x >
< x, x >
x∈Rn ,x6=0
< Ax, Ax >
≤ β2.
= sup
2
n
kxk
x∈R ,x6=0
λ0 =
sup
For the smallest eigenvalue λ0 , we start with the following fact:
(15.6)
kzk =
sup
y∈Rn ,y6=0
< z, y >
.
kyk
3
This inequality is an easy consequence of the Schwarz inequality. Taking
z = Ax gives
< Ax, y >
kAxk = sup
kyk
y∈Rn ,y6=0
< Ax, x >
≥
≥ αkxk.
kxk
Thus,
< Ax, Ax >
≥ α2 .
λ1 = inf
x∈Rn ,x6=0
kxk2
It follows that the condition number of A is bounded by β 2 /α2 . Thus, the
error ei resulting from CG applied to the normal equations satisfies
1 − α/β i
kAei k ≤ 2
kAe0 k.
1 + α/β
We see that CG provides an acceleration over GMRES (at least in theory)
converging in O(β/α) iterations as opposed to the O(β 2 /α2 ) suggested by
the above theorem.
We finish this class with a proof of the last two lemmas of the previous
class.
Lemma 1. Assume that the Arnoldi algorithm does not stop before the l’th
step. Then the vectors v1 , v2 , . . . vl form a < ·, · >-orthonormal basis for the
Krylov space Kl (A).
Proof. By the definition of v1 and (5), vi is unit sized for i = 1, . . . , l + 1.
Provided that v1 , . . . , vj are orthonormal, for l = 1, 2, . . . , j,
< vj+1 , vl > =
h−1
j+1,j
< Avj −
j
X
hi,j vi , vl >
i=1
= h−1
j+1,j (< Avj , vl > −hl,j ) = 0.
This is the key computation in the induction showing that v1 , , . . . , vl+1 are
orthogonal. Hence v1 , . . . , vl+1 are orthonormal and span an l+1 dimensional
space. A simple induction shows that vi ∈ Ki . As Ki has at most dimension
i, v1 , . . . , vi is an orthonormal basis for Ki , for i = 1, 2, . . . , l + 1.
Lemma 2. Assume that the Arnoldi algorithm does not stop before the l’th
step. For i = 1, . . . , l,
i+1
X
(H̃l )k,i vk .
Avi =
k=1
4
Proof. From (2) of the Arnoli algorithm,
wj = hj+1,j vj = Avj −
j
X
hi,j vi .
i=1
The lemma follows by rearrangement and the definition of H̃.
Example 1. We consider a finite element or finite difference method for
the boundary value problem:
−∆u + v(x) · ∇u + q(x)u = f for x ∈ Ω,
u = 0 for x ∈ ∂Ω.
Here v is a vector field (a velocity field). The first term above involves
higher order derivatives (order 2) while the remaining terms involve lower
order derivatives (order 1 and 0, respectively). As we have already seen, discretization of the first term, e.g., with finite differences, leads to a symmetric
and positive definite matrix A while the discretization of the full equation
e Often, it
gives a non-symmetric matrix (and possibly no longer definite) A.
e satisfying
is possible to develop an efficient preconditioner for A
e
αkxk2A ≤ (BAx,
x)A , for all x ∈ Rn ,
(15.7)
(BAx, y)A ≤ βkxkA kykA , for all x, y ∈ Rn ,
with constants α, β independent of h and hence n. The techniques for deriving such preconditioners and why they lead to the above inequalities is
beyond the scope of this class. Note that Theorem 1 can be applied to the
preconditioned equations and guarantees a uniform (independent of h) rate
of iterative convergence. This result holds for the preconditioned GMRES
method defined using < ·, · >= (·, ·)A . This convergence rate is not guaranteed for the preconditioned GMRES algorithm based on the standard (dot)
inner product.
Download