Class Notes 6: THE ACCELERATION OF SOR

advertisement
Class Notes 6: THE ACCELERATION OF SOR
Math 639d
Due Date: Oct. 9
(updated: October 2, 2014)
We shall see that the judicious choice of parameter ω can lead to a significantly faster converging algorithm. Let us first set the stage. Suppose
that we have a linear iterative method with reduction matrix G satisfying
ρ(G) < 1. Then we know there is a norm k · k∗ such that the induced matrix
norm satisfies
kGk∗ = γ,
with γ < 1. Actually, γ can be taken arbitrarily close to ρ(G). We then
have
kek k∗ ≤ kGkk∗ ke0 k∗ = γ i ke0 k∗ .
Now, to reduce the k · k∗ -error by a factor of ǫ, we need
γ k ≤ ǫ, i.e., k ≥
ln(ǫ−1 )
.
ln(γ −1 )
Setting δ = 1 − γ, we find that
ln(γ −1 ) = − ln(γ) = − ln(1 − δ) ≈ δ
where we used Taylor’s series for the approximation. Thus, the number
of iterations for a fixed reduction should grow proportionally with δ −1 =
(1 − γ)−1 ≈ (1 − ρ(G))−1 , i.e., k · (1 − γ) should behave like a constant.
(This was the argument that I was looking for in the extra credit problem
of Homework 2.)
Recall that for A3 we computed the spectral radius of GJ (the Jacobi
method) and found that ρ(GJ ) = cos(π/(n + 1)) ≈ 1 − π/(2(n + 1)2 ). We
again used Taylor’s series for the approximation. Thus, one should expect
that k grows like a constant times n2 for Jacobi (compare this with your
homework results). Our goal is to show that ρ(GSOR ) = 1 − O(n−1 ) for an
appropriate choice of ω (this will be illustrated in Programming 4).
To analyze the SOR method we start by setting
β = ρ(L + U ).
Note that the Jacobi method in the case when A = I + L + U is
xi+1 = −(L + U )xi + b
and the corresponding reduction matrix is GJ = −(L + U ) so β is nothing
more than the spectral radius of GJ .
We introduce the following hypothesis:
1
2
(A.1) 0 < ω < 2.
(A.2) GJ only has real eigenvalues and ρ(GJ ) = β. with 0 < β < 1.
(A.3) The matrix A = I + L + U satisfies Property A.
Remark 1. The case when β = 0 is not interesting since this means that
ρ(GJ ) = 0 and we should use the Jacobi method.
Remark 2. The assumption of real eigenvalues is automatically satisfied
when A is symmetric.
Under the above hypothesis, we have the following theorem.
Theorem 1. (D. Young
) Assume that (A.1)-(A.3) hold and let GωSOR
be the reduction matrix corresponding to SOR with iteration parameter ω.
Then,

ω−1:
if ω ∈ [ωopt , 2),

r
ω
ρ(GSOR ) =
2 2
 1 − ω + 1 ω 2 β 2 + ωβ 1 − ω + ω β :
if ω ∈ (0, ωopt ].
2
4
Here
2
p
ωopt =
.
1 + 1 − β2
Remark 3. We shall see in the proof of the theorem that both of the expressions for ρ(GωSOR ) give the same result when ω = ωopt . In addition, the
second is always greater than or equal to ωopt − 1. As ω − 1 is an increasing
function of ω, it follows that the choice of ω = ωopt gives the smallest spectral
radius over all ω ∈ (0, 2), namely,
p
p
1 − 1 − β2
ωopt
p
≈ 1 − 2 1 − β2.
ρ(GSOR ) = ωopt − 1 =
1 + 1 − β2
Before proving the theorem, we further investigate the above estimate.
We start by assuming that the Jacobi method requires a large number of
iterations, i.e., β is positive and close to one, i.e., β = 1 − γ with γ small. In
this case,
p
p
1 − β 2 = 2γ − γ 2 ≈ 21/2 γ 1/2
so
ωopt
) ≈ 1 − 23/2 γ 1/2 .
ρ(GSOR
Thus, the number of iterations for SOR with the optimal parameter choice
should grow like γ −1/2 instead of the γ −1 growth for Jacobi and Gauss Seidel
iteration. This slower growth in the number of iterations will be illustrated
in Homework 4. Even though we will only use a ball park estimate for ωopt ,
we will see considerable acceleration.
3
Proof. Let λ be a nonzero eigenvalue for GωSOR with eigenvector e. Then
λ(ω −1 I + L)e = ((ω −1 − 1)I − U )e
(6.1)
or
√
(λωL + ωU )e = (1 − λ − ω)e.
Dividing by ±ω λ gives
1−λ−ω
1
√ e
L + zU e =
z
±ω λ
√
where z = ±1/ λ. Note that λ and hence z may be a complex number. It
follows by Property A that
(6.2)
µ=
1−λ−ω
√
±ω λ
is an eigenvalue of −GJ = (L + U ). By Property A, the eigenvalues of
J−1 = −L − U are the same as J1 = L + U so that if µ is an eigenvalue of
GJ , so is −µ. It follows that equation (6.2) has the same solutions (we think
of λ as being the unknown here) as
(6.3)
(λ + ω − 1)2 = ω 2 λµ2 .
Note that if λ 6= 0 satisfies (6.3) for some eigenvalue µ of GJ , then, going
through the equations in reverse order implies that there is an eigenvector
e satisfying (6.1) with this value of λ. Thus, if λ and µ solve (6.2) or (6.3),
then λ is an eigenvalue of GSOR if and only if µ is an eigenvalue of GJ .
Examining (6.2) and (6.3), we may assume, without loss of generality, that
µ ≥ 0.
As seen in the proof of Proposition 2 of Class 4, det(GωSOR ) = (1 − ω)n
and hence is nonzero except for ω = 1. When ω = 1, any eigenvector e of GJ
with eigenvalue µ 6= 0 leads to an eigenvalue λ = µ2 for G1SOR . It follows in
this case that ρ(G1SOR ) = β 2 and is in agreement with the theorem1 (since
ωopt > 1 and the second expression of the theorem equals β 2 when ω = 1).
For all other ω, GωSOR has only nonzero eigenvalues and every one comes
from (6.3).
The equation (6.3) can be rewritten
ω 2 µ2
2
λ + (ω − 1)2 = 0.
(6.4)
λ + 2 (ω − 1) −
2
1This explains why Jacobi took twice as many iterations as Gauss-Seidel in Problem 1
of Homework 3.
4
Its roots are
r
ω 2 µ2 2
ω 2 µ2
1−ω+
− (ω − 1)2
λ (µ, ω) = (1 − ω) +
±
2
2
r
ω 2 µ2
ω 2 µ2
= (1 − ω) +
± ωµ 1 − ω +
.
2
4
±
(6.5)
First, the only way that we can get a complex root is if
ω 2 µ2
.
4
In this case, the roots are complex conjugates of each other and by (6.4),
their product equals (ω − 1)2 , i.e., they each have absolute value ω − 1.
(6.6)
ω−1>
Now, for
ω 2 µ2
,
4
the coefficient of λ in (6.4) is negative and hence we have two positive roots.
The larger one, λ+ (µ, ω), is clearly an increasing function of µ (for fixed ω).
(6.7)
ω−1≤
1.5
1
0.5
0
−0.5
−1
0
0.2
0.4
0.6
0.8
1
1.2
1.4
x
Figure 1. f1 corresponding to ω = .7 and ω = ωopt (.8) =
1.25 and f2 with µ = .8 (The axis labeled x should be labeled
λ).
When (6.7) holds, λ+ (µ, ω) is a decreasing function of ω (for fixed
√ µ).
To see this we use a geometric argument. Multiplying (6.2) by ± λ, we
5
see that the solutions λ of (6.2) occur at the intersection of the
√ straight
µ
λ+ω−1
ω
line f1 (λ) =
and the double valued function f2 (λ) = µ λ. This
ω
situation is illustrated in Figure 1 where f2µ (λ) for µ = .8 and f1ω (λ) for
ω = 1.25 and ω = .7 are plotted (you should be able to identify each of these
in the picture). Clearly, f1ω (λ) is a straight line passing through (1,1) with
slope 1/ω. As ω increases the line rotates upwards (at least for λ ∈ [0, 1]).
The rightmost value of λ at the point of intersection of the line and the
upper branch of F2 is λ+ (µ, ω) and moves to the left as ω increases, i.e.,
λ+ (µ, ω) is a decreasing function of ω for fixed µ.
It is not hard to check that the value of ω where f1ω is tangent to f2µ
coincides with the value of ω resulting in the double root (this is also clear
from the figure). Thus, ω is the solution of
ω 2 µ2
.
4
We denote the root in (0, 2) by ωopt (µ) and calculate
2
p
ωopt (µ) =
.
1 + 1 − µ2
(6.8)
ω−1=
It easily follows from (6.8) that
λ± (µ, ωopt (µ)) = ωopt (µ) − 1.
From the above discussion, we see that:
(a) If ω ≥ ωopt (β), |λ± (µ, ω)| = ω − 1 as this is the complex or double root
case for every µ.
(b) If ω ≤ ωopt (β), then (6.4) with µ = β has positive real roots, the largest
given by λ+ (β, ω). This is the second expression appearing in the theorem. The remaining µ ∈ σ(GJ ) with µ ≥ 0 satisfy µ < β. We can only
have complex roots if ω > 1 and in this case, as λ+ (β, ω) is the larger
positive root of (6.4) with µ = β, λ+ (β, ω) ≥ ω − 1. For all values of ω,
if µ leads to real roots then they are smaller than λ+ (β, ω) as λ+ (µ, ω)
is an increasing function of µ for ω fixed. Thus, all eigenvalues have
absolute value at most λ+ (β, ω) when ω ∈ (0, ωopt (β)].
This completes the proof of the theorem.
Download