6: THE ACCELERATION OF SOR Math 639 (updated: January 2, 2012) We shall see that the judicious choice of parameter ω can lead to a significancy faster converging algorithm. Let us first set the stage. Suppose that we have a linear iterative method with reduction matrix G satisfying ρ(G) < 1. Then we know there is a norm k · k∗ such that the induced matrix norm satisfies kGk∗ = γ with γ < 1. Actually, γ can be taken arbitrarily close to ρ(G). We then have kek k∗ ≤ kGkk∗ ke0 k∗ = γ i ke0 k∗ . Now, to reduce the k · k∗ -error by a factor of ǫ, we need γ k ≤ ǫ, i.e., k ≥ ln(ǫ−1 ) . ln(γ −1 ) Setting δ = 1 − γ, we find that ln(γ −1 ) = − ln(γ) = − ln(1 − δ) ≈ δ where we used Taylor’s series for the approximation. Thus, the number of iterations for a fixed reduction should grow proportionally with δ −1 = (1 − γ)−1 ≈ (1 − ρ(G))−1 , i.e., k · (1 − γ) should behave like a constant. Recall that for A3 , we computed the spectral radius of Gjac (the Jacobi method) and found that ρ(Gjac ) = cos(π/(n + 1) ≈ 1 − π/(2(n + 1)2 ). We again used Taylor’s series for the approximation. Thus, one should expect that k grows like a constant times n2 for Jacobi (compare this with you homework results). Our goal is to show that ρ(Gsor ) = 1 − O(n−1 ) for an appropriate choice of ω (this is illustrated in Programming 4). To analyze the SOR method we start by setting β = ρ(L + U ). Note that the Jacobi method in the case when A = I + L + U is xi+1 = −(L + U )xi + b and the corresponding reduction matrix is GJ = −(L + U ) so β is nothing more than the spectral radius of GJ . We introduce the following hypothesis: (A.1) 0 < ω < 2. (A.2) GJ only has real eigenvalues and ρ(GJ ) = β. with 0 < β < 1. (A.3) The matrix A = I + L + U satisfies Property A. 1 2 Remark 1. The case when β = 0 is not interesting since this means that ρ(GJ ) = 0 and we should use the Jacobi method. Under the above hypothesis, we have the following theorem. Theorem 1. (D. Young ) Assume that (A.1)-(A.3) hold and let GSOR be the reduction matrix corresponding to SOR. Then, ω−1: if ω ∈ [ωopt , 2), r ρ(GSOR ) = 2 2 1 − ω + 1 ω 2 β 2 + ωβ 1 − ω + ω β : if ω ∈ (0, ωopt ] 2 4 where 2 p ωopt = . 1 + 1 − β2 Before proving the theorem, we should investigate the estimate which it gives. We start by assuming that the Jacobi method requires a large number of iterations, i.e., β is positive and close to one. First, ω − 1 is an increasing function so the smallest value on the interval [ωopt , 2) is achieved at ω = ωopt where its value is p p 1 − 1 − β2 p ≈ 1 − 2 1 − β2. ρ(GSOR ) = 1 + 1 − β2 It is also a consequence of the proof below that the expression for ρ(GSOR ) in the second case is always greater than or equal to ωopt − 1 and hence ω = ωopt gives the lowest possible spectral radius. Now if Jacobi requires a lot of iteration, β = 1 − γ with γ small. In this case, p p 1 − β 2 = 2γ − γ 2 ≈ 21/2 γ 1/2 i.e., ρ(GSOR ) ≈ 1 − 23/2 γ 1/2 . Thus, the number of iterations for SOR with a good parameter choice should grow like γ −1/2 instead of the γ −1 growth for Jacobi. This slower growth in the number of iterations was illustrated in Program 4. Even though we only used a ball park estimate for ωopt , we still achieved considerable acceleration. Proof. Let λ be a nonzero eigenvalue for GSOR with eigenvector e. Then (6.1) or λ(ω −1 I + L)e = ((ω −1 − 1)I − U )e (λωL + ωU )e = (1 − λ − ω)e. 3 √ Dividing by ±ω λ gives 1 1−λ−ω √ e L + zU e = z ±ω λ √ where z = ±1/ λ. Note that in general λ is a complex number and so must be z. It follows by Property A that 1−λ−ω √ (6.2) µ= ±ω λ is an eigenvalue of −GJ = (L + U ). By Property A, the eigenvalues of J−1 = −L − U are the same as J1 = L + U so that if µ is an eigenvalue of GJ , so is −µ. It follows that equation (6.2) has the same solutions (we think of λ as being the unknown here) as (λ + ω − 1)2 = ω 2 λµ2 . (6.3) Note that if λ 6= 0 satisfies (6.3) for some eigenvalue µ of GJ , then, going through the equations in reverse order implies that there is an eigenvector e satisfying (6.1) with this value of λ. Thus, if λ 6= 0 and µ solve (6.2) or (6.3), then λ is an eigenvalue of GSOR if and only if µ is an eigenvalue of GJ . Now, when ω = 1, λ = 0 is an eigenvalue of GSOR (1) = GGS . However, any eigenvector e with eigenvalue µ leads to an eigenvalue λ = µ2 for GSOR (1). It follows in this case that ρ(GSOR (1)) = β 2 and is agreement with the theorem. Examining (6.1), we see that for all other ω, GSOR has only nonzero eigenvalues and we can proceed with our assumption of λ 6= 0. The equation (6.3) can be rewritten ω 2 µ2 2 λ + (ω − 1)2 = 0. λ + 2 (ω − 1) − 2 The roots are (6.4) r ω 2 µ2 2 ω 2 µ2 1−ω+ − (ω − 1)2 λ = (1 − ω) + ± 2 2 r ω 2 µ2 ω 2 µ2 = (1 − ω) + ± ωµ 1 − ω + . 2 4 First, the only way that we can get a complex root is if ω 2 µ2 . 4 In this case, the roots are complex conjugates of each other and a straightforward computation using the second equality of (6.4) shows that they have absolute value ω − 1. (6.5) ω−1> 4 We next consider the case when both roots are real. We fix µ < 1 and at the intersection of the two ω ∈ (0, 2). Then the solution λ of (6.2) occurs √ x+ω−1 curves f1 (x) = and f2 (x) = ±µ x. Note that f1 is a straight line ω with slope 1/ω which passes through the point (1,1). Two such lines with different values of ω (ω = 1.25 and ω = .7) are illustrated in Figure 1 as well as the graph of f2 with µ = .8. The line corresponding to ω = .7 intersects f2 in two places while the line corresponding to ω = 1.25 is tangent to f2 . Note that f1 (1) = 1 for every value of ω and changing ω simply changes the slope. It is clear that one can increase the slope until f1 is tangent to f2 and we define ωopt (µ) to be the value of ω for which this happens. Let ωopt (µ) be the value of ω which makes f1 tangent to f2 and let λopt (µ) be the value of x where they meet (see the figure below where λopt (µ) = .25). The lines are tangent when f1′ (x) = f2′ (x), or (6.6) µ2 ωopt (µ)2 µ 1 or λopt (µ) = = p . ωopt (µ) 4 2 λopt (µ) Substituting this into f1 (x) = f2 (x) gives or 2 (µ) 2 (µ) µ2 ωopt µ2 ωopt + ωopt (µ) − 1 = 4 2 2 (µ) µ2 ωopt = ωopt (µ) − 1. 4 Comparing this with (6.5) shows that ωopt (µ) is precisely the place where the roots switch from real to complex. (6.7) Note that if ω < ωopt (µ) then one of the intersection points leads to a value of λ greater than λopt . In contrast, ω > ωopt (µ) leads to complex λ of absolute value ω − 1 (corresponding to this µ). We get an optimal choice over all µ ∈ σ(GJ ) by setting (6.8) ωopt = ωopt (β). Indeed, for this choice, any eigenvalue µ of GJ with µ < β gives rise to complex λ since ωopt (β) > ωopt (µ) and hence |λ| = ωopt − 1. It follows from (6.6) and (6.7) that for µ = ±β, λ = λopt = ωopt − 1. Thus, every eigenvalue of GSOR with this choice of ωopt has the same absolute value as the spectral radius ρ(GSOR ) = ωopt − 1. Finally, if use a larger ω then the spectral radius increases to ω − 1. Thus, we have shown that (6.8) gives the optimal choice of ω and leads a reduction matrix with all eigenvalues of absolute value equal to ρ(GSOR ) = ωopt − 1. The value of ωopt appearing in the theorem results from computing the smaller solution of (6.7) (with µ = β) (check it!). The result for ρ(GSOR ) 5 1.5 1 0.5 0 −0.5 −1 0 0.2 0.4 0.6 0.8 1 1.2 1.4 x Figure 1. f1 corresponding to ω = .7 and ω = ωopt (.8) = 1.25 and f2 with µ = .8. in the theorem when ω ∈ [ωopt , 2) is immediate from the above discussion. Finally, the result for when ω ∈ (0, ωopt ] is the larger of the two roots in (6.8) with µ = β. This completes the proof of the theorem.