Geometric Multigrid Methods for the Helmholtz equations Ira Livshits1 1 Ball State University RICAM, Linz, 14, 16 November 2011 Ira Livshits (BSU) - November 2011, Linz 1 / 83 Multigrid Methods Aim: To understand geometric multigrid for the Helmholtz equations (difficulties and some solutions) Give a deeper understanding that led to the various choices made. Contents I I I I Basics of multigrid Basics of Local Fourier analysis Nonlinear multigrid (FAS) The Helmholtz equation Ira Livshits (BSU) - November 2011, Linz 2 / 83 Helmholtz equation Lu = −∆u(x) − k 2 u(x), x ∈ Ω ⊂ Rd with large wave numbers k: 2π/k dim(Ω) Ira Livshits (BSU) - November 2011, Linz 3 / 83 A Comment on Terminology The indefinite Helmholtz equation is Lu = (−∆ − k 2 )u = f , in contrast to (∆ − η)u = f , η > 0 often also called Helmholtz equation, or Helmholtz equation with the good sign, or positive definite Helmholtz equation. The subject today is the indefinite Helmholtz equation. Ira Livshits (BSU) - November 2011, Linz 4 / 83 The 2D discrete Poisson equation in the unit square We consider a stationary diffusion equation with diffusion in two coordinate directions: − ∂ 2 u(x, y ) ∂ 2 u(x, y ) − = f (x, y ) ∂x 2 ∂y 2 for (x, y ) ∈ Ω = (0, 1)2 with boundary conditions: u(x, y ) = g (x, y ) for (x, y ) ∈ ∂Ω The functions f ∈ C (Ω) and g ∈ C (∂Ω) are known. The “Laplace Operator” is commonly written as: ∆ := Ira Livshits (BSU) ∂2 ∂2 + ∂x 2 ∂y 2 - November 2011, Linz 5 / 83 With mesh size h = 1/n a grid G is defined as Ω = {(xi , yi )|(xi , yj ) = (ih, jh) ⊂ IR2 ; i, j, = 0, 1, . . . , n} In analogy to the 1D case we approximate the partial derivatives in y -direction. For the second derivative in y -direction we obtain: ui,j−1 − 2ui,j + ui,j+1 ∂2u ≈ 2 ∂y i,j hy2 The accuracy in x- and y -direction is O(hx2 ), O(hy2 ) or, with hx = hy = h, O(h2 ). Ira Livshits (BSU) - November 2011, Linz 6 / 83 Gauss-Seidel and Jacobi for the Poisson equation The discrete Laplace operator reads: 1 ∆h uh = [ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − 4ui,j ] or: h2 1 ∆h uh (x, y ) = [uh (x − h, y ) + uh (x + h, y ) + uh (x, y − h) + h2 uh (x, y + h) − 4uh (x, y )] Jacobi iteration reads uhm+1 (xi , yj ) = 1h 2 h fh (xi , yj ) + uhm (xi − h, yj ) + uhm (xi + h, yj ) 4 i +uhm (xi , yj − h) + uhm (xi , yj + h) , (with (xi , yj ) ∈ Ωh ) Gauss-Seidel iteration reads 1h 2 uhm+1 (xi , yj ) = h fh (xi , yj ) + uhm+1 (xi − h, yj ) + uhm (xi + h, yj ) 4 i +uhm+1 (xi , yj − h) + uhm (xi , yj + h) , Ira Livshits (BSU) - November 2011, Linz 7 / 83 Linear systems of equations in matrix form For the solution of a linear system of equations Au = b, Iterative method: u m+1 = Qu m + s, (m = 0, 1, 2, . . .), so that the system u = Qu + s is equivalent to the original problem. Matrix Q is usually not computed explicitly ! Ira Livshits (BSU) - November 2011, Linz 8 / 83 From the matrix equation Au = b we construct a splitting A = B + (A − B), so that B is “easily invertible”. With the help of a general nonsingular N × N matrix B we obtain such iteration procedures from the equation Bu + (A − B)u = b. If we put Bu m+1 + (A − B)u m = b , or, rewritten for u m+1 ; u m+1 = u m − B −1 (Au m − b) = (I − B −1 A)u m + B −1 b . We will see, that it is important, to have the eigenvalues of Q = I − B −1 A possibly small. This is the more likely, the better B resembles A. Ira Livshits (BSU) - November 2011, Linz 9 / 83 Linear systems of equations in matrix form Definition: An iterative process u m+1 = Qu m + s is called consistent, if matrix I − Q is not singular and if (I − Q)−1 s = A−1 b. If the iteration is consistent, the equation (I − Q)u = s has exactly one solution u m→∞ ≡ u. Definition: An iteration is called convergent, if the sequence u 0 , u 1 , u 2 , . . . converges to a limit, independent of u 0 . Ira Livshits (BSU) - November 2011, Linz 10 / 83 The spectral radius Definition: The spectral radius ρ(Q) of a (N, N) matrix Q is ρ(Q) := max λ : λ eigenvalue of Q Lemma The spectral radius is the infimum over all (vector norm related) matrix norms: ρ(Q) = inf k Qk Theorem: The iterative method u m+1 = Qu m + s m = 0, 1, 2, . . . yields a convergent sequence {u m } for a general u 0 , s (“the method converges”), if ρ(Q) < 1 . In the case ρ(Q) < 1, matrix I − Q is regular, there is for each s exactly one fix point of the iterative scheme. Ira Livshits (BSU) - November 2011, Linz 11 / 83 Error reduction General splitting of A (another way of writing): Bu m+1 + (A − B)u m = b = Au ∗ = Bu ∗ + (A − B)u ∗ or: B(u m+1 − u ∗ ) = (B − A)(u m − u ∗ ), So, (u m+1 − u ∗ ) = Q(u m − u ∗ ). By induction, we can obtain: (u m − u ∗ ) = Q m (u 0 − u ∗ ). Assume ρ(Q) < 1. Ira Livshits (BSU) - November 2011, Linz 12 / 83 Efficient solution methods for discrete equations – Hierarchy One level methods, like Jacobi, Gauss-Seidel iterative methods are far from optimal: O(N α ), α > 1. O(N) complexity requires hierarchical approaches. Multigrid is the classical hierarchical approach I I I I Approximation on different scales Compute only those “parts” of a solution on fine scales that really require the fine resolution. Compute/correct the remaining parts on coarser scales Gives O(N) complexity Advantages increase with increasing problem size The idea behind multigrid is very general Ira Livshits (BSU) - November 2011, Linz 13 / 83 Relaxation of ∆u = 0 1 0.8 0.6 0.4 0.2 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0 1 0.1 0 1 0.8 1 0.6 0.8 0.6 0.4 0.4 0.2 0.2 0 0 1 0.8 1 0.6 0.8 0.6 0.4 0.4 0.2 0.2 0 0 0 0.8 1 0.6 0.8 0.6 0.4 0.4 0.2 0.2 0 0 (a) Random initial error, (b) Error after 10 GS iterations, (c) Error after 10 more GS iterations. Ira Livshits (BSU) - November 2011, Linz 14 / 83 Convergence factor vs smoothing factor Convergence factor (GS, Jacobi, wJacobi, SOR) is ρ = 1 − O(h2 ) Smoothing factor is a convergence factor for oscillatory components, and it seems to be good. Remaining error components are smooth. Ira Livshits (BSU) - November 2011, Linz 15 / 83 Motivation to use coarse grids 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 1 0 1 0.8 0.8 1 0.6 0.6 0.4 0.4 0.2 0.2 0 0.8 0.6 0.4 0.4 0.2 1 0.6 0.8 0.2 0 0 0 The remaining after relaxation, aka smoothing, error has a good coarse grid representation. Ira Livshits (BSU) - November 2011, Linz 16 / 83 Detour to local Fourier analysis with a hint of trouble brewing for the Helmholtz The basic idea of any multigrid algorithm is to annihilate the high frequency error components efficiently on a fine grid by relaxation (smoothing) while the low frequencies are approximated from coarser grids. Smoothing analysis is used to analyze quantitatively this reduction of the high frequency error components on the fine grid. Assuming the ideal situation in which I I I the relaxation does not affect the low frequencies all low frequencies are approximated more or less exactly on the coarse grid there is no interaction between high and low frequency error components it already leads to an “estimate” or prediction of the 2–level convergence factor. Ira Livshits (BSU) - November 2011, Linz 17 / 83 Assumptions For a discretized PDE: Lh Uh = Fh With the basic assumptions of any local Fourier analysis: – – – – linearize operator Lh and/or freeze coefficients to local values neglect boundaries and boundary conditions consider the frozen, linearized operator on an infinite grid Gh apply the linear constant coefficient local Fourier analysis Ira Livshits (BSU) - November 2011, Linz 18 / 83 Components, the 1D case Infinite uniform mesh, meshsize h Gh = {k · h : k ∈ Z} 1D Fourier components: e iωx = cos ωx + i sin ωx On Gh only Fourier components e iθx/h with θ ∈ (−π, π] are “visible”. Definition: A component e iθx/h is “visible” on Gh if there is no frequency θ0 with |θ0 | < |θ| such that e iθ0 x/h = e iθx/h for all x ∈ Gh . A visible component e iθx/h does not coincide with any lower frequency on grid Gh . θ “frequency”, Ira Livshits (BSU) | 2πh θ | “period” - November 2011, Linz 19 / 83 Components, the 2D case Infinite uniform mesh, meshsize h = (hx , hy ) in x– and y– direction: Gh = {(khx , lhy ) : k, l ∈ Z} θ = (θ1 , θ2 ) “frequency” On Gh , only Fourier components e iθx/h := e iθ1 x/hx · e iθ2 y /hy with |θ| := max (|θ1 |, |θ2 |) ≤ π are “visible”. For higher frequencies |θ| > π, the Fourier components coincide with a lower frequency ∈ (−π, π]. For higher dimensions: analogously Ira Livshits (BSU) - November 2011, Linz 20 / 83 Difference operators Let Gh be an infinite uniform mesh, meshsize h ; uh : grid function on Gh ; I = {−ν, ..., 0, ..., ν}: set of indices. Difference operators Lh : 2D x = (x, y ) ∈ Gh Lh uh (x) = XX aµ1 µ2 uh (x + µ1 hx , y + µ2 hy ) µ1 ∈I µ2 ∈I Ira Livshits (BSU) - November 2011, Linz 21 / 83 The stencil notation, examples finite differences for ∆u, I = {−1, 0, 1} a−11 a01 a11 a00 a10 uh (x) Lh uh (x) = a−10 a−1−1 a0−1 a1−1 2D = 1 1 h2 | 1 − 4 1 {z 1 uh (x) } ∆h or for the biharmonic operator ∆2 , 1 ∆h ∆h uh (x) = 4 h 5 − point stencil I = {−2, −1, 0, 1, 2} 2 −8 2 1 1 −8 20 −8 1 2 −8 2 1 uh (x) 13-point stencil Here: x = (x, y ) ∈ Gh , hx = hy = h. Ira Livshits (BSU) - November 2011, Linz 22 / 83 Motivation to consider Biharmonic equation is Kaczmarz relaxation Kaczmarz relaxation (equivalent to GS applied to AA∗ ) Implementation: Consider i th equation: Ai x = bi and a current approximation x̃, here Ai is the i th row of A. A new approximation is given by x̃ ← x̃ + ri Ai , bi − Ai x̃ . A∗i Ai Kaczmarz relaxation always converges (indeed, AAt is always SPD) but it is often slow (much slower than the GS on original matrix) and thus should be used only as the last reserve (µ = 0.8 for Laplace vs GS’s µ = 0.25). where the normalized residual is computed as ri = Ira Livshits (BSU) - November 2011, Linz 23 / 83 Difference operators applied to Fourier components On an infinite grid Gh , the Fourier components e iθx/h are eigenfunctions of any scalar difference operator Lh with constant coefficients. 1D: Lh e iθx/h X = aµ e iθµ ·e iθx/h µ∈I | {z } e Lh (θ) 2D: Lh e iθx/h = XX aµ1 µ2 e iθ1 µ1 e iθ2 µ2 ·e iθx/h µ1 ∈I µ2 ∈I | {z } e Lh (θ) e Lh (θ) is the eigenvalue, also called the Fourier symbol of Lh . Ira Livshits (BSU) - November 2011, Linz 24 / 83 Examples 1D: Lh = 1 h2 [1 −2 =⇒ Lh e iθx/h = 1] 1 −iθ 2 e − 2 + e iθ · e iθx/h = 2 (cos θ − 1) ·e iθx/h 2 h |h {z } e Lh (θ) 2D: θ = (θ1 , θ2 ) 1 1 Lh = ∆h = 2 h − 1 4 1 1 2 =⇒ e Lh (θ) = 2 (cos θ1 + cos θ2 − 2) h or for the biharmonic operator Lh = ∆h ∆h (13-point stencil): 4 e Lh (θ) = 4 (cos θ1 + cos θ2 − 2)2 h Ira Livshits (BSU) - November 2011, Linz 25 / 83 Smoothing analysis; a general class of relaxations: Consider a linear problem Lh Uh (x) = Fh on grid Gh Then: Many (not all !) relaxation schemes for this problem are based on an operator splitting Lh = Ah + Bh where Ah and Bh are again difference operators. Then: A relaxation sweep starting from the initial approximation uh (“old values”) produces a new approximation u h (“new values”) by solving Ah uh (x) + Bh u h (x) = Fh (x) at each gridpoint x ∈ Gh . Ira Livshits (BSU) - November 2011, Linz 26 / 83 Smoothing analysis; a general class of relaxations: What does that mean for the errors before and after the relaxation sweep ? Let eh and e h Then: = = Uh − uh Uh − u h be the error before after the sweep. Ah eh (x) + Bh e h (x) = 0 at each gridpoint x ∈ Gh . How do these relaxations act on Fourier components ? Let eh = Ae iθx/h : error before relaxation; then e h (x) = Ae iθx/h error after relaxation. A, A ∈ R are the error amplitudes before and after relaxation. Ira Livshits (BSU) - November 2011, Linz 27 / 83 Smoothing analysis; a general class of relaxations: From Ah eh (x) + Bh e h (x) = 0 follows e h (θ)A + B eh (θ)A = 0 A and therefore A = − e h (θ) A eh (θ) B ! A e h (θ), B eh (θ) ∈ C are the Fourier symbols of Ah , Bh . where A e µ(θ) := Aeh (θ) Bh (θ) is the amplification factor of the component θ. Ira Livshits (BSU) - November 2011, Linz 28 / 83 1D Example:Gauss-Seidel relaxation Corresponding splitting of Lh : 1 [1 − 2 2 h {z | Lh 1 1] = 2 [0 0 h } | {z Ah 1 1] + 2 [1 − 2 h } | {z Bh 0] } Then: at gridpoint x ∈ Gh for the errors Ae iθ(x−h)/h − 2Ae iθx/h + Ae iθ(x+h)/h = 0 e −iθ − 2 A | {z } + eh (θ) B e iθ A = 0. | {z } e h (θ) A This yields: A= Ira Livshits (BSU) - −e iθ A e −iθ −2 November 2011, Linz 29 / 83 Questions What is smoothing ? How can the smoothing property of a given relaxation scheme be measured quantitatively ? How does the smoothing property (or its quantitative measure) influence the performance of multigrid cycling ? Can the performance of a multigrid cycle be predicted in terms of the smoothing measures ? Note: In a multigrid cycle, relaxations are used to “smooth” the highly oscillating error components which cannot be approximated on coarser grids. Therefore, smoothing roughly means “convergence of high frequency error components”. Ira Livshits (BSU) - November 2011, Linz 30 / 83 What are high and low frequencies ? Let Gh be a fine grid and GH a coarse grid. Then: A Fourier component e iθx/h on grid Gh is called a high frequency (with respect to GH ) if its restriction (injection) to the coarse grid GH is not “visible” there. Otherwise it is called a low frequency. Take one–dimensional standard coarsening: Gh = {kh : k ∈ Z}, GH := G2h = {2kh : k ∈ Z} Let |θ| ≤ π; then by injection from Gh to G2h : inject e iθx/h =⇒ e i2θx/2h This component e i2θx/2h is “visible” on G2h only if |2θ| ≤ π, i.e. |θ| ≤ Ira Livshits (BSU) - π 2 November 2011, Linz 31 / 83 =⇒ The high frequencies on Gh (with respect to G2h ) are those with π ≥ |θ| ≥ π2 . Fourier components in 2D: e iθx/h = e iθ1 x/h e iθ2 y /h , θ = (θ1 , θ2 ) with |θ| := max(|θ1 |, |θ2 |) ≤ π Standard coarsening GH = Gh G2h {(kh, lh) : k, l ∈ Z} {(2kh, 2lh) : k, l ∈ Z} = = π =⇒ High frequencies: 2 ≤ |θ| ≤ π Low frequencies: |θ| < π2 Note: The definition of high and low frequencies depends on the coarsening. Ira Livshits (BSU) - November 2011, Linz 32 / 83 What is smoothing ? “Smoothing” stands for convergence of high frequency error components which cannot be approximated from the coarse grids in a multigrid cycle. For a relaxation scheme with amplification factors µ(θ) of Fourier components e iθx/h the smoothing rate µ is defined by: µ := max{|µ(θ)| : θ = high frequencies} Note: • The above definition of µ depends on the coarsening. • The amplification or reduction factors µ(θ) are also called “smoothing factors”. Ira Livshits (BSU) - November 2011, Linz 33 / 83 Example: Gauss–Seidel relaxation for Poisson’s equation: ∆h Uh = 1 1 h2 1 −4 1 1 Uh = Fh Relaxation in error terms: 0 1 1 1 e h (x) + 0 1 −4 0 0 h2 h2 1 0 | {z } | {z Bh Symbols: Ah 1 eh (x) = 0 } θ = (θ1 , θ2 ) Bh e iθx/h = 1 −iθ1 e + e −iθ2 − 4 ·e iθx/h 2 h | {z } eh (θ) B Ah e iθx/h = 1 iθ1 e + e iθ2 ·e iθx/h 2 |h {z } e h (θ) A Ira Livshits (BSU) - November 2011, Linz 34 / 83 =⇒ Amplification factors: µ(θ) = − e h (θ) A e iθ1 + e iθ2 = − −iθ1 eh (θ) e + e −iθ2 − 4 B =⇒ Smoothing rate: µ = max{|µ(θ)| : π ≤ |θ| ≤ π} ≈ 0.5 2 =⇒ The high frequency error components are reduced by a factor of (at least) µ per relaxation sweep. 0.46 0.44 0.42 0.4 0.38 0.36 0.34 0.32 0.3 0.28 0.26 0 40 10 20 20 30 40 0 Note: For low frequencies, the reduction per relaxation sweep is much worse: |µ(θ)| −→ 1 Ira Livshits (BSU) - if θ −→ 0 November 2011, Linz 35 / 83 Now to Helmholtz : the stencils and the relaxations 1 [1 − 2 + (kh)2 1] h2 1D finite differences for Lu = u 00 + k 2 u, 2D finite differences for Lu = ∆u + k 2 u, 1 −4 + (kh)2 1 5 − point stencil = h12 1 1 Lh Lh = Here: x = (x, y ) ∈ Gh , hx = hy = h. Ira Livshits (BSU) - November 2011, Linz 36 / 83 Fourier symbols 1D: Lh = 1 h2 [1 − 2 + k 2 h2 =⇒ Lh e iθx/h = 1] 1 (2 cos θ − 2 + k 2 h2 ) ·e iθx/h 2 h {z } | e Lh (θ) 2D: θ = (θ1 , θ2 ) =⇒ Lh e iθx/h = 1 (2 cos θ1 + 2 cos(θ2 ) − 4 + k 2 h2 ) ·e iθx/h 2 h | {z } e Lh (θ) or for the biharmonic operator (think Kaczmarz!) 1 e Lh (θ) = 4 (2 cos θ1 + 2 cos θ2 − 4 + k 2 h2 )2 h Ira Livshits (BSU) - November 2011, Linz 37 / 83 1D Example: Gauss-Seidel relaxation of 1 [1 − 2 + k 2 h2 2 h | {z 1] Uh (x) = Fh (x) } Lh Relaxation: uh =⇒ u h u h (x − h) − (2 − k 2 h2 )u h (x) + uh (x + h) = Fh (x) h2 Corresponding splitting of Lh : 1 [1 − 2 + k 2 h2 2 h | {z Lh 1 1] = 2 [0 0 } |h {z Ah 1 1] + 2 [1 − 2 + k 2 h2 } |h {z Bh 0] } Then: at grid point x ∈ Gh for the errors Ae iθ(x−h)/h − (2 − k 2 h2 )Ae iθx/h + Ae iθ(x+h)/h = 0 Ira Livshits (BSU) - November 2011, Linz 38 / 83 e −iθ − 2 + k 2 h2 A | {z } + eh (θ) B e iθ A = 0. | {z } e h (θ) A This yields: A= −e iθ A e −iθ −(2−k 2 h2 ) What is the difference compared to Laplace ? I I I I What happens when θ = 0 ? What happens when θ is between π/2 and π? What happens when θ ≈ kh ? Happens when kh grows, i.e., k is fixed and h is coarsened ? Ira Livshits (BSU) - November 2011, Linz 39 / 83 Answers If kh 1 i.e., exp(±ikx) are smooth: I I I other smooth components θ ≈ 0 remain almost unchanged by relaxation; For |θ| = ±kh there is a slight divergence; Oscillatory components π/2 < |θ| ≤ π converge almost as fast as in Laplace (convergence does depend on kh); If kh = O(1) divergence worsens! Results for kh = 0.1 and kh = 0.5 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 Ira Livshits 0.5 (BSU) 1 1.5 2 2.5 3 3.5 - November 2011, Linz 40 / 83 More examples: 2D Helmholtz equation: Gauss-Seidel Relaxation in error terms: 0 1 1 2 2 e h (x) + 1 0 h 0 1 −4 + k 0 h2 h2 1 0 | {z } {z | Bh Symbols: Ah 1 eh (x) = 0 } θ = (θ1 , θ2 ) Bh e iθx/h = 1 −iθ1 e + e −iθ2 − 4 + k 2 h2 ·e iθx/h 2 |h {z } eh (θ) B Ah e iθx/h = 1 iθ1 e + e iθ2 ·e iθx/h 2 h | {z } e h (θ) A Ira Livshits (BSU) - November 2011, Linz 41 / 83 =⇒ Amplification factors: µ(θ) = − e h (θ) e iθ1 + e iθ2 A = − −iθ1 eh (θ) e + e −iθ2 − 4 + k 2 h2 B =⇒ Smoothing rate (what is smooth here?): µ ≈ µL for kh 1; µ deteriorates (grows) as kh grows; I I 1.4 1.2 1 2 0.8 1.5 0.6 1 0.4 0.5 0.2 0 0 0 0 150 100 50 50 50 100 100 150 Ira Livshits 0 20 (BSU) 40 60 80 100 120 140 150 - 0 November 2011, Linz 42 / 83 Questions What is smooth ? Is θ = 0 smooth ? What is the smoothing factor ? What do we want from relaxation ? Ira Livshits (BSU) - November 2011, Linz 43 / 83 Kaczmarz relaxation c = −4 + k 2 h2 1 ∆h ∆h uh (x) = 4 h 1 h4 1 0 2c 2 | 0 0 4 + c2 2c 1 {z Bh Ira Livshits (BSU) 1 1 2c 4 + c2 2c 1 2 2c 2 0 0 2 1 0 e h (x) + 4 0 h } | - 2 2c 2 1 2 2c 0 0 0 0 0 {z Ah 1 uh (x) 2 2c 0 1 eh (x) = 0 } November 2011, Linz 44 / 83 Symbols and amplification Symbol Bh e iθx/h = 1 −2iθ 1 + e −2∗iθ2 ) + 2(e −iθ1 −iθ2 + e −iθ1 +iθ2 ) + 2c(e −iθ1 + e −iθ2 ) + (4 + c 2 ) ·e iθx/h (e 4 h | {z } e (θ) B h Symbol Ah e iθx/h = 1 2iθ 1 + e 2iθ2 ) + 2(e iθ1 −iθ2 + e iθ1 +iθ2 ) + 2c(e iθ1 + e iθ2 ) ·e iθx/h (e 4 h | {z } e (θ) A h Ira Livshits (BSU) - November 2011, Linz 45 / 83 Good news: µ(θ) ≤ 1 for any θ Q: But for which is works the best ? 0.75 1 0.7 0.9 0.65 0.8 0.6 0.7 0.55 0.6 0.5 0.5 0.45 0.4 0.3 0.4 0 0.2 50 0.1 0 100 150 0 50 100 150 100 50 150 50 100 150 0 Results for kh = 1 (high frequencies and overall convergence). Ira Livshits (BSU) - November 2011, Linz 46 / 83 More on convergence Consider ordered eigen pairs of A: (λj , vj ): Avj = λj vj such that |λ1 | ≤ |λ2 | · · · ≤ |λN |. P m+1 P m For errors representation e m+1 = αn vn and e m = αn vn , Due to linearity (of everything), X X αnm+1 vn = αnm Qvn and thus overall convergence is defined by m+1 α max n m n αn The smoothing factor is a convergence factor for αn/2 , . . . , αn , i.e., for the high energy eigenfunctions vn/2 , . . . vn Remaining non reduced low energy eigenfunctions v1 , . . . , vn/2−1 are smooth. Ira Livshits (BSU) - November 2011, Linz 47 / 83 Reason for using coarse grids Coarse grids can be used to compute an improved initial guess for the fine-grid relaxation. This is advantageous because: Relaxation on the coarse-grid is much cheaper (1/2 as many points in 1D, 1/4in 2D, 1/8 in 3D); Relaxation on the coarse grid has a marginally better convergence rate, for example 1 − O(4h2 ) instead of 1 − O(h2 ); Smooth eigenfunctions on the finest grid become more oscillatory on the coarse grid. Ira Livshits (BSU) - November 2011, Linz 48 / 83 Multigrid components Relaxation (smoother): efficiently dumps high-energy (eigenfunctions with large eigenvalues) error components; leaves unchanged low eigenmodes; A coarse-grid system operator: accurately resolve the low eigenmodes; Restriction operator (averaging): transfers fine-grid information (residual) to coarse grid; Prolongation operator (interpolation): transfers coarse-grid correction back to the fine grid; has to be accurate for low eigenmodes. Ira Livshits (BSU) - November 2011, Linz 49 / 83 Nice example: ∆u = f Relaxation: Gauss-Seidel. A coarse-grid operator: discretization of the differential Laplace operator on coarse scale (2h, 4h, . . . ) Restriction operator: full weighting of the residual (annihilates remaining oscillatory components), transferring only smooth ones to coarser grids; Interpolation: polynomial (linear) which is accurate for smooth components. Works great because the lowest eigenmodes are physically smooth. Ira Livshits (BSU) - November 2011, Linz 50 / 83 One picture is worth a thousand words Ira Livshits (BSU) - November 2011, Linz 51 / 83 Back to 2d Helmholtz : discretization error Discrete wave number vs exact wave number: k 2 h2 , 24 ≤ γ ≤ 48. |k h | ≈ k 1 + γ The total (accumulated) phase error is E (kd, kh) = kd k 2 h2 . 2πγ For the discrete solutions to be accurate approximation to the differential solutions need E (kd, kh) 1. For fixed k, d this means small h and large problem size, N. Ira Livshits (BSU) - November 2011, Linz 52 / 83 Bottlenecks for multigrid Dominant phase discretization error; Quality of standard coarse-grid and prolongation operators deteriorates on coarse grids; Poor multigrid relaxation - standard ”fast” relaxation scheme diverge; Ira Livshits (BSU) - November 2011, Linz 53 / 83 Lowest eigenmodes 0.04 0.05 0.02 0 0 −0.02 −0.04 1 −0.05 1 0.8 0.8 1 0.6 0.6 0.4 0.4 0.2 0.2 0 0.8 0.6 0.4 0.4 0.2 1 0.6 0.8 0.2 0 0 0 Lowest eigenmodes for k = 6π and k = 18π. Ira Livshits (BSU) - November 2011, Linz 54 / 83 More bottlenecks for multigrid Oscillatory low eigenmodes; There are many of them; They are different from each other; On sufficiently coarse grids, there is no single prolongation or coarse grid operator of a reasonable sparsity that works for all such eigenmodes. Ira Livshits (BSU) - November 2011, Linz 55 / 83 One-dimensional Helmholtz equation: slow to converge components, aka lowest eigenmodes 1 0.8 2 2 2 1.5 1.5 1.5 0.6 1 1 1 0.4 0.5 0.5 0.5 0.2 0 0 0 0 −0.5 −0.2 −0.5 −0.5 −1 −0.4 −1 −1.5 −0.6 −1.5 −0.8 −1 0 1 2 3 4 5 6 −2 7 0 −1 −1.5 −2 1 2 3 4 5 6 −2.5 7 0 1 2 3 4 5 6 −2 7 0 1 2 3 4 5 6 7 (a) a random initial error; (b)-(d) examples of remaining errors, obtained by applying three multigrid V-cycles to the homogenous Helmholtz equation (periodic boundary conditions) Ira Livshits (BSU) - November 2011, Linz 56 / 83 Assumption on the error The error unreduced by standard multigrid X X X e(x) = exp(iwx) = αw exp(iwx) + βw exp(−iwx) |w |≈k w ≈k w ≈k where the value of w is determined by the relaxation history. One can drive it to be (1 − γ1 )k ≤ w ≤ (1 + γ2 )k, where typical are values γj ≈ 0.3 Ira Livshits (BSU) - November 2011, Linz 57 / 83 Dual representation If we agreed on the previous page, we can also agree that where e1 (x) = e(x) = e1 (x)exp(ikx) + e2 (x)exp(−ikx), P w ≈k αw exp(i(w − k)x) and e2 (x) = w ≈k βw exp(−i(w − k)x). P Obviously, envelope functions e1 and e2 are smooth (compared to exp(±ikx)). Indeed the highest frequencies are |w − k| ≈ γj k Remark : Representation is not unique - one has to work to guarantee the statement above. The idea here is simple Lets try approximate smooth e1 (x) and e2 (x) instead of oscillatory e(x) (which we cannot do well anyway) and hope for the best. Ira Livshits (BSU) - November 2011, Linz 58 / 83 Defect correction If we assume the error in the form e(x) = e1 (x)exp(ikx) + e2 (x)exp(−ikx), then naturally r (x) = Le(x) = L(e1 (x)exp(ikx) + e2 (x)exp(−ikx)) = L(e1 (x)exp(ikx)) + L(e2 (x)exp(−ikx)) = exp(ikx)L1 e1 (x) + exp(−ikx)L2 e2 (x) Ira Livshits (BSU) - November 2011, Linz 59 / 83 Differential operators and residual representation Two conclusions from the previous page: Note that we use the differential Helmholtz operators and its kernel components, resulting in differential equations for the envelope functions e1 and e2 , L1 and L2 : L1 u = uxx + 2ikux and L2 u = uxx − 2ikux . Trick! Operator L has two kernel components (exp(±ikx)) and so do Lj (1 and exp(∓2ikx)) But we consider grids where only one (a constant) is visible! Also, given the form of ej and Lj , we can claim that r (x) = r1 (x)exp(ikx) + r2 (x)exp(ikx) where rj are smooth (as smooth as ej in fact). Ira Livshits (BSU) - November 2011, Linz 60 / 83 From Waves to Rays To reduce the irreducible wave error e(x): Le = r we approximate two ray errors L1 e1 = r1 and L2 e2 = r2 and then reconstruct back to the wave e(x) = e1 (x)exp(ikx) + e2 (x)exp(−ikx) Where is the gain: To compute the ray functions one should start on the grid with kH ≈ π (very coarse) To accelerate, one may use standard multigrid on each Lj ej = rj if wishes to go coarser. Ira Livshits (BSU) - November 2011, Linz 61 / 83 Residual separation and why on kH ≈ π Consider the wave (Helmholtz) residual with r1 and r2 are smooth r (x) = r1 (x)exp(ikx) + r2 (x)exp(−ikx). Clearly, such representation is not unique and some others allow r (x) = r̃1 (x)exp(ikx) + r̃2 (x)exp(−ikx) with oscillatory r̃j . We want to approximate smooth ray residuals (on coarse grids). Ira Livshits (BSU) - November 2011, Linz 62 / 83 Separation (implementation considered on grids) Compute fine-grid wave residual r h = f h − Lh u h (h is fine enough for small phase error kd(kh)2 1) Transfer it to the grid with kHs ≈ π/2 using full weighting (annihilates physically oscillatory error components), resulting in r Hs = r1Hs exp(ikx) + r2Hs exp(−ikx) To approximate r1 : Multiply by exp(−ikx) exp(−ikx)r Hs = r1Hs + r2Hs exp(−2ikx) Note that first term is smooth and the second one is extremely oscillatory 2kH s ≈ π. Applying one more full weighting results in the desired ray residual s s s IH2H (exp(−ikx)r Hs ) = IH2H (r1Hs ) + IH2H (r2Hs exp(−2ikx)) ≈ r12Hs s s s Ira Livshits (BSU) - November 2011, Linz 63 / 83 Ray operator The discrete operators for the ray functions are derived from the differential operators: L1 u = u 00 + 2iku 0 = f considered on the grid with kH ≈ π. The choice of discretization is defined by the principality of terms: 1 The second-order term : 2 H 2k The first-order term: H If kH ≥ π the first-order term is principal and should be accurately discretized; Thus using staggered grids. LH j Ira Livshits = (BSU) 1 H2 2ik 1 − − 2 H H - 2ik 1 − 2 H H 1 H2 November 2011, Linz 64 / 83 Wave-ray cycle Wave cycle: I I I I I I I I Reduces all but problematic characteristic components; Produces residual for the ray cycles (on one of the fine grids with small phase error); Finest grid: (kd)(k 2 h2 ) 1 Coarsest grid: kH > π (Used to effectively wipe out constants and alikes, stencil [1 (k 2 H 2 − 2) 1] Correction Scheme; GS one pre- and one post relaxation on fine grid and on the coarsest grid; Kaczmarz four pre- and one post relaxation sweeps everywhere else; Linear interpolation and full weighting; Two ray cycles to reduce the characteristic components. Ira Livshits (BSU) - November 2011, Linz 65 / 83 Ray cycles Staggered grid, short central differences for ux ; Marching scheme in the propagation direction; CS or FAS multigrid scheme depending on interpretation; Natural introduction of boundary conditions in terms of rays. Ira Livshits (BSU) - November 2011, Linz 66 / 83 Before we proceed, detour to FAS Full Approximation Scheme Allows coarse grid approximation of the solution u H rather than correction eH ; Considers full equation LH u H = f H residual equation LH e H = r h ; Developed for solving non linear equations. Ira Livshits (BSU) - November 2011, Linz 67 / 83 FAS details Consider ũ h a current fine-grid solution and r h = f h − Lh ũ h is its corresponding residual; CS: I I I Coarse grid variable correction: v H Coarse grid system: LH v H = R H = IhH r h h Coarse grid correction: ũNEW = ũ h + IHh v H FAS: I I I I Coarse grid variable solution: ũ H = ĨhH ũ h + v H Coarse grid system: LH u H = f H = LH (ĨhH u h ) + R H h H Coarse grid correction: ũNEW = u h + IHh (ũNEW − ĨhH ũ h ) h H If LH resolves all solution components then ũNEW = IHh (ũNEW ) Ira Livshits (BSU) - November 2011, Linz 68 / 83 τ -correction Consider the c.g. equation as LH u H = f H + τhH where f H = IhH f h , τhH = LH (ĨhH u h ) − IhH (Lh u h ) Term τhH is a fine-to-coarse defect correction. Nonlinear equations: The correction equation is Lh (ũ h + v h ) − Lh ũ h = r h . On coarse grid its reads: LH (ĨhH ũ h + v H ) − LH (ĨhH ũ h ) = IhH r h which is exactly the top equation. Ira Livshits (BSU) - November 2011, Linz 69 / 83 Rays, directions and BC Ira Livshits (BSU) - November 2011, Linz 70 / 83 Back to Boundary conditions Assumptions: Away from the source (near the boundary) the solution has a pure ray character so can be described using ray operators; The typical boundary conditions are the radiation BC, sometimes combined with the Dirichlet boundary conditions. The rays can enter the computational domain from the infinity and leave the computational domain freely unless there is an obstacle, media change, etc. The entering part is to be defined by the Direchlet boundary conditions at the entrance (for each ray); the exiting part is controlled by the Sommerfield boundary conditions at the exit (the latter can be replaced by an interior equation appropriately discretized). If no rays (waves) enter from infinity, the Dirichlet boundary conditions are homogenous. Ira Livshits (BSU) - November 2011, Linz 71 / 83 Full ray formulation (for two-sided Sommerfield BC) Exiting Sommerfield BC: At x = 0: u 0 + 2iku = 0 and at x = d: u 0 − 2iku = 0 Positively propagating ray (corresponding to exp(ikx)) L1 a(x) = a100 (x) + 2ika10 (x) = f1 (x), a1 (0) = a1,0 , a10 (d) = 0 Negatively propagating ray (corresponding to exp(−ikx)) L2 a(x) = a200 (x) + 2ika20 (x) = f2 (x), a2 (d) = a2,0 , a10 (0) = 0 Two systems are solved independently; If there is a Dirichlet BC on one side, the two has two be combined (incident wave defines the amplitude of the reflected wave); Ira Livshits (BSU) - November 2011, Linz 72 / 83 Wave-ray interaction Because of BC the ray cycles are implemented in FAS (solution representation); At the interior, regular FAS correction: H h H − ã2H )); ũNEW = ũ h + exp(ikx)(IHh (ã1,NEW − ã1H )) + exp(−ikx)(IHh (ã2,NEW At the boundary, direct replacement h H H ũNEW = exp(ikx)(IHh ã1,NEW ) + exp(−ikx)(IHh ã2,NEW ); Ira Livshits (BSU) - November 2011, Linz 73 / 83 Conclusions on 1d The solver is a combination of the wave and the ray approaches; The complexity is similar to one for Laplace, it is O(N). Worked for jumping and slightly varying wave numbers; Ira Livshits (BSU) - November 2011, Linz 74 / 83 Wave-Ray algorithm Geometric solver for the 2D Helmholtz equation with constant wave numbers Brandt, Livshits: Wave-ray multigrid method solving standing wave equations, 1997 Livshits, Brandt: Accuracy Properties of the Wave-Ray Multigrid Algorithm for Helmholtz Equations, 2006 Special feature: Separate treatment of oscillatory kernel e ikx , |k| = k on coarse grids. Ira Livshits (BSU) - November 2011, Linz 75 / 83 Directions of propagation e ikx Ira Livshits (BSU) - November 2011, Linz 76 / 83 Discretization of the propagation directions: few chosen direction k` Ira Livshits (BSU) - November 2011, Linz 77 / 83 Wave-ray approach Components not changed (at best) by the standard multigrid: exp(iwx), |w| ≈ k, w, x ∈ R 2 Ray representation for such components u(x) and residual r (x) X X u` (x)exp(ik` x) and r (x) = r` (x)exp(ik` x) u(x) = ` ` Instead of oscillatory u(x) compute smooth u` (x); Differential operators for u` (x) are obtained directly from L. Ira Livshits (BSU) - November 2011, Linz 78 / 83 Phase error: Wave vs Rays Waves (as discussed): E (kd, kh) ≈ kd k 2 h2 , 24 ≤ γ ≤ 48 2πγ β , β ≈ 0.12 2πL2 This is assuming very aggressive coarsening with Ln = 2n+2 : Rays: E (kd, L) ≈ kd I I I Propagation direction ξ: hξn = (Ln )2 /(16k); Orthogonal direction η: hηn = CLn /4k For L1 = 8, khξ ≈ 4 and khη ≈ 2 Ira Livshits (BSU) - November 2011, Linz 79 / 83 Main features Smaller wave computational domain; Larger ray domains; grids aligned with propagation directions; Agressive ray coarsening; Ray equations aξξ + aηη + 2ikaξ , ξ = k` Ray residual separation using consecutive full weighting; FAS for both waves and rays - boundary values for waves come directly from rays; Ira Livshits (BSU) - November 2011, Linz 80 / 83 More about rays Ira Livshits (BSU) - November 2011, Linz 81 / 83 Algorithm Repeat until happy I I CS standard wave V cycle FAS ray cycle: computation of wave residual kh 1; separation into ray residuals starting kh ≈ 1; forming FAS ray rhs, line relaxation khξ ≈ 4, khη ≈ 2; reconstructing to the wave form kh ≈ 1; interpolating (with relaxation) to the finest wave grid kh 1. Ira Livshits (BSU) - November 2011, Linz 82 / 83 Requirements and advantages Requirements Helmholtz PDE Structured grids Analytical knowledge of Lu ≈ 0 Advantages Local processes - wave representation on fine grids Geometrical optics processes - ray representation on coarse grids and larger domains Natural intro of radiation boundary conditions Ira Livshits (BSU) - November 2011, Linz 83 / 83