Multilevel Solvers for high-frequency Helmholtz problems

advertisement
Geometric Multigrid Methods for the Helmholtz
equations
Ira Livshits1
1
Ball State University
RICAM, Linz, 14, 16 November 2011
Ira Livshits
(BSU)
-
November 2011, Linz
1 / 83
Multigrid Methods
Aim: To understand geometric multigrid for the Helmholtz equations
(difficulties and some solutions)
Give a deeper understanding that led to the various choices made.
Contents
I
I
I
I
Basics of multigrid
Basics of Local Fourier analysis
Nonlinear multigrid (FAS)
The Helmholtz equation
Ira Livshits
(BSU)
-
November 2011, Linz
2 / 83
Helmholtz equation
Lu = −∆u(x) − k 2 u(x),
x ∈ Ω ⊂ Rd
with large wave numbers k: 2π/k dim(Ω)
Ira Livshits
(BSU)
-
November 2011, Linz
3 / 83
A Comment on Terminology
The indefinite Helmholtz equation is
Lu = (−∆ − k 2 )u = f ,
in contrast to
(∆ − η)u = f , η > 0
often also called Helmholtz equation, or Helmholtz equation with the good sign,
or positive definite Helmholtz equation.
The subject today is the indefinite Helmholtz equation.
Ira Livshits
(BSU)
-
November 2011, Linz
4 / 83
The 2D discrete Poisson equation in the unit square
We consider a stationary diffusion equation with diffusion in two coordinate
directions:
−
∂ 2 u(x, y ) ∂ 2 u(x, y )
−
= f (x, y )
∂x 2
∂y 2
for
(x, y ) ∈ Ω = (0, 1)2
with boundary conditions:
u(x, y ) = g (x, y )
for
(x, y ) ∈ ∂Ω
The functions f ∈ C (Ω) and g ∈ C (∂Ω) are known.
The “Laplace Operator” is commonly written as:
∆ :=
Ira Livshits
(BSU)
∂2
∂2
+
∂x 2
∂y 2
-
November 2011, Linz
5 / 83
With mesh size h = 1/n a grid G is defined as
Ω = {(xi , yi )|(xi , yj ) = (ih, jh) ⊂ IR2 ; i, j, = 0, 1, . . . , n}
In analogy to the 1D case we approximate the partial derivatives in
y -direction. For the second derivative in y -direction we obtain:
ui,j−1 − 2ui,j + ui,j+1
∂2u
≈
2
∂y i,j
hy2
The accuracy in x- and y -direction is O(hx2 ), O(hy2 ) or,
with hx = hy = h, O(h2 ).
Ira Livshits
(BSU)
-
November 2011, Linz
6 / 83
Gauss-Seidel and Jacobi for the Poisson equation
The discrete Laplace operator reads:
1
∆h uh =
[ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − 4ui,j ] or:
h2
1
∆h uh (x, y ) =
[uh (x − h, y ) + uh (x + h, y ) + uh (x, y − h) +
h2
uh (x, y + h) − 4uh (x, y )]
Jacobi iteration reads
uhm+1 (xi , yj )
=
1h 2
h fh (xi , yj ) + uhm (xi − h, yj ) + uhm (xi + h, yj )
4
i
+uhm (xi , yj − h) + uhm (xi , yj + h) ,
(with (xi , yj ) ∈ Ωh )
Gauss-Seidel iteration reads
1h 2
uhm+1 (xi , yj ) =
h fh (xi , yj ) + uhm+1 (xi − h, yj ) + uhm (xi + h, yj )
4
i
+uhm+1 (xi , yj − h) + uhm (xi , yj + h) ,
Ira Livshits
(BSU)
-
November 2011, Linz
7 / 83
Linear systems of equations in matrix form
For the solution of a linear system of equations Au = b,
Iterative method:
u m+1 = Qu m + s, (m = 0, 1, 2, . . .),
so that the system u = Qu + s is equivalent to the original problem.
Matrix Q is usually not computed explicitly !
Ira Livshits
(BSU)
-
November 2011, Linz
8 / 83
From the matrix equation Au = b we construct a splitting A = B + (A − B),
so that B is “easily invertible”.
With the help of a general nonsingular N × N matrix B we obtain such
iteration procedures from the equation
Bu + (A − B)u = b.
If we put
Bu m+1 + (A − B)u m = b ,
or, rewritten for u m+1 ;
u m+1 = u m − B −1 (Au m − b) = (I − B −1 A)u m + B −1 b .
We will see, that it is important, to have the eigenvalues of Q = I − B −1 A
possibly small. This is the more likely, the better B resembles A.
Ira Livshits
(BSU)
-
November 2011, Linz
9 / 83
Linear systems of equations in matrix form
Definition: An iterative process u m+1 = Qu m + s is called consistent, if
matrix I − Q is not singular and if (I − Q)−1 s = A−1 b.
If the iteration is consistent, the equation (I − Q)u = s has exactly one
solution u m→∞ ≡ u.
Definition: An iteration is called convergent, if the sequence u 0 , u 1 , u 2 , . . .
converges to a limit, independent of u 0 .
Ira Livshits
(BSU)
-
November 2011, Linz
10 / 83
The spectral radius
Definition: The spectral radius ρ(Q) of a (N, N) matrix Q is
ρ(Q) := max λ : λ eigenvalue of Q
Lemma The spectral radius is the infimum over all (vector norm related)
matrix norms:
ρ(Q) = inf k Qk
Theorem: The iterative method u m+1 = Qu m + s m = 0, 1, 2, . . . yields a
convergent sequence {u m } for a general u 0 , s (“the method converges”), if
ρ(Q) < 1 .
In the case ρ(Q) < 1, matrix I − Q is regular, there is for each s exactly one
fix point of the iterative scheme.
Ira Livshits
(BSU)
-
November 2011, Linz
11 / 83
Error reduction
General splitting of A (another way of writing):
Bu m+1 + (A − B)u m = b = Au ∗ = Bu ∗ + (A − B)u ∗
or:
B(u m+1 − u ∗ ) = (B − A)(u m − u ∗ ),
So,
(u m+1 − u ∗ ) = Q(u m − u ∗ ).
By induction, we can obtain:
(u m − u ∗ ) = Q m (u 0 − u ∗ ).
Assume ρ(Q) < 1.
Ira Livshits
(BSU)
-
November 2011, Linz
12 / 83
Efficient solution methods for discrete equations –
Hierarchy
One level methods, like Jacobi, Gauss-Seidel iterative methods are far from
optimal: O(N α ), α > 1.
O(N) complexity requires hierarchical approaches.
Multigrid is the classical hierarchical approach
I
I
I
I
Approximation on different scales
Compute only those “parts” of a solution on fine scales that really require the
fine resolution.
Compute/correct the remaining parts on coarser scales
Gives O(N) complexity
Advantages increase with increasing problem size
The idea behind multigrid is very general
Ira Livshits
(BSU)
-
November 2011, Linz
13 / 83
Relaxation of ∆u = 0
1
0.8
0.6
0.4
0.2
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0
1
0.1
0
1
0.8
1
0.6
0.8
0.6
0.4
0.4
0.2
0.2
0
0
1
0.8
1
0.6
0.8
0.6
0.4
0.4
0.2
0.2
0
0
0
0.8
1
0.6
0.8
0.6
0.4
0.4
0.2
0.2
0
0
(a) Random initial error, (b) Error after 10 GS iterations, (c) Error after 10 more
GS iterations.
Ira Livshits
(BSU)
-
November 2011, Linz
14 / 83
Convergence factor vs smoothing factor
Convergence factor (GS, Jacobi, wJacobi, SOR) is ρ = 1 − O(h2 )
Smoothing factor is a convergence factor for oscillatory components, and it
seems to be good.
Remaining error components are smooth.
Ira Livshits
(BSU)
-
November 2011, Linz
15 / 83
Motivation to use coarse grids
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
1
0
1
0.8
0.8
1
0.6
0.6
0.4
0.4
0.2
0.2
0
0.8
0.6
0.4
0.4
0.2
1
0.6
0.8
0.2
0
0
0
The remaining after relaxation, aka smoothing, error has a good coarse grid
representation.
Ira Livshits
(BSU)
-
November 2011, Linz
16 / 83
Detour to local Fourier analysis
with a hint of trouble brewing for the Helmholtz
The basic idea of any multigrid algorithm is to annihilate the high frequency
error components efficiently on a fine grid by relaxation (smoothing) while
the low frequencies are approximated from coarser grids.
Smoothing analysis is used to analyze quantitatively this reduction of the
high frequency error components on the fine grid.
Assuming the ideal situation in which
I
I
I
the relaxation does not affect the low frequencies
all low frequencies are approximated more or less exactly on the coarse grid
there is no interaction between high and low frequency error components
it already leads to an “estimate” or prediction of the 2–level convergence
factor.
Ira Livshits
(BSU)
-
November 2011, Linz
17 / 83
Assumptions
For a discretized PDE:
Lh Uh = Fh
With the basic assumptions of any local Fourier analysis:
–
–
–
–
linearize operator Lh and/or freeze coefficients to local values
neglect boundaries and boundary conditions
consider the frozen, linearized operator on an infinite grid Gh
apply the linear constant coefficient local Fourier analysis
Ira Livshits
(BSU)
-
November 2011, Linz
18 / 83
Components, the 1D case
Infinite uniform mesh, meshsize h
Gh = {k · h : k ∈ Z}
1D Fourier components: e iωx = cos ωx + i sin ωx
On Gh only Fourier components e iθx/h with θ ∈ (−π, π] are “visible”.
Definition: A component e iθx/h is “visible” on Gh if there is no frequency
θ0 with |θ0 | < |θ| such that
e iθ0 x/h = e iθx/h for all x ∈ Gh .
A visible component e iθx/h does not coincide with any lower frequency on
grid Gh .
θ “frequency”,
Ira Livshits
(BSU)
| 2πh
θ | “period”
-
November 2011, Linz
19 / 83
Components, the 2D case
Infinite uniform mesh, meshsize h = (hx , hy ) in x– and y– direction:
Gh = {(khx , lhy ) : k, l ∈ Z}
θ = (θ1 , θ2 ) “frequency”
On Gh , only Fourier components
e iθx/h := e iθ1 x/hx · e iθ2 y /hy
with
|θ| := max (|θ1 |, |θ2 |) ≤ π
are “visible”.
For higher frequencies |θ| > π, the Fourier components coincide with a lower
frequency ∈ (−π, π].
For higher dimensions: analogously
Ira Livshits
(BSU)
-
November 2011, Linz
20 / 83
Difference operators
Let Gh be an infinite uniform mesh, meshsize h ;
uh : grid function on Gh ;
I = {−ν, ..., 0, ..., ν}: set of indices.
Difference operators Lh :
2D
x = (x, y ) ∈ Gh
Lh uh (x) =
XX
aµ1 µ2 uh (x + µ1 hx , y + µ2 hy )
µ1 ∈I µ2 ∈I
Ira Livshits
(BSU)
-
November 2011, Linz
21 / 83
The stencil notation, examples
finite differences for ∆u, I = {−1, 0, 1}


a−11
a01
a11
a00
a10  uh (x)
Lh uh (x) =  a−10
a−1−1
a0−1 a1−1
2D
=

1 
1
h2
|

1
− 4
1
{z
1  uh (x)
}
∆h
or for the biharmonic operator ∆2 ,

1
∆h ∆h uh (x) = 4
h





5 − point stencil
I = {−2, −1, 0, 1, 2}
2
−8
2
1
1
−8
20
−8
1

2
−8
2


1 
 uh (x)

13-point stencil
Here: x = (x, y ) ∈ Gh , hx = hy = h.
Ira Livshits
(BSU)
-
November 2011, Linz
22 / 83
Motivation to consider Biharmonic equation is Kaczmarz
relaxation
Kaczmarz relaxation (equivalent to GS applied to AA∗ )
Implementation: Consider i th equation: Ai x = bi and a current approximation x̃,
here Ai is the i th row of A.
A new approximation is given by
x̃ ← x̃ + ri Ai ,
bi − Ai x̃
.
A∗i Ai
Kaczmarz relaxation always converges (indeed, AAt is always SPD) but it is often
slow (much slower than the GS on original matrix) and thus should be used only
as the last reserve (µ = 0.8 for Laplace vs GS’s µ = 0.25).
where the normalized residual is computed as ri =
Ira Livshits
(BSU)
-
November 2011, Linz
23 / 83
Difference operators applied to Fourier components
On an infinite grid Gh , the Fourier components e iθx/h are eigenfunctions of
any scalar difference operator Lh with constant coefficients.
1D:
Lh e iθx/h


X
=
aµ e iθµ  ·e iθx/h
µ∈I
|
{z
}
e
Lh (θ)
2D:


Lh e iθx/h = 
XX
aµ1 µ2 e iθ1 µ1 e iθ2 µ2  ·e iθx/h
µ1 ∈I µ2 ∈I
|
{z
}
e
Lh (θ)
e
Lh (θ) is the eigenvalue, also called the Fourier symbol of Lh .
Ira Livshits
(BSU)
-
November 2011, Linz
24 / 83
Examples
1D:
Lh =
1
h2 [1
−2
=⇒ Lh e iθx/h =
1]
1 −iθ
2
e
− 2 + e iθ · e iθx/h = 2 (cos θ − 1) ·e iθx/h
2
h
|h
{z
}
e
Lh (θ)
2D:
θ = (θ1 , θ2 )

1 
1
Lh = ∆h = 2
h
−
1
4
1

1 
2
=⇒ e
Lh (θ) = 2 (cos θ1 + cos θ2 − 2)
h
or for the biharmonic operator Lh = ∆h ∆h (13-point stencil):
4
e
Lh (θ) = 4 (cos θ1 + cos θ2 − 2)2
h
Ira Livshits
(BSU)
-
November 2011, Linz
25 / 83
Smoothing analysis; a general class of relaxations:
Consider a linear problem
Lh Uh (x) = Fh on grid Gh
Then: Many (not all !) relaxation schemes for this problem are based on an
operator splitting
Lh = Ah + Bh
where Ah and Bh are again difference operators.
Then: A relaxation sweep starting from the initial approximation uh (“old
values”) produces a new approximation u h (“new values”) by solving
Ah uh (x) + Bh u h (x) = Fh (x)
at each gridpoint x ∈ Gh .
Ira Livshits
(BSU)
-
November 2011, Linz
26 / 83
Smoothing analysis; a general class of relaxations:
What does that mean for the errors before and after the relaxation sweep ?
Let eh
and e h
Then:
=
=
Uh − uh
Uh − u h
be the error before
after the sweep.
Ah eh (x) + Bh e h (x) = 0
at each gridpoint x ∈ Gh .
How do these relaxations act on Fourier components ?
Let eh = Ae iθx/h : error before relaxation; then e h (x) = Ae iθx/h error after
relaxation.
A, A ∈ R are the error amplitudes before and after relaxation.
Ira Livshits
(BSU)
-
November 2011, Linz
27 / 83
Smoothing analysis; a general class of relaxations:
From
Ah eh (x) + Bh e h (x) = 0
follows
e h (θ)A + B
eh (θ)A = 0
A
and therefore
A = −
e h (θ)
A
eh (θ)
B
!
A
e h (θ), B
eh (θ) ∈ C are the Fourier symbols of Ah , Bh .
where A
e µ(θ) := Aeh (θ) Bh (θ)
is the amplification factor of the component θ.
Ira Livshits
(BSU)
-
November 2011, Linz
28 / 83
1D Example:Gauss-Seidel relaxation
Corresponding splitting of Lh :
1
[1 − 2
2
h
{z
|
Lh
1
1] = 2 [0 0
h
} |
{z
Ah
1
1] + 2 [1 − 2
h
} |
{z
Bh
0]
}
Then: at gridpoint x ∈ Gh for the errors
Ae iθ(x−h)/h − 2Ae iθx/h + Ae iθ(x+h)/h = 0
e −iθ − 2 A
| {z }
+
eh (θ)
B
e iθ A = 0.
| {z }
e h (θ)
A
This yields:
A=
Ira Livshits
(BSU)
-
−e iθ
A
e −iθ −2
November 2011, Linz
29 / 83
Questions
What is smoothing ?
How can the smoothing property of a given relaxation scheme be measured
quantitatively ?
How does the smoothing property (or its quantitative measure) influence the
performance of multigrid cycling ?
Can the performance of a multigrid cycle be predicted in terms of the
smoothing measures ?
Note: In a multigrid cycle, relaxations are used to “smooth” the highly
oscillating error components which cannot be approximated on coarser grids.
Therefore, smoothing roughly means “convergence of high frequency error
components”.
Ira Livshits
(BSU)
-
November 2011, Linz
30 / 83
What are high and low frequencies ?
Let Gh be a fine grid and GH a coarse grid.
Then: A Fourier component e iθx/h on grid Gh is called a high frequency
(with respect to GH ) if its restriction (injection) to the coarse grid GH is not
“visible” there. Otherwise it is called a low frequency.
Take one–dimensional standard coarsening:
Gh = {kh : k ∈ Z}, GH := G2h = {2kh : k ∈ Z}
Let |θ| ≤ π; then by injection from Gh to G2h :
inject
e iθx/h =⇒ e i2θx/2h
This component e i2θx/2h is “visible” on G2h only if
|2θ| ≤ π, i.e. |θ| ≤
Ira Livshits
(BSU)
-
π
2
November 2011, Linz
31 / 83
=⇒ The high frequencies on Gh (with respect to G2h ) are those with
π ≥ |θ| ≥ π2 .
Fourier components in 2D: e iθx/h = e iθ1 x/h e iθ2 y /h ,
θ = (θ1 , θ2 ) with
|θ| := max(|θ1 |, |θ2 |) ≤ π
Standard coarsening
GH
=
Gh
G2h
{(kh, lh) : k, l ∈ Z}
{(2kh, 2lh) : k, l ∈ Z}
=
=
π
=⇒ High frequencies:
2 ≤ |θ| ≤ π
Low frequencies:
|θ| < π2
Note: The definition of high and low frequencies depends on the coarsening.
Ira Livshits
(BSU)
-
November 2011, Linz
32 / 83
What is smoothing ?
“Smoothing” stands for convergence of high frequency error components
which cannot be approximated from the coarse grids in a multigrid cycle.
For a relaxation scheme with amplification factors µ(θ) of Fourier
components e iθx/h the smoothing rate µ is defined by:
µ := max{|µ(θ)| : θ = high frequencies}
Note: • The above definition of µ depends on the coarsening.
• The amplification or reduction factors µ(θ)
are also called “smoothing factors”.
Ira Livshits
(BSU)
-
November 2011, Linz
33 / 83
Example: Gauss–Seidel relaxation for Poisson’s equation:

∆h Uh =
1 
1
h2

1
−4
1
1  Uh = Fh
Relaxation in error terms:



0
1
1
1 
 e h (x) +
 0
1
−4
0
0
h2
h2
1
0
|
{z
}
|
{z
Bh
Symbols:
Ah

1  eh (x) = 0
}
θ = (θ1 , θ2 )
Bh e iθx/h
=
1 −iθ1
e
+ e −iθ2 − 4 ·e iθx/h
2
h
|
{z
}
eh (θ)
B
Ah e iθx/h
=
1 iθ1
e + e iθ2 ·e iθx/h
2
|h
{z
}
e h (θ)
A
Ira Livshits
(BSU)
-
November 2011, Linz
34 / 83
=⇒ Amplification factors:
µ(θ) = −
e h (θ)
A
e iθ1 + e iθ2
= − −iθ1
eh (θ)
e
+ e −iθ2 − 4
B
=⇒ Smoothing rate:
µ = max{|µ(θ)| :
π
≤ |θ| ≤ π} ≈ 0.5
2
=⇒ The high frequency error components are reduced by a factor of
(at least) µ per relaxation sweep.
0.46
0.44
0.42
0.4
0.38
0.36
0.34
0.32
0.3
0.28
0.26
0
40
10
20
20
30
40
0
Note: For low frequencies, the reduction per relaxation sweep is much worse:
|µ(θ)| −→ 1
Ira Livshits
(BSU)
-
if θ −→ 0
November 2011, Linz
35 / 83
Now to Helmholtz : the stencils and the relaxations
1
[1 − 2 + (kh)2 1]
h2
1D
finite differences for Lu = u 00 + k 2 u,
2D
finite differences for Lu = ∆u + k 2 u,


1
−4 + (kh)2
1  5 − point stencil
= h12  1
1
Lh
Lh =
Here: x = (x, y ) ∈ Gh , hx = hy = h.
Ira Livshits
(BSU)
-
November 2011, Linz
36 / 83
Fourier symbols
1D:
Lh =
1
h2 [1
− 2 + k 2 h2
=⇒ Lh e iθx/h =
1]
1
(2 cos θ − 2 + k 2 h2 ) ·e iθx/h
2
h
{z
}
|
e
Lh (θ)
2D:
θ = (θ1 , θ2 )
=⇒ Lh e iθx/h =
1
(2 cos θ1 + 2 cos(θ2 ) − 4 + k 2 h2 ) ·e iθx/h
2
h
|
{z
}
e
Lh (θ)
or for the biharmonic operator (think Kaczmarz!)
1
e
Lh (θ) = 4 (2 cos θ1 + 2 cos θ2 − 4 + k 2 h2 )2
h
Ira Livshits
(BSU)
-
November 2011, Linz
37 / 83
1D Example:
Gauss-Seidel relaxation of
1
[1 − 2 + k 2 h2
2
h
|
{z
1] Uh (x) = Fh (x)
}
Lh
Relaxation:
uh =⇒ u h
u h (x − h) − (2 − k 2 h2 )u h (x) + uh (x + h)
= Fh (x)
h2
Corresponding splitting of Lh :
1
[1 − 2 + k 2 h2
2
h
|
{z
Lh
1
1] = 2 [0 0
} |h
{z
Ah
1
1] + 2 [1 − 2 + k 2 h2
} |h
{z
Bh
0]
}
Then: at grid point x ∈ Gh for the errors
Ae iθ(x−h)/h − (2 − k 2 h2 )Ae iθx/h + Ae iθ(x+h)/h = 0
Ira Livshits
(BSU)
-
November 2011, Linz
38 / 83
e −iθ − 2 + k 2 h2 A
|
{z
}
+
eh (θ)
B
e iθ A = 0.
| {z }
e h (θ)
A
This yields:
A=
−e iθ
A
e −iθ −(2−k 2 h2 )
What is the difference compared to Laplace ?
I
I
I
I
What happens when θ = 0 ?
What happens when θ is between π/2 and π?
What happens when θ ≈ kh ?
Happens when kh grows, i.e., k is fixed and h is coarsened ?
Ira Livshits
(BSU)
-
November 2011, Linz
39 / 83
Answers
If kh 1 i.e., exp(±ikx) are smooth:
I
I
I
other smooth components θ ≈ 0 remain almost unchanged by relaxation;
For |θ| = ±kh there is a slight divergence;
Oscillatory components π/2 < |θ| ≤ π converge almost as fast as in Laplace
(convergence does depend on kh);
If kh = O(1) divergence worsens!
Results for kh = 0.1 and kh = 0.5
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
Ira Livshits
0.5
(BSU)
1
1.5
2
2.5
3
3.5
-
November 2011, Linz
40 / 83
More examples: 2D Helmholtz equation:
Gauss-Seidel Relaxation in error terms:



0
1
1 
2 2
 e h (x) + 1  0
h
0
1
−4
+
k
0
h2
h2
1
0
|
{z
}
{z
|
Bh
Symbols:
Ah

1  eh (x) = 0
}
θ = (θ1 , θ2 )
Bh e iθx/h
=
1 −iθ1
e
+ e −iθ2 − 4 + k 2 h2 ·e iθx/h
2
|h
{z
}
eh (θ)
B
Ah e iθx/h
=
1 iθ1
e + e iθ2 ·e iθx/h
2
h
|
{z
}
e h (θ)
A
Ira Livshits
(BSU)
-
November 2011, Linz
41 / 83
=⇒ Amplification factors:
µ(θ) = −
e h (θ)
e iθ1 + e iθ2
A
= − −iθ1
eh (θ)
e
+ e −iθ2 − 4 + k 2 h2
B
=⇒ Smoothing rate (what is smooth here?):
µ ≈ µL for kh 1;
µ deteriorates (grows) as kh grows;
I
I
1.4
1.2
1
2
0.8
1.5
0.6
1
0.4
0.5
0.2
0
0
0
0
150
100
50
50
50
100
100
150
Ira Livshits
0
20
(BSU)
40
60
80
100
120
140
150
-
0
November 2011, Linz
42 / 83
Questions
What is smooth ? Is θ = 0 smooth ?
What is the smoothing factor ?
What do we want from relaxation ?
Ira Livshits
(BSU)
-
November 2011, Linz
43 / 83
Kaczmarz relaxation c = −4 + k 2 h2

1
∆h ∆h uh (x) = 4
h

1
h4


 1


0
2c
2
|
0
0
4 + c2
2c
1
{z
Bh
Ira Livshits
(BSU)





1
1
2c
4 + c2
2c
1

2
2c
2

0
0
2



1 

0  e h (x) + 4 
0
h 


}
|
-

2
2c
2
1
2 2c
0 0
0 0
0
{z
Ah


1 
 uh (x)


2
2c
0


1 
 eh (x) = 0

}
November 2011, Linz
44 / 83
Symbols and amplification
Symbol Bh e iθx/h =
1 −2iθ
1 + e −2∗iθ2 ) + 2(e −iθ1 −iθ2 + e −iθ1 +iθ2 ) + 2c(e −iθ1 + e −iθ2 ) + (4 + c 2 ) ·e iθx/h
(e
4
h
|
{z
}
e (θ)
B
h
Symbol Ah e iθx/h =
1 2iθ
1 + e 2iθ2 ) + 2(e iθ1 −iθ2 + e iθ1 +iθ2 ) + 2c(e iθ1 + e iθ2 ) ·e iθx/h
(e
4
h
|
{z
}
e (θ)
A
h
Ira Livshits
(BSU)
-
November 2011, Linz
45 / 83
Good news: µ(θ) ≤ 1 for any θ
Q: But for which is works the best ?
0.75
1
0.7
0.9
0.65
0.8
0.6
0.7
0.55
0.6
0.5
0.5
0.45
0.4
0.3
0.4
0
0.2
50
0.1
0
100
150
0
50
100
150
100
50
150
50
100
150
0
Results for kh = 1 (high frequencies and overall convergence).
Ira Livshits
(BSU)
-
November 2011, Linz
46 / 83
More on convergence
Consider ordered eigen pairs of A: (λj , vj ): Avj = λj vj such that
|λ1 | ≤ |λ2 | · · · ≤ |λN |.
P m+1
P m
For errors representation e m+1 =
αn vn and e m =
αn vn ,
Due to linearity (of everything),
X
X
αnm+1 vn =
αnm Qvn
and thus overall convergence is defined by
m+1 α
max n m n
αn
The smoothing factor is a convergence factor for αn/2 , . . . , αn , i.e., for the
high energy eigenfunctions vn/2 , . . . vn
Remaining non reduced low energy eigenfunctions v1 , . . . , vn/2−1 are smooth.
Ira Livshits
(BSU)
-
November 2011, Linz
47 / 83
Reason for using coarse grids
Coarse grids can be used to compute an improved initial guess for the fine-grid
relaxation. This is advantageous because:
Relaxation on the coarse-grid is much cheaper (1/2 as many points in 1D,
1/4in 2D, 1/8 in 3D);
Relaxation on the coarse grid has a marginally better convergence rate, for
example 1 − O(4h2 ) instead of 1 − O(h2 );
Smooth eigenfunctions on the finest grid become more oscillatory on the
coarse grid.
Ira Livshits
(BSU)
-
November 2011, Linz
48 / 83
Multigrid components
Relaxation (smoother): efficiently dumps high-energy (eigenfunctions with
large eigenvalues) error components; leaves unchanged low eigenmodes;
A coarse-grid system operator: accurately resolve the low eigenmodes;
Restriction operator (averaging): transfers fine-grid information (residual) to
coarse grid;
Prolongation operator (interpolation): transfers coarse-grid correction back
to the fine grid; has to be accurate for low eigenmodes.
Ira Livshits
(BSU)
-
November 2011, Linz
49 / 83
Nice example: ∆u = f
Relaxation: Gauss-Seidel.
A coarse-grid operator: discretization of the differential Laplace operator on
coarse scale (2h, 4h, . . . )
Restriction operator: full weighting of the residual (annihilates remaining
oscillatory components), transferring only smooth ones to coarser grids;
Interpolation: polynomial (linear) which is accurate for smooth components.
Works great because the lowest eigenmodes are physically smooth.
Ira Livshits
(BSU)
-
November 2011, Linz
50 / 83
One picture is worth a thousand words
Ira Livshits
(BSU)
-
November 2011, Linz
51 / 83
Back to 2d Helmholtz : discretization error
Discrete wave number vs exact wave number:
k 2 h2
, 24 ≤ γ ≤ 48.
|k h | ≈ k 1 +
γ
The total (accumulated) phase error is
E (kd, kh) = kd
k 2 h2
.
2πγ
For the discrete solutions to be accurate approximation to the differential
solutions need
E (kd, kh) 1.
For fixed k, d this means small h and large problem size, N.
Ira Livshits
(BSU)
-
November 2011, Linz
52 / 83
Bottlenecks for multigrid
Dominant phase discretization error;
Quality of standard coarse-grid and prolongation operators deteriorates on
coarse grids;
Poor multigrid relaxation - standard ”fast” relaxation scheme diverge;
Ira Livshits
(BSU)
-
November 2011, Linz
53 / 83
Lowest eigenmodes
0.04
0.05
0.02
0
0
−0.02
−0.04
1
−0.05
1
0.8
0.8
1
0.6
0.6
0.4
0.4
0.2
0.2
0
0.8
0.6
0.4
0.4
0.2
1
0.6
0.8
0.2
0
0
0
Lowest eigenmodes for k = 6π and k = 18π.
Ira Livshits
(BSU)
-
November 2011, Linz
54 / 83
More bottlenecks for multigrid
Oscillatory low eigenmodes;
There are many of them;
They are different from each other;
On sufficiently coarse grids, there is no single prolongation or coarse grid
operator of a reasonable sparsity that works for all such eigenmodes.
Ira Livshits
(BSU)
-
November 2011, Linz
55 / 83
One-dimensional Helmholtz equation: slow to converge
components, aka lowest eigenmodes
1
0.8
2
2
2
1.5
1.5
1.5
0.6
1
1
1
0.4
0.5
0.5
0.5
0.2
0
0
0
0
−0.5
−0.2
−0.5
−0.5
−1
−0.4
−1
−1.5
−0.6
−1.5
−0.8
−1
0
1
2
3
4
5
6
−2
7 0
−1
−1.5
−2
1
2
3
4
5
6
−2.5
7 0
1
2
3
4
5
6
−2
7 0
1
2
3
4
5
6
7
(a) a random initial error; (b)-(d) examples of remaining errors, obtained by
applying three multigrid V-cycles to the homogenous Helmholtz equation
(periodic boundary conditions)
Ira Livshits
(BSU)
-
November 2011, Linz
56 / 83
Assumption on the error
The error unreduced by standard multigrid
X
X
X
e(x) =
exp(iwx) =
αw exp(iwx) +
βw exp(−iwx)
|w |≈k
w ≈k
w ≈k
where the value of w is determined by the relaxation history. One can drive it to
be
(1 − γ1 )k ≤ w ≤ (1 + γ2 )k,
where typical are values γj ≈ 0.3
Ira Livshits
(BSU)
-
November 2011, Linz
57 / 83
Dual representation
If we agreed on the previous page, we can also agree that
where e1 (x) =
e(x) = e1 (x)exp(ikx) + e2 (x)exp(−ikx),
P
w ≈k αw exp(i(w − k)x) and e2 (x) =
w ≈k βw exp(−i(w − k)x).
P
Obviously, envelope functions e1 and e2 are smooth (compared to exp(±ikx)).
Indeed the highest frequencies are |w − k| ≈ γj k
Remark : Representation is not unique - one has to work to guarantee the
statement above.
The idea here is simple Lets try approximate smooth e1 (x) and e2 (x) instead of
oscillatory e(x) (which we cannot do well anyway) and hope for the best.
Ira Livshits
(BSU)
-
November 2011, Linz
58 / 83
Defect correction
If we assume the error in the form
e(x) = e1 (x)exp(ikx) + e2 (x)exp(−ikx),
then naturally
r (x) = Le(x) = L(e1 (x)exp(ikx) + e2 (x)exp(−ikx)) =
L(e1 (x)exp(ikx)) + L(e2 (x)exp(−ikx)) = exp(ikx)L1 e1 (x) + exp(−ikx)L2 e2 (x)
Ira Livshits
(BSU)
-
November 2011, Linz
59 / 83
Differential operators and residual representation
Two conclusions from the previous page:
Note that we use the differential Helmholtz operators and its kernel components,
resulting in differential equations for the envelope functions e1 and e2 , L1 and L2 :
L1 u = uxx + 2ikux
and L2 u = uxx − 2ikux .
Trick! Operator L has two kernel components (exp(±ikx)) and so do Lj (1 and
exp(∓2ikx)) But we consider grids where only one (a constant) is visible!
Also, given the form of ej and Lj , we can claim that
r (x) = r1 (x)exp(ikx) + r2 (x)exp(ikx)
where rj are smooth (as smooth as ej in fact).
Ira Livshits
(BSU)
-
November 2011, Linz
60 / 83
From Waves to Rays
To reduce the irreducible wave error e(x): Le = r
we approximate two ray errors
L1 e1 = r1
and L2 e2 = r2
and then reconstruct back to the wave e(x) = e1 (x)exp(ikx) + e2 (x)exp(−ikx)
Where is the gain:
To compute the ray functions one should start on the grid with kH ≈ π (very
coarse)
To accelerate, one may use standard multigrid on each Lj ej = rj if wishes to
go coarser.
Ira Livshits
(BSU)
-
November 2011, Linz
61 / 83
Residual separation and why on kH ≈ π
Consider the wave (Helmholtz) residual with r1 and r2 are smooth
r (x) = r1 (x)exp(ikx) + r2 (x)exp(−ikx).
Clearly, such representation is not unique and some others allow
r (x) = r̃1 (x)exp(ikx) + r̃2 (x)exp(−ikx)
with oscillatory r̃j .
We want to approximate smooth ray residuals (on coarse grids).
Ira Livshits
(BSU)
-
November 2011, Linz
62 / 83
Separation (implementation considered on grids)
Compute fine-grid wave residual r h = f h − Lh u h (h is fine enough for small
phase error kd(kh)2 1)
Transfer it to the grid with kHs ≈ π/2 using full weighting (annihilates
physically oscillatory error components), resulting in
r Hs = r1Hs exp(ikx) + r2Hs exp(−ikx)
To approximate r1 : Multiply by exp(−ikx)
exp(−ikx)r Hs = r1Hs + r2Hs exp(−2ikx)
Note that first term is smooth and the second one is extremely oscillatory
2kH s ≈ π. Applying one more full weighting results in the desired ray residual
s
s
s
IH2H
(exp(−ikx)r Hs ) = IH2H
(r1Hs ) + IH2H
(r2Hs exp(−2ikx)) ≈ r12Hs
s
s
s
Ira Livshits
(BSU)
-
November 2011, Linz
63 / 83
Ray operator
The discrete operators for the ray functions are derived from the differential
operators:
L1 u = u 00 + 2iku 0 = f
considered on the grid with kH ≈ π.
The choice of discretization is defined by the principality of terms:
1
The second-order term : 2
H
2k
The first-order term:
H
If kH ≥ π the first-order term is principal and should be accurately
discretized;
Thus using staggered grids.
LH
j
Ira Livshits
=
(BSU)
1
H2
2ik
1
−
− 2
H
H
-
2ik
1
− 2
H
H
1
H2
November 2011, Linz
64 / 83
Wave-ray cycle
Wave cycle:
I
I
I
I
I
I
I
I
Reduces all but problematic characteristic components;
Produces residual for the ray cycles (on one of the fine grids with small phase
error);
Finest grid: (kd)(k 2 h2 ) 1
Coarsest grid: kH > π (Used to effectively wipe out constants and alikes,
stencil [1 (k 2 H 2 − 2) 1]
Correction Scheme;
GS one pre- and one post relaxation on fine grid and on the coarsest grid;
Kaczmarz four pre- and one post relaxation sweeps everywhere else;
Linear interpolation and full weighting;
Two ray cycles to reduce the characteristic components.
Ira Livshits
(BSU)
-
November 2011, Linz
65 / 83
Ray cycles
Staggered grid, short central differences for ux ;
Marching scheme in the propagation direction;
CS or FAS multigrid scheme depending on interpretation;
Natural introduction of boundary conditions in terms of rays.
Ira Livshits
(BSU)
-
November 2011, Linz
66 / 83
Before we proceed, detour to FAS
Full Approximation Scheme
Allows coarse grid approximation of the solution u H rather than correction
eH ;
Considers full equation LH u H = f H residual equation LH e H = r h ;
Developed for solving non linear equations.
Ira Livshits
(BSU)
-
November 2011, Linz
67 / 83
FAS details
Consider ũ h a current fine-grid solution and r h = f h − Lh ũ h is its corresponding
residual;
CS:
I
I
I
Coarse grid variable correction: v H
Coarse grid system: LH v H = R H = IhH r h
h
Coarse grid correction: ũNEW
= ũ h + IHh v H
FAS:
I
I
I
I
Coarse grid variable solution: ũ H = ĨhH ũ h + v H
Coarse grid system: LH u H = f H = LH (ĨhH u h ) + R H
h
H
Coarse grid correction: ũNEW
= u h + IHh (ũNEW
− ĨhH ũ h )
h
H
If LH resolves all solution components then ũNEW
= IHh (ũNEW
)
Ira Livshits
(BSU)
-
November 2011, Linz
68 / 83
τ -correction
Consider the c.g. equation as
LH u H = f H + τhH
where
f H = IhH f h ,
τhH = LH (ĨhH u h ) − IhH (Lh u h )
Term τhH is a fine-to-coarse defect correction.
Nonlinear equations:
The correction equation is Lh (ũ h + v h ) − Lh ũ h = r h .
On coarse grid its reads: LH (ĨhH ũ h + v H ) − LH (ĨhH ũ h ) = IhH r h which is exactly
the top equation.
Ira Livshits
(BSU)
-
November 2011, Linz
69 / 83
Rays, directions and BC
Ira Livshits
(BSU)
-
November 2011, Linz
70 / 83
Back to Boundary conditions
Assumptions:
Away from the source (near the boundary) the solution has a pure ray
character so can be described using ray operators;
The typical boundary conditions are the radiation BC, sometimes combined
with the Dirichlet boundary conditions.
The rays can enter the computational domain from the infinity and leave the
computational domain freely unless there is an obstacle, media change, etc.
The entering part is to be defined by the Direchlet boundary conditions at
the entrance (for each ray); the exiting part is controlled by the Sommerfield
boundary conditions at the exit (the latter can be replaced by an interior
equation appropriately discretized).
If no rays (waves) enter from infinity, the Dirichlet boundary conditions are
homogenous.
Ira Livshits
(BSU)
-
November 2011, Linz
71 / 83
Full ray formulation (for two-sided Sommerfield BC)
Exiting Sommerfield BC: At x = 0: u 0 + 2iku = 0 and at x = d: u 0 − 2iku = 0
Positively propagating ray (corresponding to exp(ikx))
L1 a(x) = a100 (x) + 2ika10 (x) = f1 (x),
a1 (0) = a1,0 ,
a10 (d) = 0
Negatively propagating ray (corresponding to exp(−ikx))
L2 a(x) = a200 (x) + 2ika20 (x) = f2 (x),
a2 (d) = a2,0 ,
a10 (0) = 0
Two systems are solved independently;
If there is a Dirichlet BC on one side, the two has two be combined (incident
wave defines the amplitude of the reflected wave);
Ira Livshits
(BSU)
-
November 2011, Linz
72 / 83
Wave-ray interaction
Because of BC the ray cycles are implemented in FAS (solution
representation);
At the interior, regular FAS correction:
H
h
H
− ã2H ));
ũNEW
= ũ h + exp(ikx)(IHh (ã1,NEW
− ã1H )) + exp(−ikx)(IHh (ã2,NEW
At the boundary, direct replacement
h
H
H
ũNEW
= exp(ikx)(IHh ã1,NEW
) + exp(−ikx)(IHh ã2,NEW
);
Ira Livshits
(BSU)
-
November 2011, Linz
73 / 83
Conclusions on 1d
The solver is a combination of the wave and the ray approaches;
The complexity is similar to one for Laplace, it is O(N).
Worked for jumping and slightly varying wave numbers;
Ira Livshits
(BSU)
-
November 2011, Linz
74 / 83
Wave-Ray algorithm
Geometric solver for the 2D Helmholtz equation with constant wave numbers
Brandt, Livshits: Wave-ray multigrid method solving standing wave equations, 1997
Livshits, Brandt: Accuracy Properties of the Wave-Ray Multigrid Algorithm for
Helmholtz Equations, 2006
Special feature: Separate treatment of oscillatory kernel e ikx , |k| = k on coarse
grids.
Ira Livshits
(BSU)
-
November 2011, Linz
75 / 83
Directions of propagation e ikx
Ira Livshits
(BSU)
-
November 2011, Linz
76 / 83
Discretization of the propagation directions: few chosen
direction k`
Ira Livshits
(BSU)
-
November 2011, Linz
77 / 83
Wave-ray approach
Components not changed (at best) by the standard multigrid:
exp(iwx),
|w| ≈ k,
w, x ∈ R 2
Ray representation for such components u(x) and residual r (x)
X
X
u` (x)exp(ik` x) and r (x) =
r` (x)exp(ik` x)
u(x) =
`
`
Instead of oscillatory u(x) compute smooth u` (x);
Differential operators for u` (x) are obtained directly from L.
Ira Livshits
(BSU)
-
November 2011, Linz
78 / 83
Phase error: Wave vs Rays
Waves (as discussed): E (kd, kh) ≈ kd
k 2 h2
, 24 ≤ γ ≤ 48
2πγ
β
, β ≈ 0.12
2πL2
This is assuming very aggressive coarsening with Ln = 2n+2 :
Rays: E (kd, L) ≈ kd
I
I
I
Propagation direction ξ: hξn = (Ln )2 /(16k);
Orthogonal direction η: hηn = CLn /4k
For L1 = 8, khξ ≈ 4 and khη ≈ 2
Ira Livshits
(BSU)
-
November 2011, Linz
79 / 83
Main features
Smaller wave computational domain;
Larger ray domains; grids aligned with propagation directions;
Agressive ray coarsening;
Ray equations aξξ + aηη + 2ikaξ , ξ = k`
Ray residual separation using consecutive full weighting;
FAS for both waves and rays - boundary values for waves come directly from
rays;
Ira Livshits
(BSU)
-
November 2011, Linz
80 / 83
More about rays
Ira Livshits
(BSU)
-
November 2011, Linz
81 / 83
Algorithm
Repeat until happy
I
I
CS standard wave V cycle
FAS ray cycle:
computation of wave residual kh 1;
separation into ray residuals starting kh ≈ 1;
forming FAS ray rhs, line relaxation khξ ≈ 4, khη ≈ 2;
reconstructing to the wave form kh ≈ 1;
interpolating (with relaxation) to the finest wave grid kh 1.
Ira Livshits
(BSU)
-
November 2011, Linz
82 / 83
Requirements and advantages
Requirements
Helmholtz PDE
Structured grids
Analytical knowledge of Lu ≈ 0
Advantages
Local processes - wave representation on fine grids
Geometrical optics processes - ray representation on coarse grids and larger
domains
Natural intro of radiation boundary conditions
Ira Livshits
(BSU)
-
November 2011, Linz
83 / 83
Download