helm

advertisement
Parallel solution of the
Helmholtz equation with
large wave numbers
Dan Gordon
Rachel Gordon
Computer Science
University of Haifa
Aerospace Eng.
Technion
July 1, 2010
Parallel solution of the Helmholtz equation
1
Outline of Talk
 The
Kaczmarz algorithm (KACZ)
 KACZ  CARP (ComponentAveraged Row Projections)
 Applications of CARP
 CARP-CG: CG acceleration of CARP
 Sample results with the Helmholtz
equation
July 1, 2010
Parallel solution of the Helmholtz equation
2
KACZ: The Kaczmarz algorithm
 Iterative
method, due to Kaczmarz
(1937). Rediscovered for CT as ART
 Geometric
algorithm: consider the
hyperplane defined by each equation
 Start
from an arbitrary initial point
 Successively
project current point
onto the next hyperplane, in cyclic
order
July 1, 2010
Parallel solution of the Helmholtz equation
3
KACZ: Geometric Description
initial
point
eq. 1
eq. 3
July 1, 2010
eq. 2
Parallel solution of the Helmholtz equation
4
KACZ with Relaxation Parameter
 KACZ
can be used with a relaxation
parameter w
 w=1:
project exactly on the
hyperplane
 w<1:
project in front of hyperplane
 w>1:
project beyond the hyperplane
 Cyclic
relaxation: eq. i is assigned a
relaxation parameter wi
July 1, 2010
Parallel solution of the Helmholtz equation
5
Convergence Properties of KACZ
 KACZ
with relaxation (0<w<2)
converges for consistent systems:
– Herman, Lent & Lutz, 1978
– Trummer, 1981
 For
inconsistent systems, KACZ
converges cyclically:
– Tanabe, 1971
– Eggermont, Herman & Lent, 1981 (for
cyclic relaxation parameters).
July 1, 2010
Parallel solution of the Helmholtz equation
6
Algebraic formulation of KACZ
 Given
the system
 Consider
system
Ax = b
the "normal equations"
T
T
AA y = b, x=A y
 Well-known
fact: KACZ is SOR
applied to the normal equations
 The
relaxation parameter of KACZ is
the usual relax. par. of SOR
July 1, 2010
Parallel solution of the Helmholtz equation
7
Block Mode & Parallelization
 Block
KACZ: projection onto affine
subspace defined by a block of eqns
 Block-sequential
KACZ:
– partition eqns into blocks
– each block consists of independent eqns
– iterate over blocks
– in each block, perform projections in
parallel
July 1, 2010
Parallel solution of the Helmholtz equation
8
CARP: Component-Averaged Row Projections
A
block-parallel version of KACZ
 The equations are divided into blocks
(not necessarily disjoint)
 Initial estimate: vector x=(x1,…,xn)
 Suppose x1 is a variable (component
of x) that appears in 3 blocks
 x1 is “cloned” as y1 , z1 , t1 in the
different blocks.
 Perform one (or more) KACZ
iteration(s) on each block
(independently, in parallel)
July 1, 2010
Parallel solution of the Helmholtz equation
9
CARP – Explanation (cont)
 The
internal iterations in each block
produce 3 new values for the clones
of x1 : y1’ , z1’ , t1’
 The next iterative value of x1 is
x1’ = (y1’ + z1’ + t1’)/3
 The next iterate is
x’ = (x1’ , ... , xn’)
 Repeat iterations as needed for
convergence
July 1, 2010
Parallel solution of the Helmholtz equation
10
CARP as Domain Decomposition
x0
y
1
clone of x1
x1
external grid
point of A
Note: domains may overlap
domain A
July 1, 2010
domain B
Parallel solution of the Helmholtz equation
11
Overview of CARP
domain A
KACZ
iterations
domain B
averaging
KACZ
iterations
cloning
KACZ in superspace
(with cyclic relaxation)
July 1, 2010
Parallel solution of the Helmholtz equation
12
Convergence of CARP
 Averaging
Lemma: the component-
averaging and cloning operations of
CARP are equivalent to KACZ row-
projections in a certain superspace
(with w=1)

CARP is equivalent to KACZ in the
superspace, with cyclic relaxation
parameters – known to converge
July 1, 2010
Parallel solution of the Helmholtz equation
13
CARP Application: Solution of stiff
linear systems from PDEs
 Elliptic
PDEs w/large convection term
result in stiff linear systems (large
off-diagonal elements)
 CARP is very robust on these
systems, as compared to leading
solver/preconditioner combinations
 Downside: Not always efficient
July 1, 2010
Parallel solution of the Helmholtz equation
14
CARP Application: Electron Tomography
(joint work with J.-J. Fernández)
 3D
reconstructions: Each processor
is assigned a block of consecutive
slices. Data is in overlapping blobs.
 The blocks are processed in parallel.
 The values of shared variables are
transmitted between the processors
which share them, averaged, and
redestributed.
July 1, 2010
Parallel solution of the Helmholtz equation
15
CARP-CG: CG acceleration of CARP
 CARP
is KACZ in some superspace
(with cyclic relaxation parameters)
 Björck & Elfving (BIT 79): developed
CGMN, which is a (sequential) CGacceleration of KACZ (double sweep,
fixed relax. parameter)
 We extended this result to allow
cyclic relaxation parameters
 Result: CARP-CG
July 1, 2010
Parallel solution of the Helmholtz equation
16
CARP-CG: Properties
 Same
robustness as CARP
 Very significant improvement in
performance on stiff linear systems
derived from elliptic PDEs
 Very competitive runtime compared
to leading solver/preconditioner
combinations on systems derived
from convection-dominated PDEs
 Improved performance in ET
July 1, 2010
Parallel solution of the Helmholtz equation
17
CARP-CG: Properties
 On
one processor, CARP-CG is
identical to CGMN
 Particularly useful on systems with
LARGE off-diagonal elements
– example: convection-dominated PDEs
 Discontinuous
coefficients are
handled without requiring domain
decomposition (DD)
July 1, 2010
Parallel solution of the Helmholtz equation
18
Robustness of CARP-CG
 KACZ
inherently normalizes the eqns
 After normalization, the diagonal
elements of AAT are larger than the offdiagonal ones (in each row)
 This is not diagonal dominance, but it
makes the normal eqns manageable
 Normalization was also found to be
useful for discontinuous coefficients
July 1, 2010
Parallel solution of the Helmholtz equation
19
The Helmholtz Equation
Eqn: -Δu - k2u = f
 Wave length: l = 2p/k
 No. of grid pts per l: Ng = 2p/kh
 Shifted Laplacian approach:
– Bayliss, Goldstein & Turkel, 1983
– Erlangga, Vuik & Oosterlee, 2004/06
-Δu – (1- i b)k2 u = f
uses multigrid to solve the PC
(PC = preconditioner)

July 1, 2010
Parallel solution of the Helmholtz equation
20
The Helmholtz Equation
 Bollhöfer,
Grote & Schenk, 2009:
introduced algebraic multilevel PC for
the Helmholtz eqn in heterogeneous
media. Uses symmetric max weight
matchings and an inverse-based
pivoting method.
 Apologies
to many other contributors
to this problem!
July 1, 2010
Parallel solution of the Helmholtz equation
21
Experiments
 CARP-CG
was used with a fixed
relaxation parameter of 1.7 in all
cases
 Domain:
 2nd
unit square [0,1][0,1]
order central difference scheme
July 1, 2010
Parallel solution of the Helmholtz equation
22
Problem 1 (with analytic sol'n)
Based on Erlangga et al '04, §6.1
 Eqn: (Δ+k2)u = (k2–5p2)sin(px)sin(2py)
 bndry condition: u=0 on all sides
 Analytic solution: u = sin(px)sin(2py)
 Grid points per l: Ng = 6,8,10,12
 No. of processors: 1 – 32
 k = 100, 300

July 1, 2010
Parallel solution of the Helmholtz equation
23
Problem 2 (homogeneous)
Based on Erlangga et al '04, §6.2
 Eqn: Δu + k2u = 0
 Domain: unit square [0,1]x[0,1]
 Dirichlet bndry cond. on one side, with a
discontinuity at midpoint
 1st-order absorbing bndry cond. on other
sides
 Grid points per l: Ng = 6, 8, 10
 No. of processors: 1 – 32
 k = 75, 150, 300, 450, 600

July 1, 2010
Parallel solution of the Helmholtz equation
28
July 1, 2010
Parallel solution of the Helmholtz equation
29
July 1, 2010
Parallel solution of the Helmholtz equation
30
July 1, 2010
Parallel solution of the Helmholtz equation
31
Problem 3 (heterogeneous)
3-layer heterogeneous problem
 Based on Erlangga et al '04, §6.3
 Everything is identical to Problem 2
 EXCEPT:

k=600
k=450
k=300
July 1, 2010
Parallel solution of the Helmholtz equation
34
July 1, 2010
Parallel solution of the Helmholtz equation
35
July 1, 2010
Parallel solution of the Helmholtz equation
36
July 1, 2010
Parallel solution of the Helmholtz equation
37
Comparative Timing Results
Time/iter of Bi-CGSTAB and GMRES relative to CARP-CG
it-ratio = (time/iter of algorithm) / (time/iter of CARP-CG)
Method
time/iter time/iter
it-ratio
it-ratio
1 proc
16 proc
1 proc
16 proc
CARP-CG
0.0978
0.0134
1.00
1.00
BiCGSTAB
0.1477
0.0344
1.51
2.56
0.1490
0.0212
1.52
1.58
GMRES
(restart
=10)
Results from CARP-CG paper (PARCO, 2010)
Timing and Speedup Results
Problem 2, k=600, Ng=8, grid: 763763
582,169 (complex) equations, rel-res<10-7
No. proc
No. Iter
Time (s)
Speed-up Efficiency
1
7025
1256
1.00
100%
2
7039
822
1.53
76.4%
4
7066
462
2.72
68.0%
8
7115
255
4.92
61.5%
15
7206
159
7.92
52.8%
Summary
 CARP-CG
is highly scalable on the
Helmholtz eqn w/high wave numbers
 Applicable to discontinuous
coefficients
 Very simple to implement
 General-purpose – useful also for
other problems with large offdiagonal elements and discontinuous
coefficients
Other Potential Applications
 Fourth-order
schemes for the
Helmholtz equation (already have
good initial results)
 Maxwell equations
 Saddle-point problems
 Circuit problems
 Linear solvers in some eigenvalue
methods
 ...
July 1, 2010
Parallel solution of the Helmholtz equation
41
Relevant Publications
http://cs.haifa.ac.il/~gordon/pub.html
CARP: SIAM J Sci Comp 2005
CGMN: ACM Trans Math Software 2008
Microscopy: J Parallel & Distr Comp 2008
Large convection + discontin coef: CMES 2009
CARP-CG: Parallel Comp 2010
Scaling for discont coef: J Comp & Appl Math 2010
CARP-CG SOFTWARE AVAILABLE ON REQUEST
THANK YOU!
July 1, 2010
Parallel solution of the Helmholtz equation
42
Download