Distributed computation of wave propagation models using PVM

advertisement
Distributed Computation
of Wave Propagation
Models Using PVM
Richard E. Ewing
Texas A& M University
Robert C. Sharpley
University of South Carolina
Derek Mitchum and Patrick O'Leary
University of Wyoming
James S. Sochacki
James Madison University
@ The Parallel
VirtualMachine lets
researchers create a
powerful, inexpensive
parallel system on
which they can solve
large, sophisticated
problems such as
simulating the
propagation of
seismic waves.
26
A
lthough MIMD, SIMD, shared-memory, and other emerging supercomputers can attack most of today's large-scale
computing problems, these machines are inaccessible to
the average researcher. But any researcher with accounts on
multiple Unix workstations can corral unused C P U cycles
to solve large problems by using a distributed software utility called the
Parallel Virtual Machine, developed by Emory University, the University
of Tennessee, and Oak k d g e National Laboratory (see the sidebar).'a2
PVM lets researchers connect workstations, mini-supercomputers, or
specialty machines to form a relatively inexpensive, powerful, parallel
computer. Such hardware is frequently abundant at research locations,
so PVM incurs little or no hardware costs. Also, PVM is flexible: It uses
existing communication networks (Ethernet or fiber) and remote procedural libraries; it lets programmers use either C or Fortran; and it can
emulate several commercial architectures including hypercubes, meshes,
and rings. W e believe that PVM can compete effectively with traditional
supercomputers, and we have demonstrated its computational power and
cost-effectiveness by simulating the propagation of seismic waves using
an isolated Ethernet ring comprising an IBM RS/6000 550 as the host
and six RS/6000 320H's as the nodes.
1063-65j?/94/$4.00 0 1994 IEEE
IEEE Parallel & Distributed Technology
Model equations and
numerical method
Geophysicists determine the earth's substructure by
producing vibrations (through controlled explosions or
vibroseis trucks) at or near the earth's surface. They are
particularly interested in the density, sound speed, and
Lam6 parameters (describing the earth's elastic properties) of the materials composing the section o f the earth
surrounding the explosion site. Typical measurements
include pressure distribution at the earth's surface
caused by the explosion (the p.re.s.szii-e seismogram) and the
vertical displacement of the earth's surface (the displacement seismognzm). Geophysicists use an acoustic wave
equation to simulate a pressure seismogram, and an elastic zm'e equation to simulate a displacement seismogram.
From these they determine the substructure's characteristics. (These wave model equations can also solve
problems in medical imaging, sonar, and nondestructive testing of materials.)
Determining a wave source's effects on a specified substructure is called thefomai-dp.roblem,while determining
the substructure and its parameters is the inverseproblem.
W e address the forward problem here. W e are dealing
with a 2D problem, so x1 or x represents distance along
the earth's surface, and x2or z represents depth into the
earth. T h e forward acoustic problem consists of solving
the following equation, given po, c, F , and S:
I f p = -U, and zl = (l/po)Vn + G, where R(.r, t)= c'p,,V . (G),
and G, = (l/po)F, thenp and i' solve Euler's equations for
pressure and velocity. To that equation we add the earth's
surface condition: p(.q, 0, t) = -ut(x1,0, t) = S(xl,t),where
Sis a surface excitation (S= O if there is no surface source).
T h e forward elastic problem consists of solving the
following equation, given U ,p, p, F1,F z , S,, and Sz:
a
ax
pnrt = - [ ( h + 2 p ) u , +hiE,]t
where p = p(x,~)is the equilibrium density; h = h(x,z),p
= p(x,z) are the Lain6 parameters; c = U = d(h2p/p), p =
PVM components
PVM has two primary components:controlling daemons and
a procedural libra7y. The controlling daemon PVMD institutes distributed control by requiring each processing
unit in the distributed calculation to execute its own copy
of PVMD. Each processing unit thereby absorbs any masterlslave overhead. As the controlling daemons exchange
information, a resident look-up table of enrolled subprocesses enables interprocessorcommunication. PVMD
also facilitates point-to-pointdata transfer, message broadcasting, mutual exclusion, process control, shared memory emulation,and barriers.
The set of simple subroutinecalls in the procedural library lets programmers interact with PVMD in a relatively transparent manner. Therefore, parallelizing an
application requires few subroutine calls and provides
flexibility.
d(p/p) are the P and S wave velocities, respectively; 14 is
the horizontal particle displacement; w is the vertical
particle displacement; and F,, F2 are the interior sources.
Free-surface boundary conditions describe the earth's
surface:
where S, and SI are the surface excitation sources.
Our model of the earth has idealized curves of discontinuity for p, c, and p that describe the iizte7fnces between layered media. Geophysicists use the inverse
problem to locate these interface curves and determine
the layers' parameters. For the forward problem, these
interfaces and parameters arc specified.
W e could use many numerical methods to solve the
forward problem, including finite-difference, finiteelement, Fourier, and pseudospecu-al.Each has strengths
and weaknesses. U'e chose the finite-difference
method,3,-'which gives discrete difference equations for
each point in the region of interest and integrates the
equations at each spatial grid point. W'e use centered
differences to keep second-order accuracy over time.
Integration forces continuity of pressure and normal velocity at the interfaces in the acoustic wave equation,
and the particle displacements, normal stresses, and tangential stresses at the interfaces in the elastic wave equation. This method is naturally parallel, because the integration scheme is uniform at each node (grid point)
and may be handled independently.
Typically, the region surrounding the explosion site
Wavepropagation
models
Figure A1 shows an acoustic model of
a salt dome lying between 500 and
1,000meters under three layersof different homogeneous materials that are
separated at 200 meters and 400 meOm
1,000 m 0.2 seconds
ters. The model dimensions are 1,OOO
x 1,000 meters; sound speed values
vary from 1,000 to 5,000 meters per
second, and density vanes from 1,000
to 5,500 kilograms per cubic meter.
The casual source is a surface explosion set off at 400 meters, with frequencies from 3-7 Hz. Figure A2
shows pressure distribution (wave
(11
propagation) at 0.2, 0.3, 0.4, and 0.5
igure
A.
Acoustic
wave simulal:ion.
seconds. The remaining parameters
for the acoustic model are dt = 0.001
second, dx = dz = 10 meters.
Figure B1 shows the elastic model,
which is the same as the acousticmodel
except that a fluid-saturated dome lies
from 600 to 800 meters and there is an
outcropping at the surface. The dome
demonstrates S wave generation (not
present in acoustic simulations), while
the slanted interface at the surface
Om
1,000 m 0.2 seconds
shows the importance of accurate surface boundary conditions for the elastic model. The source, located at x =
600 meters and z = 200 meters, is a
compressional spherical source with
amplitude in time given by the derivative of a Gaussian. Figure B2 shows
particle motion (wave propagation) en0.4 seconds
ergy at 0.2, 0.3, 0.4, and 0.5 seconds.
(2)
The remaining parameters are the
Figure B. Elastic wave simulation.
same as for the acoustic model.
has no physical boundaries, so the numerical simulation
should minimize spurious (artificial) reflections off the
numerical boundary. W e use numerical boundary conditions called absov-bing boundaly conditions to reduce or
eliminate spurious reflections. Since only the processors handling the model's outer edges calculate boundary conditions, this presents a load-balancing problem.
W e address this problem using a damping method,j remembering that absorbing boundary conditions are a t
best approximately absorbing. This method requires a
modified wave equation (at the boundary only) that artificially maintains load balancing. This equation requires more calculations a t the interior points, but most
absorbing boundary conditions are computationally intensive.
28
0.3 seconds
0.3 seconds
T h e difference approximation to the acoustic equation without damping is
u;;l
=
where
IEEE Parallel & Distributed Technology
and bj,k = (l/PO(T,zk)).
T h e finite-difference equation that includes the absorbing boundary conditions is
1
2
3
4
5
6
Processors
~
~~~~
~-
~~
~
Figure I . Communication time.
where vIilis the expression computed in the difference
approximation without damping and Aj,k is the damping weights. This forms a five-point star for U .
T h e difference equations for the elastic equation are
~ i m i l a r but
, ~ they include mixed differences for the
cross-derivative terms," and the difference stencil is a
nine-point star.
T h e elastic equation's free-surface boundary conditions are difficult to solve using finite differences. W e
use an implicit m e t h ~ d , ~which
Jj
creates a system that
has four bands and must be solved a t each time iteration. Directly inverting this matrix is essentially a sequential algorithm, so we solve this system by iterative
methods in order to keep the code parallel.
PVM implementation
T h e parallel version of our acoustic wave propagation
simulator uses the hosdnode approach. T h e host program performs VO and dictates the domain decomposition to the node program. T h e node program gathers
and distributes information needed for and produced
by the finite-difference calculations, and communicates
iterative interprocessor boundary solutions to neighboring nodes with respect to the problem's domain decomposition.
This is a 2D decomposition, so we can divide the domain into strips to exploit available vector processors or
into patches to reduce communication packet size. Our
timings show that this flexibility can help achieve optimal speedup.
T h e node program calculates communication pathways by assigning node values from 0 to n-1 to the
processors. This maintains nearest-neighbor communication, although PVM obtains no computational advantage from it. T h e communication of the iteration
interprocessor boundary solutions synchronizes the
node programs, while requirements for output synchronize the hosdnode programs.
Spring 1994
T h e parallel version of the elastic wave propagation
simulator is similar, but we add the implicit method to
incorporate stress-free surface conditions. Since these calculations occur only at the surface nodes, load balancing
and node program synchronization become issues. W e
do not have to reorganize the data structure among the
nodes because the implicit method uses the same finitedifference stencil. However, we also use the conjugate
gradient squared algorithm7as a solver, which ylelds five
barriers to parallelization that involve both inner products and a matrix multiply. The inner products require
a global sum across surface nodes, but the associated
communication packet is small. For the matrix multiply,
the necessary matrix components are locally available,
but vector components that correspond to the off-diagonal bands are not resident and must be gathered using
nearest-neighbor communication between surface nodes.
Computational results
W e analyzed the forward problem for two models that
are important to geophysicists (see the second sidebar).
We ran an acoustic model and elastic model (with and
without the free-surface solve) 10 times each on configurations of one to six processors, for a total of 60 runs per
model. Figure 1 shows the overall communication time
for each simulation.The acoustic simulation requires less
communication time because we are solving only for the
pressure, as opposed to solving for vertical and horizontal displacements. However, the elastic simulation seems
to take slightly less communication time with the freesurface solve than without it. This anomaly indicates that
the synchronization step in the surface solve alleviated a
communication bottleneck caused by network saturation.
Since the amount of information communicated is constant, the flattening of the curves indicates how well PVM
software parallelizes codes.
29
6.0
n
3
U
a
m
x Acoustic
+ Elastic
5.0 o Elastic with
free surface solve
4.0
free surface solve
3.0
2.0
1.o
0.0
00’
_.I
2
1
~
~
3
4
Processors
5
~~
~
3.0
2.0
1.o
0.0
~
1
2
3
4
Processors
5
6
Figure 4. Timestep time.
Figure 2 shows the ideal computational time for each
simulation, gven PVM overhead and the chosen algorithms. Highly parallel algorithms such as those we use
reduce inhibition of parallelization. T h e matrix solve,
used for the elastic simulation with the free-surface solve,
drastically inhibits parallelizationbecause many processors are unused.
Figure 3 shows the actual time for each simulation,
indicating overall speedup of 5.14 for the acoustic simulation, 5.42 for the elastic simulation, and 4.75 for the
elastic simulation with the free-surface solve. Compared
to the ideal times in Figure 2, the acoustic simulations
balance computation and communication poorly, while
the elastic simulations are more evenly balanced. This
indicates a need to analyze &s balance thoroughly when
running distributed code.
Figure 4 factors out the startup time to indicate how
the host-node paradigm performs. PVM initialization
requires only a few seconds and thus does not inhibit
speedup.
30
~
5
6
.~
Future directions
S
U
W
m
3
4
Processors
T h e major performance difference between the
acoustic and elastic models was communication overhead
caused by the elastic model’s free-surface constraints.
Acoustic
Elastic
5.0 o Elastic with
free surface solve
4.0
x
f
n
~~
2
Figure 3. Actual computational time.
Figure 2. Ideal computational time.
6.0
1
6
T h e matrix computations arising from the free-surface
conditions are similar to those for elliptic differential
equations. One way to parallelize this computation is to
diagonally precondition this system and then parallelize
the matrix multiply part of a preconditioned conjugategradient type iterative procedure. This also requires parallelizing the scalar product and a global sum. Techniques for these parallel computations are available.
As the application’ssize increases and the discretization sizes decrease, the conditioning of the matrix described above will increase significantly, and the diagonal preconditioner will be less effective. Therefore, we
are developing better parallelization methods based on
domain decomposition.’O W e have written a general
Additive-Schwartz, overlapping domain code9 that
physically splits the domain; the size of the overlap between regions is given as an input parameter and controls communication. W e can then locally apply multigrid methods to give good local preconditioning.
D
VM does have disadvantages. It cannot ex-
ploit nearest-neighbor communication.
Since it depends on existing networks, communications must follow network package
protocols, so several machines may process
a message before it reaches its destination. Also, the network could become a significant bottleneck. For many
applications, speedups will be less significant as processors are added and network communication becomes
saturated. Finally, we performed our simulation on
homogeneous hardware in an isolated network; perforIEEE Parallel & Distributed Technology
mance will probably degrade in a heterogeneous environment or a network with heavy or bursty traffic. A
heterogeneous system will also cause additional loadbalancing problems, and PVM may not be suitable for
some algorithms in a heterogeneous environment due to
incompatible processors and inaccuracies in their math
libraries. However, these drawbacks should diminish as
network technologies improve, as the Open Systems
Foundation addresses system compatibility, and as PVM
undergoes continued development.
%
ACKNOWLEDGMENTS
We thank Patrick K. Malone, Christian Turner, and Phillip Crotwell
for their help and the University of South Carolina and Westinghouse
Savannah River Laboratory for use of the IBM RS/6000 computing ring.
This work was supported in part by the National Science Foundation
under grants EHR-910-8774, EHR-910-8772, and INT-89-14472.
REFERENCES
1. A. Beguelin et al., “A User’s Guide to PVM Parallel Virtual Machine,”Tech. ReportTM-1126, OakRidge Nat’l Laboratory, Oak
Ridge, Term., 1991.
2. G.A. Geist and V.S. Sunderam, “Network-Based Concurrent
Computing on the PVM System,” to appear in Concurrency: Practice and Experience.
3. K.R. Kelly et al., “Modeling: The Forward Method,” in Concepts and Techniques in Oil and Gas Exploration, K.C. Jain and
R.J.P. de Figueiredo, eds., Soc. Exploration Geophysicists,
Tulsa, Okla., 1982.
4. J.S. Sochackiet al., “Interface Conditions for Acoustic and Elastic
WavePropagation,”Geophysics,Vol. 56,No. 2,1991, pp. 161-181.
5. J.S. Sochacki et al., “Absorbing Boundary Conditions and Surface Waves,” Geophysics,Vol. 52, No. 1, 1987, pp. 60-71.
6. J.E. Vidale and R.W. Clayton, “A Stable Free-Surface Boundary Condition for Two-Dimensional Elastic Wave Propagation,”
Geophysics,Vol. 51, No. 12, 1986, pp. 2247-2249.
7. P. Sonneveld, “CGS: A Fast Lanczos-Type Solver for Nonsynmetriclinear Systems,” SlAM?. Scientific and Statistical Computing, Vol. 10, No. 1, 1989, pp. 36--52.
8. J.S. Sochacld et al., “SeismicModehgand Inversion on the nCube,”
Frfth Dhrihted Memory Cvmputing Conference, Vol. 1, IEEE Computer Society Press, Los Alamitos, Calif., 1990, pp. 530-535.
9. R.E. Ewing et al., “ParallelizationofMultiphaseModels for Contaminant Transport in Porous Media,” in Parallel Processingfor
Scient$cCmpting, Vol. 1, R. Sincovecetal., eds., 1993,pp. 83-91.
10. J.H. Bramble et al., “Convergence Estimates for Product Iterative Methods with Applications to Domain Decomposition,”
Math. Comp., Vol. 57, 1991, pp. 2 3 4 5 .
Spring 1994
Richard E. Ewing is professor of mathematics and engineering, director of the Institute
for Scientific Computation, dean of Science,
and Texas Engineering Experiment Station
Distinguished Research Chair at Texas A&M
University. He has conducted research in numerical analysis, mathematical modeling, fluid
flow in porous media, and large-scale scientific computation. He has more than 180 scientific publications in journals, books, and
proceedings. He received his PhD in mathematics from the University of Texas a t Austin. Readers can contact
Ewing a t the Institute for Scientific Computation, Texas A&M University, College Station. T X 77843.
Robert C. Sharpley is a professor of mathematics at the University of South Carolina. He
is an editor of ConstruaiveApproximationand the
author of two research monographs and thirty
research articlesin approximationtheory, functional analysis, numerical analysis, computational science, Fourier analysis, and partial differential equations. He received his PhD in
mathematics from the University of Texas at
Austin in 1972. Readers can contact Sharpleyat
the Department of Mathematics, University of
South Carolina, Columbia, SC 29208.
DerekMitchum is a graphics specialist in the
Department ofMathematics at the University
of South Carolina. H e was previously a systems programmer for the University of
Wyoming. Readers can contact him at the Department of Mathematics, University of South
Carolina, Columbia, SC 29208.
James Sochacki is an applied mathematician
for the Department of Mathematics at James
Madison University. His research interests
include linear and nonlinear wave propagation,
especially the numerical approximation of
these equations. He is also developing an interdisciplinary undergraduate mathematical modeling center. Readers can contact Sochacki at
the Deparnnent of Mathematics, James Madison University, Harrisonburg, VA 22807.
Patrick O’Leary is a research scientist in
mathematics at the University of Wyoming.
His current research interests include parallelism, scientific visualization, and mathematical modeling. Readers can contact O’Leary at
the Institute for Scientific Computation, University of Wyoming, Laramie, WY 8207 1.
31
Download