Dv: A toolkit for building remote interactive visualization services

advertisement
Dv: A toolkit for building remote
interactive visualization services
David O’Hallaron
School of Computer Science and
Department of Electrical and Computer Engineering
Carnegie Mellon University
September, 1999
joint work with Martin Aeschlimann, Peter Dinda,
Julio Lopez, and Bruce Lowekamp
www.cs.cmu.edu/~dv
1
Internet service models
request
server
client
response
•
Traditional lightweight service model
– Small to moderate amount of computation to satisfy requests
» e.g., serving web pages, stock quotes, online trading,
current search engines
•
Proposed heavyweight service model
– Massive amount of computation to satisfy requests
» e.g., remote viz, data mining, future search engines
– Approach: provide heavyweight services on a computational
grid of hosts.
2
Heavyweight grid service model
Best
effort
Internet
Remote compute hosts
(allocated once per service
by the service provider)
Local compute hosts
(allocated once per session
by the service user)
3
Initial heavyweight service:
Remote visualization of earthquake
ground motion
Jacobo Bielak and Omar Ghattas (CMU CE)
David O’Hallaron (CMU CS and ECE)
Jonathan Shewchuk (UC-Berkeley CS)
Steven Day (SDSU Geology and SCEC)
www.cs.cmu.edu/~quake
4
Teora, Italy
1980
5
San Fernando Valley
lat. 34.38 long. -118.16
epicenter lat. 34.32 long. -118.48
x
San Fernando Valley
lat. 34.08 long. -118.75
6
San Fernando Valley (top view)
Hard rock
x epicenter
Soft soil
7
San Fernando Valley (side view)
Soft soil
Hard rock
8
Initial node distribution
9
Partitioned unstructured finite
element mesh of San Fernando
nodes
element
10
Communication graph
Vertices: processors
Edges: communications
11
Quake solver code
NODEVECTOR3 disp[3], M, C, M23;
MATRIX3 K;
/* matrix and vector assembly */
FORELEM(i) {
...
}
/* time integration loop */
for (iter = 1; iter <= timesteps; iter++) {
MV3PRODUCT(K, disp[dispt], disp[disptplus]);
disp[disptplus] *= - IP.dt * IP.dt;
disp[disptplus] += 2.0 * M * disp[dispt] (M - IP.dt / 2.0 * C) * disp[disptminus] - ...);
disp[disptplus] = disp[disptplus] / (M + IP.dt / 2.0 * C);
i = disptminus;
disptminus = dispt;
dispt = disptplus;
disptplus = i;
}
12
Archimedes
www.cs.cmu.edu/~quake
Problem
Geometry (.poly)
Triangle/Pyramid
MVPRODUCT(A,x,w);
DOTPRODUCT(x,w,xw);
r = r/xw;
Finite element
algorithm (.arch)
Author
.c
Runtime library
.node, .ele
Slice
C compiler
a.out
.part
parallel system
Parcel
.pack
13
1994 Northridge quake simulation
•
•
40 seconds of an aftershock from the Jan 17,
1994 Northridge quake in San Fernando Valley
of Southern California.
Model:
– 50 x 50 x 10 km region of San Fernando Valley.
– unstructured mesh with 13,422,563 nodes, 76,778,630 linear
tetrahedral elements, 1 Hz frequency resolution, 20 meter
spatial resolution.
•
Simulation
–
–
–
–
0.0024s timestep
16,666 timesteps (45M x 45M SMVP each timestep).
~15-20 GBytes of DRAM.
6.5 hours on 256 PEs of Cray T3D (150 MHz 21064 Alphas, 64
MB/PE).
– Comp: 16,679s (71%) Comm: 575s (2%) I/O: 5995s(25%)
– 80 trillion (10^12) flops (sustained 3.5 GFLOPS).
– 800 GB/575s (burst rate of 1.4 GB/s).
14
15
Visualization of 1994
Northridge aftershock:
Shock wave propagation path
Generated by Greg Foss, Pittsburgh Supercomputing Center
16
Visualization of 1994
Northridge aftershock:
Behavior of waves within basin
Generated by Greg Foss, Pittsburgh Supercomputing Center
17
Animations of 1994 Northridge
aftershock and 1995 Kobe
mainshock
•
•
Data produced by the Quake group at
Carnegie Mellon University
Images rendered by Greg Foss, Pittsburgh
Supercomputing Center.
18
Motivation for Dv
•
Quake datasets are too large to store and
manipulate locally.
– 40 GB - 6 TB depending on degree of downsampling
– Common problem now because of advances in hardware,
software, and simulation methodology.
•
Current visualization approach
– Make request to supercomputing center (PSC) graphics
department.
– Receive an MPEGs, JPEG and/or videotapes in a couple
of weeks/months.
•
Desired visualization approach
– Provide a remote visualization service that will allow us
to visualize Quake datasets interactively and in
collaboration with colleagues around the world.
– Useful for qualitative debugging, demos, and solid
engineering results
19
Challenges to providing
heavyweight services on a
computational grid
•
Local resources are limited
– We must find an easy way to grid-enable existing packages.
•
Grid resources are heterogeneous
– Programs should be performance-portable (at load time) in
the presence of heterogeneous resources.
•
Grid resources are dynamic
– Programs should be performance-portable (at run time) in
the face of dynamic resources.
•
Bottom line: applications that provide
heavyweight grid services must be resourceaware.
20
Example Quake viz flowgraph
FEM solver materials
engine
database
remote
database
ROI
reading
resolution
interpolation
or decimation
contours
isosurface
extraction
scene
scene
synthesis
local
display
and
input
rendering
vtk (visualization toolkit) routines
Decreasing amount of data
21
Approaches for providing
remote viz services
•
Do everything on the remote server
– Pros: very simple to grid enable existing packages
– Cons: high latency, eliminates possibility of proxying and
caching at local site, can overuse remote site, not
appropriate for smaller datasets.
Moderate
Bandwidth
Link
very
high-end
remote
server
~1-10 Mb/s
local
machine
22
Approaches for providing
remote viz services
•
Do everything but the rendering on the
remote server
– Pros: fairly simply to grid-enable existing packages,
removes some load from the remote site.
– Cons: requires every local site to have good rendering
power.
Moderate
bandwidth
link
~1-10 Mb/s
high-end
remote
server
machine with
good rendering
power
23
Approaches for providing
remote viz services
•
Use a local proxy for the rendering
– Pros: offloads work from the remote site, allows local
sites to contribute additional resources.
– Cons: local sites may not have sufficiently powerful
proxy resources, application is more complex, requires
high bandwidth between local and remote sites.
Moderate
bandwidth
link
High
bandwidth
link
~100 Mb/s
high-end
remote
server
powerful
local
proxy
server
~1-10 Mb/s
low-end
local
PC or PDA
24
Approaches for providing
remote viz services
•
Do everything at the local site.
– Pros: low latency, easy to grid-enable existing packages.
– Cons: requires high-bandwidth link between sites,
requires powerful compute and graphics resources at the
local site.
Very High
Bandwidth
Link
low-end
remote
server
~1 Gb/s
powerful
local
server
25
Providing remote viz services
•
•
•
Claim: static application partitions are not
appropriate for heavyweight Internet services
on computational grids
Thus, the goal with Dv is to provide a
framework for automatically scheduling and
partitioning heavyweight services.
The Dv approach is based on the notion of an
active frame.
26
Active frames
Active Frame Server
Input Active Frame
Frame
data
Frame
program
Output Active Frame
Active
frame
interpreter
Frame
data
Frame
program
Application
libraries
e.g, vtk
Host
27
Overview of a
Dv visualization service
User
inputs/
Display
Remote
datasets
Local
Dv
client
Request frames
Response
frames
...
Dv
Server
Response
frames
Dv
Server
Remote DV Active Frame Servers
Response
frames
Dv
Server
Response
frames
Dv
Server
Local DV Active Frame Servers
28
Grid-enabling vtk with Dv
request frame
[request server, scheduler, flowgraph, data reader ]
code
local
Dv
client
result
request server
reader
scheduler
...
...
response frames (to other Dv servers)
local
Dv
server
[appl. data, scheduler, flowgraph,control ]
code
local machine
(Dv client)
remote machine
29
Scheduling Dv programs
•
Scheduling at request frame creation time
– all response frames use same schedule
– can be performance portable at load time
– cannot be performance portable at run time
•
Scheduling at response frame creation time
– performance portable at load time and partially at run
time.
•
Scheduling at response frame delivery time
– can be performance portable at both load and run time.
– per-frame scheduling overhead a potential disadvantage.
30
Scheduling
Single client resource aggregation example
Visualization flowgraph
High bandwidth link
Moderate bandwidth
link
Scheduling example (resource aggregation)
Dv
Dv server
(source)
Frame 1
server
Frame 2
Dv
Frame 3
server
Dv server
(client)
Dv
server
Remote site
Local site
31
Scheduling
Single client adaptation example
Visualization flowgraph
High CPU or battery availability
Moderate bandwidth link
Low CPU or battery availability
Moderate bandwidth link
Remote site
Local site
32
Current Dv Issues
•
Collaboration with viz groups
– need to exploit existing and new viz techniques
– e.g., progressive viz
•
•
Flowgraphs with multiple inputs/outputs
Caching
– Static data such as meshes needs to be cached on
intermediate servers.
•
Scheduling interface
– Must support a wide range of scheduling strategies, from
completely static (once per session) to completely
dynamic (each time a frame is sent by each frame server)
•
Network and host resource monitoring
– network queries and topology discovery (Lowekamp,
O’Hallaron, Gross, HPDC99)
– host load prediction (Dinda and O’Hallaron, HPDC99)
33
Issues (cont)
•
Managing the Dv grid
– Resource discovery
» where are the Dv servers?
» which of them are running?
– Resource allocation
» which Dv servers are available to use?
– Collective operations
» Broadcast?
» Global synchronization of servers
•
Client model
– one generic client that runs in a browser
– config files that personalize client interface for each new
service (dataset)
34
Related work
•
•
•
•
•
•
•
•
•
•
PVM and MPI (MSU, ORNL, Argonne)
Active messages (Berkeley, Illinois)
Active networks (CMU, MIT, GA Tech, ...)
Globus (Argonne and USC)
Legion (UVA)
Harness and Cumulvs (ORNL)
AppLEs (UCSD)
NWS (UTenn and UCSD)
Remos (CMU)
svPilot and Autopilot (UIUC)
35
Conclusions
•
•
•
•
•
Heavyweight services on computational grids
are emerging.
Static partitioning is not appropriate for
heavyweight grid services
Actives frames provides a uniform framework
for grid-enabling and partitioning
heavyweight services such as remote
visualization.
Dv is a toolkit based on active frames that we
have used to grid-enable vtk.
Dv provides a flexible framework for
experimenting with grid scheduling
techniques.
36
Download