Dv: A toolkit for building remote interactive visualization services David O’Hallaron School of Computer Science and Department of Electrical and Computer Engineering Carnegie Mellon University July, 1999 Martin Aeschlimann, Julio Lopez Peter Dinda, Bruce Lowekamp www.cs.cmu.edu/~droh 1 Jacobo Bielak and Omar Ghattas (CMU CE) David O’Hallaron (CMU CS and ECE) Jonathan Shewchuk (UC-Berkeley) Steven Day (SDSU Geology) www.cs.cmu.edu/~quake 2 Teora, Italy 1980 3 San Fernando Valley lat. 34.38 long. -118.16 epicenter lat. 34.32 long. -118.48 x San Fernando Valley lat. 34.08 long. -118.75 4 San Fernando Valley (top view) Hard rock x epicenter Soft soil 5 San Fernando Valley (side view) Soft soil Hard rock 6 Initial node distribution 7 Partitioned unstructured finite element mesh of San Fernando nodes element 8 Communication graph Vertices: processors Edges: communications 9 Archimedes www.cs.cmu.edu/~quake Problem Geometry (.poly) Triangle/Pyramid MVPRODUCT(A,x,w); DOTPRODUCT(x,w,xw); r = r/xw; Finite element algorithm (.arch) Author .c Runtime library .node, .ele Slice C compiler a.out .part parallel system Parcel .pack 10 Visualization of 1994 Northridge aftershock: Shock wave propagation path Generated by Greg Foss, Pittsburgh Supercomputing Center 11 Visualization of 1994 Northridge aftershock: Behavior of waves within basin Generated by Greg Foss, Pittsburgh Supercomputing Center 12 Animations of 1994 Northridge aftershock and 1995 Kobe mainshock • • Data produced by the Quake group at Carnegie Mellon University Images rendered by Greg Foss, Pittsburgh Supercomputing Center. 13 Motivation for Dv • Quake datasets are too large to store and manipulate locally. – 40 GB - 6 TB depending on degree of downsampling – Common problem now because of advances in hardware, software, and simulation methodology. • Current visualization approach – Make request to supercomputing center (PSC) graphics department. – Receive an MPEGs, JPEG and/or videotapes in a couple of weeks/months. • Desired visualization approach – Provide a remote visualization service that will allow us to visualize Quake datasets interactively and in collaboration with colleagues around the world. – Useful for qualitative debugging, demos, and solid engineering results 14 Internet service models request server client response • Traditional lightweight service model – Small to moderate amount of computation to satisfy requests » e.g., serving web pages, stock quotes, online trading, current search engines • Proposed heavyweight service model – Massive amount of computation to satisfy requests » e.g., scientific visualization, data mining, future search engines – Approach: provide heavyweight services on a computational grid of hosts. 15 Heavyweight grid service model Best effort Internet Remote compute hosts (allocated once per service by the service provider) Local compute hosts (allocated once per session by the service user) 16 Challenges to providing heavyweight Internet services on a computational grid • Grid resources are limited – We must find an easy way to grid-enable existing packages. • Grid resources are heterogeneous – Programs should be performance-portable (at load time) in the presence of heterogeneous resources. • Grid resources are dynamic – Programs should be performance-portable (at run time) in the face of dynamic resources. • Applications that provide heavyweight grid services must be resource-aware. 17 Motivating application: Earthquake ground motion visualization FEM solver materials engine database remote database ROI reading resolution interpolation contours isosurface extraction scene scene synthesis local display and input rendering vtk routines Decreasing amount of data 18 Approaches for providing remote viz services • Do everything on the remote server – Pros: very simple to grid enable existing packages – Cons: high latency, eliminates possibility of proxying and caching at local site, can overuse remote site, not appropriate for smaller datasets. Moderate Bandwidth Link Very high-end remote server Local machine 19 Approaches for providing remote viz services • Do everything but the rendering on the remote server – Pros: fairly simply to grid-enable existing packages, removes some load from the remote site. – Cons: requires every local site to have good rendering power. Moderate bandwidth Link high-end remote server Machine with good rendering power 20 Approaches for providing remote viz services • Use a local proxy for the rendering – Pros: offloads work from the remote site, allows local sites to contribute additional resources. – Cons: local sites may not have sufficiently powerful proxy resources, application is more complex, requires high bandwidth between local and remote sites. Moderate bandwidth link High bandwidth link high-end remote server powerful local proxy server low-end local PC or PDA 21 Approaches for providing remote viz services • Do everything at the local site. – Pros: low latency, easy to grid-enable existing packages. – Cons: requires high-bandwidth link between sites, requires powerful compute and graphics resources at the local site. Ultrahigh Bandwidth Link low-end remote server powerful local server 22 Providing remote viz services • • • Claim: static application partitions are not appropriate for heavyweight Internet services on computational grids Thus, the goal with Dv is to find some flexible framework that allows us to schedule and partition heavyweight services. The approach is based on the notion of an active frame. 23 Active frames Active Frame Server Input Active Frame Frame data Frame program Output Active Frame Active frame interpreter Frame data Frame program Application libraries e.g, vtk Host 24 Overview of a Dv visualization service User inputs/ Display Remote datasets Local Dv client Request frames Response frames ... Dv Server Response frames Dv Server Remote DV Active Frame Servers Response frames Dv Server Response frames Dv Server Local DV Active Frame Servers 25 Grid-enabling vtk with Dv request frame [request server, scheduler, flowgraph, data reader ] local Dv client result request server reader scheduler ... ... response frames (to other Dv servers) local Dv server [appl. data, scheduler, flowgraph,control ] local machine (Dv client) remote machine 26 Scheduling Dv programs • Scheduling at request frame creation time – all response frames use same schedule – can be performance portable at load time – can not be performance portable at run time • Scheduling at response frame creation time – performance portable at load time and partially at run time. • Scheduling at response frame delivery time – can be performance portable at both load and run time. – per-frame scheduling overhead a potential disadvantage. 27 Current Dv Issues • Flowgraphs with multiple inputs/outputs – currently only support chains • Caching – Static data such as meshes needs to be cached on intermediate servers. • Scheduling interface – Must support a wide range of scheduling strategies, from completely static (once per session) to completely dynamic (each time a frame is sent by each frame server) • Network and host resource monitoring – network queries and topology discovery (Lowekamp, O’Hallaron, Gross, HPDC99) – host load prediction (Dinda and O’Hallaron, HPDC99) 28 Issues (cont) • Managing the Dv grid – Resource discovery » where are the Dv servers? » which of them are running? – Resource allocation » which Dv servers are available to use? – Collective operations » Broadcast? » Global synchronization of servers • Client model – one generic client that runs in a browser – config files that personalize client interface for each new service (dataset) 29 Related work • • • • • • • Active messages (Berkeley) Active networks (MIT) Globus (Argonne and USC) Legion (UVA) Harness (ORNL) Cumulvs (ORNL) PVM and MPI (MSU, ORNL, Argonne) 30 Conclusions • • • • • Heavyweight Internet services on computational grids are emerging. Static partitioning is not appropriate for heavyweight grid services Actives frames provides a uniform framework for grid-enabling and partitioning heavyweight services such as remote visualization. Dv is a toolkit based on active frames that we have used to grid-enable vtk. Dv provides a flexible framework for experimenting with grid scheduling techniques. 31