Visualization Middleware for e-Science UK e-Science Core Programme Open Call gViz Project 1 gViz - Visualization for e-Science • Visualization is key component in eScience – • IRIS Explorer gViz project aims to: – – • Increasingly so, as size of datasets and scale of simulations continues to increase gViz partners: – – – Grid-enable Grid-enable an existing visualization system (IRIS Explorer) Broaden out to: • Study XML applications for visualization • Look at computational steering in depth • Grid-enable another visualization system, pV3 Academic: Leeds, Oxford, Oxford Brookes, CLRC/RAL Industrial: NAG, IBM UK and Streamline Computing International: Caltech, MIT IRIS Explorer XML Appl’n Comp Steering pV3 2 Starting Point: Dataflow Visualization Systems • Visualization represented as pipeline: – Read in data – Construct a visualization in terms of geometry – Render geometry as image data visualize render • Realised as modular visualization environment – IRIS Explorer is one example – Visual programming paradigm – Extensible – add your own modules – Others include IBM Open Visualization Data Explorer – Toolkits such as vtk have similar underlying model 3 Grid-enabled IRIS Explorer • As we move to a Grid world… – Dataflow paradigm remains entirely relevant – … but we want to be able to distribute modules across the Grid – … so can we simply extend NAG’s IRIS Explorer to work in this way? 4 Harnessing Remote Compute Resources Explorer on multiple hosts Explorer on single host Automatic authentication using: •Globus certificate •SSH Key pair Select remote host 5 Secure Distributed IRIS Explorer • Users can manually launch modules from remote Librarians • IRIS Explorer maps can link specific modules to specific remote hosts • ‘Design’ the dataflow for the Grid 6 Simple Computational Steering • Extensibility of IRIS Explorer allows user code (such as a simulation) to be included as a module in the dataflow • Demonstrator created for an environmental crisis scenario – Dangerous chemical escapes! – Model dispersion using system of PDEs and solve numerically over mesh – Visualize mesh elements where concentration exceeds threshold – What happens when the wind changes? – ‘faster-than-real-time’ 7 Computational Steering within IRIS Explorer • Finite volume simulation code as module in IRIS Explorer • Exploit secure distributed IRIS Explorer to run simulation remotely • Steer and visualize on the desktop – simulate on the Grid resource • Other IRIS Explorer features can be brought into action – such as collaborative working…. between numerical modeller and meteorologist for example 8 Collaborative Visualization - Bring in the Meteorologist Meteorologist with wind information (linking in remotely) Initiate collaborative session Numerical modeller studying error details 9 Moving ahead… • • This gives us Grid-enabled Collaborative Visualization and Computational Steering • Broaden out from IRIS Explorer – Study XML application for visualization, independent of any specific system Why is this not the whole answer? – XML for data too – Study computational steering • Decouple the simulation from the visualization – IRIS Explorer is just one system – It is designed primarily for visualization – not computational steering – Look in depth at grid-enabling another system, pV3 • First we revisit the reference model… 10 Extending the Model for Grid Environments • • • Visualization pipeline described independently of software or hardware resources For Grid environments we extend to progressively bind in resources…. data visualize render Conceptual: intent of the visualization – Show me isosurface of constant temperature • Logical: bind in the software system – Use IRIS Explorer (or vtk, or whatever) • Physical: bind in the resources to be used – Run the isosurface extraction on particular Grid resource 11 Developing an XML Language for Visualization: skML • Dataflow consists fundamentally of: – – – – – a map containing links between ports on modules which have parameters • This leads us to a simple XML application for visualization: called skML • Here a data reader is linked to an isosurfacer <?xml version="1.0"?> <skml> <map> <link> <module name="ReadLat” out-port="Output"> <param name="Filename"> testVol.lat </param> </module> <module id=“iso” name="IsosurfaceLat" in-port="Input"> <param name="Threshold" min="0" max="27"> 1.8</param> </module> </link> … 12 Binding in Resources • Resources can be bound in by adding RDF annotations to describe software and hardware requirements • Here we run ‘iso’ remotely and say that we are requiring IRIS Explorer <rdf:RDF xmlns:rdf="http://www.w3.org/1999/ 02/22-rdf-syntax-ns#" xmlns:v="http://www.gviz.org/skML/"> <rdf:Description about="iso"> <v:Type>IRISExplorer</v:Type > <v:PhysicalLocation rdf:resource=”http://www.gviz.org/ Mars101” /> </rdf:Description> </rdf:RDF> 13 SVG Map Editor • • skML gives us an XML application for visualization In addition to language representation, a diagrammatic representation has been created in SVG – so we can do dataflow programming in a web browser • A new IRIS Explorer module can read skML and generate corresponding map • skML can also be turned into an IBM Open Visualization Data Explorer network <?xml version="1.0"?> <skml> <map> <link> <module name="ReadLat” out-port="Output"> <param name="Filename"> testVol.lat </param> </module> <module id=“iso” name="IsosurfaceLat" in-port="Input"> <param name="Threshold" min="0" max="27"> 1.8</param> </module> </link> … 14 Taming dataset structure for visualizing over the Grid • At present: – Multiplicity of data sources (say, m) – Multiple visualization systems (say, n) • Suppose a VO is being assembled where that diversity is present and may vary? • Would like to move from Axmxn to Bxm+Cxn+D (where we would like B,C,D not to be too big) • Investigating an approach where: – – – – – Intermediate XML-based language for data structure is used Structure is processed by sequence of straightforward filters Bulk data (e.g. resequencing) is processed as late as possible Existing tools used as far as possible (e.g. XSLT transformers) Metadata (including provenance) is not lost! • Interested to gain access to other data sources to test approach 15 Computational Steering • Experience in building steering applications with IRIS Explorer taught us: – Better to decouple simulation from visualization system… – … simulation runs autonomously, scientist uses visualization system as front-end control panel to monitor and steer – Useful to be able to distribute the visualization – move some code close to the simulation – Scientist should be able to connect/disconnect from longrunning simulations • Led us to build the gViz steering library – Code to link simulation and visualization • Some design criteria: – Lack of intrusion – Minimize performance loss – Breadth of scope – Exploit service-oriented concepts – Exploit existing visualization systems – Manage different rates of producer-consumer – Support collaboration 16 Communications using Web Services and Proxies Use GIIS to locate grid resources. Use web service tool, gVizDS, as directory service of running simulations. Use proxy tool, gVizProxy, if no direct connection possible between simulation and visualization system Grid Information gViz-lib Register Parameters gVizDS Launch code on Retrieve Selected resource resource Locate any running information Parameters simulations Discover/ Pass Location Locate Launch gViz-lib Grid resources Simulation code Data gVizProxy Data gViz-lib getData visualize render control Desktop 17 Pollution Simulation Using the gViz Library and IRIS Explorer 18 gViz with other visualization systems • Any visualization system can potentially act as the front-end control panel – SCIRun from University of Utah – vtk toolkit 19 Where Next? IRIS Explorer Grid-enable IRIS Explorer Open Overlays project (Fundamental CS for e-science) Lancaster, Ox Brookes XML Appl’n Ontologies for Visualization -> E-Viz project with Bangor, Swansea and Manchester Comp Steering Exploitation through future releases from NAG pV3 Apply within Integrative Biology project and compare/contrast with RealityGrid Mike Giles talk 20 Acknowledgements The gViz project team has involved many people: • • • • • • • • • Leeds University: Ken Brodlie, Jason Wood, Chris Goodyer, Martin Thompson, Mark Walkley, Haoxiang Wang, Ying Li Oxford Brookes University: David Duce, Musbah Sagar Oxford University: Mike Giles, David Gavaghan CLRC/RAL: Julian Gallop NAG: Steve Hague, Jeremy Walton Streamline Computing: Mike Rudgyard IBM UK: Brian Collins, Alan Knox, John Illingworth CACR, Caltech: Jim Pool, Santiago de Lombeyda, John McCorquadale MIT: Bob Haimes Development environment at Leeds: White Rose Grid – e-Science Centre of Excellence 21 Use of Web Services for Remote Distributed-Memory Visualization Mike Giles Bob Haimes giles@comlab.ox.ac.uk Oxford University Computing Laboratory 22 Goal of Oxford Work Starting from existing pV3 software from MIT, • Post-processing or co-processing of data on distributed-memory parallel systems • Support for collaborative visualization and steering • MPI used for communication with parallel system Replace PVM message-passing to remote workstation by secure web services – much better suited for collaborative visualization • Cleaner security model – restricted access to data, no user account required on parallel system • Easy handling of firewalls and NAT 23 Existing pV3 Architecture 24 Web Service Version 25 Achievements and Conclusions • Web service implementation used C/C++ gSOAP toolkit – Very easy to use with extensive feature set: OpenSSL, keepalive, gzip compression • Main technical challenge was mimicking symmetric messagepassing using asymmetric web service RPC • Has been tested successfully on a PC cluster with NAT firewall, running the concentrator on the front-end node • Performance is comparable to PVM message passing 26