iSERVO International Solid Earth Research Virtual Observatory Grid/Web Services and Portals Supporting Earthquake Science December 15 2004 AGU Fall Meeting San Francisco Geoffrey Fox, Marlon Pierce (Community Grids Lab, Pervasive Technologies Laboratories, Indiana University) John Rundle (UC Davis), Andrea Donnellan, Robert Granat, Greg Lyzenga, Jay Parker (JPL) Don McLeod (USC), Lisa Grant (UC Irvine) Repositories Federated Databases Database Sensors Streaming Data Field Trip Data Database Sensor Grid Database Grid Research SERVOGrid Education Compute Grid Data Filter Services Research Simulations ? GIS Discovery Grid Services Customization Services From Research to Education Analysis and Visualization Portal Grid of Grids: Research Grid and Education Grid Education Grid Computer Farm iSERVO in a nutshell Designed to link data-sets (repositories and real time), computations and earthquake scientists in ACES (Asia Pacific) Cooperation • Australia China Japan USA Exemplified by SERVOGrid in USA led by JPL Supports simulation and datamining as services Adopts conservative WS-I+ Web Service Interoperability standards Builds full “Grid” in a library fashion as a Grid of Grids • GIS (Geographic Information System) Grid built as a set of OGC compatible Web Services “talking” GML • iSERVO federates separate Grids in each country/organization/function • A Grid is “just” a collection of Services aka distributed programs Multi-scale simulations supported by Grid workflow Portals based on NSF Middleware Initiative NMI Open Grid Computing Environment OGCE Characteristics of Computing for Solid Earth Science Widely distributed datasets in various formats • GPS, Fault data, Seismic data sets, InSAR satellite data • Many available in state of art tar files that can be FTP’d • Provenance problems: faults have controversial parameters like slip rates which have to be estimated. Distributed models and expertise • Lots of codes with different regions of validity, ranging from cellular automata to finite element to data mining applications (HMM) • Simplest challenges are just making these codes useable for other researchers. • And hooking this codes to data sources • Some codes also have export or IP restrictions • Other codes are highly specialized to their deployment environments. Decomposable problems requiring interoperability for linking full models • The fidelity of your fault modeling can vary considerably • Link codes (through data) to support multiple scales (i)SERVO Web (Grid) Services Programs: All applications wrapped as Services using proxy strategy Job Submission: support remote batch and shell invocations • Used to execute simulation codes (VC suite, GeoFEST, etc.), mesh generation (Akira/Apollo) and visualization packages (RIVA, GMT). File management: • Uploading, downloading, backend crossloading (i.e. move files between remote machines) • Remote copies, renames, etc. Job monitoring Workflow: Apache Ant-based remote service orchestration • For coupling related sequences of remote actions, such as RIVA movie generation. Data services: support remote data bases and query construction • XML data model being adopted for common formats with translation services to “legacy” formats. • Migrating to Geography Markup Language (GML) descriptions. Metadata Services: for archiving user session information. SERVOGrid Applications Codes range from simple “rough estimate” codes to parallel, high performance applications. • Disloc: handles multiple arbitrarily dipping dislocations (faults) in an elastic half-space. • Simplex: inverts surface geodetic displacements for fault parameters using simulated annealing downhill residual minimization. • GeoFEST: Three-dimensional viscoelastic finite element model for calculating nodal displacements and tractions. Allows for realistic fault geometry and characteristics, material properties, and body forces. • Virtual California: Program to simulate interactions between vertical strike-slip faults using an elastic layer over a viscoelastic half-space • RDAHMM: Time series analysis program based on Hidden Markov Modeling. Produces feature vectors and probabilities for transitioning from one class to another. Preprocessors, mesh generators: AKIRA suite Visualization tools: RIVA, GMT, IDL SERVOGrid Codes, Relationships Elastic Dislocation Inversion Viscoelastic FEM Viscoelastic Layered BEM Elastic Dislocation Pattern Recognizers Fault Model BEM This linkage called Workflow in Grid/Web Service parlance Role of Workflow Service-1 Service-3 Service-2 Programming the Grid: Workflow describes linkage between services As distributed, linkage must be by messages Linkage is two-way and has both control and data Apply to multi-scale (complexity) linkage, multiprogram linkage, link visualization to simulation, GIS to simulations and viz filters to each other Microsoft-IBM specification BPEL is current preferred Web Service XML specification of workflow SERVOGrid uses ANT (well known XML build tool) to perform workflow and this works well in our relatively simple cases) Applications and Observational Data Several SERVO codes work directly with observational data. Scenarios include • GeoFEST, VirtualCalifornia, Simplex, and Disloc all depend upon fault models. • RDAHMM and Pattern Informatics codes use seismic catalogs. • RDAHMM primarily used with GPS data Problem: We need to provide a way to integrate these codes with the online data repositories. • QuakeTables Fault Database • Existing GPS and Earthquake Catalogs Solution: use databases to store catalog data; use XML (GML) as exchange data format; use OGC and WS-I+ Compatible Web Services for data exchanges, invoking queries, and filtering data. • Use Web Feature Service, Web Map Service from OGC • Use UDDI (Discovery), WS-DAI (Database),WS-Context (Dynamic metadata) from WS-I+ SERVOGrid and Semantic Grid SERVOGrid has many types of metadata We are designing RDFS descriptions for the following components: • • • • • Simulation codes, mesh generators, etc. Visualization tools Data types Computing resources … These are easily expressed as RDFS (actually DAML) “nuggets” of information. • Create instances of these • Use properties to link instances. Some Sample Relationships installedOn Danube Computer installedOn GMT Viz Appl Disloc Application visualizedBy createsOutput usesInput USC Fault DB Data Storage storedIn Fault DataType Stress Map DataFormat Expanding to iSERVO Strategy • Agree on what (type of) resources and capabilities need to put on the ISERVO Grid – Computers, instruments, databases, visualization, maps, job submittal …. • Agree on interfaces to resources from OGSA-DAI (databases) to particular data structures (GML/OpenGIS) – specify in XML • Implement Resources and Capabilities as Services – User Interface should be a portlet that can be integrated by the portal into web interface • Make certain overarching Grid capabilities such as workflow, federation and metadata are sufficient • SERVO Grid is a prototype of this strategy using several US sites rather than several countries – Can be naturally extended to iSERVO, education, emergency response by extending resources • WS-I+ Web Service Architecture ensures continued interoperability and extensibility Grid Syntax Controversies • There are several proposals for the Web Service extensions needed for Grids – OGSI (GT3), WSRF (GT4), WS-GAF (Newcastle) – We adopt a wait and see philosophy • We use WS-I+ Pure Web Services approach that adopts minimum set of ~7 Web Service specifications choosing from 60 or so proposed in last few years – Those adopted by Industry wide WS-I Web Service Interoperability group – Those declared by IBM and Microsoft – Any extra absolutely essential – This approach adopted by next phase of UK e-Science Program Performance and Streaming WS-1 WS-2 • Web Services are meant to exchange messages using SOAP which is very interoperable but very slow – Drastically reduces effective bandwidth • Most real programs exchanges data via reading and writing binary files – Increases latency • All Control Messages should use classic SOAP • All data messages use optimal binary – Respect “SOAP Infoset” (Header and Body of Message) • Use streaming not file-based infrastructure to give better latency and same technology for files and streaming sensors – Similar to using UNIX Pipes not directly files – http://www.naradabrokering.org SERVOGrid Web Portal Package every Web Service with its own user interface as a document fragment Portlets are underlying technology OGCE Open Grid Computing Environment is developing lots of useful portlets • Computing • GIS • Access Grid etc. Aggregate Portals Portlet User Interface Components Application Web Services and Workflow Core Web Services Portal Architecture Clients (Pure HTML, Java Applet ..) Aggregation and Rendering Portlet Class: WebForm Clients Portal Portlet Class Portlet Class Portlet Class Portal Internal Services Portlets SERVOGrid (IU) Remote or Proxy Portlets Web/Grid service Computing Web/Grid service Data Stores Web/Grid service Instruments GridPort etc. (Java) COG Kit Local Portlets Libraries Hierarchical arrangement Services Resources Each Service has its own portlet Individual portlet for the Proxy Manager Use tabs or choose different portlets to navigate through interfaces to different services 2 Other Portlets OGCE Consortium SERVOGrid Portal Screen Shots