GEON: A Cyberinfrastructure Facility to Advance Research and Education Dogan Seber San Diego Supercomputer Center University of California, San Diego www.geongrid.org Acknowledgements Chaitan Baru Sandeep Chandra Efrat Frank Kai Lin Ashraf Memon Choonhan Youn www.geongrid.org Outline: • • • • Cyberinfrastructure and the Geosciences GEON Cyberinfrastructure GEON Services and Access Mechanisms Impact on Science and Education • Synthetic seismogram calculation • LiDAR data processing • 3D and 4D visualizations • Future activities www.geongrid.org Enabling Scientific Discoveries: Pathway to Discovery Data Access Knowledge Process Analyze Interpret Discovery How does cyberinfrastructure help? www.geongrid.org GEON’s Vision • Enable new discoveries in the geosciences by utilizing an easy-to-use integration environment built using state-of-the-art information technology resources. The end product is a system that : Enables new interdisciplinary research and encourages resource sharing Stores and seamlessly integrates distributed data sets, software, and tools Allows users to make searches to discover data and information not only by name, but also by concepts. Provides high-end computational resources Provides a customizable analysis and research environment (workflow) Is a rich resource for teachers and learners www.geongrid.org GEON Project NSF Large ITR project – collaborative effort GEON is creating an IT infrastructure to “enable” interdisciplinary geoscience research for the entire community Project started in October 1, 2002 (after three years of preparation) and will continue until September 30, 2007 www.geongrid.org GEON Cyberinfrastructure Principles • An equal partnership • IT works in close conjunction with science • CI: Support the “day to day” conduct of science (escience), in addition to “hero” computations • The “two-tier” approach • Use best practices, including use of commercial tools and open standards, where applicable… • …while developing advanced technology, and doing CS research • Create shared “science infrastructure” • Integrated online databases, with advanced search and query engines • Online models, robust tools and applications • Leverage from and work with other intersecting projects • Much commonality in the technologies, regardless of science disciplines www.geongrid.org GEON Cyberinfrastructure GEON Portal Web/Grid Services Interfaces Registration GEONsearch Data Registration Services GEONworkbench workflow, visualization, HPC Data Integration Services Mapping Services ArcIMS WMS WFS Ontology Enabled Integration Spatial Temporal Conceptual Metadata Services GEON Catalog Indexing Services Others Postgres SRB OpenDAP mySQL DB2, Data Services Physical Grid MyGEON Computational & Modeling Services Modeling, Analysis Tools Logging Services Other Core Services Usage Stats Collection & Analysis GridFTP OGSA-DAI CSF RedHat Linux, ROCKS, OGSI, www.geongrid.org Internet, I2 GEON Nodes Geological Survey of Canada Chronos USGS ESRI CUAHSI PoP node 5-node cluster Partner Projects 4 Tb Partner services www.geongrid.org Node Deployment Architecture • • • Hardware Deployment • Each site runs a PoP • Optional cluster and data resources Users access resources through PoP • PoP provides point of entry • PoP provides access to global services in GEON Developers add services & data hosted on GEON resources • Web services/Grid services www.geongrid.org GEON Software Stack • • • • • • • Base OS • Rocks: highly programmatic software configuration GEONGrid Software Stack Version 1.0 management Development GridSphere Portal • Globus 4.0.2 • Web Services (Jakarta-tomcat-5.0.28, axis-1.1, ant-1.6, GRASS (GDAL, NetCDF, Tiff) GMT jdk1.4.2) • GridSphere 2.0.2 Portal Framework PBS Condor NWS INCA/GRASP Database OGSA-DAI NMI Globus OGSA Axis • Postgres 8.0.3 • PostGIS 1.0.2 (Geos, Proj) Tomcat Postgres PostGIS Geos Proj GIS/Mapping • Grass 6.0.2, GMT Ant Samba JDK Tripwire Security • Tripwire, chkrootkit Rocks 4.2.1 based on RedHat Enterprise Linux System Monitoring • INCA Testing and Monitoring framework (Teragrid) • With GRASP benchmarks • Network Weather Service (NWS) • Ganglia Job Submission and Monitoring www.geongrid.org • Condor, PBS GEON Cyberinfrastructure GEON Portal Web/Grid Services Interfaces Registration GEONsearch Resource/data Registration Services GEONworkbench workflow, visualization, HPC Data Integration Services Mapping Services ArcIMS WMS WFS Ontology Enabled Integration Spatial Temporal Conceptual Metadata Services GEON Catalog Indexing Services Others Postgres SRB OpenDAP mySQL DB2, Data Services Physical Grid MyGEON Computational & Modeling Services Modeling, Analysis Tools Logging Services Other Core Services Usage Stats Collection & Analysis GridFTP OGSA-DAI CSF RedHat Linux, ROCKS, OGSI, www.geongrid.org Internet, I2 User Entry Points: GEON Web Site and the Portal Portal Framework: gridsphere; JSR168 compliant www.geongrid.org (www.gridsphere.org) Security Infrastructure • Portal users need access to various Grid-enabled resources for job submission, data management, instrument control, etc. • Standard security mechanism is GSI (Grid Security Infrastructure). Typically involves: • • • • Creation of credentials for a new user Storage of a proxy in MyProxy by user Retrieval of proxy upon user login to portal Configuration of resources to accept credentials www.geongrid.org GAMA: Grid Account Management Architecture • Install command-line security infrastructure on a dedicated, locked-down machine (GAMA server) • Wrap tools as Web Services on GAMA server • Construct GridSphere portlets and services for submitting and managing account requests from users on a portal server • Configure GridSphere to automatically retrieve a proxy from the GAMA server when a user logs on to the portal www.geongrid.org GEON Cyberinfrastructure GEON Portal Web/Grid Services Interfaces Registration GEONsearch Resource/data Registration Services GEONworkbench workflow, visualization, HPC Data Integration Services Mapping Services ArcIMS WMS WFS Ontology Enabled Integration Spatial Temporal Conceptual Metadata Services GEON Catalog Indexing Services Others Postgres SRB OpenDAP mySQL DB2, Data Services Physical Grid MyGEON Computational & Modeling Services Modeling, Analysis Tools Logging Services Other Core Services Usage Stats Collection & Analysis GridFTP OGSA-DAI CSF RedHat Linux, ROCKS, OGSI, www.geongrid.org Internet, I2 Primary GEON Services* • • • • • Resource Registration - GEONcontribute Resource Discovery -- GEONsearch Data Integration -- GEON Integration Cart Personalized Access -- myGEON Applications -- GEONtools * All services accessible via the portal, some available as “remote services” to other client applications. System Architecture User View www.geongrid.org Other distributed apps Kepler, DLESE, … Client Access (via Web Services) User Access (via Portal) myOntology.owl metadata myDataset.foo metadata ResourceRegistration GEON Catalog Search condition(s) spatial temporal concept GEONsearch User actions add delete manipulate GEONworkbench GEON Workspace (user) SRB Log GEONmiddleware external services Gazetteer, DLESE, … Geologic Age, Chronos, … www.geongrid.org GEON Resource Registration System • Hosted registration: Contributor provides a full copy of the resource to the GEON network, and the network maintains the resource and archives it • Non-hosted registration: Users provide access to a resource, but GEON does not keep a copy, and accesses the resource remotely. (e.g., RDBMS). Full functionalities are available in the integration area. • Public, private, and group registrations • Register ontologies (domain knowledge) and ontology articulations • Optionally register datasets to ontologies www.geongrid.org www.geongrid.org Other distributed apps Kepler, DLESE, … Client Access (via Web Services) User Access (via Portal) myOntology.owl metadata myDataset.foo metadata ResourceRegistration GEON Catalog Search condition(s) spatial temporal concept GEONsearch User actions add delete manipulate GEONworkbench GEON Workspace (user) SRB Log GEONmiddleware external services Gazetteer, DLESE, … Geologic Age, Chronos, … www.geongrid.org Resource Discovery in GEON A Search Engine for Users • Metadata based search • Spatial coverage based search • Temporal coverage based search • Concept based search www.geongrid.org GEON Basic Search www.geongrid.org GEON Advanced Search www.geongrid.org www.geongrid.org www.geongrid.org www.geongrid.org Resource Usage Statistics www.geongrid.org GEON Semantic Mediator Oracle DB2 SQL Server MySQL PostgreSQL PostGIS Query Execution Query Optimization Query Planning Internal Database SQL Parser Spatial SQL against federal schemas Mediator JDBC Driver SOQL GUI Portal or Application Semantic Query Rewriter SOQL Parser Ontology Reasoner ODAL Processor OWL SOQL Processor www.geongrid.org ODAL Other distributed apps Kepler, DLESE, … Client Access (via Web Services) User Access (via Portal) myOntology.owl metadata myDataset.foo metadata ResourceRegistration GEON Catalog Search condition(s) spatial temporal concept GEONsearch User actions add delete manipulate GEONworkbench GEON Workspace (user) SRB Log GEONmiddleware external services Gazetteer, DLESE, … Geologic Age, Chronos, … www.geongrid.org www.geongrid.org GEON tools: TeraGrid Science Gateway • Synthetic Seismogram Calculation Tool • LiDAR data Analysis Tool www.geongrid.org SYNSEIS: A grid application in GEON • SYNSEIS is a SYNthetic SEISmogram calculation tool built as part of the GEON system • Uses E3D in the background • Enables 2D and 3D seismic waveform simulations using a service-oriented architecture • Utilizes both local as well as national computational platforms such as TeraGrid • Integrated with GEON resources allowing utilization of archival and storage resources www.geongrid.org SYNSEIS Components -IRIS •Earthquakes •Stations •Waveform SYNSEIS e3d HPC Centers Digital Libraries/GEON Data Grid Earth model NCSA SDSC Earth model Earth model www.geongrid.org SYNSEIS Architecture GEONGrid Portal MyGEON GEONTools Synseis Portlet SYNSEIS Computations Macromedia Flash GUI Map Server Web Services Earth Model Service Job Submission/Monitoring and File Service Grid Services Grid FTP Data Repository JDBC Data Archives Service IIOP/CORBA HPC Resources Job Database www.geongrid.org IRIS DMC SYNSEIS Portlet www.geongrid.org GEON SYNSEIS INTEGRATION PLATFORM EarthScope data Observation Analyses and Integration Earthquake parameters Seismic attenuation Subsurface www.geongrid.org Etc. Model Scientific Discoveries GEON portal and HPC Environment Simulation LiDAR Data Analysis www.geongrid.org Survey Example: LiDAR Workflow Courtesy: Chris Crosby, ASU D. Harding, NASA Point Cloud x, y, z, … Interpolate / Grid Analyze / “Do Science” www.geongrid.org A Three-Tier Architecture • GOAL: Efficient LiDAR interpolation and analysis using GEON infrastructure and tools Portal • GEON Portal • Kepler Scientific Workflow System • GEON Grid • Use scientific workflows to glue/combine different tools and the infrastructure Grid www.geongrid.org LiDAR Processing: Three Tier Architecture x,y,z and attribute Client/ NFS Mounted Disk DB2 GEON Portal Render Map Map Parameters Grass Functions ArcSDE ArcInfo Parameter xml ArcIMS process output Create Workflow Description raw data DB2 Spatial query submit Map onto the grid (Pegasus) Grass surfacing algorithms: Spline IDW Compute Cluster block mean Binary grid … ASCII grid Text file Tiff/Jpeg/Gif ASCII grid Download data KEPLER WORKFLOW www.geongrid.org www.geongrid.org www.geongrid.org www.geongrid.org Example Outputs www.geongrid.org Visualization - Integration Putting it all together. The first step to true integration is to remove the barriers to accessing and importing the data and then visualization of the data. Below are some examples of multidimensional data displayed in three dimensions with the GEON IDV. Yellowstone (Smith and others) and the geodynamics of the mantle (McNamara) www.geongrid.org 4D Visualization Images from the GEON IDV 170 ma 90 ma 10 ma GeodynamicsMcNamara PaleomagneticsSchettino, Scotese PaleogeographyBlakely www.geongrid.org i-GEON: International GEON • GEON has started international collaborations • GEON nodes now operational in Japan, China, and India. • Two international GEON workshops were held • India, October 2005 • China, July 2006 www.geongrid.org GEON Cyberinfrastructure Summer Institute Series The Cyberinfrastructure Summer Institute for Geoscientists series is designed to introduce geoscientists to commonly-used as well as emergent information technology (IT) tools. Topics covered include Data Modeling, Web Services, Geographic Information Systems, introduction to key concepts in Grid Computing, Parallel Programming, and Scientific Workflows. CSIG 2006, August 2006, San Diego, CA. CSIG 2005, July 2005, San Diego, CA. CSIG 2004, August 2004, San Diego, CA. Webcasts of Summer Institutes are available at http://www.geongrid.org/CSIG04/ http://www.geongrid.org/CSIG05/ http://www.geongrid.org/CSIG06/ 2007 Summer Institute is being planned, August 2007 www.geongrid.org www.geongrid.org www.geongrid.org