ESCI meeting

advertisement
GEON IT Update
PI Meeting, Blacksburg, VA
March 21-23, 2004
GEON PI Meeting, March 21-23h, 2004, Blacksburg, VA
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES
Outline
• Update on state of IT activities
• GEON Software Architecture and project high-level goals
• Update on activities at SDSC (since last meeting)
• The GEON Portal
• Knowledge representation
• Development of knowledge structures
• Schemas (metadata is implicit in this), Controlled
vocabularies, Ontology structures…
GEON PI Meeting, March 21-23h, 2004, Blacksburg, VA
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES
Components of the GEONgrid
Architecture
• GEONgrid Physical Implementation
• Core Grid Services
• Registry, authentication, access control, monitoring, replication,
distributed filesystem, collection management (SRB), job submission,
e.g. launch job to TeraGrid
• “Higher-Order” Services
• Registration: data and metadata, schema, ontology, services
• Data Integration: spatial data integration, data systems integration,
schema integration
• 2D Visualization, including GIS
• Workflow
• 3D Viz, Augmented Reality
• Portal
• Portlet-based design. User space, GeonSearch, GeoWorkbench.
GEON PI Meeting, March 21-23h, 2004, Blacksburg, VA
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES
GEON Software Architecture activities
• Architecture “Retreat” at SDSC, Jan 27-28
• Architecture document in preparation
• Established GEON software development areas with
Coordinator and Chief Programmers for each
• Each group meets once a week.
• Chief programmers meet once a week, on Monday
• Would like to develop a schedule of visits of GEON PI’s
to SDSC
• To attend Monday meetings
• GEONgrid software
• Plans for 6 month, 1 year, 2 years
• Release 1 by Dec. 2004
GEON PI Meeting, March 21-23h, 2004, Blacksburg, VA
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES
GEON Software Development Areas
Development Area
Coordinator
Chief Programmer
1. Core Grid Services
Karan Bhatia
Sandeep Chandra
2. Portal
Dogan Seber
Choonhan Youn
3. Data Registration
Bertram Ludaescher
Kai Lin
4. Mapping
Ilya Zaslavsky
Ashraf Memon
5. Mediation
Pavel Velikhov
Pavel Velikhov
6. Workflow
Bertram Ludaescher
Efrat Jaeger
Also, Jane Park, Doug Greer, LJ Ding, and others …
GEON PI Meeting, March 21-23h, 2004, Blacksburg, VA
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES
GEONgrid Physical Implementation
• PoP Nodes only
• VaTech, Bryn Mawr, Penn State, Rice, Utah EGI, Utah,
DLESE, UNAVCO
• PoP nodes + Data Nodes
• Idaho, Arizona State, SDSC
• PoP nodes + Compute Nodes
• Missouri, UTEP, SDSC
GEON PI Meeting, March 21-23h, 2004, Blacksburg, VA
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES
GEON Nodes
• Compute nodes
• Want to create at least a few nodes as a TeraGrid
“sandbox”
• GEONgrid is currently based on Redhat Linux, OGSI and
Globus Toolkit Version 3 (GT3)
• TeraGrid is currently based on SuSE Linux, GT2.4
• Sandbox allows GEON PI’s to develop debug software in
GEONgrid prior to sending jobs to TeraGrid
• GEON has a TeraGrid allocation (30,000hours)
• Need to keep in mind GEONgrid heterogeneity
• Windows and other platforms
GEON PI Meeting, March 21-23h, 2004, Blacksburg, VA
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES
GEON Services
• “Hosted” vs “non-hosted” services
• Hosted: service is implemented within the physical GEONgrid
environment (i.e. on one of the systems).
• The implementation can benefit from core capabilities provided in
GEONgrid, e.g. replication, load-balancing
• Need at least a PoP node to host a service
• Hosted databases will be stored at Data Nodes, but may
be replicated at one or more PoP nodes
• Data nodes
• Require Internet2 connectivity
• Will be backed up to SDSC (figuring out details)
• Will be replicated among themselves (need to figure out details)
GEON PI Meeting, March 21-23h, 2004, Blacksburg, VA
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES
Core Grid Services
• Registry:
• a place to register and find basic Web services. But also, all services (e.g.
PGAP, Gravity Database, Seismic Simulation Tool, …)
• Authentication:
• using GEON Certificate Authority and Grid certificates
• Initially (I.e. in 2004), use certificates only at the portal. Very few, if any, services
may actually validate to Grid certificates
• Access control:
• investigating various systems for policy-based access to services
• Data replication:
• initial target is IBM GMR software for replicating files as well as databases
• Support for various data systems:
• e.g., SDSC Storage Resource Broker (SRB) and OpenDAP
• Perhaps implement servers at Data Nodes
• Job submission, e.g. launch job to TeraGrid.
• Leverage NMI funding. New proposal under NMI.
GEON PI Meeting, March 21-23h, 2004, Blacksburg, VA
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES
Higher-Order Grid Services
• Registration
• Data and metadata, schema, ontology, services
• Important in order to support search functionality
• Data Integration
• Defining “views” across multiple sources
• Multiple database schemas, e.g. in Chronos (Paleostrat, Neptune,
Paleobiology), PAST?, Geochemisry (Navdat, PetDB, …)
• Multiple maps and map layers
• GIS and 2D Viz
• Integrating map layers. “Simple” mapping service.
• SVG-based data access and visualization tools
• Workflow
• Iconic representation of databases and tools
• Ability to link together tools and data to specify computations
• Based on Kepler system
GEON PI Meeting, March 21-23h, 2004, Blacksburg, VA
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES
GEON Portal
• Exact “look and feel” and core functionality is a work in
progress
• Portal components for:
• GeonSearch, GeoWorkbench, Rocky Mountain Testbed, Mid-Atlantic
Testbed, GEONSystems, GEON Docs, EOT
• Portlet-based design is meant to make it easier to create
customized portals (“building blocks” approach)
• E.g. Rocky Mountain and Mid-Atlantic are examples of customized
portlets
• Portal software is distributed to each PoP node. Can be
customized at each PoP node.
GEON PI Meeting, March 21-23h, 2004, Blacksburg, VA
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES
GeonSearch
• Ad hoc search versus querying of preestablished “views”
• Ad hoc Search
• Search/discover information on data, services,
experiments, “other” (e.g., people, organizations)
• Display results via map interfaces, semantic graphs
• View-based querying
• E.g., use ad hoc search to find a set of databases, map
layers of interest; define a specific way of combining data
across these various sources
• Need good use cases
GEON PI Meeting, March 21-23h, 2004, Blacksburg, VA
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES
GeoWorkbench
• Workbench “capabilities”
• Data and service registration
• Create spatial, temporal, concept-based indexes as part of
registration process
• Ability to define views, e.g. using GeonSearch to find data,
services, etc.
• Run analysis routines, e.g. via workflow specifications,
using Kepler
• Visualize output, save output, feed output to other services
• Need good use cases
• Current portal is very much a work in progress
• Figuring out functional components
• Which function goes under what part of the portal, etc.
GEON PI Meeting, March 21-23h, 2004, Blacksburg, VA
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES
Knowledge Representation
GEON PI Meeting, March 21-23h, 2004, Blacksburg, VA
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES
Download