GEON: The User Perspective

advertisement
GEON: The User Perspective
Choonhan Youn
Dogan Seber, Chaitan Baru, Ashraf Memon
San Diego Supercomputer Center,
University of California at San Diego
GEON (GEOscience Network)
• A cyberinfrastructure project for geosciences
funded by NSF ITR.
• creating an IT infrastructure to “enable”
interdisciplinary geoscience research -- not a group
of researchers, but the entire community will
benefit
• Vision: Enable new discoveries in the geosciences
by building an easy-to-use and “comprehensive”
data, software, tools, and information network by
utilizing state-of the-art information technology
resources.
Current GEON member institutions
Partners
Members
• California Institute for Telecommunications
• Arizona State University
and Information Technology Cal-IT2
• Bryn Mawr College
• Chronos
• Penn State University
• CUAHSI
• Rice University
• ESRI
• San Diego State University
• Geological Survey of Canada
• San Diego Supercomputer Center /
• Georeference Online
University of California, San Diego
• IBM
• University of Arizona
• Kansas Geological Survey
• University of Idaho
• Lawrence Livermore National Laboratory
• University of Missouri, Columbia
• U.S. Geological Survey (USGS)
• University of Texas at El Paso
• University of Utah
Other Affiliates
• Virginia Tech
• Southern California Earthquake Center
• UNAVCO, Inc.
(SCEC), EarthScope, IRIS, NASA
• Digital Library for Earth System Education
(DLESE)
GEOSCIENCE CHALLENGES
• Exponential Increase in Data Volume
– How to manage vast amounts of data can be used by all scientists in an easy-to-use
environment
• Data Storage, Access and Preservation
– How to build a framework to exchange data and help preserving collected data sets
• Data Integration (semantic and syntactic)
– How to merge multiple geology maps to make a seamless (“integrated”) map
• Computational Challenges
– How to build a system that helps scientists run advance software without having access
to significant resources (computers and technical), focusing on the science problem
• Advance Visualization (3D/4D)
– How to build a visualization system that helps scientists analyze large and complex
data sets dynamically
• Archiving and publications of results with reusable components (reusability)
– How to preserve scientific results and help others to repeat the analysis as efficiently as
possible?
GEON Cyberinfrastructure (CI) Principles
• CI: Support the “day to day” conduct of science (escience), in addition to “hero” computations
• An equal partnership
– IT works in close conjunction with science
• Create shared “science infrastructure”
– Integrated online databases, with advanced search and query
engines
– Online models, robust tools and applications
• Leverage from other intersecting projects
– Much commonality in the technologies, regardless of
science disciplines, e.g. BIRN, SEEK, and many others
Main e-Research facilities I
• A Resource Registration System for Data Providers
– Register ontologies (domain knowledge) and ontology articulations
– Register datasets with metadata including data access information
– Optionally register datasets to ontologies (which is crucial for data integration
and smart search): Ontology enabled semantic integration
– Shapefile, ASCII, Excel, GMT Raster, Geo TIFF, Relational Database, PDF,
tool, WMS service, Web service, etc.
• A Search Engine for Data Users
–
–
–
–
–
Metadata based search
Spatial coverage based search
Temporal coverage based search
Concept based search
Ontology based data discovering
Main e-Research facilities II
• The user workspace, called myGEON area.
– Users are able to search and collect their data sets from the
GEON search engine and integrate them.
– For example, users can review and analyze "SYNSEIS“
ouputs that are generated by job running.
• Computational HPC
– SYNSEIS (Synthetic Seismogram toolkit)
• Workflow
– LiDAR: an end-to-end solution for the distribution, interpolation
and analysis of LiDAR / ALSM point data.
– Atype workflow: generates map for all plutonic bodies in
Virginia from the VA Igneous rocks database based on the
certain inputs.
Constraints for main e-Research facilities
• Dynamic workflow issues due to the web-based
system on the GEON
• Large computational clusters for simulating GEON
applications as needed
– GEON has three small cluster nodes on partner sites
GEON Portal Usability
• Easy of use
– GEON Search, SYNSEIS, many of them, etc.
• Make complex tasks easy to specify
– LiDAR
• Highly interactive
– SYNSEIS
• Integrated access to tools and resources
– myGEON, Mapping Integration
Computational HPC
for SYNSEIS
Lessons Learnt
• Its main strengths
– Standard-compliant ways
– Using open source libraries and tools for most of
implementations
• Its main weaknesses
– Highly user interactive, friendly interface issues within
the portlet franework
• Would you consider alternatives to a portal
solution?
– Currently, No
Future Plan
• Will add and develop new functionalities based on
the requests from GEON PIs and geoscience
community.
• Will keep improving the portal usability.
– For example, in case of SYNSEIS, add more user
capabilities in the user interface for complex earthquake
simulations.
• Will expand its use within geoscience community
internationally
– Center on GEON PIs first
GEON: The Developer Perspective
Choonhan Youn
Dogan Seber, Chaitan Baru, Ashraf Memon
San Diego Supercomputer Center,
University of California at San Diego
Methods of GEON’s Design
• Several workshops were held with participation
from scientists from different disciplines like
geochemistry, geophysics etc.
• Also Principal Investigators (PIs) visits SDSC for
focused discussion on their requirements
• Prototypes are built using gathered requirements
and then spiral model of software development is
followed to enhance the prototype.
Service-Oriented Approach
Priority of Functional and non-functional
requirements
• Start with functional requirement from the
principal investigators or local geo-science PI
• Prototypes are built and functional requirements
are tested
• Then focus on to non-functional requirements like
usability
Technical Strategy
• The “two-tier” approach
– Use best practices, including use of commercial tools and
open standards, where applicable…
• start with development using the technology available now
– …while developing advanced technology, and doing CS
research
• push for open source and best practices as much as possible
GEONSearch, Registration, myGEON
Portlet
Client Access (via web services)
User Access (via Portal)
myOntology.owl
metadata
myDataset.foo
metadata
ResourceRegistration
GEON
Catalog
Other distributed apps
Kepler, DLESE, …
Search condition(s)
spatial temporal concept
GEONsearch
User actions
add delete manipulate
myGEON
GEON
Workspace
(user)
SRB
Log
GEONmiddleware
external services
Gazetteer,
DLESE, …
Geologic Age,
Chronos, …
SYNSEIS toolkit
User Access (via Web Browser)
GEONGrid Portal
HTTP
SYNSEIS Portlet
myGEON Portlet
Web
Services
Flash Application
SOAP
SAC
Service
Data Model
Service
SOAP
Job Submission/Monitoring and
File Service
Grid Services
Grid FTP
Data
Repository
SOAP
JDBC
HPC Resources
TeraGrid
clusters
Job Database
Cornell
Map Server
SOAP
Data Archives
Service
CORBA(IIOP)
IRIS
DMC
Development Issues
• Constraints
– Interoperability issues due to use of existing tools
• Use of existing tools developed in Fortran and some machine dependent
algorithms and code GRASS based GIS processing.
• Incompatible implementation of same standard (OGC’s WMS)
– Usability requirements
• Portlets UI is designed by the software developers and so they are not very
user friendly
– Part of our tension in the project is that
• while this is an R&D project for the IT folks, the science folks want some
of it to look like production software
– lack of user input in some cases,
• because some users are still trying to get up to speed with the IT concepts
so they haven’t really used the system.
Evaluation
• Usually success of our GEON services is
determined by user satisfaction!
• Usability workshop was held recently with domain
scientist involved and their feedback was taken.
– Based on this report, we are working on it
• Another workshop will be held after the
implementation of the suggested changes.
Lessons Learnt
• The most successful aspects
– Integrating with other grid, such as TeraGrid
– Data registration, search capabilities for geoscience
community
– Community involvement
• The least successful aspects
– Community still is evaluating this system.
Future Plans
• Will provide a secure role-based authorization
control (using SAML) to fully integrate into the
GEON portal.
• Will add WSRP service.
• The definition of conventions for managing state
may be handled through standard ways such as
WSRF so that applications discover, bind, and
communicate with stateful resources in standard
and interoperable ways.
GEON Search Portlet
GEON Resource
Registration Portlet
User Workspace
Geon Dataset Ids
Map Integration
Portlet (Mediator)
2.
Dataset Ids to Ontology Ids
3.
Ontology Ids to Ontology Names
4.
Ontology Ids to Ontology Concepts
Redefine Query
Execute Query
Ontology
Service
SRB
Download Datasets
Store Query Results
Query
Tracking
DB
Generate Map
Query Result Indexing
Ontology Engine
Mapping
Services
Query
Service
GET_EXTRACT
GET_MAP
ArcIMS
Webservices
Dataset Ids to Dataset Names
Mapping
GEON
Metadata
Catalogue
1.
Knowledge Representation
Client Portlet
Gridsphere
Mapping Integration Portlet
DATA PROCESSING(LiDAR Portlet)
Client
GEON Catalog
x,y,z and attribute
NFS Mounted Disk
IBM DB2
maps/data
Software Tools
GEON Portal
GEON
Search
Portlet
WWW
LiDAR
Process
Portlet
Other
Portlet
GEON
Search
Service
DB2
Spatial
Function
raw data
GRASS
ARCINFO
process
output
GMT
Spatial
Query
Service
Compute Cluster
LiDAR
Processing
Service
Data Processing
Algorithms
TeraGrid
DataStar
Download