EarthScope Portal and IRIS Web Service Development Robert Casey

NERIES Data Portal for Seismology: Brainstorm Meeting
Edinburgh, UK
November 6-7, 2008
EarthScope Portal and
IRIS Web Service
Robert Casey
IRIS Data Management Center
Seattle, WA
• Provide a central search and
data access capability for
distributed EarthScope data
and data products
– IRIS (USArray) – seismic, strainmeter, MT
– UNAVCO (PBO) – GPS, tiltmeter
– ICDP (SAFOD) – drilling logs, core data
• Distributed implementation
– A central Web service invokes Web services
at each location, which search the local
– SOAP-based interaction
– Common Query Interface schema
– Station Location, Identification, and Product
– Product Listing Returns
– Product Packaging and Delivery
• Distributed development
– Software design and development by a
distributed team of developers at SDSC,
– Leverage portal work already done for GEON
– Independent code developed to a jointly
agreed-upon schema
– Regular teleconferences, occasional travel
for technical discussions
– Project planning with regular milestones
coordinated at SDSC.
Development Timeframe
• Development period – 15 months
• First Demo available at AGU Dec
• Alpha Testing - April 2008
• Beta Testing - July 2008
• Release Candidate – Sept 2008
• Final Deployment – Oct 2008
Deployment Sites
• Portal Work in Progress sited at
San Diego Supercomputer Center
• Alpha and Beta releases to select
group of testers
• Feedback and issue tracking in
• Final release candidate code
ported to permanent siting at
UNAVCO in Boulder, CO.
The EarthScope Portal – opening page
Zoom and Bounding Box
Station Selection
Cluster Stations
Cluster Selection
Cluster Zoom
Search Toolbar
Station Selection on Map
Selected Stations List
Find Data
Search Results
Select Desired Products
Fill the Data Cart
Package Cart Data
Package is Ready
Package Details
Download Package
Query History
Current Issues
• Large number of stations.
Slow to render so many when
opening page.
– Use WMS layers?
– Other fast-rendering strategy?
– Level of Detail variations – only
selectable stations at zoom?
Current Issues
• Temporal Constraints. How
do we allow user to have wide
discovery but narrow access
to data?
– User cannot browse through
years of data
– User has to guess at when
stations have data available
Current Issues
• Large Return sets. How do
we present data to the user
which spans large geographic,
time, sensor type, and sample
rate dimensions?
– Continuous sampling of data
– Multiple channels per site
– Different measurements at site
Current Issues
• Query flexibility vs complexity
9Current interface is standard
time/location/product, but…
o Drilling data has a depth
dependency for sensors
o Some products may cover an
area, not a point source
o Each product type can have
multiple data types
Current Issues
• Query result browsing and
– Too many results, requires
pagination and cutoff limits
– Allowing sorting by field,
especially when results are
truncated by the server
– Tree categorization does not
allow breadth-wise filters across
• Examined use of Portlet
Remoting (WSRP) –
technology considered not
mature at the time (2007)
• SDSC already had experience
with Gridsphere, provided
portlet container API
• Deemed too risky to have
each component site develop
custom portlets – time/cost
Moving on…
IRIS DMC web services
Current Plan
• Preserve current CORBA
access technologies (DHI)
• Make use of existing tools and
legacy software
• Create an underlying data
layer for locating and fetching
data and metadata, common
to CORBA and web services
Current Plan
• Create a web services access
layer in tandem with improved
• Both web services and DHI
will access the same data
layer interfaces
Web Service Layers
• Potentially three layers of
service module composition
1. A high level abstract workflow
2. A more detailed programmatic
SOAP interface
3. A detailed protocol buffer
interface (does the heavy
Abstract Workflow Layer
• Most abstract of the service
• Will need to interact with lower
layers for ‘refinement’ of
• May go with a commercial
SOAP interface
• This layer will be suitable for
programmatic access by
SOAP 2.0 clients
• Current technologies: Axis 2
(SOAP stack) and JiBX (data
• Investigating SOAP header
messaging for intermediary
processing nodes
Protocol Buffer Layer
• Covers all service modules,
many not publicly accessible
• Presents fine-grained
decoupled functions for data
fetching and step-wise
• Protocol buffer carries data
and messages between
functions (Google)
Stateful Tracking
• Investigating use of a
database for asynchronous
persistent awareness of
workflow state
• Will track intermediate steps
and products to allow
provenance tracking and
efficient repeat processing at
the refinement stage
First steps
EarthScope Portal
SPADE product catalog
Phase Pick Query
Ground Motion and
Decimation services
• Waveform fetching and
creation of plot image
Sample work in progress
Memon, A., C. Baru, K. Behrends, R. Casey, B. Hoyt, L. Kamb, K. Lin, B. Weertman,
C. Weiland. The EarthScope Data Portal. Presented at Geoinformatics 2008,
Baru, C., T. Ahern, G. Anderson, K. Behrends, R. Casey, B. Hoyt, L. Kamb, K. Lin, C.
Meertens, A. Memon, J. Muench, C. Stolte, B. Weertman, C. Weiland (2007), The
Earthscope Portal, Eos Trans. AGU, 88(52), Fall Meet. Suppl., Abstract IN44A-08
Weertman, B., J. Muench, L. Kamb, R. Casey, T. Ahern. 2007. Emerging Web
Services at the IRIS Data [Management Center]. Presented at the Geoinformatics
2007 Conference, Geological Society of America, University of California, 18 May
Muench, J., L. Kamb, R. Casey, T. Ahern. 2006. Opening Doors for Seismic Data
Access. Presented at the Geoinformatics 2006 Conference, USGS, Reston, Virginia,
10-12 May 2006.