What Are Web Services?

advertisement
Integrating Geographical
Information Systems and
Grid Services for
Earthquake Forecasting
Marlon Pierce
Community Grids Lab
Indiana University
May 4, 2005
1
A Big Picture for Crisis Grid
Science Grid
…
Tsunami
EPR/CIP
…
Earthquake
Response WS
Earthquake
Forecast WS
Collaboration Grid
Portals
Sensor/Database Grid
GIS Grid
Registry
EPR/CIP Grid
Data Access/Storage
Visualization Grid
Compute Grid
Metadata
Core Grid Services
Security
Notification
Workflow
Messaging
Physical Network
Figure 1: Science, Critical Infrastructure Protection (CIP) and Emergency
Preparedness and Response (EPR) Grids built as a Grid of Web Service
(WS) Grids
2
The Problem: Integrating Data,
Applications, and Client Devices

The key issue we try to solve is building the distributed computing
infrastructure that can connect
•
•
•
•
•

Legacy data archives
Executable codes
Real time data sources
Collaboration services (http://www.globalmmcs.org)
Client tools for collaboration
 Audio/Video systems, whiteboard annotators, etc
Various application-specific grids can be built out of the common
infrastructure
• Science Grids (described here)
• Emergency planning, crisis response

We choose certain fixed points for our foundations
• Web Service standards: SOAP and WSDL
• Other standards where available: GIS standards
• Universal messaging substrate for SOAP and other messages:
http://www.naradabrokering
3
Pattern Informatics (PI)

PI is a technique developed at University of California, Davis for
analyzing earthquake seismic records to forecast regions with
high future seismic activity.
• They have correctly forecasted the locations of 15 of last 16 earthquakes
with magnitude > 5.0 in California.

See Tiampo, K. F., Rundle, J. B., McGinnis, S. A., & Klein, W.
Pattern dynamics and forecast methods in seismically active
regions. Pure Ap. Geophys. 159, 2429-2467 (2002).
• http://citebase.eprints.org/cgibin/fulltext?format=application/pdf&identifier=oai%3AarXiv.org%3Aco
nd-mat%2F0102032

PI is being applied other regions of the world, and John has
gotten a lot of press.
• Google “John Rundle UC Davis Pattern Informatics”
4
Pattern Informatics in a Grid
Environment

PI in a Grid environment:
• Hotspot forecasts are made using publicly available seismic records.
 Southern California Earthquake Data Center
 Advanced National Seismic System (ANSS) catalogs
• Code location is unimportant, can be a service through remote execution
• Results need to be stored, shared, modified
• Grid/Web Services can provide these capabilities

Problems:
• How do we provide programming interfaces (not just user interfaces) to
the above catalogs?
• How do we connect remote data sources directly to the PI code.
• How do we automate this for the entire planet?

Solutions:
• Use GIS services to provide the input data, plot the output data
• Use HPSearch tool to tie together and manage the distributed data sources
and code.
5
CGL Work on GIS Services

Some example OGC services include
• Web Feature Service (WFS): for retrieving GML encode features, like
faults, roads, county boundaries, GPS station locations,….
• Web Map Service (WMS): for creating maps out of Web Features

Problems with current GIS services
• Not (yet) Web Service compliant
 Efforts underway to provide this within OGC.
 But current specs are “pre” web service, no SOAP or WSDL
 Use instead HTTP GET/POST conventions.
• Often define general Web Service services as specialized standards
 Information services
 Notification services in sensor grids
• Can’t use other Web Service standards for reliability, security, etc.

CGL is developing Web Service versions of OGC standard
services
6
Tying It All Together: HPSearch

HPSearch is an engine for orchestrating distributed
Web Service interactions
• It uses an event system and supports both file transfers and
data streams.
• Legacy name

HPSearch flows can be scripted with JavaScript
• HPSearch engine binds the flow to a particular set of remote
services and executes the script.

HPSearch engines are Web Services, can be distributed
interoperate for load balancing.
• Boss/Worker model

ProxyWebService: a wrapper class that adds
notification and streaming support to a Web Service.
7
Data can be stored and
retrieved from the 3rd part
repository (Context Service)
WS Context
WFS
(Tambora)
(Gridfarm001)
NaradaBroker network:
Used by HPSearch
engines as well as for
data transfer
WMS
Data Filter
HPSearch
(Danube)
(TRex)
Virtual
Data
flow
WMS submits script
execution request
(URI of script,
parameters)
HPSearch hosts an
AXIS service for
remote deployment of
scripts
PI Code Runner
(Danube)
 Accumulate Data
 Run PI Code
 Create Graph
 Convert RAW -> GML
HPSearch
(Danube)
GML
(Danube)
Actual Data flow
HPSearch controls the Web services
Final Output pulled by the WMS
HPSearch Engines
communicate using NB
Messaging
infrastructure
9
10
11
Some Challenges

Performance: Are GIS services suitable for non-trivial data
transfers?
•
•
•
•

Entire California seismic record since 1932 is 12 MB.
Global records obviously larger
This is not really suitable for HTTP downloads.
We are currently pursuing streaming data transfers for higher
performance
Adoption: We must get the tools and services to the point where
science application developers want to use them early in the
development process rather than later.
• Web Service client libraries to remote GIS data
• Develop codes to work with data streams rather than files.

Security: A global version of this has interesting security
requirements


Authentication, authorization, federation for different countries
Time/event dependent security for crisis response
12
More Information




Contact: mpierce@cs.indiana.edu
GIS Work at CGL: www.crisisgrid.org
HPSearch at CGL: www.hpsearch.org
SERVOGrid Web Sites
• Our fine parent project
• http://servo.jpl.nasa.gov/
• http://quakesim.jpl.nasa.gov/
13
Status and Software

Web Feature Service 1.x software available now
• www.crisisgrid.org

Our SERVO WFS includes
•
•
•
•

Fault data
GPS records
Seismic records now for most areas of the globe
Note these are Web Services, so you can build your own clients to connect
to our running services.
Web Map Service
• Client portlets (shown) and services to be public soon.
• Software downloads available soon.

HPSearch
• Currently available, www.hpsearch.org

WS-Context and Information Services Work
• http://grids.ucs.indiana.edu/~maktas/fthpis/
14
Acknowledgements


Prof. John Rundle and his group at UC-Davis,
particularly James Holliday.
Prof. Geoffrey Fox, director of the Community Grids
Lab, and CGL grad students
•
•
•
•


Ahmet Sayar: Web Map Server development
Galip Aydin: Web Feature Service development
Mehmet Aktas: Information and Context Services
Harshawardhan Gadgil: HPSearch Workflow development
Satellite images from NASA OnEarth WMS
This work is supported by the NASA Advanced
Information Systems Technology Program.
15
Back up slides
16
A Screen Shot From the WMS Client
17
When you select (i) and click on a feature in
the map
18
SERVO Apps and Their Data


Pattern Informatics: using seismic records, forecast “hotspot” areas of
Disloc: handles multiple arbitrarily dipping dislocations (faults) in an elastic
half-space.
• Relies upon geometric fault models.

GeoFEST: Three-dimensional viscoelastic finite element model for calculating
nodal displacements and tractions. Allows for realistic fault geometry and
characteristics, material properties, and body forces.
• Relies upon fault models with geometric and material properties.

Virtual California: Program to simulate interactions between vertical strikeslip faults using an elastic layer over a viscoelastic half-space.
• Relies upon fault and fault friction models.

RDAHMM: Time series analysis program based on Hidden Markov
Modeling. Produces feature vectors and probabilities for transitioning from
one class to another.
• Used to analyze GPS and seismic catalogs.
19
Where Is the Data?

QuakeTables Fault Database
• SERVO’s fault repository for California.
• Compatible with GeoFEST, Disloc, and VirtualCalifornia
• http://infogroup.usc.edu:8080/public.html

GPS Data sources and formats (RDAHMM and others).
• JPL: ftp://sideshow.jpl.nasa.gov/pub/mbh
• SOPAC: ftp://garner.ucsd.edu/pub/timeseries
• USGS: http://pasadena.wr.usgs.gov/scign/Analysis/plotdata/

Seismic Event Data (RDAHMM and others)
• SCSN: http://www.scec.org/ftp/catalogs/SCSN
• SCEDC: http://www.scecd.scec.org/ftp/catalogs/SCEC_DC
• Dinger-Shearer: http://www.scecdc.org/ftp/catalogs/dinger-shearer/dingershearer.catalog
• Haukkson: http://www.scecdc.scec.org/ftp/catalogs/hauksson/Socal
20
What Are Web Services?
Web Services framework is a way for
doing distributed computing with
XML.
•
•
•



XML provides cross-language support
Suitable for both human and
application clients.
We built SERVOGrid’s execution grid
services this way.
We are building data grid components
in the same fashion.
•
Allows us to build general purpose
client environments like Web portals
and command line tools.
Browser
Appl
Web
Server
WSDL
SOAP
WSDL
SOAP
Web
Server
WSDL

WSDL: Defines interfaces to functions
of remote components. This is the
programming interface.
SOAP: Defines the message format that
you exchange between components.
This carries the messages over the
network.
Web Services are not web pages, CGI,
servlets, applets.
WSDL

JDBC
DB
21
Browser Interface
HTTP(S)
User Interface Server
WSDL WSDL WSDL WSDL
SOAP
WSDL
DB Service 1
SOAP
WSDL WSDL
Job Sub/Mon
And File
Services
WSDL
Viz Service
JDBC
DB
Host 1
Operating and
Queuing
Systems
Host 2
IDL
GMT
Host 3
22
GIS for Data Grids

Geographic Information Systems
• ESRI: commercial company with many popular GIS
products.
• Open Geospatial Consortium (formerly OpenGIS
Consortium).
• We will focus on OGC since they define open and
interoperable standards.

What are the characteristics of a GIS system?
• Need data models to represent information
• Need services for remotely accessing data.
• Need metadata for determining what is stored in the services.
23
GML: A Data Model For GIS



GML 3.x is a interconnected suite of over 20 connected
XML schemas.
GML is an abstract model for geography.
With GML, you can encode
• Features: abstract representations of map entities.
• Geometry: encode abstractly how to represent a feature
pictorially.
• Coordinate reference systems
• Topology
• Time, units of measure
• Observation data.
24
Using GML to Encode GPS

Collected GPS data for each site is
made available through online
catalogs.
• Using various text formats.




We have developed GML
descriptions of GPS using
Feature.xsd schema.
Full GML GPS schema:
http://www.crisisgrid.org/files/gps.x
sd.
Supports JPL, SOPAC, and USGS
formats.
Other efforts: Feature encodings of
Seismic and Fault catalogs.
• http://www.crisisgrid.org/html/serv
o_-_gps.html
25
Anatomy of Web Feature Service


GML provides the data model, but we must still provide a remote
programming interface to the data.
Web Feature Service provides three major services as described in OGC
specification:
• GetCapabilities: The clients (WMS servers or users) starts with requesting a
document from WFS which describes it’s abilities. When a getCapabilities request
arrives, the server dynamically creates a capabilities document and returns this.
• This is OGC’s formalization of metadata, so important to GEON, ESG, etc.
• DescribeFeatureType: After the client receives the capabilities document he/she
can request a more detailed description for any of the features listed in the WFS
capabilities document.
• The WFS returns an XML schema that describes the requested feature.
• Metadata about a specific entry.
• GetFeature: The client can ask the WFS to return a particular portion of any
feature data.
• GetFeature requests contain some property names of the feature and a Filter
element to describe the query.
• The WFS extracts the query and bounding box from the filter and queries the
feature databases.
• The results obtained from the DB query are converted the feature’s GML
format and returned to the client as a FeatureCollection.
26
Sample Feature- CA Fault Lines
<gml:featureMember>
<fault>
<name>Northridge2</name>
<segment>Northridge2</segment>
<author>Wald D. J.</author>
<gml:lineStringProperty>
<gml:LineString
srsName="null">
<gml:coordinates>
-118.72,34.243 -118.591,34.176
</gml:coordinates>
</gml:LineString>
</gml:lineStringProperty>
</fault>
</gml:featureMember>




After receiving getFeature
request, WFS decodes this
request, creates a DB query
from it and queries the
database.
WFS then retrieves the
features from the database
and converts them into GML
documents.
Each feature instance is
wrapped as a
gml:featureMember element.
WFS returns a
wfs:FeatureCollection
document which includes all
featureMembers returned in
the query result.
27
`
WMS
le
ec
tio
n
Fe
a
ol
tur
eC
eC
oll
Ge
tF
ea
e
r
tu
r
tu
a
Fe
a
Fe
et
G
tur
e
Client
io
ct
n
•A Web Feature Service can
serve multiple feature types data.
•WFS returns the results of
GetFeature requests as GML
documents (Feature Collections).
•Clients may include other
services as well as humans.
•Services include Web Map
Servers.
s
ad
i l ro ]
a
b
R [a-
Railroads
WFS Server
Hi
River [a-d]
Bridge [1-5]
ry
SQL Query
ue
LQ
SQ
SQ
L
gw
ay
[1
2-
Q
ue
18
ry
]
Interstate
Highways
Rivers
Bridges
90
Web Map Services

OGC Web Map Servers create digital maps from abstract data
sets retrieved from Web Feature Services.
• Layer features are overlayed.



WMS servers can be daisy chained.
Our research efforts here are to build Web Service version of the
WMS as well as portlet clients.
We are developing backward compatibility with existing WMS.
• Purdue CAAGIS for Indiana maps.
• NASA JPL OnEarth WMS.

Future plans are to replace current GMT portal tools with
WMS, combine maps with GeoFEST and Virtual California
application output.
29
Indiana map data
proxied through
legacy WMS.
California
boundaries and
streams overlaid
with QuakeTables
faults.
31
32
SERVOGrid Portal Screen
Shots
33
Conclusion and Other Topics

We have reviewed Community Grids Lab efforts to
build a GIS-compatible set of data grid services.
• This provides a unified architecture for grid execution and
data grid services.

Two other important topics:
• Managing messaging between components.
 NaradaBrokering project led by Dr. Shrideep Pallickara.
 www.naradabrokering.org
• Managing client environments
 Component-based portals: www.collab-ogce.org.
 Command line and scripting interfaces for web services:
www.hpsearch.org.
34
More Information

My email:
• mpierce@cs.indiana.edu

Overview (submitted to ACES special issue of PAGEOPH).
• http://www.servogrid.org/slide/GEM/SERVO/ISERVO_ACES_PAGEOPH.doc

QuakeSim Project Page:
• http://quakesim.jpl.nasa.gov/

QuakeSim Portal:
• http://www.complexity.ucs.indiana.edu:8282.

QuakeTables Fault Database:
• http://infogroup.usc.edu:8080/public.html

Community Grids Lab GIS Grid Development:
•
•
•
•
http://www.crisisgrid.org/.
See particularly http://www.crisisgrid.org/html/servo.html for GML information.
See http://www.crisisgrid.org/html/wfs.html for more on Web Feature Service.
WMS and Information Services information coming soon.
35
Client Environments for Web
Services


Web Services can support multiple types of clients.
Tools such as Apache Axis are service hosting environments.
• Manage deployed Web Services

Questions now:
• How do I manage several client applications for cooperating services?
• How do I manage multiple clients that collaborate through the same
services?
 G. Erlebacher, et al.

Client environments for Web Services
• Component-based Web portals: www.collab-ogce.org
• Shell environments:
 Program and prototype complicated interactions between data and
execution services.
 Manage data streams.
36
Data flow in applications


Real-time processing required.
Typically data transfer involves temporary storing of data. This
data may be transferred using files (E.g. Grid FTP).
• Every component of the chain processes data from input file, writes
processed data to output file.
• Time and Space critical in real-time applications hence file-based transfer
is undesirable for real-time applications.

Tools to automate data transfer and invoke applications (E.g.
Grid Ant, Karajan)
37
HPSearch

Binds URI to a scripting language
• We use Mozilla Rhino (A Javascript implementation, Refer:
http://www.mozilla.org/rhino), but the principles may be applied to any
other scripting language

Every Resource may be identified by a URI and HPSearch allows us to manipulate the
resource using the URI.
• For e.g. Read from a web address and write it to a local file
x = “http://trex.ucs.indiana.edu/data.txt”;
y = “file:///u/hgadgil/data.txt”;
Resource r = new Resource(“Copier”);
r.port[0].subscribeFrom(x);
/* read from */
r.port[0].publishTo(y);
/* write to */
f = new Flow();
f.addStartActivities(r);
f.start(“1”);

Adding support for WS-Addressing construct, under investigation
38
Architecture Components



SHELL
• Front end to scripting.
TASK_SCHEDULER (FLOW_ENGINE)
• Distributes tasks among co-operating engines for load-balancing purposes.
WSPROXY • An AXIS web service wraps an actual service. The behavior of the service
can be controlled by making simple WS calls to this proxy.
 Can be controlled by any Workflow Engine
 WSProxy handles streaming data communication on behalf of the
service.
• Service only sees I/P and O/P streams. These could be files or a remote
data stream or even a file transferred via HTTP / FTP or results from a
database query
• Can be deployed in standard Web Service containers (such as Tomcat)
39
WSProxy Interfaces

Runnable
• More control over execution (start, suspend, resume, stop…)
• Basic idea (read block of data, process it, write it out)
• Ideal for designing quick filtering applications that process
data in streams.

Wrapped
• Wrap an existing service (Executables [*.exe], Matlab scripts,
shell / Perl scripts etc…)
• Less control, can only start, stop
• Ideal for wrapping existing programs / services to expose as a
pluggable component / web service
40
HPSearch
Architecture Overview
HPSearch Kernel
Files
Sockets
Topics
HPSearch Kernel
Request Handler
Request Handler
Java script Shell
DataBase
URIHandler
Task Scheduler
Flow Handler
Web
Service
DBHandler
Web Service EP
WSDLHandler
WSProxy
WSProxyHandler
Other Objects
Service
Broker Network
WSProxy
HPSearch
Kernel
Service
WSProxy
...
Service
41
Applications
Streaming Data Filtering
Sensor
Source
GPS Data
HPSearch
Kernel - TSE
Kernel - TSE
Data Filter
Filters the input data to get
only the estimate and error
values
Matlab Plotting
Script
Graph
Kernel - TSE
(Distributed)
Services
RDAHMM
Analyze the data
42
Acknowledgements

I have presented work from the following collaborators
•
•
•
•
•

Prof. Geoffrey Fox, Community Grids Lab director
Harshawardhan Gadgil: HPSearch
Galip Aydin: Web Feature Service, GML, Sensor Grid
Ahmet Sayar: Web Map Service and Clients
Mehmet S. Aktas: GIS Information and Discovery Services
This work is supported by NASA Advanced
Information Systems Technology (AIST)
43
SERVOGrid Basics


SERVOGrid is our project to build a distributed computing
infrastructure to support earthquake simulation codes.
Distributed Computing Infrastructure
• Services for remotely executing applications, managing distributed files in
logical units, monitoring applications, archiving user projects, and so on.
• Variously called “Grids” (DOE), “Cyberinfrastructure” (NSF), “eScience” (UK e-Science program), ….

The above is sometimes referred to as an “Execution Grid”.
• We designed this around a web service architecture.
• See links at the end for more information.

But what about the data?
• Need ways to access remote data sources through programming interfaces.
• Practice often is to download (via FTP) entire catalogs and hand edit to
get data you want in the format you want.
• We need to automate this.
44
Japan-1
45
Japan-2
46
FeatureInfo Popup
(non-geometric attributes of the selected features)
47
Turkey-2
48
Sample-1
49
Sample-1 (PI Output Plotting)
50
Sample-2 (PI Output Plotting)
51
Sample-3
52
Download