Integrating Geographical Information Systems and Grid Applications

advertisement
Integrating Geographical
Information Systems and Grid
Applications
Marlon Pierce
Contributions: Ahmet Sayar, Galip Aydin, Mehmet Aktas,
Harshawardhan Gadgil
Community Grids Lab
Indiana University
Acknowledgements

The real work was done by (in alphabetical
order).





Project web site:


Mehmet Aktas
Galip Aydin
Harshawardhan Gadgil
Ahmet Sayar
http;//www.crisisgrid.org
This work was supported by NASA AIST as part
of “SERVOGrid: Complexity Computational
Environment”
Geographical Information Systems and
Grid Applications

Pattern Informatics



Regularized Dynamic Annealing Hidden Markov Method (RDAHMM)





Time series analysis code by Dr. Robert Granat (JPL).
Can be applied to GPS and seismic archives.
Can be applied to real-time data.
Interdependent Energy Infrastructure Simulation System (IEISS)
GeoFEST



Earthquake forecasting code developed by Prof. John Rundle (UC Davis) and
collaborators.
Uses seismic archives.
Finite element method code developed by Dr. Jay Parker (JPL) and Prof. Greg
Lyzenga (JPL/Harvey Mudd College)
Uses fault models as input.
Virtual California



Prof. Rundle’s UC-Davis group
Used for forecasting
Uses fault and fault friction input
GIS Data Grid Work at CGL

We decided that the Data Grid components of SERVO is best implemented
using standard GIS services.



Use Open Geospatial Consortium standards
Provide downloadable GIS software to the community as a side effect of SERVO
research.
We implemented two cornerstone standards as Web Services (WS-I+
approach)

Web Feature Service (WFS): data service for storing abstract map features



Web Map Service (WMS): generate interactive maps from WFS’s and other
WMS’s.


Can be used to set up problems by extracting features (faults, seismic events,
etc) from user GUIs to drive problems such as the PI code and (in near future)
GeoFEST, VC.
We also built a GIS compatible UDDI and WS-Context


Supports queries
Faults, GPS, seismic records
Browse capabilities files.
We are currently working on these steps



Improving WFS performance
Integrating WMS with video streaming technologies.
Implementing Sensor Web Enablement for streaming, real-time data.
GIS and Sensor Grids






OGC has defined a suite of data structures and services
to support Geographical Information Systems and
Sensors
GML Geography Markup language defines
specification of geo-referenced data
SensorML and O&M (Observation and Measurements)
define meta-data and data structure for sensors
Services like Web Map Service, Web Feature Service,
Sensor Collection Service define services interfaces to
access GIS and sensor information
Grid workflow links services that are designed to
support streaming input and output messages
We are building Grid (Web) service implementations of
these specifications for NASA’s SERVOGrid
A Screen Shot From the WMS Client
WMS uses WFS that uses data sources
<gml:featureMember>
<fault>
<name> Northridge2 </name>
<segment> Northridge2
</segment>
<author> Wald D. J.</author>
<gml:lineStringProperty>
<gml:LineString
srsName="null">
<gml:coordinates>
-118.72,34.243 118.591,34.176
</gml:coordinates>
</gml:LineString>
</gml:lineStringProperty>
</fault>
</gml:featureMember>
`
WMS
le
ec
tio
n
Fe
a
ol
tur
eC
eC
oll
Ge
tF
ea
e
r
tu
r
tu
a
Fe
a
Fe
et
G
tur
e
Client
io
ct
n
s
ad
i l ro ]
a
R [a-b
Railroads
WFS Server
Hi
River [a-d]
Bridge [1-5]
ry
SQL Query
ue
LQ
SQ
SQ
L
gw
ay
[1
2-
Q
ue
18
ry
]
Interstate
Highways
Rivers
Bridges
90
Automating Pattern
Informatics
Pattern Informatics (PI)



PI is a technique developed at University of California, Davis for
analyzing earthquake seismic records to forecast regions with
high future seismic activity.
 They have correctly forecasted the locations of 15 of last 16
earthquakes with magnitude > 5.0 in California.
See Tiampo, K. F., Rundle, J. B., McGinnis, S. A., & Klein, W.
Pattern dynamics and forecast methods in seismically active
regions. Pure Ap. Geophys. 159, 2429-2467 (2002).
 http://citebase.eprints.org/cgibin/fulltext?format=application/pdf&identifier=oai%3AarXiv.org%
3Acond-mat%2F0102032
PI is being applied other regions of the world, and John has
gotten a lot of press.
 Google “John Rundle UC Davis Pattern Informatics”
Pattern Informatics in a Grid Environment

PI in a Grid environment:

Hotspot forecasts are made using publicly available seismic records.






Code location is unimportant, can be a service through remote execution
Results need to be stored, shared, modified
Grid/Web Services can provide these capabilities
Problems:




Southern California Earthquake Data Center
Advanced National Seismic System (ANSS) catalogs
How do we provide programming interfaces (not just user interfaces) to the above
catalogs?
How do we connect remote data sources directly to the PI code.
How do we automate this for the entire planet?
Solutions:

Use GIS services to provide the input data, plot the output data



Web Feature Service for data archives
Web Map Service for generating maps
Use HPSearch tool to tie together and manage the distributed data sources and
code.
Example of Data Mining and GIS Grid
Databases with
NASA, USGS features
SERVOGrid Faults
WFS1
UDDI
Data Mining Grid
WFS3
WFS2
NASA WMS
WMS handling
Client requests
SOAP
WMS
WMS Client
Client
HTTP
Web Map
Client
WSDL
Aggregating
WMS
Stubs
Stubs
HTTP
SOAP
WSDL
WSDL
WFS
+
Seismic Rec.
WFS
+
State Bounds
“REST”
…
WMS
+
OnEarth
WMS uses WFS that uses data sources
<gml:featureMember>
<fault>
<name> Northridge2 </name>
<segment> Northridge2
</segment>
<author> Wald D. J.</author>
<gml:lineStringProperty>
<gml:LineString
srsName="null">
<gml:coordinates>
-118.72,34.243 118.591,34.176
</gml:coordinates>
</gml:LineString>
</gml:lineStringProperty>
</fault>
</gml:featureMember>
`
WMS
le
ec
tio
n
Fe
a
ol
tur
eC
eC
oll
Ge
tF
ea
e
r
tu
r
tu
a
Fe
a
Fe
et
G
tur
e
Client
io
ct
n
s
ad
i l ro ]
a
R [a-b
Railroads
WFS Server
Hi
River [a-d]
Bridge [1-5]
ry
SQL Query
ue
LQ
SQ
SQ
L
gw
ay
[1
2-
Q
ue
18
ry
]
Interstate
Highways
Rivers
Bridges
90
GIS Behind the Scenes


The web features are served up by a Web Feature Service.
Web Map Service aggregates maps


We re-implement Open Geospatial Consortium standards using Web
Service Standards.




http://grids.ucs.indiana.edu/ptliupages/publications/acm-gis-sayar.pdf.
http://grids.ucs.indiana.edu/ptliupages/publications/Geoinformatics05_asayar.pd
f.
More WFS Info:


SOAP messages, WSDL service definitions.
Will allow us to separate messages from HTTP transport layer in future.
More WMS Info:


NASA OnEarth + our own renderings.
http://grids.ucs.indiana.edu/ptliupages/publications/gwpap243.pdf
More general info, software, demos: http://www.crisisgrid.org
Tying It All Together: HPSearch





HPSearch is an engine for orchestrating distributed Web Service
interactions
 It uses an event system and supports both file transfers and data
streams.
 Legacy name
HPSearch flows can be scripted with JavaScript
 HPSearch engine binds the flow to a particular set of remote
services and executes the script.
HPSearch engines are Web Services, can be distributed
interoperate for load balancing.
 Boss/Worker model
ProxyWebService: a wrapper class that adds notification and
streaming support to a Web Service.
More info: http://www.hpsearch.org
Data Mining Grid from Grid of Grids
Databases with
NASA,USGS features
SERVOGrid Faults
UDDI
WFS4
SOAP
Filter
Pipeline
PI Data Mining
HPSearch
“Workflow”
Filter
WS-Context
Narada
Brokering
System Services
WFS3
GIS Grid
Traditional
Execution
Grid
Data can be stored and
retrieved from the 3rd part
repository (Context Service)
WS Context
WFS
(Tambora)
(Gridfarm001)
NaradaBroker network:
Used by HPSearch
engines as well as for
data transfer
WMS
Data Filter
HPSearch
(Danube)
(TRex)
Virtual
Data
flow
WMS submits script
execution request
(URI of script,
parameters)
HPSearch hosts an
AXIS service for
remote deployment of
scripts
PI Code Runner
(Danube)
 Accumulate Data
 Run PI Code
 Create Graph
 Convert RAW -> GML
HPSearch
(Danube)
GML
(Danube)
Actual Data flow
HPSearch controls the Web services
Final Output pulled by the WMS
HPSearch Engines
communicate using NB
Messaging
infrastructure
IEISS GUI FOR OVERLAYING
OUTAGE AREA ON A MAP
IEISS Summary



IEISS simulates power outages resulting from
damage to electrical and natural gas grids.
GIS Grid integration is similar to earlier PI
application.
Primary differences:



Better support for dynamic GIS service discovery.
Better integration of distributed state monitoring
(WS-Context).
Google map clients as well as modified PI clients.
4-5
- WFS
publishes
the
as
GML
document
into a topic
WFS
1-2-3
6
- User
and
- WMS
invokes
WMS
Client
publish
IEISS
->results
through
WMS
their
Server
WSDL
WMSFeatureCollection
->
Client
URL
UDDI
to
interface
->
the
WFS
UDDI
for the
Registry
obtained
(“/NISAC/WFS”)
in a pub/sub
based
messaging
WFS session
-> WMS in
Server
geospatial features,
and WMS
Client
starts system.
a workflow
the
(creates
map overlay) and IEISS receive this GML document. WMS Server ->
Context aService.
WMS Client (displays it)
7 - On receiving invocation message, IEISS updates the shared state data
to be “IEISS_IS_IN_PROGRES”. IEISS runs and produces an ESRI
Shape file and then invokes shp2gml tool to convert produced Shape
file to GML format. After the conversion IEISS updates shared session
state to be “IEISS_COMPLETED”. As the state changes, the Context
Service notifies all interested workflow entities such as WMS Client.
9-10
- WFS-L
publishes
the IEISSWMS
output
as amakes
GML FeatureCollection
8 – On
receiving
the notification,
Client
a request to the
document
topicoutput
‘NISAC/WFS-L’. WMS Server is subscribed to
WFS-L for to
theNB
IEISS
this topic and receives the GML file then converts it to map overlay,
and the Client displays the new model on the map.
Electric Power and Natural Gas data
Zoom-in
Zoom-out
FeatureInfo mode
Measure distance mode
Clear Distance
Drag and Drop mode
Refresh to initial map
Overlaid Outage Area - I

Basic Steps:



Select Energy Power
AND Natural Gas Data
and Update Layer List
rendered on the map
Click on “Overlay
Outage” button
See the outage area on
the map
Overlaid Outage Area - II

Basic Steps:




Select Energy Power Data
and Update Layer List
rendered on the map
Click on “Overlay Outage”
button
Use zoom-in mapping tool
below to get same outage
area in more detail
See the outage area on the
map
Overlaid Outage Area - III

Basic Steps:




Select Energy Power and
Natural Gas Data and
Update Layer List rendered
on the map
Select St. Petersburg from
the “Area of Interest”
dropdown list.
Click on “Overlay Outage”
button.
See the outage area on the
map
Getting Info about specific EP Data by clicking
on the map

Basic Steps:




Select Energy Power Data and
Update Layer List rendered on
the map
Select (i) from the mapping
tools below.
Click on any feature data on
the map.
See the information for
selected feature in pop-up
window
Google Hybrid Map and
Feature Information call to WMS
Natural Gas Layer
Electric Power Layer
Google Map Client
Archived
Real Time
Databases with
SERVOGrid Faults
Sensor
Grid
WFS1
WFS2
Google Central
HTTP
Google Map
Client
Helper
Services
UDDI
SOAP
DoD and Homeland Security can in a crisis combine custom
geo-referenced data with that available from hundreds of
thousands of computers from Microsoft, Yahoo and Google
Just build simple services using Interoperability standards!
Real Time GPS
and Google Maps
Subscribe to live GPS
station. Position data
from SOPAC is
combined with Google
map clients.
Select and zoom to
GPS station location,
click icons for more
information.
Google maps
can be
integrated with
Web Feature
Service
Archives to
filter and
browse seismic
records.
Integrating
Archived Web
Feature Services
and Google Maps
Google Maps
as Service
accessed from
our WMS
Client
Support for Real Time
Applications
RDAHMM: GPS Time Series Segmentation
Slide Courtesy of Robert Granat, JPL
GPS displacement (3D)
length two years.
Divided automatically
by HMM into 7 classes.
Features:
• Dip due to aquifer
drainage (days 120250)
• Hector Mine
earthquake (day 626)
• Noisy period at
end of time series


Complex data with subtle signals is difficult for humans to
analyze, leading to gaps in analysis
HMM segmentation provides an automatic way to focus attention
on the most interesting parts of the time series
Towards Real-Time RDAHMM


A real-time version of RDHAMM could potentially be
used to detect state change events in live data from
a GPS station.
SCIGN maintains 125+ GPS stations, so trivially
parallel RDAHHM clones can monitor state changes
in the entire network.


HPSearch can help
But first we must get the data to RDAHMM.
NaradaBrokering: Message Transport for
Distributed Services

NB is a distributed messaging
software system.


NB system virtualizes transport
links between components.


http://www.naradabrokering.or
g
Supports TCP/IP, parallel
TCP/IP, UDP, SSL.
See e.g.
http://grids.ucs.indiana.edu/ptli
upages/publications/AllHands2
005NB-Paper.pdf for transAtlantic parallel tcp/ip timings.
Traditional NaradaBrokering Features
Multiple protocol
transport support
In publish-subscribe
Paradigm with different
Protocols on each link
Transport protocols supported include TCP, Parallel TCP
streams, UDP, Multicast, SSL, HTTP and HTTPS.
Communications through authenticating proxies/firewalls &
NATs. Network QoS based Routing
Allows Highest performance transport
Subscription Formats
Subscription can be Strings, Integers, XPath queries, Regular
Expressions, SQL and tag=value pairs.
Reliable delivery
Robust and exactly-once delivery in presence of failures
Ordered delivery
Producer Order and Total Order over a message type. Time
Ordered delivery using Grid-wide NTP based absolute time
Recovery and Replay
Recovery from failures and disconnects.
Replay of events/messages at any time. Buffering services.
Security
Message-level WS-Security compatible security
Message Payload options
Compression and Decompression of payloads
Fragmentation and Coalescing of payloads
Messaging Related
Compliance
Java Message Service (JMS) 1.0.2b compliant
Support for routing P2P JXTA interactions.
Grid Feature Support
NaradaBrokering enhanced Grid-FTP. Bridge to Globus GT3.
Web Services supported
Implementations of WS-ReliableMessaging, WS-Reliability
and WS-Eventing.
Transit Delay (Milliseconds)
Mean transit delay for message samples in
NaradaBrokering: Different communication hops
9
8
7
6
5
4
3
2
1
0
hop-2
hop-3
hop-5
hop-7
100
1000
Message Payload Size (Bytes)
Pentium-3, 1GHz,
256 MB RAM
100 Mbps LAN
JRE 1.3 Linux
Standard Deviation for message samples in NaradaBrokering
Different communication hops - Internal Machines
0.8
hop-2
hop-3
hop-5
hop-7
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1000
1500
2000
2500
3000
3500
Message Payload Size
(Bytes)
4000
4500
5000
Typical use of Grid Messaging
Filter or
Datamining
Sensor Grid
Post before
Processing
Database
Archives
Post after
Processing
Narada
Brokering
Web Feature Service
Notify
Subscribe
HPSearch
Manages
WS-Context
Stores dynamic data
WFS
(GIS data)
GIS Grid
Geographical
Information System
Raw to GML via NaradaBrokering

The Scripps Orbit and Permanent Array Center
(SOPAC) GPS station network data published in RYO
format is converted to ASCII and GML
Typical use of Grid Messaging in NASA
Sensor Grid
Grid Eventing
Datamining Grid
GIS Grid
GIS and Collaboration Tools
e-Annotation
Player
Archieved
stream list
Archived stream
player
Real time
stream list
Annotation/WB
player
Real time stream
player
e-Annotation
Whiteboard
GIS and Collaboration



The previous slide illustrates an initial interface for capturing,
annotating, and storing/replaying video streams.
Still images can be captured and annotated on shared white
board.
Annotations are stored along with rest of system.
Collaborative and Synchronous
Annotation & Discussion
Collaborative Communication
e
tiv n
ora icatio
b
lla n
Co mmu
Co
Streaming
Servers
Master/Coach
broker
broker
archive
NaradaBrokering
Student
broker
broker
broker
e-Annotation
Portal Server
broker
Student
Ar
Collaborative Communication
Collaborative e-Annotation Player
Student
ch
iv
e/R
etr
iev
e
G
L
O
B
A
L
M
M
C
S
m
trea
TV
Live
stre
am
Capture Device
Archived Real Time (Live) Stream
From TV and Capture Devices
Collaborative e-Annotation Whiteboard
Archived Streams
Stream Annotation Snapshots
Instant Messenger
Real Time (Live) Stream Player
s
TV
Storage Servers
Challenges for Geographical Information
System Grids

Must address performance issues.




Related workshop at GGF 15.
HTTP is not an adequate transport mechanism for moving
data around.
XML representations, compression, etc.
Well established techniques from real-time
collaboration can be applied to sensors


Stream archiving and playback, session management,
software multicasting.
Applies to both data streams (GPS) and maps (streaming
video).
Download