Making Distributed Datasets More Available

advertisement
Unidata THREDDS: Making
Distributed Datasets More Available
(and Usable) in NSDL
THematic Real-time Environmental Distributed Data Services
Ben Domenico
October 2003
Sponsored by the National Science Foundation
http://www.nsf.gov
1
Topics
• Traditional Unidata Approach
– Mainly meteorological data
– Subscription system pushes data to user sites
– UPC provides data analysis tools for use on
data at user sites
• THREDDS Enhancements
– Broader menu of Earth system data
– Local client access from remote servers
– Less arcane, more accessible tools
– Integration of data and analysis tools into
educational modules and digital libraries
2
Unidata Community Today
• More than160 institutions
– Includes over 100 academic departments
plus government agencies and private
sector research groups
– Does not count separate installations, e.g.
Spanish weather service IDD, US Weather
Service radar data system
• Interdisciplinary from the outset: 1996
survey showed over 2/3 of institutions
had some uses outside meteorology
(oceanography, hydrology, climatology, civil
engineering, environmental science…)
3
Impact Survey
• Over 21,000 college students per year use
Unidata tools and data in classrooms and
labs
• Nearly 4,000 women/minority students
• More than 1,800 faculty and research staff
• Over 55,000 K-12 students involved through
Unidata-connected university programs
• Informal education: in excess of 1 million hits
at Unidata-based university web sites per day
• 97% of community report being satisfied or
very satisfied
4
Principal Activities
of the Unidata Program Center
• Facilitating Data Access to a broad
spectrum of observations & forecasts (in near
real time)
• Providing Tools to visualize, analyze,
organize, receive, & share data at university
sites
• Supporting Faculty who use Unidata
systems at colleges & universities (most in
the U.S.)
• Building and Advocating for a Community
where data, tools, & best practices in
education/research are shared
5
Traditional Unidata Data Types
• Individual observations from weather
stations around the globe
• Satellite imagery
• Radar data from 150 NEXRAD radars
• Output from forecast model runs at the
National Centers for Environmental
Prediction
• Lightning strike data
• Measurements from sensors on
commercial aircraft
6
1Km Radar
Image
7
IDD: The Community in Action
• The Internet-based system by which
universities acquire huge quantities of
weather data in near-real time (i.e. ASAP)
typifies Unidata’s community orientation.
• The system has no data center -- all tasks are
performed on the participants’ own (small)
computers.
• Currently the most used “advanced
application” on the Abilene network (2-3% in
terms of packets and bytes transferred)
8
Internet Data Distribution (IDD)
with Multiple Sources (Injecting 17 Gigabytes per Day)
Source
LDM
LDM
LDM
Source
Source
LDM
LDM
LDM
Internet
LDM
LDM
LDM
Using LDM software for instant data relaying, ~160
institutions cooperate to acquire a wide range of realtime, global, atmospheric & oceanic observations, model
outputs, remotely sensed images..., in a coordinated
community effort.
9
Typical Data Handling
at a Unidata Site
Unidata user
Unidata user
running local
running
analysis and display
tools local
analysis and display tools
Forecast
Model Output
Application
specific protocols
Satellite
imagery
Decoders
Local data
decoded into
application
specific
formats
Decoders
Weather station
observations
IDD
Radar data
Decoders
Decoders
Decoders
Lightning, aircraft,
GPSmet, etc.
10
Thematic Data Servers
(combining IDD “push” with several
forms of “pull” and DL discovery)
Local user applications:
e.g., LAS, McIDAS,
IDV, VGEE,
IDL, MatLab...
Discovery
Digital Library for
Earth-System Education
Client/server data
access protocols, e.g.
OpenDAP, ADDE,
WCS, FTP
Hydrology
Data, e.g.
IDD
IDD
DLESE
DL
interchange
protocol
Geophysical
Data, e.g.
IDD
IDD
Satellite
Satellite
Satellite
Satellite
Images,e.g.
e.g.
Images,
Images, e.g.
Imagery...
IDD
11
THREDDS
THematic Real-time Environmental Distributed Data Services
Connecting people, documents and data
People
Documents
Data
12
THREDDS Overview
• National Science Digital Library (NSDL)
“collections” project
• Integrating real-time environmental data
into
– Online educational materials
– Digital libraries (DLESE, NSDL)
• Two-year grant from NSF Department of
Undergraduate Education (DUE)
• Second generation under negotiation
• Led by Unidata Program Center (UPC)
13
THREDDS Data Providers
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
University of Alabama Huntsville (Sara Graves, Rahul Ramachandran, Steve Tanner, Ken Keiser)
ARM (Atmospheric Radiation Measurement, Chris Klaus)
CDC, the Climate Diagnostic Center (Roland Schweitzer)
COLA, Center for Oceans Land Atmosphere (Joe Wielgosz)
University of Florence (Stefano Nativi)
GMU, George Mason University (Menas Kafatos and Ruixin Yang)
IRI/LDEO, International Research Institute/Lamont Doherty Earth Observatory (Benno Blumenthal)
ESG, the Earth System GRID (Luca Cinquini, NCAR/SCD)
IRIS DMC, Incorporated Research Institutes for Seismology Data Management Center (Rob Casey)
NCAR, the National Center for Atmospheric Research (Don Middleton)
NCDC, the National Climatic Data Center (Ben Watkins)
NGDC, National Geophysical Data Center (Ted Habermann)
NOMADS,NOAA Operational Model Archive and Distribution System, (Glenn Rutledge, NCDC)
University of Oklahoma (Kelvin Droegemeier)
PMEL, the Pacific Marine Environment Laboratory (Steve Hankin)
FNMOC, Fleet Numerical Meteorological and Oceanographic Center (Phil Sharfstein)
SSEC, the Space Science and Engineering Center., U. of Wisconsin-Madison (Steve Ackerman, Tom
Whittaker)
Unidata Community ADDE servers (Tom Yoksas, Unidata Program Center)
CIESIN (Consortium for International Earth Science Information Network, Bob Downs)
CUAHSI (Consortium of Universities for Advancement of Hydrologic Science, David Maidment)
ESIG/NCAR (NCAR Environmental Societal Impacts Group, Bob Harriss)
Earthscope (UCAR UNAVCO, Chuck Meertens)
GEON (GEOphysical Network, Chaitan Baru, UCSD San Diego Supercomputer Center)
ESRI GIS Community
14
THREDDS
Analysis/Display Tool Builders
• Data Discovery Toolkit and Foundry based on EDMI (Earth Data
Multimedia Instrument, New Media Studio, Bruce Caron).
• GDS, GrADS/DODS Server (COLA, Center for Oceans Land Atmosphere,
Joe Wielgosz)
• IDV, Integrated Data Viewer (Unidata Program Center, Don Murray)
• INGRID (IRI/LDEO, International Research Institute/Lamont Doherty Earth
Observatory, Benno Blumenthal)
• LAS, Live Access Server (PMEL, the Pacific Marine Environment
Laboratory, Steve Hankin)
• VGEE, Virtual Geophysical Exploration Environment (NCAR, DLESE, U. of
Illinois, Unidata, many collaborators)
• WXWISE Applets (SSEC, the Space Science and Engineering Center., U.
of Wisconsin-Madison, Tom Whittaker)
• ESRI GIS Clients (ESRI, Inc., Jack Dangermond, President)
• OGC Clients (Open GIS Consortium, David Schell, President)
• MyWorld (Northwestern educational GIS Client, Danny Edelson)
15
THREDDS Interoperability Partners
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
ADDE, Abstract Data Distribution Environment (University of Wisconsin – Madison, Tom Yoksas)
DIMES, DIstributed MEtadata System (George Mason University, Ruixin Yang)
DODS/OPeNDAP/Aggregation Server, Distributed Oceanographic Data System/Open source
Project for a Network Data Access Protocol (University of Rhode Island, Unidata, Ethan Davis)
DLESE, Digital Library for Earth System Education (Rajul Pandya)
ESML, Earth System Markup Language (University of Alabama-Huntsville, Rahul Ramachandran)
ESRI, Environmental Science Research Institute (various)
GCMD, Global Change Master Directory (Gene Major)
OGC and ISO Standards (University of Florence, Stefano Nativi)
ADL (Gazetteer Services The University of California, Santa Barbara, Linda Hill and Michael
Goodchild)
DLESE Evaluation Services (The University of Colorado CIRES, Susan Buhr)
DLESE Data Services (Tamara Ledley)
DLESE Program Center Digital Library for Earth System Education (Mary Marlino)
ESRI (Jack Dangermond, President)
OPeNDAP (The University of Rhode Island Open source Project for a Network Data Access Protocol
-- formerly DODS, Peter Cornillon)
LAITS (Laboratory for Advanced Information Technology and Standards,Liping Di, George Mason
University)
NSDL Evaluation Services (University of Colorado, Tamara Sumner)
OGC (Open GIS Consortium, David Schell, President)
SWEET (Semantic Web for Earth and Environmental Terminology, Rob Raskin)
16
Unidata’s Contributions
• A large, (inter)national, active, cooperative academic
user community
• Coordination of many disparate contributors
(universities, government agencies, digital libraries,
commercial vendors, standards bodies…)
• Reliable, automated, real-time data systems
• Platform-independent 5D visualization with HTML
document integration
• Basic inventory catalog generator and server
software
• Client-side catalog access modules
17
Funding Sources
• Unidata 2003/2008 (NSF Atmospheric
Science Division)
• THREDDS NSDL Collections Grant (NSF
Department of Undergraduate Education)
• DODS/OPeNDAP (University of Rhode
Island subcontract on Naval Ocean
Partnership Program Grant and NASA
Earth Science Enterprise)
• NWS/COMET Case Studies (NOAA NWS)
18
The Web
• Well-developed
connections
People
– Document references
– Embedded multimedia
– Embedded interactive
applets
• Powerful tools
– Google
– Dreamweaver
Documents
– Web-site management
tools
– Web services
Data
20
Data Access Technologies
People
Documents
Data
• Web-based data interactions
with passive gif images -- most
analysis work done on remote
server
• Traditional Unidata IDD with
analysis on local clients
• Combinations with Web
browse and FTP delivery for
local analysis,
• Client/server, e.g.,
DODS/OPeNDAP
• All lack sophisticated, textbased Web search/discovery
tools and coherent integration
22
People
Documents
THREDDS is the Bottom line
Data
• Associate words of the science
with available datasets
• Create “compound” documents
pointing to datasets
• Connect analysis tools to
documents and datasets
• Wide range of compound documents
– Lists of datasets available on server with brief
description of dataset classes
– Online publications pointing to datasets illustrating
concepts
• Massive arsenal of Web and Digital Library
search/discovery tools can be applied to
compound documents
25
People
Discovery and
Publication Tools
Discovery and
Publication Services
Documents
Analysis and
Visualization Tools
THREDDS
Middleware
Data Services
Data
28
Basic Compound Document
THREDDS Server Inventory Catalog
• Inventory list of
datasets on server
• Generated
automatically with
minimal human
input
• Viewed from within
analysis and display
application
• Can be harvested
for inclusion in
GCMD, DLESE,
NSDL for use by
module builders
30
Enhanced Metadata Catalog
31
Compound Publication: Educational
Module within Interactive Analysis Tool
• Discovery at
DLESE
• module at DPC
• VGEE tool at
Unidata
• datasets at NCAR
• Lends itself well to
Web discovery
tools, DL
integration
• Can be:
– education module
– online scientific
publication
32
Browser-base Thin Client Access
• LDEO/IRI web site
publishes catalog of
datasets available on
server at UCAR
• Catalog resides and
is updated at UCAR
• Browsing of datasets
on UCAR server
from LDEO server
• Also enables
analysis and display
of datasets on UCAR
server using tools on
LDEO server
33
Future Directions
• Standards-based web services approach
to providing both data and metadata
• Integrate GIS clients and servers into
THREDDS for access to societal impacts,
infrastructure, hydrology data, etc.
• Work with OGC and ISO to incorporate
emerging standard access protocols into
THREDDS
• Actively participate in future DLESE Data
Access Working Group and Data Services
workshops to create more compound
document educational module.
44
THREDDS, GIS, DL Interoperability
THREDDS Client
Applications
GIS Client
Applications
OGC or
proprietary GIS
protocols
OGC or OPeNDAP
ADDE. FTP…
protocols
OpenGIS Protocols:
WMS, WFS, WCS
GIS Servers
GIS Server
Demographic,
infrastructure,
GIS Server
societal impacts, …
datasets
Metadata
crosswalk
THREDDS Servers
THREDDS Server
THREDDS
Server
Satellite,
radar,
forecast model output, …
datasets
Metadata
crosswalk
Open Archives Initiative (OAI) Metadata Harvesting
Digital Library Discovery Systems
45
Summary
• Universities have used Unidata tools to
acquire, analyze, and display real-time
atmospheric data for nearly 20 years
• THREDDS – along with related client/server
access and display technologies-- makes an
even broader menu of Earth system data to a
more diverse community of users
• THREDDS technologies enable the creation of
compound educational modules and
scientific publications with embedded pointers
to datasets and tools.
46
More Information
• http://my.unidata.ucar.edu/
• http://www.unidata.ucar.edu/projects/THREDDS/
• ben@unidata.ucar.edu
47
Download