Unidata THREDDS: Making Distributed Datasets More Available (and Usable) in NSDL THematic Real-time Environmental Distributed Data Services Ben Domenico October 2003 Sponsored by the National Science Foundation http://www.nsf.gov 1 Topics • Traditional Unidata Approach – Mainly meteorological data – Subscription system pushes data to user sites – UPC provides data analysis tools for use on data at user sites • THREDDS Enhancements – Broader menu of Earth system data – Local client access from remote servers – Less arcane, more accessible tools – Integration of data and analysis tools into educational modules and digital libraries 2 Unidata Community Today • More than160 institutions – Includes over 100 academic departments plus government agencies and private sector research groups – Does not count separate installations, e.g. Spanish weather service IDD, US Weather Service radar data system • Interdisciplinary from the outset: 1996 survey showed over 2/3 of institutions had some uses outside meteorology (oceanography, hydrology, climatology, civil engineering, environmental science…) 3 Impact Survey • Over 21,000 college students per year use Unidata tools and data in classrooms and labs • Nearly 4,000 women/minority students • More than 1,800 faculty and research staff • Over 55,000 K-12 students involved through Unidata-connected university programs • Informal education: in excess of 1 million hits at Unidata-based university web sites per day • 97% of community report being satisfied or very satisfied 4 Principal Activities of the Unidata Program Center • Facilitating Data Access to a broad spectrum of observations & forecasts (in near real time) • Providing Tools to visualize, analyze, organize, receive, & share data at university sites • Supporting Faculty who use Unidata systems at colleges & universities (most in the U.S.) • Building and Advocating for a Community where data, tools, & best practices in education/research are shared 5 Traditional Unidata Data Types • Individual observations from weather stations around the globe • Satellite imagery • Radar data from 150 NEXRAD radars • Output from forecast model runs at the National Centers for Environmental Prediction • Lightning strike data • Measurements from sensors on commercial aircraft 6 1Km Radar Image 7 IDD: The Community in Action • The Internet-based system by which universities acquire huge quantities of weather data in near-real time (i.e. ASAP) typifies Unidata’s community orientation. • The system has no data center -- all tasks are performed on the participants’ own (small) computers. • Currently the most used “advanced application” on the Abilene network (2-3% in terms of packets and bytes transferred) 8 Internet Data Distribution (IDD) with Multiple Sources (Injecting 17 Gigabytes per Day) Source LDM LDM LDM Source Source LDM LDM LDM Internet LDM LDM LDM Using LDM software for instant data relaying, ~160 institutions cooperate to acquire a wide range of realtime, global, atmospheric & oceanic observations, model outputs, remotely sensed images..., in a coordinated community effort. 9 Typical Data Handling at a Unidata Site Unidata user Unidata user running local running analysis and display tools local analysis and display tools Forecast Model Output Application specific protocols Satellite imagery Decoders Local data decoded into application specific formats Decoders Weather station observations IDD Radar data Decoders Decoders Decoders Lightning, aircraft, GPSmet, etc. 10 Thematic Data Servers (combining IDD “push” with several forms of “pull” and DL discovery) Local user applications: e.g., LAS, McIDAS, IDV, VGEE, IDL, MatLab... Discovery Digital Library for Earth-System Education Client/server data access protocols, e.g. OpenDAP, ADDE, WCS, FTP Hydrology Data, e.g. IDD IDD DLESE DL interchange protocol Geophysical Data, e.g. IDD IDD Satellite Satellite Satellite Satellite Images,e.g. e.g. Images, Images, e.g. Imagery... IDD 11 THREDDS THematic Real-time Environmental Distributed Data Services Connecting people, documents and data People Documents Data 12 THREDDS Overview • National Science Digital Library (NSDL) “collections” project • Integrating real-time environmental data into – Online educational materials – Digital libraries (DLESE, NSDL) • Two-year grant from NSF Department of Undergraduate Education (DUE) • Second generation under negotiation • Led by Unidata Program Center (UPC) 13 THREDDS Data Providers • • • • • • • • • • • • • • • • • • • • • • • • University of Alabama Huntsville (Sara Graves, Rahul Ramachandran, Steve Tanner, Ken Keiser) ARM (Atmospheric Radiation Measurement, Chris Klaus) CDC, the Climate Diagnostic Center (Roland Schweitzer) COLA, Center for Oceans Land Atmosphere (Joe Wielgosz) University of Florence (Stefano Nativi) GMU, George Mason University (Menas Kafatos and Ruixin Yang) IRI/LDEO, International Research Institute/Lamont Doherty Earth Observatory (Benno Blumenthal) ESG, the Earth System GRID (Luca Cinquini, NCAR/SCD) IRIS DMC, Incorporated Research Institutes for Seismology Data Management Center (Rob Casey) NCAR, the National Center for Atmospheric Research (Don Middleton) NCDC, the National Climatic Data Center (Ben Watkins) NGDC, National Geophysical Data Center (Ted Habermann) NOMADS,NOAA Operational Model Archive and Distribution System, (Glenn Rutledge, NCDC) University of Oklahoma (Kelvin Droegemeier) PMEL, the Pacific Marine Environment Laboratory (Steve Hankin) FNMOC, Fleet Numerical Meteorological and Oceanographic Center (Phil Sharfstein) SSEC, the Space Science and Engineering Center., U. of Wisconsin-Madison (Steve Ackerman, Tom Whittaker) Unidata Community ADDE servers (Tom Yoksas, Unidata Program Center) CIESIN (Consortium for International Earth Science Information Network, Bob Downs) CUAHSI (Consortium of Universities for Advancement of Hydrologic Science, David Maidment) ESIG/NCAR (NCAR Environmental Societal Impacts Group, Bob Harriss) Earthscope (UCAR UNAVCO, Chuck Meertens) GEON (GEOphysical Network, Chaitan Baru, UCSD San Diego Supercomputer Center) ESRI GIS Community 14 THREDDS Analysis/Display Tool Builders • Data Discovery Toolkit and Foundry based on EDMI (Earth Data Multimedia Instrument, New Media Studio, Bruce Caron). • GDS, GrADS/DODS Server (COLA, Center for Oceans Land Atmosphere, Joe Wielgosz) • IDV, Integrated Data Viewer (Unidata Program Center, Don Murray) • INGRID (IRI/LDEO, International Research Institute/Lamont Doherty Earth Observatory, Benno Blumenthal) • LAS, Live Access Server (PMEL, the Pacific Marine Environment Laboratory, Steve Hankin) • VGEE, Virtual Geophysical Exploration Environment (NCAR, DLESE, U. of Illinois, Unidata, many collaborators) • WXWISE Applets (SSEC, the Space Science and Engineering Center., U. of Wisconsin-Madison, Tom Whittaker) • ESRI GIS Clients (ESRI, Inc., Jack Dangermond, President) • OGC Clients (Open GIS Consortium, David Schell, President) • MyWorld (Northwestern educational GIS Client, Danny Edelson) 15 THREDDS Interoperability Partners • • • • • • • • • • • • • • • • • • ADDE, Abstract Data Distribution Environment (University of Wisconsin – Madison, Tom Yoksas) DIMES, DIstributed MEtadata System (George Mason University, Ruixin Yang) DODS/OPeNDAP/Aggregation Server, Distributed Oceanographic Data System/Open source Project for a Network Data Access Protocol (University of Rhode Island, Unidata, Ethan Davis) DLESE, Digital Library for Earth System Education (Rajul Pandya) ESML, Earth System Markup Language (University of Alabama-Huntsville, Rahul Ramachandran) ESRI, Environmental Science Research Institute (various) GCMD, Global Change Master Directory (Gene Major) OGC and ISO Standards (University of Florence, Stefano Nativi) ADL (Gazetteer Services The University of California, Santa Barbara, Linda Hill and Michael Goodchild) DLESE Evaluation Services (The University of Colorado CIRES, Susan Buhr) DLESE Data Services (Tamara Ledley) DLESE Program Center Digital Library for Earth System Education (Mary Marlino) ESRI (Jack Dangermond, President) OPeNDAP (The University of Rhode Island Open source Project for a Network Data Access Protocol -- formerly DODS, Peter Cornillon) LAITS (Laboratory for Advanced Information Technology and Standards,Liping Di, George Mason University) NSDL Evaluation Services (University of Colorado, Tamara Sumner) OGC (Open GIS Consortium, David Schell, President) SWEET (Semantic Web for Earth and Environmental Terminology, Rob Raskin) 16 Unidata’s Contributions • A large, (inter)national, active, cooperative academic user community • Coordination of many disparate contributors (universities, government agencies, digital libraries, commercial vendors, standards bodies…) • Reliable, automated, real-time data systems • Platform-independent 5D visualization with HTML document integration • Basic inventory catalog generator and server software • Client-side catalog access modules 17 Funding Sources • Unidata 2003/2008 (NSF Atmospheric Science Division) • THREDDS NSDL Collections Grant (NSF Department of Undergraduate Education) • DODS/OPeNDAP (University of Rhode Island subcontract on Naval Ocean Partnership Program Grant and NASA Earth Science Enterprise) • NWS/COMET Case Studies (NOAA NWS) 18 The Web • Well-developed connections People – Document references – Embedded multimedia – Embedded interactive applets • Powerful tools – Google – Dreamweaver Documents – Web-site management tools – Web services Data 20 Data Access Technologies People Documents Data • Web-based data interactions with passive gif images -- most analysis work done on remote server • Traditional Unidata IDD with analysis on local clients • Combinations with Web browse and FTP delivery for local analysis, • Client/server, e.g., DODS/OPeNDAP • All lack sophisticated, textbased Web search/discovery tools and coherent integration 22 People Documents THREDDS is the Bottom line Data • Associate words of the science with available datasets • Create “compound” documents pointing to datasets • Connect analysis tools to documents and datasets • Wide range of compound documents – Lists of datasets available on server with brief description of dataset classes – Online publications pointing to datasets illustrating concepts • Massive arsenal of Web and Digital Library search/discovery tools can be applied to compound documents 25 People Discovery and Publication Tools Discovery and Publication Services Documents Analysis and Visualization Tools THREDDS Middleware Data Services Data 28 Basic Compound Document THREDDS Server Inventory Catalog • Inventory list of datasets on server • Generated automatically with minimal human input • Viewed from within analysis and display application • Can be harvested for inclusion in GCMD, DLESE, NSDL for use by module builders 30 Enhanced Metadata Catalog 31 Compound Publication: Educational Module within Interactive Analysis Tool • Discovery at DLESE • module at DPC • VGEE tool at Unidata • datasets at NCAR • Lends itself well to Web discovery tools, DL integration • Can be: – education module – online scientific publication 32 Browser-base Thin Client Access • LDEO/IRI web site publishes catalog of datasets available on server at UCAR • Catalog resides and is updated at UCAR • Browsing of datasets on UCAR server from LDEO server • Also enables analysis and display of datasets on UCAR server using tools on LDEO server 33 Future Directions • Standards-based web services approach to providing both data and metadata • Integrate GIS clients and servers into THREDDS for access to societal impacts, infrastructure, hydrology data, etc. • Work with OGC and ISO to incorporate emerging standard access protocols into THREDDS • Actively participate in future DLESE Data Access Working Group and Data Services workshops to create more compound document educational module. 44 THREDDS, GIS, DL Interoperability THREDDS Client Applications GIS Client Applications OGC or proprietary GIS protocols OGC or OPeNDAP ADDE. FTP… protocols OpenGIS Protocols: WMS, WFS, WCS GIS Servers GIS Server Demographic, infrastructure, GIS Server societal impacts, … datasets Metadata crosswalk THREDDS Servers THREDDS Server THREDDS Server Satellite, radar, forecast model output, … datasets Metadata crosswalk Open Archives Initiative (OAI) Metadata Harvesting Digital Library Discovery Systems 45 Summary • Universities have used Unidata tools to acquire, analyze, and display real-time atmospheric data for nearly 20 years • THREDDS – along with related client/server access and display technologies-- makes an even broader menu of Earth system data to a more diverse community of users • THREDDS technologies enable the creation of compound educational modules and scientific publications with embedded pointers to datasets and tools. 46 More Information • http://my.unidata.ucar.edu/ • http://www.unidata.ucar.edu/projects/THREDDS/ • ben@unidata.ucar.edu 47