CUAHSI Hydrologic Information System an introduction Ilya Zaslavsky Director, Spatial Information Systems Lab San Diego Supercomputer Center University of California San Diego Presentation at DID Data Management, Kuala Lumpur, Malaysia, July 24, 2009 San Diego Supercomputer Center • Founded in 1985, as one of the five original supercomputer centers, funded by the National Science Foundation • 400 employees • Advanced research in highperformance computing and networking • R&D and cyberinfrastructure projects: in neuroscience, geology, astronomy, environmental sciences, molecular biology, hydrology SDSC building on UCSD campus Consortium of Universities for the Advancement of Hydrologic Science, Inc. 120+ US Universities An organization representing more than one hundred United States universities, receives support from the National Science Foundation to develop infrastructure and services for the advancement of hydrologic science and education in the U.S. http://www.cuahsi.org/ What is CUAHSI HIS? CUAHSI HIS: NSF support through 2012 (GEO) Partners: Academic: 11 NSF hydrologic observatories, CEO:P projects, LTER… Government: USGS, EPA, NCDC, NWS, state and local Commercial: Microsoft, ESRI, Kisters International: Australia, UK Standardization: OGC, WMO (Hydrology Domain WG, CHy); adopted by USGS, NCDC An online distributed system to support the sharing of hydrologic data from multiple repositories and databases via standard water data service protocols; software for data publication, discovery, access and integration. Observation Stations Map for the US Ameriflux Towers (NASA & DOE) NOAA Automated Surface Observing System USGS National Water Information System NOAA Climate Reference Network Build a common window on water data using web services Water Data Water quantity and quality Soil water Meteorology Remote sensing Rainfall & Snow Modeling Sources of Observations Data Point Water Observations Time Series A point location in space A series of values in time Getting Water Data (the old way) Different Query Pages Different Query Responses Web Pages and Web Services http://www.safl.umn.edu/ http://his.safl.umn.edu/SAFLMC/cuahsi_1_0.asmx Uses Hypertext Markup Language (HTML) Uses WaterML (a Markup Language for water data) HTML as a Web Language HyperText Markup Language <title>Texas Water Development Board</title> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta name = "Robots" content = "index,follow"> <meta name = "Priority" content = "home,twdb,homepage"> <meta name = "Author" content = "Texas Water Development Board, Agency Number 580"> <meta name = "Title" content = "Texas Water Development Board"> <meta name = "Description" content = "Texas Water Development Board Home Page"> <meta name = "Keywords" content = "water,drought,rain,conservation,groundwater,surfacewater,lake,reservoir,hydr ology,geology,desalination,TWDB,loans,grants,wastewater,sewage,Clean Water,Drinking Water,State Revolving Fund,planning,State Water Plan,GIS,Geographic Information Systems,Mapping,data"> Text and Pictures in Web Browser WaterML as a Web Language Streamflow data in WaterML language Discharge of the San Marcos River at Luling, June 28 - July 18, 2002 Point Observations Information Model Utah State Univ Data Source Little Bear River WaterOneFlow Service Network GetSites Little Bear River at Mendon Rd Sites Dissolved Oxygen GetSiteInfo GetVariableInfo Variables 9.78 mg/L, 1 October 2007, 5PM Values GetValues {Value, Time, Metadata} • A data source operates an observation network • A network is a set of observation sites • A site is a point location where one or more variables are measured • A variable is a property describing the flow or quality of water • A value is an observation of a variable at a particular time • Metadata provide additional information about the value WaterML and WaterOneFlow Site Codes Variable Codes Date Ranges GetSites GetSiteInfo GetVariableInfo GetValues Data DEC Data UVM Data USGS WaterML Client LOAD WaterOneFlow Web Service TRANSFORM Data Repositories EXTRACT WaterML is an XML language for communicating water data WaterOneFlow is a set of web services based on WaterML Standard Water Data Services • Set of query functions • Returns data in WaterML Next Step: OGC-WMO Hydrology Domain Working Group: WaterML 2.0 https://lists.opengeospatial.org/mailman/listinfo/hydro.dwg http://external.opengis.org/twiki_public/bin/view/HydrologyDWG/WebHome Contact: Ilya Zaslavsky, co-chair NWIS Daily Values (discharge), NWIS Ground Water, NWIS Unit Values (real time), NWIS Instantaneous Irregular Data, EPA STORET, NCDC ASOS, DAYMET, MODIS, NAM12K, USGS SNOTEL, ODM (multiple sites) Hydrologic Information System Service Oriented Architecture Deployment to test beds Customizable web interface (DASH) Global search (Hydroseek) Other popular online clients HTML - XML ETL services Ontology tagging (Hydrotagger) Controlled vocabularies WSDL and ODM registration Water Data Web Services, WaterML Ontology Test bed HIS Servers Desktop clients WSDL - SOAP HIS Central Registry & Harvester Metadata catalogs Data publishing ArcGIS Matlab IDL, R Excel ODM DataLoader ODMTools Server config tools HIS Lite Servers Central HIS servers External data providers Programming (C#, VB..) MapWindow Modeling (OpenMI) HIS Desktop Streaming Data Loading Central HIS Data Services Catalog Semantic Tagging of Harvested Variables Hydroseek http://www.hydroseek.net Supports search by location and type of data across multiple observation networks including NWIS, Storet, and academic data Against the NIH Syndrome 2006: ► CUAHSI HIS web services are discussed on the BASINS mailing list as a new way to access hydrologic data. The list is mostly used by hydrologists and developers outside academia; ► NCDC develops ASOS web services following WaterML 2007: ► MOU with USGS; USGS is developing WaterML-compliant GetValues service; ► GLEON uses an early version of ODM to develop their own schema (VEGA); ► Phoenix LTER is developing ODM (in MySQL) and WaterML services (in Java); ► A Google Earth-based client for CUAHSI web services is developed at CSIRO, Australia; ► Deployment to 11 hydrologic observatory test beds, + CBEO (CEOP project) 2008-2009: ► KISTERS develops WaterML-compliant web services over their database; ► Workshops at state agencies ► MapWindow open source GIS develops WaterOneFlow parsers; ► Florida, Texas and Idaho use ODM and WaterOneFlow web services to provide access to state data repositories; New Jersey is considering the same; ► Another CEOP project, at UC-Davis, is implementing ODM (in Postgres) and web services (in Java); ► Stroud Water Research Center; WRON; CZO; … many that we don’t know… ► Now SBRP: data from UCSD, UA, more? ► Integration with streaming data middleware (Open Source Data Turbine) The International Workshop on Hydrologic Data Management and Modeling in South East Asia July 20-24 University of Malaya Learning how the system works Publishing hydrologic data Setting up a server for SEA Already published: sample data from JPS (Malaysia) and from Indonesia Data published as web service: http://svctag-2z3322s/jps/cuahsi_1_0.asmx These are results of GetValues for JPS:3116434, Streamflow data In HydroExcel Charts of the same data In HydroExcel Area of interest In HydroSeek Finding JPS stations In HydroSeek More information about JPS stations In HydroSeek, and data download JPS data downloaded from HydroSeek Zooming in on Indonesia Looking for COD measurements In HydroSeek Zooming in to stations Summary • CUAHSI HIS = Cyberinfrastructure for managing and publishing observational data – Supports many types of point observational data – Overcomes syntactic and semantic heterogeneity using a standard data model and controlled vocabularies – Supports a national network of observatory test beds – Maintains national registry of services (1.75 million stations – the largest in the world) • WaterML is a standard language for consistently communicating water observations data from academic and government sources using web services; already adopted by several federal agencies. Joint WMO and OGC activity to enhance it. • The system is already deployed at multiple locations • It is free and open source