Session 4: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday 10 th November 2011 GENERAL INFORMATION • This is a half-day workshop (9am to 12:30pm) • 9:00am Introductions and Participants Goals • 9:15am Session 1: Discovering OPeNDAP data access services • 10:00am Session 2: Applicable use cases of OPeNDAP data services − 10:30am Tea Break for 15 minutes • 11:00am Session 3: OPeNDAP service protocols and features • 11:45am 11:45am Session Session 4: Accessing complementary features andand services 4: Accessing complementary features services • 12:30pm End of Workshop Session 4 • Accessing OPeNDAP servers with geospatial services, aggregation services, virtual datasets and data libraries. • A short tutorial on data access to these services using an OPeNDAP- and OGC-enabled client application or web servlet. • 45 minutes in length OPeNDAP crawler • Both Hyrax and TDS support THREDDS catalogs • THREDDS catalogs are hierarchical: • Each catalog describes a set of local resources and • Provide links to child nodes • Analogous to a files system directory tree • Our crawling code • • • • • Uses simple ‘rules’ to form aggregations Optimizes reads to actual data Can write NCML and/or EML Can also exhaustively read DDX responses Is ‘pipelined’ and stores stages’ results in Postgres OPeNDAP crawler availability Our crawler is made up of six shell programs: • ddx_crawler.sh: Crawl the THREDDS catalogs (NB: The misnomer) • ddx_retriever.sh: Get DDX responses • print_cached_urls.sh: Extract TCs and DDXs from the cache • url_classifier.sh: Use TCs and URL characters to form ‘logical groups’ • eml_writer.sh: Write EML for groups • ncml_writer.sh: Write NCML for groups NcML NcML can provide two basic features: • Augmenting/Modifying data sets with new • Attributes • Values • Combining two or more data sets (i.e., files) in an aggregation • Three kinds of aggregation are aupported: • Tile files • Join files along an existing axis • Join files along a new axis • The creation of these logical data sets can replace the old idea of an ‘inventory’ of files in some important cases • While very powerful, these aggregations are not applicable to every data set made up of multiple files NcML demo Access aggregation of data files on a service: • Access an aggregation of http://satdat1.gso.uri.edu/thredds/dodsC/NWAtlanticRaw_6km • • • • • • • Look at the DDS and DAS This is an aggregation of 18,000 discrete files The times run from 473452108 to 987018422 Lets get data from a given latitude over time (Hoffmüller plot) Here’s the Constraint: ?dsp_band_1[17108:1:17208][600][0:1023] Really cool, but gads that’s hard to write. The NcML Documentation: http://www.unidata.ucar.edu/software/netcdf/ncml/ Using the grid function is much easier, but this particular server does not support it. Here’s the function version: ?grid(dsp_band_1,"lat=30","954794345<time<987018422") Providing certificates to clients • There are a variety of ways to get PKI certificates • The general process is as follows: • Make a private key • Use that to request a certificate from some ‘Authority’ • When the authority sends you the requested cert, install it. • • • • How do you do this with OPeNDAP clients? Right now, the answer is ‘it depends’ We hope it will become more uniform in the future For clients made using the netCDF C library, version 4: • Add some lines to the .dodsrc file • For clients made with the netCDF Java library: • Add the cert to a Java Keystore netCDF C case # DODS client configuation file. See the DODS # users guide for information. USE_CACHE=0 … # This is the 'verify peer' function of curl. For this to work, the # server's certificate must be signed by a real CA, not the mock CA # we use for demos, hacks, etc. CURL.SSL.VALIDATE = 0 [https://localhost:8443/opendap/] CURL.SSL.CERTIFICATE = /Users/jimg/certtest/client.crt [https://localhost:8443/opendap/] CURL.SSL.KEY = /Users/jimg/certtest/client.key [https://localhost:8443/opendap/] CURL.SSL.CAPATH = /Users/jimg/certtest/ netCDF Java case When you invoke Java, you need to set four flags: -Dkeystore=<path to the keystore> -Dkeystorepassword=<password for the keystore> Data Discovery and Access Data discovery services • NASA’s Global Change Master Directory − http://gcmd.nasa.gov • IMOS eMII portal − http://imosmest.aodn.org.au/geonetwork/srv/en/main.home − Help --> http://emii1.its.utas.edu.au/drupal/?q=node/25 • TERN AusCover portal − http://data.auscover.org.au/ • My Ocean portal − http://www.myocean.eu/web/24-catalogue.php • TPAC Digital Library − http://dl.tpac.org.au Data access services • Unidata’s THREDDS Data Service − http://www.unidata.ucar.edu/projects/THREDDS/ • OPeNDAP’s Hyrax Data Service − http://opendap.org/download/hyrax.html • NOAA’s ERDDAP Data Service − http://coastwatch.pfeg.noaa.gov/erddap Web services and GIS Data analysis/visualization services • NOAA PMEL’s Live Access Server − http://ferret.pmel.noaa.gov/Ferret/LAS/home • COLA’s GrADS Data Service − http://www.iges.org/grads/gds/ Geospatial Information Services (GIS) • Unidata THREDDS WCS (Web Coverage Service) − http://www.unidata.ucar.edu/projects/THREDDS/ • Reading e-Science/TPAC’s ncWMS (Web Map Service) − http://www.resc.rdg.ac.uk/trac/ncWMS/ • Reading e-Science Centre’s Godiva2 − http://www.resc.rdg.ac.uk/twiki/bin/view/Resc/GodivaTwo Some of the Technology in the TDS 1. THREDDS Dataset Inventory Catalogs provide virtual directories of available data and associated metadata. 2. The Netcdf-Java/CDM library reads NetCDF, OpenDAP, and HDF5 datasets, as well as other binary formats such as GRIB and NEXRAD, essentially an (extended) netCDF view of the data. 3. TDS can use the NetCDF Markup Language (NcML) to modify and create virtual aggregations of datasets. 4. An integrated server provides OPeNDAP access with subsetting data access method. 5. An integrated server provides bulk file access through the HTTP protocol. 6. An integrated server provides data access through the OpenGIS Consortium (OGC) Web Coverage Service (WCS) protocol, for any "gridded" dataset whose coordinate system information is complete. 7. An integrated server provides data access through the OpenGIS Consortium (OGC) Web Map Service (WMS) protocol, for any "gridded" dataset whose coordinate system information is complete. 8. The integrated ncISO server provides automated metadata analysis and ISO metadata generation. Some of the Technology in Hyrax 1. THREDDS Dataset Inventory Catalogs provide virtual directories of available data and associated metadata. 2. Supports many formats and data stores: netCDF3, netCDF4, HDF4, HDF5, FreeForm, SQL data bases 3. Uses a plug-in based architecture and includes tools to write custom handlers 4. NetCDF Markup Language (NcML) to modify and create virtual aggregations of datasets. 5. OPeNDAP access with subsetting data access method. 6. bulk file access through the HTTP protocol. 7. ncISO server provides automated metadata analysis and ISO metadata generation. 8. RDF output 9. Code that has passed a formal security audit 10. A true multi-system architecture that can fit in a variety of enterprise settings 11. An administrator’s interface A quick look at Digital Library services The TPAC Digital Library, see e.g. https://dl.tpac.org.au/tpacportal/ the Oceans and Climate Digital Portal • Searchable, aggregatable OPeNDAP community research portal • Includes text descriptions of data sets with rich and structured metadata The IMOS or AusCover Portals should be explored. • http://imosmest.aodn.org.au/geonetwork/srv/en/main.home • http://data.auscover.org.au/ Open Data through the digital library Tim Pugh 03 9669 4345 t.pugh@bom.gov.au Exercise Open the web browser and visit the following sites for a couple of minutes. • NASA’s Global Change Master Directory − http://gcmd.nasa.gov − Find a reference to sea surface temperature data in the Southern Ocean • Visit the IMOS, AusCover, or TPAC portal • • Are the site similar or different? NOAA PMEL’s Live Access Server − http://mynasadata.larc.nasa.gov/data.html − Select “+ Live Access Server (Intermediate Edition)” and view some data • Reading e-Science Centre’s Godiva2 − http://www.resc.rdg.ac.uk/twiki/bin/view/Resc/GodivaTwo − Access TDS server with wms available, launch the Godiva2 java applet LIST OF OPENDAP SERVERS • OPeNDAP servers located at • Bureau of Meteorology ( http://opendap.bom.gov.au:8080/thredds ) • CSIRO ( http://opendap.csiro.au/thredds ) • ANU/NCI ( http://opendap.nci.org/thredds ) • OPeNDAP, inc ( http://test.opendap.org:8080/hyrax ) • TPAC ( http://opendap-tpac.arcs.org.au/thredds ) Thank you Authors: Tim F. Pugh1, James Gallagher2, Dave Fulker3 1Australian Bureau of Meteorology, Melbourne, Australia, t.pugh@bom.gov.au 2 OPeNDAP, Butte, Montana, USA, jgallagher@opendap.org 3 OPeNDAP, Boulder, Colorado, USA, dfulker@opendap.org