CESD SAGES Scottish Alliance for Geoscience, Environment & Society The challenges of geo-simulation data Centre For Earth System Dynamics m.mineter@ed.ac.uk 1 CESD This talk: perspectives from CESD’s climate modelling • How climate modelling is done – Why model the climate? – NetCDF – CF – climate and forecast – Archives and metadata • Current challenges • Imminent challenges 2 What is “the climate”? CESD • Statistical concepts such as: – Typical seasonal rainfall distribution – Global mean annual outgoing shortwave radiation – Monthly mean surface temperature • …arising from physical processes – Fluid dynamics on rotating sphere – Interactions of radiation – …. 3 Why use a computer model of the climate? CESD 1. Explore the climate: – – – – Test hypotheses about how the climate works Interpret observations Express scientific community understanding Generate possible past and future climates 1. Use climate model output data – – – To drive other models To inform mitigation/adaptation Where observations are sparse at best… e.g. the future 4 Modelling the Climate System CESD Main Message: Lots of things going on! Karl and Trenberth 2003 5 Toolbox – not a black box! A climate model CESD Initial state Ancillary data can be time series δ that/δ other = something else δ this/δ that = process New something Modelled New “diagnostic” processes Files of means: 6hr, daily…decadal 6 CESD Data volumes and typical analyses • Typically we make 1-5GB/model year – 40 model years/day (coarse coupled model (HadCM3) using 40 cores) • Our biggest project: 14TB • Researcher selects/slices data • Does – Global/regional analyses – global means – Comparisons with related runs and observation,…., ….,… – NCL, IDL, NCO,… tools built on data standards 7 NetCDF CESD • “NetCDF is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data”. • File contains dimensions, variables, and attributes. Ed Hartnett’s talk at: http://www.unidata.ucar.edu/software/netcd f/papers/nasa_data_workshop_2010.pdf 8 Climate Forecast conventions CESD • http://cf-pcmdi.llnl.gov/documents/cf-conventio • define metadata that provide a definitive description of what the data in each variable represents – E.g. A variable called temp • Long name (ad hoc): near-surface daily mean • Standard name: air_temperature • Units: K 9 CF: time – two examples CESD double time(time) ; time:long_name = "time" ; time:units = "days since 1990-1-1 0:0:0" ; Days; Hours; Min; Sec All data are for same date: time:units = "days since 1-7-15 0:0:0” time:calendar = "none" ; data: time = 0., 1., 2., ...; 10 How are data made accessible? CESD • publish data in data centres: – Provide “experiment” metadata – Upload NetCDF data – Metadata are harvested from files into catalogue • Web services – E.g ncWMS 11 CESD Some challenges 12 Current trends CESD Data Diversity Volume Computation Ensembles Global + Regional Legacy analyses (IDL, …,..,..,..) Cooperation across groups Publish more than papers Build research ecosystem Collaboration 13 Future Lifecycle of research data Public CESD Web services Research communityArchives: BADC Project ECDF Researcher Provenance: re-use/modify analyses Easy transitions personal-project-world Tools to capture metadata: instrument current codes + workflow 14 Current challenges CESD • Wrap/instrument tools to give Metadata + Provenance in post-model analyses, impact modelling… learn from – SYSMO (Univ. of Manchester) – e-Science Central (Univ. of Newcastle) – Steve! • Workflow with wrapped legacy tools? 15 CESD Imminent challenges: impact / adaptation Climate biodiversity ecologies crops flood urban land …… Socio-economics NetCDF Regular/nested grids Triangulated irregular nwks Data synthesis Climate downscaling point->area modelling probabilistic data data Census – sociopolitical area 16