A flexible processing system for the calculation of air-sea gas fluxes: a dynamic tool for the community Jamie Shutler, Jean-Francois Piollé, Peter Land, David Woolf, Lonneke Goddijn-Murphy, Frederic Paul, Fanny Girard-Ardhuin,Craig Donlon, Bertrand Chapron OceanFlux GHG is funded by: and affiliated to: Introduction • Keen for Earth observation data to be exploited for SOLAS studies • Issues: – Simplifying access to these data (e.g. standard file types and tools) – Ability to handle large data sets • Flexible system needed for project uncertainty and scientific analyses • Simple to provide community access. – Greater transparency and traceability for publications • Users can create own climatologies, generate net fluxes and re-grid data • Data can be ingested into standard software packages and tools (e.g. Matlab, IDL, Excel). Cloud computing Cloud computing: inter-connected computing resources that can be easily scaled up (grown) or down (shrunk) while maintaining its capability or function (rather like a ‘cloud’ in the atmosphere). Features: • Redundancy – • Scale-able – • • • Processing is close to data (no bottleneck) System is tailored – • Uses standard hardware and software Simple backup and restoration No specific skills required by user Speed – • Servers can be removed or upgraded without users noticing No reliance on physical hardware e.g. use of virtual servers Maximises use of resources – Dynamic re-allocation of resources as and when needed Cloud computing Cloud computing: inter-connected computing resources that can be easily scaled up (grown) or down (shrunk) while maintaining its capability or function (rather like a ‘cloud’ in the atmosphere). Features: • Redundancy – • Scale-able – • • • Processing is close to data (no bottleneck) System is tailored – • Uses standard hardware and software Simple backup and restoration No specific skills required by user Speed – • Servers can be removed or upgraded without users noticing No reliance on physical hardware e.g. use of virtual servers Maximises use of resources – Dynamic re-allocation of resources as and when needed Nephelae 600 processing nodes. 1.5 Peta bytes of storage Available input data • All input data are 1o x 1o global monthly composites. Parameter Dataset (current number of datasets) Uncertainty Years SST NOAA AVHRR, ESA CCI, GHRSST datasets (4) yes 1992-2010 U10 ESA GlobWave archive (2+) yes 1992-2010 Hs ESA GlobWave archive (1+) yes 1992-2010 pCO2/fCO2 LDEO Takahashi 2002, LDEO Takahashi 2009, OceanFlux/SOCATv1.5, OceanFlux/SOCATv2 (4) majority 2000 2010 Rain GPCP, TRMM, SSMI (4) yes 1992-2012 Chl-a ESA GlobColour, ESA Ocean Colour CCI (2) yes 1997-2011 Basic climatology processing system Daily Dailydata data [1] [1] Compositing Compositing and/or and/orcorrection correction (mean, (mean,median, median, krigging, krigging,profile…) profile…) Rain Rain(intensity, (intensity,event) event) Wind Wind(speed, (speed,direction) direction) SOCAT SOCAT SST SST(foundation, (foundation,skin) skin) Ice (%age, thickness) Ice (%age, thickness) Wave Wave(altimeter (altimeterand andWWIII WWIIImodel) model) Salinity Salinity Monthly Monthlydata data SST SSTfronts fronts Biology Biology(ocean (oceancolour colourchl-a) chl-a) Takahashi Takahashiclimatology climatology Fixed Fixeddata data Longhurst Longhurstprovinces provinces Bathymetry Bathymetry Monthly data [2] [2] Climatology (monthly NetCDF) Process Processindicator indicatorlayers layers - -Variance Variance(all (alldata) data) - -Standard Standarddeviation deviation(all (alldata) data) - -Kurtosis Kurtosis(all (alldata) data) - -no. no.ofofdata datapoints points(all (alldata) data) - -Kriging stats (SOCAT data Kriging stats (SOCAT dataonly) only) - -Regions Regionsofoflow lowwind windspeed speed - -Dominance Dominanceofofwave wavebreaking breaking - -Diurnal Diurnalwarming warming(foundation (foundation––skin) skin) Flux Fluxcalculation calculation - -Default Defaultcalculation calculation(using (using Takahashi climatology Takahashi climatologyfor forpCO2 pCO2 data dataie. ie.not notusing usingSOCAT SOCATdata) data) - -Input Inputdata datauser userconfigurable configurable - -Formulation Formulationuser userconfigurable configurable - -Ability to re-process Ability to re-process - -Output Outputformat formatfixed fixed (contents/data-layers (contents/data-layersset setby by configuration file) configuration file) Process Processindicator indicatorlayers layers - -Biology type (low, medium, Biology type (low, medium,high) high) - -Presence Presenceofofstrong strongSST SSTgradients gradients Data Datalayers layers - -Monthly Monthlydata data(from (fromdaily dailydata, data,e.g. e.g. mean, mean,median, median,krigging kriggingresult) result) - -%age %ageice ice - -pCO pCO2W2W - -pCO pCO2A2A - -KKWBWB, ,KKWO , KW,trad , KWR, KWW WO, K W,trad, K WR, K - -flux flux - -ΔpCO ΔpCO2 2 - -asym, asym,CCI, I,CCMM Process Processindicator indicatorlayers layers - -Province Province - -Coastal Coastalororopen-ocean open-ocean Basic climatology processing system Daily Dailydata data [1] [1] Compositing Compositing and/or and/orcorrection correction (mean, (mean,median, median, krigging, krigging,profile…) profile…) Rain Rain(intensity, (intensity,event) event) Wind Wind(speed, (speed,direction) direction) SOCAT SOCAT SST SST(foundation, (foundation,skin) skin) Ice (%age, thickness) Ice (%age, thickness) Wave Wave(altimeter (altimeterand andWWIII WWIIImodel) model) Salinity Salinity Monthly Monthlydata data SST SSTfronts fronts Biology Biology(ocean (oceancolour colourchl-a) chl-a) Takahashi Takahashiclimatology climatology Fixed Fixeddata data Longhurst Longhurstprovinces provinces Bathymetry Bathymetry Monthly data [2] [2] Climatology (monthly NetCDF) Process Processindicator indicatorlayers layers - -Variance Variance(all (alldata) data) - -Standard Standarddeviation deviation(all (alldata) data) - -Kurtosis Kurtosis(all (alldata) data) - -no. no.ofofdata datapoints points(all (alldata) data) - -Kriging stats (SOCAT data Kriging stats (SOCAT dataonly) only) - -Regions Regionsofoflow lowwind windspeed speed - -Dominance Dominanceofofwave wavebreaking breaking - -Diurnal Diurnalwarming warming(foundation (foundation––skin) skin) Flux Fluxcalculation calculation - -Default Defaultcalculation calculation(using (using Takahashi climatology Takahashi climatologyfor forpCO2 pCO2 data dataie. ie.not notusing usingSOCAT SOCATdata) data) - -Input Inputdata datauser userconfigurable configurable - -Formulation Formulationuser userconfigurable configurable - -Ability to re-process Ability to re-process - -Output Outputformat formatfixed fixed (contents/data-layers (contents/data-layersset setby by configuration file) configuration file) Process Processindicator indicatorlayers layers - -Biology type (low, medium, Biology type (low, medium,high) high) - -Presence Presenceofofstrong strongSST SSTgradients gradients Data Datalayers layers - -Monthly Monthlydata data(from (fromdaily dailydata, data,e.g. e.g. mean, mean,median, median,krigging kriggingresult) result) - -%age %ageice ice - -pCO pCO2W2W - -pCO pCO2A2A - -KKWBWB, ,KKWO , KW,trad , KWR, KWW WO, K W,trad, K WR, K - -flux flux - -ΔpCO ΔpCO2 2 - -asym, asym,CCI, I,CCMM Process Processindicator indicatorlayers layers - -Province Province - -Coastal Coastalororopen-ocean open-ocean Basic climatology processing system Daily Dailydata data [1] [1] Compositing Compositing and/or and/orcorrection correction (mean, (mean,median, median, krigging, krigging,profile…) profile…) Rain Rain(intensity, (intensity,event) event) Wind Wind(speed, (speed,direction) direction) SOCAT SOCAT SST SST(foundation, (foundation,skin) skin) Ice (%age, thickness) Ice (%age, thickness) Wave Wave(altimeter (altimeterand andWWIII WWIIImodel) model) Salinity Salinity Monthly Monthlydata data SST SSTfronts fronts Biology Biology(ocean (oceancolour colourchl-a) chl-a) Takahashi Takahashiclimatology climatology Fixed Fixeddata data Longhurst Longhurstprovinces provinces Bathymetry Bathymetry Monthly data [2] [2] Climatology (monthly NetCDF) Process Processindicator indicatorlayers layers - -Variance Variance(all (alldata) data) - -Standard Standarddeviation deviation(all (alldata) data) - -Kurtosis Kurtosis(all (alldata) data) - -no. no.ofofdata datapoints points(all (alldata) data) - -Kriging stats (SOCAT data Kriging stats (SOCAT dataonly) only) - -Regions Regionsofoflow lowwind windspeed speed - -Dominance Dominanceofofwave wavebreaking breaking - -Diurnal Diurnalwarming warming(foundation (foundation––skin) skin) Flux Fluxcalculation calculation - -Default Defaultcalculation calculation(using (using Takahashi climatology Takahashi climatologyfor forpCO2 pCO2 data dataie. ie.not notusing usingSOCAT SOCATdata) data) - -Input Inputdata datauser userconfigurable configurable - -Formulation Formulationuser userconfigurable configurable - -Ability to re-process Ability to re-process - -Output Outputformat formatfixed fixed (contents/data-layers (contents/data-layersset setby by configuration file) configuration file) Process Processindicator indicatorlayers layers - -Biology type (low, medium, Biology type (low, medium,high) high) - -Presence Presenceofofstrong strongSST SSTgradients gradients Data Datalayers layers - -Monthly Monthlydata data(from (fromdaily dailydata, data,e.g. e.g. mean, mean,median, median,krigging kriggingresult) result) - -%age %ageice ice - -pCO pCO2W2W - -pCO pCO2A2A - -KKWBWB, ,KKWO , KW,trad , KWR, KWW WO, K W,trad, K WR, K - -flux flux - -ΔpCO ΔpCO2 2 - -asym, asym,CCI, I,CCMM Process Processindicator indicatorlayers layers - -Province Province - -Coastal Coastalororopen-ocean open-ocean Features of climatology processing system • • • • All datasets are online and pre-processed into monthly composites. Monthly composites include mean, median, σ2 and 2nd - 4th order moments. 1 Year global climatology takes 40 mins to generate. Ability to disable process indicator layers for speed increase – 1 year takes ~20 mins to generate – 10 year global run = 3.5 hours. • Flux calculation is user configurable. – e.g user configurable wind based k relationships, optional handling of vertical thermal and haline gradients. • Ability to inject noise or biases into input datasets – e.g. inject random noise based on known uncertainties of individual input data. • Additional tools: – Net flux tool – Takahashi style grid re-sampling tool • Python and Perl using standard libraries. ie no licenses are required and portable (>4000 lines of code !) Example output Global regular grid 1o x 1o NetCDF 3.0, CF 1.6 Example daily mean flux Example 2010 g C m-2 day-1 Example global time series Example output Concentration at the interface (CI) Example output Concentration at the interface (CI) Concentration at the MBL (CM) Example output Concentration at the interface (CI) Concentration at the MBL (CM) Gas transfer velocity (k) Example output – process indicator layers Persistent SST fronts (GHRSST) Example output – process indicator layers Persistent SST fronts (GHRSST) Surface chl-a (ESA CCI) Example output – process indicator layers Persistent SST fronts (GHRSST) Surface chl-a (ESA CCI) Regions of low wind Verification and testing • Software extensively tested over a period of months. • Quality layers highlight any regions which fail quality criteria as set out in the OceanFlux GHG Technical Specification (TS). • Software verified using Takahashi climatology data (SST, XCO 2, U10, pCO2w, air pressure, ice) as input (at 1o × 1o): – kSW06 < 2 % difference – pCO2a < 0.25 % difference – pH2O < 1.4 % difference – Ko < 2 % difference – Annual global net flux 1.34 Gt C yr-1 (ref: Takahashi 1.4 ± 0.7 Gt C yr-1) • Development version of software and output formats have been evaluated by an external independent third party company. • NetCDF3 output follow CF 1.6 guidelines Web based control Web based control Web based control Web based control Web based control Web based control And finally…. a brochure PDF on project website Conclusions • Highly flexible community tool to calculate global air-sea CO 2 fluxes. • Exploits Cloud based computing • Large amount of climate quality data and processing capability available. – 8 Terra bytes available and/or has been pre-processed – 500 Terra bytes of EO data available – 600 processing cores • Flexible solution – e.g. very simple to add additional datasets or run long time series. • Output fluxes have been verified. • System exploited for a number of OceanFlux GHG investigations and analyses. Future: Expand to handle other gases ? Shutler, J. D., Piolle, J. F., Land, P. E., Woolf, D. K., Goddijn-Murphy, L., Paul, F., et al. (in-prep) Air-sea gas flux data processing system using Cloud computing: a flexible and dynamic tool for the community, to be submitted to Journal of Atmospheric and Oceanic Technology.