Software Development and Management Processes in the NOAA Environmental Software Infrastructure and Interoperability (NESII) Group Ryan O’Kuinghttons, Cecelia DeLuca and the NESII team 11th World Congress on Computational Mechanics (WCCM XI) Barcelona, Spain, July 21, 2014 Motivation • How has the NESII group contributed to the data, metadata, and modeling capabilities of the Earth system community? ◦ Development of tools, standards, and conventions • What has been successful in development of well-managed and reliable code (both within NESII and external user codes)? ◦ Community ownership - stakeholders take part in regular meetings to set priorities ◦ Distributed development – utilize expertise of wide range of collaborators ◦ Open project processes, records, metrics, and code – accessibility breeds involvement, collaboration, and productivity ◦ Good software development practices - version control, issue tracking, regression testing, standards, documentation, documentation, and more documentation The Basics • NESII focuses on development of software infrastructure for Earth system modeling • Based at the National Oceanic and Atmospheric Administration (NOAA) Earth Systems Research Laboratory in Boulder, Colorado, USA • Diverse customer base - modeling groups from universities, the U.S. Navy, the National Center for Atmospheric Research (NCAR) and other major U.S. research centers, the National Weather Service (NWS), the Department of Defense (DoD), and the National Aeronatics and Space Administration (NASA) and the National Science Foundation (NSF) • Started with the Earth System Modeling Framework (ESMF) project – has grown to include several others The Vision • Develop interoperable modeling components that can connect in multiple ways Improve predictions and support research • Build advanced utilities that many models can use Enable research, promote efficiency • Enable models to be self-describing Increase understanding and defensibility of outputs • Create workflows that automate the modeling process from beginning to end Improve productivity • Build workspaces that encourage collaborative, distributed development of models and data analysis Leverage distributed expertise Key Values • Driven by the need to connect many different communities and codes. • Commitment to standards (data, metadata, component interfaces, services) to maximize interoperability • Community-driven development and community ownership ◦ Formal governance processes in which customers set priorities ◦ Frequent public design reviews and demonstrations ◦ Expertise shared among partners • Openness of project metrics, code and information • Commitment to a globally distributed and diverse development and customer base Associated Projects • Modeling Infrastructure ◦ ◦ ◦ ◦ ◦ Earth System Modeling Framework (ESMF) ESMF Grid Remapping National Unified Operational Prediction Capability (NUOPC) Earth System Prediction Suite (ESPS) Cupid Integrated Development Environment • Metadata and Data Infrastructure ◦ ◦ ◦ ◦ Open Climate Geographic Information Systems (OCGIS) Hydro-Climate Modeling Earth System Documentation (ES-DOC) Earth System CoG Collaboration Environment ESMF Started: 2002 Collaborators: Co-developed and used by NASA GEOS-5 climate model, NOAA (NCEP weather models), U.S. Navy (global and regional models), Community Earth System Model (CESM) • ESMF increases code reuse and interoperability in climate, weather, coastal, and other Earth system models • ESMF is based on the idea of components, sections of code that are wrapped in standard calling interfaces • Provides high performance libraries and tools for: ◦ Time management Metrics: ◦ Data communications ~5500 downloads ◦ Metadata and I/O ~3000 individuals on info mailing list ◦ Running models as web services ~40 platform/compilers regression tested ◦ Parallel grid remapping ~6400 regression tests ~830,000 SLOC http://www.earthsystemmodeling.org/ ESMF as an Information Layer Applications of information layer • • • • • Parallel generation and application of interpolation weights Fast parallel I/O Redistribution and other parallel communications Automated documentation of models and simulations Ability to run components in workflows and as web services Structured model information stored in ESMF wrappers ESMF data structures Standard metadata Standard data structures Attributes: CF conventions, ISO standards, METAFOR Common Information Model Component Field Grid Clock User data is referenced or copied into ESMF structures Native model data structures modules grids fields timekeeping ESMF Grid Remapping/Interpolation • Fortran, C and Python interfaces ◦ Weight generation is also available through a command line application • Works with any combination of logically rectangular and unstructured grids ◦ In 2D or 3D, with the following exceptions: • 3D unstructured made of hexahedrons only • Higher-order patch method not available in 3D • Regridding options include: ◦ Methods: bilinear, higher-order patch, conservative, nearest-neighbor ◦ Masking: source, destination ◦ Pole handling: N-point average, full circle average, no pole ◦ Ignore unmapped destination points ◦ Line type: great circle or linear FIM Unstructured Grid Courtesy of ESRL GSD Regional Grid 2D Unstructured Mesh From www.ngdc.noaa.gov NUOPC Overview Started: 2010 Collaborators: Tri-agency (NOAA, U.S. Navy, U.S. Air Force) consortium of operational weather prediction centers, with participation from NOAA Geophysical Fluid Dynamics Laboratory (GFDL) and NASA modelers • ESMF allows for many levels of components, types of components, and types of connections • In order to achieve greater interoperability, usage and content conventions and component templates are needed • This collaboration is building a “NUOPC Layer” that constrains how ESMF is used, and introduces metadata and other content standards • Goals are to increase collaboration between the agencies and accelerate the transition of new technology into the operational centers https://www.earthsystemcog.org/projects/nuopc/ NUOPC Architecture An interoperability layer on top of ESMF that adds: Generic components that provide a standard implementation of interoperable components Formalizes initialize sequences, runtime field brokering Mechanisms to report component incompatibilities detected during run-time A compliance checker option that serves as a development and debugging tool A collection of example applications The Earth System Prediction Suite • The Earth System Prediction Suite (ESPS) is a collection of major U.S. and climate modeling codes that use ESMF with the NUOPC conventions. • The ESPS makes clear which codes are available as ESMF components and modeling systems. • Currently, components in the ESPS can be of the following types: coupled system, atmosphere, ocean, wave, sea ice https://www.earthsystemcog.org/projects/esps/ ESPS Code Status LEGEND Compliant NEMS 2014 (Completion date) In progress CFSv3 Coupled Modeling Systems COAMPS / NavGEMCOAMPS-TC HYCOM-CICE 2014 Candidate GEOS-5 2015 ModelE 2015 CESM 2014 Atmospheres GFS/GSM NMMB CAM FIM GEOS-5 FV ModelE Atm COAMPS Atm NavGEM NEPTUNE WRF 2014 2014 2015 2014 2014 Oceans MOM5 HYCOM NCOM MPAS-O POP 2014 Ice CICE 2014 2014 2014 2014 2014 Wave WW3 SWAN 2014 2014 2014 2014 Spanning major climate, weather, and ocean codes, ESPS is the most direct response to calls for common modeling infrastructure yet assembled Cupid Integrated Development Environment Started: 2012 Collaborators: NOAA CIRES, GA Tech, and NASA GISS/GSFC • Cupid is a tool designed to make ESMF training and development simpler, faster, and more appealing • Goal: Make ESMF development simpler and more appealing while accelerating the rate at which ESMF is adopted into new components • Plugin for Eclipse-based “Integrated Development Environment” or IDE • Customized for ESMF applications with NUOPC conventions • Cupid is a working prototype expected to be ready for first public release later this year https://earthsystemcog.org/projects/cupid/ Cupid Development and Training Environment Run locally or on a cloud Select sample code or model Source code editor NUOPC outline Project explorer Console for viewing output • Pick a training problem (or coupled model) • Generate a framework-aware outline of the source code • Navigate around the source code using the outline • Use an editor to modify the source code • Automatically generate code needed for NUOPC compliance • Compile and run locally or on a cloud (currently Amazon Web Services) ESMF Web Services and Climate Impacts Modeling • The ESMF distribution allows any networked ESMF component to be available as a web service. • Coupling using web services offers a new perspective on climate impacts modeling. • Instead of what impacts are “put in” the climate model … • How do we create a distributed network of models that links climate to local and regional processes while preserving native infrastructure and specialized information delivery systems? • Partners include NESII, University of South Carolina, University of Michigan. Climate-Hydro Coupling • Access to the atmosphere model across the network is through ESMF Web Services • Atmosphere model output data is streamed via ESMF Web Services Client Web Browser • Hydrologic model: Soil and Water Assessment Tool (SWAT) runs “in the cloud” on Amazon Web Services Virtual Server, with a native (OpenMI) interface • Atmosphere Model: Community Atmosphere Model (CAM) or Weather Research and Forecast Model (WRF) runs on a supercomputer, with an ESMF interface Personal Computer Apache HTTP Server • Wrappers for models provide OpenMI RESTful service interfaces ESMF OpenMI Service • Driver (Web Application) uses OpenMI/HTTP interface to timestep through models OpenMI Svc SWAT ESMF Web Services ESMF CAM Component High Performance Computer Amazon Web Services EC2 Virtual Windows Svr Technical proof-of-concept is complete, not scientifically validated Currently refining a GUI capability which allows for specification of fields exchanged https://www.earthsystemcog.org/projects/hydroclimatemodeling/ OpenClimateGIS • OpenClimateGIS (OCGIS) is a Python-based, open source software library enabling dynamic access to and manipulation of climate data • Its goal is to overcome barriers of usability of climate projections in adaptation planning and resource management ○ ○ ○ ○ ○ ○ • Translate out of climate data formats to GIS-friendly formats Select geographical regions of interest Select times/levels of interest Compute application-relevant indices Convert to end-user and analysis-ready formats Provide comprehensive metadata Developed and used by the NCPP project, the IS-ENES climate4impact project, ClimatePipes, and others http://www.earthsystemcog.org/projects/openclimategis/ Precipitation Threshold for Tampa Bay Watershed Basins • Figure at right generated using Quantum GIS with shapefile output from OpenClimateGIS. • It shows daily precipitation exceedances using a threshold value of 9.62 mm/day for July 1990. • Source data uses the Bias Corrected/Constructed Analogs (BCCA) method to downscale data from the Canadian Center for Climate Modeling Analysis's (CCMA) Coupled Global Climate Model (CGCM). Earth System Documentation (ES-DOC) Started: 2012 Collaborators: British Atmospheric Data Center, Institut Pierre Simon Laplace, NOAA/NESII, Program for Climate Model Diagnosis and Intercomparison • International distributed effort to develop metadata services for climate modeling and related projects • Evolution of E.U. Metafor project and U.S. Earth System Curator project • Developing tools for model documentation based on the Common Information Model (CIM): o Search – Connected to Earth System Grid Federation node with access to international archive of climate and related datasets o Create – Questionnaires and scripting libraries (pyesdoc) create model metadata o Compare – Web-based Comparator to examine model documentation o Visualize – Web-based Viewer to display model documentation https://earthsystemcog.org/projects/es-doc-models/ http://es-doc.org/ ES-DOC Questionnaire • • • • Motivated by the need to collect model information for intercomparison projects Collaboration with E.U. IS-ENES partners and National Climate Predictions and Projections (NCPP) platform – a NOAA project focused on making local and regional climate information more usable in applications Developed a flexible utility based on the Common Information Model (CIM) developed by the E.U. Metafor project Generates customizable CIM-based questionnaires on the fly Screenshot of a sample questionnaire https://www.earthsystemcog.org/projects/es-doc-models/questionnaire_description Earth System CoG Started: 2009 Collaborators: National Center for Atmospheric Research (NCAR), Earth System Grid Federation (ESGF), University of Michigan, University of Colorado (CU) Community Surface Dynamics Modeling System • • • • • • CoG is a collaboration environment that hosts and links software development projects, model intercomparison projects (MIPS), events, and workshops. It includes a configurable search to data on ANY ESGF data node. It provides projects with a wiki and customizable navigation to wiki content. It contains an ontology for the description and management of projects and provides a consolidated look at this content across a project’s network. It contains a file server for documents and images. It provides services for Earth system model metadata collection and display (through ES-DOC tools) Some of the 70+ projects currently hosted on CoG include: •NOAA’s High Impact Weather Prediction Project (HIWPP) •Atmospheric Dynamical Core Model Intercomparison Project (DCMIP) •Reanalysis Data for CMIP5 (Ana4MIPs) •Observational Data for CMIP5 (Obs4MIPs) •National Unified Operational Prediction Capability (NUOPC) •National Climate Predictions and Projections Platform (NCPP) •Earth System Documentation (ES-DOC) •Earth System Prediction Capability (ESPC) https://www.earthsystemcog.org/ Wiki and Collaboration Tools https://www.earthsystemcog.org/projects/dcmip-2012/ The CoG layout is colorcoded: • The right-hand side (dark yellow) is where services (data, news, project connectivity) are located. • The upper navigation bar (dark teal) contains links to project-level metadata. • On the left (light teal) is navigation for custom content. • The central portion of the site is a wiki that allows projects to create their own content. ESGF-CoG Integration • The Earth System Grid Federation (ESGF) is software for the management and analysis of Earth science data on a global scale. • CoG is a collaboration environment and hub to connect projects in the Earth sciences. • CoG will provide a superior interface to ESGF users and data managers in terms of: • Overall usability • Content management • Model intercomparison project support • Multi-project support • Online collaboration tools Reference: 3rd Annual Earth System Grid Federation and Ultrascale Visualization Climate Data Analysis Tools Face-toFace Meeting Report December (http://aims-group.github.io/pdf/ESGF_UV-CDAT_Meeting_Report_March2014.pdf) Revisit our motivations... • How has the NESII group contributed to the data, metadata, and modeling capabilities of the Earth system community? ◦ Development of tools, standards, and conventions • What has been successful in development of well managed reliable code (both within NESII and external user codes)? ◦ Community ownership - stakeholders take part in regular meetings to set priorities ◦ Distributed development – utilize expertise of wide range of collaborators ◦ Open project processes, records, metrics, and code – accessibility breeds involvement, collaboration, and productivity ◦ Good software development practices - version control, issue tracking, regression testing, standards, documentation, documentation, and more documentation Questions? • For more information, links and references, see our group page: http://esrl.noaa.gov/nesii/ • CoG pages: https://earthsystemcog.org/ • ESMF homepage: http://www.earthsystemmodeling.org/ • Send questions to: esmf_support@list.woc.noaa.gov