Slides

advertisement
Community infrastructure for building and coupling high
performance climate, weather, and coastal models
Cecelia DeLuca
NOAA / CIRES
University of Colorado, Boulder
cecelia.deluca@noaa.gov
ESMF website: www.earthsystemmodeling.org
Building Community Codes for Effective Scientific Research
on HPC Platforms
September 6-7, 2012
Outline
•
•
•
•
•
Vision and Context
Technical Overview
User Community
Values, Processes and Governance
Future
Vision
• Earth system models that can be built, assembled and reconfigured
easily, using shared toolkits and standard interfaces.
• A growing pool of Earth system modeling components that, through
their broad distribution and ability to interoperate, promotes the rapid
transfer of knowledge.
• Earth system modelers who are able to work more productively,
focusing on science rather than technical details.
• An Earth system modeling community with cost-effective, shared
infrastructure development and many new opportunities for scientific
collaboration.
• Accelerated scientific discovery and improved predictive capability
through the social and technical influence of ESMF.
Computational Context
• Teams of specialists, often at different sites, contribute scientific or computational
components to an overall modeling system
• Components may be at multiple levels: individual physical processes (e.g.
atmospheric chemistry), physical realms (e.g. atmosphere, ocean), and
members of same or multi-model ensembles (e.g. “MIP” experiments)
• Components contributed from multiple teams must be coupled together, often
requiring transformations of data in the process(e.g. grid remapping and
interpolation, merging, redistribution)
• Transformations are most frequently 2D data, but 3D is becoming more common
• There is an increasing need for cross-disciplinary and inter-framework coupling
for climate impacts
• Running on tens of thousands of processors is fairly routine; utilizing
hundreds of thousands of processors or GPUs is less common
• Modelers will tolerate virtually no framework overhead and seek fault
tolerance and bit reproducibility
• Provenance collection is increasingly important for climate simulations
Architecture
• The Earth System Modeling
Framework (ESMF) provides a
component architecture or
superstructure for assembling
geophysical components into
applications.
• ESMF provides an
infrastructure that modelers
use to
– Generate and apply
interpolation weights
– Handle metadata, time
management, I/O and
communications, and other
common functions
The ESMF distribution does not
include scientific models
Components Layer
Gridded Components
Coupler Components
ESMF Superstructure
Model Layer
User Code
Fields and Grids Layer
ESMF Infrastructure
Low Level Utilities
External Libraries
MPI, NetCDF, …
Components
• ESMF is based on the idea of components – sections of code that are
wrapped in standard interfaces
• Components can be arranged hierarchically, helping to organize the
structure of complex models
• Different modeling groups may create different kinds or levels of
components
Some of the ESMF
components in the
GEOS-5
atmospheric GCM
ESMF as an
Information Layer
Applications
of information
layer
•
•
•
•
•
•
Parallel generation and application of interpolation weights
Run-time compliance checking of metadata and time behavior
Fast parallel I/O
Redistribution and other parallel communications
Automated documentation of models and simulations
Ability to run components in workflows and as web services
Structured model information stored in ESMF wrappers
ESMF data
structures
Standard
metadata
Standard data
structures
Attributes: CF conventions, ISO standards,
METAFOR Common Information Model
Component
Field
Grid
Clock
User data is referenced or copied into ESMF structures
Native model
data structures
modules
grids
fields
timekeeping
Standard Interfaces
• All ESMF components have the
same three standard methods:
– Initialize
– Run
– Finalize
• Each standard method has the
same simple interface:
Steps to adopting ESMF
• Divide the application into components
(without ESMF)
• Copy or reference component input and
output data into ESMF data structures
• Register components with ESMF
• Set up ESMF couplers for data exchange
call ESMF_GridCompRun (myComp, importState,
exportState, clock, …)
Where:
myComp points to the component
importState is a structure containing input fields
exportState is a structure containing output fields
clock contains timestepping information
• Interfaces are wrappers and can often be set up in a non-intrusive way
Data Structures
• ESMF represents data as Fields that can be built on four
discretization types:
– Logically rectangular grids, which may be connected at the
edges
– Unstructured meshes
– Observational data streams
– Exchange grids – grids that are the union of the grids of
components being coupled, and reference the data on the
original grids
• ESMF can transform data among these representations,
using an underlying finite element unstructured mesh
engine
Component Overhead
• Representation of the
overhead for ESMF
wrapped native CCSM4
component
• For this example, ESMF
wrapping required NO
code changes to scientific
modules
• No significant
performance
overhead
(< 3% is typical)
• Few code changes for
codes that are modular
• Platform: IBM Power 575, bluefire, at NCAR
• Model: Community Climate System Model (CCSM)
• Versions: CCSM_4_0_0_beta42 and
ESMF_5_0_0_beta_snapshot_01
• Resolution: 1.25 degree x 0.9 degree global grid
with 17 vertical levels for both the atmospheric and
land model, i.e. 288x192x17 grid. The data
resolution for the ocean model is 320x384x60.
ESMF Regridding
ESMF offers extremely fast parallel regridding with
many options (regional, global; bilinear, higher order,
first order conservative methods; logically rectangular
grids and unstructured meshes; pole options; 2D or 3D;
invocation as offline application or during model run)
Summary of features:
http://www.earthsystemmodeling.org/esmf_releases/non_pu
blic/ESMF_5_3_0/esmf_5_3_0_regridding_status.htm
HOMME Cubed Sphere Grid with Pentagons
Courtesy Mark Taylor of Sandia
FIM Unstructured Grid
ESMF supported grids
IMPACT:
“use of the parallel
ESMF offline regridding
capability has reduced
the time it takes to
create CLM surface
datasets from hours to
minutes” - Mariana
Vertenstein, NCAR
Regional Grid
Regridding Modes
•
ESMF Offline RegridWeightGen Application:
– Separate application that is built as part of ESMF, can be used independently
– Application generates a netCDF weight file from two netCDF grid files
– Supports SCRIP, GRIDSPEC, UGRID, and custom ESMF unstructured format
mpirun –np 32 ESMF_RegridWeightGen –s src_grid.nc –d dst_grid.nc –m bilinear –w weights.nc
•
Regridding during model execution:
– ESMF library subroutine calls which do interpolation during model run
– Can get weights or feed directly into ESMF parallel sparse matrix multiply
– Can be used without ESMF components
call ESMF_FieldRegridStore(srcField=src, dstField=dst,
regridMethod=ESMF_REGRID_METHOD_BILINEAR, routehandle=rh)
call ESMF_FieldRegrid(srcField=src, dstField=dst, routehandle=rh)
ESMP – Python Interface
●
●
●
●
●
●
●
Flexible way to use ESMF parallel regridding
Separate download:
http://esmfcontrib.cvs.sourceforge.net/viewvc/esmfcontrib/python/ESMP/
Requirements:
– Python
– Numpy
– Ctypes
Limited platform support: Linux/Darwin, GCC(g++/gfortran), OpenMPI
Data type: ESMP_Field
Grid types:
– Single-tile 2D logically rectangular type: ESMP_Grid
– Unstructured type: ESMP_Mesh
Support for all ESMF interpolation options
3/15/2016
ESMF Web Services for
Climate Impacts
GOAL: Develop a two-way coupled, distributed, service-based modeling
system comprised of an atmospheric climate model and a hydrological
model, utilizing standard component interfaces from each domain.
IDEA: Bring the climate model to local applications
• Two way technical coupling completed during 2012 using CAM (with
active land) and SWAT (Soil Water Assessment Tool)
• Utilized switch in CAM’s ESMF component wrapper that enables web
service interface
• Coupled system is run using the OpenMI configuration editor, a web
service driver developed in the hydrological community
• Next steps: using WRF within the CESM framework and an updated
hydrological model
Summary of Features
•
Components with multiple coupling and execution modes for flexibility,
including a web service execution mode
•
Fast parallel remapping with many features
•
Core methods are scalable to tens of thousands of processors
•
Supports hybrid (threaded/distributed) programming for optimal
performance on many computer architectures; works with codes that
use OpenMP and OpenACC
•
Time management utility with many calendars, forward/reverse time
operations, alarms, and other features
•
Metadata utility that enables comprehensive, standard metadata to be
written out in standard formats
•
Runs on 30+ platform/compiler combinations, exhaustive nightly
regression test suite (4500+ tests) and documentation
•
Couples Fortran or C-based model components
•
Open source license
Major Users
ESMF Components:
• NOAA National Weather Service operational weather models
(GFS, Global Ensemble, NEMS)
• NASA atmospheric general circulation model GEOS-5
• Navy and related atmospheric, ocean and coastal research
and operational models – COAMPS, NOGAPS, HYCOM,
WaveWatch, others
• Hydrological modelers at Delft hydraulics, space weather
modelers at NCAR and NOAA
ESMF Regridding and Python Libraries
• NCAR/DOE Community Earth System Model (CESM)
• Analysis and visualization packages: NCAR Command
Language, Ultrascale Visualization - Climate Data Analysis
Tools (UV-CDAT), PyFerret users
• Community Surface Dynamics Modeling System
Usage Metrics
• Updated ESMF component listing at:
http://www.earthsystemmodeling.org/components/
• Includes 85 components with ESMF interfaces, 12 coupled crossagency modeling systems in space weather, climate, weather,
hydrology, and coastal prediction, for operational and research use
• About 4500 registered downloads
Values and Principles
•
•
•
•
•
•
•
•
Community driven development and community ownership
Openness of project processes, management, code and information
Correctness
Commitment to a globally distributed and diverse development and
customer base
Simplicity
Efficiency
User engagement
Environmental stewardship
Web link for detail: http://www.esmf.ucar.edu/about_us/values.shtml
Agile Development vs
Community Values
In general, Agile processes promote to community development.
However, there are areas of potential conflict:
Agile emphasis on co-location: “most agile methodologies or
approaches assume that the team is located in a single team
room” (from Miller, Dist. Agile Development at Microsoft, 2008)
vs
ESMF emphasis on distributed co-development
Agile emphasis on single product owner
vs
ESMF emphasis on multi-party ownership
Making Distributed
Co-Development Work
Hinges on asynchronous, all-to-all communication patterns: everybody
must have information
• Archived email list where all development correspondence gets cc’d
• Minutes for all telecons
• Web browsable repositories (main and contributions), mail summary
on check-ins
• Daily, publicly archived test results
• Monthly archived metrics
• Public archived trackers (bugs, feature requests, support requests,
etc.)
Discouraged: IMing, one-to-one correspondence or calls – the medium
matters
Change Review Board
• CRB established as a vehicle for shared ownership through user task
prioritization and release content decisions
• Consists of technical leads from key user communities
• Not led by the development team!
• Sets the schedule and expectations for future functionality
enhancements in ESMF internal and public distributions
– Based on broad user community and stakeholder input
– Constrained by available developer resources
– Updated quarterly to reflect current realities
• CRB reviews releases after delivery for adherence to release plan
Governance Highlights
Management of ESMF required governance that recognized
social and cultural factors as well as technical factors
Main practical objectives of governance:
• Enabling stakeholders to fight and criticize in a civilized,
contained, constructive way
• Enabling people to make priority decisions based on
resource realities
Observations:
• Sometimes just getting everyone equally dissatisfied and
ready to move on is a victory
• Thorough, informed criticism is the most useful input a
project can get
• Governance changes and evolves over the life span of a
project
Governance Functions
•
•
•
•
•
•
•
•
Prioritize development tasks in a manner acceptable to major stakeholders
and the broader community, and define development schedules based on
realistic assessments of resource constraints (CRB)
Deliver a product that meets the needs of critical applications, including
adequate and correct functionality, satisfactory performance and memory
use, ... (Core)
Support users via prompt responses to questions, training classes, minimal
code changes for adoption, thorough documentation, ... (Core)
Encourage community participation in design and implementation
decisions frequently throughout the development cycle (JST)
Leverage contributions of software from the community when possible (JST)
Create frank and constructive mechanisms for feedback (Adv. Board)
Enable stakeholders to modify the organizational structure as required
(Exec. Board)
Coordinate and communicate at many levels in order to create a
knowledgeable and supportive network that includes developers, technical
management, institutional management, and program management (IAWG
and other bodies)
Governance
Executive
Management
Executive Board
Strategic Direction
Organizational Changes
Board Appointments
Interagency Working Group
annually
Advisory Board
Stakeholder Liaison
Programmatic Assessment & Feedback
Reporting
External Projects Coordination
General Guidance & Evaluation
Working Project
Change Review Board
quarterly
Development Priorities
Release Review & Approval
Joint Specification Team
Functionality Change
Requests
Requirements Definition
Design and Code Reviews
External Code Contributions
weekly
Resource
Constraints
Implementation
Schedule
Core Development Team
Project Management
Software Development
Testing & Maintenance
daily Distribution & User Support
Collaborative Design
Beta Testing
Evolution
Phase 1: 2002-2005
NASA’s Earth Science Technology Office ran a solicitation to develop an Earth System
Modeling Framework (ESMF).
A multi-agency collaboration (NASA/NSF/DOE/NOAA) won the award. The core
development team was located at NCAR.
A prototype ESMF software package (version 2r) demonstrated feasibility.
Phase 2: 2005-2010
New sponsors included Department of Defense and NOAA.
A multi-agency governance plan including the CRB was created:
http://www.earthsystemmodeling.org/management/paper_1004_projectplan.pdf
Many new applications and requirements were brought into the project, motivating a
complete redesign of framework data structures (version 3r).
Phase 3: 2010-2015 (and beyond)
The core development team moved to NOAA/CIRES for closer alignment with federal
models.
Basic framework development completed with version 5r (ports, bugs, feature requests,
user support etc. still require resources).
The focus is on increasing adoption and creating a community of interoperable codes.
Technical Evolution:
National Unified Operational
Prediction Capability (NUOPC)
• ESMF allows for many levels of components, types of components,
and types of connections
• In order to achieve greater interoperability, usage and content
conventions and component templates are needed
• A tri-agency collaboration (NOAA, Navy, Air Force) is building a
“NUOPC Layer” that constrains how ESMF is used, and introduces
metadata and other content standards, along with inheritable
templates for different usage scenarios
• A production version of the NUOPC Layer is scheduled for delivery
at the end of 2012
CoG Collaboration
Environment
Central navigation:
Templated content
Right side: Services
Left side: Auto
generated
navigation of
project
freeform
content
Central Section:
Freeform content
The CoG environment exposes and collates the information needed for
distributed, multi-project development, including project repositories,
trackers, and governance processes. It does this in an environment that’s
linked to data search, metadata, and visualization services and set up to
enable component comparisons.
DCMIP on CoG
Atmospheric Dynamical Core Model Intercomparison Project
http://www.earthsystemcog.org/projects/dcmip-2012/
Planned MIPs:
2013
Downscaling
2014
Atm-surface
hydrology
2014
Frameworks
Future
Governance and process largely follow established patterns
Active development
●
Python API redesign and addition of features
●
Preparation for NUOPC Layer major release
●
Web services for climate impacts
●
Regridding options – higher order conservative , filling in gaps
●
Advanced fault-tolerance
●
Performance optimizations and GPUs
●
CoG collaboration environment
ESMF and related development is supported by NASA, NOAA, NSF, and the
Department of Defense.
3/15/2016
Download