slide - List of Research Wikis

advertisement
Model Coupling Toolkit: Recent
Developments and Future Plans
Robert Jacob
Argonne National Laboratory
Second Workshop on Coupling Technologies for Earth System
Modeling, NCAR, February 20-22, 2013.
About Argonne


Founded in 1943, designated
a national laboratory in 1946
Managed by The University of Chicago
for the U.S. Department of Energy
– More than 2,900 employees
and 5,000+ facility users
– About $475M/year budget
– 1,500-acre, wooded site in
DuPage County, Illinois


Broad science portfolio
Numerous sponsors
MCT Philosophy: Model coupling vs. “the coupler”
 MCT is not a coupler
 Instead, MCT provides datatypes/methods you add to models (and a
separate coupler or driver if you want one) to make the models
“couple-able”.
 Must also follow a few programming standards (suggested but not
required for using MCT)
– Separate “initialization” and “run” methods.
– Do not use MPI_COMM_WORLD everywhere.
– Avoid global data types
Model Coupling Toolkit: History

Pre-History:
– 1996 parallel coupler for Fast Ocean Atmosphere Model (Jacob, UW-Madison)
– 1998 Physical-space Statistical Analysis System (Larson, NASA)

U.S. Department of Energy ACPI Avante-Garde project (2000-2001)
– First work on parallelizing CCSM coupler (cpl5->cpl6)
– MCT 1.0

DOE Scientific Discovery through Advanced Computing (SciDAC) (2001-2011)
– MCT 2.0 – MCT 2.7.3
– CCSM3, CCSM4, CESM1.

DOE Climate Science for a Sustainable Energy Future (CSSEF) (2011 – 2015)
– MCT 2.8
– Next generation MCT
All DOE support is from the climate modeling program in the Office of Biological and
Environmental Research (BER) in the Office of Science.
MCT Architecture
High-level MCT classes
Low-level MCT classes
Message-Passing
Environment Utilities
(MPEU)
mpeu (message passing environment utilities)
Developed by the NASA DAO, and extended by MCT developers, mpeu
provides the following services to Fortran90 MPI applications:

Support for basic derived types (List, String) on which low-level classes
in MCT are built
And….
 F90 module-style access to MPI
 Support for multiprocessor stdout and stderr




Error handling / shutdown
Support for namelist replacement “resource files”
Run-time flow tracing
Timing/Load balance measurements
MCT Low-level Classes

Coupled model registry (describe how many models are coupled
(no limit))
– MctWorld

Multi-field data storage (hold data being transferred (any
amount))
– AttrVect

Domain decomposition (any grid, any decomposition)
– GlobalSegMap

Intercomponent parallel data transfer scheduler (between two
GSMaps)
– Router

Intercomponent parallel data transfer (For a Router and an Av)
– Transfer

Intracomponent parallel data redistribution (for an AV and a
GSMap of the same grid)
– Rearranger
MCT High-Level Classes and Modules
 Interpolation (sparse) matrix
object
– SparseMatrix
 Sparse matrix – Attribute Vector
multiply (for interpolation)
– MatAttrVectMult
 Physical Grid Description
– GeneralGrid
 Time averaging and accumulation
support
– Accumulator
 Masked/unmasked spatial
integrals and averages
– SpatialIntegral
 Combining sources from two or
more models
– Merge
 Communication methods for
MCT datatypes
–
–
–
–
–
AccumulatorComms
AttrVectComms
GeneralGridComms
GlobalSegMapComms
SparseMatrixComms
Typical MCT Use:
ATM (M nodes)
CPL (N nodes)
OCN (P nodes)
Call MCT World
Call MCT World
Call MCT World
Define GlobalSegMap
Define AttrVect
Define Router
Define GlobalSegMaps
Define AttrVects
Define Routers
Define Accumulators
Read Matrix elements
Define GlobalSegMap
Define AttrVect
Define Router
Read Atmosphere
Data
Initialization
Read Ocean Data
DO WORK
DO WORK
MCT_Send(AtrVect, Router)
MCT_Recv(AtrVect, Router)
MCT_Recv(AAtrVect, ARouter)
MCT_Recv(OAtrVectin, ORouter)
MCT_AvMatVectMult(AAtrVect,
SparseMatrix, OAtrVectout)
Compute Fluxes
MCT_Send(AAtrVect, ARouter)
MCT_Send(OAtrVect, ORouter)
MCT_Send(AtrVect, Router)
MCT_Recv(AtrVect, Router)
More on MCT Use
 The user must:
– Know how data is layed out on processors.
– Describe decomposition to MCT with a GSMap
• Points are uniquely numbered globally
– Copy local data in to an MCT Attribute vector.
• Copy in either memory order or global index order
– Calculate Interpolation weights (with SCRIP or ESMF Regridder)
– Read in interpolation weights (to root node)
 MCT can:
– Derive communication tables between decompositions (using indices)
– Do all parallel data communication necessary for interpolation,
gather, scatter.
• Minimizes sizes of data transferred.
MCT Users
 “IPCC-class” production coupled model: The NSF/DOE
Community Earth System Model
–
–
–
–
MCT is the default coupling method in CCSM4 and CESM1
MCT datatypes always used in top level coupler driver.
MCT methods/datatypes are default for driver-component communication.
All AR5 simulations by CCSM4/CESM1 are using MCT.
 Other academic coupled systems:
– COAMPS/ROMS - Hurricanes
– ROMS/Swan - coastal oceanography
– WRF/ROMS - Hurricanes
 OASIS3 – MCT
– See next talk
CESM using cpl7/MCT can scale to 100K cores.
12
MCT Recent history

01/06/2010: MCT 2.7.0 released in CCSM4
– Limted used of OpenMP


02/28/2010: MCT 2.7.1 released in CESM1
11/30/2010: MCT 2.7.2 released in CESM1.0.3
(CW2010 in Toulouse, France. December, 2010)

01/25/2011: MCT 2.7.3 add debugging option to configure
(2011: Some divergence between Argonne and NCAR MCT repositories)

02/07/2012: MCT 2.7.4 update autoconf build to latest version (Jim Edwards)
MCT More Recent history

MCT 2.8.0 - Released April 30, 2012 (first standalone release since 2.6)!
– Merged differences in Argonne and NCAR MCT repos.
– New datatype in AttributeVector to speed up copies (thanks to Bill Sacks, NCAR)
– ANL and NCAR repos in sync!

07/12/12 - MCT 2.8.1
Convert Argonne repository to git
– Repository now world readable. Copy on github.com
– Full 10+ year history MCT development converted.
– Github provides SVN interface allowing CESM to pull directly. Eliminate duplicate repo
at NCAR.
git clone http:git.mcs.anl.gov/MCT.git
Continuing to improve cpl7/MCT performance at
scale.

Slow initialization time for high-resolution, high-processor count cases.
– 1/8th degree runs on Intrepid were taking over 1.5 hours to initialize.
– Traced to initialization of MCT’s Rearranger (equal to 2 Routers) which moves data
between overlapping decompositions. Thanks Tony!
MCT More Recent history

09/12/12 - MCT 2.8.2
– Includes fix for slow Router init.
– Released in CESM 1.1
– Not released separately.

12/19/12 – MCT 2.8.3 Current Version
– Public release
– All of above changes plus some minor compiler fixes
CSSEF research demands on coupling
 Dynamical Adaptive Atmospheric Dynamics
– Grid points are created and destroyed on a coupler processor
– Changing cell sizes for just one grid within coupler will require online
calculation of new interpolation weights
• Which requires more information about both grids then currently in coupler.
 Development of MPAS-Ocean
– Need to retain information about unstructured grids for interpolation
weight calculation.
 Resiliency and Scaling
– Dynamic load balancing and resilient computing means points could
move from processor to processor.
– Millions of threads and small per-core memory means need more
parallelism and optimize for low-memory
Solution:
Re-Implement MCT data model with MOAB
 MOAB = Mesh Oriented dAtaBase
– A database for mesh (structured and unstructured) and field
data associated with mesh
– Tuned for memory efficiency first, speed a close second
– Serial, parallel look very similar, parallel data constructs
imbedded in MOAB interface
– http://trac.mcs.anl.gov/projects/ITAPS/wiki/MOAB
– Developed under DOE SciDAC program
– Includes parallel I/O and visualization capabilities.
– Included in nuclear engineering exascale co-design center.
MOAB is already used in other projects, notably DOEfunded cryosphere modeling and nuclear reactor
simulation.
Like MCT, it is “battle tested”.
Ice Sheet
bed
Klystron Mesh
MOAB Data Model
• 4 fundamental “types”:
– Entity: fine-grained entities in grid (vertex, tri, hex)
• Supported types: vertex, edge, tri, quad, polygon, tet,
prism, pyramid, hex, septahedron, polyhedron
• Mostly unstructured, though can represent structured
(leveraging work done with ParVis).
• Flexible in representing intermediate-dimension entities
(internal edges/faces)
– Entity Set: arbitrary set of entities & other sets
• Parent/child relations, for embedded graphs between sets
– Interface: object on which interface functions are called and
through which other data are obtained
– Tag: named datum annotated to Entitys, Entity Sets, Interface
• Instances accessed using opaque (type-less)
“handles”
• MOAB is a C++ library. Fortran interface is iMesh
MOAB Data Model illustrated
Review: MCT Classes
Mesh
Fields
MOAB provides
different class structures that
define mesh, fields, and
domain decomposition
Index-space
domain
decomposition
MCTWorld (Legacy)
type MCTWorld
integer :: MCT_comm
integer :: ncomps
integer :: mygrank
integer,dimension(:),pointer :: nprocspid
integer,dimension(:,:),pointer :: idGprocid
end type MCTWorld
 Lightweight component model registry that stores the coupled-systemwide MPI global communicator (and local PE rank on it), number of
component models, number of PEs in each component, and global PE
rank translation table
 Registry methods:
– Create/destroy - init()/clean()
– Query: # components, components’ root PE ranks, rank translation
MCTWorld (MOAB)
type MCTWorld
integer :: MCT_comm
integer :: ncomps
integer :: mygrank
iMesh_Instance :: mesh
integer,dimension(:),pointer :: nprocspid
integer,dimension(:,:),pointer :: idGprocid
end type MCTWorld
 Sole extension to the datatype is the MOAB mesh instance
 MCTWorld was conceived as a "lightweight registry" that served
as a directory service for intercomponent communications.
Addition of an MOAB instance to it makes it considerably
heavier, but converts it into a full-blown registry for coupling
purposes.
AttrVect (Legacy MCT)
type AttrVect
type(List) :: iList
type(List) :: rList
integer,dimension(:,:),pointer :: iAttr
real(FP) ,dimension(:,:),pointer :: rAttr
end type AttrVect
 Stores pointwise collections of REAL (INTEGER) fields,
or attributes, indexible by string tags in iList (iList)
 Key methods:
– Create/destroy: init(), clean()
– Query: length - lsize(), # REAL/INTEGER attributes nIAttr()/nRAttr(), names of attributes
– Manipulate: copy(), zero(), append attributes, Import/Export
indivudual attributes, sorting,, cross-indexing of attributes
AttrVect (MOAB)
type AttrVect
type(List) :: iList
type(List) :: rList
iBase_TagHandle,dimension(:),pointer :: itagh
iBase_TagHandle,dimension(:),pointer :: rtagh
iBase_EntityHandle,dimension(:),pointer :: enths
end type AttrVect
 Built using Fortran interface to MOAB (iMesh)
– INTEGER/REAL attribute lists retained
– Natural equivalence between “attribute” and “tag”
– Attributes now stored contiguously and referenced by a handle
iBase_TagHandle (implemented as an integer)
– Mesh entities referenced by iBase_EntityHandles
iMesh-AttrVect test program
! Initialize MCT (Default 3-D--but empty--iMesh instance created:
call MCTWorld_init(1, MPI_COMM_WORLD, comm1, 1)
! Initialize MCT AttrVect:
call AttrVect_init(av1, rList=‘field1:field2’, &
lsize=avsize)
! Query embedded iMesh instance to determine dimensionality:
call iMesh_getGeometricDimension(%VAL(ThisMCTWorld%mesh), &
geom_dim, ier)
! iMesh query function on the new Av tag handle
call iMesh_getTagName(%VAL(ThisMCTWorld%mesh), &
%VAL(av1%rtagh(1)) , &
tagname, ier, %VAL(10))
Other AttrVect methods from previous slide also available as-is
MCT on Mira - Argonne’s BlueGene/G
 48 racks
 1024 nodes per rack
 1.6 Ghz 16-2ay core
processor and 16 GB RAM
per node
 348 I/O nodes
 240 GB/s 35PB Storage
 768K cores
 768 TB Ram
 10PF peak
CESM with cpl7/MCT on BG/Q Status

Latest development version of CESM compiles and runs on BG/Q
Nodes (1 degree case)
Total ranks (pure MPI)
Simulation rate
(years/day
32
512
4.03
64
1024
6.54
128
2048
8.96

Compare: 7.4 years/day on 512 BG/P nodes (2048 cores; mixed).

Compiler bug encountered and patched!

CESM is mixed-mode but currently any threading slows down the model.
29
Additional MCT development
 New Features to aid in debugging Router times
– GSMap and MCTWorld print().
• Print contents to ascii file for later reading
– Router init internal timers
• Invoked with optional string argument to Router init.
– RouterTest.F90 - test program which reads in output GSMap and
MCTWorld info and builds a Router.
• Will build on same number of procs and same decomposition as original
model.
– On branch but not yet released
 Next: Conversion from F90 to F95 (and later F2003)
MCT: To be continued…
www.mcs.anl.gov/mct
Download