Benchmarking - Center for Computation & Technology

advertisement
High Performance Computing: Concepts, Methods & Means
Scientific Components and Frameworks
Prof. Daniel S. Katz
Department of Electrical and Computer Engineering
Louisiana State University
April 24th, 2007
AT LOUISIANA STATE UNIVERSITY
Opening Remarks
•
•
•
•
•
•
•
•
Context: high performance computing
May have multiple types of physics
May have multiple spatial scales
May have multiple time scales
May use multiple solvers
May need multiple I/O libraries
May need multiple visualization interfaces
Need to build real, working, complex
applications
• How?
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Topics
• Meat Grinder Introduction (slides from Gary
Kumfert, part of CCA tutorial)
• Common Component Architecture (CCA)
• Cactus
• Earth System Modeling Framework (ESMF)
• Summary
3
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
A Pictorial Introduction
to Components
in Scientific Computing
Once upon a time...
Input
Output
Program
5
As Scientific Computing grew...
6
Tried to ease the bottle neck
7
SPMD was born.
1
1
2
3
4
2
4
3
8
1
2
3
4
SPMD worked.
1
1
2
3
4
2
4
3
9
But it
isn’t
easy!!!
1
2
3
4
Meanwhile, corporate computing
was growing in a different way
Input
Input
email client spreadsheet
browser
editor
multimedia
10
graphics
Program
Unicode
database
Output
This created a whole new set of
problems
complexity

Interoperability
across multiple
languages

Interoperability
across multiple
platforms

Incremental
evolution of large
legacy systems
(esp. w/ multiple
3rd party software)
email client spreadsheet
browser
editor
multimedia
11
graphics
Unicode
database
Component Technology
addresses these problems
12
So what’s a component ???
Implementation :
No Direct Access
Interface Access :
Generated by Tools
Matching Connector :
Assigned by Framework
Hidden from User
13
1. Interoperability across
multiple languages
C
C++
14
F77
Language &
Platform
independent
interfaces
Java
Automatically
generated
bindings to
working code
2. Interoperability Across Multiple
Imagine a company
Platforms
migrates to a new
system, OS, etc.
15
What if the
source to
this one part
is lost???
Transparent Distributed
Computing
These wires
are very,
very smart!
internet
16
internet
3. Incremental Evolution With
Multiple 3rd party software
v 1.0
v 2.0
17
v 3.0
Now suppose you find this bug...
v 1.0
v 2.0
18
v 3.0
Good news: an upgrade available
Bad news: there’s a dependency
v 1.0
2.0
v 2.0
v 3.0
2.1
19
Great News:
Solvable with Components
2.0
2.1
20
v 3.0
Great News:
Solvable with Components
2.0
2.1
21
v 1.0
v 3.0
Why Components for Scientific
Computing
Complexity
SAMRAI
JEEP
Sapphire Scientific Viz
Ardra
Overture
nonlinear solvers
ALPS hypre
linear solvers
DataFoundry
22

Interoperability
across multiple
languages

Interoperability
across multiple
platforms

Incremental
evolution of large
legacy systems
(esp. w/ multiple
3rd party software)
The Model for Scientific
Component Programming
Science
Industry
?
CCA
23
Topics
•
•
•
•
•
Meat Grinder Introduction
Common Component Architecture (CCA)
Cactus
Earth System Modeling Framework (ESMF)
Summary
24
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
CCA Motivation: Modern Scientific
Software Engineering Challenges
• Productivity
– Time to first solution (prototyping)
– Time to solution (“production”)
– Software infrastructure requirements (“other stuff needed”)
• Complexity
– Increasingly sophisticated models
– Model coupling – multi-scale, multi-physics, etc.
– “Interdisciplinarity”
• Performance
– Increasingly complex algorithms
– Increasingly complex computers
– Increasingly demanding applications
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Motivation: For Library Developers
• People want to use your software, but need wrappers
in languages you don’t support
– Many component models provide language interoperability
• Discussions about standardizing interfaces are often
sidetracked into implementation issues
– Components separate interfaces from implementation
• You want users to stick to your published interface
and prevent them from stumbling (prying) into the
implementation details
– Most component models actively enforce the separation
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Motivation: For Application
Developers and Users
• You have difficulty managing multiple third-party libraries in
your code
• You (want to) use more than two languages in your
application
• Your code is long-lived and different pieces evolve at
different rates
• You want to be able to swap competing implementations of
the same idea and test without modifying any of your code
• You want to compose your application with some other(s)
that weren’t originally designed to be combined
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Some Observations
About Software…
• “The complexity of software is an essential property, not an
accidental one.” [Brooks]
– We can’t get rid of complexity
• “Our failure to master the complexity of software results in
projects that are late, over budget, and deficient in their stated
requirements.” [Booch]
– We must find ways to manage it
• “A complex system that works is invariably found to have
evolved from a simple system that worked… A complex system
designed from scratch never works and cannot be patched up to
make it work.” [Gall]
– Build up from simpler pieces
• “The best software is code you don’t have to write” [Jobs]
– Reuse code wherever possible
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Component-Based Software
Engineering
• CBSE methodology is emerging, especially from business
and internet areas
• Software productivity
– Provides a “plug and play” application development environment
– Many components available “off the shelf”
– Abstract interfaces facilitate reuse and interoperability of software
• Software complexity
– Components encapsulate much complexity into “black boxes”
– Plug and play approach simplifies applications
– Model coupling is natural in component-based approach
• Software performance (indirect)
– Plug and play approach and rich “off the shelf” component library
simplify changes to accommodate different platforms
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
A Simple Example:
Numerical Integration
Components
Interoperable components
(provide same interfaces)
FunctionPort
IntegratorPort
FunctionPort
MidpointIntegrator
GoPort
IntegratorPort
NonlinearFunction
FunctionPort
LinearFunction
FunctionPort
Driver
IntegratorPort
FunctionPort
PiFunction
RandomGeneratorPort
MonteCarloIntegrator
RandomGeneratorPort
RandomGenerator
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Many Applications are
Possible…
Dashed lines
indicate alternate
connections
FunctionPort
IntegratorPort
FunctionPort
MidpointIntegrator
GoPort
IntegratorPort
NonlinearFunction
FunctionPort
LinearFunction
FunctionPort
Driver
IntegratorPort
FunctionPort
PiFunction
RandomGeneratorPort
Create different applications
in "plug-and-play" fashion
MonteCarloIntegrator
RandomGeneratorPort
RandomGenerator
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
The “Sociology” of
Components
• Components need to be shared to be truly useful
– Sharing can be at several levels
• Source, binaries, remote service
– Various models possible for intellectual property/licensing
• Components with different IP constraints can be mixed in a
single application
• Peer component models facilitate collaboration of
groups on software development
– Group decides overall architecture and interfaces
– Individuals/sub-groups create individual components
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Who Writes Components?
• “Everyone” involved in creating an application can/should
create components
– Domain scientists as well as computer scientists and applied
mathematicians
– Most will also use components written by other groups
• Allows developers to focus on their interest/specialty
– Get other capabilities via reuse of other’s components
• Sharing components within scientific domain allows
everyone to be more productive
– Reuse instead of reinvention
• As a unit of publication, a well-written and tested
component is like a high-quality library
– Often a more appropriate unit of publication/recognition than an
entire application code
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
CCA Concepts: Components
IntegratorPort
FunctionPort
MidpointIntegrator
FunctionPort
NonlinearFunction
• Components are a unit of software composition
– Composition is based on interfaces (ports)
• Components provide/use one or more ports
– A component with no ports isn’t very interesting
– Components interact via ports; implementation is opaque to the
outside world
• Components include some code which interacts with the
CCA framework
• The granularity of components is dictated by the
application architecture and by performance considerations
• Components are peers
– Application architecture determines relationships
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
What is a Component Architecture?
• A set of standards that allows:
– Multiple groups to write units of software (components)…
– And have confidence that their components will work with
other components written in the same architecture
• These standards define…
– The rights and responsibilities of a component
– How components express their interfaces
– The environment in which are composed to form an
application and executed (framework)
– The rights and responsibilities of the framework
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
CCA Concepts: Frameworks
• The framework provides the means to “hold” components
and compose them into applications
– The framework is often application’s “main” or “program”
• Frameworks allow exchange of ports among components
without exposing implementation details
• Frameworks provide a small set of standard services to
components
– BuilderService allow programs to compose CCA apps
• Frameworks may make themselves appear as components
in order to connect to components in other frameworks
• Currently: specific frameworks support specific computing
models (parallel, distributed, etc.).
Future: full flexibility through integration or interoperation
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
CCA Concepts: Ports
IntegratorPort
FunctionPort
MidpointIntegrator
FunctionPort
NonlinearFunction
• Components interact through well-defined interfaces,
or ports
– In OO languages, a port is a class or interface
– In Fortran, a port is a bunch of subroutines or a module
• Components may provide ports – implement the
class or subroutines of the port ( “Provides” Port )
• Components may use ports – call methods or
subroutines in the port ( “Uses” Port )
• Links denote a procedural (caller/callee) relationship,
not dataflow!
– e.g., FunctionPort could contain: evaluate(in Arg, out Result)
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Interfaces, Interoperability,
and Reuse
• Interfaces define how components interact…
– Therefore interfaces are key to interoperability and reuse
of components
• In many cases, “any old interface” will do, but…
– General plug and play interoperability requires multiple
implementations providing the same interface
• Reuse of components occurs when they provide
interfaces (functionality) needed in multiple
applications
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Designing for Reuse,
Implications
• Designing for interoperability and reuse requires “standard”
interfaces
– Typically domain-specific
– “Standard” need not imply a formal process, may mean “widely used”
• Generally means collaborating with others
• Higher initial development cost (amortized over multiple
uses)
• Reuse implies longer-lived code
– thoroughly tested
– highly optimized
– improved support for multiple platforms
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Relationships: Components,
Objects, and Libraries
• Components are typically discussed as objects or
collections of objects
– Interfaces generally designed in OO terms, but…
– Component internals need not be OO
– OO languages are not required
• Component environments can enforce the use of published
interfaces (prevent access to internals)
– Libraries can not
• It is possible to load several instances (versions) of a
component in a single application
– Impossible with libraries
• Components must include some code to interface with the
framework/component environment
– Libraries and objects do not
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Domain-Specific Frameworks vs
Generic Component Architectures
Domain-Specific
• Often known as
“frameworks”
• Provide a significant
software infrastructure to
support applications in a
given domain
– Often attempts to generalize
an existing large application
• Often hard to adapt to use
outside the original domain
– Tend to assume a particular
structure/workflow for
application
• Relatively common
Generic
• Provide the infrastructure to
hook components together
– Domain-specific
infrastructure can be built as
components
• Usable in many domains
– Few assumptions about
application
– More opportunities for reuse
• Better supports model
coupling across traditional
domain boundaries
• Relatively rare at present
– Commodity component
models often not so useful
in HPC scientific context
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Special Needs of Scientific HPC
• Support for legacy software
– How much change required for component environment?
• Performance is important
– What overheads are imposed by the component
environment?
• Both parallel and distributed computing are important
– What approaches does the component model support?
– What constraints are imposed?
– What are the performance costs?
• Support for languages, data types, and platforms
– Fortran?
– Complex numbers? Arrays? (as first-class objects)
– Is it available on my parallel computer?
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
What is the CCA? (User View)
• A component model specifically
designed for high-performance scientific
computing
• Supports both parallel and distributed
applications
• Designed to be implementable without
sacrificing performance
• Minimalist approach makes it easier to
componentize existing software
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
What is the CCA? (2)
• Components are peers
• Not just a dataflow model
• A tool to enhance the productivity of
scientific programmers
– Make the hard things easier, make some
intractable things tractable
– Support & promote reuse & interoperability
– Not a magic bullet
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Importance of Provides/Uses
Pattern for Ports
• Fences between components
– Components must declare both
what they provide and what
they use
– Components cannot interact
until ports are connected
– No mechanism to call anything
not part of a port
• Ports preserve high
performance direct connection
semantics…
• …While also allowing distributed
computing
Component 1
Component 2
Provides/Uses
Port
Direct Connection
Component 1
Provides
Port
Network
Connection
Component 2
Uses
Port
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
CCA Concepts: Framework
Stays “Out of the Way” of
Component Parallelism
• Single component multiple data
(SCMD) model is component
analog of widely used SPMD
model
• Each process loaded with the
same set of components wired
the same way
•Different components in same
process “talk to each” other via
ports and the framework
•Same component in different
processes talk to each other
through their favorite
communications layer (i.e.
MPI, PVM, GA)
P0
P1
P2
P3
Components: Blue, Green, Red
Framework: Gray
MCMD/MPMD also supported
Other component models
ignore parallelism entirely
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
CCA Concepts: MxN Parallel
Data Redistribution
• Share Data Among Coupled Parallel Models
– Disparate Parallel Topologies (M processes vs. N)
– e.g. Ocean & Atmosphere, Solver & Optimizer…
– e.g. Visualization (Mx1, increasingly, MxN)
Research area -- tools under development
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
CCA Concepts:
Language Interoperability
• Existing language
interoperability
approaches are “pointto-point” solutions
• Babel provides a unified
approach in which all
languages are
considered peers
• Babel used primarily at
interfaces
f77
f77
C
f90
C
f90
Babel
C++
Python
Python
C++
Java
Few other component models support all languages and data types
important for scientific computing
Java
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
What the CCA isn’t…
• CCA doesn’t specify who owns “main”
– CCA components are peers
– Up to application to define component relationships
• “Driver component” is a common design pattern
• CCA doesn’t specify a parallel programming
environment
– Choose your favorite
– Mix multiple tools in a single application
• CCA doesn’t specify I/O
– But it gives you the infrastructure to create I/O components
• CCA doesn’t specify interfaces
– But it gives you the infrastructure to define and enforce them
– CCA Forum supports & promotes “standard” interface efforts
• CCA doesn’t require (but does support) separation of
algorithms/physics from data
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
What the CCA is…
• CCA is a specification for a component environment
–Fundamentally, a design pattern
–Multiple “reference” implementations exist
–Being used by applications
• CCA increases productivity
–Supports and promotes software interoperability and reuse
–Provides “plug-and-play” paradigm for scientific software
• CCA offers the flexibility to architect your application as
you think best
–Doesn’t dictate component relationships, programming models, etc.
–Minimal performance overhead
–Minimal cost for incorporation of existing software
• CCA provides an environment in which domain-specific
application frameworks can be built
–While retaining opportunities for software reuse at multiple levels
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Review of CCA Terms &
Concepts
• Ports
– Interfaces between components
– Uses/provides model
• Framework
– Allows assembly of components into applications
• Direct Connection
– Maintain performance of local inter-component calls
• Parallelism
– Framework stays out of the way of parallel components
• MxN Parallel Data Redistribution
– Model coupling, visualization, etc.
• Language Interoperability
– Babel, Scientific Interface Definition Language (SIDL)
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
CCA Summary
• Components are a software engineering tool to help
address software productivity and complexity
• Important concepts: components, interfaces,
frameworks, composability, reuse
• Scientific component environments come in “domain
specific” and “generic” flavors
• Scientific HPC imposes special demands on
component environments
– which commodity tools may have trouble with
• The Common Component Architecture is specially
designed for the needs of HPC
• CCA is a research project - intended to be quite
general - not heavily used yet in production
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Topics
•
•
•
•
•
Meat Grinder Introduction
Common Component Architecture (CCA)
Cactus (slides from Tom Goodale)
Earth System Modeling Framework (ESMF)
Summary
53
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
What Is Cactus?
• Cactus is a framework for developing portable, modular
applications, in particular, although not exclusively, highperformance simulation codes.
• Cactus is designed to allow experts in different fields to
develop modules based upon their expertise and to leverage
off modules developed by experts in other fields to perform
their work, with minimal knowledge of the internals or
operation of the other modules.
• This enables it to be used in large, geographically dispersed,
collaborations.
• Cactus and the Cactus Computational Toolkit are Open
Source and freely available.
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Cactus Goals
• Portable
• Modular
– People can write modules that interact through standard interfaces
with other modules without knowing internals of the other modules
– Modules with same functionality are interchangeable
• Support legacy codes
• Make use of existing technologies and tools where
appropriate
• Future proof
– Not tied to any particular paradigm
– Parallelism is independent but compatible with MPI or PVM
– I/O system is independent but compatible with HDF or others
• Easy to use
• Maintainable
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Cactus History
• First developed in 1997 by Paul Walker, Joan Masso, and
others, as a continuation of a long line of numerical relativity
codes, such as the NCSA G-code and Paul's Framework
• In first years, Cactus became progressively more modular,
allowing modules for different formulations of Einstein's
equations and different physical systems
• Although in principle Cactus was modular, its history and
evolution had left many dependencies between modules and
between the core and the modules
• Cactus 4.0 (current) is complete redesign of core -- moved
everything possible out into modules, and put structures in
place to enable modules to be far more independent
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Current Cactus Users
• Numerical Relativity
– Used by many groups including: AEI (Germany), UNAM (Mexico),
Tuebingen (Germany), Southampton (UK), Sissa (Italy), Valencia
(Spain), U. of Thessaloniki (Greece), MPA (Germany), RIKEN
(Japan), TAT (Denmark), Penn State, U. of Texas at Austin, U. of
Texas at Brownsville, LSU (USA), Wash, U. of Pittsburgh, U. of
Arizona, Washburn, UIB (Spain), U. of Maryland, Monash (Australia)
• Quantum Gravity
• Coastal and Climate Modeling
• CFD
– KISTI
– DLR looking at flow in turbines
Over 150 Science Papers
Over 30 Student Theses
• Lattice Boltzmann
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Cactus Structure
• Cactus source code consists of core part, the Flesh, and set
of modules, the Thorns
• Flesh
– Independent of all thorns
– After initialization, acts as a utility and service library that the thorns
call to get information or ask for some action to happen
• Thorns
– Separate libraries that encapsulate some functionality
– In order to keep a distinction between functionality and
implementation of the functionality, each thorn declares that it
provides a certain “implementation”
– Different thorns can provide the same “implementation”, and thorn
dependencies are expressed in terms of “implementations” rather than
explicit references to thorns, thus allowing the different thorns
providing the same “implementation” to be interchangeable
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Structure
remote steering
Plug-In “Thorns”
(modules)
extensible APIs
ANSI C
parameters
driver
scheduling
equations of state
Core “Flesh”
input/output
error handling
interpolation
SOR solver
Fortran/C/C++
black holes
make system
grid variables
wave evolvers
multigrid
boundary conditions
coordinates
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Cactus Flesh
• Make System
– Organizes builds as configurations which hold everything
needed to build with a particular set of options on a
particular architecture
• API
– Functions which must be there for thorns to operate
• Scheduling
– Sophisticated scheduler which calls thorn-provided
functions as and when needed
• CCL
– Configuration language which tells the flesh all it needs to
know about the thorns
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Thorn Specification
• The Flesh finds out about thorns by configuration files in
each thorn
• These files are converted at compile time into a set of
routines the Flesh can call to find out about thorns
• There are three such files
– Scheduling directives
• The flesh incorporates a scheduler which is used to call defined routines
from different thorns in a particular order
– Interface definitions
• All variables which are passed between scheduled routines need to be
declared
• Any thorn-provided functions which other thorns call should be declared
– Parameter definitions
• The flesh and thorns are controlled by a parameter file; parameters must
be declared along with their allowed values
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Scheduling
• Thorns specify which functions are to be called at which
time, and in which order
• Rule-based scheduling system
• Routines are either before or after other routines (or
don't care)
• Routines can be grouped, and whole group scheduled
• Functions or groups can be scheduled while some
condition is true
• Flesh sorts all rules and flags an error for inconsistent
schedule requests
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
The Driver Layer
• In principle, drivers are the only thorns which know anything about
parallelism
• Other thorns access parallelism via an API provided by the flesh
• Underlying parallel layer could be anything from a TCP-socket to Java
RMI -- should be transparent to application thorns
• Could even be a combination of things
• Can even run with no parallel layer at all
• Can pick actual driver to use at runtime - no need to recompile code to
test differences between parallel layers
• Can take one executable and use whatever the best layer for any
particular environment happens to be
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Current Drivers
• There are several drivers available at the moment, both developed by the
cactus team and by the community.
• PUGH
– a parallel uni-grid driver, which comes as part of the the computational
toolkit
• PAGH
– a parallel AMR driver which uses the GrACE library for grid hierarchy
management
• Carpet
a parallel fixed mesh refinement driver
• SimpleDriver
– a simple demonstration driver which illustrates driver development
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Cactus Computational Toolkit
• Core thorns which provide many basic utilities,
such as:
–
–
–
–
–
–
–
–
Boundary conditions
I/O methods
Reduction and Interpolation operations
Coordinate Symmetries
Parallel drivers
Elliptic solvers
Web-based interaction and monitoring interface
...
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Current Capabilities: Methods
• Almost all codes in Cactus are explicit finite difference codes
on structured meshes
• In principle, finite volume or finite element on structured
meshes is possible
• There is now a generic method-of-lines thorn which makes
developing thorns using such methods very quick and easy
• Interface for elliptic solvers and support for generic elliptic
solver packages such as PETSc as well as a numericalrelativity-specific multigrid solver written by Bernd
Bruegmann
– However, interface is not as generic as it could be, and it may not be
too useful as it stands for solving general implicit problems
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Current Capabilities: Interaction
• HTTPD thorn provides interface that allows web browser to
connect to running simulation
• Allows a user to examine state of running simulation and
change parameters, such as frequency of I/O or variables to
be output, or any other parameter that thorn author declared
may be changed during the simulation
• These capabilities may be extended by any other thorn
– E.g. the HTTPDExtra thorn allows the user to download any file output
by the I/O thorns in the Computational toolkit, and even to view twodimensional slices as jpegs
– Also, there is helper script for web browsers that allows appropriate
visualization tool to be launched when a user requests a file
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Cactus Summary
• Used to build apps in NumRel, and
starting to be used in other fields
• Flesh/Thorns distinction
– Flesh is like CCA Framework + some
general components
– Thorns are like CCA components
• Production code for certain domains,
well-used and well-tested
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Topics
•
•
•
•
Meat Grinder Introduction
Common Component Architecture (CCA)
Cactus
Earth System Modeling Framework (ESMF) slides from ESMF tutorial
• Summary
69
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
ESMF Motivation and Context
In climate research and NWP...
increased emphasis on detailed representation of
individual physical processes; requires many teams of
specialists to contribute components to an overall
modeling system
In computing technology...
increase in hardware and software complexity in highperformance computing, as we shift toward the use of
scalable computing architectures
In software …
development of first-generation frameworks, such as
FMS, GEMS, CCA and WRF, that encourage software
reuse and interoperability
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
What is ESMF?
• ESMF provides tools for turning model
codes into components with standard
interfaces and standard drivers.
• ESMF provides data structures and
common utilities that components use
for routine services such as data
communications, regridding, time
management and message logging.
ESMF Superstructure
AppDriver
Component Classes: GridComp, CplComp, State
User Code
ESMF Infrastructure
Data Classes: Bundle, Field, Grid, Array
Utility Classes: Clock, LogErr, DELayout, Machine
ESMF GOALS
1. Increase scientific productivity by making model components much
easier to build, combine, and exchange, and by enabling modelers to
take full advantage of high-end computers.
2. Promote new scientific opportunities and services through community
building and increased interoperability of codes (impacts in
collaboration, code validation and tuning, teaching, migration from
research to operations)
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Application Example:
GEOS-5 AGCM
• Each box is an
ESMF component
• Every component
has a standard
interface so that it
is swappable
• Data in and out of
components are packaged
as state types with user-defined fields
• New components can be added to the system
• Each ESMF application is also a Gridded Component
• Entire ESMF applications can be nested within larger applications
• This strategy can be used to systematically compose very large, multicomponent codes.
• Coupling tools include regridding and redistribution methods
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Design Strategies
• Modularity
◦ Gridded Components don’t have access to the internals of other Gridded
Components, and don’t store any coupling information
◦ Gridded Components pass their States to other components through their
argument list.
◦ Components can be used standalone or coupled with others into a larger
application.
• Flexibility
◦ Users write their own drivers as well as their own Gridded Components and
Coupler Components -- Users decide on their own control flow
• Communication
◦ All communication handled within components. If an atmosphere is coupled
to an ocean, Coupler Component is defined on both atmosphere and ocean
processors.
◦ The same programming interface is used for shared memory, distributed
memory, and combinations thereof. This buffers the user from variations
and changes in the underlying platforms.
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Elements of Parallelism:
Serial vs. Parallel
• Computing platforms can have multiple processors, some
or all of which may share the same memory pools
• Can be multiple Persistent Execution Threads (PETs)
• Can be multiple PETs per processor
• Software like MPI and OpenMP commonly used for
parallelization
• Programs can run in a serial fashion, with one PET, or in
parallel, using multiple PETs
• Often, a PET can be thought of as a processor
• Sets of PETs are represented by Virtual Machines (VMs)
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Elements of Parallelism:
Sequential vs. Concurrent
In sequential mode components run one
after the other on the same set of PETs.
In concurrent mode components run at the
same time on different sets of PETs
PET s
1
PETs
2
3
4
5
6
7
9
1
2
3
T im e
T im e
AppDriv er (“M ain”)
8
4
5
6
7
8
AppDriver (“Main”)
Call Run
Call Run
Run
GridComp “Hurricane M odel”
LOOP
Run
GridCom p “Hurricane Model”
Call Run
LOOP
Run
GridComp
“Atm osphere”
Run
GridCom p
“Atmosphere”
Run
Call Run
Run
GridCom p
“Ocean”
GridComp
“Ocean”
Run
Run
CplComp
“Atm -Ocean Coupler”
CplCom p
“Atm-Ocean Coupler”
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
9
Elements of Parallelism: DEs
• Data decomposition represented as set of Decomposition Elements
(DEs)
• Sets of DEs are represented by the DELayout class
• DELayouts define how data is mapped to PETs
• In many applications
there is one DE per PET
Temperature Field T
T1
T10
T19
T28
T2
T11
T20
T29
T3
T12
T21
T30
T4
T13
T22
T31
T5
T14
T23
T32
T6
T15
T24
T33
T7
T16
T25
T34
T8
T17
T26
T35
T9
T18
T27
T36
1
2
3
4
5
6
7
8
9
1 x 9 DELay out
2
3
4
5
6
7
8
9
VM with 9 PETs
4 x 9 f ield
DEs
1
PETs
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Modes of Parallelism:
Single vs. Multiple Executable
• In Single Program Multiple Datastream (SPMD)
mode the same program runs across all PETs in
the application - components may run sequentially
or concurrently.
• In Multiple Program Multiple Datastream (MPMD)
mode the application consists of separate
programs launched as separate executables components may run concurrently or sequentially,
but in this mode almost always run concurrently
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Classes and Objects in ESMF
• The ESMF Application Programming Interface
(API) based on object-oriented programming
notion of class
– A software construct that’s used for grouping a set of
related variables together with the subroutines and
functions that operate on them
– They help to organize the code, and often make it
easier to maintain and understand.
• A particular instance of a class is an object
– For example, Field is an ESMF class
– An actual Field called temperature is an object
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
ESMF Class Structure
GridComp
Land, ocean, atm, … model
State
Data imported or exported
CplComp
Xfers between GridComps Superstructure
Bundle
Collection of fields
Field
Physical field, e.g. pressure
Regrid
Computes interp weights
Infrastructure
Grid
LogRect, Unstruct, etc.
Array
Hybrid F90/C++ arrays
Data
PhysGrid
Math description
DistGrid
Grid decomposition
DELayout
Communications
Utilities
Virtual Machine, TimeMgr, LogErr, IO, ConfigAttr, Base etc.
F90
Route
Stores comm paths
C++
Communications
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
ESMF Superstructure
Classes
• Gridded Component
– Models, data assimilation systems - “real code”
• Coupler Component
– Data transformations and transfers between Gridded
Components
• State – Packages of data sent between
Components
– Can be Bundles, Fields, Arrays, States, or nameplaceholders
• Application Driver – Generic driver
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
ESMF Components
• Component has two parts
– One supplied by ESMF - an ESMF derived type that is either a
Gridded Component or a Coupler Component
– One supplied by the user
• Gridded Component typically represents a physical domain
in which data is associated with one or more grids - for
example, a sea ice model
• Coupler Component arranges and executes data
transformations and transfers between one or more Gridded
Components.
• Gridded Components and Coupler Components have
standard methods, which include initialize, run, and finalize
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
ESMF Infrastructure Data
Classes
• Model data is contained in a hierarchy of multi-use classes
• The user can reference a Fortran array to an Array or Field,
or retrieve a Fortran array out of an Array or Field.
• Array – holds a Fortran array (with other info, such as
halo size)
• Field – holds an Array, an associated Grid, and
metadata
• Bundle – collection of Fields on the same Grid bundled
together for convenience, data locality, latency
reduction during communications
Supporting these data classes is the Grid class, which
represents a numerical grid
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Application Driver
• Small, generic program that contains the
“main” for an ESMF application.
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
ESMF Communications
• Halo
– Updates edge data for consistency
between partitions
• Redistribution
– No interpolation, only changes how the
data is decomposed
• Regrid
– Based on SCRIP package from Los
Alamos
– Methods include bilinear, conservative
• Bundle, Field, Array-level interfaces
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
ESMF Utilities
•
•
•
•
•
•
•
Time Manager
Configuration Attributes (replaces namelists)
Message logging
Communication libraries
Regridding library (parallelized, on-line SCRIP)
I/O (barely implemented)
Performance profiling (not implemented yet, may
simply use Tau)
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
ESMF Summary
• Developed for and by climate
community
• Sandwich model
– EMSF provides superstructure and
infrastructure, user provides filling
• Used for some applications, and
increasingly, apps are written using it
• Mostly Fortran-based (user community
requirement), and CCA compatible
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Summary – Material for the Test
•
•
•
•
•
•
•
•
•
CCA Motivations: slides 25-27
Component based Software Engineering: slide 29
CCA Concepts: slides 34-50
What is Cactus: slides 54,55,57
Cactus Architecture: slides 58-65
Cactus, current capabilities: slides 66,67
What is ESMF: slides 70,71
Design concepts in ESMF: slides 73-77
ESMF Architectural Components: slides 78-85
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
URLs
• Common Component Architecture (CCA)
– http://www.cca-forum.org/
• Cactus
– http://www.cactuscode.org/
• Earth System Modeling Framework (ESMF)
– http://www.esmf.ucar.edu/
88
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
CENTER FOR COMPUTATION & TECHNOLOGY AT LOUISIANA STATE UNIVERSITY
Download