SFA-1
Computational science is emerging as its own discipline
Simulation is becoming a peer to theory and experiment in the process of scientific discovery
Integration is the key
— domain science expert
— applied mathematician
— computer scientist
Turbulence
Fusion
Biology
Lasers
Materials
Environment
SFA-2
A “multidiscipline” on the verge of full bloom
— Envisioned by Von Neumann and others in the 1940’s
— Undergirded by theory (numerical analysis) for the past fifty years
— Empowered by spectacular advances in computer architecture over the last twenty years
— Enabled by powerful programming paradigms in the last decade
Adopted in industrial and government applications
— Boeing 777’s computational design a renowned milestone
— DOE NNSA’s “ASCI” (motivated by CTBT)
— DOE SC’s “SciDAC” (motivated by Kyoto, etc.)
SFA-3
Experiments difficult to instrument
Experiments prohibited or impossible
Experiments dangerous
Environment global climate wildland firespread
Engineering electromagnetics aerodynamics
Ex #2
Physics cosmology radiation transport
Scientific
Simulation
Ex #3
Experiments expensive
Energy combustion fusion
Ex #1
Ex #4 personal examples
SFA-4
Has theoretical aspects (modeling)
Has experimental aspects (simulation)
Unifies theory and experiment by providing common immersive environment for interacting with multiple data sets of different sources
Provides “universal” tools, both hardware and software
Telescopes are for astronomers, microarray analyzers are for biologists, spectrometers are for chemists, and accelerators are for physicists, but computers are for everyone!
Costs going down, capabilities going up every year
SFA-5
Computational Applied Math
=
Science
Domain Science
+
Computer Science Engineering
Applied Math and CS
Science and
Engineering
Applications
Biology
Physics
Chemistry
Engineering
Environmental
Computational scientists bring applied mathematics and computer science capabilities to bear on challenging problems in science and engineering
Math sparse linear solvers nonlinear equations differential eqns multilevel methods
AMR techniques optimization eigenproblems
CS data management data mining visualization program’g models languages, OS compilers, debuggers architectural issues
Computational Science & Engineering is a team effort!
SFA-6
Traditional supercomputing applications involve the solution of a PDE on a computational grid
— computational fluid dynamics
— oil reservoir and groundwater management
— stockpile stewardship
— ICF and MFE applications
Bigger machines and smarter algorithms have allowed more realistic simulations
— Moore’s Law and massively parallel computers have provided unprecedented computing power
— scalable algorithms enable large-scale simulations
SFA-7
Theory, Experiment and Computation
Growth in the expectations for and applications of CSE methodology has been fueled by rapid and sustained advances over the past 30 years of computing power and algorithm speed and reliability, and the emergence of software tools for the development and integration of complex software systems and the visualization of results.
In many areas of science and engineering, the boundary has been crossed where simulation, or simulation in combination with experiment is more effective (in some combination of time/cost/accuracy) than experiment alone for real needs .
In addition, simulation is now a key technology in industry.
SFA-8
Growth of Capabilities of Hardware and Algorithms
Updated version of chart appearing in “Grand Challenges: High performance computing and communications”, OSTP committee on physical, mathematical and Engineering Sciences, 1992.
SFA-9
Advances in algorithmic efficiency rival advances in hardware architecture
Consider Poisson’s equation on a cube of size N=n 3
Year Method Reference Storage Flops n 5 n 7 1947 GE (banded) Von Neumann &
Goldstine
1950 Optimal
SOR
Young
1971 CG Reid n 3 n 4 log n
1984 Full MG Brandt n 3 n 3 n 3.5
log n n 3
64
64
2 u=f
64
If n=64, this implies an overall reduction in flops of ~16
SFA-10
This advance took place over a span of about 36 years, or 24 doubling times for Moore’s Law
2 24
16 million
the same as the factor from algorithms alone!
relative speedup year
SFA-11
Since O(N) is already optimal, there is nowhere further
“upward” to go in efficiency, but one must extend optimality
“outward”, to more general problems
Hence, for instance, algebraic multigrid (AMG), obtaining O(N) in anisotropic, inhomogeneous problems
AMG Framework
R n error damped by pointwise relaxation algebraically smooth error
Choose coarse grids, transfer operators, etc. to eliminate, based on numerical weights, heuristics
SFA-12
Modeling and Decision Making Framework
Conceptual
Model
Observations
* Model Types:
(1) Statistical
(2) Empirical
(3) Mechanistic
-
PDE’s
ODE’s
DAE’s
-
AC’s
Mathematical
Formulation
* {see types
Simulation
Model
Parameter
Estimation model parameters
Simulation
-may be stochastic
Objective
-model error
-cost
Stochastic Nature
Uncertainty In:
-model
-parameters
-auxiliary Conditions
Leads to uncertainty in predictions
Model Use
-prediction
-design
-policy
CDF
1
P
Parameter Estimation for:
(1) Fitting model
(2) Minimizing objective function y
SFA-13
Physical,
Chemical and
Biological
Processes
Conservation
Equation
Formulated Model
Closure Relations
Domains and
Auxiliary
Conditions
Experimental Theoretical Computational
SFA-14
• Behavior near walls and boundaries is critical
• Large molecules moving through small spaces
• Interaction with the macroscale world is still important
SFA-15
The Multiscale World
•Quasicontinuum method (Tadmor, Ortiz,
Phillips, 1996) Links atomistic and continuum models through the finite element method. A separate atomistic structural relaxation calculation is required for each cell of the FEM mesh instead of using empirical constitutive information. Predicts observed mechanical properties of materials on the basis of their constituent defects
•Hybrid finite element/molecular dynamics/quantum mechanics method
(Abraham, Broughton, Bernstein, Kaxiras,
1999) Massively parallel, but designed for systems which involve a central defective region surrounded by a region which is only slightly perturbed from equililibrium
Nakano et al.
SFA-16
More Multiscale
•Hybrid finite element/molecular dynamics/quantum mechanics algorithm
(Nakano, Kalia and Vashista, 1999)
•Adaptive mesh and algorithm refinement
(Garcia, Bell, Crutchfield, Alder, 1999)
Embeds a particle method (DSMC) within a continuum method at the finest level of an adaptive mesh refinement hierarchy – application to compressible fluid flow
•Coarse stability and bifurcation analysis using time-steppers (Kevrekidis, Qian,
Theodoropoulos, 2000)
The “patch” method
Nakano et al.
This is only a small sample: There is a new journal devoted entirely to multiscale issues!
SFA-17
Engineering Meets Biology
Computational Challenges:
•Multiscale simulation
•Understanding and controlling highly nonlinear network behavior
(140 pages to draw a diagram for network behavior of E. Coli)
•Uncertainty in network structure
•Large amounts of uncertain and heterogeneous data
•Identification of feedback behavior
•Simulation, analysis and control of hybrid systems
•Experimental design
SFA-18
Multiscale Simulation of Biochemical Networks
In the heat-shock response in
E. Coli, an estimated 20-30 sigma-32 molecules per cell play a key role in sensing the folding state of the cell and in regulating the production of heat shock proteins. The system cannot be simulated at the fully stochastic level, due to
•Multiple time scales
(stiffness)
•The presence of exceedingly large numbers of molecules that must be accounted for in SSA
32
Khammash et al.
SFA-19
Beyond Simulation: Computational Analysis
•Sensitivity analysis
•Forward and adjoint methods – ODE/DAE/PDE; hybrid systems
Multiscale, stochastic,… still to come
•Uncertainty analysis
•Polynomial chaos, deterministic systems with uncertain
32 coefficients
•Many other ideas – special issue in progress, SIAM SISC
•Design optimization/optimal control
•Design of experiments – to what extent can you learn something from incomplete information?, where is the most predictive power?
SFA-20
More Computational Analysis
•Determination of nonlinear structure – multiscale, stochastic, hybrid
•Bifurcation
•Mixing
•Long-time behavior
•Invariant manifolds
•Chaos
32
•Control mechanisms – identifying feedback mechanisms
•Reduced/simplified models – deterministic, multiscale, stochastic, hybrid systems, identify the underlying structure and mechanism
•Data analysis – revealing the interconnectedness, dealing with complications due to data uncertainties
SFA-21
Computer Science will Play a Much Larger Role
Pragmatic reasons: Significant help from software tools
•Source-code generation
•Automatic differentiation – enables greater accuracy and reliability
(and saves work in writing derivative routines and especially in debugging!) in generation of Jacobian matrix
•Fix the dumb things we have done in codes , like ‘if’ statements in functions that are supposed to be continuous
•Thread-safety - identify and fix the problems so that the code is ready for parallel/grid computing
•User interfaces: by current standards in the rest of the computer world, user interfaces for scientific computing look like this:
Some exceptions and coming developments:
•Matlab
•Semi-automatic generation of GUI (MAUI,JMPL), for big production codes and dusty decks
•Component technologies (PETSC)
SFA-22
Computer Science will Play a Much Larger Role
The deeper reason:
At the smaller scales, we are dealing with and manipulating large amounts of discrete, stochastic, Bayesian, Boolean information .
These are the foundations of Computer Science. Bioinformatics is just the tip of the iceberg.
32
SFA-23
Consider the process of scientific simulation
— software development
— problem definition and simulation setup
— data analysis and understanding
There has been no equivalent of Moore’s Law for how we develop our software
Increasingly complex simulations often require months to set up and months to analyze the results
SFA-24
Multi-level methods for multi-scale problems
Rapid problem setup tools (mesh generation and discretization methods for complex geometries)
Flexible software frameworks and interoperable s/w components for rapid application development
Computer architectures & performance optimization
Information exploitation (data management, image analysis, info/data visualization, data mining)
Systems engineering to integrate simulation, sensors, and info analysis into a decision support capability
Discrete simulation (scenario planning)
Validation and Verification (coupling to experiments)
SFA-25
We should focus on how CSE can benefit the nation
— enhancing national & homeland security
— promoting economic vitality and energy security
— improving human health
We need to emphasize the multi-disciplinary nature of
CS&E and its track record in delivering!
— distinguish ourselves from constituent disciplines
— need to do a better job of getting the word out!
SFA-26
DOE has been long-time leader in CS&E
— ASCI re-invigorated supercomputing
— Office of Science is championing the cause with its successful SciDAC initiative
NSF has long invested in IT and CS, and is beginning to think more about CS&E
DHS has pressing needs for help in simulation and information fusion
NIH should be a bigger player than it is, but there are serious cultural obstacles
SFA-27
Computational Science Research and
Education: Funding Considerations
Fellowship programs
Need for critical mass
Focus
Baseline support of sufficient duration is optimal
SFA-28
Need to teach the importance of working on teams
— Rarely have a single PI
— We need to recognize team efforts
Need more opportunities for students to solve “real” problems in a research environment
We need opportunities for everybody to learn new fields
Integration between agencies as well as integration across disciplines?
SFA-29
Biotechnology
— Biophysical simulations
— Data management
— Stochastic dynamical systems
Nanoscience
— Multiple scales (time and length)
— Scalable algorithms for molecular systems
— Optimization and predictability
SFA-30
100+ Tflop / 30 TB
50+ Tflop / 25 TB
30+ Tflop / 10 TB
White
10+ Tflop / 4 TB
Blue 3+ Tflop / 1.5 TB
Red
1+ Tflop / 0.5 TB
Plan
Develop
Use
‘97 ‘98 ‘99 ‘00 ‘01 ‘02 ‘03 ‘04 ‘05 ‘06
Time (CY)
Sandia
Los Alamos
Livermore
Livermore
NNSA has roadmap to go to 100 Tflop/s by 2006 www.llnl.gov/asci/platforms
SFA-31
Bringing the CS&E and Statistics Communities Together
Example : Inverse problems and validation for complex computer models
Barriers to closer association
Mechanisms for closer association
SFA-32
Barriers to Bringing the CS&E and Statistics Communities
Together
To many disciplinary scientists
— we are each ‘providers of tools they can use’
— we are indistinguishable quantitative experts
Program and project funding rarely encourage inclusion of both
CS&E and statistical scientists.
Our traditional application areas generally differ
— CS&E tradition: physical sciences and engineering
— Statistics tradition: strongest – as the statistics discipline – in social sciences, medical sciences,…
(This could be an organizational strength for the CS&E initiative, but is a barrier at the personal level.)
SFA-33
Mechanisms for Bringing the CS&E and Statistics Communities
Together
Most important is simply to bring them together on interdisciplinary teams.
Institute programs (e.g., at SAMSI), for extended cooperation
— joint workshops
— joint working groups
Emphasize need for joint funding on interdisciplinary projects.
At Universities?
SFA-34
Research Challenges
Statistical computational research challenges:
— MCMC development and implementation
— data confidentiality and large contingency tables
— dealing with large data sets
– in real time
– off-line
— bioinformatics, gene regulation, protein folding, …
— data mining
— utilizing multiscale data
— data fusion, data assimilation
— graphical models/causal networks
— open source software environments
— visualization
— many many more.
SFA-35
Research Challenges, Continued
Challenges in the synthesis of statistics and development of computer modeling:
— Statistical analysis in non-linear situations can require thousands of model evaluations (e.g., using MCMC), so the ‘real’ computational problem is the product of two very intensive computational problems; this is needed for
– designing effective evaluation experiments;
– estimating unknown model parameters
(inverse problem), with uncertainty evaluation;
– assessing model bias and predictive capability of the model;
– detecting inadequate model components.
SFA-36
Research Challenges, Continued
— Simultaneous use of statistical and applied mathematical modeling is needed for
– effective utilization of many types of data, such as
– data that occurs at multiple scales;
– data/models that are individual-specific.
– replacing unresolvable determinism by stochastic or statistically modeled components (parameterization)
This general area of validation of computer models should be a Grand Challenge.
SFA-37
Five Investment Models for CS&E to Prosper
Laboratory institutes (hosted at a lab)
ICASE, ISCR (more details to come)
National institutes (hosted at a university)
IMA, IPAM
Interdisciplinary centers
ASCI Alliances, SciDAC ISICs, SCCM, TICAM, CAAM, …
CS&E fellowship programs
CSGF, HPCF
Multi-agency funding (cyclical to be sure, but sometimes collaborative)
DOD, DOE, NASA, NIH, NSF, …
SFA-38
Be “eyes and ears” for CSE by staying abreast of advances in computer and computational science
Be “hands and feet” for CSE by carrying those advances into the laboratory
Three principal means for packaging scientific ideas for transfer
— papers
— software
— people
People are the most effective!
SFA-39
Universities
Generic CSE
Center
(GCC)
Lab programs
Students
Faculty
Lab Employees
Faculty visit the GCC, bringing students
Most faculty return to university, with lab priorities
Some students become lab employees
Some students become faculty, with lab priorities
A few faculty become lab employees
SFA-40
Bay Area NA Day
Common Component Architecture
Copper Mountain Multigrid Conference
DOE Computational Science Graduate
Fellows
Hybrid Particle-Mesh AMR Methods
Mining Scientific Datasets
Large-scale Nonlinear Problems
Overset Grids & Solution Technology
Programming ASCI White
Sensitivity and Uncertainty
Quantification
SFA-41
CS&E majors without a CS undergrad need to learn to compute!
Prerequisite or co-requisite to becoming useful interns at a lab
Suggest a “bootcamp” year-long course introducing:
— C/C++ and object-oriented program design
— Data structures for scientific computing
— Message passing (e.g., MPI) and multithreaded (e.g.,
OpenMP) programming
— Scripting (e.g., Python)
— Linux clustering
— Scientific and performance visualization tools
— Profiling and debugging tools
NYU’s sequence G22.1133/G22.1144 is an example for CS
SFA-42
Difficult to get support for maintaining critical software infrastructure and “benchmarking” activities
Difficult to get support for hardware that is designed with computational science and engineering in mind
Difficult for pre-tenured faculty to find reward structures conducive to interdisciplinary efforts
Unclear how stable is the market for CS&E graduates at the entrance to a 5-year pipeline
Political necessity of creating new programs with each change of administrations saps time and energy of managers and community
SFA-43
DOE’s SciDAC model being recognized and propagated
NSF’s DMS budgets on a multi-year roll
SIAM SIAG-CSE attracting members from outside of traditional SIAM departments
CS&E programs beginning to exhibit “centripetal” potential in traditionally fragmented research universities e.g., SCCM’s “Advice” program
Computing at the large scale is weaning domain scientists from “Numerical Recipes” and MATLAB and creating thirst for core enabling technologies (NA, CS, Viz, …)
Cost effectiveness of computing, especially cluster computing, is putting a premium on graduate students who have CS&E skills
SFA-44
Jul 2002 report to DOE
Proposes $5M/year theory and modeling initiative to accompany the existing $50M/year experimental initiative in nano science
Report lays out research in numerical algorithms and optimization methods on the critical path to progress in nanotechnology
SFA-45
Dec 2002 report to DOE
Currently DOE supports 52 codes in Fusion Energy Sciences
US contribution to ITER will
“major” in simulation
Initiative proposes to use advanced computer science techniques and numerical algorithms to improve the US code base in magnetic fusion energy and allow codes to interoperate
SFA-46