Trucano-JohnsHopkins..

advertisement
V&V Issues
Timothy G. Trucano
Optimization and Uncertainty Estimation Department, Org 9211
Sandia National Laboratories
Albuquerque, NM 87185
Workshop on Error Estimation and Uncertainty Quantification
November 13-14, 2003
Johns Hopkins University
Phone: 844-8812, FAX: 844-0918
Email: tgtruca@sandia.gov
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,
for the United States Department of Energy under contract DE-AC04-94AL85000.
Outline of talk.
• The problem.
• What is validation?
• What is verification?
• Coupling is required.
• Walking through V&V.
• A few research issues.
21 March 2016
Johns Hopkins, November 2003
Page 2
Useful quotes to keep in mind.
• Hamming – “The purpose of computing is insight…” (?)
• ASCI – the purpose of computing is to provide “highperformance, full-system, high-fidelity-physics predictive
codes to support weapon assessments, renewal process
analyses, accident analyses, and certification.” (DOE/DP99-000010592)
• Philip Holmes – “…a huge simulation of the ‘exact’
equations…may be no more enlightening than the
experiments that led to those equations…Solving the
equations leads to a deeper understanding of the model
itself. Solving is not the same as simulating.” (SIAM News,
June, 2002)
21 March 2016
Johns Hopkins, November 2003
Page 3
“Validation” is a process of comparing calculations
with experimental data and drawing inferences
about the scientific fidelity of the code for a
particular application. For example:
This is physics.
6
pr / pinc
5
Experiment + Error Bar
4
Analytic
3
2
ALEGRA Calculation
1
30
35
45
40
50
Incident Angle
• Validation is a “physics problem.”
• Verification is a “math problem.”
This is math.
21 March 2016
Johns Hopkins, November 2003
Page 4
Some of the questions that occur to
us as a result of this comparison:
• Error bars mean what?
• What is the numerical accuracy of the code?
• Is the comparison good, bad, or indifferent? In what
context?
• Why did we choose this means to compare the data and the
calculation? Is there something better?
• Why did we choose this problem to begin with?
• What does the work rest on (such as previous knowledge)?
• Where is the work going (e.g. what next)?
21 March 2016
Johns Hopkins, November 2003
Page 5
What is validation?
• Validation of computational science software is the process
of answering the following question:
Are the equations correct?
• It is convenient to recognize that validation is also the
process of answering the following question:
Are the software requirements correct?
• It goes without saying that validation is “hard;” but it is
sometimes forgotten that the latter definition of validation
takes precedence (it applies to ANY software).
– Focus on the former; but remember the latter.
21 March 2016
Johns Hopkins, November 2003
Page 6
What is verification?
• Verification of computational science software is the
process of answering the following question:
Are the equations solved correctly ?
• It is convenient to recognize that verification is also the
process of answering the following question:
Are the software requirements correctly implemented?
• A strict definition of verification is:
Prove that calculations converge to the correct solution
of the equations.
• This latter definition is HARD (impossible)!! Use it as a
mission statement.
• Provably correct error estimation is essentially an
equivalent problem.
21 March 2016
Johns Hopkins, November 2003
Page 7
V&V are processes that accumulate
information – “evidence”.
What evidence is required
requires
V&V plans
How evidence is accumulated
requires
V&V tasks
How evidence is “accredited”
requires
V&V assessment
How evidence is applied
intersects
Computing
21 March 2016
Johns Hopkins, November 2003
Page 8
Why do we choose specific
validation tasks?
2-D Shock Wave Experiment
We have defined and implemented a
planning framework for code application
validation at Sandia that reflects the
hierarchical nature of validation.
• We have defined formal and documented
planning guidance. (Trucano, et al, Planning Guidance
Ver. 1, 2 SAND99-3098, SAND2000-3101)
• A core concept is the PIRT (Phenomena
Identification and Ranking Table) (a “Quality
Function Deployment” tool).
This is a single-material, simple
EOS strong-shock multidimensional hydrodynamics
validation problem that develops
validation evidence for ALEGRAHEDP capabilities. It will be used in
a “validation grind” for ALEGRA.
(Chen-Trucano, 2002)
21 March 2016
• We have defined a formal and documented
assessment methodology for the planning
component.
Johns Hopkins, November 2003
Page 9
What do the experimental error bars
mean?
pr / pinc
Instrument Fidelity
6
All serious discussion of validation
metrics begins with uncertainty in
the experimental data.
5
• A difficult problem is to characterize the
uncertainty embedded in |Calc – Expt|.
4
• For the short term, validation metrics are
driven by the assumption that this
uncertainty can be characterized
probabilistically.
Analytic
Experiment + Error Bar
• An important component in doing this
right is to execute dedicated experimental
validation.
3
This study relied upon existing
experimental data that did not
• A rigorous methodology for experimental
2
characterize
uncertainty properly.
validation addresses experimental data
Our interpretation of the error bars
requirements. (Trucano, et al “General Concepts
is that they reflect only instrument
ALEGRA Calculationfor Experimental Validation of ASCI Code Applications”
fidelity.
(From this, we might make a
1
SAND2002-0341)
strong assumption
about
30
35 “uniform
40
45
50
distributions.”)
21 March 2016
Johns Hopkins,
November 2003
Incident
Angle
Page 10
What are the “Validation Metrics”
|Calc – Expt|?
Viewgraph Norm
Our key R&D project is the Validation
Metrics Project (origin ~ 1998)
6
• The focus of the project is to answer the
questions: (Trucano, et al “Description of the Sandia
pr / pinc
5
Validation Metrics Project” SAND2002-0121)
Experiment + Error Bar
4
1. What metrics and why?
Analytic
3
2. What are relevant Pass/Fail criteria?
3. What are the implications for
calculation prediction confidence?
2
ALEGRA Calculation
1
30
35
40
45
50
Incident Angle
The main metric was reproduction
of qualitative trends in shock
reflection. The secondary goal was
quantitative pointwise comparison
of specific Mach No./Angle pairs.
21 March 2016
• Critical impact on current V&V
milestones.
• Uncertainty Quantification is an enabling
technology.
• Current themes are thermal analysis,
solid mechanics, & structural dynamics.
Johns Hopkins, November 2003
Page 11
It’s obvious that there are better
metrics than the viewgraph norm.
• Probabilistic sophistication in development and application
of these metrics is a great challenge.
P  Mi observation 
 P observation M i    P  Mi  
=
×

P
M
P
observation
M
P M j observation





j 
j 
 


21 March 2016

Johns Hopkins, November 2003
Page 12
What do probabilistic metrics
mean?
40
30
“Error” = |Calc – Expt| is probabilistic:
Statistical Error Description
• Expt = random field; Calc = random field
Empirical Residual
Histogram
• These fields depend on many variables –
geometry, initial and boundary condition
specifications, numerical parameters in the
case of the calculation. (Hills - SAND99-1256,
20
SAND2001-0312, SAND2001-1783)
10
 Hopefully only a “few” of these
variables are important
0
2.
2.
1.
1.
50
00
50
00
0
00
.5
0.
0
0
0
0
0
- .5
0
.0
-1
.5
-1
.0
-2
.5
-2
.0
-3
Regression Standardized Residual
A study of Hills investigating
statistical methodologies for very
simple validation data (Hugoniot data
for aluminum) has been influential for
the entire validation metrics project.
This is the simplest starting point for
validation of shock wave calcs.
21 March 2016
• Predictive confidence results from
understanding of the error field; depends
on quantity and quality of data.
• Additional complexities arise from the
hierarchical nature of validation and the
intended applications. This is an important
subject for current research (Hills & Leslie
– multivariate statistical approaches;
Mahadevan – Bayesian net reliability)
Johns Hopkins, November 2003
Page 13
What are the calculation error bars?
These calculations are not converged.
It is critical to “verify” calculations
used in validation studies. (Verification
guidance currently missing but in progress)
• This requires convergence studies and
error estimation techniques.
• Because it is unlikely that this will be fully
definitive, our confidence in the numerical
accuracy of validation calculations also
rests upon:
When we performed the original work
we could not converge the
calculations because of hardware
limitations.
ALEGRA-HEDP has a growing set of
verification problems that increase our
confidence in the numerical accuracy.
21 March 2016
 Code verification processes and
results. This includes attention to
software engineering (SE).
 Careful design and application of
verification test suites where
convergence to the right answer can
be demonstrated.
• DOE has demanded formal attention to SE
(and Sandia has responded).
Johns Hopkins, November 2003
Page 14
Is there also uncertainty in the
calculation beyond numerical accuracy?
Is uncertainty in the Grüneisen parameter
important? (Part of an ensemble of 120000
calculations.)
• Numerical accuracy uncertainties
fundamentally reside in lack of
convergence for fixed problem
specification; one extreme point is
known under-resolution of the grid.
Experiment
To study the probabilistic content of
the error field when we compare
calculated and experimental Hugoniot
data we studied the influence of
uncertainty in certain computational
parameters. (L. Lehoucq, using the
DDACE UQ tool. This type of study can
now be accomplished using DAKOTA.)
21 March 2016
Calculations have uncertainties that
are composed both of numerical
accuracy questions and
uncertainties arising from problem
specifications.
• There is uncertainty in translating
experimental specs into calculation
specs.
• There is uncertainty in specifying a
variety of numerical parameters; hence
calibration of uncertain models
becomes an important question.
Johns Hopkins, November 2003
Page 15
Uncertainty dominates real V&V.
Input
Codes
Output
Application
Specification
Algorithm lack of
rigor
Quantitative Margins
and Uncertainty
Calibrations
Under-resolution
Decisions
Structural (Model)
Uncertainty
Code Reliability
Validation Data
21 March 2016
Human Reliability
Infrastructure
Reliability
Johns Hopkins, November 2003
Page 16
What is the intended application and are
we accumulating predictive confidence?
Predictive modeling for Z-pinch physics.
There is a very important link between
V&V and the intended application of
modeling and simulation.
• Rigorous assessment of predictive
confidence resulting from V&V is
important.
 This is demanded by the experimental
validation methodology and the validation
metrics project.
ALEGRA-HEDP validation
calculation for Imperial College 4x4
arrays.
Success or failure of predictive M&S
may have an important influence on
the future of the Pulsed Power
program.
21 March 2016
 There are technical problems, such as
how to quantify the benefit gained by
doing additional validation experiments;
and how to quantify the risk associated
with not having validation experiments.
• We have also devoted significant attention
to the issue of stockpile computing – how
to make proper use of the investment in
V&V. Document in progress.
• “Quantitative Margins and Uncertainty”(!)
Johns Hopkins, November 2003
Page 17
WIPP and NUREG-1150 Precedents
High Consequence Regulatory Issues in the National Interest
Addressed Primary Through Modeling and Simulation
WIPP Data
Lessons Learned: (1) Seek BE + Uncertainty
21 March 2016
(2) It takesJohns
more
than 2003
one shot to get it rightPage 18
Hopkins, November
WIPP Data
Example research question: Is Probabilistic
Software Reliability (PSR) useful for
computational science software?
Hypothetical reliability model for ASCI
codes.
Failure Rate
Application
Application Decisions
Decisions
1st use /
Validation
Development and test
# of
Users
Capability I
Capability II
Capability etc
Note the important implication here that the software is NEVER FROZEN! This effects
reliability methods.
A general purpose computational
physics code such as ALEGRAHEDP, has a complex software
lifecycle and reliability history.
A fundamental complexity is the
constant evolution of the software
capability.
21 March 2016
PSR methodologies may deepen
our ability to express operational
confidence in our codes as
software products.
• A vigorous area of research is the
expansion and limits of statistical
testing techniques.
 “Based on the software developer and user
surveys, the national annual costs of an
inadequate infrastructure for software testing
is estimated to range from $22.2 to $59.5
billion.” (“The Economic Impacts of
Inadequate Infrastructure for Software
Testing,” NIST report, 2002.)
• Can PSR be extended to include
“failures” defined by unacceptable
algorithm performance? By inadequate
resolution?
• Can interesting code acceptance criteria
be devised based on statistical software
reliability ideas?
Johns Hopkins, November 2003
Page 19
Example research question: Validation
Metric Research.
Real data
Uncertainty quantification remains
an critical enabling technology for
validation:
• Forward uncertainty propagation is
computationally demanding (ideal
would be stochastic PDE’s).
• UQ needs new ideas in experimental
design for simulations and coupled
simulation-experiment validation tasks.
• UQ needs tools and the expertise to use
them properly.
Load Implosion
Load Stagnation
The most relevant data on the Zmachine tends to be complicated,
integral, and spatio-temporally
correlated. The uncertainty is
currently not well-characterized.
21 March 2016
 DAKOTA is the platform of choice
for current and future evolution of
UQ tool capability at Sandia.
• The “backward” UQ problem – model
improvement – is an even harder formal
challenge.
 This is related to Optimization
Under Uncertainty (OUU).
Johns Hopkins, November 2003
Page 20
Example research question: OUU
(Optimization Under Uncertainty).
z
High Gain Capsules
High Z dense plasma
(Design, uncertain,
unstable)
Foam + ?
(Design)
• V&V are the source of the confidence
that we have in the modeling
component of these activities.
Robust
(reliable?)
pulse shaping
Robust
(reliable?) pulse
compensation
Conversion –
1,2,3D ALEGRA
rad-MHD
Wire initiation
– 3D ALEGRA
MHD
Drive and implosion
– 1,2D ALEGRA radhydro
Capsule
(Design)
Lagrangian
Using computational models in
reliability-based or robust design is
an important goal.
SMALE
MMALE
Eulerian
Fusion capsule design for a Z-pinch
driver is an interesting and extreme
problem in OUU. We are currently
using ALEGRA-HEDP and DAKOTA
to study features of this problem.
21 March 2016
• Model improvement derived from V&V is
related to OUU.
 For example, calibration under
uncertainty.
• It is important to couple research on
OUU with research threads in Validation
Metrics.
r
• We are just beginning this work.
• VERY COMPLEX computation underlies
this work.
Johns Hopkins, November 2003
Page 21
Giunta has been working on the use of multifidelity surrogates, which will surely be crucial
for use of OUU in such complex problems.
Multifidelity Surrogate Models
• The low-fidelity surrogate model
retains many of the important
features of the high-fidelity “truth”
model, but is simplified in some way.
– decreased physical resolution
– decreased FE mesh resolution
– simplified physics
Finite Element
Models of the
Same Component
• Independent of number of design
parameters.
• Low-fidelity model still may have
nonsmooth response trends.
• Works well when low-fidelity trends
match high-fidelity trends.
Low Fidelity
30,000 DOF
High Fidelity
800,000 DOF
10
21 March 2016
Johns Hopkins, November 2003
Page 22
Combining uncertainty and multi-fidelity
runs us head-on into probabilistic error
models.
log(r)
Material #1
Material #2
• The problem is a
simple shock problem
involving shock
transmission and
reflection from a
contact discontinuity.
• Key features are
various wave spacetime trajectories.
• This is also a common
verification test
problem and has an
analytic solution.
21 March 2016
error[log(r)]
mean error in contact x
Johns Hopkins, November 2003
empirical histogram of error
in shock arrival time at wall
Page 23
Probabilistic Error Models (PEM) are useful
for computational science software and
necessary for risk-informed decisions.
• Suppose that we can neither “verify codes” nor “verify
calculations.”
– “When quantifying uncertainty, one cannot make errors
small and then neglect them, as is the goal of classical
numerical analysis; rather we must of necessity study
and model these errors.”
– “…most simulations of key problems will continue to be
under resolved, and consequently useful models of
solution errors must be applicable in such
circumstances.”
– “…an uncertain input parameter will lead not only to an
uncertain solution but to an uncertain solution error as
well.”
• These quotes reflect a new view of “numerical error” expressed in
B. DeVolder, J. Glimm, et al. (2001), “Uncertainty Quantification for
Multiscale Simulations,” Los Alamos National Laboratory, LAUR01-4022.
21 March 2016
Johns Hopkins, November 2003
Page 24
Conclusion:
“We make no warranties, express or implied, that the
programs contained in this volume are FREE OF
ERROR, or are consistent with any particular
merchantability, or that they will meet your
requirements for any particular application. THEY
SHOULD NOT BE RELIED UPON FOR SOLVING A
PROBLEM WHOSE SOLUTION COULD RESULT IN
INJURY TO A PERSON OR LOSS OF PROPERTY…”
[Emphasis Mine] (from Numerical Recipes in
Fortran, Press, Teukolsky, Vetterling, and Flannery)
Will we be able to seriously claim that ASCI codes
are any better than this?!
21 March 2016
Johns Hopkins, November 2003
Page 25
How absurd would the following be?
We make no warranties,
express or implied, that the
bridge you are about to drive on
is free of error…
21 March 2016
Johns Hopkins, November 2003
Page 26
How much more absurd would the
following be?
We make no warranties,
express or implied, that the
book you are about to read
is free of error…
21 March 2016
Johns Hopkins, November 2003
Page 27
Download