Uploaded by 不含磷

Reactor Core Methods

advertisement
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/227209296
Reactor Core Methods
Chapter · April 2010
DOI: 10.1007/978-90-481-3411-3_4
CITATION
READS
1
443
3 authors:
Robert Roy
Yousry Y. Azmy
champlain
North Carolina State University
17 PUBLICATIONS 193 CITATIONS
177 PUBLICATIONS 1,094 CITATIONS
SEE PROFILE
Enrico Sartori
Organisation for Economic Co-operation and Development (OECD)
103 PUBLICATIONS 1,231 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Porous Media View project
THOR - Tetrahedral High Order Radiation Transport Code View project
All content following this page was uploaded by Enrico Sartori on 25 March 2015.
The user has requested enhancement of the downloaded file.
SEE PROFILE
Yousry Azmy
•
Enrico Sartori
Nuclear Computational
Science
A Century in Review
13
Prof. Yousry Azmy
North Carolina State University
Department of Nuclear Engineering
Raleigh, NC 27695
1110 Burlington Engineering Labs
USA
yyazmy@ncsu.edu
Enrico Sartori
Organisation for Economic Co-operation
and Development (OECD)
12 bd. des Iles
92130 Issy-les-Moulineaux
France
esartori@noos.fr
ISBN 978-90-481-3410-6
e-ISBN 978-90-481-3411-3
DOI 10.1007/978-90-481-3411-3
Springer Dordrecht Heidelberg London New York
Library of Congress Control Number: 2009944067
Mathematics Subject Classification (2010): 82D75, 65C05
c Springer Science+Business Media B.V. 2010
No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or by
any means, electronic, mechanical, photocopying, microfilming, recording or otherwise, without written
permission from the Publisher, with the exception of any material supplied specifically for the purpose
of being entered and executed on a computer system, for exclusive use by the purchaser of the work.
Cover design: deblik
Printed on acid-free paper
Springer is part of Springer Science+Business Media (www.springer.com)
Ely Gelbard
November 6, 1924–April 18, 2002
Preface
Scheduled on the heels of the atomic century, the American Nuclear Society’s
international topical meeting on Mathematics and Computation seemed like an opportune moment in time to capture accomplishments in this area during the first
half-century of nuclear engineering. Held in a semi-secluded part of the city of
Gatlinburg, Tennessee, April 6–10, 2003, this gathering of prominent experts in the
field and young professionals embarking on exciting careers in what promises to
develop into a nuclear renaissance turned out to be the perfect venue for such a review. The conference was co-sponsored by three divisions of the American Nuclear
Society, namely the Mathematics and Computation Division, the Reactor Physics
Division, and the Radiation Protection and Shielding Division. The Technical Program of the conference revolved around the theme of its title, Nuclear Mathematical
& Computational Sciences: A Century in Review, A Century Anew. The Anew component comprised contributed papers organized in 25 regular and special sessions
on a broad variety of topics, plus a poster session and a panel session. The Review
component of the conference comprised the lecture series that grew into this book.
As Technical Program Chair (YYA) and Assistant General Chair (ES) of the conference, we decided to break with the traditional format of plenary sessions standard
in technical meetings and organize a lecture series that takes stock of the state of the
art in nuclear computational science at the turn of a new century. Thus the concept
of the lecture series that led to the chapters of this book was born.
One of the first experts we solicited to present a lecture in the series was the late
Dr. Ely Gelbard of Argonne National Laboratory at the time. In his gentle, but firm
and persuasive manner, he declined preferring instead to participate as co-organizer
of the lecture series. We jumped on the opportunity recognizing his long-standing,
distinguished, and generous contributions to many subareas in nuclear computational science, and his many years of service in the field positioned him well to know
the major areas to cover in the lectures and to nominate world-renowned lecturers. In
short order the three of us came up with a slate of topics and a corresponding list of
lecturers. The response of the nominated lecturers was supportive and enthusiastic,
and by mid Fall 2001 what has later become known as the Gelbard Lecture Series was fully conceived, and a tentative idea of ultimately documenting the lecture
contents in book chapters was initiated. Our charge to the invited lecturers was to
provide an overview of the assigned topic aiming primarily at breadth of coverage,
vii
viii
Preface
with a sharp focus on its mathematical and computational aspects. Specifically we
requested that each author provide a historical perspective of the conception of their
topic as a major area of research in nuclear computational science, and to identify
landmarks for the evolution of the topic through the end of the twentieth century.
We further requested that the lecturers delineate the current state of the art in their
assigned topic and to project into the future by exposing perceived challenges and
opportunities for advancing the frontier of knowledge.
Our renowned lecturers did not disappoint and the lecture series was a smashing
success, thanks to their dedicated effort and professionalism. The lectures, scheduled to open each half-day of the conference, were well attended, with conference
participants packing the lecture hall on a consistent basis. Perhaps the only sour note
that tainted the lecture series was the passing on April 18, 2002, of Dr. Ely Gelbard
whose contributions to the success of the lecture series, and ultimately to the publication of this book, cannot be overstated. This great loss to the field of nuclear
computational science overshadowed the conference leading to various observances
of this sad event. The conference banquet included a memorial celebrating Dr. Gelbard’s life and his significant contributions to nuclear computational science, and
the lecture series was named after him in recognition of his involvement that propelled the series to success. Later, the contributing authors to this book agreed to
dedicate it to the memory of Dr. Ely Gelbard.
Unfortunately death struck again with the passing of Dr. Richard Hwang on
December 20, 2007, shortly after he completed the final revisions to his chapter
appearing in this book. We are grateful for Richard’s contribution to the success of
the lecture series, for the chapter he composed in this book, and for his dedication
to his research over the past 5 decades.
While the original list of topics envisioned in our early planning of the lecture
series has not changed, the reader will notice a few differences between the lectures
lineup and the chapters herein. First, Dr. Dan Cacuci who, for unforeseen circumstances, was unable to deliver his lecture on Sensitivity and Uncertainty Analysis
at the conference has graciously composed the corresponding chapter for this book.
Second, Dr. Kord Smith who presented the lecture on Reactor Physics at the conference apologized from composing the corresponding book chapter due to increased
job-related responsibilities. We are grateful to Dr. Robert Roy for accepting to undertake such burden and for the excellent job he did in composing his chapter on
Reactor Core Methods. Lastly, in composing Chapter 7, Elliott Whitesides recruited
Mike Westfall and Calvin Hopper to help with the composition.
This book would not have been possible without the support and active involvement of many people over the span of 6 years. Most of all we wish to thank the
authors who willingly and cheerfully accepted this additional burden to their normally hectic schedules. We are confident that the benefit to the field of nuclear
computational science and the gratitude of its practitioners, especially the young
scientists who will carry the torch into the future, will reward the authors’ perseverance and patience during this long an arduous journey. We are grateful to
Argonne National Laboratory’s Dr. Roger Blomquist for composing the memorials
to Ely Gelbard and Richard Hwang, and for reviewing the final version of Richard’s
Preface
ix
Chapter 5. The support and encouragement of Bernadette Kirk, Director of Oak
Ridge National Laboratory’s Radiation Safety Information Computational Center
(RSICC) and General Chair of the Gatlinburg conference, was invaluable to the
completion of this project. The technical help by Alice Rice of RSICC with bringing together the pieces of this book into a single volume is greatly appreciated. In
addition, we wish to acknowledge the tacit approval and support of our respective
institutions, The Pennsylvania State University and North Carolina State University
(YYA), and the Nuclear Energy Agency of the Organisation for Economic Cooperation and Development (ES).
June 2009
Yousry Y. Azmy
Enrico Sartori
Obituary Composed by Dr. Roger Blomquist
for Dr. Ely Meyer Gelbard
Ely Gelbard was born in New York City on November 6, 1924. He was the son
of immigrants. His undergraduate work was at the City Colleges of New York and
after World War II he earned his Ph.D. in physics from the University of Chicago.
During the war, he served in the US Army Air Corps as a radar technician. He was
a Senior Scientist at Argonne National Laboratory and a Fellow of the American
Nuclear Society.
Ely started his postgraduate career when the use of digital computers to solve the
neutron balance equations for fission reactor core design and analysis was just starting to receive wide application. At Bettis (1954–1972), he participated in the efforts
that put the numerical methods for the solution of the finite difference form of the
neutron transport equation on a firm mathematical basis, and he devised several approximation schemes that were suitable for numerical methods and also developed
efficient algorithms for their solution. While at Bettis, he earned international stature
in the field, authoring important papers in many variants of the solution procedures
(spherical harmonics, Sn , synthetic methods, and Monte Carlo), including the book,
Monte Carlo Principles and Neutron Transport Problems, with J. Spanier. He was
the first physicist at Bettis to attain the rank of Consulting Scientist, and earned the
Atomic Energy Commission’s prestigious E. O. Lawrence Award.
Since 1972, when Dr. Gelbard joined Argonne National Laboratory, fast reactors
have been the focus of ANL’s reactor program, with its emphasis on more accurate
computation of the neutron spectrum. His work in this area produced fundamental
advances in the analysis of neutron streaming, collision probabilities, improvements
in Monte Carlo methods, and neutron diffusion and transport within the nodal approximation. He also brought improved iterative solution strategies to bear on the
equations of single-phase computational thermal-hydraulics analysis of passively
safe metal-cooled reactor systems. He was consulted by many at ANL, at other
labs, and at universities on a wide variety of technical issues, and invariably provided important insights.
Ely’s sustained record of high productivity of the highest-quality technical work
attracted a series of bright and vigorous visiting scholars and students whose participation magnified his work. He excelled at distilling complex technical issues to
their essence, then performing the relevant mathematical analysis and, finally, computationally confirming the analysis. He was always careful, honest, and thoroughly
xi
xii
Obituary Composed by Dr. Roger Blomquist for Dr. Ely Meyer Gelbard
scrupulous in his work. He earned the ANS Special Award for Computer Methods
for the Solution of Problems in Reactor Technology, the ANS Mathematics and
Computations Division Distinguished Service Award, the ANS Reactor Physics
Division Eugene Wigner Award, and the University of Chicago Distinguished Performance Award.
In spite of his great stature and many accomplishments, Ely was a mild and
modest gentleman who always gave full credit to others’ work, and was very
approachable and an excellent listener. His technical questions at meetings were insightful, probing, and gentle. He also pursued the understanding of others’ points
of view in personal and political matters with both intellect and sensitivity. His
restaurant adventures at meetings and other venues have provided a rich array
of gastronomic experiences and many fond memories to his many friends in our
profession.
The Gelbard Review Lecture Series
Conducted during the American Nuclear Society’s Conference
Nuclear Mathematical and Computational Sciences:
A Century in Review, A Century Anew
Gatlinburg, Tennessee, April 6–10, 2003
Back row: Richard Hwang, Elmer Lewis, Kord Smith, Enrico Sartori
Front row: Yousry Azmy, Elliott Whitesides, Jerry Spanier, Jack Dorning,
Ed Larsen
Contents
Preface .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . vii
Obituary Composed by Dr. Roger Blomquist
for Dr. Ely Meyer Gelbard .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . xi
1
Advances in Discrete-Ordinates Methodology .. . . . . . . . . . . . . . . . . .. . . . . . . . . . .
Edward W. Larsen and Jim E. Morel
1
2
Second-Order Neutron Transport Methods . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 85
E.E. Lewis
3
Monte Carlo Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .117
Jerome Spanier
4
Reactor Core Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .167
Robert Roy
5
Resonance Theory in Reactor Applications . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .217
R.N. Hwang
6
Sensitivity and Uncertainty Analysis of Models and Data.. . . . .. . . . . . . . . . .291
Dan Gabriel Cacuci
7
Criticality Safety Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .355
G.E. Whitesides, R.M. Westfall, and C.M. Hopper
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond . . . . . . . . . . . . . .. . . . . . . . . . .375
Jack Dorning
Index . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .459
xv
Chapter 1
Advances in Discrete-Ordinates Methodology
Edward W. Larsen and Jim E. Morel
1.1 Introduction
In 1968, Bengt Carlson and Kaye Lathrop published a comprehensive review on the
state of the art in discrete-ordinates .SN / calculations [10]. At that time, SN methodology existed primarily for reactor physics simulations. By today’s standards, those
capabilities were limited, due to the less-developed theoretical state of SN methods
and the slower and smaller computers that were then available. In this chapter, we
review some of the major advances in SN methodology that have occurred since
1968. These advances, combined with the faster speeds and larger memories of today’s computers, enable today’s SN codes to simulate problems of much greater
complexity, realism, and physical variety. Since 1968, several books and reviews on
general numerical methods for SN simulations have been published [32, 46, 71], but
none of these covers the advanced work done during the past 20 years.
The specific purpose of this chapter is to describe how the field of SN calculations
has matured through the lens of three important physical problems that can be simulated today but could not be realistically simulated in 1968. By discussing these
problems and the methods developed to overcome their calculational difficulties,
we hope to (i) show how dramatically the field of SN simulations of the transport
equation has advanced and (ii) provide an introduction to the new algorithmic techniques that have enabled these advances.
An outline of the remainder of this review follows. In Section 1.2, we briefly
introduce the transport equation and discuss its basic temporal (implicit), energy
(multigroup), directional .SN /, and spatial (finite-difference) discretizations, together with iterative solution procedures – as of 1968. The purpose of this section
is to establish notation and set the stage for the later sections, which describe more
recent developments.
E.W. Larsen ()
Department of Nuclear Engineering and Radiological Sciences, University of Michigan,
Ann Arbor, 48109-2104 Michigan, USA
e-mail: edlarsen@umich.edu
J.E. Morel
Department of Nuclear Engineering, Texas A&M University, College Station, Texas, USA
e-mail: morel@tamu.edu
Y. Azmy and E. Sartori, Nuclear Computational Science: A Century in Review,
c Springer Science+Business Media B.V. 2010
DOI 10.1007/978-90-481-3411-3 1, 1
2
E.W. Larsen and J.E. Morel
Section 1.3 discusses three important physical problems that could not be
simulated in 1968 but can be realistically simulated today: thermal radiation transport, charged-particle transport, and oil-well logging tool design.
In Section 1.4, we discuss advanced spatial discretizations (characteristic methods, discontinuous finite-element methods [DFEMs], and nodal methods) and the
asymptotic thick diffusion limit (a technique to predict the validity of SN spatial
discretizations for diffusive systems with optically thick spatial cells). Section 1.5
describes advances in discretizations of the angular derivatives associated with
curvilinear geometries and treatments of anisotropic scattering. Section 1.6 covers
advances in angular and energy discretizations for charged particles; Section 1.7 describes advances in time discretizations.
In Section 1.8, we discuss major advances in iteration acceleration: diffusionsynthetic acceleration (DSA), linear multifrequency-grey acceleration for thermal
radiation transport, fission source acceleration for time-dependent calculations, and
upscatter acceleration. Section 1.9 outlines the recent application of preconditioned
Krylov methods.
Section 1.10 concludes with a brief discussion of challenges for the future: robust
finite-element methods on nonorthogonal grids, positive and monotone methods,
efficient parallel sweep algorithms for unstructured grids, further development of
Krylov methods for solving the SN equations, methods for charged-particle calculations with pencil-beam sources, Galerkin quadrature with positive generalized
weights, and ray-effect mitigation.
1.2 Basic Concepts
The physical process discussed in this chapter is the interaction of radiation with
matter (radiation transport, or particle transport). The archetypical equation that describes these interactions is the linear Boltzmann equation (LBE) [2, 3, 7, 13]:
1 @ .r; ; E; t /
C r .r; ; E; t / C †t .r ; E/ .r ; ; E; t /
v
@t
Z 1Z
D
†s r; 0 ; E 0 ! E .r; 0 ; E 0 ; t / d0 dE 0
0
4
Z
Z
.r; E/ 1
Q .r; E; t/
C
†f r; E 0
: (1.1)
r; 0 ; E 0 ; t d0 dE 0 C
4
4
0
4
In full generality, this equation has seven independent variables: three spatial variables .r/, two direction-of-flight (or angular) variables , energy .E/, and time .t/.
Particle transport problems are difficult and costly to simulate because, in part, of the
high dimensionality of phase space. In this section, we discuss the basic numerical
methods used to solve Eq. (1.1) in the principal large computer codes of the 1960s
[4, 5, 10, 14]. We assume that the reader understands the physical meaning and basic
mathematical properties of each of the terms in Eq. (1.1), and we have used notation
that is broadly standard. The discussion in this section is terse; we refer the reader
to standard texts [13, 71] for details.
1
Advances in Discrete-Ordinates Methodology
3
The LBE given in Eq. (1.1) describes neutron transport with scattering and fission
interactions. Variations of this equation primarily involve the types of interactions
that are included. For instance, a gamma-ray transport equation would not have a
fission term. Systems of coupled transport equations, each similar to Eq. (1.1), are
required to describe the coupled transport of multiple types of particles, e.g., coupled neutron gamma-ray transport in which neutrons interact with nuclei to create
gamma-rays and gamma-rays interact with nuclei to create neutrons. The principal
computational difficulties associated with Eq. (1.1) are common to essentially all
variations of this equation that are associated with different physical applications.
In this chapter, we describe numerical methods in terms of Eq. (1.1), or simpler versions of that equation whenever possible. We consider variations of Eq. (1.1) that
correspond to different physical applications only when necessary.
To begin, we mention a few technical details. First, the differential scattering
cross section is commonly written as a Legendre polynomial expansion:
1
X
2m C 1
Pm 0 †s;m r; E 0 ! E :
†s r; ; E ! E D
4
mD0
0
0
(1.2)
The Legendre moments †s;m are typically calculated and stored for each material
region. Also, initial and boundary conditions must be specified for Eq. (1.1). If V
denotes the physical system and t D 0 is the initial time, then Eq. (1.1) holds for all
r 2 V; 2 4; 0 < E < 1, and t > 0. At t D 0, must be fully specified in V :
.r ; ; E; 0/ D
i
.r; ; E/; r 2 V; 2 4 ; 0 < E < 1:
(1.3)
Also,
must be specified on the boundary @V for directions of flight pointing
into V :
.r ; ; E; t/ D
b
.r; ; E; t/; r 2 @V; n < 0; 0 < E < 1; 0 < t: (1.4)
Here, n is the unit outer normal vector at the boundary point r 2 @V .
Many important algorithmic concepts can be explained most easily for problems
with planar-geometry symmetry, in which the geometry and solution depend on only
one spatial variable x and one angular variable D i . (The unit vector i points
in the positive x-direction.) For a planar-geometry system 0 x X; Eq. (1.1)
simplifies to
1 @ .x; ; E; t/
@ .x; ; E; t/
C
C †t .x; E/ .x; ; E; t/
v
@t
@x
Z 1Z 1
†s x; 0 ; ; E 0 ! E
x; 0 ; E 0 ; t d0 dE 0
D
0
1
.x; E/
C
2
Z
0
1Z 1
1
†f x; E 0
Q.x; E; t/
x; 0 ; E 0 ; t d0 dE 0 C
; (1.5)
2
4
E.W. Larsen and J.E. Morel
where
1
X
2m C 1
Pm ./Pm 0 †s;m x; E 0 ! E : (1.6)
†s x; 0 ; ; E 0 ! E D
2
mD0
The initial condition for Eq. (1.5) is
.x; ; E; 0/ D
i
.x; ; E/; 0 < x < X; 1 1; 0 < E < 1;
(1.7)
and the boundary conditions are
.0; ; E; t/ D
l
.X; ; E; t/ D
.; E; t/; 0 < 1; 0 < E < 1; 0 < t;
r
(1.8a)
.; E; t/; 1 < 0; 0 < E < 1; 0 t: (1.8b)
Because D vN , where v is the particle speed and N is the particle density,
physically must be non-negative. If the cross sections, inhomogeneous source, initial
conditions, and boundary conditions in Eqs. (1.5) through (1.8) are all non-negative
(as they must be physically), then it can be shown that the solution
of these
equations is non-negative. However, the positivity of does not necessarily hold
when approximations (discretizations) of the LBE are imposed. A desirable feature
of a discretization for the LBE is that the resulting approximate solution should be
positive – or nearly so.
We now sketch the basic discretization and solution methods for Eqs. (1.5)
through (1.8), which existed in computer codes in the late 1960s. We begin with
the discretization of time.
The most widely used time-discretization technique for transport problems, even
today, is implicit time differencing. For a time interval tk1=2 <t<tkC1=2 , and
with the definition kC1=2 .x; ; E/ D .x; ; E; tkC1=2 /, Eq. (1.5) with implicit
time differencing is given as follows:
.x; ; E/
1
kC1=2
C †t .x; E/ C
.x; ; E/
@x
vt k
Z 1Z 1
kC1=2 D
x; 0 ; E 0 d0 dE 0
†s x; 0 ; ; E 0 ! E
@
kC1=2
0
1
Z Z
kC1=2 .x; E/ 1 1
x; 0 ; E 0 d0 dE 0
C
†f x; E 0
2
0
1
kC1=2
k1=2
.x; E/
.x; ; E/
Q
C
C
:
(1.9)
2
vt k
This equation can be obtained by integrating Eq. (1.5) over tk1=2 <t<tkC1=2 , dividing by t k D tkC1=2 tk1=2 , and approximating
1
t k
Z
tkC1=2
tk1=2
.x; ; E; t/dt kC1=2
.x; ; E/:
(1.10)
1
Advances in Discrete-Ordinates Methodology
5
The definition of QkC1=2 in Eq. (1.9) follows Eq. (1.10). The boundary conditions
kC1=2
.0; ; E/ D
l;kC1=2
kC1=2
.X; ; E/ D
.; E/; 0 < 1; 0 < E < 1;
(1.11a)
.; E/; 1 < 0; 0 < E < 1;
(1.11b)
r;kC1=2
for kC1=2 are obtained similarly.
Implicit time discretization yields a steady-state LBE to be solved within each
time step. The angular flux at the end of the previous time step appears as a source in
the right side of the (steady-state) LBE. Since the solution of the LBE with a positive
source is guaranteed to be positive, it follows that implicit time differencing – in the
absence of other truncation errors – is guaranteed to yield a positive solution. (This
is not the case for other time discretizations.)
However, all discretization methods contain truncation errors that degrade the
solution. For implicit time differencing, these errors are first-order:
x; ; E; t kC1=2 D
kC1=2
.x; ; E/ C O.t/:
(1.12)
Because Eq. (1.9) is equivalent to a steady-state equation, the same solution techniques can be used for both time-dependent and steady-state calculations. This is
generally true even for time discretizations that are more advanced than the fully implicit discretization. Thus, we shall, henceforth, only discuss steady-state problems.
Next, we consider the multigroup discretization of the energy variable. This approximation begins with the specification of an energy grid Emin D EG < EG1 <
< Eg < Eg1 < E1 < E0 D Emax , which defines the boundaries
of energy groups (or, more simply, groups), the gth group being the interval
Eg < E < Eg1 . On each group, the cross sections are represented as constants:
†t;g .x/, †s;g 0 !g .x; 0 ; /, †f;g .x/, and g .x/. (In effect, the continuous-energy
cross sections are approximated as histograms in E.) Doing this, integrating the
(steady-state) Eq. (1.5) over the gth group, and defining the gth group flux as
Z
g .x; /
Eg1=2
D
.x; ; E/dE;
(1.13)
EgC1=2
we obtain the coupled system of multigroup transport equations:
@
g .x; /
@x
C †t;g .x/
g .x; /
C
D
G Z
X
1
g 0 D1 1
†s;g 0 !g .x; 0 ; /
G Z
g .x/ X 1
†f;g 0 .x/
2
1
0
g D1
g0
g0
x; 0 d0
Qg .x/
;
x; 0 d0 C
2
0 < x < X; 1 1; 1 g G:
(1.14)
6
E.W. Larsen and J.E. Morel
The multigroup boundary conditions are likewise obtained by integrating Eq. (1.8)
over the gth group:
D
g .0; /
g .X; /
D
l
g ./;
0 < 1; 1 g G;
(1.15a)
1 < 0; 1 g G:
(1.15b)
r
g ./;
This simplified derivation of the multigroup equations does not specify the multigroup cross sections, e.g., †s;g 0 !g . This topic is routinely covered in nuclear
engineering textbooks and will not be discussed here. However, the above derivation
shows that if the physical cross sections are histograms in E, then the multigroup
transport equations are exact.
The multigroup technique is one of the most successful approximations in computational transport theory. In practice, much effort goes into the definition of the
energy groups and multigroup cross sections: the more carefully these are chosen,
the fewer the number of groups required, and the less costly is the calculation. The
optimal number of groups and choice of the group structure depend on the specific
application. For some light water reactor calculations, only two energy groups are
required to achieve sufficient accuracy. However, other problems – such as neutron
and gamma-ray transport in shields, or charged-particle transport – can require well
over 100 energy groups.
Equation (1.14) can also be written as
Z 1
@ g .x; /
C †t;g .x/ g .x; / D
†s;g x; 0 ; g x; 0 d0
@x
1
Qg .x/
;
(1.16)
C Sg .x; / C
2
where †s;g D †s;g!g is the within-group scattering cross section for group g, and
XZ 1
Sg .x; / D
†s;g 0!g x; 0 ; g 0 x; 0 d0
g 0 ¤g
C
1
G Z
g .x/ X 1
†f;g 0 .x/
2
1
0
g0
x; 0 d0
(1.17)
g D1
is the scattering plus fission source to group g. Thus, the multigroup equations can
be viewed as a system of one-group equations that are coupled through the scattering
and fission sources.
Like the implicit time discretization of the transport equation, the multigroup
approximation is inherently positive: for any number of groups G, if the multigroup
cross sections, source Qg , and boundary conditions gl and gr are non-negative,
the solution g of the multigroup Eqs. (1.14) and (1.15) is non-negative.
Next, we discuss the discrete-ordinates .SN / discretization of the angular variable. This begins with the specification of N discrete angles nC1=2 , satisfying
1
Advances in Discrete-Ordinates Methodology
7
1 D 1=2 < 3=2 < < N C1=2 D 1, which define the boundaries of angular
bins, the nth angular bin being the interval n1=2 < < nC1=2 . The width of this
bin is wn D nC1=2 n1=2 . At a point within each nth angular bin, a discrete ordinate n is specified. The set f.n ; wn / j1 n N g is an angular quadrature set
of order N. In practice, one-dimensional (1-D) quadrature sets for Eq. (1.16) have an
even number N of discrete-ordinates and angular bins, which are symmetric about
D 0: N D 1 , wN D w1 , etc. (The direction D 0 is always chosen as the edge
of an angular bin, never as a discrete ordinate.) [15].
Many numerical methods can be described in terms of a simplified one-group
version of Eq. (1.16). This simplified equation is obtained from Eq. (1.16) by absorbing Sg into Qg , assuming Qg to be a known source, and dropping the group
subscript g. The SN approximation to this equation is
n
@
n .x/
@x
C †t .x/
n .x/
N
X
D
†s .x; n ; n0 /
n0 D1
n0 .x/wn0
0 < x < X; 1 n N
C
Q.x/
;
2
(1.18)
with boundary conditions:
n .0/
D
l
n;
0 < n 1;
(1.19a)
n .X /
D
r
n;
1 n < 0:
(1.19b)
This approximation can be obtained by integrating the one-group version of
Eq. (1.16) over the nth angular bin, dividing by the width of the bin, and making
simple approximations. The above SN equations are a coupled system of first-order
differential equations in x. The unknowns in this system are the angular fluxes
n .x/ for the nth angular bin. In the absence of scattering .†s D 0/, the SN equations become uncoupled; particles cannot “jump” from one angular bin to another.
(However, in curvilinear-geometry transport equations, angular “jumping” does
occur, even in the absence of scattering.)
In two-dimensional (2-D) and three-dimensional (3-D) problems, two angular
variables are needed to define the direction-of-flight variable on the unit sphere.
In this case, a multidimensional angular quadrature set consists of a set of discrete
directions (discrete ordinates) n ; one for each angular bin, and a corresponding set
of angular weights wn that define the area (on the unit sphere) of each bin.
Like implicit time differencing and the multigroup approximation, the SN approximation – applied to a Cartesian-geometry equation such as Eq. (1.16) – is
inherently positive: if the cross sections and sources in Eqs. (1.18) through (1.19)
are non-negative, then the SN angular fluxes are non-negative.
The positivity of the SN approximation for 1-D planar and other Cartesian geometries does not automatically apply in curvilinear geometries. For example, let
us consider 1-D spherical geometry problems, in which the cross sections depend
only on the radial variable r D jrj D .x 2 C y 2 C z2 /1=2 and the sources depend
spatially on r and angularly on
D
r
:
r
(1.20)
8
E.W. Larsen and J.E. Morel
In such problems, the angular flux depends spatially only on r and angularly only
on , and the appropriate (one-group) 1-D spherical geometry transport equation is
@ .r; /
1 2 @
C
.r; / C †t .r/ .r; /
@r
r
@
Z 1
Q.r/
:
†s .r; ; 0 / r; 0 d0 C
D
2
1
(1.21)
Now, an angular derivative term occurs on the left side of the LBE. This term
is present because, by Eq. (1.20), the spherical geometry variable is spacedependent; as a particle streams through the system, its angular variable changes
continuously. The new complication is that a successful -discretization must
accurately treat the integral scattering term on the right side of the equation and
the angular derivative term on the left side.
We now describe the original SN approximation to Eq. (1.21). First, we multiply
by r 2 and rearrange to obtain the conservative form of the LBE:
@
@ 2
r .r; / C r
.1 2 / .r; / C †t .r/r 2 .r; /
@r
@
Z 1
r 2 Q.r/
:
D
†s r; ; 0 r 2 r; 0 d0 C
2
1
(1.22)
(Note that by integrating this equation over 1 1, the angular derivative
term automatically vanishes. This fact implies that the angular redistribution of
particles as the particles freely stream is conservative – it neither adds nor subtracts particles from the system.) We integrate Eq. (1.22) over the nth angular bin
n1=2 < < nC1=2 and divide by the bin width wn . After making simple
approximations, we obtain
r @ 2
r n .r/ C
ˇnC1=2 nC1=2 .r/ ˇn1=2
@r
wn
N
X
r 2 Q.r/
;
D
†s .r; n ; n0 /r 2 n0 .r/ wn0 C
2
0
n
n1=2 .r/
C †t .r/r 2
n .r/
(1.23a)
n D1
where
ˇnC1=2 D 2
n
X
n0 wn0
(1.23b)
n0 D1
is defined so that Eq. (1.23a) admits a constant solution for the infinite medium
configuration in the presence of a constant source. In Eq. (1.23a), extra unknowns
arise – the angular fluxes n˙1=2 at the edges of the angular bins. Originally, these
unknowns were treated by solving Eq. (1.23) in combination with the DiamondDifference (DD) approximation in angle
n .r/
D
1
2
n1=2 .r/
C
nC1=2 .r/
;
(1.24)
1
Advances in Discrete-Ordinates Methodology
9
and the starting-direction calculation for 1=2 , which is obtained directly from
Eq. (1.21) by setting D 1 and assuming that @ =@ at D 1 is bounded,
@
1=2 .r/
@r
C †t .r/
1=2 .r/
D
N
X
†s .r; 1; n0 /
n0 .r/wn0
n0 D1
C
Q.r/
:
2
(1.25)
Equations (1.23) through (1.25) hold in (e.g.) a finite sphere 0 < r < R, with
boundary conditions
n .R/
b
n
D
1=2 .R/
D
; n < 0;
(1.26a)
b
1=2 ;
(1.26b)
which specify the fluxes entering the sphere through its outer boundary r D R.
It is apparent that for curvilinear geometries, in which angular derivatives occur, the SN approximation is more complicated than in Cartesian geometries. As a
result, there are issues of accuracy in curvilinear geometries that do not occur in
Cartesian geometries. We will discuss some of these issues later. Although the 1-D
spherical geometry SN equations do not automatically yield positive angular fluxes,
it is rare that these equations yield negative solutions given positive sources and
cross sections, provided that the variation of the solution is resolved by the computational mesh.
Because the cost of a calculation increases with the quadrature order N , there is
an incentive to use angular quadrature sets with the fewest number of angles that
yield the desired accuracy. It has long been known that, for most 1-D problems,
the even-order Gauss–Legendre quadrature sets are optimal (or nearly so). However, for multidimensional problems, the choice of angular quadrature sets is more
ambiguous.
Next, we discuss spatial discretizations for the one-group SN Eqs. (1.18)
and (1.19). As with the previously discussed variables, we must specify points
f0 D x1=2 < x3=2 < < xj C1=2 < < xJ C1=2 D X g that define the edges
of spatial cells, the j th spatial cell being the interval xj 1=2 < x < xj C1=2 . The
cell width is xj D xj C1=2 xj 1=2 . Within each (j th) spatial cell, the cross
sections are constant: †t .x/ D †t;j , etc. The cell-edge angular fluxes are defined
as n;j ˙1=2 D n .xj ˙1=2 /, and the cell-average angular fluxes are defined as
n;j
D
1
xj
Z
xj C1=2
n .x/dx:
(1.27)
xj 1=2
Integrating Eq. (1.18) over the j th spatial cell and dividing by xj , we obtain exactly the balance equation:
n
.
xj
n;j C1=2
n;j 1=2 /
C †t;j
n;j
D
N
X
n0 D1
†s;j .n ; n0 /
n0 ;j
wn0 C
Qj
;
2
(1.28)
10
E.W. Larsen and J.E. Morel
which relates the cell-edge and cell-average angular fluxes, and holds for each
spatial cell 1 j J and each direction 1 n N: These equations and
the boundary conditions
n;1=2
D
n;J C1=2
l
n;
D
0 < n < 1;
r
n;
(1.29a)
1 < n < 0;
(1.29b)
yield a system of N.J C 1/ equations for the N.2J C 1/ cell-average and cell-edge
unknown fluxes. To obtain a discrete system with the same number of equations as
unknowns, NJ extra equations must be formulated.
One way to formulate the extra equations is to specify finite difference relationships between the cell-edge and cell-average fluxes within each cell. If these
relationships only directly couple fluxes traveling in the same direction, and if the
infinite-medium spatially constant solutions are to be preserved, then one is led to
algebraic relationships of the form
n;j
D
1 C ˛n;j
1 ˛n;j
n;j C1=2 C
2
2
n N; 1 j J;
n;j 1=2 ;
(1.30)
where the constants ˛n;i are constrained by the goals of accuracy, stability,
and symmetry about D 0. The simplest choice, ˛n;j D 0, yields the diamonddifference scheme.
The discrete system (1.28) through (1.30) is the basis for the earliest planargeometry SN codes. For the diamond-difference scheme, the numerical solution is
strictly positive if the widths of the spatial cells are sufficiently small. Unfortunately,
as the cell widths increase, diamond-difference solutions can become negative, and
unphysical oscillatory behavior can occur. In multidimensional problems, positive
diamond-difference solutions are never guaranteed; oscillatory behavior is always
possible, for any spatial grid.
These issues led to other choices of the constants ˛n;j , and nonlinear (negative
flux fixup) modifications to Eq. (1.30). In practice, the diamond-difference scheme
(sometimes with negative flux fixup) was found to generally work well for relatively
“diffusive” and weakly absorbing reactor cores, while Weighted-Diamond (WD)
techniques .˛n;j ¤ 0/ generally worked well in more strongly absorbing reactor
shields.
All of the independent variables in the LBE – time, energy, angle, and space –
have now been discretized. Typically, a very large system of linear algebraic
equations has been generated. How does one solve this system? In principle, one
can write the entire discrete system as a single matrix equation A D q and invert
the matrix A. However, A is usually so large that this is impractical; thus, iterative techniques that avoid the explicit construction and direct inversion of A must
be used.
1
Advances in Discrete-Ordinates Methodology
11
The most fundamental iterative approach, called Source Iteration (SI), is based
on a simple concept. Let us suppose that the entire right side of Eq. (1.28) is known.
Then for n > 0, it is easy to show that by starting at the leftmost .j D 1/ spatial cell
and recursively marching from left to right (in the direction of particle flow) across
the system to the rightmost cell, one can explicitly solve Eqs. (1.28) through (1.30)
for all cell-edge and cell-average unknowns that correspond to n > 0. Likewise,
by marching across the system from right to left, one can solve Eqs. (1.28) through
(1.30) for all cell-edge and cell-average unknowns that correspond to n < 0.
In the absence of scattering, this marching (or transport sweep) procedure yields
an explicit noniterative solution of the transport problem. If the physical problem
includes scattering, then the SI scheme can be implemented. SI begins with an estimate (typically, zero) for the scattering source on the right side of Eq. (1.28). Using
this estimated source, Eqs. (1.28) through (1.30) are solved by a transport sweep.
The resulting estimates of the cell-average angular fluxes are introduced into the
scattering source, and the next iteration is ready to commence. These “source iterations” are repeated until the scalar fluxes converge.
Mathematically, the SI scheme for Eqs. (1.28) through (1.30) is defined by
n xj
.`1=2/
n;j C1=2
N
X
.`1=2/
n;j 1=2
C †t;j
.`1=2/
n;j
Qj
;
2
n0 D1
1 C ˛n;j
1 ˛n;j
.`1=2/
.`1=2/
D
C
n;j
n;j C1=2
2
2
D
.`1=2/
n;1=2
.`1=2/
n;J C1=2
†s;j .n ; n0 /
D
D
l
n;
r
n;
.`1/
0
n0 ;j wn
C
0 < n < 1;
(1.31a)
.`1=2/
n;j 1=2 ;
(1.31b)
(1.31c)
1 < n < 0;
(1.31d)
with the update equation
.`/
n;j
D
.`1=2/
:
n;j
(1.32)
Here, ` is the iteration index. The `th iteration begins with cell-average angular flux
.`1/
estimates n;j , which are introduced into the scattering source. Equations (1.31)
are then solved by a transport sweep, and Eq. (1.32) defines the new cell-average
angular flux estimates to be those that were just calculated in the sweep.
.0/
If the initial scattering source is taken to be zero . n;j
D 0/, the SI angular flux
estimate after ` sweeps is, physically, the angular flux due to particles that have
experienced at most ` 1 scattering events. If the physical system has significant
absorption or is optically thin (leaky), particles will generally have short histories,
and the SI scheme will converge rapidly. However, if the physical system is optically
thick with weak absorption (i.e., diffusive), then most particles will experience long
histories, and the SI scheme will converge slowly.
12
E.W. Larsen and J.E. Morel
For iteratively solving multigroup problems, the single-group SI scheme is the
main building-block. If no upscattering is present, the equations for the highest
.g D 1/ group are iterated to sufficient precision; then the equations for the next
.g D 2/ group are iterated to sufficient precision; this process continues to the
lowest-energy .g D G/ group, and the problem is then considered fully solved.
If upscattering is present, it is necessary to iterate on the upscattering sources:
in the first outer iteration, the upscattering sources are estimated (typically set to
zero) and the resulting downscattering problem is solved as described in the previous paragraph. The second outer iteration begins by updating the upscattering
sources and continues with a second downward “sweep” through the groups. In this
process, it is typical not to fully converge the scattering source within each group;
instead, a user-specified number of transport sweeps is performed for each group. If
the upscattering probability is small, this upscattering or outer iteration strategy will
converge rapidly; if the upscattering probability is high, it will generally converge
slowly.
A widely used parameter for assessing the efficiency of an iterative scheme is the
spectral radius of the iteration operator, defined as the asymptotic (for large `)
ratio of the error in the `th iteration to the error in the (`–1)st iteration:
D lim
`!1
.`/
.`1/
:
(1.33a)
An equivalent definition, which allows to be estimated without knowing the limit
of the sequence of iterates, , is given as follows:
.`/
D lim
`!1
.`1/
.`1/
.`2/
:
(1.33b)
A rapidly converging scheme will have 1; a slowly converging scheme will
have < 1 and 1; and a nonconverging scheme will have 1. A Fourier
analysis, to be discussed later, shows that the SI scheme applied to a one-group
transport equation in an infinite homogeneous medium has the spectral radius
D
†s
D c scattering ratio:
†t
(1.34)
Therefore, the SI scheme applied to one-group problems always converges; but if c
is arbitrarily close to unity (the probability of absorption is arbitrarily small) and the
system becomes increasingly thick, SI will converge arbitrarily slowly. (As c ! 1,
increasingly many particles will undergo increasingly many scattering events during
their histories.) Similarly, it can be shown that the SI scheme applied to multigroup
problems, as described above, always converges; but if the probability of upscattering is high, then the spectral radius will again be close to unity.
1
Advances in Discrete-Ordinates Methodology
13
The SI scheme is the most fundamental iteration scheme for solving discretized
particle transport problems. All practical iterative schemes today use SI as a principal ingredient in the iteration procedure. The SI method is guaranteed to converge,
but for many important problems it converges slowly. Improvements to SI involve
modifications to Eq. (1.32) that are intended to improve the efficiency of the SI
scheme (reducing the spectral radius) for diffusive problems, or for problems with
significant upscatter.
The first widely used improvement to SI was the rebalance scheme. In fine-mesh
rebalance, Eq. (1.32) is replaced by
.`/
n;j
D
.`1=2/ .`/
Fj ;
n;j
(1.35)
where the rebalance factors Fj.`/ are determined by requiring that
.`/
n;j C1=2
D
8
<
:
.`1=2/
F .`/ ;
n;j C1=2 j
.`1=2/
.`/
n;j C1=2 Fj C1 ;
.`/
n;j
and
n > 0
(1.36)
n < 0
satisfy Eq. (1.28), integrated over n . [Equations (1.35) and (1.36) show that the
rebalance factor for the j th cell applies to all the cell-average fluxes within this
cell and all the cell-edge fluxes that correspond to exiting directions from the cell.]
.`/
This procedure yields a tridiagonal system of equations for Fj ; the updated fluxes
satisfy the correct angle-integrated balance equation on each spatial cell.
Coarse-mesh rebalance operates by the same principle, except that now the J
fine-mesh cells are clustered into a smaller number of coarse-mesh cells. (e.g., two
adjacent fine-mesh cells could be clustered into one coarse-mesh cell, yielding J /2
coarse-mesh cells.) Coarse-mesh rebalance factors are defined on the coarse-mesh
spatial cells; any cell-average angular flux within a given coarse-mesh cell is updated multiplicatively by the same coarse-mesh rebalance factor.
The same concept can be applied to multigroup problems. Here, fine-mesh rebalance factors have one rebalance factor for each spatial cell and energy group.
Coarse-mesh rebalance factors correspond to multiple adjacent fine-mesh spatial
cells and/or multiple energy groups.
In practice, rebalance is often inefficient, even when it is implemented optimally. For most problems, fine-mesh rebalance is unstable (divergence occurs),
while coarse-mesh rebalance with a sufficiently coarse grid is stable and provides
acceleration. However, if the coarse grid is too coarse, then coarse-mesh rebalance
becomes ineffective [56]. Thus, the application of rebalance can be time-consuming,
in terms of both the cost of setting up the problem (determining the optimal coarse
mesh) and the resulting inefficiency (poor iterative performance after the coarse
mesh is chosen). In spite of these deficiencies, rebalance was used widely for many
years because it was still more efficient than Source Iteration.
To summarize, the discretization methods described above are finite-difference
in nature, and the resulting truncation errors are guaranteed to be small only when
the discretization grid is “fine.” The implicit time discretization is accurate when
14
E.W. Larsen and J.E. Morel
a time step is small compared to a mean free time (t .v†t /1 , where v D
neutron speed). We have already discussed the accuracy of the multigroup approximation in energy; depending on the problem, as few as two groups, or as many
as over 100 can be required. Angular discretizations can also be difficult; if the
solution of a specified problem has a weak angle-dependence, then a low-order
quadrature set may be sufficient. However, if strong angular effects are present,
due to strongly absorbing regions or voids with streaming, then high-order angular quadrature sets are required. Finally, the finite-difference spatial differencing
schemes discussed above require, for reliable accuracy, that spatial cells be optically thin (in 1-D,†t x 1).
Thus, the state of the art in production particle transport codes in the late 1960s
consisted of the following: implicit differencing in time, the multigroup approximation in energy, the SN approximation in angle, and finite-difference discretizations
in space requiring optically thin and rectangular (orthogonal) spatial grids. The iteration methods that existed to solve these discrete problems were often inefficient.
The production codes in which these methods were implemented were 1-D or 2-D;
no major 3-D codes existed.
1.3 Three Challenging Physical Problems
Next, we discuss three important physical problems that could not be simulated
realistically by the numerical techniques described in Section 1.2, but that can be
meaningfully simulated using the larger and faster computers and the more sophisticated SN algorithms that are available today. These problems are:
Thermal radiation transport. This problem requires accuracy in strongly absorb-
ing media, accuracy in the optically thick diffusion limit with both resolved and
unresolved boundary layers, and robust iteration acceleration.
Charged-particle transport. This problem requires accurate treatment of highly
forward-peaked scattering, accurate treatment of scattering with extremely small
energy losses, and robust iteration acceleration.
Oil-well logging tool design. This problem requires efficient treatment of extremely complex 3-D geometries with vastly different scale lengths in a single
problem.
1.3.1 Thermal Radiation Transport in the Stellar Regime
Thermal radiation transport in the stellar regime is important in astrophysical applications. Hence, some of the most significant early contributions to transport theory
came from astrophysicists, e.g., Eddington [1] and Chandrasekhar [2]. For stellar
modeling applications, the thermal radiation transport equation is usually coupled
to hydrodynamic equations [17, 39], but in this exposition we ignore hydrodynamic
1
Advances in Discrete-Ordinates Methodology
15
coupling. We caution the reader not to confuse infrared transport with thermal
radiation transport in the stellar regime. Although both constitute thermal (photon)
radiation transport, the stellar regime corresponds to high stellar temperatures and
photon energies characteristic of X-rays, while the infrared regime corresponds to
low terrestrial temperatures and infrared photon energies. From a numerical viewpoint, infrared calculations are much more benign.
In thermal radiation transport, high-energy photons transport through and nonlinearly interact with a host medium (a plasma). The photons can scatter, or they can
be absorbed in the medium. If they are absorbed, their energy goes into the medium,
increasing its temperature. The medium also emits photons (via a temperaturedependent Planck function), causing its temperature to decrease.
The equations of thermal radiation transport consist of a transport equation for
the angular intensity of photons I.r; ; E; t/,
1
1 @I
C rI C †t I D
†s C †a B;
c @t
4
(1.37)
and an energy-balance equation for the material temperature T .r; t/,
Cv
@T
D
@t
Z
1
†a Œ 4 B dE:
(1.38)
0
Here, c is the speed of light, †s .r; E; T / is the macroscopic Thompson scattering cross section, †a .r; E; T / is the macroscopic absorption cross section,
†t .r; E; T / D †a C †s is the macroscopic total cross section, .r; E; t/ is angular
intensity integrated over all directions, Cv .r; T / is the material heat capacity, and
B.E; T / is the Planck function:
B.E; T / D
E
2E 3
1
exp
h3 c 2
kT
1
;
(1.39)
where h is Planck’s constant and k is Boltzmann’s constant. The primary unknowns
are the angular intensity I.r; ; E; t/, which is a photon energy flux rather than a
number flux, and the material temperature T .r; t/.
Because Eqs. (1.37) and (1.38) are nonlinear in T , they are generally solved in
each time step via Newton’s method. To illustrate this technique, we implicitly difference the equations in time:
1 kC1=2
kC1=2 kC1=2
k1=2
I
C rI kC1=2 C †t
I
I
k
ct
1 kC1=2 kC1=2
†s
D
C †a B kC1=2 ;
4
Z 1
CvkC1=2 kC1=2
k1=2
kC1=2
T
D
T
†
a
t k
0
kC1=2
(1.40)
4 B kC1=2 dE:
(1.41)
16
E.W. Larsen and J.E. Morel
We let T denote the latest Newton iterate for the temperature. Then the linearized
equations for the next Newton iteration are obtained by evaluating the material
properties at T and linearly expanding the temperature-dependent Planck function
about T :
@B kC1=2
(1.42)
T ;
T
B kC1=2 D B C
@T
where a superscript “ ” denotes a quantity evaluated at T . With the above expansion, the material temperature can be eliminated from the transport equation.
Suppressing the temporal superscript “k C 1=2”, we express the linearized temporally differenced transport equation as follows:
Z 1
1 1
†s C
†a .E 0 / .E 0 / dE 0 C ;
(1.43)
rI C † I D
4
4
0
where
† D †t C ;
D
(1.44a)
1
;
ct k
(1.44b)
R1
@B .E/
dE
@T
;
D
R1
Cv
@B .E/
.E/
dE
C
4
†
a
0
@T
t k
4
0
†a .E/
@B .E/
@T
.E/ D
;
R1
.E 0 /
@B
.E 0 /
0
dE
†
a
0
@T
(1.44c)
†a .E/
(1.44d)
D †a B C I k1=2
Z 1
C †a .E 0 /4 B .E 0 / dE 0 C vk T k1=2 T ; (1.44e)
4
t
0
and the material temperature is given by
R1
T kC1=2 D T C
0
Cv k1=2
T
T
k
t
:
R1
Cv
.E/
@B
.E/4
dE
C
†
a
0
@T
t k
(1.45)
†a .E/ Œ .E/ 4 B .E/ dE C
Equation (1.45) is used to calculate the temperatures after the linearized transport
Eq. (1.43) has been solved. Equation (1.43) has the form of the steady-state neutron transport equation, with †a acting as the fission cross section, acting as the
number of neutrons per fission, acting as the fission spectrum, and acting as the
inhomogeneous source. Also, we note that the scattering process is monochromatic,
1
Advances in Discrete-Ordinates Methodology
17
i.e., photons do not change energy when they scatter. The spectral radius for the inner iteration process (the iteration on a within-group scattering source) is the usual
scattering ratio, c D †s=†t , and the spectral radius for the outer iteration process
(the iteration on the absorption or effective “fission” rate) is given by
Z 1 †a
dE:
(1.46)
D
0
†a C
0
The spectral radii observed in a practical problem are the maximum values of c
and 0 , respectively, evaluated over the problem domain. Any region in which 0 is
close to unity will be diffusive. The diffusion limit for thermal radiation transport is
characterized by
I D B.T /;
(1.47)
@T
4acT 3
r
rT D 0;
(1.48)
@t
3†r
where a D 8 5 k 4 =15h3c 3 , and †r is the Rosseland-averaged total cross section:
Z 1
Z 1
@B.E/
1 @B.E/
dE
dE :
(1.49)
†r D
@T
†
0
0
t .E/ @T
.Cv C 4aT 3 /
The linearized thermal radiation transport equation is analogous to a particular form
of the neutron transport equation, but it is generally much more difficult to accurately and efficiently simulate than the neutron transport equation for the following
reasons:
Within a specific material region, the cross sections †a and †s can vary with
energy by six or more orders of magnitude.
†a and †s can change in space by six or more orders of magnitude across a material interface. Problems often contain both optically thin regions and optically
thick regions.
Optically thick regions can be strongly absorbing or highly diffusive. In diffusive
regions, the spatial scale length of the solution can have a thickness of many
mean free paths (e.g., thousands or more).
The spectral radius of the outer iteration process can be very close to unity, e.g.,
0 0:9999.
For these reasons, the numerical requirements for thermal radiation transport are
severe. In particular, spatial differencing schemes must be highly damped in strongly
absorbing media, accurate in optically thin regions, have the thick diffusion limit
(produce the correct diffusion solution in diffusive regions where the spatial grid is
optically thick), and behave well with unresolved spatial boundary layers. A special
diffusion-synthetic acceleration-like scheme, known as the linear multifrequencygrey method [42, 50, 77], is used to accelerate the outer iterations. The numerical
implementation of this scheme must be very robust because there are often huge
spatial discontinuities in the cross sections. In multidimensional calculations, the
linear multifrequency-grey method must be recast as a preconditioner and used in
conjunction with a Krylov solution technique.
18
E.W. Larsen and J.E. Morel
New algorithmic technologies that have made accurate and efficient thermal
radiation transport calculations possible include lumped discontinuous finiteelement temporal and spatial discretizations, diffusion-synthetic acceleration of
the inner iterations, and linear multifrequency-grey acceleration of the outer iterations. Fourier analysis has played a major role in determining the effectiveness of
acceleration schemes, and asymptotic analysis has played a major role in determining the behavior of finite-element spatial differencing schemes in the thick diffusion
limit. While 1-D methods are fairly mature [77], multidimensional methods are still
a topic of research. Currently, preconditioned Krylov methods are having a major
impact on the efficiency of multidimensional calculations. In subsequent sections,
we discuss all of these new techniques.
The numerical technology of 1968 was completely inadequate for thermal radiation transport calculations in the stellar regime. The spatial differencing schemes of
that time were not accurate in optically thick, diffusive problems, and the iterative
acceleration techniques available then were not sufficiently efficient and robust.
1.3.2 Charged-Particle Transport
Charged-particle transport problems are described by Eq. (1.1) with †f D 0
(charged particles do not yield fission). However, charged particles can generate
secondary particles of various sorts, which, for simplicity and without loss of generality, we neglect in our equations. Charged-particle problems are challenging
because the mean free path, the average scattering angle, and the associated average energy loss are generally very small. To deal with this, the nearly singular
components of the differential scattering cross section are often split off from the
nonsingular (“smooth”) components and approximated by Fokker–Planck operators, yielding the approximate Boltzmann–Fokker–Planck (BFP) equation [37]:
Z
r
C †t;sm
D
1
Z
†s;sm.0 ; E 0 ! E/ .0 ; E 0 /d0 dE 0
1
@2
†r;tr @
2 @
.1 /
C
C
2
@
@
1 2 @! 2
@ˇr
C Q:
(1.50)
C
@E
0
4
Here, the direction vector is defined in terms of the cosine of the polar angle and the azimuthal angle !. Also, †t;sm is the “smooth” total cross section, †s;sm is
the “smooth” differential scattering cross section, †r;tr is the restricted transportcorrected scattering cross section (also called the restricted momentum transfer),
and ˇr is the restricted stopping power.
The differential (in , !, and E) operators on the right side of Eq. (1.50)
are Fokker–Planck operators in the angular and energy variables. As described
above, the complete scattering cross section in Eq. (1.50) has been decomposed into
1
Advances in Discrete-Ordinates Methodology
19
“singular” and “smooth” components. The singular (differential) terms represent the
part of the scattering cross section associated with highly forward-peaked scattering
and small energy losses, while the smooth (integral) term represents the part of the
scattering that is not singular. The cross sections and Fokker–Planck coefficients in
Eq. (1.50) are said to be “restricted” because each is restricted to one component of
the scattering.
The angular Fokker–Planck operator on the right side of Eq. (1.50) is called the
continuous-scattering operator. It represents an asymptotic limit of the Boltzmann
scattering operator in which the scattering cross section †s becomes unbounded and
the average cosine of the scattering angle N 0 limits to unity in such a manner that the
momentum transfer, ˛ D †s .1 N 0 /, remains constant. Thus, particles scatter more
often per unit pathlength but undergo a smaller average change in direction, in such
a way that the mean change in direction cosine per unit pathlength remains constant.
This limit results in a continuous-scattering process. In fact, it rigorously represents
diffusion on the unit sphere. The continuous-scattering operator in Eq. (1.50) is the
Laplacian operator on the unit sphere multiplied by the effective diffusion coefficient
†r;tr =2.
The energy Fokker–Planck operator on the right side of Eq. (1.50) is called
the continuous-slowing-down operator. It represents an asymptotic limit of the
Boltzmann scattering operator in which the scattering cross section †s becomes unbounded and the average energy loss per scatter E goes to zero in such a manner
that the stopping power, ˇ D †s E, remains constant. Thus, particles scatter more
often per unit pathlength but undergo a smaller average energy loss per scatter, such
that the average energy loss per unit pathlength remains constant. This limit yields
a process in which particles continuously lose energy per unit pathlength at a rate
given by the stopping power.
It is not always necessary to split the singular components off the scattering operator and approximate them by Fokker–Planck operators. In 1-D slab geometry
and spherical geometries, one can obtain extremely accurate discrete approximations for the full scattering source (singular C smooth), even when the Legendre
expansion for the scattering cross section is highly truncated; one need simply
use a PN1 cross-section expansion, together with a Gauss SN quadrature set [54].
However, in 2-D and 3-D calculations, special techniques must be used to achieve
the accuracy associated with 1-D Gauss quadratures. Special discretizations of the
continuous-scattering operator can be very useful for obtaining a positive scattering
source representation [40, 55, 120]. It is difficult to simultaneously maintain accuracy and scattering source positivity with highly truncated Legendre cross-section
expansions.
Charged-particle calculations are numerically difficult for the following reasons:
If the LBE is not approximated by the BFP equation, Legendre expansions
for charged-particle scattering cross sections can require thousands of terms to
converge.
The scattering ratio can be very close to unity, resulting in very slow convergence
of the source iteration process.
20
E.W. Larsen and J.E. Morel
Diffusion-synthetic acceleration improves convergence, but because of the highly
forward-peaked scattering, it does not necessarily bound the spectral radius away
from unity.
Some higher-order Legendre moments of the angular flux must also be accelerated to bound the spectral radius away from unity.
Multigroup-like methods for treating the continuous-slowing-down term generally result in a great deal of numerical diffusion in energy. This loss of accuracy
can make it difficult to calculate energy spectra.
Higher-order (in energy) methods are more accurate than multigroup-like methods, but the energy spectra can still be difficult to converge. (This is a greater
difficulty for [heavy] ions than for [light] electrons.)
Strong boundary layers can exist at material interfaces if secondary particle generation occurs. For instance, near interfaces between low-Z and high-Z materials,
X-ray photons can create low energy electrons with strong boundary layers.
Different particle species in a coupled calculation may have vastly different scale
lengths for their solutions. For instance, photons can generate electrons that have
much smaller mean free paths than the photons.
Although it was adequate for simple demonstration calculations, the numerical
transport technology of 1968 was not adequate for realistic charged-particle calculations. Some of the new technologies that have strongly impacted charged-particle
transport are discontinuous finite-element methods (DFEMs) in both space and energy. DFEM discretizations are essential for robustness. For instance, if one is
interested in bulk energy deposition rather than the anomalous deposition that occurs
very near high-Z/low-Z interfaces, DFEM space-energy methods can yield accurate
bulk energy deposition, even with completely unresolved boundary layers. Furthermore, DFEM energy discretizations are much more accurate than multigroup-like
methods when energy spectra are of interest, or when energy and charge deposition are of interest in deep penetration problems [47]. Diffusion-synthetic
acceleration can significantly reduce running times, and methods that accelerate
higher-order moments can have an even greater impact [36, 62]. The Galerkin
quadrature method, which is compatible with standard Legendre cross-section expansions, has made it possible to accurately treat highly anisotropic scattering in
2-D and 3-D calculations [54].
One-dimensional methods for coupled electron–photon transport are quite mature [58], but 2-D and 3-D methods remain a research topic for charged-particle
transport calculations of all types. A difficult problem of great interest is that of
pencil-beam sources in 2-D cylindrical and 3-D geometries. Acceleration of the
high-order scattering source moments can be efficiently done in 1-D [62], but this
task has proven to be very difficult in 2-D and 3-D [93]. The application of the
SN method to multidimensional charged-particle transport calculations is just beginning to occur [101]. This technology will probably be slow to gain widespread
use in production settings because of the high level of expertise required to apply
the SN method to charged-particle transport problems, and because of the historical
preference of Monte Carlo rather than deterministic methods within the chargedparticle transport community. Nonetheless, the SN method has the potential to be
1
Advances in Discrete-Ordinates Methodology
21
much more efficient than Monte Carlo for a wide variety of applications, including
space shielding, accelerator shielding, radiation effects on electronics, and radiation
cancer therapy.
1.3.3 Oil-Well Logging Tool Design
Oil-well logging describes the measuring of physical characteristics of rock formations surrounding a borehole using various types of instruments [48]. Data acquired
by an instrument is recorded as a function of its depth in the borehole, resulting in a
scroll-like plot called a log. Several types of logging tools based on electromagnetic,
acoustical, mechanical, and nuclear radiation physics exist. Each tool type measures
specific properties of rock formations, which provide information on the oil content
of the formations. For instance, a nuclear neutron–neutron tool (a neutron source
with a neutron detector) measures the porosity of the rock, while a gamma–gamma
tool (a gamma-ray source with a gamma-ray detector) measures the density of the
rock. Nuclear tools are generally operated with the tool pressed against the side of
the borehole, rather than being centered in the borehole; this makes design calculations inherently three-dimensional.
Nuclear tools function by emitting radiation from the source in the tool into the
rock formation. The source particles, or secondary particles generated by the source
particles, are then detected within the tool. (Each nuclear tool has two detectors:
one “close” to the source, and one “far” from the source.) Finally, the intensity and
energy spectra of the detected radiation are interpreted to yield information about
the rock formation surrounding the borehole [48].
Monte Carlo methods have traditionally been used to perform logging tool design calculations because of the inherent 3-D nature of these calculations. Extensive
variance reduction techniques must generally be employed because the probability
of a source particle reaching a detector can be very small. For instance, the probability that a source particle will reach the far detector in a gamma–gamma tool is
roughly 108 . This extremely small value is the consequence of a highly collimated
source and a highly collimated detector. Furthermore, statistical errors in design calculations must generally be reduced to a standard relative deviation of one percent
or less. The need for extensive variance reduction and very low statistical errors can
lead to extremely high CPU times for nuclear tool design calculations. Thus, there
is much room for improvement in methods for performing these calculations.
There are two roles for SN methods to play in tool design calculations. The
first is to use deterministic methods to directly perform design calculations;
the second is to use deterministic methods to provide adjoint calculations for
automatic variance reduction in Monte Carlo calculations [118]. The properties of
nuclear logging tool design calculations that present the most difficulties for SN
methods are 3-D geometric complexity and the need for accuracy to within 1%.
The inherent 3-D nature of logging tool design calculations put them far beyond
the reach of the 1-D and 2-D computational transport technology of 1968. Much of
22
E.W. Larsen and J.E. Morel
the work on nodal spatial discretization schemes in the 1980s was motivated by the
high accuracy required for logging tool design calculations. However, most of these
nodal methods were developed for rectangular meshes, and the geometric complexity of logging tools can cause the use of such meshes to be highly inefficient. In the
late 1980s it became clear that rectangular meshes were inadequate for logging tool
design calculation, regardless of the accuracy of the discretization schemes developed for such meshes. Since all of the first production 3-D SN codes were limited to
rectangular meshes, and since it was not clear that sufficiently accurate unstructuredmesh SN methods could be developed, most of the research on SN methods for
logging tool design ended without success in the early 1990s. This situation changed
in the late 1990s with the advent of 3-D unstructured-mesh SN methods.
It has recently been demonstrated that 3-D unstructured-mesh SN methods can
in fact be successfully applied to oil-well logging tool design calculations. In particular, a special version of the ATTILA code [78] was used to perform calculations
for both a neutron–neutron tool and a gamma–gamma tool [89]. No special techniques were required for the neutron–neutron tool, but the highly collimated source
and detectors associated with the gamma–gamma tool required the use of a firstcollided source technique (to get the source gamma-rays into the rock formation)
and a last-collided source technique (to get the scattered gamma-rays into the highly
collimated near and far detectors) in conjunction with the SN method. Monte Carlo
calculations for both tools were performed with the MCNP code [72] to provide
benchmark results. Each tool required two MCNP calculations: one calculation for
the response of the near detector, and one calculation for the response of the far
detector. The ATTILA code was also used to generate two adjoint solutions for the
gamma–gamma tool (one for the response of each detector). These adjoint solutions
were used to generate weight windows for the MCNP calculations. All calculations
were performed on an SGI Octane workstation with a 250 MHz R10000 processor.
The ATTILA results were excellent for the neutron–neutron tool (errors less than
1%) and very good for the gamma–gamma tool (errors less than 2%). Completely
converged solutions could not be obtained for the gamma–gamma tool calculations
because of memory limitations.
The CPU time for the SN neutron–neutron tool calculation was about 7 h versus
about 10 h for the Monte Carlo calculation (which used the best variance reduction
techniques available). A general conclusion from these results is that SN methods
can now accurately and efficiently perform neutron–neutron tool calculations, but
Monte Carlo methods will compete with SN methods for such calculations.
This conclusion is not necessarily true for gamma–gamma tool calculations.
The CPU time for the SN gamma–gamma tool calculation was about 12 h, while
the CPU time for the Monte Carlo calculation would have been about 100 h if the
statistical errors had been reduced to the requisite 1%. Furthermore, the adjoint SN
solution used to generate weight windows for the Monte Carlo calculation of the
far detector response resulted in a factor of 1,000 speedup. It is doubtful that the
Monte Carlo calculations could have been performed on a single workstation without the SN adjoint solutions. A general conclusion is that 3-D unstructured-mesh SN
methods have the potential to dramatically improve capabilities for gamma–gamma
1
Advances in Discrete-Ordinates Methodology
23
tool calculations [89]. However, to achieve 1% accuracy, the direct use of SN
methods will require a large number of energy groups, and hence a large amount of
storage. In the near future, this will likely necessitate a parallel 3-D unstructuredmesh capability, because huge memories are only available on massively parallel
computers. However, the indirect use of SN methods (to provide weight windows
for Monte Carlo) is feasible with workstations, because adjoint solutions do not
need to be highly accurate to be effective for variance reduction. Also, highly accurate adjoint solutions may not be cost-effective because the automatic variance
reduction techniques available in production Monte Carlo codes do not give a zerovariance solution when the adjoint solution is exact [118]. Monte Carlo methods
will not be able to fully utilize SN adjoint solutions until zero-variance (or nearly
zero-variance) biasing algorithms are developed.
1.4 Advances in Spatial Discretizations
As discussed in Section 1.2, the state of the art in spatial discretizations of the transport equation prior to the 1970s consisted of finite-difference approximations involving the particle balance equation (Eq. (1.28)) and associated weighted-diamond
auxiliary equations (Eq. (1.30). These methods were accurate for optically thin cells
.†t x 1/, but they generally became inaccurate for problems with spatial cells
that were not optically thin. This deficiency led to the consideration of more sophisticated spatial discretization schemes with multiple unknowns per cell. In the
most successful of these new methods, the solutions’ increased accuracy and robustness more than compensated for the greater computer arithmetic and storage
requirements. In this section, we discuss some of these new methods. To keep our
discussion simple, we consider the SN Eq. (1.18) with isotropic scattering and an
isotropic inhomogeneous source,
n
d
n .x/
dx
.x/ D
C †t
N
X
n .x/
D
n0 .x/wn0 ;
1
Πs .x/ C Q.x/ ;
2
(1.51a)
(1.51b)
n0 D1
together with a spatial grid fxj C1=2 j 0 j J g and related notation, as described
in Section 1.2.
1.4.1 Characteristic Methods
Characteristic methods are some of the most successful of the new classes of
spatial discretization methods that have been developed since the early 1970s
24
E.W. Larsen and J.E. Morel
[12, 65, 88, 102, 106]. The methods discussed below are often referred to as short
characteristics to distinguish them from long characteristic methods typically employed in assembly calculations in core physics applications; see Chapter 4. Long
characteristics are based on an integral form of the transport equation, and their solution algorithm usually utilizes a ray-tracing scheme of particle paths throughout
the problem geometry. Short characteristic methods have the following features:
On each spatial cell,
.x/ and Q.x/ are approximated by a low-order polynomial
(usually constant or linear).
The leakage-plus-collision operator m d=dx C †t is mathematically inverted.
(In multidimensional problems, this inversion requires integration along a characteristic line of the transport equation; this property gives its name to the
method.)
The resulting analytic expression for n .x/ is manipulated to yield a polynomial
representation, which, when combined with similar results for other directions of
flight, yields a new polynomial expression for .x/.
The resulting discrete equations are considered to be solved when, for each cell,
the original polynomial expression for .x/ in the right side of Eq. (1.51a) equals
the polynomial expression described in the previous bullet.
For example, if .x/ is represented in the j th cell as a spatial constant – expressed
in terms of the cell-average angular fluxes – then Eq. (1.51a) becomes (assuming for
simplicity Q D 0)
n
d
n .x/
dx
C †t
n .x/
D
†s
2
j;
xj 1=2 < x < xj C1=2 :
(1.52)
For n > 0 (flow from left to right), Eq. (1.52) can be analytically solved for
†t .xxj 1=2 /=n
.x/
D
e
n
†
s
†t .xxj 1=2 /=n
C
1
e
n;j 1=2
2†t
n .x/:
j:
(1.53)
Using this expression, the flux exiting the j th cell is
n;j C1=2
D e †t xj =n
n;j 1=2
†
s
C 1 e †t xj =n
2†t
j;
(1.54)
and the cell-average flux is
n;j
D
n 1 e †t xj =n
n;j 1=2
†t xj
†
n s
1 e †t xj =n
C 1
†t xj
2†t
j:
(1.55)
The exiting flux (Eq. (1.54)) is used as the incident flux for the next, i.e., .j C 1/,
cell, and the cell-average flux (Eq. (1.55)) is folded into an array that, on completion of the transport sweep in all discrete directions, yields a new estimate for j .
1
Advances in Discrete-Ordinates Methodology
25
The solution is considered to be converged when the new value of j differs from
the old value by less than a prespecified convergence criterion for all j .
This algorithm, with .x/ represented as a constant in each cell, is the StepCharacteristic (SC) method [12, 25]. For 1-D problems, the SC method can be
formulated as a weighted-diamond scheme, i.e., of the form described by Eqs. (1.28)
and (1.30). The SC method readily generalizes to multidimensional problems on
irregular grids, but in these circumstances it cannot be formulated as a weighteddiamond scheme.
The SC method is currently used in several multidimensional production neutron
transport codes. In applications of these codes, the spatial grid is optically thin, and
accurate solutions are obtained. However, the SC method is not accurate for problems demanding optically thick meshes, e.g., of the type described in Section 1.3.
Thus, more complicated characteristic methods have been proposed and implemented, in which the scattering source is represented within each spatial cell as a
linear, or even quadratic function of the spatial variables [25]. As can be expected,
with each increase in the polynomial order of the representation of .x/:
The accuracy of the resulting solution increases.
The computational effort required to process the extra algebraic complexity
increases.
The computer memory needed to store the extra problem unknowns increases.
1.4.2 Linear Discontinuous Method
Among the most flexible and successful of the noncharacteristic methods are the
discontinuous finite-element (DFE) methods [18, 70, 98, 105, 109, 116]. The linear discontinuous (LD) method is perhaps the archetypical method in this class.
This method is based on representations of n .x/ and .x/ that are linear within
each cell, but discontinuous at cell edges. In the LD method, Eq. (1.52) is approximated by
n
d
n .x/
C†t
dx
n .x/
D
†s
2
.0/
j C
2
.x xj /
xj
.1/
j
; xj 1=2 < x < xj C1=2 ;
(1.56)
where xj is the center of the j th cell; now the representation of .x/ requires two
unknowns per cell, j.0/ and j.1/ . If the operator on the left side of Eq. (1.56) were
inverted exactly, as described previously, we would obtain a (linear) characteristic
method. However, discontinuous finite-element methods are based on an approximate, rather than an exact, inversion of this operator.
26
E.W. Larsen and J.E. Morel
The LD method employs the following linear-discontinuous representation of
n .x/:
n .x/
D
8
ˆ
ˆ
ˆ
<
nj
C
2
xj
ˆ
ˆ
:̂
nj
C
2
xj
.x xj /
n;j C1=2
nj
;
n > 0; xj 1=2 < x xj C1=2 ;
.x xj / nj n;j 1=2 ;
(1.57)
n < 0; xj 1=2 x < xj C1=2 :
Equation (1.57) is consistent with the notation that nj is the cell-average flux and
.xj C1=2 ; n / is the cell-edge angular flux. We note that at the cell
n;j C1=2 D
edges, n .x/ is continuous from the left for n > 0 and from the right for n < 0,
but is discontinuous otherwise. The unknowns nj and n;j ˙1=2 in Eq. (1.57) and
.0/
.1/
j and j in Eq. (1.56) are related by
.0/
j
D
N
X
nj wn ;
(1.58a)
nD1
.1/
j
D
X n;j C1=2
nj
wn C
n >0
To obtain equations for
X nj
n;j 1=2
wn : (1.58b)
n <0
nj
and
Z
`C1
xj`C1
n;j ˙1=2 ,
xj C1=2
we operate on Eq. (1.56) by
.x xj /` ./dx; ` D 0; 1;
xj 1=2
to obtain, for ` D 0, the conventional balance equation:
n xj
n;j C1=2
n;j 1=2
C †t
nj
D
†s
2
.0/
j ;
(1.59)
and for ` D 1:
n xj
n;j C1=2
C
n;j 1=2
2
nj
2†t
C
xj2
Z
xj C1=2
.x xj /
n .x/dx
xj 1=2
D
†s
6
.1/
j :
(1.60)
(Here, we have used the previously adopted notation for the cell-edge and cellaverage angular fluxes.) To close these equations, the integral term in Eq. (1.60)
must be expressed in terms of nj and n;j ˙1=2 . To do this, we introduce Eq. (1.57)
into the integral term and for n > 0 obtain
n xj
n;j C1=2
C
n;j 1=2
2
nj
C
†t 3
n;j C1=2
nj
D
†s
6
.1/
j :
(1.61)
1
Advances in Discrete-Ordinates Methodology
27
Equations ( 1.59) and ( 1.61) are two linear algebraic equations for nj and n;j ˙1=2 .
[For n < 0, the term . n;j C1=2 nj / in Eq. (1.61) is replaced by . n;j n;j 1=2 /.]
The resulting LD method can be generalized to multidimensional problems
on structured (rectangular or orthogonal) and unstructured (nonorthogonal) spatial
grids [97]. Variants of the LD method have also been developed, such as the lumped
LD method, which is more robust for optically thick spatial cells but less accurate
for optically thin cells, and various corner balance methods [76, 83, 90, 96]. In general, LD-like methods are much more accurate and robust for the difficult physical
problems described in Section 1.3 than finite-difference methods. LD methods require greater computer arithmetic and storage than finite-difference methods, but
their increased accuracy and robustness usually more than compensates for these
disadvantages.
1.4.3 Nodal Methods
Another class of spatial differencing techniques that have been developed and
widely used in the nuclear reactor community are nodal methods [28–30, 43–45].
In essence, a nodal method approximates a multidimensional transport equation by a
coupled system of 1-D transport equations. Discretization techniques that are highly
accurate for 1-D can then be utilized.
To illustrate, let us consider an .x; y/-geometry version of Eq. (1.51a) on a spatial
cell (assuming for simplicity Q D 0):
n
@
n .x; y/
C n
@
n .x; y/
C †t
n .x; y/
D
@x
@y
xi 1=2 < x < xi C1=2 ; yj 1=2 < y < yj C1=2 :
†s
.x; y/;
4
(1.62)
Transversely integrating this equation by the operator
< >y;j D
1
yj
Z
yj C1=2
./ dy;
(1.63a)
>y;j ;
(1.63b)
yj 1=2
and defining
n;y;j .x/
D<
n .x; y/
we obtain (after moving the transverse derivative to the right side of the equation)
d
†s
n;y;j .x/ C †t n;y;j .x/ D
y;j .x/
dx
4
n n x; yj C1=2 n x; yj 1=2 :
yj
n
(1.64)
28
E.W. Larsen and J.E. Morel
Similarly, transverse-integrating Eq. (1.62) by the operator
< >x;i D
1
xi
Z
xi C1=2
./dx;
(1.65a)
>x;i ;
(1.65b)
xi 1=2
and defining
n;x;i .y/
D<
n .x; y/
we obtain (after moving the transverse derivative to the right side of the equation)
n
d
dy
D
n;x;i
†s
4
.y/ C †t
x;i
.y/ n;x;i
n xi
.y/
n .xi C1=2; y/
n .xi 1=2; y/
:
(1.66)
In the context of the SN equations, Eqs. (1.64) and (1.66) are exact. These are two
1-D first-order transport equations containing the transversely integrated angular
fluxes defined in Eqs.( 1.63b) and( 1.65b). Equations (1.64) and (1.66) can be closed
by taking
n .x; yj ˙1=2 /
D
n;x;i .yj ˙1=2 /;
xi 1=2 < x < xi C1=2 ;
(1.67a)
n .xi ˙1=2 ; y/
D
n;y;j .xi ˙1=2 /;
yj 1=2 < y < yj C1=2 ;
(1.67b)
which yield
d
†s
n;y;j .x/ C †t n;y;j .x/ D
y;j .x/
dx
4
n n;x;i .yj C1=2 / n;x;i .yj 1=2 / ;
yj
n
(1.68)
and
n
d
dy
n;x;i
n xi
.y/ C †t
n;x;i .y/
n;y;j .xi C1=2 /
D
†s
4
x;i .y/
n;y;j .xi 1=2 /
:
(1.69)
More accurate approximations can be obtained by calculating higher-order transverse spatial moments of Eq. (1.62) and using these to obtain more accurate closure
relations than defined in Eqs. (1.67a) and (1.67b). Nodal methods always consist
of coupled 1-D transport equations for each multidimensional spatial cell. These
equations can be fully discretized by applying a standard 1-D discretization to each
transverse-integrated equation.
For example, one can obtain a pseudo-characteristic method by replacing the
analytic functions on the right side of each transverse-integrated equation by
moment-preserving polynomial fits to those functions, and then inverting the 1-D
1
Advances in Discrete-Ordinates Methodology
29
transport operator on the left side of each transverse-integrated equation. The result
is a transport solution for a polynomial source that depends on the solution itself.
For instance, assuming a constant spatial dependence for the scattering source on
the right side of Eq. (1.68), and inverting the 1-D transport operator on the left side
of Eq. (1.68), we obtain the following solution for > 0 and > 0 :
n;y;j .x/
D
†t .x xi 1=2 /
n
†t .x xi 1=2 /
1 exp ;
n
n;y;j .xi 1=2 / exp
C
qn;y;j;i
†t
(1.70)
where
qn;y;j;i D
†s
4 xi
Z
xi C1=2
y;j .x/dx
xi 1=2
n yj
n;x;i .yj C1=2 /
n;x;i .yj 1=2 /
:
(1.71)
Following the same procedure for Eq. (1.69), we obtain
†t .y yj 1=2 /
n;x;i .y/ D n;x;i .yj 1=2 / exp n
†t .y yj 1=2 /
qn;x;i;j
C
1 exp ;
†t
n
(1.72)
where
qn;x;i;j D
†s
4 yj
Z
yj C1=2
x;i .y/ d y
yj 1=2
n xi
n;y;j .xi C1=2 /
n;y;j .xi 1=2 /
:
(1.73)
Equation (1.70) is now evaluated at x D xi C1=2 , Eq. (1.72) is evaluated at y D
yj C1=2 , and the resulting two equations [together with Eqs. (1.71) and (1.73)] are
solved for the outgoing edge fluxes n;x;i .yj C1=2 / and n;y;j .xi C1=2 /.
This procedure constitutes the 2-D Constant-Constant Nodal (CCN) method.
This method is so-named because the transverse derivative term and the scattering source are approximated as constants in each transverse-integrated equation. The more accurate (but more expensive) 2-D Linear-Linear Nodal (LLN)
method has four transverse-integrated equations with linear transverse derivatives
and linear scattering sources in each equation. Extending these methods to 3-D is
straightforward.
Nodal transport (and diffusion) methods have played an extremely important role
in the nuclear engineering community during the past 20 years. As discussed in
Section 1.3.3 nodal methods (on rectangular cells) were applied to oil-well logging
problems in the 1980s but were abandoned for that application because of their
lack of suitability to nonorthogonal grids. The applicability of nodal methods to
nonorthogonal grids remains a research topic.
30
E.W. Larsen and J.E. Morel
1.4.4 Solution Accuracy in the Thick Diffusion Limit
The last topic in this section is not a discretization scheme, but rather a theoretical
technique, which, in the past 20 years, has become essential in predicting the accuracy of discretization schemes for diffusive problems with optically thick spatial
cells .†t x 1/.
Early (finite difference) spatial differencing schemes for the transport equation
were experimentally and theoretically understood to be accurate only when spatial
cells were optically thin .†t x 1/, and the accuracy of these schemes was generally measured by the order of their truncation error [25]. For example, an nth-order
scheme would satisfy
k
exact
x k
D O.
n
/;
1;
where jj jj is a suitable error norm and D †t x. In slab geometry, the DD
and SC schemes are second-order, while the LD method is third-order. Analyses
to mathematically prove the order of convergence were always carried out in slab
geometry; in multidimensional geometries the SN solutions have singular characteristics, across which the solution is not smooth [20], so the truncation error analyses
that can be carried out in slab geometry are not applicable. In fact, computer experiments have shown [33] that because of the singular characteristics, the order of
convergence of the DD scheme in x,y-geometry depends on the definition of the
error norm.
Worse yet, the difficult thermal radiation and charged-particle transport problems
described in Section 1.3 are so optically thick that it is impossible, because of limits
in computer memory, to assign spatial grids for them that are optically thin. However, such calculations are associated with the diffusion and Fokker–Planck limits,
respectively, and the spatial scale lengths for the solution associated with these limits are much larger than a mean free path. It is not unreasonable to expect a transport
spatial discretization scheme to yield accurate results with optically thick cells if
the scale length of the solution is well-resolved by the mesh. Indeed, one would
intuitively expect to get accurate results with such mesh resolution. The difficulty is
that a truncation error analysis does not provide useful information for these types
of problems. Such an analysis tells us only that accurate results will be obtained by
using optically thin cells. To determine if accurate results can be obtained with a
mesh that is optically thick but resolves the spatial scale length of the solution, it is
necessary to perform a discrete asymptotic analysis [49, 113].
Although the system is optically thick in both the diffusion and Fokker–Planck
limits, the requirements associated with each limit for spatial differencing schemes
are quite different. Accurate and robust spatial discretization schemes are generally
required for charged-particle transport, but the highly anisotropic scattering treatment is of primary importance in the Fokker–Planck limit rather than the spatial
differencing scheme [113]. In contrast, the spatial discretization is of primary importance in the diffusion limit. We focus on the diffusion limit here.
1
Advances in Discrete-Ordinates Methodology
31
A diffusive problem is optically thick with weak absorption; it is a problem for
which the transport solution is well-approximated by the diffusion solution. A spatial discretization of the SN equations is of practical use for diffusive problems if
it possesses the optically thick diffusion limit [49, 52, 66, 86, 88, 105]. Such a discretization scheme will yield accurate results for diffusive problems if the spatial
mesh cells are thin with respect to a diffusion length (the spatial scale length for the
diffusion solution), even if these cells are thick with respect to a mean free path.
To describe the diffusion limit, let us consider the monoenergetic planargeometry SN equations
n
d
n .x/
dx
C †t
n .x/
D
N
†s X
2 0
n0 .x/wn0 C
n D1
Q.x/
; 1 n N;
2
(1.74)
and their diffusion approximation
d 1 d
.x/ C †a .x/ D Q.x/:
dx 3†t dx
(1.75)
Here, we have used the standard notation
†a D †t †s D absorption cross section;
and
.x/ D
N
X
n .x/wn
D scalar flux:
(1.76a)
(1.76b)
nD1
To motivate the subsequent analysis, we multiply the diffusion Eq. (1.75) by a positive constant ":
d " d
.x/ C "†a .x/ D "Q.x/:
(1.77)
dx 3†t dx
Clearly, the solution of the diffusion equation is unchanged. This shows that if we
define the following scaled cross sections and source,
†t
;
"
†a ! "†a ;
(1.78b)
Q.x/ ! "Q.x/;
(1.78c)
†t !
(1.78a)
which implies
†s D †t †a !
†t
"†a ;
"
(1.78d)
32
E.W. Larsen and J.E. Morel
then the diffusion equation is invariant under this scaling, for any choice of ".
However, the SN equations are not invariant; they become
n
d
n .x/
dx
C
†t
"
1
2
n .x/ D
†t
"†a
"
Q.x/
;
2
C"
X
N
n0 .x/wn0
n0 D1
1 n N:
(1.79)
Now, one can show that for " 1, the solution of Eqs. (1.79) satisfies
n .x/
.x/
C O."/;
2
D
(1.80)
where .x/ satisfies the diffusion Eq. (1.75). To derive this result, we solve Eqs.
(1.79) by assuming a solution that, for " 1, depends on " by a simple asymptotic
expansion:
1
X
.x/
D
"i n.i / .x/:
(1.81)
n
i 0
Introducing Eq. (1.81) into Eq. (1.79) and equating the coefficients of different powers of ", we obtain for i 0 the following system of equations:
†t
.i /
n .x/
!
N
1 X
2 0
.i /
0
n0 .x/wn
n D1
D n
.1/
d
dx
.i 1/
.x/
n
N
†a X
2 0
.i 2/
.x/ wn0
n0
n D1
C ıi;2
Q.x/
; (1.82)
2
.2/
where n .x/ D n .x/ D 0. We solve this system recursively, by solving the
first .i D 0/ equation, then the second .i D 1/ equation, etc.
The first .i D 0/ equation is
!
N
1 X .0/
.0/
0
D 0:
(1.83)
†t
n .x/ n0 .x/wn
2 0
n D1
Assuming that the quadrature set satisfies
N
X
Z
wn D
nD1
1
d D 2;
(1.84)
1
Eq. (1.83) has the general isotropic solution
.0/
n .x/
where
.0/
D
.x/ is – for now – undetermined.
.0/
.x/
;
2
(1.85)
1
Advances in Discrete-Ordinates Methodology
33
The second .i D 1/ equation is, using Eq. (1.85),
.1/
n .x/
†t
N
1 X
2 0
!
.1/
0
n0 .x/wn
D
n D1
n d
2 dx
.0/
.x/:
(1.86)
Assuming that the quadrature set satisfies
N
X
Z
n wn D
1
d D 0;
(1.87)
1
nD1
the general solution of Eq. (1.86) is
.1/
n .x/
D
.1/
n d .0/
.x/
.x/;
2
2†t dx
(1.88)
where .1/ .x/ is undetermined.
The third .i D 2/ equation, using Eqs. (1.85) and (1.86), is
X
t
.2/
n .x/
N
1 X
2 0
!
.2/
0
n0 .x/ wn
n D1
d
D n
dx
.1/
n d .0/ .x/
.x/
2
2†t
dx
!
†a
2
.0/
.x/ C
Q.x/
:
2
(1.89)
Unlike Eqs. (1.83) and (1.86), this third equation does not automatically possess
a solution. To see this, we multiply Eq. (1.89) by wn and sum over 1 n N ;
assuming that the quadrature set satisfies
N
X
Z
2n wn D
nD1
1
2 d D
1
2
;
3
(1.90)
we obtain the solvability condition
0D
d 1 d .0/ .x/
†a
dx 3†t
dx
.0/
.x/ C Q.x/:
(1.91)
If this equation is satisfied, then it can easily be shown that solutions to Eq. (1.89)
exist. Equations (1.91), (1.85), and (1.81) confirm the result (1.80).
The asymptotic analysis outlined above provides a direct mathematical link between the SN equations (1.74) and the diffusion equation (1.75). This analysis
shows that if the cross sections and source in the SN equations are scaled by Eqs.
(1.78) with " 1, the diffusion equation (1.75) is obtained. The condition " 1 is
34
E.W. Larsen and J.E. Morel
consistent with the physical understanding of neutron diffusion: the mean free path
[ D 1=†t D O."/] is small, the absorption rate [†a D O."/] is small, and the
source [Q D O."/] is small – all of these “smallnesses” being balanced so that the
resulting angular flux is O(1) and satisfies the diffusion equation.
In the thick diffusion limit, two limits occur that have significance for spatial
discretizations:
1
,
1. †t ! O
"
.x/
.
2. n (x) !
2
Thus, the total cross section becomes unbounded, yet the SN solution limits to an
O.1/ diffusion solution. This result applies to the spatially continuous SN equations (no spatial discretization). We now ask: What happens if this same asymptotic
analysis is applied to the spatially discretized SN equations? More precisely, let us
consider a spatially discrete SN problem posed on a fixed spatial grid. We scale the
cross sections and source in this problem exactly as in Eqs. (1.78). For " 1, we
seek a solution of this discrete system in the form of Eqs. (1.81), i.e., we expand all
unknowns (cell-average fluxes, cell-edge fluxes, etc.) as power series in ", and we
solve the resulting hierarchy of equations as described above for the continuous SN
equations. What happens to the spatially discrete SN solution in this limit?
There are two possible answers to this question. First, because as " ! 0 the SN
solution smoothly limits to the diffusion solution, it is plausible to hope that the
spatially discrete SN solution will smoothly limit to the solution of a spatially discrete diffusion solution. (Then, if the chosen spatial grid is adequate to resolve the
solution of this discrete diffusion problem, the resulting discrete solution will be
accurate.) However, because †t D O."1 /, the optical thickness of spatial cells
†t x ! 1 as " ! 0. This and the fact that SN solutions generally become inaccurate as †t x increases suggest that spatially discrete SN solutions may not limit
to an accurate result as " ! 0. Which of these two possibilities is correct?
The answer to this question depends on the chosen spatial discretization scheme.
Some schemes are accurate in the thick diffusion limit; others are not. For example, the Step-Characteristic (SC) scheme fails as " ! 0 (the SC solution ! 0).
The Diamond-Difference (DD) scheme fails unless all the diffusive regions of the
problem have isotropic incident boundary fluxes (in the presence of nonisotropic
boundary fluxes, DD solutions become corrupted by unphysical spatial oscillations).
LD-like schemes perform successfully in the thick diffusion limit in 1-D geometries. LD methods also perform well in multi-D geometries with triangular (2-D)
or tetrahedral (3-D) spatial grids, but they fail in quadrilateral (2-D) or hexahedral
(3-D) grids. (However, bilinear-discontinuous methods work well for quadrilateral grids and trilinear-discontinuous methods work well for hexahedral grids.)
The thick diffusion limit analysis, which has been applied to these discretization
schemes and many others, accurately predicts the performance of approximation
schemes in realistic calculations. This analysis has enabled the successful development of spatial discretization methods for problems with optically thick, diffusive
1
Advances in Discrete-Ordinates Methodology
35
systems – in particular, for the thermal radiation transport and charged-particle
transport problems discussed in Section 1.3.
A reader may ask that if a transport problem is diffusive, then why not solve a
simpler diffusion problem instead? The answer is that in many applications, only a
part of the physical system is diffusive, and it may not be obvious where this diffusive part is. Also, some energy groups may be diffusive, while others are not. Finally,
for time-dependent problems, some regions of space-energy phase space may be
diffusive for certain times but not for others. For these reasons, it is generally infeasible to calculate accurate transport solutions by using the diffusion approximation
in subregions of phase space where it is accurate.
This leads to an important issue that can be discussed only briefly here: the behavior of SN spatial discretization schemes in the presence of unresolved boundary
layers. (These are thin volumes, typically only a few mean free paths in width, containing the material boundaries that separate diffusive and nondiffusive subregions
of a problem.) Across boundary layers, the flux usually has a rapid spatial variation;
if the spatial grid is not sufficiently fine to resolve this fast variation, the boundary layer is said to be unresolved.) Many problems exist in which, due to computer
memory limitations, it is not practical to prescribe a spatial grid that adequately
resolves all boundary layers. Thus, one is led to the question of whether a given discretization scheme is accurate across an unresolved boundary layer. In particular, if
an optically thick, diffusive region is adjacent to a nondiffusive region, can anything
be said about the ability of a given discretization scheme to predict the changes in
the flux across an unresolved boundary layer between two such regions?
The asymptotic thick diffusion limit analysis does make it possible to study unresolved boundary layers; the conclusions so far are that no known differencing
scheme is completely adequate to model unresolved boundary layers accurately. For
example, LD methods are generally inaccurate in the first cell (containing the boundary layer) within the thick diffusive region, and they incorrectly predict that the flux
exiting the diffusive region is isotropic. Generally, to be certain that a discrete solution is accurate, all spatial boundary layers must be adequately resolved by the
spatial grid.
For charged-particle transport problems, which are optically thick and have
highly forward-peaked scattering, a more complicated asymptotic limit exists in
which the total cross section †t D O."1 / and the mean scattering cosine 0 D 1
O."/. As " ! 0, the solution of the continuous transport equation limits to the
solution of a Fokker–Planck equation (see the discussion in Section 1.3.2). Spaceangle discretization schemes have also been successfully analyzed in this asymptotic
limit [113]. Ensuring that the discretized SN equations limit to a valid discretization
of the Fokker–Planck equation is primarily related to the treatment of anisotropic
scattering rather than the spatial differencing scheme. Nonetheless, the presence of
very large and very small eigenvalues in the spectrum of the angular Fokker–Planck
operator necessitates the use of accurate and robust spatial differencing schemes in
Fokker–Planck calculations.
In the thick diffusion and Fokker–Planck problems discussed above, it is generally impossible, given computer memory limitations, to use optically thin spatial
36
E.W. Larsen and J.E. Morel
grids for the entire problem. To successfully simulate these problems, discretization
schemes must produce accurate solutions for optically thick spatial grids away from
boundary layers; and a theory is needed to justify the use of these schemes for these
problems. The discontinuous finite-element schemes (such as LD and its variants)
and the asymptotic (thick diffusion and Fokker–Planck) theories were developed to
deal with just these practical difficulties.
1.5 Advances in Angular Discretizations
Next we discuss (i) advances in SN discretizations for the angular derivative terms
that appear in curvilinear coordinates systems and (ii) improvements to the standard
SN treatment for highly anisotropic scattering. Perhaps surprisingly, very little has
been accomplished during the past 40 years to successfully reduce the classic ray
effects in SN simulations [21, 24].
1.5.1 Angular Derivatives
The transport equation in curvilinear geometries contains one or more angular
derivatives, in addition to spatial derivative terms. The traditional technique for
treating the angular derivative term in the 1-D spherical geometry equation, which
is representative of the traditional treatment used in essentially all curvilinear geometries, is described in Section 1.2. This technique is characterized by:
The use of special ˇ-coefficients to represent the quantity .12 / at each angular
cell edge (see Eq. (1.23a))
The use of the diamond-in-angle relationship to express each cell-average angu-
lar flux in terms of the adjacent cell-edge angular fluxes (see Eq. (1.24))
The use of a starting-direction flux equation to obtain initial values for the angular
flux at D 1 (see Eq. (1.25))
This treatment has a deficiency, known as the discrete-ordinates flux dip, which
consists of an erroneous suppression in the flux at the center of a sphere. Although
the existence of the flux dip was recognized in the early 1960s, it was not eliminated
until the early 1980s.
Three features of the original method contributed to the existence of the flux dip:
The starting-direction flux equation is a slab-geometry equation, but this was
originally put in the following curvilinear-like form before being spatially discretized [22]:
d 2
r
dr
D
1=2
N
X
n0 D1
C 2r
1=2
C †t .r/r 2
r 2 †s .r; 1; n0 /
1=2
n0 .r/wn0
C
r 2Q
2
(1.92)
1
Advances in Discrete-Ordinates Methodology
37
This was thought to make the discretization for the starting-direction flux equation
consistent with that of the other directions. However, it actually contributed to truncation errors that enhanced the flux dip.
A boundary condition corresponding to specular reflection was used at the center
of the sphere, even though the angular flux at the center of a sphere is rigorously
isotropic and equal to the starting-direction flux. This incorrectly allowed the
angular flux at r D 0 to be anisotropic.
The diamond-in-angle equation is inconsistent with the location of the quadrature
cosines within each angular bin. As a result, the diamond-in-angle scheme does
not preserve solutions that are linear in .
In the late 1970s, it was proposed that the slab-geometry form of the startingdirection flux equation be discretized rather than the curvilinear-like form, and that
all of the angular fluxes at the center of the sphere be set equal to the startingdirection flux value [26]. These two steps significantly reduced the severity of the
flux dip. In the early 1980s, an angular weighted-diamond equation was proposed
that related the angular edge and average fluxes in a manner consistent with the
location of the cosine in each angular bin [38]:
n .r/
D
n n1=2
wn
nC1=2 .r/
C
nC1=2 n
wn
n1=2 .r/:
(1.93)
When all three of these measures were combined, the resulting angular discretization scheme eliminated the flux dip [38]. This scheme has been generalized to 2-D
cylindrical geometry [38].
Very few practical improvements beyond the elimination of the flux dip have
been made in SN angular derivative treatments. Discontinuous finite-element discretizations might have been expected to have had an impact, but this has not
happened, partly because it is difficult to develop a discontinuous angular finiteelement method that is compatible with the standard SN method in multidimensional
geometries.
It is interesting that discontinuous angular derivative treatments do not require
a starting-direction flux. This would appear to be an advantage, but one of the
few linear-discontinuous SN angular derivative treatments ever developed for the
1-D spherical geometry equation was found to be less accurate than the weighteddiamond scheme (Eq. (1.93)) for a series of test problems [60]. A reason for this
is that the starting-direction flux is computed (by Eq. (1.25)) with greater accuracy than the other directions; hence, significant accuracy is actually lost if the
starting-direction flux plays no role in the angular derivative treatment. However,
superior accuracy relative to the weighted-diamond scheme was obtained by using
a quadratic-continuous approximation in the first angular cell and using a lineardiscontinuous approximation in the remaining angular cells [60]. All of these factors
make it challenging to develop advanced SN angular derivative treatments [100].
38
E.W. Larsen and J.E. Morel
1.5.2 Anisotropic Scattering
The standard SN treatment for the scattering source, which is based on a Legendre polynomial expansion for the scattering cross section in conjunction with
quadrature-generated spherical-harmonic moments of the angular flux, is still the
workhorse for modern discrete-ordinates calculations, even though it is not always
satisfactory. There are several reasons why it remains in widespread use:
Fundamentally different approaches usually require significant processing of raw
cross-section data.
Such techniques often have memory requirements that are significantly larger
than those of the standard treatment.
The standard technique is often much more accurate than one would expect, even
when highly truncated cross-section expansions are used in a calculation.
Next, we describe the standard method, together with an improvement that has had
a notable impact on charged-particle calculations [54]. For simplicity, we consider
the monoenergetic 1-D slab-geometry scattering source denoted by S :
Z
S .x; / D
C1
1
1
X
2m C 1
Pm ./Pm .0 /†s;m
2
mD0
!
x; 0 d0 :
(1.94)
We assume that the angular flux is a Legendre series of degree L,
.x; / D
where
L
X
2m C 1
Pm ./
2
mD0
Z
m .x/
D
m .x/;
(1.95)
C1
1
Pm ./ .x; /d:
(1.96)
Substituting Eq. (1.95) into Eq. (1.94) and using the orthogonality of the Legendre
polynomials, we find that the scattering source is
S .x; / D
L
X
2m C 1
Pm ./†s;m
2
mD0
m .x/:
(1.97)
Thus, the scattering source generated by an angular flux that is a Legendre series
of degree L is itself a Legendre series of degree no higher than L. Furthermore,
the only cross-section information appearing in the scattering source is the first
L C 1 moments of the scattering cross section. This same result is obtained if
a cross-section expansion of degree L is used, rather than an exact expansion of
infinite degree. In this case, the convergence of the cross-section expansion is irrelevant. This powerful result is not widely appreciated. An analogous result holds
for multidimensional calculations when the angular flux takes the form of a finite
1
Advances in Discrete-Ordinates Methodology
39
spherical-harmonic expansion. These results follow from the fact that the sphericalharmonic functions (which include the Legendre polynomials) are eigenfunctions
of the Boltzmann scattering operator.
We now discuss how this property impacts SN calculations. For simplicity, we
consider the 1-D slab-geometry scattering source. Assuming that an N -point angular quadrature set is used in conjunction with a cross-section expansion of degree
N 1, the SN scattering source takes the following form:
S
n .x/ D
N
1
X
2m C 1
Pm .n /†s;m
2
mD0
where
m .x/ D
N
X
Pm .n /
n .x/wn :
m .x/;
(1.98)
(1.99)
nD1
We assume a Gauss–Legendre quadrature set. With N quadrature points, one can
uniquely interpolate those points with a polynomial of degree N 1. Furthermore,
since an N -point Gauss–Legendre set exactly integrates polynomials of degree
2N 1 [19], the Legendre moments in Eq. (1.99) are exactly the moments of the
interpolatory polynomial. Considering our previous results regarding the scattering
source for a polynomial angular flux representation, we see that the discrete scattering source values given in Eq. (1.98) are exactly those of the scattering source
generated with the polynomial interpolation for the angular flux. Thus, if the true
angular flux is well-represented by the polynomial interpolation of the discrete angular flux values, the true scattering source will similarly be well-represented by the
polynomial interpolation of the discrete scattering source values.
We again stress that this is true regardless of the convergence of the cross-section
expansion. This property does not guarantee positive discrete scattering source
values, given positive discrete angular flux values, because the polynomial interpolation of the discrete angular fluxes can be negative at some points, even though
the discrete values themselves are positive. However, since polynomial interpolation
at the Gauss points is known to be stable, any negativities in the angular flux interpolation will be small relative to the maximum discrete angular flux value. Therefore,
any negativities in the discrete scattering source values will also be small relative
to the maximum discrete scattering source value. Hence, accurate SN solutions for
angle-integrated quantities can be obtained in a wide variety of problems in 1-D
slab geometry with highly anisotropic scattering using Gauss–Legendre quadrature,
even if the scattering cross-section expansion is highly truncated.
If a Gauss–Legendre quadrature set is not used, some of the scattering source
moments of the interpolatory polynomial will be properly computed, but others will
not, depending on the accuracy of the quadrature set. It can be seen from Eq. (1.98)
that the mth moment of the scattering source is just the product of the mth moment
of the scattering cross section and the mth moment of the angular flux. Thus, any
flux moment that is erroneous yields a corresponding scattering source moment that
40
E.W. Larsen and J.E. Morel
is erroneous. This deficiency can be treated by generating a separate set of quadrature weights for each moment. In particular, for each 0 m N 1, one can
generate N weights, fwm;n gN
nD1 that are defined by the N linear equations
N
X
Pm .n /Pj .n /wm;n D
nD1
2
ımj ; 1 j N :
2m C 1
(1.100)
Pm .n /
(1.101)
Then Eq. (1.99) is replaced by
m .x/
D
N
X
n .x/wn :
nD1
This method gives the desirable properties of Gauss quadrature to non-Gauss
quadrature for the purpose of calculating the scattering source. (However, there is
no guarantee that the weights generated in this way will be positive.)
This is one variant of a more general technique known as Galerkin quadrature
[54]. To present the more general method, we reexpress the standard SN technique
for calculating the scattering source in terms of matrix algebra. In particular, we
write Eqs. (1.98) and (1.99) as follows:
SE D M†D E ;
(1.102)
where E is the vector of discrete angular flux values:
E .
D is the N
1;
2; : : : ;
N/
T
;
N matrix,
Dm;n Pm .n / wn ;
† is the N
(1.103)
(1.104)
N diagonal matrix:
† diag.†0 ; †1 ; †2 ; : : :/;
and M is the N
(1.105)
N matrix:
Mn;m 2m C 1
Pm .n /:
2
(1.106)
The discrete-to-moment matrix D maps a vector of discrete angular flux values
to a corresponding vector of Legendre flux moments. We note from Eq. (1.104)
that the first row of this matrix consists of the standard quadrature weights, because P0 ./ D 1. The matrix † is the scattering matrix in the Legendre basis, or
equivalently, the scattering matrix for the PN 1 approximation. It maps a vector
of Legendre flux moments to a corresponding vector of Legendre scattering source
1
Advances in Discrete-Ordinates Methodology
41
moments. The moment-to-discrete matrix M maps a vector of Legendre scattering
source moments to a corresponding vector of discrete scattering source values.
Using the orthogonal property of the Legendre polynomials, one can show that with
Gauss quadrature, M D D1 ,
DM D
N
X
kD1
Di;k Mk;j D
N
X
kD1
Pi .k /
2j C 1
Pj .k /wk D ıi;j :
2
(1.107)
Thus, using Eq. (1.107), we can reexpress Eq. (1.102) as follows:
SE D D1 †D E :
(1.108)
Equation (1.108) shows that the SN scattering matrix represents a similarity transformation of the Legendre scattering matrix, †. This means that the standard SN
scattering source with Gauss quadrature (and a Legendre cross-section expansion
of degree N 1) is equivalent to the scattering source of the PN 1 approximation. This is to be expected, considering the well-known equivalence between the
SN and PN approximations in 1-D slab geometry [71]. If Gauss quadrature is not
used, then M ¤ D1 , which is an undesirable result. The matrix D maps a vector
of N discrete function values to N Legendre moments, and the matrix M maps a
vector of N Legendre moments to N discrete function values. One can uniquely
define a polynomial of degree N 1 either in terms of N Legendre moments or in
terms of N discrete function values at N distinct points. Therefore, D and M should
be inverses of one another. The moment-dependent weights defined in Eq. (1.100)
ensure that this will be the case. We note that it is not necessary to actually generate
the moment-dependent weights; one can directly obtain the correct matrix D simply
by calculating the inverse of M.
This Galerkin quadrature method is useful for 1-D calculations when quadratures with special directions are desired. For example, Lobatto and double Radau
quadrature sets, which have quadrature points at D ˙1, are particularly useful
for simulating a normally incident plane-wave of radiation [54].
In 2-D and 3-D, the Galerkin quadrature method is based on spherical-harmonic
interpolation of the discrete angular fluxes rather than polynomial interpolation.
Choosing the correct spherical harmonics for interpolation is more complicated
in multidimensions because the number of spherical-harmonic functions of order
N 1 does not equal the number of discrete directions in a multidimensional SN
quadrature set. Nonetheless, suitable interpolation functions have been defined for
triangular quadrature sets [54]. The Galerkin quadrature method in 2-D and 3-D can
be much more accurate than the standard quadrature method with highly anisotropic
scattering because there is no analog of Gauss quadrature in 2-D and 3-D, i.e.,
there is no 2-D or 3-D quadrature set that will exactly calculate all of the sphericalharmonic moments of the interpolated angular flux. In fact, fewer than half of the
moments are exactly calculated with typical sets, e.g., even-moment symmetric sets
[14], etc.
42
E.W. Larsen and J.E. Morel
The Galerkin quadrature method can also accommodate nonpolynomial or
nonspherical-harmonic interpolation functions [54]. To demonstrate this in 1-D, we
consider a general interpolatory basis set for a given set of N discrete directions:
N
X
.x; / D
n .x/Bn ./;
(1.109)
nD1
where
Bi .j / D ıij :
(1.110)
Multiplying Eq. (1.109) by Pm ./ and integrating over all directions, we obtain
m .x/ D
N
X
Z
n .x/
C1
1
nD1
Pm ./Bn ./d :
(1.111)
It follows from the definition of the discrete-to-moment matrix M and Eq. (1.110)
that the components of D are
Z
Dm;n D
C1
1
Pm ./Bn ./d:
(1.112)
Equation (1.112) is valid for all types of interpolation functions, including polynomials. We note that the first row of the discrete-to-moment matrix consists of
standard quadrature weights that are exact for integrating the interpolated angular flux:
Z C1
N
X
.x; /d D
(1.113)
.x/ D
n wn ;
1
where
nD1
Z
wn D
C1
1
Bn ./d:
(1.114)
These are called the companion quadrature weights. Nonpolynomial interpolation
requires much more computational effort to generate the discrete-to-moment matrix, because the interpolatory basis functions must be explicitly formed and their
products with the Legendre polynomials must be integrated. Also, one must invert
the discrete-to-moment matrix to obtain the moment-to-discrete matrix, because the
standard SN expression for the moment-to-discrete matrix, Eq. (1.106), is only correct for polynomial interpolation.
We refer to the scattering source obtained by operating on the interpolated angular flux with the exact scattering kernel as the exact interpolation-generated
scattering source. When the interpolation functions are nonpolynomial, the exact
interpolation-generated scattering source is generally not expressible in terms of
the interpolation functions. Thus, if the discrete scattering source values obtained
from the Galerkin quadrature method are interpolated, one generally does not obtain
1
Advances in Discrete-Ordinates Methodology
43
the exact interpolation-generated scattering source. Rather, one obtains a scattering
source that has the same Legendre moments of degree 0 through N 1 as the exact
interpolation-generated scattering source [54].
As an example of a useful nonpolynomial interpolation scheme, we consider a
linear-discontinuous angular trial space in 1-D spherical geometry. Such a trial space
is fully compatible with a linear-discontinuous treatment for the angular derivative
term [60]. An “SN ” trial space of this type is defined to consist of N=2 equalwidth piecewise-linear segments in , where N is even and N > 2. There are two
discrete angular flux unknowns per segment located
p at the local Gauss S2 quadrature points, i.e., the points corresponding to ˙1= 3 obtained by linearly mapping
Œ1; C1 onto each segment. A Galerkin quadrature set is generated for this trial
space by exactly evaluating the Legendre angular flux moments of degree 0 through
N 1 associated with the linear-discontinuous interpolation of the N discrete flux
values [60]. The companion quadrature set corresponding to the Galerkin set, i.e.,
the standard quadrature set having the same quadrature points as the Galerkin set
with quadrature weights that exactly integrate the interpolated angular flux representation, corresponds to a local Gauss S2 set on each linear segment. Since each
local Gauss set exactly integrates cubic polynomials, it follows that the companion set will exactly evaluate the zeroth, first, and second Legendre moments of
the interpolated angular flux. However, all higher flux moments will be inexactly
evaluated, regardless of the quadrature order N . This is in contrast to the Galerkin
quadrature set of order N , which always exactly evaluates the Legendre angular flux
moments of degree 0 through N 1. Furthermore, because the companion quadrature set never exactly integrates polynomials of degree greater than 3, one cannot
use a Legendre cross-section expansion of degree greater than 3 with the companion quadrature set (otherwise particle conservation will be lost). Thus, the accuracy
of the scattering source with highly anisotropic scattering can be greatly improved
for linear-discontinuous angular trial spaces in 1-D by using Galerkin quadrature.
This enables one to use a linear-discontinuous approximation for the angular derivative term in 1-D spherical geometry in conjunction with an accurate treatment for
highly anisotropic scattering.
Perhaps the most important property of the Galerkin quadrature method, independent of the type of functions used to interpolate the discrete angular flux values, is
that straight-ahead delta-function scattering is exactly treated. This has a very strong
impact on charged-particle calculations, because it enables the total scattering cross
section to be dramatically reduced (with an attendant decrease in the scattering ratio)
while leaving the SN solution invariant. To demonstrate how straight-ahead scattering is exactly treated, let us consider the following differential scattering cross
section
(1.115)
†s .0 / D ˛ı.0 1/;
where ˛ is an arbitrary constant. The Boltzmann scattering operator associated with
this cross section is ˛ times the identity operator:
Z
S
D
C1
1
˛ı.0 1/
0 0
d D ˛ ./:
(1.116)
44
E.W. Larsen and J.E. Morel
Furthermore, the Legendre moments of this cross section are all equal to alpha:
Z
†m D ˛
C1
1
ı.0 1/Pm 0 d0 D ˛Pm .1/ D ˛; 0 m 1:
(1.117)
Thus, the diagonal matrix of cross-section moments used to construct the vector of
discrete scattering source values is ˛ times the identity matrix:
† D ˛I:
(1.118)
Substituting from Eq. (1.118) into Eq. (1.108), and recognizing that M D D1 , we
obtain
SE E D M˛ID E D ˛MD E D ˛ E ;
(1.119)
which agrees with Eq. (1.116). We have explicitly considered only the 1-D case, but
this result also applies in multidimensions.
For charged particles, the scattering ratio for each group is generally very close to
unity, and the mean free path is very small; nonetheless, the transport process is not
diffusive. This is because the “transport-corrected” scattering ratio, .†0 †1 /=†t ,
is not close to unity. Within-group straight-ahead scattering is equivalent to no
scattering at all, since the particle scatters into the same group and direction it
had before the scattering. Thus, one can add or subtract a straight-ahead differential scattering cross section from any physically correct within-group cross section
without changing the analytic transport solution. Since all Galerkin quadratures
treat straight-ahead scattering exactly, one can subtract the truncated expansion for
a within-group straight-ahead scattering cross section from the physically correct
cross-section expansion without changing the SN solution.
For instance, let us consider the total Boltzmann scattering operator (outscatter
minus inscatter) associated with a 1-D SN Galerkin quadrature:
†0 E SE D ŒM†0 D M†D E
D MŒ†s †D E
D MŒdiag.†0 †0 ; †0 †1 ; : : : †0 †N 1 / D E : (1.120)
Subtracting the delta-function cross section given in Eq. (1.115) from the physically
correct cross section, we obtain the following modified outscatter matrix:
†0 D .†0 ˛I/;
(1.121)
and the following modified inscatter matrix
where
.M†D/ D M† D;
(1.122)
† D diag.†0 ˛; †1 ˛; : : : ; †N 1 ˛/:
(1.123)
1
Advances in Discrete-Ordinates Methodology
45
We note that while the outscatter and inscatter matrices are modified by subtraction of the straight-ahead scattering cross section, the total Boltzmann matrix,
Eq. (1.120), does not change, i.e.,
†0 M† D D †0 M†D:
(1.124)
Thus, the SN solution does not change. However, the convergence properties of the
source iteration process can be dramatically changed. A discussion of the optimal
choice of ˛ is beyond the scope of this review, but the traditional choice (which is
nearly optimal) is to set ˛ D †N . This extended transport correction can greatly
reduce both the total scattering cross section and the scattering ratio in relativistic
charged-particle transport calculations [9, 27]. We note that the significance of this
cross-section modification depends on the convergence of the cross-section expansion. If the expansion is essentially converged, †N 1 will be very small relative to
†0 , resulting in a negligible reduction in †0 ; and if the cross-section expansion is
highly truncated, †N 1 will be comparable to †0 , resulting in a significant reduction in †0 . Acceptable computational efficiency often cannot be achieved without
the use of the extended transport correction in charged-particle calculations.
Thus, with Galerkin quadrature, the extended transport correction (correctly)
leaves the SN solution invariant. This is a powerful motivation for using the Galerkin
method in charged-particle calculations.
1.6 Advances in Fokker–Planck Discretizations
Next, we discuss advances in Fokker–Planck angle and energy discretizations for
charged-particle transport. We first consider the continuous-scattering operator, and
then the continuous-slowing-down operator.
1.6.1 The Continuous-Scattering Operator
The continuous-scattering operator in 1-D slab geometry is
.x; ; E/ D @
†r;t r @ 1 2
2 @
@
.x; ; E/:
(1.125)
Taking the zeroth and first angular moments of Eq. (1.125), we obtain
Z
C1
1
.x; / d D 0
(1.126)
46
E.W. Larsen and J.E. Morel
and
Z
C1
1
.x; /d D †r;t r J.x/;
(1.127)
respectively, where J.x/ is the current
Z
J.x/ D
C1
.x; / d:
(1.128)
1
It is highly desirable for a numerical approximation to the continuous-scattering
operator to preserve both the zeroth and first angular moments of that operator. Also,
since the continuous-scattering operator is a diffusion operator on the unit sphere
[see Section 1.3.2], it is highly desirable that the discretization for this operator yield
a coefficient matrix that is symmetric and monotone. These two properties ensure
that the matrix (like the analytic operator) will have only positive real eigenvalues
and will yield positive solutions given positive sources.
A straightforward discretization of Eq. (1.125) is
. /n D †r;t r 1 nC1 n
n n1
1 2nC1=2
;
1 2n1=2
2 wn
nC1 n
n n1
(1.129)
where
nC1=2 D n1=2 C wn ; 1 n N; 1=2 D 1:
(1.130)
This discretization results in a symmetric and monotone coefficient matrix and preserves Eq. (1.126) under numerical integration, but it preserves Eq. (1.127) only if
each quadrature point lies at the center of its associated angular interval. (As previously noted, this never occurs with standard quadrature sets.) A discretization that
does preserve Eq. (1.127) with standard quadrature sets is [40]
†r;t r 1
. /n D 2 wn
ˇnC1=2
n
n n1
ˇn1=2
nC1 n
n n1
nC1
;
(1.131)
where the ˇ-coefficients are defined by Eq. (1.23b) and all else remains as previously defined. Thus the ˇ-coefficients used to admit the constant solution in the
discretization of the angular derivative term in the spherical geometry transport
equation are also used in the discretization of the continuous-scattering operator
to preserve Eq. (1.127).
This moment-preserving approach has been extended to multidimensions for
product quadratures [120]. For instance, in three dimensions, the continuousscattering operator is
†r;t r
D
2
@
1 @2
@
.1 2 /
C
@
@
1 2 @! 2
:
(1.132)
1
Advances in Discrete-Ordinates Methodology
47
Taking the zeroth and first angular moments of Eq. (1.132), we obtain
Z
2
Z
C1
dd! D 0;
and
Z
Z
2
C1
dd! D †r;t r J ;
1
0
(1.133)
1
0
(1.134)
respectively, where J is the current
Z
2
Z
C1
J D
dd!:
(1.135)
1
0
Standard triangular SN quadrature sets do not represent a rectangular angular mesh
on the unit sphere, but product sets do. Hence, it is reasonably straightforward to
derive a discretization for the continuous-scattering operator assuming a product
quadrature set. An SN product quadrature set has 2N 2 directions and is formed by
the tensor product of an N -point quadrature defined over the polar cosine and a 2N point quadrature defined over the azimuthal angle. Each direction can be uniquely
referenced in terms of a polar index n and an azimuthal index j . A momentpreserving discretization for the 3-D continuous-scattering operator is
2
. /n;j D †r;t r 6
4
2
1
p
wn
ˇnC1=2
nC1;j ˇn1=2
n;j
nC1 n
n
1
C 1
2 wa
n j
n;j C1 n;j
!j C1 !j
n;j n1;j
n n1
n;j n;j 1
3
7
5:
!j !j 1
(1.136)
p
Here, wn and waj are weights associated with the polar and azimuthal quadratures
that sum to 2 and 2 , respectively, the ˇ-coefficients are identical to those defined
for the 1-D case, and
2
Kn
; 1 n 2N;
2N 1 cos N
p
Kn D 2.1 2n / C cn 1 2n;
n D
ˇnC1=2 dnC1=2 ˇn1=2 dn1=2
;
p
wn
p
p
1 2nC1 1 2n
D
:
nC1 n
cn D
dnC1=2
(1.137)
(1.138)
(1.139)
(1.140)
48
E.W. Larsen and J.E. Morel
The above discretization is defined only for product quadrature sets constructed with
azimuthal quadrature sets of the Chebychev type:
!j D
waj D
2j 1
; 1 j 2N;
N
N
; 1 j 2N:
(1.141)
(1.142)
This discretization has a 5-point stencil, and is symmetric positive-definite (SPD)
and monotone. It preserves Eqs. (1.133) and (1.134). The restriction to Chebychev
azimuthal quadrature arises from the fact that three first-moment equations must
be met, but the ˇ-coefficients and the coefficients can only be defined to meet
two of them. The ˇ-coefficients are defined to preserve the moment equation associated with the polar cosine, i.e., cos , and for a general quadrature set, the
-coefficients can preserve one of the two moment equations associated with the
cosines that depend on the azimuthal angle, i.e., sin cos! or sin sin!. However,
when a Chebychev azimuthal quadrature set is used, the -coefficients can be defined to preserve both azimuthal cosines.
An alternative to a finite-difference representation for the continuous-scattering
operator is a Legendre moment representation. Because the total Boltzmann scattering operator (outscatter minus inscatter) and the continuous-scattering operator
have the same eigenfunctions, one can define effective cross-section moments to
represent the continuous-scattering operator [31]. In particular, the mth eigenvalue
of the continuous-scattering operator is
c;m D †r;t r
m.m C 1/
;
2
(1.143)
while the mth eigenvalue of the total Boltzmann scattering operator is
b;m D †0 †m :
(1.144)
Without loss of generality, we assume that a 1-D SN calculation is performed with
Galerkin quadrature. In this case, an effective cross-section expansion of degree
N 1 is required. The first step in defining the effective cross-section moments is
to equate the eigenvalues defined by Eqs. (1.143) and (1.144):
†e0 †em D †r;t r
m.m C 1/
; 0 m N 1:
2
(1.145)
This step does not uniquely define the effective cross-section moments, but rather
leaves †0 a free parameter. The only consideration for choosing †0 is to minimize
the effective scattering ratio. In analogy with the choice of ˛ in the extended transport correction,†0 is defined so that the last moment in the expansion is zero:
†e0 D †r;t r
.N 1/N
:
2
(1.146)
1
Advances in Discrete-Ordinates Methodology
49
Substituting from Eq. (1.146) into Eq. (1.145), we obtain an expression for the
remaining effective moments:
†em D †r;t r
.N 1/N m.m C 1/
; 1 m N 1:
2
(1.147)
When used in conjunction with Galerkin quadrature, the number of eigenvalues
preserved is always equal to the number of discrete directions. This is to be contrasted with finite-difference approximations that only preserve the zeroth and first
moments. Thus, the moment representation can be considered to be more accurate
than the finite-difference representations, but the coefficient matrix for the moment
representation is not monotone. Thus the moment representation is less robust than
the finite-difference representation.
1.6.2 The Continuous-Slowing-Down Operator
We next consider the continuous-slowing down operator:
C .x; ; E/ D
@ˇ .x; ; E/
:
@E
(1.148)
Discretizations of this operator have advanced from a multigroup or step-like treatment [31] through a diamond treatment [41] to a linear-discontinuous finite-element
treatment [47]. Because standard SN codes were not originally intended to solve
the charged-particle transport problems, early efforts in treating the continuousslowing-down operator were spent defining effective cross sections to implement
various discretization schemes via the standard SN scattering source representation
[31, 40, 47]. Modern codes that were designed to solve the charged-particle transport equation use a linear-discontinuous finite-element discretization in space, but
treat the energy derivative similar to the spatial derivatives. Thus, this operator is
inverted via a space-energy sweep [101]. Self-adjoint codes must still treat the term
as a scattering source, and invert it via source iteration [112]. Because the spectral
radius associated with iterations on the continuous-slowing down term can be very
close to unity, these iterations must be accelerated. A synthetic acceleration technique, based on the diamond-difference approximation as the low-order operator,
has been used for this purpose [47].
Although the pure Fokker–Planck equation can be solved, most charged-particle
calculations are carried out with the Boltzmann–Fokker–Planck equation [37].
In this case, the Boltzmann scattering operator is treated with the standard multigroup approximation. The resulting hybrid multigroup/linear-discontinuous operator is formally treated as a linear-discontinuous operator. One simply takes the
scattering kernel to be piecewise-constant in energy; equivalently, one takes all energy slopes associated with the scattering kernel to be zero.
50
E.W. Larsen and J.E. Morel
For instance, let us assume that the angular flux within group g has the following
linear-discontinuous dependence:
.E/ D
where
a;g
a;g
C
e;g
2 E Eg ; EgC1=2 E < Eg1=2 ;
Eg
is the group average flux:
a;g
e;g
(1.149)
D
1
Eg
Z
Eg1=2
.E/ dE;
(1.150)
EgC1=2
is the group energy slope:
e;g
D
6
Eg2
Z
Eg1=2
E Eg .E/ dE;
(1.151)
EgC1=2
and Eg D Eg1=2 EgC1=2 is the group width for group g. We note from Eq.
(1.149) that the angular flux at the interface energy between two groups is defined
by the solution in the higher energy group. Since the continuous-slowing-down operator causes particles to lose energy, this choice is consistent with the direction of
particle flow in energy. Under the assumptions of a Legendre expansion of degree L
and 0 energy slopes for the multigroup scattering kernel, and a constant dependence
of the restricted stopping power within each group, the discretized 1-D transport
equation takes the following form for group g:
@
a;g
@x
C
@
C †t;g
a;g
@x
G X
L
X
2m C 1 m
Eg 0
†g 0 !g
2
Eg
0
mD1
a;m;g 0 Pm ./
C Qa;g
g D1
1 ˇr;g1 .
Eg
e;g
D
C †t;g
e;g
a;g1
e;g1 /
D Qe;g C
ˇr;g .
3 ˇr;g1
Eg
2ˇr;g
a;g
a;g1
a;g
e;g /
C ˇr;g
;
(1.152a)
e;g1
a;g
e;g
:
(1.152b)
Here, 'a;m;g denotes the group average Legendre flux moment of degree m for group
g, Qa;g and Qe;g respectively denote the group average source and group source
energy slope for group g, and †m
g 0 !g is the standard multigroup-Legendre coefficient of degree m for a transfer from group g0 to group g:
Z Eg1=2 Z Eg0 1=2 Z C1
1
D
†s E 0 ! E; 0 Pm .0 /d0 dE 0 dE:
†m
g 0 !g
Eg 0 EgC1=2 Eg0 C1=2 1
(1.153)
1
Advances in Discrete-Ordinates Methodology
51
Equations (1.152a) and (1.152b) are solved via source iteration, but the continuousslowing-down operator is inverted during the sweep. In particular, the source iteration process takes the form
@
.`C1/
a;g
.`C1/
C †t;g a;g
@x
1 h
.`C1/
ˇr;g1 a;g1
Eg
D Qa;g C
C
@
.`C1/
e;g
@x
L
X
2m C 1 m
†g!g
2
mD1
G
X
g 0 DgC1
.`C1/
e;g1
C †t;g
2ˇr;g
ˇr;g
.`/
a;m;g Pm
L
X
2m C 1 m
Eg 0
†g 0 !g
2
Eg
mD1
3 h
ˇr;g1
Eg
.`C1/
.`C1/
Cˇ
r;g
a;g
a;g
.`C1/
e;g
.`C1/
a;g
.`C1/
e;g
i
./
.`C1/
a;m;g 0 Pm
.`C1/
a;g1
i
.`C1/
e;g
./;
.`C1/
e;g1
(1.154a)
D Qe;g ; (1.154b)
where ` is the iteration index. If the full linear-discontinuous treatment for the scattering source were used, one would have nonzero scattering source energy slopes.
In this case, one would iterate on both the scattering source averages and energy
slopes. If the scattering ratio for a given group is sufficiently large to require convergence acceleration, the scattering source energy slopes could require acceleration
in addition to the scattering source averages. Accelerating the source energy slopes
significantly complicates the acceleration process. (We will address this point again,
regarding the convergence acceleration of the temporal scattering source slopes associated with a linear-discontinuous discretization of the time derivative.)
When combining the linear-discontinuous energy approximation with discontinuous finite-element approximations in space, one generally assumes a single energy
slope per spatial cell rather than a separate energy slope for each spatial unknown
within a cell. Although the latter assumption is more accurate, it can be excessively
expensive. For example, if a trilinear-discontinuous spatial approximation is used
for a 3-D rectangular cell, one gets eight spatial unknowns per cell per angle per
group. If a separate energy slope is used for each spatial unknown, the number of unknowns per cell per angle per group increases to 16; but if only a single energy slope
is used for the entire spatial cell, the number of unknowns only increases to nine.
The linear-discontinuous discretization of the continuous-slowing-down operator
represents a major improvement relative to step and diamond-difference discretizations [47]. In particular, the linear-discontinuous method is much less numerically
diffusive than the step method and much less oscillatory than the diamond method.
Nodal methods can also be applied to the continuous-slowing down operator. One
would expect such methods to be comparable to discontinuous finite-element methods. However, because they have rarely been used in practice, we will not explicitly
discuss nodal methods here.
52
E.W. Larsen and J.E. Morel
1.7 Advances in Time Discretizations
Next, we discuss advanced discretization techniques for the time derivative. As is the
case for most of the derivatives terms in the Boltzmann equation, the time derivative
has been treated with the discontinuous finite-element method [91,98] and the nodal
method [59].
The linear-discontinuous method assumes an angular flux dependence of the following form over the kth time step:
.t/ D
where
k
a
C
k
a
k
t
2 k
t
t
; t k1=2 < t t kC1=2 ;
t k
is the average flux:
k
a
k
t
(1.155)
D
1
t k
Z
t kC1=2
.t/dt ;
(1.156)
t tk
.t/dt ;
(1.157)
t k1=2
is the temporal slope:
k
t
6
D
.t k /2
Z
t kC1=2
t k1=2
t k D .t k1= 2 C t k1= 2 / = 2 is the midpoint of the time step, and t k D .t kC1= 2 t k1= 2 / is the width of the time step. We note from Eq. (1.155) that the angular flux
at the interface between two time steps is defined by the solution from the previous time step. For simplicity, let us consider the 1-D time-dependent slab-geometry
monoenergetic transport equation with isotropic scattering and an isotropic inhomogeneous source:
@
Q
†s
1@
C
C †t D
C :
(1.158)
v @t
@x
2
2
We obtain the following equations after applying the linear-discontinuous finiteelement approximation in time to Eq. (1.158):
1 h
vt k
k
a
C
k
t
k1
a
C
k1
t
i
C
@ k
C †t
@x
k
D
†s
2
k
C Qak ;
(1.159a)
3 h
vt k
D
†s
2
k
a
k
t
C
k
t
C Qtk ;
2
k
a
C
k1
a
C
k1
t
i
C
@ tk
C †t;g
@x
k
t
(1.159b)
1
Advances in Discrete-Ordinates Methodology
53
where Qak and Qtk respectively denote the source temporal average and source temporal slope for time step k. Equations (1.159a) and (1.159b) can be simultaneously
solved via source iteration:
i
1 h k;.`C1/
k;.`C1/
k1
k1
C
C
t
a
a
t
vt k
k;.`C1/
@ a
†
s k;.`/
C †t ak;.`C1/ D
C
C Qak ;
(1.160a)
@x
2 a
i
3 h k;.`C1/
k;.`C1/
k;.`C1/
k1
k1
2
C
C
C
t
a
a
a
t
vt k
k;.`C1/
@
†s k;.`/
C †t;g tk;.`C1/ D
C t
C Qtk ;
(1.160b)
@x
2 t
where ` is the iteration index. We note that one must iterate on both the temporal
averages and the temporal slopes of the scattering source. If the scattering ratio is
close to unity, the source iterations for both the averages and slopes must be accelerated. Deriving fully consistent diffusion acceleration equations from Eqs. (1.160a)
and (1.160b) yields a complicated and difficult-to-solve system of coupled diffusion
equations. If one uses step or diamond differencing in time, the diffusion-synthetic
acceleration algorithm requires the solution of only one diffusion equation and is
essentially identical to the algorithm for steady-state calculations. An approximate
method has been developed in which the fully coupled system of diffusion acceleration equations associated with the linear-discontinuous temporal discretization
scheme is replaced by two independent diffusion equations [98]. This approximate
method appears to work quite well, resulting in a cost increase for performing DSA
of about a factor of 2 relative to that associated with traditional temporal differencing schemes.
While the linear-discontinuous finite-element approximation in time is more
accurate than the step scheme and more robust than the diamond scheme, it is
also more expensive. As with the continuous-slowing down operator, when one
combines the linear-discontinuous temporal approximation with discontinuous
finite-element approximations in space, one generally assumes a single temporal
slope per spatial cell rather than a separate temporal slope for each spatial unknown
within a cell. Although the latter assumption is more accurate, it can be excessively
expensive. As we noted before, if a trilinear-discontinuous spatial approximation
is used for a 3-D rectangular cell, one gets eight spatial unknowns per cell per
angle per group. If a separate temporal slope is used for each spatial unknown, the
number of unknowns per cell per angle per group increases to 16; but if only a
single temporal slope for the entire spatial cell is used, the number of unknowns
only increases to nine.
In analogy with the derivation of Eqs. (1.70) and (1.72), we apply the constantconstant nodal method to Eq. (1.158) to obtain the following equations for > 0
(assuming for simplicity Q D 0):
h
i
k1=2
k1=2
t
exp
†
.t/
D
v
t
t
n;x;i
n;x;i
t
h
io
qn;x;i;k n
1 exp †t v .t t k1=2
;
(1.161)
C
†t
54
E.W. Larsen and J.E. Morel
where
qn;x;i;k D
†s
4 t k
Z
t kC1=2
t k1=2
and
n;t;k .x/ D
where
†s
qn;t;k;i D
4 xi
Z
n xi
n;t;k
xi C1=2 n;t;k .xi 1=2 /
;
(1.162)
†t .x xi 1=2 /
.x
/
exp
n;t;k x1=2
n
†t .x xi 1=2 /
qn;t;k;i
C
1 exp ;
†t
n
xi C1=2
xi 1=2
x;i .t/ dt
t;k .x/ dx 1 h
vt k
n;x;i
t kC1=2 n;x;i
(1.163)
i
t k1=2 :
(1.164)
The considerations for applying nodal methods in time are analogous to those for
applying discontinuous finite-element methods. For instance, with a linear nodal
method, one must be concerned with accelerating the temporal slopes of the scattering source. With a linear nodal method in both time and space [59], there would
be only one temporal slope per space cell, but if one were to apply a nodal method
in time in conjunction with another type of spatial discretization, multiple temporal
slopes per space cell could arise.
The practical need for advanced temporal discretization schemes relative to traditional discretization schemes is not as strong for the time variable as for the space
and energy variables. This is due to the relative ease with which adaptive techniques
can be applied to time integration, making it feasible to avoid the regimes in which
simple discretization schemes perform poorly. In any event, the transport community has little experience with advanced temporal discretization schemes, and little
research has been performed in this area.
1.8 Advances in Iteration Acceleration
Next, we discuss major advances in iteration acceleration. A plethora of SN iterative
acceleration techniques have been developed over the years (we refer the reader
to the recent comprehensive review by Adams and Larsen [110]), but there is little doubt that the practical application of diffusion-synthetic acceleration (DSA) to
source iteration has been the most significant advance in iteration acceleration techniques in the history of discrete-ordinates methods. Early attempts to utilize DSA
were marred by successful performance only for problems with optically thin spatial
grids [16]. Later, the subtleties concerning how these methods should be discretized
became understood. To motivate the DSA and DSA-like methods, we first discuss
the Fourier analysis technique, which has become an invaluable theoretical tool
for predicting the convergence rate of iterative solutions of continuous and discrete
problems.
1
Advances in Discrete-Ordinates Methodology
55
1.8.1 Fourier Analysis
The Source Iteration (SI) method is described in Section 1.2; see Eqs. (1.31)
and (1.32). For a model infinite, homogeneous-medium transport problem with
no discretization,
1
@ .x; /
C †t .x; / D Œ†s .x/ C Q.x/ ;
@x
2
Z 1
.x/ D
x; 0 d0 ;
(1.165a)
(1.165b)
1
.0/
the SI process begins with an initial guess
` 1, the `th source iteration is defined by
@
.`/
.`1=2/
.x; /
C †t
@x
.x/ D
.`1=2/
Z
.`1=2/
.x; / D
1
.x/ .`1=2/
.x/ of the scalar flux, and then for
1h
†s
2
.`1/
i
.x/ C Q.x/ ; (1.166a)
x; 0 d0 :
(1.166b)
1
We now write an exact transport equation for the scalar and angular flux errors:
.x/ D .x/ ı
.`1/
ı
.`1=2/
.x; / D
.`1/
.x; /;
.x; / (1.167a)
.`1=2/
.x; /:
(1.167b)
To do this, we subtract Eqs. (1.166) from Eqs. (1.165), obtaining
ı
@ı
.`1=2/
.x; /
@x
.`/
.x/ D ı
C †t ı
.`1=2/
Z
.`1=2/
.x; / D
1
.x/ ı
.`1=2/
1
†s ı
2
.`1/
x; 0 d0 ;
.x/; (1.168a)
(1.168b)
1
which define ı .`/ .x/ in terms of ı .`1/ .x/. Clearly, the rate of convergence of
Eq. (1.166) is equal to the rate at which ı .`/ .x/ ! 0.
To calculate this rate, we introduce the Fourier transforms
Z 1
.`1/
ı
.x/ D
a.`1/ ./e i †t x d;
(1.169a)
1
ı
.`1=2/
.x; / D
Z
1
1
b .`1=2/ .; /e i †t x d;
(1.169b)
56
E.W. Larsen and J.E. Morel
into Eqs. (1.168) to obtain
c
.i C 1/b .`1=2/ .; / D a.`1/ ./;
2
Z 1
a.`/ ./ D
b .`1=2/ ; 0 d0 ;
(1.170a)
(1.170b)
1
where c D †s =†t D scattering ratio. Equation (1.170a) gives
b .`1=2/ .; / D
1
c
a.`1/ ./ ;
2 1 C i (1.171)
and then Eq. (1.170b) gives
a.`/ ./ D
Z 1
d0
c
a.`1/ ./
2 1 1 C i 0
D ! ./ a.`1/ ./
D D Œ!./ ` a.0/ ./;
where
c
!./ D
2
Z1
1
(1.172a)
d0
c
D tan1 ./
1 C i 0
(1.172b)
is the iteration eigenvalue. Equations (1.172) and (1.169a) yield
Z1
ı
.`/
.x/ D
! ` ./a.0/ ./e i †t x d:
(1.173)
1
Thus, the rate at which the Fourier mode corresponding to wave number limits
to zero is determined by !./. If j!./j 1, the corresponding mode converges
rapidly. If j!./j < 1 and j!./j 1, the mode converges slowly. If j!./j 1,
the mode does not converge. The overall rate of convergence is determined by the
most slowly converging error mode, i.e., the largest value of j!./j over the Fourier
variable . For large `, Eq. (1.173) implies
ı
.`/
.x/ `
A;
(1.174)
where A is a constant, and
D
sup
j!./j D lim
1<<1
is the spectral radius (see Eqs. (1.33)).
`!1
ı
ı
.`/
.x/
.`1/ .x/
(1.175)
1
Advances in Discrete-Ordinates Methodology
57
From Eqs. (1.175) and (1.172b), the spectral radius of the continuous SI
scheme is
c
D sup
(1.176)
tan1 D c;
1<<1 which is attained for 0. Thus, the 0 Fourier error modes are the most
slowly converging modes in the SI scheme. From Eq. ( 1.171), we have for 0:
b .`C1=2/ .; / D
c
c
1
! ` ./a.0/ ./ .1 i / ! ` ./ a.0/ ./;
2 1 C i 2
and thus the most slowly converging modes depend nearly linearly on .
The Fourier analysis can be extended to multidimensional problems, to fully
discrete (in space, angle, and energy) SN problems for infinite homogeneous (or spatially periodic) media on Cartesian grids, and to iteration strategies beyond Source
Iteration [110]. This analysis makes it possible to predict – with relative ease and
great accuracy – the rate of convergence of transport iteration schemes, before implementing them in test codes. It also provides a theoretical foundation for iteration
schemes, making it possible to understand how a scheme works (if it works), or why
it fails (if it fails). In the latter case, the Fourier analysis has often provided clues that
have enabled researchers to modify schemes to perform more effectively. Overall,
the Fourier analysis has become an invaluable tool in the development of advanced
iteration strategies for particle transport problems.
For the SI scheme, the Fourier analysis predicts that the spectral radius D c,
even for fully discrete SN codes. The accuracy of this prediction has been observed
in many calculations [110]. The Fourier analysis also predicts, even for discrete
problems, that the most slowly converging modes correspond to 0, and that for
such modes, the -dependence is nearly linear. This result provides a motivation for
the DSA method, discussed next.
1.8.2 Diffusion-Synthetic Acceleration
The DSA method is based on the use of diffusion as an approximation to transport,
for the purpose of calculating the iterative error after source iteration. The algorithm
can be expressed as follows. We again suppose that the problem to be solved is given
by Eqs. (1.165). The `th DSA iteration begins with a SI sweep (Eqs. (1.166)):
@
.`1=2/
.`1=2/
.x; /
@x
Z
C †t
.`1=2/
1
.`1=2/
.x/ D
1
.x; / D
x; 0 d0 :
1h
†s
2
.`1/
i
.x/ C Q.x/ ; (1.177a)
(1.177b)
58
E.W. Larsen and J.E. Morel
Unlike the SI scheme, the DSA method does not define .`/ .x/ D .`1=2/ .x/.
Instead, Eqs. (1.177) are subtracted from Eqs. (1.165) to obtain the following exact
equation for the angular flux error (Eqs. (1.167)):
Z
@ı .`1=2/ .x; /
†s 1
.`1=2/
C †t ı
.x; / ı .`1=2/ x; 0 d0
@x
2 1
h
i
†s .`1=2/
D
.x/ .`1/ .x/ :
(1.178)
2
This equation for ı .`1=2/ .x; / is exact, but it is just as difficult to solve as the
original transport equation for .x; / (Eq. (1.165)). However, a good estimate of
the iterative error can be obtained by approximating Eq. (1.178) by its diffusion
approximation:
h
i
d 1 d
ı .`1=2/ .x/ C †a ı .`1=2/ .x/ D †s .`1=2/ .x/ .`1/ .x/ ;
dx 3†t dx
(1.179)
which is much easier to solve than Eq. (1.178). This approximation is further motivated by the fact that the most slowly converging SI modes are linear in angle;
these components are treated accurately in the above diffusion approximation. After
solving Eq. (1.179) for ı .`1=2/ .x/, the update equation is
.`/
.x/ D
.`1=2/
.x/ C ı
.`1=2/
.x/:
(1.180)
The DSA scheme is then defined by Eqs. (1.177) (a transport sweep), Eq. (1.179)
(the low-order diffusion equation for the approximate scalar flux correction), and
Eq. (1.180) (the update equation). A Fourier analysis can be employed to calculate
the spectral radius of the DSA algorithm for the same model infinite homogeneousmedium problem used to analyze source iteration alone. The DSA spectral radius
for this case is approximately 0:23c, which is bounded less than unity for all
0 c 1. Furthermore, with one exception that is discussed later, this excellent
spectral radius can be achieved in practical calculations. Thus, for the most part,
DSA is extremely effective.
Details of the Fourier analysis show why DSA performs well for the model problem. The transport sweep strongly attenuates scalar flux errors that vary rapidly in
space, i.e., high-frequency errors. However, this sweep attenuates low-frequency errors that vary slowly in space by only a factor of c. Thus, when c is close to unity,
the low-frequency errors are essentially unattenuated. In contrast, the diffusion step
almost completely attenuates low-frequency errors (because such errors have an
angular shape that is primarily linear in ) but does basically nothing to highfrequency errors. Thus when DSA is applied, the high-frequency errors are strongly
attenuated by the sweep and the low-frequency errors are strongly attenuated by the
diffusion step. The minimum level of attenuation occurs for intermediate-frequency
errors, but this level of attenuation is very high relative to that of the unaccelerated
iteration algorithm.
1
Advances in Discrete-Ordinates Methodology
59
The concept of DSA existed long before 1968 [6], but the synthetic method for
discrete problems was originally seen to be unstable for problems in which the cell
thickness exceeded roughly a mean free path [11]. Alcouffe [23] made the DSA
method practical for the diamond-differenced SN equations by showing that if the
spatial discretization of the diffusion equation was chosen to be consistent with the
spatial discretization of the SN equations, the instability was eliminated, and one obtained an algorithm that was unconditionally effective. Soon afterward, Alcouffe’s
ideas were extended and generalized to nondiamond schemes [34, 35, 99, 108, 110].
A nonlinear “quasidiffusion” method was also developed [8], which is rapidly convergent but is not strictly an acceleration method, because it produces converged
solutions that differ from the discrete SN solution due to an extra truncation error.
In general, an acceleration scheme is said to be “unconditionally effective” when
the accelerated spectral radius is bounded less than unity. Also, an acceleration
scheme is said to be “unconditionally efficient” when the computational execution time associated with the accelerated scheme is always much smaller than that
of the unaccelerated scheme. While the use of consistent diffusion discretizations
makes the DSA method effective, it does not necessarily make it efficient. This is
because the discrete diffusion equations that are consistent with advanced SN spatial discretization methods can have a nonstandard form that makes them much
more expensive to solve than standard discrete diffusion equations. This results
in an effective-but-inefficient DSA algorithm [116]. The most modern SN codes
generally use advanced discontinuous finite-element discretization schemes, but
they do not use fully consistent diffusion discretizations. Rather, they either use
“partially-consistent” diffusion discretizations [64] or they solve the fully consistent
discretizations in an approximate manner that involves the solution of a standard
discretization of the diffusion equation [61, 67]. Although such schemes generally
result in a degradation of the accelerated spectral radius, they are usually much more
efficient than fully consistent schemes because of the high cost of solving the fully
consistent diffusion equations [116].
The DSA method is not limited to problems with isotropic scattering. When applied to problems with anisotropic scattering, the original DSA algorithm was nearly
identical to that for isotropic scattering; all higher-order Legendre angular flux moments (above the scalar flux) were simply left unaccelerated. However, the DSA
method becomes ineffective for problems with highly anisotropic forward-peaked
scattering because the higher-order Legendre moments become slow to converge
and require acceleration. The standard DSA method can be modified to accelerate
the current (the n D 1 Legendre moment) in addition to the scalar flux (the n D 0
Legendre moment) [36]. In 1-D calculations, this improves performance with highly
anisotropic scattering, but the modified method nonetheless becomes increasingly
ineffective as the anisotropy of the scattering increases [36]. In multidimensional
calculations, acceleration of the currents becomes unstable as the anisotropy of the
scattering increases [68]. For 1-D calculations, an angular multigrid method has
been developed, which is unconditionally effective and efficient with anisotropic
scattering [62]. However, this approach has had only limited success in multidimensional calculations [93].
60
E.W. Larsen and J.E. Morel
1.8.3 DSA-Like Methods for Outer Iteration Acceleration
The DSA method was originally applicable only to within-group source iterations,
but variations on DSA have been developed to accelerate energy-dependent sources.
In particular, one can accelerate the neutron upscatter source [69], the fission source
in steady-state subcritical or time-dependent super-critical neutronics calculations
[73], and the fission-like implicit emission source in radiative transfer calculations
[42, 50]. To illustrate, we consider time-dependent radiative transfer calculations.
The DSA-like scheme that is used to accelerate the outer iterations is called the
linear multifrequency-grey method [42,50]. In accordance with Eq. (1.43), an accelerated outer iteration can be described as follows. The first step is a standard outer
iteration on the emission source:
rI .`1=2/ .r; ; E/ C † .E/I .`1=2/ .r; ; E/ i
1 h
.E/f .`1/ .r/ C .r ; E/ ;
D
4
where I is the intensity,
sorption rate:
†s .E/
4
.`1=2/
.r; E/
(1.181)
is the angle-integrated intensity, and f is the total abZ
f .r/ D
1
0
r ; E 0 dE 0 :
†a E 0
(1.182)
We note from Eq. (1.181) that the monochromatic scattering sources (corresponding to within-group scattering sources in practice) carry an index of ` 1=2, and
thus are assumed to be converged within each outer iteration. The convergence of
these sources within each outer iteration is required for application of the linear
multifrequency-grey method. Of course, one can use the DSA method to accelerate
the convergence of these sources if desired. The iterative error in the angular intensity at step ` 1=2 is defined as ıI .`1=2/ D I I .`1=2/ , where I is the converged
transport solution. This error satisfies the following transport equation:
1 †s .E/ı
rıI .`1=2/ .r; ; E/ C †t .E/ıI .`1=2/ .r; ; E/ 4
h
i
1
.E/ f .`1=2/ .r/ f .`1/ .r/ ;
D
4
.`1=2/
.r; E/
(1.183)
which is just as difficult to solve as Eq. (1.181).
The DSA algorithm would approximate Eq. (1.183) by its (energy-dependent)
diffusion approximation. Instead, in the multifrequency-grey method, Eq. (1.183)
is approximated by a simpler monoenergetic or “grey” diffusion equation. The
unknown in this approximate equation is the iterative error in the angle-energyintegrated intensity at step ` 1=2:
Z
.`1=2/
ıˆ
1
.r/ D
0
h r; E 0 .`1=2/
i
r; E 0 dE 0 :
(1.184)
1
Advances in Discrete-Ordinates Methodology
61
The approximate diffusion equation is
r hDirıˆ.`1=2/ .r/ C Œ C .1 /h†a i ıˆ.`1=2/ .r/
h
i
D f .`1=2/ .r/ f .`1/ .r/ ;
(1.185)
where
1
hDi D
3
Z
h†a i D
Z
0
1
0
and
.E/
&.E/ D
†a .E/ C
1
&.E 0 /
dE 0 ;
† .E 0 /
(1.186)
&.E 0 /†a .E 0 /dE 0 ;
Z
1
0
.E 0 /
†a .E 0 / C
(1.187)
dE 0 :
(1.188)
The final step in the algorithm is to add the estimate for the absorption rate error to
the absorption rate iterate:
f .`/ .r/ D f .`1=2/ .r/ C ha iıˆ.`1=2/ .r/:
(1.189)
A Fourier analysis of the unaccelerated emission source iteration shows that when
the spectral radius is near unity, iterative errors that vary slowly in space are poorly
attenuated by the transport sweep, while errors that vary rapidly in space are strongly
attenuated by the sweep. (Such behavior can always be expected from a sweep,
regardless of the particular type of source being iterated upon.) The Fourier analysis
also shows that the low-frequency errors take on an angular shape that is linear in the
components of the photon direction vector with an energy spectrum that is fixed. The
linear angular dependence suggests an energy-dependent diffusion approximation to
Eq. (1.183), and the specific, i.e., common, energy shape suggests that an energydependent diffusion approximation can be reduced without loss of accuracy to a grey
diffusion approximation. This is exactly how Eq. (1.185) is derived. In particular, it
is first assumed that
ıI.r; ; E/ D
1
Œı .r; E/ C 3 ıF .r; E/ :
4
(1.190)
A multigroup diffusion approximation is then obtained by substituting Eq. (1.190)
into Eq. (1.183), taking angular moments with respect to the weight functions, 1 and
, and then eliminating ıF from the resulting P1 equations, to obtain
r D.E/rı .`1=2/ .r; E/ C C †a .E/ ı
i
h
D .E/ f .`1=2/ .r/ f .`1/ .r/ ;
.`1=2/
.r; E/
(1.191)
62
E.W. Larsen and J.E. Morel
where
D.E/ D
1
:
3† .E/
(1.192)
Equation (1.185) is obtained from Eq. (1.191) by first assuming that
ı
.`1=2/
.r; E/ D &.E/ıˆ.`1=2/ .r/;
(1.193)
where &.E/ is a normalized shape function defined by Eq. (1.188). The next step is
to substitute Eq. (1.193) into Eq. (1.191) and integrate that equation over all energies. The resulting equation is identical to Eq. (1.185), except for a drift term that
arises in systems with space-dependent cross sections. In practice, the drift term is
dropped because doing this simplifies the grey equation with no significant degradation in the effectiveness of the acceleration algorithm.
In analogy with DSA, the diffusion step in the linear multifrequency-grey
method almost completely attenuates low-frequency errors but does not affect
high-frequency errors. With one possible exception discussed in Section 1.8.4, the
spectral radius is found to be excellent for practical radiative transfer calculations
in the stellar regime.
The fission source acceleration method [73] is nearly identical to the linear
multifrequency-grey method because, as previously noted, the numerical emission
source in radiative transfer has the mathematical form of a fission source. The neutron thermal upscatter source acceleration method [69] is very similar to the fission
source acceleration method and the linear multifrequency-grey method, except that
the upscatter operator is of full rank in energy as opposed to the fission and emission operators, which are of rank one in energy. Simultaneous upscatter and fission
source acceleration has also been investigated [84]. A full discussion of the requirements imposed on acceleration techniques by a general “scattering” operator of rank
greater than one is beyond the scope of this chapter. Suffice it to say that effective
acceleration techniques for “scattering” operators of full rank in both energy and
angle may not be achievable using a one-group diffusion approximation to estimate
low-frequency errors. Rather, “multigrid” or multilevel approximations in angle, in
energy, or in both angle and energy may be required.
1.8.4 A Deficiency in Multidimensional DSA
and DSA-Like Methods
For many years, Alcouffe’s consistent DSA approach appeared to yield an unconditionally effective DSA algorithm. However, Azmy [87] has recently shown
that the DSA method becomes increasingly ineffective in heterogeneous multidimensional calculations as the jumps in cross sections across material discontinuities increase in magnitude. It is likely that all DSA-like methods will exhibit
this same deficiency. This is not significant for neutronics calculations, but it is
very significant for certain classes of radiative transfer calculations. Fortunately,
1
Advances in Discrete-Ordinates Methodology
63
this deficiency has been overcome by recasting DSA as a preconditioner and using
it in conjunction with preconditioned Krylov methods to solve the SN equations.
This topic is discussed next.
1.9 Krylov Methods
In the past few years, Krylov methods have had an enormous impact on the manner in which the SN equations are solved. Although the first applications of Krylov
methods to the SN equations were made by applied mathematicians in the late 1980s
and early 1990s [53, 63], the numerical transport community was initially slow to
embrace them. However, during the last several years, Krylov methods have entered
the mainstream of SN solution techniques [74, 75, 79, 80, 95, 103, 115, 124, 126]. Indeed, almost every SN solution technique developed today can be expected to be
based on a preconditioned Krylov method. There are three basic reasons for the
mainstream acceptance of Krylov methods within the numerical transport community. First, the recent recognition that DSA and DSA-like methods are not
unconditionally effective in multidimensional problems [94] has motivated a search
both for modifications to the DSA method that would eliminate this deficiency and
for fundamentally new unconditionally effective solution techniques. Second, it was
demonstrated early on that an inconsistently discretized DSA method, when recast
as a preconditioner for a Krylov method, can produce an unconditionally effective
and efficient SN solution technique [63]. Third, it has been demonstrated that Krylov
methods, even with relatively simple preconditioners, can be much more effective
than simple source iteration [82]. In this section, we give a basic description of
Krylov methods, and then we consider some of the specific issues that arise when
applying preconditioned Krylov methods to the SN equations. This is not intended
as a review of all forms of Krylov methods, but rather as a basic introduction to
Krylov concepts with a focus on those types of Krylov methods that appear to be
most relevant for particle transport calculations.
1.9.1 The Central Theme of Krylov Methods
Let us assume that we want to iteratively solve the following matrix equation:
E
AxE D b;
(1.194)
where A is a nonsingular N N matrix, xE is a solution vector of length N , and bE is
a source vector of length N . We assume an initial solution guess for Eq. (1.194) of
E However, if xE0 is not zero, one can always solve the following equivalent
xE0 D 0.
system with an initial zero guess:
AxE0 D bE0 ;
(1.195a)
64
where
and
E.W. Larsen and J.E. Morel
xE0 D xE xE0 ;
(1.195b)
bE0 D bE AxE0 :
(1.195c)
Fundamental to the central theme of a Krylov method is a Krylov space of dimenE We denote
sion m, which is defined with respect to the matrix A and the vector b.
E A basis for this space consists of M vectors formed by
this space by Km .A; b/.
E
applying successive powers of A to b:
o
n
E : : : ; Am1 bE :
E D span b;
E Ab;
E A2 b;
Km .A; b/
(1.196)
We refer to these basis vectors as the Krylov vectors. The basic idea of Krylov
methods is to approximate the solution to Eq. (1.194) by a linear combination of
Krylov vectors. Before we provide more details on how this is done, we first show
that the solution of Eq. (1.194) lies within a Krylov space. We define the minimum
polynomial of A, denoted by Pd .A/, as the polynomial in A of minimum degree d
such that
(1.197)
Pd .A/ D I C a1 A1 C C ad Ad D 0:
If A is nonsingular, then zero is not a root of its minimal polynomial. Thus, the
minimal polynomial can be scaled so that a0 D 1:0. Equation (1.197) implies
I D .a1 I C C ad Ad 1 /A:
(1.198)
Multiplying Eq. (1.198) from the right by A1 gives
A1 D d
1
X
aj C1 Aj :
(1.199)
j D0
Thus, the solution to Eq. (1.194) can be written as
xE D A1 bE D d
1
X
E
aj C1 Aj b;
(1.200)
j D0
E Thus, one is
which, by comparison with Eq. (1.196), is an element of Kd .A; b/.
E where
motivated to seek an approximation to xE from the Krylov space Kn .A; b/,
n d.
Although Krylov methods are generally quite similar in spirit, they can be quite
different in detail. Recognizing this fact, we nonetheless describe a “typical” Krylov
E
method as follows. The mth solution iterate, xE .m/ , is an element of Km .A; b/.
E
The first step in the algorithm is to build an orthogonal basis for Km .A; b/.
1
Advances in Discrete-Ordinates Methodology
65
Orthogonalization of the Krylov vectors is necessary because these vectors become
less linearly independent with increasing m. This follows from the fact that Aj bE
approaches the fundamental eigenvector (the eigenvector associated with the eigenvalue of largest magnitude) as j increases. Different methods achieve orthogonality
in different ways, and in some cases, the vectors are orthogonal with respect to
some particular inner product. For illustrative purposes we will simply assume that
the vectors are orthogonal with respect to the standard dot product. For reasons that
will be explained later, the orthogonal basis vectors are also normalized. The mth solution iterate is expressed as a linear combination of the orthonormal basis vectors:
xE .m/ D Vm yEm ;
(1.201)
where Vm is an n m matrix whose j th column consists of the j th orthonormal basis vector, and yEm is the vector of expansion coefficients of length m. The expansion
coefficients are uniquely determined via a residual orthogonality condition. More
specifically, an m-dimensional weighting space of vectors is first chosen,
˚
E 1; w
E 2; : : : ; w
Em ;
Wm D span w
(1.202)
and then expansion coefficients are chosen so that the residual vector associated with
xE .m/ ,
rEm D bE AxE .m/ D bE AVm yEm ;
(1.203)
is orthogonal to the weighting space, i.e.,
WTm rEm D 0;
(1.204)
where Wm is an m m matrix whose j th column consists of the j th weighting
vector. Substituting from Eq. (1.203) into Eq. (1.204), we obtain the following m m
matrix equation for the coefficient vector:
E
WTm AVm yEm D WTm b:
(1.205)
Different Krylov methods arise from different choices of the weighting space. For
E is a good choice. This results in
instance, if A is positive-definite, Wm D Km .A; b/
a classic Ritz–Galerkin approximation because the weighting space and the approximate solution space are identical:
E
VTm AVm yEm D VTm b:
(1.206)
The A-norm of the error, defined by
xE xE .m/
A
D .xE xE .m/ /T A.xE xE .m/ /;
(1.207)
66
E.W. Larsen and J.E. Morel
is minimized under this approximation. The quantity defined by Eq. (1.206) is
always minimized, but this is not a true norm unless A is positive-definite. The
conjugate-gradient (CG) method [81] uses the Ritz–Galerkin approximation.
E
If A is not positive-definite, a good choice for the weighting space is AKm .A; b/,
i.e., the space spanned by the A times the Krylov vectors. In this case, Eq. (1.205)
becomes
E
(1.208)
.AVm /T .AVm /yEm D .AVm /T b:
Equation (1.208) is easily recognized as the least-squares approximation to the overdetermined linear system
E
A.Vm yEm / D b:
(1.209)
Thus, the residual associated with this approximation is minimized with respect to
the L2 norm, defined as
q
rEm L2 D
Tr
rEm
Em :
(1.210)
The generalized minimum-residual (GMRES) and minimum-residual (MINRES)
methods [81] use the least-squares approximation.
Weighting spaces other than those discussed above can be used, but when this is
done, neither the error nor the residual are minimized in any rigorous sense. However, there are advantages to using other weighting spaces, relating to the amount of
computational work and the amount of data storage required to solve for xE .m/ .
In general, Eq. (1.205) is not actually solved to obtain yEm . For instance, in the
GMRES method, an equivalent matrix equation based on an Arnoldi decomposition
of A is used. The details of the process used to obtain xE .m/ are not important for
our purposes, but the amount of computational work and the amount of data storage
required to obtain xE .m/ are important. The storage costs associated with the Krylov
vectors can be prohibitive if one is solving very large linear systems for which many
iterations are required to achieve convergence. This is not a difficulty if A is symmetric positive-definite and the CG method is used, because xE .m/ can be calculated
using a three-term recursion formula [81]. As a result, one need only save the previous solution iterate and one other vector of similar length to compute a new solution
iterate. In this case, the required data storage does not increase with the number
of iterations as it otherwise does. This is a great advantage associated with the CG
method. If A is not symmetric positive-definite, one can avoid the storage issue by
choosing the weighting space such that a recursion formula can be used to compute
xE .m/ . However, as previously noted, this approach generally fails to minimize the
error or the residual at each iteration step, which can result in erratic convergence
properties. The biconjugate-gradient (BCG) and quasi-minimum-residual (QMR)
methods [81] use this type of approach. If a recursion formula cannot be used, a
restart strategy must generally be used. Under such a strategy, one chooses a maximum number of Krylov iterations to perform within each “stage” before restarting
the Krylov process with the last iterate from the previous stage serving as the initial
solution guess for the next stage. The difficulty with this approach is that convergence of the restarted process is not guaranteed for general matrices. However,
the restarted GMRES method is guaranteed to converge if A is positive-definite,
1
Advances in Discrete-Ordinates Methodology
67
i.e., if xE T AxE > 0 for all xE ¤ 0; or equivalently, if the eigenvalues of A C AT
are all positive and nonzero. As we later explain, this property is relevant for SN
calculations.
In summary, all Krylov methods (i) use a Krylov space of dimension m to approximate the mth solution iterate, and (ii) force the residual associated with that
iterate to be orthogonal to an m-dimensional weighting space of vectors. The main
differences between the various types of Krylov methods arise from the nature of the
weighting space of vectors and the computational cost (both in CPU time and memory) of obtaining a solution iterate. Some Krylov methods are designed for general
linear systems, while others are specifically designed to be optimal for some particular type of linear system. For instance, the conjugate-gradient method is designed
for symmetric positive-definite systems while the minimum-residual method is designed for symmetric indefinite systems.
1.9.2 Convergence and Preconditioning of Krylov Methods
The speed of convergence of Krylov methods very much depends on the characteristics of the coefficient matrix associated with the linear system being solved, i.e.,
the characteristics of the coefficient matrix A in Eq. (1.194). This matrix is normal if AAT D AT A. The convergence properties associated with normal coefficient
matrices are fairly well understood, but characterizing the convergence properties
associated with non-normal coefficient matrices remains an open problem. A property associated with the convergence of the conjugate-gradient (CG) method is the
condition number of the coefficient matrix, defined as the ratio of its largest singular value to its smallest singular value:
D
max
:
min
(1.211)
If is not large, the convergence of the CG method will be rapid. However, if is large, no conclusion can be made regarding the rate of convergence of CG iterations. The condition number is not particularly relevant to the convergence of
the generalized minimum-residual (GMRES) method. More relevant factors are the
distribution of the eigenvalues of A in the complex plane, and (assuming that A is
diagonalizable) the condition number of the matrix of eigenvectors of A. An eigenvalue spectrum clustered away from zero and entirely contained in either half-plane
and an eigenvector matrix of A with a small condition number are desirable. If any
single property of the coefficient matrix is desirable with a Krylov method, it is
that its eigenvalues should be clustered away from zero. A set of eigenvalues is
considered to be clustered if the distance between any two eigenvalues is much
smaller than the distance of any eigenvalue from the origin. We stress that all of
the eigenvalues need not be in one cluster or a collection of clusters. Any clustering
of eigenvalues can be advantageous relative to a completely unclustered eigenvalue
distribution.
68
E.W. Larsen and J.E. Morel
The maximum number of iterations required for a Krylov method to converge
is N , the dimension of A. However, it is desirable to use the technique of preconditioning to reduce the number of required iterations, both from the viewpoint
of efficiency and from the fact that roundoff errors can be problematic when N is
large. A preconditioner for A is a matrix that approximates A1 . Given a preconditioning matrix C; the preconditioning process can take two forms. The first, called
left-preconditioning, corresponds to solving
CAE
x D Cy:
E
(1.212)
In the absence of round-off error, the above equation clearly has the same solution as
Eq. (1.194). The second is called right-preconditioning and corresponds to solving
ACEz D y;
E
(1.213)
xE D CEz:
(1.214)
where
We note from Eq. (1.214) that once Eq. (1.213) has been solved, xE is obtained from
Ez via the action of C on Ez. If C D A1 , both Eqs. (1.212) and (1.213) can be solved
in a single Krylov iteration. This is why a preconditioner should approximate A1 .
However, from a practical point of view, a preconditioner should approximate A1
only in a limited sense, because it is not generally possible to find a nearly exact
approximation to A1 at a low computational cost. Also if a nearly exact approximation to A1 were known, a nearly exact solution to the problem, Eq. (1.194),
would be immediately realizable without the need for iterations.
Experience in solving the transport equation with Krylov methods indicates that
the best preconditioners move the smallest eigenvalues significantly away from zero,
while leaving the largest eigenvalues relatively unaffected [124]. Assuming a preconditioned transport system that is SPD, the effectiveness of such preconditioners
is easily explained when they are used in conjunction with the CG method. In particular, the eigenvalues and singular values of an SPD matrix are identical, so the
condition number of such a matrix is just the ratio of the largest eigenvalue to the
smallest eigenvalue. Thus, moving the eigenvalues closest to zero away from zero,
while leaving the largest eigenvalues essentially unaffected, will decrease the condition number of the coefficient matrix and thereby result in faster convergence
of the CG method. However, even though such preconditioners are observed to
remain effective when used in conjunction with nonsymmetric transport systems
and more general Krylov methods, the reasons for their effectiveness are not yet
well understood. To move the smallest eigenvalues away from zero, the preconditioner must accurately approximate the inverse of the coefficient matrix when
operating on the eigenvectors associated with the smallest eigenvalues. To leave
the largest eigenvalues relatively unaffected, the preconditioner must act approximately as the identity matrix when operating on the eigenvectors associated with the
largest eigenvalues. Such preconditioners are analogous to the coarse-grid operators
1
Advances in Discrete-Ordinates Methodology
69
used in multigrid methods and the low-order operators used in synthetic acceleration schemes. Thus, it is somewhat clear how such preconditioners should be
constructed.
As previously stated, the convergence of Krylov methods can be significantly improved by clustering some of the eigenvalues. Unfortunately, the properties required
for a preconditioner to cluster eigenvalues are generally much less clear than those
required to move the smallest eigenvalues away from zero. Nonetheless, as we shall
later see, moderately effective preconditioners for transport calculations have been
found that cluster some of the eigenvalues while leaving the smallest eigenvalues
essentially unaltered [124].
Finally, moderately effective preconditioners for transport calculations have been
found that move the smallest eigenvalues away from zero, but also significantly increase and spread out (uncluster) the largest eigenvalues [121, 122]. One can give
a plausible explanation for the effectiveness of such preconditioners when they are
used in conjunction with the CG method, but such an explanation is not currently
possible with more general Krylov methods. For instance, if we assume a preconditioned transport system that is SPD, moving the smallest eigenvalues away from
zero decreases the condition number of the coefficient matrix, while increasing the
large eigenvalues increases the condition number of the coefficient matrix. We can
conjecture that these preconditioners are moderately effective when used in conjunction with the CG method because the decrease in the condition number associated
with moving the smallest eigenvalues away from zero weakly dominates the increase in the condition number associated with increasing the largest eigenvalues.
This results in a net decrease in the condition number of the coefficient matrix and
an attendant increase in the convergence rate of the CG method. Preconditioners of
this type are analogous to unstable acceleration schemes that strongly attenuate the
low-frequency Fourier error modes but weakly amplify the high-frequency Fourier
error modes. We emphasize that not every acceleration scheme that attenuates lowfrequency Fourier error modes and amplifies high-frequency Fourier error modes
can be expected to be effective when recast as a preconditioned Krylov method. If
the amplification factors for the high-frequency Fourier error modes are too large,
the corresponding preconditioner may increase the largest eigenvalues too much
relative to the movement of the smallest eigenvalues away from zero, presumably
resulting in a net decrease in convergence rate.
1.9.3 Applying Krylov Methods to the SN Equations
The optimal application of Krylov methods to the transport equation is not necessarily obvious. To illustrate, let us consider a monoenergetic approximation with
isotropic scattering in 1-D slab geometry. The transport equation can be expressed as
1
L †s P
2
D Q;
(1.215)
70
E.W. Larsen and J.E. Morel
where
L
D
@
C †t ;
@x
and
Z
P
D
(1.216)
C1
D
d:
(1.217)
1
A Krylov method could be used to solve a fully discretized SN version of
Eq. (1.215), but it would not be optimal. A better approach is to left-precondition
Eq. (1.215) with L1 :
1 1
I L †s P
2
D L1 Q:
(1.218)
This approach is better for three reasons. First, the analytic operator on the left side
of Eq. (1.218) represents the integral transport operator, which is bounded, whereas
the differential transport operator is unbounded. This implies that the discretized SN
version of the integral operator will have eigenvalues restricted to a bounded (hence
much smaller) region of the complex plane than the discretized SN version of the
differential operator. Furthermore, the integral transport operator is a compact perturbation of the identity operator, hence many (but not all) of the eigenvalues of the
discretized SN version of the integral operator will be clustered about unity. Thus,
Krylov methods will generally converge faster when used to solve Eq. (1.218) than
when used to solve Eq. (1.215). The second reason for solving Eq. (1.218) is that the
action of L1 is easily and economically obtained via a sweep. Thus, the improved
convergence rate will not be outweighed by the cost of preconditioning with L1 .
(We note that L1 is an example of a preconditioner that clusters eigenvalues but
does not significantly move the smallest eigenvalues away from zero. The improvement in convergence rate is significant, but preconditioning with L1 will not result
in rapid convergence for all problems.) The third reason for solving Eq. (1.218) is
that it has been observed (but not rigorously proven [124]) that the matrices associated with SN discretizations of Eq. (1.218) are positive-definite. As previously
noted, the restarted GMRES algorithm is guaranteed to converge when applied to
linear systems with this property. This is a significant property of Eq. (1.218).
A further improvement can be obtained by multiplying Eq. (1.218) by P. This
results in an integral equation for the scalar flux (also known as Peierls’ equation):
1
I PL1 †s
2
D PL1 Q:
(1.219)
The main advantage of this approach is that the dimension of the matrix associated
with a discrete version of Eq. (1.219) is much less than that associated with a discrete version of Eq. (1.218), greatly reducing the amount of computation required
per Krylov iteration. The derivation yielding Eq. (1.219) can be generalized for any
order of anisotropic scattering, resulting in an integral equation for the Legendre (or
spherical-harmonic) flux moments. Solving an integral moments equation will be
1
Advances in Discrete-Ordinates Methodology
71
advantageous relative to Eq. (1.218) whenever the number of moments is less than
the number of discrete angular flux directions. Another potential advantage of the
integral moments approach is that analytic versions of such equations can be made
self-adjoint and positive-definite (SAPD) via both a preconditioner and the use of a
nonstandard inner product [107]. If a discretization for such an equation preserves
this SAPD property, then the corresponding discrete system can be expressed in an
SPD form and the conjugate-gradient method can be used to solve it. The SAPD
property has been observed to be preserved for Eq. (1.219) on orthogonal meshes
with a wide variety of SN spatial discretization schemes [82], but it is not preserved
with the linear-discontinuous spatial discretization scheme on unstructured tetrahedral meshes [124]. The conditions under which the SAPD property is preserved are
not currently well understood, but for reasons given later, this is not as important an
issue as it might appear to be.
Finally, perhaps the optimal approach would be to multiply Eq. (1.219) by the
operator I C D1 †s :
1
.I C D1 †s / I PL1 †s
2
D .I C D1 †s /PL1 Q;
(1.220)
where D is the diffusion operator:
D D
@ 1 @
C †a :
@x 3†t @x
(1.221)
This is the analog of diffusion-synthetic acceleration, and it (presumably) results
in rapid convergence under almost all practical conditions [124]. The deficiency of
DSA in multidimensions with large discontinuities in the cross sections is strongly
diminished when it is recast as a preconditioned Krylov method [124]. This is the
great advantage of Krylov methods. Only one eigenvalue of the iteration matrix
very near unity can ruin the performance of an acceleration scheme, but a single
anomalously large eigenvalue associated with the system being solved will generally
have little effect on a Krylov method.
As previously noted, modern SN codes generally use discontinuous spatial discretization schemes. If a discontinuous diffusion discretization is used, the left side
of Eq. (1.220) will not be symmetric [124]. A great advantage of preconditioned
Krylov methods is that the diffusion discretization does not have to be fully consistent with the SN spatial discretization to be effective. However, if a discontinuous
SN spatial discretization is used, some capability for calculating a discontinuous
solution must be included in the diffusion approximation to ensure unconditional
effectiveness, and this will almost certainly cause the overall diffusion operator to
be either nonsymmetric or symmetric-indefinite. Thus, even if a discontinuous discretization of Eq. (1.219) were to preserve the SAPD property of Peierls’ operator
and result in an SPD system, applying a DSA-like preconditioner would result in
an overall non-SPD system, precluding the use of the conjugate-gradient method to
solve that system. This is why the failure of discontinuous methods to preserve the
72
E.W. Larsen and J.E. Morel
SAPD property of Peierls’ operator on unstructured meshes is not as important an
issue as it might be. On the other hand, even when a discontinuous diffusion approximation is used, the operator on the left side of Eq. (1.220) remains positive-definite,
ensuring the convergence of the restarted GMRES method. This is an important
property of Eq. (1.220).
Next, we review the operations required to solve a fully discretized SN version of
Eq. (1.220) and show that they are quite similar to the operations that take place in a
standard SN code with diffusion-synthetic acceleration. We first note that to solve a
general linear system such as Eq. (1.194), most Krylov routines do not require the
formation of the matrix A. Rather, it is required that the calling routine provide the
Krylov routine with a vector that represents the action of A. Specifically, the Krylov
routine provides the calling program with a vector, which we denote by vE, and the
calling routine returns the vector vE0 , where
vE0 D AEv:
(1.222)
The Krylov space can easily be built up this way, with one matrix-vector multiply
per iteration. The simplest Krylov methods (e.g., the conjugate-gradient method)
require only one matrix-vector multiply per iteration, but others may require more
(e.g., the biconjugate-gradient method requires two).
We now consider the solution of Eq. (1.220) in detail. Since the unknown in
Eq. (1.220) is the monoenergetic or one-group scalar flux, a Krylov method used
to solve this equation works with “scalar flux” vectors, i.e., vectors with a length
equal to the number of spatial cells times the number of spatial unknowns per cell.
The latter number clearly depends on the spatial discretization used for Eq. (1.220).
In the process of generating the scalar flux vectors required by the Krylov solver,
intermediate “angular flux vectors” arise. These vectors have a length equal to the
length of a scalar flux vector times the number of discrete directions in the quadrature set. In the descriptions that follow, angular flux vectors carry an additional
subscript “a” to distinguish them from scalar flux vectors. The first vectors that must
be provided to a Krylov solver are the initial solution guess, and the vector corresponding to the right side of Eq. (1.220). The latter “source” vector is generated as
follows:
E a as the source to obtain
1. Perform a transport sweep with Q
E a:
vE1;a D L1 Q
2. Integrate vE1;a over all directions to obtain
E a:
vE2 D PL1 Q
3. Multiply vE 2 by †s (a diagonal matrix in this context) to obtain a diffusion source
vector:
E a:
vE3 D †s P L1 Q
1
Advances in Discrete-Ordinates Methodology
73
4. Solve the diffusion equation with vE 3 as the source to obtain
E a:
vE4 D D1 †s P L1 Q
5. Add vE2 and vE4 to obtain the desired source vector:
E a:
vE 5 D .I C D1 †s / P L1 Q
Next, every iteration requires the calling routine to provide the Krylov solver with
the action of the operator on the left side of Eq. (1.220). The action of this operator
on the vector vE is calculated as follows:
1. Multiply vE by †s =2 and map it to an isotropic angle-dependent vector to obtain
a transport source vector:
1
vE1;a D †s vE:
2
2. Perform a sweep with vE1;a as the source to obtain
vE2;a D
1 1
L †s vE:
2
3. Integrate vE2;a over all angles to obtain
vE3 D
1 1
PL †s vE:
2
4. Subtract vE3 from vE to obtain
1
vE 4 D I PL1 †s vE:
2
5. Multiply vE4 by †s to obtain a diffusion source vector:
1 1
vE5 D †s I PL †s vE :
2
6. Perform a diffusion calculation with vE 5 as the source to obtain
1 1
1
vE6 D D †s I PL †s vE:
2
7. Add vE4 to vE6 to obtain the desired action on vE:
1
vE7 D .I C D1 †s / I PL1 †s vE:
2
74
E.W. Larsen and J.E. Morel
Thus we see that only sweeps, diffusion solves, and vector additions and subtractions are required to solve the transport equation via a preconditioned Krylov
method. Because these operations occur in standard SN codes with DSA, it is not difficult to modify existing SN codes to use preconditioned Krylov solution techniques.
1.10 Future Challenges
In this review, we have discussed some of the major improvements to numerical
techniques, developed during the last three decades, for the SN equations. Perhaps the most far-reaching improvements have resulted from discontinuous finiteelement methods, which have impacted the discretization of essentially all of
the variables in the transport equation. The present challenge for discontinuous
finite-element methods is to achieve both accuracy and robustness on general
nonorthogonal meshes in the optically thin limit, the optically thick absorptive limit,
and the optically thick diffusive limit. These properties can be achieved on orthogonal meshes with standard lumping techniques, but such techniques are problematic
on nonorthogonal meshes. Achieving robust strictly positive solutions with at least
second-order accuracy in regions where fixup is not required is a classic problem
that arises in many fields. Typical fixup schemes for transport are highly nonlinear
and nondifferentiable. For instance, a popular fixup scheme used with spatial diamond differencing is to set any negative outflow fluxes in a cell to zero and then
recalculate the cell-averaged flux [71]. When a problem contains both highly absorptive and highly scattering regions, the resulting equations can be difficult to
solve even when nonlinear solution techniques are used because of the nondifferentiable nature of the “fixup” scheme. A new concept is now emerging by which
solutions are not “fixed up” during the iteration process, but rather are “repaired”
after convergence [123]. New ideas, such as this, will be required to solve the old
and difficult problem of ensuring positive and monotone second-order solutions.
Discrete-ordinates methods have traditionally been viewed as being limited to
simple geometries, due to the use of orthogonal meshes. The recent development
of SN methods for general unstructured meshes constitutes a major advance in the
applicability of SN methods to geometrically complex problems. Highly efficient
and scalable massively parallel SN solution techniques have been developed for orthogonal meshes [51], but massively parallel solution techniques for unstructured
meshes are not yet as efficient or scalable [104, 114]. Also, fundamental problems
occur in the SN method on meshes with curved cell faces. These problems have
been addressed to a limited extent [109], but much more computational experience
is needed before the efficacy of current approaches is fully demonstrated.
Tremendous progress in DSA and DSA-like acceleration schemes has been
made during the last two decades. The field of acceleration techniques was dealt
a significant setback when it was discovered that these schemes are not always
unconditionally effective, even with fully consistent diffusion discretizations [87].
However, the advances recently made in the application of Krylov methods to
1
Advances in Discrete-Ordinates Methodology
75
transport calculations have enabled synthetic acceleration schemes to be recast as
unconditionally effective preconditioners, thereby regaining a powerful potential
for these methods [124]. Developing optimal preconditioned Krylov methods for
transport calculations remains an outstanding research topic.
A significant challenge for charged-particle transport is to use the SN method
to efficiently and accurately treat problems with pencil-beam or near pencil-beam
sources. Some progress has recently been made in this regard [125], but more
efficient methods requiring lower orders of approximation are needed. Since
charged-particle scattering is extremely anisotropic, Galerkin quadratures represent an important alternative to standard quadratures for charged-particle transport
calculations. However, much remains to be understood regarding the optimal
properties of Galerkin quadrature sets. For example, it is desirable for standard
quadrature sets to have positive quadrature weights. Galerkin quadrature sets
have a (potentially different) set of weights for evaluating each flux moment.
The Galerkin weights associated with the evaluation of the zeroth flux moment
represent the weights associated with the standard companion quadrature set. These
quadrature weights should be positive because (among other things) positivity indicates a stable spherical-harmonic interpolation. However, the connection between
positive weights for evaluating the higher-order flux moments and the stability of
the spherical-harmonic interpolation is not clear. To our knowledge, a multidimensional Galerkin quadrature set beyond S2 with positive weights for all moments has
never been constructed.
Finally, very little progress has been made with respect to perhaps the most significant of all SN deficiencies – the ray effect [119]. Although PN methods do not
suffer from classic ray effects, they rarely provide the correct solutions to problems
for which SN solutions exhibit ray effects. This is because the exact solutions to such
problems usually have an extremely anisotropic angular dependence, and the global
nature of spherical-harmonic functions makes them inefficient for representing such
dependencies. Unfortunately, the introduction of angular coupling via continuous
and discontinuous finite-element approximations does not appear promising for rayeffect mitigation [57, 119]. Because the solutions associated with ray effects are so
complex, the accurate elimination of ray effects will probably require angular adaptation. Although adaptive methods are quite mature in many fields of computation,
they are just beginning to be investigated for transport calculations [85,92,111,117];
this possibly represents an important research area for the future.
Acknowledgments We would like to acknowledge the invaluable assistance of Michele Benzi of
Emory University, James Warsa of Los Alamos National Laboratory, and Carrie Beck. Michele
graciously reviewed our section on Krylov methods. James provided us with basic information on
Krylov methods and their application to the SN equations. Carrie accomplished the laborious task
of typing this manuscript in its present format. We would also like to acknowledge that one of the
authors (JEM) was an employee of Los Alamos National Laboratory when much of this manuscript
was written. Los Alamos National Laboratory is operated by Los Alamos National Security, LLC,
for the US Department of Energy under Contract DE-AC52–06NA25396.
76
E.W. Larsen and J.E. Morel
Authors’ Note
The following reference list consists mostly of papers that, in the authors’ opinions,
have significantly contributed to the topics discussed in this review. We have included several other important publications, but we do not intend for the list to be a
full compilation of significant publications on SN methods.
References
1900–1964
1. Eddington AS (1922) The internal constitution of the stars. Cambridge University Press,
Cambridge
2. Chandrasekhar S (1960) Radiative transfer. Dover, New York. First published by Oxford
University Press, London (1950)
3. Davison B (1957) Neutron transport theory. Oxford University Press, London
4. Carlson BG (1961) Numerical solution of neutron transport problems. In: Proc. of symposia
in applied mathematics, Vol. 11, Nuclear reactor theory, American Mathematical Society,
Providence, RI, 219
5. Carlson BG (1963) The numerical theory of neutron transport. In: Methods in Computational
Physics, Vol. 1, Academic, New York
6. Kopp HJ (1963) Synthetic method solution of the transport equation. Nucl Sci Eng 17:65
7. Vladimirov VS (1961) Mathematical problems in the one-velocity theory of particle transport, Transactions of V.A. Steklov Mathematical Institute, USSR Academy of Sciences, 61.
English translation published by Atomic Energy of Canada Limited, Tech. Report AECL1661 (1963)
8. Gol’din VYa (1964) A quasi-diffusion method for solving the kinetic equation. Zh Vych Mat
I Mat Fiz 4:1078 (in Russian) English translation (1967) published in: USSR Comp Math
Math Phys 4:136
1965–1969
9. Lathrop KD (1965) Anisotropic scattering approximations in the mono-energetic Boltzmann
equation. Nucl Sci Eng 21:498
10. Carlson BG, Lathrop KD (1968) Transport theory – the method of discrete ordinates. In:
Greenspan H, Kelber CN, Okrent D (eds) Computing methods in reactor physics. Gordon &
Breach, New York
11. Gelbard EM, Hageman LA (1969) The synthetic method as applied to the SN equations. Nucl
Sci Eng 37:288
12. Lathrop KD (1969) Spatial differencing of the transport equation: positivity vs accuracy.
J Comp Phys 4:475
1970–1974
13. Bell GI, Glasstone S (1970) Nuclear Reactor Theory. Van Nostrand Reinhold, New York
14. Carlson BG (1970) Transport theory: discrete ordinates quadrature over the unit sphere. Los
Alamos Scientific Laboratory report LA-4554
1
Advances in Discrete-Ordinates Methodology
77
15. Carlson BG (1971) Tables of symmetric equal weight quadrature EQn over the unit sphere.
Los Alamos Scientific Laboratory Report LA-4734
16. Reed WH (1971) The effectiveness of acceleration techniques for iterative methods in transport theory. Nucl Sci Eng 45:245
17. Pomraning GC (1973) The equations of radiation hydrodynamics. Pergamon Press, Oxford
18. Reed WH, Hill TR (1973) Triangular mesh methods for the neutron transport equation. Los
Alamos Scientific Laboratory Report LA-UR-73–479
19. Dahlquist G, Björck Å (1974) Numerical methods. Prentice Hall, Englewood Cliffs, NJ
20. Kellogg RB (July 1974) First derivatives of solutions of the plane neutron transport equation,
Technical Note BN-783. Institute for Fluid Dynamics and Applied Mathematics, University
of Maryland, College Park, MD
1975–1979
21. Briggs LL, Miller WF Jr, Lewis EE (1975) Ray effect mitigation in discrete ordinate-like
angular finite-element approximations in neutron transport. Nucl Sci Eng 57:205
22. Hill TR (1975) ONETRAN: a discrete-ordinates finite-element code for the solution of the
one-dimensional multigroup transport equation. Los Alamos Scientific Laboratory Report
LA-5990-MS
23. Alcouffe RE (1977) Diffusion synthetic acceleration methods for the diamond-differenced
discrete-ordinates equations. Nucl Sci Eng 64:344
24. Miller WF Jr, Reed WH (1977) Ray effect mitigation methods for two-dimensional neutron
transport theory. Nucl Sci Eng 62:391
25. Alcouffe RE, Larsen EW, Miller WF Jr, Wienke BR (1979) Computational efficiency of numerical methods for the multigroup, discrete-ordinates neutron transport equations: the slab
geometry case. Nucl Sci Eng 71:111
26. Dudziak DJ, O’Dell RD, Alcouffe RE (July 1979) Transport and reactor theory. Los Alamos
National Laboratory Report, LA-7911-PR
27. Morel JE (1979) On the validity of the extended transport cross-section correction for lowenergy electron transport. Nucl Sci Eng 71:64
1980–1984
28. Lawrence RD, Dorning JJ (1980) A discrete nodal integral transport theory method for multidimensional reactor physics and shielding calculations. In: Proc. ANS topical conference.
Advances in reactor physics and shielding, 14–19 Sept 1980. Sun Valley, Idaho, 840
29. Walters WF, O’Dell RD (1980) Nodal methods for discrete-ordinates transport problems in
(x-y) geometry. In: Proc. ANS topical conference. Advances in mathematical methods for the
solution of nuclear engineering problems, 27–29 April 1980. 1, 115. Munich
30. Gupta NK (1981) Nodal methods for three-dimensional simulators. Prog Nucl Energy 7:127
31. Morel JE (1981) Fokker–Planck calculations using standard discrete ordinates codes. Nucl
Sci Eng 79:340
32. Sanchez R, Mc Cormick NJ (1982) A review of neutron transport approximations. Nucl Sci
Eng 80:481 (review)
33. Larsen EW (1982) Spatial convergence properties of the diamond difference method in X,Ygeometry. Nucl Sci Eng 80:710
34. Larsen EW (1982) unconditionally stable diffusion synthetic acceleration methods for the slab
geometry discrete-ordinates equations. Part I: theory. Nucl Sci Eng 82:47
78
E.W. Larsen and J.E. Morel
35. McCoy DR, Larsen EW (1982) Unconditionally stable diffusion synthetic acceleration methods for the slab geometry discrete-ordinates equations. Part II: numerical results. Nucl Sci
Eng 82:64
36. Morel JE (1982) A synthetic acceleration method for discrete ordinates calculations with
highly anisotropic scattering. Nucl Sci Eng 82:34
37. Caro M, Ligou J (1983) Treatment of scattering anisotropy of neutrons through the
Boltzmann–Fokker–Planck equation. Nucl Sci Eng 83:242
38. Morel JE, Montry GR (1984) Analysis and elimination of the discrete ordinates flux dip.
Transp Theory Statist Phys 13:615
39. Mihalas D, Mihalas BW (1984) Foundations of radiation hydro-dynamics. Oxford University
Press, New York
1985–1989
40. Morel JE (1985) An improved Fokker–Planck angular differencing scheme. Nucl Sci Eng
89:131–136
41. Morel JE (1985) Diamond-difference multigroup-Legendre coefficients for continuousslowing down operator. Nucl Sci Eng 91:324
42. Morel JE, Larsen EW, Matzen MK (1985) A synthetic acceleration scheme for radiative diffusion calculations. J. Quant. Spectro. Radiat Transf 34:243
43. Badruzzaman A (1985) An efficient algorithm for nodal transport solutions in multidimensional geometry Nucl Sci Eng 89:281
44. Khalil H (1985) A nodal diffusion technique for synthetic acceleration of nodal SN calculations. Nucl Sci Eng 90:263
45. Lawrence RD (1986) Progress in nodal methods for the solution of the neutron diffusion and
transport equations. Prog Nucl Energy 17:271 (review)
46. Marchuk GI, Lebedev VI (1986) Numerical methods in the theory of neutron transport.
Harwood Academic Publishers, London (originally published in Russian in 1981)
47. Lazo MS, Morel JE (1986) A linear discontinuous Galerkin approximation for the continuous
slowing down operator. Nucl Sci Eng. 92:98
48. Ullo JJ (1986) Use of multidimensional transport methodology on nuclear logging problems.
Nucl Sci Eng 92:228
49. Larsen EW, Morel JE, Miller WF Jr (1987) Asymptotic solutions of numerical transport
problems in optically thick, diffusive regimes. J Comp Phys 69:283
50. Larsen EW (1988) A grey transport acceleration method for thermal radiative transfer
problems. J Comp Phys 78:459
51. Baker RS, Koch KR (1988) An SN algorithm for the massively parallel CM-200 computer.
Nucl Sci Eng 128:312
52. Larsen EW, Morel JE (1989) Asymptotic solutions of numerical transport problems in optically thick, diffusive regimes II. J Comp Phys 83:212. See also Corrigendum. J Comp Phys
91:246 (1990)
53. Faber V, Manteuffel TA (1989) A look at transport theory from the point of view of linear
algebra. In: Nelson P et al. (eds) Lecture notes in pure and applied mathematics, Vol. 115.
Marcel Dekker, New York, p 31
54. Morel JE (1989) A hybrid collocation-Galerkin-SN method for solving the Boltzmann transport equation. Nucl Sci Eng 101:72
55. Landesman M, Morel JE (1989) Angular Fokker–Planck decomposition and representation
techniques. Nucl Sci Eng 103:1
1
Advances in Discrete-Ordinates Methodology
79
1990–1994
56. Cefus GR, Larsen EW (1990) Stability analysis of coarse mesh rebalance. Nucl Sci Eng
105:31
57. Coppa GGM, Lapenta G, Ravetto P (1990) Angular finite-element techniques in neutron
transport. Ann Nucl Energy 17:363
58. Lorence LJ Jr, Morel JE,Valdez GD (1990) Results guide to CEPXS/ONELD: a onedimensional coupled electron-photon discrete ordinates code package. Sandia National
Laboratories Report SAND89–211
59. Badruzzaman A (1991) Finite moments approaches to the time-dependent Boltzmann equation. Prog Nucl Eng 25:127
60. Walters WF, Morel JE (1991) Investigation of linear-discontinuous angular differencing for
the 1-D spherical geometry SN equations. In: Proc. ANS M&C topical meeting. Advances
in mathematics, computation, and reactor physics, 28 April–May 2 May 1991, Sec. 11.1,
Pittsburgh
61. Wareing T, Larsen EW, Adams ML (1991) Diffusion accelerated discontinuous finite element
schemes for the SN equations in slab and X,Y geometries. In: Proc. ANS topical meeting.
Advances in mathematics, computations, and reactor physics, 29 April–2 May 1991, Vol. 3,
Sec. 11.1. Pittsburgh, p 2–1
62. Morel JE, Manteuffel TA (1991) An angular multigrid acceleration technique for the SN equations with highly forward-peaked scattering. Nucl Sci Eng 107:330
63. Ashby SF, Brown PN, Dorr MR, Hindmarsh AC (1991) Preconditioned iterative methods for
discretized transport equations. In: Proc. International topical meeting. Advances in mathematics, computations, and reactor physics, 28 April–2 May 1991, Vol. 2. American Nuclear
Society, Pittsburgh, PA, p. 6.1 2–1
64. Adams ML, Martin WR (1992) Diffusion synthetic acceleration of discontinuous finite element transport iterations. Nucl Sci Eng 111:145
65. Azmy YY (1992) Arbitrarily high order characteristic methods for solving the neutron transport equation. Ann Nucl Energy 19:593
66. Larsen EW (1992) The asymptotic diffusion limit of discretized transport problems. Nucl Sci
Eng 112:336 (review)
67. Wareing T (November, 1992) Asymptotic diffusion accelerated discontinuous finite element
methods for transport problems. Los Alamos National Laboratory Report LA-12425-T
68. Adams ML, Wareing TA (1993) Diffusion-synthetic acceleration given anisotropic scattering,
general quadratures, and multi-dimensions. Trans Am Nucl Soc 68:203
69. Adams BT, Morel JE (1993) A two-grid acceleration scheme for the multigroup SN equations
with neutron upscattering. Nucl Sci Eng 115:253
70. Morel JE, Dendy JE Jr, Wareing TA (1993) Diffusion accelerated solution of the 2-D SN
equations with bilinear-discontinuous differencing. Nucl Sci Eng 115:304
71. Lewis EE, Miller WF Jr (1984) Computational methods of neutron transport. WileyInterscience, New York [reprinted: American Nuclear Society, La Grange Park, IL (1993)]
72. Briesmeister JF (ed) (1993) MCNP – A General Monte Carlo N-Particle Transport Code,
Version 4A. Los Alamos National Laboratory Report, LA-12625
73. Morel JE, McGhee JM (1994) A fission-source acceleration technique for time-dependent
even-parity SN calculations. Nucl Sci Eng 116:73–85
1995–1999
74. Brown PN (1995) A linear algebraic development of diffusion synthetic acceleration for threedimensional transport equations. SIAM J. Numer Anal 32:179
75. Kelly CT (1995) Multilevel source iteration accelerators for the linear transport equation in
slab geometry. Transp Theory Stat Phys 24:697
80
E.W. Larsen and J.E. Morel
76. Castrianni CL, Adams ML (1995) A nonlinear corner-balance spatial discretization for transport on arbitrary grids. In: Proc. ANS M&C international conference on mathematics and
computations, reactor physics, and environmental analyses, 30 April–4 May 1995, Vol. 2.
Portland, p 916
77. Morel JE, Wareing TA, Smith K (1996) A linear-discontinuous spatial differencing scheme
for SN radiative transfer calculations. J Comp Phys 128:445
78. Wareing TA, McGhee JM, Morel JE (1996) ATTILA: a three-dimensional unstructured tetrahedral mesh discrete ordinates transport code. Trans Am Nucl Soc 17:146
79. Patton BW, Holloway JP (1996) Application of Krylov subspace iterative methods to the slab
geometry transport equation. In: Proc. 1996 ANS topical meeting. Advances and applications
in radiation protection and shielding, 21–25 April 1996, Vol. 1. N. Falmouth, MA, p 384
80. Kelly CT, Xue ZQ (1996) GMRES and integral operators. SIAM J Sci Comp 17:217
81. Saad Y (1996) Iterative methods for sparse linear systems. PWS Publishing Company,
Boston, MA
82. Ramone GL, Adams ML, Nowak PF (1997) A transport synthetic acceleration method for
transport iterations. Nucl Sci Eng 125:257
83. Adams ML (1997) A subcell balance method for radiative transfer on arbitrary spatial grids.
Transp Theory Stat Phys 26(4 & 5):385
84. Adams BT, Morel JE (1997) An acceleration scheme for the multigroup SN equations with
fission and thermal upscatter. In: Proc. ANS M&C topical meeting. Joint international conference on mathematical methods and supercomputing for nuclear applications, 5–10 Oct 1997,
Vol. 1. Saratoga Springs, New York, p. 343
85. Aussourd C (1997) An adapted DSN scheme for solving the two-dimensional neutron transport equation on a structured AMR grid. In: Proc international conference on mathematical
methods and supercomputing for nuclear applications, October 5–9, 1997, Vol. 1. American
Nuclear Society, Saratoga Springs, New York, p. 41
86. Adams ML, Nowak PF (1998) Asymptotic analysis of a computational method for time- and
frequency-dependent radiative transfer. J Comp Phys 146:366
87. Azmy YY (2002) Unconditionally stable and robust adjacent-cell diffusive preconditioning
of weighted-difference particle transport methods is impossible. J Comp Phys 182:213
88. Adams ML, Wareing TA, Walters WF (1988) Characteristic methods in thick diffusive problems. Nucl Sci Eng 130:18
89. Wareing TA, Morel JE (1998) ACTI Task 1 Summary. Los Alamos National Laboratory Report LA-UR-98–4749
90. Castrianni CL, Adams ML (1998) A nonlinear corner-balance spatial discretization for transport on arbitrary grids. Nucl Sci Eng 128:278
91. Warsa JS, Prinja AK (1998) Bilinear-discontinuous numerical solution of the time dependent
transport equation in slab geometry. Ann Nucl Energy 26:195
92. Jesse JP, Fiveland WA, Howell LH, Colella P, Pember RB (1998) An adaptive mesh refinement algorithm for the radiative transport equation. J Comp Phys 139:380
93. Pautz SD, Morel JE, Adams ML (1999) An angular multigrid acceleration method for SN
equations with highly forward-peaked scattering. In: Proc international conference on mathematics and computation, reactor physics and environmental analyses in nuclear applications,
27–30 Sept 1999, Vol. 1. Madrid, Spain, pp 647–656
94. Azmy YY (1999) Iterative convergence acceleration of neutral particle transport methods via
adjacent-cell preconditioners. J Comp Phys 152:359
95. Guthrie B, Holloway JP, Patton BW (1999) GMRES as a multi-step transport sweep accelerator. Transp Theory Statist Phys 28:83
96. Thompson KG, Adams ML (1999) A spatial discretization for solving the transport equation
on unstructured grids of polyhedra. In: Proc. ANS M&C international topical conference.
Mathematics and computation, reactor physics and environmental analysis, 27–30 Sept 1999,
Vol. 2. Madrid, p 1196
97. Wareing TA, McGhee JM, Morel JE, Pautz SD (1999) Discontinuous finite element SN
methods on 3-D unstructured grids. In: Proc. ANS M&C international topical conference.
Mathematics and computation, reactor physics and environmental analysis, 27–30 Sept 1999,
Vol. 2. Madrid, p 1185
1
Advances in Discrete-Ordinates Methodology
81
98. Wareing TA, Morel JE, McGhee JM (1999) A diffusion synthetic acceleration method for the
SN equations with discontinuous finite element space and time differencing, In: Proc. ANS
M&C international topical conference. Mathematics and computation, reactor physics and
environmental analysis, 27–30 Sept 1999. Vol. 1. Madrid, p 45
2000–2005
99. Azmy YY (2000) Acceleration of multidimensional discrete ordinates methods via adjacentcell preconditioners. Nucl Sci Eng 136:202
100. Lathrop KD (2000) A comparison of angular difference schemes for one-dimensional spherical geometry SN equations. Nucl Sci Eng 134:239
101. Wareing TA, Morel JE, McGhee JM (2000) Coupled electron–photon transport methods on
3-D unstructured grids. Trans Am Nucl Soc 83:240–242
102. Mathews KA, Miller RL, Brennan CR (2000) Split-cell linear characteristic transport method
for unstructured tetrahedral meshes. Nucl Sci Eng 136:178
103. Zika MR, Adams ML (2000) Transport synthetic acceleration with opposing reflecting boundary conditions. Nucl Sci Eng 134:159
104. Plimpton S, Hendrickson B, Burns S, McLendon W III (2000) Parallel algorithms for radiation transport on unstructured grids. In: Proc. SuperComputing 2000, 4–10 Nov 2000. Dallas,
TX
105. Adams ML (2001) Discontinuous finite element transport solutions in thick diffusive problems. Nucl Sci Eng 137:298
106. Brennan CR, Miller RL, Mathews KA (2001) Split-cell exponential characteristic transport
method for unstructured tetrahedral meshes. Nucl Sci Eng 138:26
107. Sanchez R, Santandrea S (2001) Symmetrization of the transport operator and Lanczos’ iterations. In: Proc. international meeting on mathematical methods for nuclear applications, 9–13
Sept 2001. American Nuclear Society, Salt Lake City, Utah
108. Sanchez R, Chetaine A (2001) A synthetic acceleration for a two-dimensional characteristic
method in unstructured meshes. Nucl Sci Eng 136:122
109. Wareing TA, McGhee JM, Morel JE, Pautz SD (2001) Discontinuous finite element SN methods on three-dimensional unstructured grids. Nucl Sci Eng 138:256
110. Adams ML, Larsen EW (2002) Fast iterative methods for discrete-ordinates particle transport
calculations. Prog Nucl Energy 40:3 (review)
111. Baker RS (2002) A block adaptive mesh refinement algorithm for the neutral particle transport
equation. Nucl Sci Eng 141:1
112. Liscum-Powell JL, Prinja AB, Morel JE, Lorence LJ Jr (2002) Finite element solution of
the self-adjoint angular flux equation for coupled electron–photon transport. Nucl Sci Eng
142:270
113. Pautz SD, Adams ML (2002) An asymptotic study of discretized transport equations in the
Fokker–Planck limit. Nucl Sci Eng 140:51
114. Pautz SD (2002) An algorithm for parallel SN sweeps on unstructured meshes. Nucl Sci Eng
140:111
115. Patton BW, Holloway JP (2002) Application of preconditioned GMRES to the numerical
solution of the neutron transport equation. Ann Nucl Energy 29:109
116. Warsa JS, Wareing TA, Morel JE (2002) Fully consistent diffusion synthetic acceleration of
linear discontinuous SN transport discretizations on unstructured tetrahedral meshes. Nucl Sci
Eng 141:236
117. Aussourd C (2003) Styx: a multidimensional AMR SN scheme. Nucl Sci Eng 143:281
118. Haghighat A, Wagner JC (2003) Monte Carlo variance reduction with deterministic importance functions. Prog Nucl Energy 42:25 (review)
119. Morel JE, Wareing TA, Lowrie RB, Parsons DK (2003) Analysis of ray-effect mitigation
techniques. Nucl Sci Eng 144:1
82
E.W. Larsen and J.E. Morel
120. Morel JE, Prinja AK, McGhee JM, Wareing TA, Franke BC (2003) A discretization scheme
for the 3-D continuous-scattering operator. Trans Am Nucl Soc 89:360
121. Hanshaw HL, Larsen EW (2003) The explicit slope SN discretization method. In: Proc. topical
meeting on mathematics and computations in nuclear applications, 6–10 April 2003. American Nuclear Society, Gatlinburg, Tennessee,
122. Hanshaw HL, Nowak PF, Larsen EW (2003) Stretched and filtered transport-synthetic acceleration of SN problems: part 1, homogeneous media. Trans Am Nucl Soc 89:354
123. Shashkov M, Wendroff B (2004) The repair paradigm and application to conservation laws. J
Comp Phys 198:265
124. Warsa JS, Wareing TA, Morel JE (2004) Krylov iterative methods and the degraded effectiveness of diffusion synthetic acceleration for multidimensional SN calculations in problems
with material discontin-uities. Nucl Sci Eng 147:218
125. Sanchez R, McCormick NJ (2004) Discrete ordinates solutions for highly forward peaked
scattering. Nucl Sci Eng 147:249
126. Warsa JS, Wareing TA, Morel JE, McGhee JM, Lehoucq RB (2004) Krylov subspace iterations for deterministic k-eigenvalue calculations. Nucl Sci Eng 147:26
1
Advances in Discrete-Ordinates Methodology
83
Professor Edward Larsen was born in New
York City in 1944. He attended Rensselaer
Polytechnic Institute, graduating with a
Ph.D. in Mathematics in 1971. His first
position after graduate school was an assistant professorship in the Department of
Mathematics at New York University. At
the Courant Institute of Mathematical Sciences at NYU, he started a research program
on neutron transport theory and made significant contributions to the applications of
asymptotic expansions to this subject. In
1977, after 5 years at NYU and 1 year
at the University of Delaware, he accepted
a staff position in the Transport Theory
Group at Los Alamos National Laboratory, where he investigated the numerical
methods that underlie the discretization and
solution strategies of the group’s large-scale
deterministic neutron transport codes. During his 9 years at Los Alamos, Larsen’s
research focus underwent a fundamental shift from analytical problems of mainly
theoretical interest to computational problems of practical interest. At Los Alamos
he made significant contributions to the development of methods for accelerating
the iterative convergence of transport calculations, and to theoretically predicting
the accuracy of numerical solutions in regions of large optical thickness. Several of
these techniques are now commonplace in the computational transport community.
In 1986, Larsen accepted a professorship in the Department of Nuclear Engineering at the University of Michigan. At Michigan, he has expanded his research in
the development of advanced methods for numerically simulating radiation interactions with matter. His recent work includes photon and electron transport problems
in medical physics, and merging deterministic and Monte Carlo methodologies.
In 1988 he was made a Fellow of the American Nuclear Society; in 1994 he was
awarded the US Department of Energy’s E.O. Lawrence Award for his innovative
contributions to nuclear technology; in 1996 he won the Arthur Holly Compton
Award of the American Nuclear Society in recognition of his outstanding contributions to education in the nuclear field; and in 1999 he won the Special Recognition
Award from the Mathematics and Computations Division of the American Nuclear
Society. Larsen is the author of over 150 scholarly journal articles on the mathematical analysis and computer simulation of radiation transport, and he has graduated
28 Ph.D. students who have worked in this technical area.
84
E.W. Larsen and J.E. Morel
Professor Jim E. Morel received a B.S.
in mathematics from Louisiana State
University in 1972, an M.S. in nuclear engineering from Louisiana State University
in 1974, and a Ph.D. in nuclear engineering
from the University of New Mexico in 1979.
He began his career in 1974 as a nuclear
research officer at the Air Force Weapons
Laboratory. In 1976 he became a staff scientist at Sandia National Laboratories. In 1984
he became a staff member at Los Alamos
National Laboratory, eventually serving as a
group leader and scientific advisor. In 2005
he accepted a professorship in the Department of Nuclear Engineering at Texas A&M
University. He has published over 140 refereed articles in journals and conference
proceedings in the areas of neutron transport, coupled electron–photon transport,
thermal radiation transport, and radiation-hydrodynamics.
Chapter 2
Second-Order Neutron Transport Methods
E.E. Lewis
2.1 Introduction
Among the approaches to obtaining numerical solutions for neutral particle transport
problems, those classified as second-order or even-parity methods have found increased use in recent decades. First-order and second-order methods differ in a
number of respects. Following discretization of the energy variable, invariably
through some form of the multigroup approximation, the time-independent forms
of both are differential in the spatial variable and integral in angle. They differ in
that the more conventional first-order equation includes only first derivatives in the
spatial variables, but requires solution over the entire angular domain. Conversely,
the second-order form includes second derivatives but requires solution over one
half of the angular domain. The two forms in turn lead to contrasting approaches
to reducing the differential–integral equations to sets of linear equations and in the
formulation of iterative methods suitable for the numerical solution of large engineering design problems. In what follows, we explore the state of methods used
to solve the second-order transport equation, comparing them, where possible, to
first-order methods.
Historically, diffusion theory was the earliest and remains the most widely employed second-order computational method in reactor core design. Fine mesh spatial
discretization has been performed through a variety of finite difference [1] and finite element techniques [2–4]. Along with the continuing advance of computing
power, two events greatly influenced the development of methods that are emphasized in this work. The first was the translation in 1962 of Vladimirov’s variational
formulation of the second-order form of the transport equation [5]. There followed
increased interest in variational formulations and the systematic exploration of alternate variational principles and their interconnectedness [6]. The second was the
rise of finite element methods, first in computational mechanics but then spreading to other fields [7–9]. The power of finite element methods largely supplanted
E.E. Lewis ()
Department of Mechanical Engineering, Northwestern University, Evanston, IL 60208, USA
e-mail: e-lewis@northwestern.edu
Y. Azmy and E. Sartori, Nuclear Computational Science: A Century in Review,
c Springer Science+Business Media B.V. 2010
DOI 10.1007/978-90-481-3411-3 2, 85
86
E.E. Lewis
earlier finite difference techniques not only in computational mechanics, but also in
a number of related fields as well. Since finite element methods in computational
mechanics were closely associated with variational principles, the method’s application to transport problems was greatly facilitated by the variational framework
that had been developed for the Boltzmann equation.
The first applications of finite element methods to neutronics problems appeared
in the early 1970s, and not surprisingly to the diffusion approximation [2–4]. But
soon thereafter finite element methods appeared coupled with spherical harmonics
[10–14], discrete ordinates [14–20], and also with finite elements in angle [22, 23].
Through the 1970s and into the 1980s, however, second-order transport methods
were largely limited to stand-alone codes applied to model problems. At that time,
computer memory restrictions continued to favor very effective marching algorithms with which first-order discrete ordinates methods were able to solve large
engineering problems with very limited memory [24, 25]. It has only been with
the rapid expansion of computer memories since that time that second-order methods have become a viable alternative for engineering calculation, and production
codes based on the methods became viable for large-scale engineering computations [11–14, 17–21, 26, 27].
To present the current state of even-parity computational methods, we begin in
Section 2.2 by deriving the second-order form of the linear Boltzmann equation
from its first-order counterpart. We then examine the weak and variational forms
of these equations that are used almost without exception as the points of departure
for numerical approximation. For brevity and clarity, isotropic scattering is assumed.
Section 2.3 reduces these forms to the diffusion approximation and then discretizes
the spatial variables. The simplicity of diffusion theory provides a convenient vehicle for introducing finite element methods, and for establishing notation that then
carries over for use in conjunction with more accurate angular approximations.
Following the inclusion of anisotropic scattering into the even-parity formulation,
Section 2.4 treats angular approximations: spherical harmonics, discrete ordinates as
well as their simplified forms. The section then combines these angular approximations with finite elements in space to obtain fully discretized forms of the even-parity
equation. Section 2.5 examines two variants of even-parity approaches: a hybrid or
nodal method and an integral even-parity method. Section 2.6 then offers concluding
remarks.
2.2 The Transport Equation
The numerical algorithms of the following sections are based primarily on the variational form of the even-parity transport equation. We begin, however, with the
standard first-order form of the equation and proceed to arrive at the variational
principle. Along the way, we also present both the mixed and the second-order
even- and odd-parity equations. Before proceeding to the variational formulation,
we examine briefly the weak form of the even-parity equation and of the mixed
2
Second-Order Neutron Transport Methods
87
equations in their primal and dual weak forms. The even-parity weak form may be
used interchangeably with the variational formulation. Mixed methods are less well
developed but are included because of the future potential that they may hold.
In what follows, we assume the time-independent form of the neutron transport
equation in which a multigroup discretization of the energy variable has already
been performed. Thus, we need to deal only with the energy-independent or withingroup form of the equation. For brevity, we also assume isotropic scattering in this
section, reserving the introduction of anisotropic scattering to Section 2.4.
2.2.1 First- and Second-Order Forms
The first-order form of the transport equation is [28]
*
*
O r .*
O C .*
O D s .*
r ; /
r / . r ; /
r/
Z
*
*
O 0 / C s.*
d0 . r ; r /; r 2 V; (2.1)
*
O is the direction of neutron travel; and s
where r is the spatial location, and are the macroscopic total and scattering cross sections, Rrespectively, s is the group
source, and we normalize the angular integrals such that d D 1. Integrating over
angle yields the neutron conservation equation
*
* *
*
*
r J . r / C r '. r / D s. r /;
Z
where
*
O
d . r ; /
(2.3)
O . r ; /
O
d
(2.4)
*
'. r / D
and
Z
* *
J .r / D
(2.2)
*
are the scalar flux and current, respectively, and
r D s
(2.5)
defining the even- and odd-parity flux components as
˙
O D 1=
. r ; /
2
*
h
O ˙
. r ; /
*
O
. r ; /
*
i
(2.6)
O and O and add and subtract the results to obtain a
We may evaluate Eq. (2.1) at set of coupled first-order equations
*
O r
C
C
D s C s
(2.7)
88
E.E. Lewis
and
*
O r
C
C
D 0:
(2.8)
A pair of second-order equations may then be obtained. Eliminating
Eqs. (2.7) and (2.8) yields the even-parity equation
*
*
O r
O r 1 while eliminating
*
C
*
O r 1 O r
C
C
C
between
D s C s
(2.9)
yields the odd-parity equation
C
h
i
*
* *
O r. s /1 .s =/ r J s :
D
These two equations contain only
and current may be expressed as
C
and
Z
D
and
*
Z
J D
d
, respectively, since the scalar flux
C
O
d
(2.10)
(2.11)
:
(2.12)
Finally, we may also write a second-order equation for the angular flux by first
*
O r s s/ and substituting into Eq. (2.1)
solving Eq. (2.1) for D 1 .
to obtain
*
*
O r 1 O r
C
*
O r 1 /.s C s/:
D .1 (2.13)
2.2.2 Weak Forms
The starting point for most space–angle approximations is the weak form of the forgoing equations. To obtain the weak forms of the coupled equations (2.7) and (2.8)
we multiply the two by even- and odd-parity test functions, Q C and Q , respectively, and integrate over angle and space:
Z
Z
dV
Z
Z
dV
*
O r
d Q C C
C
*
O r
d Q C
C
s ' s D 0;
D 0:
(2.14)
(2.15)
Applying the divergence theorem to the first term of the two equations yield
Z
Z
dV
h
i
*
O r Q C / C Q C C Q .s C s/
d .
Z
Z
O nO Q C D 0
C d d
(2.16)
2
Second-Order Neutron Transport Methods
and
Z
Z
dV
h
*
O r Q /
d .
C
89
C Q
i
Z
C
Z
d
O nO Q d
C
D 0:
(2.17)
where d indicates integration over the surface of the spatial domain, V , and nO is
the outward normal from d.
Two classes of methods arise from these equations [8, 19]. Equation (2.16) is the
basis for primal methods, and Eqs. (2.16) and (2.15) taken together constitute the
mixed-primal formulation. Alternately, Eq. (2.8) may be used to eliminate from
the volume integral to obtain the second-order even-parity primal formulation:
Z
Z
h
*
*
O r Q C /.
O r C/ C Q C
d 1 .
Z
Z
O nO Q C D 0
C d d
dV
C
Q .s C s/
i
(2.18)
This even-parity result may also be obtained directly by multiplying Eq. (2.9) by
Q C , integrating over space and angle, and then applying the divergence theorem.
Equation (2.17) is the basis for dual methods. Taken together, Eqs. (2.17) and
(2.14) constitute the mixed dual formulation. Alternately, Eq. (2.7) may be used to
eliminate C from the volume integral in Eq. (2.17) to form the second-order oddparity dual formulation:
Z
Z
*
*
O r Q /.
O r
d 1 .
/C Q Z
i Z
* *
* *
Q h
1
O nO Q . s / .r J / .s = / r J s C d d
dV
C
D0
(2.19)
Multiplying Eq. (2.10) by Q , integrating over space and angle and then applying
the divergence theorem also leads to this result.
The surface terms in the above equations determine the form of the boundary
conditions [9, 28]. Setting them equal to zero yields the homogeneous form of the
*
natural boundary conditions. Thus D 0, and C D 0 for r 2 are the natural
boundary conditions, respectively, in the primal and dual formulations. Inhomogeneous natural conditions may be obtained by setting D , in the primal and
C
D C , in the dual formulation; hereafter, we employ the subscript to indicate
a known function on the boundary. Conversely, if the boundary conditions specify
C
in the primal or in the dual formulations, they are essential; the surface
integral is deleted, and they are imposed directly on the solution.
Classically, it is the incoming angular flux that is known on the boundary. Hence
C
O C
. r ; /
*
O . r ; /
*
*
O D 0; *
O < 0;
r 2 ; nO . r ; /
(2.20)
90
E.E. Lewis
*
O D 0, *
O < 0 is the
and in particular the vacuum boundary . r ; /
r 2 , nO homogeneous form of this condition. These conditions may be treated as modified
natural conditions by first forming weighted residuals
Z
2
Z
d
O nO Q ˙ .
d
C
C
/
D0
(2.21)
n
O O <0
for primal and dual formulations, respectively. By making use of angular parity
arguments, it can be written
Z
Z
Z
Z
Z
O nO Q ˙ D d dj
O nj
d d
O Q ˙ ˙ C 2 d
Z
O nO Q ˙ :
d
(2.22)
n
O O <0
By employing this expression to eliminate the surface integral on the left of
Eq. (2.22) from the primal or dual formulations in Eq. (2.16) or (2.17), modified
natural boundary conditions may be obtained for a known incoming flux distribu*
O 0 / D .*
O
r ; /,
tion. Reflected boundary conditions, which have the form . r ; 0
O
O
where and are the angles of incidence and reflection are essential since they
apply directly to , with no contribution from its derivatives.
Until quite recently, the vast majority of second-order transport work has applied
to the even-parity form of the equation given by Eq. (2.9), whose weak form is given
by Eq. (2.18). Thus, we focus our attention on it. With the incoming flux boundary
conditions, Eq. (2.18) becomes
Z
Z
i
h
*
*
O r Q C /.
O r C / C Q C C Q .s C s/
dV
d 1 .
Z
Z
Z
Z
O nj
O nO Q C D 0;
C d dj
O Q C C C 2 d
d
n
O O <0
(2.23)
where for vacuum boundary conditions we set D 0. For most computations,
the problem domain is bounded by some combinations of reflected and vacuum
boundaries. We indicate this by dividing the boundary as D r C . Then, instead
of Eq. (2.23), we have
Z
Z
h
i
*
*
O r Q C /.
O r C / C Q C C Q .s C s/
dV
d 1 .
Z
Z
O nj
C d dj
O Q C C D 0;
(2.24)
since no surface integral appears for r , where the essential reflected conditions are
applied.
2
Second-Order Neutron Transport Methods
91
2.2.3 Variational Formulation
Both coupled and second-order forms of the primal and dual formulations may
also be formulated as variational principles, and surface terms are added to impose modified natural boundary conditions. Here, we restrict attention to the primal
second-order form, since it was the original variational principal of Vladimirov [5]
and is that upon which virtually all second-order methods are based.
The variational principle corresponding to Eq. (2.24) derives from the requirement that the following functional be stationary with respect to variations in C :
FŒ
C
Z
D
Z
dV
2
*
O r C C
d 1 Z
Z
O nj
C d dj
O C2 :
C2
s
2
2 s
(2.25)
To illustrate, suppose we make the substitution C ! C C ı Q C where ı is a very
small number, and Q C is an arbitrary deviation about the true solution C . We then
have
Z
Z
h
i
*
C
C
O r C /2 C C2 s 2 2 s
FŒ
Cı
D dV
d 1 .
(Z
Z
Z
Z
h
*
*
C2
O nj
O r Q C /.
O r C/
O
C 2ı
C d dj
dV
d 1 .
C Q C .
1
C
Z
s s/ C
Z
d
)
O nj
dj
O QC
C
(Z
Cı
2
Z
dV
d
)
Z
Z
2
*
C
C2
2
C2
O
O
Q
Q
Q
Q
: (2.26)
r
C d dj nj
C
s
O
For the functional to be stationary, the linear term in ı must vanish. This term, however, is identical to Eq. (2.24) derived from the weak form. Applying the divergence
theorem to Eq. (2.24), or to the linear term in ı in Eq. (2.26), yields:
Z
Z
*
*
O r
O r 1 dV
d Q C Z
Z
*
O nO Q C 1 O r
C d d
C C s s
Z
Z
C
O nj
C d dj
O QC
C
C
D 0: (2.27)
For this expression to hold, both the volume and surface integrals must vanish.
Since Q C is an arbitrary function, the expression in parentheses must vanish.
Thus, the even-parity transport equation given by Eq. (2.9) is satisfied. This is the
92
E.E. Lewis
Euler–Lagrange equation. On the vacuum boundary the two surface integral terms
may be rewritten as
Z
Z
d
r
*
O r
O nO Q C 1 d
Z
C2
Z
C
*
O r
O nO Q C . 1 d
d
C
C
/:
(2.28)
O n<0
O
*
O r C and
D C
Since from Eqs. (2.8) and (2.6) we have D 1 *
O r C , the first term in Eq. (2.28) yields the natural boundary condition
1 D 0. However, on reflected boundary condition we take Q C D 0 on r and
impose the essential boundary condition on the solution. From the second term, the
O < 0 on ) is
modified natural boundary condition on the vacuum ( D 0, for nO satisfied.
2.3 The Discretized Diffusion Equation
The diffusion approximation may be viewed as arising naturally as the lowestorder angular approximation to the even-parity equation. Indeed a straightforward
exercise is to eliminate the angular variables from the weak or variational forms discussed in Section 2.2 by using the diffusion approximation. Here, a simple statement
of the diffusion equation in variational form serves as a vehicle for introducing the
finite element method. Finite element notation established for use in conjunction
with the diffusion equation then allows for a more straightforward discretization
of the spatial variable in conjunction with the higher order angular approximations
treated in Section 2.4.
2.3.1 The Diffusion Formulation
The diffusion approximation may be obtained by making the approximations
R
*
C
O J and then operating on Eqs. (2.7) and (2.8) with d
, 3
R
O respectively, to obtain
and d,
*
and
*
r J C r D s
(2.29)
*
*
1= r
C J D 0:
3
(2.30)
2
Second-Order Neutron Transport Methods
93
*
The second-order form of the diffusion equation is obtained by eliminating J
between Eqs. (2.29) and (2.30):
*
*
r.3/1 r C . s / D s:
(2.31)
Alternately, we may obtain this equation directly from the even-parity transport
equation, or from its variational form. Let
C
:
(2.32)
R
Then operating on Eq. (2.9) with d yields Eq. (2.31). The approximation
C
likewise reduces the even-parity functional F ΠC , given by Eq. (2.25),
to its diffusion form
Z
Z
* 2
1
2
1
1
r
C r 2 s C =2 d 2 :
(2.33)
F ΠD dV =3
The diffusion equation can be shown to be the Euler–Lagrange equation by letting
! C ı Q , where Q is an arbitrary deviation from the solution and requiring
the linear term in ı to vanish:
Z
Z
i
h
*
*
(2.34)
dV 1=3 1 .r Q /.r / C r Q Q s C 1=2 d Q D 0:
This is identical to the diffusion approximation’s weak form obtainable by inserting Eq. (2.32) into Eq. (2.27). If we apply the divergence theorem to Eq. (2.34), we
obtain
Z
Z
*
*
*
1
1
Q
dV
r =3 r C r s C d Q 1=3 1 nO r
Z
C
*
d Q 1=3 1 nO r C 1=2
r
D 0:
(2.35)
For arbitrary Q , the quantity in parentheses in the first term must vanish, yielding
Eq. (2.31) as the Euler–Lagrange equation. The surface integral over r yields the
*
reflected or zero current condition nO r D 0 as the natural boundary condition,
and in diffusion theory this corresponds to the reflected boundary condition. (Recall, however, that for the transport equation more generally reflective boundary
conditions are essential.) Requiring the quantity in parentheses in the last term to
vanish yields the modified natural boundary condition; it approximates the vacuum
condition extrapolating the flux to zero at a distance of 2=3 1 outside of .
94
E.E. Lewis
2.3.2 Finite Element Discretization
We discretize the spatial variable employing the finite element method. For clarity,
we illustrate using simple triangular elements in two-dimensional Cartesian geometry [7]. We first divide the problem domain into elemental volumes Ve :
V D
X
Ve :
(2.36)
e
In two dimensions, these may appear, for example, as the triangles in Fig. 2.1.
*
Within each element, we approximate . r / as the scalar product of a vector of
*
polynomial trial functions, ne . r /, and unknown coefficient vector, Ÿe :
*
*
. r / nTe . r /Ÿe ;
*
r 2 Ve :
(2.37)
Representative linear and quadratic trial functions for triangular elements, for
*
which the length of ne . r / is 3 for the linear element and 6 for the quadratic element,
Linear elements
Quadratic elements
Fig. 2.1 Triangular finite element grids
1
1
1
0
0
0
0
ne(r) linear
0
0
0
0
0
Fig. 2.2 Triangular finite element trial functions
ne(r) quadratic
0
2
Second-Order Neutron Transport Methods
95
respectively, are shown in Fig. 2.2. (These and other classes of hfinite elements
are
i
*
discussed in detail elsewhere [7]). They have the properties that ne . r i / D ıij so
j
*
that the components of Ÿe are equal to at r i , i D 1; : : : , 3 or 6, the mesh points
shown in Fig 2.2. Moreover, if the same sets of trial functions are used in adjoining
elements, the flux approximation will be continuous across the interfaces.
Provided continuity is maintained, the functional given by Eq. (2.33) may be
written as the sum of elemental contributions.
X
X
ŸTe Ae Ÿe 2
ŸTe qe ;
(2.38)
F ŒŸe D
e
e
where the elemental matrices involve integrals of the known trial functions,
Ae D 1=3e1
Z
Z
Z
* * dV rne rnTe C re dV ne nTe C 1=2 dne nTe ;
e
e
(2.39)
ve
with the last term present only where one or more of the element surfaces lie along
a vacuum boundary; the source term is given by
Z
qe D
dV ne s:
(2.40)
e
We may formally assemble the global functional from its elemental contributions as
follows. The continuity of the trial functions across element interfaces allows the elemental unknowns to be mapped on to a globally numbered vector of N unknowns:
Ÿe D „© Ÿ:
(2.41)
where „e is 3 N or 6 N Boolean transformation matrices. Inserting this equation
into Eq. (2.38), we obtain the algebraic functional
F ŒŸ D ŸT AŸ 2ŸT q;
where
AD
X
„Te Ae „e ;
(2.42)
(2.43)
e
and
qD
X
„Te qe :
(2.44)
e
We obtain a set of linear equations for Ÿ by requiring the reduced functional
Q Hence
F ŒŸ to be stationary with respect to variations: Ÿ ! Ÿ C ı Ÿ.
i
h
Q
F Ÿ C ı ŸQ T D .ŸT AŸ 2ŸT q/ C 2ı ŸQ T .AŸ q/ C ı 2 ŸQ T AŸ;
(2.45)
96
E.E. Lewis
and since the linear terms in ı must vanish we have
AŸ D q:
(2.46)
Here, A is a symmetric matrix. Provided a consistent numbering scheme has been
applied to the mesh points the matrix will also be banded. Solution may be obtained
using any number of direct methods for smaller problems, or iterative techniques
such as preconditioned conjugate gradients for large problems [29–31].
2.4 The Discretized Transport Equation
In this section, we first write Eq. (2.1) with anisotropic scattering included, and then
derive the corresponding form of the even-parity equation as well as stating the variational principle that provides the starting point for numerical approximation. We
then derive spherical harmonic and discrete ordinates angular approximations (referred to frequently as Pn and Sn methods, respectively) along with their simplified
spherical harmonic and discrete ordinates (SPn and SSn) counterparts. For each approximation, the result is a coupled set of second-order partial differential equations.
The sets differ only in the structure of their coefficient matrices. Thus, we can apply
the finite element methods discussed in Section 2.3 to discretize the spatial variables
for all of the approximations simultaneously.
2.4.1 Anisotropic Scattering
With anisotropic scattering included, the form of the second-order equation becomes
more complex [15, 17, 32]. To obtain it, we begin with the first-order form of the
equation:
*
O r
O C *
r;
r
*
Z
*
O
O O0
r ; D d0 s r ; *
O 0 Cs *
O ;
r;
r;
*
(2.47)
0
O
O
where s r ; is the macroscopic differential scattering cross section. In the
multigroup approximation the group source is given by
*
X Z
*
O
O O0
d0 sgg0 r ; s r; D
*
g0
O 0 C Sfg *
r;
r
*
(2.48)
g 0 ¤g
where g designates the group under consideration, sgg0 represents scattering from
group g 0 to g, and Sfg includes the isotropic fission and external sources.
2
Second-Order Neutron Transport Methods
97
O and O and add and subtract the results to
We may evaluate Eq. (2.47) at obtain a pair of coupled first-order equations
*
O r
where
and
O C *
r;
r
*
˙
Z
O D d0 s˙ *
O O0
r;
r;
*
O ;
Cs ˙ r ; *
˙
O0
r;
*
(2.49)
h i
*
O O 0 / D 1= s *
O O 0 ˙ s *
O O0
s˙ . r ; r ;
r ; 2
(2.50)
h i
*
O D 1= s *
O ˙s *
O :
s˙ r ; r;
r ; 2
(2.51)
Here, s˙ and s ˙ indicate the even- and odd-parity components of the anisotropic
scattering cross section and source, respectively. We next express the anisotropic
scattering cross sections as expansions in an orthonormal set of spherical harmonics
[28, 33]
O D Clm P m ./
Ylm ./
l
cos.m!/
;
sin.m!/
l D 0; 1; 2; : : : ;
jmj D 0; 1; 2; : : : ; l (2.52)
where Plm ./ are the associated Legendre polynomials, and the Clm are chosen
R
O l 0 m0 ./
O D ıll0 ımm0 ; we follow the convention that m 0
such that dYlm ./Y
*
O O 0 / as
signifies the cosine series and m < 0 the sine series. We first write s˙ . r ; expansions in Legendre polynomials
O O 0/ D
s˙ . r ; *
X
l
*
O O0 :
l r Pl (2.53)
even
odd
The Legendre addition theorem [33]
O O 0/ D
Pl .
l
X
O lm .
O 0/
Ylm ./Y
(2.54)
mDl
then allows Eq. (2.53) to be written in the convenient vector form
*
O O 0 / D yT˙ ./†
O ˙ .*
O 0 /;
r /y˙ .
s˙ . r ; (2.55)
where we have formed vectors of the even- and odd-parity spherical harmonics
O D ŒY00 ; Y22 ; Y21 ; Y20 ; Y21 ; Y22 ; Y44 ; ;
yTC ./
O D ŒY11 ; Y10 ; Y11 ; Y33 ; Y32 ; Y31 ; Y30 ; Y31 ;
yT ./
(2.56)
(2.57)
98
E.E. Lewis
R
R
O T ./
O D I and dy˙./y
O T ./
O D 0. The diagonal matrices
with dy˙ ./y
˙
are defined by
† C D diagŒ0 ; 2 ; 2 ; 2 ; 2 ; 2 ; 4 ; ;
(2.58)
† D diagŒ1 ; 1 ; 1 ; 3 ; 3 ; 3 ; 3 ; :
(2.59)
The coupled pair of even- and odd-parity equations given by Eq. (2.49) then
becomes
*
O C . r / C . r ; /
O
. r ; /
Z
O C .*
O 0/
r / d0 yC .
D yTC ./†
O r
*
*
*
C
O 0 / C s C . r ; /;
O
.r ; *
*
(2.60)
and
*
*
*
O C .*
O
. r ; /
r / . r ; /
Z
O .*
O 0 / .*
O 0 / C s .*
O
r / d0 y .
r ;
r ; /:
D yT ./†
O r
C
(2.61)
To obtain the even-parity equation, we employ Eq. (2.61) to obtain the odd-parity
moments in terms of the even-parity moments:
Z
Z
*
O ./
O D 1 s 1 r dy
O ./
O C ./;
O
ŒI 1 † dy ./
(2.62)
where we define the even- and odd-parity group-source moments as
Z
˙ *
O ˙ .*
O
r ; /:
s . r / D dy˙ ./s
(2.63)
Using Eq. (2.62) to eliminate the integral of on the right side of Eq. (2.61), we
may solve for in terms of C and the group-source terms:
*
O r
O D 1 ./
O C 1 s ./
O
./
Z
*
O Q
O 0 /
O0r
d0 y .
2 yT
./† C
where
C
Q s ; (2.64)
O 0 / C 2 yT ./
O †
.
Q ˙ D † ˙ ŒI 1 † ˙
†
1
:
(2.65)
Finally, substituting Eq. (2.64) into Eq. (2.60) we obtain the second-order evenparity equation;
*
*
O r
O r 1 *
C
O r 1 yT
C
Z
O 0 / C .
O 0 / C sC
d0 yC .
D yT
†
C
C
Z
*
Q d0 y .
Q /s ; (2.66)
O 0 /
O 0 r C .
O 0 / .I C 1 †
1 †
C
C
2
Second-Order Neutron Transport Methods
99
where we have also expanded the group source in harmonic moments
sCD
X
Z
† Cgg0
O
dyC./
C O
g 0 ./
C sfg ;
(2.67)
g 0 ¤g
sD
X
1
†gg0 g1
0 I g 0 † g 0 g 0
*
1
s
g0 r Z
O ./
O
dy
C O
g 0 ./
: (2.68)
g 0 ¤g
Comparing Eq. (2.9) to Eq. (2.66), it is apparent that the second-order forms of the
equation appear to be considerably more complex than the first-order form when
anisotropic scattering is included.
The weak form of Eq. (2.66) may be obtained by integrating over angle and
volume and then applying the divergence theorem:
(Z
Z
h
*
*
O r Q C /.
O r
d 1 .
dV
Z
dyC Q C
Z
T
C
/ C QC
Z
†C
*
O r QC
dy dyC
T
Q
†
C
C
i
Z
dyC Q
Z
*
O r
dy C
)
Z
T
*
C
2
1
Q /s
O r Q
dy .I C †
Z
Z
C
d
2
O nO Q C
d C
T
sC
D 0;
(2.69)
where is given by Eq. (2.64).
Alternately, a variational principle may be written, which yields Eq. (2.66) as the
Euler–Lagrange equation. Variational formulation of the within-group problem is
FŒ
C
Z
D
(Z
dV
h
*
O r
d 1 .
Z
dyC
Z
C 2
T
C
*
i
Z
†C
O r
dy C2
/ C
dyC
T
Q
†
C
Z
dyC
Z
*
O r
dy C
)
Z
T
*
C
2
1
Q /s :
O r
dy 2 .I C †
C
The natural boundary condition remains
Eq. (2.64).
2
D 0, but where
C
C
T
2sC
(2.70)
is defined by
100
E.E. Lewis
2.4.2 Angular Approximations
We next use Eq. (2.70) as the point of departure for deriving spherical harmonics
and discrete ordinates approximations.
2.4.2.1 Spherical Harmonics Expansions
The spherical harmonics approximations may be expressed as the scalar product of
O is given by Eq. (2.56)
a row and a column vector, where yC ./
C
*
O yTC ./§
O C .*
. r ; /
r /:
(2.71)
O we then have the space-dependent coeffiFrom the orthonormality of the yC ./,
cients to be
Z
*
O C .*
O
§ C . r / D dyC./
r ; /:
(2.72)
Likewise, we may write
C
Z
*
Uk 0 rk 0 § . r / D
Z
where
Uk D
*
O O r
dy ./
C
*
O
. r ; /;
O kO y ./y
O TC ./:
O
d
(2.73)
(2.74)
and hereafter repeated k or k 0 in the same term indicates summation over the spatial
directions.
We next substitute Eq. (2.71) into the functional given by Eq. (2.70) to obtain:
F Τ C D
Z
n
Q /.Uk 0 rk 0 § C /
dV .Uk rk § C /T 1 .I C 1 †
o
Q /s ;
C§ CT .I 1 † C /§ C 2§ CT sC 2.rk § C /T 2 .I C 1 †
(2.75)
O to obtain the identity
where we have used the orthonormal properties of yC ./
Z
O kO O kO 0 yC ./y
O TC ./
O
(2.76)
UTk Uk 0 D d
and thus simplify the streaming term in the reduced functional. Requiring the func*
tional to be stationary with respect to variations in § C . r / yields the spherical
harmonics approximation as the Euler–Lagrange equation:
Q Uk 0 rk 0 § C C .I † C /§ C
rk UTk 1 ŒI C 1 †
Q /s :
D sC rk UT 1 .I C 1 †
k
(2.77)
2
Second-Order Neutron Transport Methods
101
2.4.2.2 Discrete Ordinates Approximations
The discrete ordinates equation may be obtained from the Functional, Eq. (2.70), by
O n and replacing integrals over
evaluating all the quantities at discrete directions angle with an appropriate quadrature approximation
Z
X
*
*
O O n /;
d f . r ; /
wn f . r ; (2.78)
n
O n are the discrete ordinates directions. The evenwhere wn are the weights and *
O n /. Thus,
parity flux is then evaluated in the discrete ordinates directions by C . r ; to reduce the functional we create a vector of the even-parity flux approximated in
the discrete ordinates directions:
h
i
T *
*
C * O
O2 ; C *
O 3 ; ; C *
O n ; :
§C . r / r ; 1 ; C r ; r;
r;
(2.79)
To apply the quadrature rule of Eq. (2.78) to the functional, we also define two
diagonal matrices
h
i
O O O ;
O O 1 k;
O 2 k;
O 3 k;
O n k;
k D diag (2.80)
and
W D diagŒw1 ; w2 ; w3 ; ; wn ; :
(2.81)
O column vectors evaluated in the
We also employ matrices made up of the y˙ ./
ordinates directions:
h
i
O 1 /; y˙ .
O 2 /; y˙ .
O 3 /; ; y˙ .
O n /; :
(2.82)
Y ˙ D y ˙ .
Utilizing the angular quadrature approximation and Eqs. (2.79) through (2.82), the
Functional reduces to the form
Z
n
C
Q Y W .k 0 rk 0 § C /
F Œ§ D dV .k rk § C /T 1 W I C 1 YT †
C§ CT W I 1 YTC † C YC W § C 2§ CT WYTC sC
o
Q /s ;
(2.83)
2.k rk § C /T WYT 2 .I C 1 †
where the group sources are obtained by applying the quadrature formula to
C
Eqs. (2.67) and (2.68). If we require F Œ§ C to be stationary by taking F Œ§ C Cı §Q
and requiring the linear term in ı to vanish, we obtain an Euler–Lagrange equation
h
i
Q Y W k 0 rk 0 § C C I YT † C YC W § C
rk k 1 I C 1 YT †
C
Q /s ;
D YTC sC k rk 1 YT .I C 1 †
which is just the even-parity form of the discrete ordinates equations.
(2.84)
102
E.E. Lewis
Both the spherical harmonics and discrete ordinates approximations can be
written in forms that are identical except for the coefficient matrices. The variational
principle has the form
F §C D
Z
dV
˚
Hkk0 rk 0 § C C § CT K§ C
2§ CT GC sC 2.rk § C /T Gk s ;
rk § C
T
(2.85)
and the corresponding even-parity equations are
rk Hkk0 rk 0 § C C K§ C D GC sC rk Gk s :
(2.86)
The coefficient matrices are given in Table 2.1, along with those for the simplified
approximations treated in the following section.
Thus far, the derivations hold for three dimensions. For one or two dimensions
they may be appropriately reduced by eliminating terms in the spherical harmonics
or discrete ordinates expansions. In spherical harmonics all terms with m < 0 for
x–y geometry and all terms with m ¤ 0 for plane geometry are eliminated from
Eqs. (2.71) through (2.77). The duplicate directions are likewise eliminated from
discrete ordinates approximations (polar axis in x direction, and azimuthal angle
measured from y direction k D 1, 2, 3 for x; y, and z axes). The plane-geometry
spherical harmonics equation, in which the forgoing matrices are expressed in terms
of Legendre polynomials, is particularly relevant here as the basis for the simplified
spherical harmonics approximation.
Table 2.1 Uniform notation for Eqs. (2.84) and (2.85)
Pn
Hkk0
K
Q Uk 0
UTk 1 ŒI C 1 †
GC
I † C
I
Gk
UTk 1
Q /
.I C 1 †
Sn
k 1 ŒW
C
SPn
1
i
Q Y W
WYT †
W WYTC † C YC W
YTC
k 1 WYT
I † C
I
UT0 1
k 0
Q UT ıkk0
U0 1 ŒI C 1 †
0
Q /
.I C 1 †
Q /
.I C 1 †
SSn k 1 ŒW
i
Q Y W k 0 ıkk0
C 1 WYT †
W WYTC † C YC W
YTC
k 1 WYT
Q /
.I C 1 †
2
Second-Order Neutron Transport Methods
103
2.4.2.3 Simplified Angular Approximations
Cross-derivative terms rk . /rk 0 with k ¤ k 0 appear in both the spherical harmonics and discrete ordinates approximations. Eliminating them greatly simplifies
the spatial discretization. The widely employed simplified spherical harmonics, or
SPn, approximation provides such simplification and reduces the number of coupled
differential equations that must be solved, often without substantial loss of accuracy
[34–41]. We obtain the SPn equations by taking the one-dimensional form of the Pn
equations, given by Eq. (2.77)
i
@ T 1 h
Q U1 @ § C C .I † C /§ C
I C 1 †
U1 @x
@x
@ T 1 C
1 Q
I C † s
U1 Ds @x
and replacing the derivatives with gradient and divergence operators
h
i
Q U1 rk § C C .I † C /§ C
rk UT1 1 I C 1 †
X
Q s :
D sC rk UT1 1 I C 1 †
(2.87)
(2.88)
k
The resulting equation may also be written in variational form and expressed as coupled partial differential equations; the coefficient matrices are included in Table 2.1.
More importantly, Eq. (2.88) can be written as coupled sets of diffusion equations,
allowing highly developed diffusion computational methods to be used to obtain
solutions with great computational efficiency.
The simplified discrete ordinates approximation is much more recent [42], and
to date not widely employed. It simply eliminates the cross-derivative terms from
Eq. (2.84) by letting
h
h
i
i
Q Y W k 0 ! k 1 I C 1 YT †
Q
k 1 I C 1 YT †
Y W k 0 ıkk0
(2.89)
to yield
h
i
Q Y W k rk § C C I YTC † C YC W § C
rk k 1 I C 1 YT †
Q /s :
D YTC sC k rk 1 YT .I C 1 †
(2.90)
The SSn equations may also be written as coupled sets of partial differential equations as indicated in Table 2.1.
Before proceeding, it is instructive to briefly compare Pn and Sn solutions for
which the spatial discretization error is insignificant. Figure 2.3 shows results for
the widely used Azmy benchmark [43], which consists of a fixed source located
in a highly absorbing medium. The flux plots along the surface far from the source
accentuate errors in angular approximations. The P11 solution is converged in angle,
and may be used as a reference.
Relative Flux Magnitude
104
E.E. Lewis
6.00
5.75
5.50
5.25
5.00
4.75
4.50
4.25
4.00
3.75
3.50
3.25
3.00
2.75
2.50
2.25
2.00
1.75
1.50
1.25
1.00
0.75
0.50
0.25
0.00
VARIANT P1
VARIANT P3
VARIANT P5
VARIANT P11
TWODANT S8
TWODANT S16
0
1
2
3
4
5
6
7
8
9
10
Y Position (cm)
Fig. 2.3 Flux plots from Sn and Pn calculations
Note that the Sn solutions oscillate about the reference demonstrating the wellknown ray effects. Low order Pn solutions appear more physically plausible, since
they are smooth curves and their errors tend to be either systematically low (as
in this problem) or high. For this problem, SPn solutions, which are not shown,
closely track the corresponding Pn solutions. Since SPn has fewer angular degrees of freedom, computational algorithms for SPn solutions run faster and use
less memory than those for the same level Pn approximation. Unlike discrete ordinates or spherical harmonics methods, however, SPn calculations do not converge
to the true solution as n is increased. In many cases, however, the residual errors are acceptably small. The circumstances under which SPn solutions may be
expected to reasonably approximate the transport equation are well examined elsewhere [34, 35, 39, 40].
2.4.3 Spatial Discretization
We spatially discretize the foregoing angular approximations using the same finite
element techniques employed in Section 2.3. Thus, we begin by subdividing the
problem domain into a finite number of volume elements.
V D
X
e
Ve :
(2.91)
2
Second-Order Neutron Transport Methods
105
Then, provided we consider only spatial trial functions that are continuous across element interfaces, we may write the functional given by Eq. (2.85) as a superposition
of elemental contributions:
n
XZ
T
F §C D
dV rk § C Hekk0 rk 0 § C C § CT Ke § C
e
e
2§
CT
o
T
GeC sC 2 rk § C Gek s :
(2.92)
Within each element, we represent the spatial distribution trial functions, designated
*
by the vector ne . r /, and a vector of unknown magnitudes Ÿe . The even-parity flux
moments become
*
*
*
(2.93)
§ C . r / D I ˝ nTe . r /Ÿe ; r 2 Ve :
Then, using the properties of the tensor product, we may write
§ CT Ke § C D .I ˝ nTe Ÿe /T Ke I ˝ nTe Ÿe D ŸTe I ˝ ne Ke ˝ nTe Ÿe D ŸTe Ke ˝ ne nTe Ÿe :
(2.94)
Similarly, it follows that
T
rk § C Hekk0 rk 0 § C D ŸTe Hekk0 ˝ .rk ne / rk 0 nTe Ÿe :
(2.95)
The functional given by Eq. (2.92) reduces to a superposition of subelement contributions:
X
X
ŸTe Ae Ÿe 2
ŸTe qe ;
(2.96)
F ŒŸe D
e
e
where the coefficient matrices are given in terms of the known trial functions of
space and angle
Z
Ae D
Hekk0
˝
dV .rk ne / r
k0
nTe
Z
CK ˝
e
e
dV ne nTe :
(2.97)
e
Here, the superscripts on Hekk0 and Ke indicate that the cross sections are evaluated
for element Ve . The source term is
Z
Z
(2.98)
qe D GeC dV sC ˝ ne C Gek dV s ˝ rk ne :
e
e
With piecewise polynomial trial functions for the finite elements, the components
of Ÿe are just the approximate values of § C at the element vertices [7]. Since these
trial functions must be continuous across element interfaces, the process for assembling the elemental contributions and determining a set of linear equations is
completely analogous to that of Eqs. (2.38) through (2.46): The components of Ÿe
corresponding to the same physical location on either side of a subelement must
have the same value. This continuity condition is enforced by creating a Boolean
106
E.E. Lewis
matrix „e for each subelement that maps the Ÿe onto Ÿ, a node wide vector of coefficients: Ÿe D „e Ÿ. This transformation allows us to write the discretized functional
in the form of Eq. (2.38). The set of algebraic equations is again obtained by requiring the discretized functional to be stationary. The result, once again, is a set of
equations with the form AŸ D q, but where the elemental contributions to A and q
are now given by Eqs. (2.97) and (2.98). Regardless of the angular approximation
employed, the A matrix is sparse and symmetric. For smaller problems, direct methods may be used for solving the matrix problem. For large problems, an array of
sparse matrix iterative techniques may be employed [29–31].
2.5 Hybrid and Integral Methods
Thus far, we have examined a number of angular approximations to the even-parity
transport equations coupled with the use of finite elements in space. There are, of
course, other possibilities for the discretization of the second-order equations. Here,
we briefly examine two of these, the variational nodal method [11, 13, 32, 41] and
an integral method [44], and then discuss recent work in combining nodal, finite
element, and integral methods.
2.5.1 A Variational Nodal Method
The variational forms of the even-parity equation that we have utilized have incoming flux, vacuum, or reflected boundary conditions. Suppose, however, that we
create a functional that is easily shown to be equivalent to the weighted residual,
Eq. (2.69), which has an inhomogeneous natural boundary condition on the oddparity flux. We simply append the following surface term to the functional given by
Eq. (2.70):
Z
Z
C
C
O C :
DF Œ
C 2 d dnO (2.99)
F Π;
The subscript is appended to denote that the problem domain is a volume V
bounded by . By requiring this functional to be stationary with respect to variations in C within V we obtain the Euler–Lagrange equation (2.66). Variations
on the boundary, , yield the relationship, Eq. (2.64), giving in terms of C .
Thus, we may consider solving the transport problem within V , assuming that the
odd-parity flux is known at the boundary. In the finite element literature the
following would be classified as a hybrid element approach [7, 8].
Suppose we consider V to be the volume of one “node” of the problem domain,
and the volume of the entire domain to be the sum of the nodes’ volumes:
2
Second-Order Neutron Transport Methods
107
V D
X
V ;
(2.100)
Then, we may write the functional for the entire domain as
FŒ
C
;
D
X
F Œ
C
;
;
(2.101)
where now appears as a Lagrange multiplier at the nodal interfaces. Requiring
F ΠC ; to be stationary with respect to variations in C within V , just yields
Eq. (2.66) within each V . Requiring it to be stationary with respect to ! C
ı Q yields the condition
Z
Z
d
O
dnO C
0C
Q D 0;
(2.102)
where nO and nO 0 D nO are the outward normal vectors from nodes V and V 0 on
either side of . Thus, C must be continuous across the interfaces.
In the neutronics literature, a nodal method is generally defined as one in which
neutron conservation is enforced for each node or subregion V [45]. The variational
method described
here
R
R may be shown to meet this condition as follows: suppose we
let N D V 1 dV d C be the scalar flux averaged over node V . We may then
R
R
write C D N C 0C where dV d 0C D 0 and substitute this expression into
Eq. (2.99) to obtain
F Œ
C
;
D . s /V N 2 2V N sN C 2 N
Z
C terms independent of N ;
*
d nO J
(2.103)
where sN is the group source averaged over V and angle. If we require the functional
to be stationary with respect to variations N ! N C ı QN we obtain the nodal balance
condition
Z
*
N
. s /V C d nO J D V sN :
(2.104)
This enforcement of neutron conservation over each V allows larger nodes to be
used than typical in the case of fine mesh methods that do not enforce such balance.
To proceed, we apply spherical harmonics, discrete ordinates, or one of the simplified angular approximations given in Section 2.4 to F ΠC given by Eq. (2.99).
We must consistently discretize the surface term. We first divide the surface into
a number of flat surfaces, . In the spherical harmonics approximation, we let
*
*
*
O yT ./§
O . r ; /
. r /; r 2 :
(2.105)
O to indicate that the odd-parity spherical
Here, the subscript is attached to y ./
harmonics are rotated such that the polar axis is perpendicular to the surface. In
108
E.E. Lewis
O this is necessary to obtain the correct
addition, the Yll are deleted from y ./;
number of linearly independent coupling conditions [14]. Employing Eqs. (2.71)
and (2.105), we get
Z
Z
d
O
dnO C
D
XZ
d§ CT C § ;
(2.106)
R
O C yT , while for the SPn approximation C D UT .
with C D dnO y
1
For discrete ordinates, we approximate the odd-parity flux on by using
Eqs. (2.79) through (2.81) with
*
§ T . r / h
*
O 1 /;
.r ; *
O 2 /;
.r ; *
O 3 /; ;
.r ; i
*
O n /; .r ; (2.107)
and
h
i
O 1 nO ; O 2 nO ; O 3 nO ; ; O n nO ; :
D diag (2.108)
The result is again Eq. (2.106), but with C D W ; this equality remains unchanged in the SSn approximation.
Next, we rewrite the reduced form of the functional given by Eq. (2.99) with the
angularly discretized variables discretized in a form similar to Eq. (2.92):
F § C; § D
Z
Hkk0 rk 0 § C C § CT K§ C 2§ CT GC sC
XZ
2.rk § C /T Gk s C 2
d § CT C § (2.109)
:
dV
˚
rk § C
T
Within the node, we approximate the spatial dependence of the even-parity flux by a
*
vector f. r / containing the orthonormal components of a complete polynomial, and
on the interfaces the spatial dependence of the odd-parity flux is given by a second
*
set of orthonormal polynomials, h . r /, defined only on the interface. Thus,
*
*
§ C . r / D I ˝ f T . r /Ÿ
*
r 2V
(2.110)
and
*
*
*
T
§
. r / D I ˝ h . r / r 2 ;
(2.111)
where Ÿ and are the unknown coefficients. The reduced functional then becomes
F ŒŸ; D ŸT AŸ 2ŸT q C 2ŸT
X
M ;
(2.112)
2
Second-Order Neutron Transport Methods
109
where
Z
Z
A D Hkk0 ˝ dV .rk f/ rk 0 f T C K ˝ dV f f T ;
Z
Z
q D GC dV sC ˝ f C Gk dV s ˝ rk f
(2.113)
(2.114)
Z
and
M D C ˝
dfhT :
(2.115)
Requiring Eq. (2.112) to be stationary with respect to variations in Ÿ yields
AŸ D q X
M ;
(2.116)
or equivalently
Ÿ D A1 q A1 M;
(2.117)
T T
T
where M D ŒM1 ; M2 ; : : : and D 1 ; 2 ; : : : . To obtain the interface continuity conditions we first note that
F ŒŸ; D
X
F ŒŸ; (2.118)
Requiring that the functional be stationary with respect to the variations !
C ı Q , we obtain the condition
X
ŸT M Q D 0
(2.119)
where the sum is now over all interfaces in the problem domain. This condition can
be met for arbitrary Q only if the even-parity moments
® D MT Ÿ
(2.120)
are continuous across the interfaces. Thus for each node, we may combine this
expression with Eq. (2.117) to express the even-parity interface moments in terms
of the node source and the odd-parity interface moments
® D MT A1 q MT A1 M¦:
(2.121)
Letting ®T D ®T1 ; ®T2 ; : : : ; and therefore ® D MT Ÿ, we may write for each node
® D MT A1 q MT A1 M¦:
(2.122)
110
E.E. Lewis
A convenient way to solve these equations is to convert them to response matrix
form. We make a linear transformation of variables
j˙ D 1=4® ˙ 1=2¦;
(2.123)
which in the diffusion approximation correspond to the partial currents. We then
may rewrite Eq. (2.122) in response matrix form
jC D Rj C Bq;
(2.124)
h
i1 h
i
1= MT A1 M I
where the matrices are defined by R D 1=2MT A1 M C I
2
i1
h
T 1
T 1
1
1
=2M A .
and B D =2M A M C I
Red-black iteration or other methods, then, may be applied to the solution of
the resulting equations. Partitioning algorithms, over-relaxation methods, and other
techniques have also been developed to accelerate the iterative solutions of the systems of nodal response matrix equations [46, 47].
2.5.2 An Even-Parity Integral Method
The functional given by Eq. (2.25) may also be used to derive an even-parity integral
transport method [28, 44]. The treatment is restricted to isotropic scattering, and
we also eliminate the surface term from Eq. (2.25) by assuming essential boundary
conditions. To obtain the integral method, we reverse the discretization order used in
the preceding sections and first discretize the spatial variables, using finite elements,
while leaving the unknowns functions of angle. Dividing the problem domain into
elements, Eq. (2.25) reduces to
F
C
D
XZ
e
e
Z
dV
h
d e1 k rk
C 2
C e
C2
se
2
i
2 s :
(2.125)
*
Then, employing the same finite element trial functions, ne . r /, used in Sections 2.4
and 2.5, we get
*
C * O
O
. r ; / D nTe . r /®e ./;
(2.126)
and correspondingly
*
*
. r / D nTe . r /¥e :
(2.127)
O and ¥e approximate the even-parity angular flux and the
Here, the vectors ®e ./
scalar flux, respectively, at the spatial mesh points. Combining these approximations
with Eq. (2.125) then yields
2
Second-Order Neutron Transport Methods
F Ψe D
X Z
111
O k k 0 e1
d®Te ./
Z
e
e
Z
C e
e
dV .rk ne / rk 0 nTe
O ¥Te se
dV ne nTe ®e ./
Z
e
dV ne nTe ¥e 2¥Te se
(2.128)
R
where se D e dV ne s. As in Eqs. (2.38) through (2.46), we assemble the elemental
contributions into global vectors of angularly dependent coefficients through the
O D „e ®./
O and ¥e D „e ¥. The reduced
use of the Boolean transformations ®e ./
functional is then
Z
O I ./®.
O
O ¥Te BI ¥ 2®T sI ;
F Œ® D d®T ./A
/
(2.129)
where
O D k k 0
AI ./
C
BI D
X
X
e
X
e
„Te e1
Z
„Te e
e
e
Z
„Te se
e
Z
e
dV .rk ne / rk 0 nTe „e
dV ne nTe „e ;
dV ne nTe „e ;
and
sI D
X
(2.130)
(2.131)
„Te se :
(2.132)
e
O then yields
Requiring Eq. (2.129) to be stationary with regard to variations in ®./
the Euler–Lagrange equation
O
O D BI ¥ C sI :
/
AI ./®.
(2.133)
We may obtain scalar flux equations by first inverting A and then integrating over
angle. We obtain
Z
I
O
dA1
I ./BI
Z
¥D
O
dA1
I ./sI :
(2.134)
This equation has a number of similarities to those obtained using collision probability theory [28]. The dimension of the problem is only that of the number of spatial
nodes, but the coefficient matrix on the left is both dense and nonsymmetric. MoreO represents an intractable task, numerical
over, since analytical inversion of AI ./
quadrature must be used, inviting comparison to the ray tracing methods used in integral transport schemes. Treating curved boundaries and other irregular features is
straightforward using isoparametric finite elements. Moreover, the truncation errors
112
E.E. Lewis
are order h2 for the lowest-order finite elements (triangles with piecewise linear
trial functions), compared to h for collision probability methods, allowing coarser
element meshes to be used.
2.5.3 Combined Methods
The variational nodal method as described in Section 2.5.1 is applicable to homogenous nodes. However, by using finite element trial functions within the nodes,
heterogeneous nodes may also be treated. Recently, this has been accomplished in
what is called the subelement nodal method and applied to reactor problems in
which each node corresponds to one pin-cell [48]. Likewise, the integral method
described in Section 2.5.2 may be placed within the nodal framework, and this too
has resulted in the ability to treat heterogeneous nodes, often times with substantial savings in memory over what is required using the corresponding differential
method [49].
2.6 Discussion
The foregoing sections attempt to provide an overview of the second-order methods that are finding increased use in neutron transport calculations. The focus has
been placed upon the discretization of the transport equation rather than on the algorithms that are employed to solve the resulting sets of linear equations. The vast
literature on iterative methods used to solve large sets of sparse matrix equations,
and in particular the intense efforts being made to effectively utilize parallel computing to speed such computations, lies beyond the scope of this overview. In addition,
the foregoing exposition has been limited to the linear neutron transport equation.
No attempt has been made to include recent work in which second-order transport methods are combined with computational fluid mechanics or other techniques
to include nonlinear thermal-hydraulic feedback mechanisms into neutronics calculations. Likewise, the scope of this work has not included techniques needed
specifically in treating the unique characteristics of thermal radiation or of electron
transport problems.
Work in progress seems destined to expand significantly the value of secondorder methods. In particular, interest in the primal and dual coupled forms of the
transport equations is beginning to break down the sharp distinction between firstand second-order methods. Perhaps the greatest challenge facing the expanded use
of second-order methods, however, is in the treatment of void regions. Since the
total cross section appears in the dominator of the even-parity equations, they cannot be applied directly to vacuum regions. Rather, they must be coupled with ray
tracing or other techniques in situations, where neutron streaming in voids must be
incorporated into second-order computations.
2
Second-Order Neutron Transport Methods
113
References
1. HASSITT (1968) Diffusion theory in two and three dimensions. In: Greenspan H, Kelber CN,
Okrent D (eds) Computing methods in reactor physics, Chap. 2. Gordon & Breach, New York
2. Semenza LA, Lewis EE, Rossow EC (1972) Application of the finite element method to the
multigroup neutron diffusion equation. Nucl Sci Eng 47:302
3. Hansen KF, Kang CM (1975) Finite element methods in reactor physics analysis. Adv Nucl
Sci Tech 8:173
4. Kavenoky A, Lautard JJ (1977) A finite element depletion diffusion calculations method with
space-dependent cross-sections. Nucl Sci Eng 64:563
5. Vladimirov VS (1961) Mathematical problems in the one-velocity theory of particle transport,
Atomic Energy of Canada Ltd., Ontario (1963) (trans: V. A. Steklov Mathematical Institute) 61.
6. Kaplan S, Davis JA (1967) Canonical and involutory transformations of the variational problems of transport theory. Nucl Sci Eng 28:166–176
7. Zienkiewicz OC (1989) The finite element method, 4th edn. McGraw-Hill, London
8. Brezzi F, Fortin M (1991) Mixed and hybrid finite element methods. Springer-Verlag,
New York
9. Strang G, Fix GJ (1973) An analysis of the finite element method. Prentice-Hall, Englewood
Cliffs, NJ
10. Blomquist RN, Lewis EE (1980) A rigorous treatment of transverse buckling effects in twodimensional neutron transport computations. Nucl Sci Eng 73:125
11. Dilber I, Lewis EE (1985) Variational nodal methods for neutron transport. Nucl Sci Eng
91:132
12. deOliveira CRC (1986) An arbitrary geometry finite element method for multigroup neutron
transport with anisotropic scattering. Prog Nucl Energy 18:227
13. Carrico CB, Lewis EE, Palmiotti G (1992) Three-dimensional variational nodal transport methods for Cartesian, triangular and hexagonal criticality calculations. Nucl Sci Eng 111:223
14. Lewis EE, Carrico CB, Palmiotti G (1996) Variational nodal formulation for the spherical
harmonics equations. Nucl Sci Eng 122:194
15. Lillie RA, Robinson JC (1976) A linear triangle finite element formulation for multigroup
neutron transport analysis with anisotropic scattering, ORNL/TM-5281. Oak ridge National
Laboratory
16. Jung J, Kobayashi NO, Nishihara N (1973) Second-order discrete ordinate Pl equations in
multi-dimensional geometry. J Nucl Energy 27:577
17. Morel JE, McGhee JM (1995) A diffusion-synthetic acceleration technique for the even-parity
Sn equations with anisotropic scattering. Nucl Sci Eng 120:147–164
18. Morel JE, McGhee JM (1999) A self-adjoint angular flux equation. Nucl Sci Eng 132:312–325
19. Lautard JJ, Schneider D, Baudron AM (1999) Mixed dual methods for neutronic reactor core
calculations in the CRONOS system. In: Proc. Int. Conf. Mathematics and Computation,
Reactor Physics and Environmental Analysis of Nuclear Systems, 27–30 Sept 1999, Madrid
20. Fedon-Magnaud C (1999) Pin-by-pin transport calculations with CRONOS reactor code. In:
Proc. Int. Conf. Mathematics and Computation, Reactor Physics and Environmental Analysis
of Nuclear Systems, 27–30 Sept 1999, Madrid
21. Akherraz B, Fedon-Magnaud C, Lautard JJ, Sanchez R (1995) Anisotropic scattering treatment
for the neutron transport equation with primal finite elements. Nucl Sci Eng 120:187–198
22. Miller WF Jr, Lewis EE, Rossow EC (1973) The application of phase-space finite elements to
the two-dimensional neutron transport equation in X-Y geometry. Nucl Sci Eng 52:12
23. Briggs LL, Miller WF Jr, Lewis EE (1975) Ray-effect mitigation in discrete ordinate-like angular finite element approximations in neutron transport. Nucl Sci Eng 57:205–217
24. Carlson BG, Lathrop KD (1968) Transport theory: the method of discrete ordinates. In:
Greenspan H, Kelber CN, Okrent D (eds) Computing methods in reactor physics, Chap. 3.
Gordon & Breach, New York
25. Gelbard EM (1968) Spherical harmonics methods: PL and double PL approximations. In:
Greenspan H, Kelber CN, Okrent D (eds) Computing methods in reactor physics, Chap. 4.
Gordon & Breach, New York
114
E.E. Lewis
26. Fletcher JK (1994) The solution of the multigroup neutron transport equation using spherical
harmonics. Nucl Sci Eng 116:73
27. Fedon-Magnaud C, Lautard JJ, Akherraz B, Wu GJ (1995) Coarse mesh methods for the transport calculations in the CRONOS reactor code. In: Proc. int. conf. mathematics, computations,
reactor physics and environmental analysis, 30 Apr–4 May 1995, Portland, Oregon
28. Lewis EE, Miller WF Jr (1984) Computational methods of neutron transport. Wiley, New York
29. Greenbalm A (1977) Iterative methods for solving linear systems. SIAM, Philadelphia, PA
30. Saad Y (1996) Iterative methods for sparse linear systems. PWS Publishing Co, Boston, MA
31. Bemmel JD (1997) Applied Numerical Linear Algebra. SIAM, Philadelphia, PA
32. Palmiotti G, Carrico CB, Lewis EE (1966) Variational nodal transport methods with anisotropic
scattering. Nucl Sci Eng 122:194
33. Morse PM, Feshbach H (1953) Methods of theoretical physics. McGraw-Hill, New York
34. Gelbard EM (1960) Application of spherical harmonics method to reactor problems, WARDBT-20. Bettis Atomic Power Laboratory
35. Gelbard EM (1961) Simplified spherical harmonics equations and their use in shielding problems, WAPD-T 1182 (Rev. 1). Bettis Atomic Power Laboratory
36. Smith KM (1986) Multidimensional nodal transport using the simplified PL method. In: Proc.
topl. mtg. reactor physics and safety, 17–19 Sept 1986, Saratoga Springs, New York, p 223
37. Smith KS (1991) Multi-dimensional nodal transport using the simplified PL method. In: Proc.
ANS topl. mtg. advances in mathematics, computations, and reactor physics, 29 Apr–2 May
1991, Pittsburgh, PA
38. Pomraning CG (1993) Asymptotic and variational derivations of the simplified P n equations.
Ann Nucl Energy 20:623
39. Larsen EW, McGhee JM, Morel JE (1993) Asymptotic derivation of the simplified P n equations. In: Proc. topl. mtg. mathematical methods and supercomputers in nuclear applications,
M&C + SNA’93, 19–23 Apr 1993, Karlsruhe, Germany
40. Larsen EW, Morel JE, McGhee JM (1995) Asymptotic derivation of the multigroup P1 and
simplified P n equations. In: Proc. int. conf. mathematics and computations, reactor physics
and environmental analysis, 30 Apr– 4 May 1995, Portland, Oregon
41. Lewis EE, Palmiotti G (1997) Simplified spherical harmonics in the variational nodal method.
Nucl Sci Eng 126:48
42. Noh T, Miller WF Jr, Morel JE (1996) The even-parity and simplified even-parity transport
equations in two-dimensional x-y geometry. Nucl Sci Eng 123:38–56
43. Azmy YY (1988) The weighted diamond-difference form of the nodal transport methods. Nucl
Sci Eng 98:29
44. Lewis EE, Miller WF Jr, Henry TP (1975) A two-dimensional finite element method for integral neutron transport calculations. Nucl Sci Eng 58:202
45. Lawrence RD (1986) Progress in nodal methods for the solution of the neutron diffusion and
transport equations. Prog Nucl Energy 17:271
46. Palmiotti G, Lewis EE, Carrico CB (1995) VARIANT: VARIational Anisotropic Nodal Transport for multidimensional Cartesian and hexagonal geometry calculation, ANL-95/40. Argonne
National Laboratory
47. Yang WS, Palmiotti G, Lewis EE (2001) Numerical optimization of computing algorithms for
the variational nodal method. Nucl Sci Eng 139:74–185
48. Smith MA, Tsoulfanidis N, Lewis EE, Palmiotti G, Taiwo TA (2003) A finite subelement
generalization of the variational nodal method. Nucl Sci Eng 144:36
49. Smith MA, Palmiotti G, Lewis EE, Tsoulfanidis N (2004) An integral form of the variational
nodal method. Nucl Sci Eng 146:141
2
Second-Order Neutron Transport Methods
115
Professor Elmer E. Lewis received his
B.S. in engineering physics (1960) and an
M.S. (1962) and Ph.D. (1964) in nuclear
engineering at the University of Illinois,
Urbana. He served as a captain in the
US Army Ordnance Corps and as a Ford
Foundation Fellow and assistant professor of nuclear engineering at MIT before joining as Northwestern’s faculty in
1968. In addition to serving as chair of
Northwestern’s Department of Mechanical Engineering (1987–1997), he has held
appointments as visiting professor at the
University of Stuttgart and Guest Scientist
at the Nuclear Research Center at Karlsruhe, Germany. He has been a frequent
consultant to Argonne and Los Alamos
National Laboratories and to a number of
industrial firms. A Fellow of the American Nuclear Society, and winner of its Mathematics and Computation Distinguished Service and Arthur Holly Compton Awards,
Professor Lewis serves on the Editorial Boards of the journals Nuclear Science
and Engineering and Transport Theory and Statistical Physics. He has held a number of offices in the American Nuclear Society, including chair of its Mathematics
and Computation Division. His research resulted in significant advances in a wide
variety of topics including neutronics computational methods, radiation transport,
the physics and safety of nuclear systems, reliability and quality modeling, and
Monte Carlo simulation. Professor Lewis has taught a wide range of courses in
mechanical and nuclear engineering, ranging from freshman seminars to graduatelevel offerings, and currently serves as his department’s undergraduate curriculum
coordinator. In addition to undergraduate advising, he has been a primary supervisor
to more than 20 Ph.D. and approximately 30 M.S. students, three of them winning
the American Nuclear Society’s Mark Mills Award for their doctoral work. He is the
author or co-author of nearly 200 journal articles and conference proceeding papers.
He has written four engineering textbooks as well as a historical appreciation of engineering intended for a more general audience.
Chapter 3
Monte Carlo Methods
Jerome Spanier
3.1 Introduction
Monte Carlo methods comprise a large and still growing collection of methods
of repetitive simulation designed to obtain approximate solutions of various problems by playing games of chance. Often these methods are motivated by randomness
inherent in the problem being studied (as, e.g., when simulating the random walks
of “particles” undergoing diffusive transport), but this is not an essential feature
of Monte Carlo methods. As long ago as the eighteenth century, the distinguished
French naturalist Compte de Buffon [1] described an experiment that is by now well
known: a thin needle of length l is dropped repeatedly on a plane surface that has
been ruled with parallel lines at a fixed distance d apart. Then, as Laplace suggested
many years later [2], an empirical estimate of the probability P of an intersection
obtained by dropping a needle at random a large number, N , of times and observing the number, n, of intersections provides a practical means for estimating . The
relationship is
2l
d
2l
PO d
P D
or
where PO D n=N and we assume that l < d .
We introduce another example that can be used to illustrate several important
features of Monte Carlo simulation. This “model” transport problem, one of the
simplest random walk problems one might imagine, can be solved completely without resorting to sampling at all and yet exhibits characteristics of problems that are
typical of more complex particle transport. The study of such model problems is crucial in obtaining a deeper understanding of the basic principles that underlie Monte
Carlo methods.
J. Spanier ()
Beckman Laser Institute, University of California, Irvine, California, USA
e-mail: jspanier@uci.edu
Y. Azmy and E. Sartori, Nuclear Computational Science: A Century in Review,
c Springer Science+Business Media B.V. 2010
DOI 10.1007/978-90-481-3411-3 3, 117
118
J. Spanier
We imagine particles (random walkers) that are assumed to impinge on the left
face of a vertical slab of unit thickness. Each particle moves only to the right in
steps selected at random from a uniform distribution on [0,1] until it escapes from
the slab. Let X be the number of steps required to escape. The problem is to estimate
EŒX , the average number of steps to escape, where EŒX is the expectation of the
random variable X .
For this problem, one expresses the expectation as an infinite series, the nth term
of which is the product of n and the probability pn that the particle escapes after
exactly n steps. This infinite series representation of EŒX is precisely analogous to
the Neumann series representation of the solution of the transport equation that describes this problem, as well as so many problems that are solved using Monte Carlo
methods. Making use of the fact that for this simple problem, pn D 1 = n.n 2/Š
for n D 2; 3; : : : the infinite series can be summed exactly, which yields
EŒX D e Š 2:71828
and the variance of X is
2 ŒX D 3e e 2 Š 0:76579:
This very simple test problem provides a very useful vehicle for analyzing more
complex Monte Carlo random walk problems for which exact values of the moments
of key random variables will be unknown in general.
Of course, much more accurate and efficient (deterministic) methods may be used
to estimate both and e. However, even these very simple simulation problems
serve to illustrate a number of key ingredients of the Monte Carlo method: (1) The
need to generate random samples drawn from a variety of probability distributions;
(2) the need to express the outcomes of a Monte Carlo experiment as estimates of
a theoretical expected value of some random variable; (3) the need to perform an
error analysis based on statistical fluctuations of a random variable from sample to
sample; and (4) where possible, the desirability of reducing the sample to sample
fluctuations by thinking deeply about the inherent cause of these fluctuations and
taking appropriate measures to reduce them.
For example, in the needle-tossing experiment, why not toss a cruciform-shaped
needle, or even one with many needles of the same length equally distributed around
the circumference of a circle of radius l? This would seem to provide a more efficient
experiment since each toss produces many possible intersections, yet each “spoke”
retains the same distributional characteristics as each single needle toss in the original experiment. But clearly these more sophisticated “needles” produce correlated
sampling results. How does this affect the statistical analysis of the outcomes? Some
of these questions are addressed in the interesting references [3–5].
For the one-dimensional random walk problem, one might imagine that a more
systematic sampling (instead of random sampling) of the unit interval to obtain step
sizes for each step to the right would lead to reduced statistical fluctuations in the
Monte Carlo estimate. Suppose, then, that one were to subdivide the unit interval
3
Monte Carlo Methods
119
into a large, but fixed, number, S , of equal subdivisions and choose as individual
step sizes the midpoints, say, of these subintervals ordered deterministically. For
example, one could move through these in order, going from the least to the greatest,
to generate steps for the random walkers. Pretty clearly, this choice is not a very
good one since, if S is very large compared to the number of samples generated,
there would be a bias in the direction of short steps. Perhaps one should average
a small step with a large one or run through the midpoints randomly. Again, what
about the correlation introduced? And what is the impact of any such scheme on the
sample to sample variability?
In addition to containing many of the same critical features of most Monte Carlo
problems, these simple model problems illustrate some of the advantages of the
Monte Carlo method: (a) its appeal to intuition; (b) its simplicity of implementation;
and (c) its accessibility to nonexperts. However, as more and more sophistication is
considered in an attempt to speed up the computation or to reduce the sampling variability per unit computing cost, the model for the experiment can depart more and
more from intuitive plausibility, and the need for mathematical rigor is accentuated.
A firm theoretical foundation becomes not only desirable, but also essential.
There are many problems for which analytic or good deterministic methods are
simply unavailable that benefit from being formulated stochastically. Particle transport problems provide a fertile field of examples of such problems, and this field will
be our main emphasis here. Since the earliest applications of Monte Carlo methods
to neutron transport problems, however, the number and range of applications have
grown well beyond the bounds of a single book chapter. The field of operations research is rife with such examples, which we do not discuss here (see, e.g., [7]). We
have deliberately ignored the discussion of developments in computer architecture
(e.g., use of parallel or vectorized computation) and many of the rapidly growing
list of important application areas, such as financial modeling, design of radiation
therapy plans, Markov chain Monte Carlo, and others. Because of the literal explosion in the number of applications of Monte Carlo methods and the avalanche of
publications dealing with both the theory and applications, there is no possibility
of dealing with the subject comprehensively here.
Finally, we presume that the reader is familiar with at least the rudiments of
Monte Carlo. Several books and articles can provide an introduction to the subject
matter (e.g., [8–13]). For those more familiar with at least the rudiments, a review
of [14] would provide an appropriate introduction since our ultimate goal here is to
update that article.
3.2 Organizing Principles
Initially, a historical context was suggested by the organizers for each of the Gelbard
lectures. Each lecture was intended to survey one of the topics traditionally important to the Mathematics and Computation Division of the American Nuclear Society
in such a way as to update the 1968 publication of [14]. While this context seemed
120
J. Spanier
to serve well for the lectures at the Gatlinburg conference – at least the chronology
featured prominently in the oral accounts – a division according to subject content
seemed more appropriate for the written version. Our hope is that this switch in
perspective might make the chapter more useful as a reference.
Accordingly, we have abandoned the idea of dividing the content into two historical periods, one prior to 1968 and the other afterward. Instead, the material here will
be organized into what we perceive to be the key elements of Monte Carlo methods
development: generating sequences, error analysis, error reduction, and theoretical
foundations. Here is a brief overview of what the reader might expect to find.
3.2.1 Generating Sequences
Section 3.4, dealing with generating sequences, shows that the relatively simple algorithms used to create a sequence of unpredictable, “random” numbers for early
use in simulation have evolved into several quite different streams of research. These
involve both modern-day successors to the earliest pseudorandom number generators and algorithms that are completely deterministic, focusing only on uniformity
in a manner that forsakes the idea of stochastic independence altogether.
3.2.2 Error Analysis
For the analysis of errors in Monte Carlo output, we will discuss in Section 3.5
the evolution, in the case of pseudorandomly generated samples, from reporting
the sample mean and standard deviation to the now fairly common use of higher
moments and additional statistical tests (see [15]). When completely deterministic sequences are used in place of pseudorandom ones, and no probabilistic model
is invoked, however, we will see that a markedly different error analysis must be
applied.
3.2.3 Error Reduction
Perhaps the greatest concentration of effort to improve Monte Carlo methods has
been devoted to the topic of error reduction in the last 50 years or so. Whether the
simulation makes use of pseudorandom number sequences or their deterministic
cousins – quasi-random sequences – highly sophisticated techniques have been developed that are capable of producing not only increased rates of convergence, but
also much lower error levels for a fixed sample size. Some of these developments
are described in Section 3.6.
3
Monte Carlo Methods
121
3.2.4 Foundations/Theoretical Developments
In discussing this last of our “big four” topics in Section 3.7, we shall attempt to
list the major advances in understanding the foundations of the subject. Our (admittedly biased) perspective is that these have had a great deal to do with the practical
advances made since the earliest uses of simulation methods to solve complex
problems.
Following our discussion of the development of each of these four major themes,
we try to formulate at the close of each section, a succinct summary of the present
state of the art for each such theme. We end this chapter with a short list, presented
in Section 3.8, of major challenges that might serve to stimulate further thinking.
3.3 Historical Perspectives
Before we pursue our discussion of each theme, we provide an overview of the
early history of the subject. Much has already been written about the development
of a nuclear weapons program whose history began with the famous experiments
conducted during World War II at the University of Chicago. These culminated on
December 2, 1942 with the first controlled nuclear chain reaction on a squash court
situated beneath Chicago’s football stadium. Working later in Los Alamos, New
York, Oak Ridge, and other locations, Enrico Fermi and others (for instance, Stanislaw Ulam, John von Neumann, Robert Richtmyer, and Nicholas Metropolis) played
an important part in solving the problems connected with the development of the
atomic bomb. This work depended crucially on rudimentary numerical simulations
of multiplying neutron populations.
It is widely held that the first paper on the Monte Carlo method was [16]. The
marriage of relatively unsophisticated simulation methods with the development of
automatic digital computers provided the key ingredients for success in this rapidly
expanding wartime undertaking.
Shortly after the end of World War II another major effort was initiated, spearheaded by Admiral Hyman Rickover, to design and implement nuclear propulsion
systems for the Navy. This highly successful program began in the 1940s, with the
first test reactor started up in the United States in 1953. The first nuclear-powered
submarine – the USS Nautilus – put to sea in 1955. This development marked
the transition of submarines from slow underwater vessels to warships capable of
sustaining 20–25 knots while submerged for weeks at a time. The success of the
Nautilus effort led to the development of additional submarines, each powered by a
single pressurized water reactor, and an aircraft carrier, the USS Enterprise, powered
by eight reactor units, in 1960. A cruiser, the USS Long Beach, was placed into service in 1961 and by 1962 the United States boasted a nuclear fleet of 26 operational
nuclear submarines with 30 more under construction. Today, the US Navy operates
more than 80 nuclear-powered ships including 11 aircraft carriers and a number of
cruisers.
122
J. Spanier
The postwar activities to produce nuclear propulsion units for a nuclear navy
were concentrated primarily at the Westinghouse-managed Bettis Atomic Power
Laboratory in suburban Pittsburgh, PA and the General Electric-managed Knolls
Atomic Power Laboratory in Schenectady, New York during the 1950s and 1960s.
In that same time frame there was a very rapid expansion in the capability of digital computers at these and at the nation’s National Laboratories at Los Alamos,
Oak Ridge, Brookhaven, Argonne, and Livermore. The development of the giant
(using nearly 18,000 vacuum tubes) ENIAC machine by John W. Mauchly and
J. Presper Eckert at the University of Pennsylvania during the period 1943–1946
and the EDVAC computer, for which conceptual design was completed in 1946 but
which was not fully operational until 1952, led the way. It is perhaps no coincidence
that von Neumann and Metropolis were heavily involved in both the development
of modern computing machines and the Monte Carlo method as a practical numerical method. As a result of this dual evolution of science and technology, the
same period was marked by a dramatic increase in the levels of sophistication of
computations in support of the nuclear energy program, including the design and development of reactors for peacetime uses. The commercial development of nuclear
power-generating plants provided added incentives for acceleration of this effort. As
a result, Monte Carlo methods began to find greater use as a design tool and as a
partial replacement for the much more expensive (and risky!) criticality experiments
that were needed to validate the various nuclear designs.
An unfortunate consequence of the fact that much of the work done was classified
during this early period is that publication occurred mainly in classified government
reports rather than in the open literature. This undoubtedly prevented many important ideas from becoming known to a much wider audience sooner. Indeed, many
of these reports, such as the seminal work of Herman Kahn [17, 18], were never
republished.
3.4 Generating Sequences
It seems appropriate to begin our review of this topic with an often cited quotation
due to R.R. Coveyou [19] – “Random number generation is too important to be left
to chance.” Indeed, this statement, which was certainly valid in 1969 when a good
deal of effort was being devoted to try to understand how to generate high quality
pseudorandom numbers, is even more accurate today. This is so because there are
very sophisticated new methods for generating pseudorandom sequences as well as
methods for generating completely deterministic uniform sequences that serve in
place of pseudorandom ones. We will deal with such deterministic sequences1 in
Section 3.4.2.
1
Of course, even pseudorandom sequences are deterministic, a fact that has stirred some debate
about whether a probabilistic analysis made any sense at all for pseudorandomly implemented
Monte Carlo. This, in turn, was one of the motives for developing a mathematically rigorous
analysis divorced from probability theory and based only on a notion of uniformity arising out
of number-theoretic considerations.
3
Monte Carlo Methods
123
3.4.1 Pseudorandom Sequences
Pseudorandom numbers are commonly understood to be computer substitutes for
“truly random” numbers. Their function is to simulate realizations of independent,
identically distributed (iid) random variables on the unit interval. Other distributions
that might be needed in a stochastic simulation are then obtained by transformation methods, of which there are many (see [20] and the software package
C-Rand [21, 22]).
The most common early source of pseudorandom numbers was the multiplicative
congruential generator
i C1 a i .mod m/
originally suggested by Lehmer [23], or one incorporating an additive component
i C1
Œa
i
C b .mod m/:
The pseudorandom numbers themselves are then defined through division by the
modulus:
ri D i =m 2 Œ0; 1 :
When employing numbers generated by such linear congruential generators, appropriately chosen “seeds” 0 , moduli m, and multipliers a provide the means to
generate reasonably high quality pseudorandom sequences, with sufficiently long
periods for many problems (see [24]). Following Lehmer’s original suggestion to
use such sequences for simulation, substantial effort and analysis was invested in
assessing their imperfections, such as the persistence of serial correlation [25, 26].
Knuth [24] devised a number of statistical tests to bolster confidence in using pseudorandom sequences for generating samples from the many distributions required
during the course of a simulation experiment.
For much of the early history of Monte Carlo computations, the emphasis was
on obtaining results, not on looking carefully at the theoretical underpinnings of
the computations. When using linear congruential generators as a source of nearly
independent and nearly uniform realizations of independent and identically distributed uniform distributions on the unit interval, it seemed sufficient to assure
sufficiently long periods of pseudorandom sequences with serial correlation properties that seemed “safe.” The ideal desiderata for pseudorandom sequences were
maximal periods and minimal “structure.” Here, the word structure is being used
nearly synonymously with predictability. And indeed, recursive multiplications of
large integers produces, as remainders, numbers that work surprisingly well in this
regard for many Monte Carlo problem applications.
In 1967, however, George Marsaglia published his seminal paper [27] with the
surprising conclusion that all multiplicative congruential generators possess a defect that makes them unsuitable for use in many applications. Furthermore, this
defect cannot be eliminated by adjusting the starting values, multipliers, or moduli
of the congruence. The fatal flaw revealed by Marsaglia’s paper is that if successive
n-tuples .u1 ; u2 ; : : :; un /, .u2 , u3 , : : :, unC1 ), : : : produced by the generator are
124
J. Spanier
treated as points in a unit cube of dimension n, then all of these points will lie
on a relatively small number of parallel hyperplanes. If one plots in two or three
dimensions a set of successive pairs or triples of numbers obtained from a linear
congruential generator, this hyperplane structure is readily apparent. In applications
requiring as much uniformity as possible in two (or more) dimensional space, not
just on each axis separately, this would be a crucial factor. For example, it is easy
to imagine specific piecewise linear functions of a single variable whose integrals –
estimated crudely by selecting a pair of points randomly in the unit square and
counting the fraction that fall below the graph of the function – would not be estimated very accurately even after many samples were generated. After nearly 20
years of essentially unquestioning use of such generators, the stage had been set
for an explosion of effort and analysis to create improved pseudorandom sequences
as well as methods that do not rely on randomness at all. After all, as John von
Neumann stated “Anyone who considers arithmetical methods of producing random
numbers is, of course, in a state of sin” (quoted in [24]).
There are many excellent sources of additional material about pseudorandom
numbers. Beginning with Knuth’s book [24], which may be the best reference for
a rigorously mathematical discussion, another more recent reference, which belongs to any basic library on the subject is Harald Niederreiter’s book [10] based
on lectures he gave at a CBMS-NSF Regional Conference held at the University
of Alaska, Fairbanks in 1990. In addition to these standard reference works, there
are now several Internet web sites that provide not only lists of references but also
links to various other sites, algorithms, and a variety of other relevant information.
A good choice among these is the one maintained at the University of Salzburg at
http://random.mat.sbg.ac.at.
Two important developments characterize the past 30 years of progress with respect to generating sequences. The first of these is the exhaustive, rigorous analysis
of uniform pseudorandom number algorithms, including linear and nonlinear congruential generators and the emergence of a host of other ideas for generating
sequences that behave nearly randomly. The second development that occurred more
or less in parallel with the first was much more radical. This involved the abandonment of randomness as a requirement for decision making in favor of reliance on
uniformity in an appropriately defined high-dimensional space. In this latter area of
research, number-theoretic methods are used for generating and analyzing optimally
regular sequences. The use of such “quasi-random,” as opposed to pseudorandom
numbers, in turn, can produce more rapid convergence of sample means to theoretical means for many practical problems (see Section 3.5.2).
Following the discovery by Marsaglia of the parallel hyperplanes phenomenon
characteristic of linear congruential generators it became clear that many simulations could be adversely affected by this behavior. As a result, much effort was
devoted to careful theoretical analyses of the behavior and fundamental properties
of uniform pseudorandom number generators and algorithms. Excellent material
describing these developments can be found in [28–34] and in [10] and [35, 36].
A detailed exposition of this topic alone is beyond the scope of this chapter.
The reference [28] lists 147 relevant papers, most of them published during the
3
Monte Carlo Methods
125
10-year interval, 1985–1995. However, we try to summarize the salient facts here
to provide a flavor of the material to be found in the literature. Taking into account the important, indeed, pivotal role played by pseudorandom numbers in
any Monte Carlo simulation, the significance of these investigations can hardly be
overemphasized.
Computer algorithms for generating uniform pseudorandom numbers all yield
periodic sequences. Naturally, it is desirable that these sequences have long periods,
good equidistribution properties, minimal serial correlation, and as little intrinsic
structure as possible. Obviously, it is also important that the algorithm be amenable
to a fast computer implementation.
The purpose of a pseudorandom number generator is to simulate independent,
identically distributed (iid) random variables that are uniform in the unit interval.
Stringing s such numbers together in sequence then simulates iid uniform random
variables in Œ0; 1 s for every integer s. Clearly, this cannot be the case for every s
since there are, in any event, only finitely many numbers available from any pseudorandom generator. It remains, then, to find sensible criteria for assessing the quality
of a given pseudorandom generator and to construct specific generators that satisfy
these criteria. Presumably, this will provide a measure of the evenness of the distribution of successive s-tuples in all dimensions s up to some sufficiently large integer
or some carefully selected subset of dimensions, if these can be identified.
A common quantification of quality is the discrepancy between the empirical
distribution (i.e., one based on sample, rather than theoretical, averages) of a point
set, such as the set of all s-tuples of pseudorandom numbers) and the uniform distribution over Œ0; 1 s . This same notion of discrepancy will play a fundamental role
in analyzing so-called quasi-random sequences – sequences chosen solely based on
their uniformity properties rather than on any probabilistically – inspired ones.
Definition 1. For any N points u0 ; u1 ; : : : ; uN 1 in the s-dimensional unit cube
I s D Œ0; 1 s , s 1, their discrepancy is defined by
DN .u0 ; u1 ; : : : ; uN 1 / D sup jEN .J / V .J /j
J
where the supremum is extended over all subintervals J of I s with one vertex at the
origin, EN .J / is N 1 times the number of points among the u0 ; u1 ; : : : ; uN 1 that
lie in J and V .J / is the s-dimensional volume of J.
Evidently, the first study of the concept of discrepancy is in a paper of Bergstrom
[37]. The term “discrepancy” was most likely coined by van der Corput and the first
intensive study of it appeared in a paper of van der Corput and Pisot [38].
The discrepancy provides the statistical test quantity for the s-dimensional Kolmogoroff test, which is a goodness-of-fit test for the empirical distribution of initial
segments of a pseudorandom sequence. For fixed N and theoretically random
.u0 ; u1 ; : : : ; uN 1 / 2 Œ0; 1 , the distribution of DN .u0 ; u1 ; : : : ; uN 1 / is known,
which gives rise to various formulas for
ProbfDN .u0 ; u1 ; : : : ; uN 1 / tg;
0 t 1:
126
J. Spanier
For s D 1 one obtains a test of one-dimensional uniformity (equidistribution in
[0,1]), while for s > 1 one obtains a test of higher dimensional uniformity
(which encompasses the independence, or lack thereof, of successive pseudorandom numbers).
As we saw earlier, the classical method for generating pseudorandom numbers is
the linear congruential method
i C1
Œa
i
C b .mod m/:
where m is always chosen to be a large integer and 0 , a, and b are suitably chosen
integers. Algorithms based on this simple idea have been widely studied [10, 24,
34, 40–43], especially in the past 40 years. In spite of well-founded criticism of
linear congruential generators stemming from the hyperplane structure they exhibit,
they remain the most commonly used in practice, especially in standard computer
software libraries. There are several reasons for this state of affairs:
Linear congruential algorithms are fast. Alternative generators are essentially al-
ways considerably slower.
Many large Monte Carlo programs that undergo periodic revision rely on repro-
ducing output from test problems run with previous versions of the program.
Introducing change in the basic pseudorandom sequence employed would destroy the reproducibility of results obtained with the earlier versions of the
program.
Those responsible for maintaining Monte Carlo programs may be unaware of the
potential danger in using pseudorandom number generators that were devised
many years earlier, generators that may well have outlived their usefulness.
Unfortunately, many of the “default” generators currently available in popular computer software are old and could be dangerous to use in modern applications that
require much larger sample sizes than might have been needed earlier. An increasing demand for parallel or vector pseudorandom streams of numbers has aggravated
the problem as well since these often make use of subsequences of sequences that
may not have sufficiently long periods to justify this subdivision. For example, consider the simple generator
i C1
Œa
i
C b .mod m/
with
ri D
i =m
and the popular choices m D 231 – 1, a D 16; 807. The period length is 231 – 2,
which is judged to be too small for serious applications [44, 45]. Furthermore, this
generator has a lattice structure with rather large distances between the small number of hyperplanes that contain the higher dimensional s-tuples, which could easily
prove to be a problem in simulation results [46, 47].
3
Monte Carlo Methods
127
One method for controlling the lattice structure is to combine linear recursive
generators [48]. For example, beginning with two such linear recurrences
x1;n D .a1;1 x1;n1 C C a1;k x1;nk /.mod m1 /
x2;n D .a2;1 x2;n1 C C a2;k x2;nk /.mod m2 /
where k, the mj ’s, and the ai;j ’s are fixed integers, one can define the pseudorandom
numbers by
rn D .x1;n =m1 x2;n =m2 /.mod1/:
This is an example of a combined linear multiple recursive generator. At step n
this generator produces the 2k-dimensional vector .x1;n ; : : : ; x1;nkC1 ; x2;n ; : : : ;
x2;nkC1 /, whose first k components lie in f0; 1; : : : ; m1 1g and whose last k
components lie in f0; 1; : : : ; m2 1g. It can be shown in [24, 44] that this generator
gives good coverage of the unit hypercube Œ0; 1 s for all dimensions s k. For
dimensions higher than that there is the usual lattice structure, but parameters can be
chosen that make the distance between the parallel hyperplanes of this lattice quite
small. L’Ecuyer [47] recommends parameter choices that produce a generator with
two main cycles of length 2192 each and whose lattice structure in dimensions up
to 48 has been found to be excellent. This would seem to make it a good choice for
many applications.
The lattice structure that Marsaglia [27] discovered plagues every linear congruential generator. Inversive congruential generators (introduced by Eichenauer and
Lehn [48] in 1986) were designed to overcome this difficulty. By analogy with linear
congruential methods, inversive congruential sequences are defined by the nonlinear
congruence
i C1 Œa Ni C b .mod p/
with
i Ni 1.mod p/
and
ri D
i =p
2 Œ0; 1 :
In these equations, p is a prime modulus, a is a multiplier, b is an additive term, and
0 is a starting value. From these definitions it follows that the i take values in the
set f0; 1; : : : ; p 1g. We denote this generator by ICG .p; a; b; 0 /. A key feature
of the ICG with prime modulus is the absence of any lattice structure, in sharp
contrast to linear congruential generators. Figure 3.1 is a plot of pairs of consecutive
pseudorandom numbers .rn ; rnC1 / generated by ICG .231 1; 1288490188; 1; 0/
concentrated in a region of the unit square near the point (0.5, 0.5).
The extra inversion step dramatically reduces the effect of the parallel hyperplanes phenomenon. For example, an inversive congruential algorithm modulo
p D 231 – 1 passes a rather stringent s-dimensional lattice test for all dimensions
s 230 , whereas, for the linear congruential algorithm with this same modulus,
it is difficult to guarantee a nearly optimal lattice structure for s 10 [10]. Inversive congruential algorithms also display better behavior for a rather severe test of
serial correlation. They also enjoy robustness with respect to the choice of parameters. These algorithms are promising candidates for parallelization because, unlike
128
J. Spanier
t000_102.out
0.5004
0.5002
0.5
0.4998
0.4996
0.4996 0.4998
0.5
0.5002 0.5004
Fig. 3.1 Points generated by an inversive congruential generator
linear congruential generators, they do not have long-range correlation problems.
The only downside to their use is that they are substantially costlier to produce (by
approximately a factor of 8) than their linear congruential counterparts because of
the relative costliness of computing multiplicative inverses in modular arithmetic.
In addition to these linear and inversive congruential generators for uniform
pseudorandom numbers, several other classes have been studied – some quite extensively – in an attempt to circumvent the problems perceived with the classical
ones. Of course, no single generator can prove to be ideal for all applications. Any
single generator can satisfy only a finite number of randomness tests. This, along
with the finiteness of the set of computer numbers, means that a test can always be
devised that cannot be passed by a specific generator. It has been said that pseudorandom number generators are like antibiotics in that respect: no one is appropriate
for all tasks. What is needed is an arsenal of possible choices with distinct properties. If two very different generators yield the same outcome in a simulation, then
additional confidence is gained in the result.
In addition to the generator families mentioned above, there are laggedFibonacci, generalized feedback shift register, matrix, Tausworthe and other classes
of generators, and combinations of these. There are also various nonlinear generators. Literally thousands of publications dealing with these topics have appeared in
print in the last 30 years. In [49], L’Ecuyer suggests the following generator families
as having the essential requirements of good theoretical support, extensive testing,
and ease of use: the Mersenne twister [50], the combined multiple recursive generators of L’Ecuyer [47], the combined linear congruential generators of L’Ecuyer
and Andres [51], and the combined Tausworthe generators of L’Ecuyer [52]. More
information can be found on the web pages:
http://www.iro.umontreal.ca/ lecuyer
http://random.mat.sbg.ac.at
http://cg.scs.carleton.ca/ luc/rng.html
http://www.robertnz.net/
3
Monte Carlo Methods
129
3.4.2 Quasirandom Sequences
In the past 35 years or so there has been a surge in the number of publications
(see, e.g., [10, 53–58]) that recommend the use of quasi-random sequences (i.e.,
sequences more regular than pseudorandom ones) in place of pseudorandom sequences for difficult problems. There is no universally accepted, rigorous definition
of the term “quasi-random sequence,” but it has come to mean sequences with low
discrepancy (see definition of discrepancy: Definition 1 of Section 3.4.1). Indeed,
the terms quasi-random and low-discrepancy sequences are frequently used synonymously.
These quasi-Monte Carlo methods, as they are sometimes called, have become
the methods of choice recently for many problems involving financial modeling
[59–62], radiosity, and global illumination problems [54, 55, 63, 64] and other applications. Quasi-random methods offer the potential for improved asymptotic (i.e.,
for sufficiently large sample size) convergence rates when compared with pseudorandom methods and have performed even better than can easily be explained by
existing theory in many applications. Additionally, deterministic rather than statistical error analysis can be applied to their use, even though sharp error bounds are
not easily obtained. Consequently, a sizeable research effort is presently devoted to
obtaining a deeper understanding of the potential of quasi-Monte Carlo methods.
In spite of this potential, most traditional nuclear applications continue to depend upon Monte Carlo programs, such as MCNP [15], that rely on pseudorandom
sequences. An important reason for this is that a completely different (and more
complex) error analysis must be applied for simulations based on quasi-random sequences. Also, the generation of quasi-random sequences is costlier, in general, than
that for pseudorandom sequences. In return for the extra computation, quasi-random
sequences offer the prospect of accelerated convergence rates when compared with
pseudorandom Monte Carlo. There is sufficient complexity involved in deciding
which method might be best to use for a given problem, however, that no hard and
fast rule can be applied. In any case, the development of simulation methods based
on the use of quasi-random sequences represents, in our view, one of the most important lines of Monte Carlo research since 1968.
As stated above, the term low-discrepancy sequences is normally used in place
of quasi-random sequences to characterize the highly regular sequences used in
quasi-Monte Carlo implementations. The definition of discrepancy (Section 3.4.1)
provides a quantitative measure of the regularity of a sequence of s-dimensional
points. For the case s D 1 we are concerned with either a finite set u0 ; u1 ; : : : ; uN 1
or with the first N members of an infinite sequence u0 ; u1 ; : : : of points drawn from
the unit interval [0,1]. In this case the discrepancy reduces to
DN .u0 ; u1 ; : : : ; uN 1 / D sup jEN .J / V .J /j;
J
where the supremum is extended over all subintervals J of [0,1] with one vertex at
the origin, EN .J / is N 1 times the number of points among the u0 ; u1 ; : : : ; uN 1
that lie in J and V .J / is simply the length of the interval J .
130
J. Spanier
A useful example of a low-discrepancy infinite sequence in [0,1) is the Van der
Corput sequence [65], which is defined by
2 .n/ D
N
X
aj .n/2j 1 ;
(3.1)
aj .n/2j :
(3.2)
j D0
where
nD
N
X
j D0
These formulas produce 2 .1/ D 1=2, 2 .2/ D 1=4, 2 .3/ D 3=4, 2 .4/ D 1=8,
1
2 .5/ D 5=8; : : : and the numbers f 2 .n/gn D 1 systematically run through the mulk
tiples of 2 without duplicating any that arose earlier. Such numbers are much
more uniformly distributed in the unit interval than are pseudorandom numbers.
In similar fashion, one can define the radical inverse function for any number
base b by
b .n/ D
N
X
aj .n/b j 1 I
(3.3)
j D0
it enjoys properties very similar to the b D 2 case when b is a prime larger than 2.
k
without duThat is, f b .n/g1
n D 1 systematically runs through the multiples of b
plication for any prime b.
The Halton sequence [66] is an infinite, s dimensional, low-discrepancy sequence defined by f b1 .n/; b2 .n/; : : : ; bs .n/g, where b1 ; b2 ; : : : ; bs are relatively
prime in pairs (e.g., the first s primes). It is useful for generating very uniform sdimensional vectors, as when random walks in an s-dimensional phase space are
required. In Fig. 3.2, a visual comparison is made of 2,000 pseudorandom pairs
(left) with 2,000 Halton pairs (right).
Another family of low-discrepancy sequences, called lattices, has been especially
useful for integrating periodic functions. Their ancestor is the number-theoretic
method of good lattice points, developed by Korobov [67] and Hlawka [68] for
the approximate evaluation of integrals over I s D Œ0; 1 s under the assumption
that the integrand is 1-periodic in each variable. Lattice methods, or lattice rules,
generalize and extend this early work making use of algebraic, rather than numbertheoretic, principles and techniques. Excellent references for lattice rules are the
books [10, 69, 70].
Many other low-discrepancy sequences have been used in various quasi-Monte
Carlo simulations. The reader is referred to [10] for a more thorough treatment of
this general topic.
3
Monte Carlo Methods
131
Fig. 3.2 Visual comparison of pseudorandom (left) and quasi-random (right) sequences
3.4.3 Hybrid Sequences
Hybrid sequences are meant to combine the best features of both pseudorandom
(convergence rate is independent of the problem dimension) and quasi-random
(asymptotic rate of convergence is greater than N 1=2 but weakly dependent on
dimension). Ideas for generating hybrid sequences rely, in general, on combining both random and quasi-random elements in a single sequence. For example,
randomly scrambling the elements of a low-discrepancy sequence or restricting
the use of the low-discrepancy component to a lower dimensional portion of the
problem and filling out the remaining dimensions (“padding”) with pseudorandom
sequence elements can be effective strategies. Thus, Spanier [58] introduces both
a “scrambled” and a “mixed” sequence based on these ideas, Owen [71] describes
a method for scrambling certain low-discrepancy sequences called nets [10], Faure
[72] describes a method for scrambling the Halton sequence to achieve lowered discrepancy, Wang and Hickernell [73] randomize Halton sequences, and Moskowitz
[74], Coulibaly and Lecot [75], and Morokoff and Caflisch [56] present various
methods for renumbering the components of a low-discrepancy sequence – in effect,
introducing randomness somewhat differently into the sequence. Okten [76, 77] has
introduced a generalization of Spanier’s mixed sequence and Moskowitz’s renumbering method and also indicated how error estimation can be performed when using
such sequences. Because of its generality, we describe Okten’s ideas briefly here.
132
J. Spanier
In [77], Okten provides the following:
Definition 2. Let D fi1 ; : : : ; id g.i1 < < id / be a subset of the index set
f1; : : : ; sg . For a given d-dimensional sequence fqn g1
n D 1 , a mixed .s; d / sequence
i
1
is an s-dimensional sequence fmn gn D 1 .s d / such that mnk D qnk , k D 1; : : : ; d
ik
and all other components of mn (i.e., mn for i 2 f1; : : : ; sg ) come from independent realizations of a random variable uniformly distributed on Œ0; 1 sd .
This definition is useful inasmuch as it specializes to a number of interesting sequences introduced by other authors earlier. For example, Spanier’s mixed sequence
corresponds to the choices D f1; : : : ; d g with s D 1; i.e., it is a mixed .1; d /
sequence that can be used for either high-dimensional integration or random walk
problems. Also, the continuation method introduced in [78] (see also [71]) amounts
to using a mixed .s; d / sequence with D fs d C 1; s d C 2; : : : ; sg. Furthermore, the Spanier mixed sequence obviously specializes to an ordinary pseudorandom sequence when d D 0 and is empty, while if D f1; : : : ; d g and s D d , the
resulting mixed .d; d / sequence is clearly completely deterministic with no random components at all.
Okten [79] also introduces a new family of hybrid sequences obtained by random
sampling from a universe consisting of low-discrepancy sequences of the appropriate dimension for the problem. This idea permits the use of conventional statistical
analyses to be performed on the resulting estimates and it therefore attempts to
overcome one of the major drawbacks of using low-discrepancy sequences: the unavailability of an effective and convenient error analysis.
Hybrid sequences are designed to produce good results for general problems with
dimensions s that are too large for pure low-discrepancy sequences to be effective.
The dimension that defines this threshold depends upon the details of the problem.
For example, a number of problems arising in financial modeling involve integrations over 360 dimensions and have been successfully accomplished with purely
low-discrepancy sequences, whereas it is not difficult to compose integrands of only
20 variables for which the use of pseudorandom sequences provides better results
than when a 20-dimensional low-discrepancy sequence is used. This disparity exists because, in the 360-dimensional case, the integrand function does not depend
strongly on all of its 360 variables: it is much more affected by fluctuations in only
a handful of the variables while behaving very smoothly with respect to the others. In other words, the partial derivatives of the integrand are all quite large in the
case of the 20-dimensional integrand function while in the 360-dimensional problem cited, only a few of the partial derivatives are large and the remaining ones are
much smaller.
Hybrid sequences should, therefore, be useful for s-dimensional integration of
arbitrary functions or for random walk problems whose Neumann series converge
rather slowly. But one should not overlook the possibility to which we alluded in
the previous paragraph that for certain integrands or random walk problems, special features of that problem might suggest the use of special sequences designed
to take advantage of additional information about the problem. For example, if the
s-dimensional integrand is, in fact, independent of several of the s variables, the
3
Monte Carlo Methods
133
properties of the hybrid sequence with respect to these variables becomes much
less important. More generally, if an s-dimensional integrand exhibits diminished
dependence on some subset of the variables, it makes sense to design a hybrid
sequence that takes advantage of that information. Similar special considerations
would apply in the case of certain random walk problems also.
Based on this sort of reasoning, quite recently some authors have focused attention on restricting the class of integrands treated by each method in an attempt to
explain why some sequences perform surprisingly well for certain problems. Interest in pursuing this point of view might have been piqued by provocative results
reported when purely quasi-random sequences were used to estimate some very
high-dimensional integrals arising in financial applications [80, 81]. This has led to
a rash of publications in which the sensitivity of an integrand function with respect
to its independent variables and/or parameters is studied [82–87].
3.4.4 State of the Art
The idea of replacing pseudorandom sequences in Monte Carlo programs by sequences that are more regular, although correlated, has had a profound effect on the
field in the past 40 years. Although use of pseudorandom sequences still dominates
the more traditional applications involving neutron and charged particle transport,
quasi-random sequences are used increasingly in the newer applications areas. Thus,
low-discrepancy or hybrid sequences are used almost routinely for financial modeling, global illumination, and other problems for which Monte Carlo methods have
recently been shown to be useful. This trend has also dramatically influenced the
amount and kind of research needed to analyze errors and to develop error reduction strategies, which is described in Section 3.5.
3.5 Error Analysis
3.5.1 The Pseudorandom Case
Throughout the first 20 or more years (1942–1962) of the exciting period during
which digital computers and Monte Carlo methods developed rapidly, only minimal
information was available from most of the computer programs employing simulation. At that time, the primary output consisted of one or more sample means
mN D
N
1 X
N
i D1
i
134
J. Spanier
of an estimating random variable, , together with estimates of the sample standard
deviation, s
8
2
N
< N
X
41
sD
:N 1 N
i D1
2
i
N
1 X
N
i D1
!2 391= 2
=
5
:
i
;
The estimated variance, s 2 =N , of the sample mean, mN , then establishes the very
slow p
O.N 1=2 / convergence of the Monte Carlo error (based on the standard deviation s 2 =N ) to zero for pseudorandom Monte Carlo implementations. The central
limit theorem states that the sample mean mN is approximately normally distributed
for large sample sizes N and that this normal distribution has as its true mean the expected value of the random variable and as its variance 2 =N , where 2 is the
population variance. It is the assumption of asymptotic normality that permits the
derivation of various confidence intervals that, in turn, provide the foundation for
assigning precision to each Monte Carlo result.
Quite often it is the estimate of the relative error
R D Œs=N 1=2 =mN
(3.4)
that is supplied with each estimated mean. It is common to use the size of R as a
mechanism for interpreting the quality of the Monte Carlo estimates. Small values
of R indicate high precision, whereas values near to 1 suggest sample means that
are suspect. The use of higher-order statistical quantities and other statistical tests
has been recommended in conjunction with some Monte Carlo programs. For example, the program MCNP [15] estimates, in addition to tally means and variances,
the variance of the variance (VOV), and the users’ manual contains guidelines for
interpreting these measures of precision, along with a number of other test quantities
provided routinely with MCNP output.
Partly driven by the frustration caused by the slowness of the O.N 1=2 / convergence rate of pseudorandom Monte Carlo, the development of Monte Carlo methods
that abandoned the probability theory model and the central limit theorem for error
analysis was undertaken. Of course, this necessitated the construction of a deterministic model and error analysis. Radically, different error analyses must be applied
when simulations are based on quasi-random or hybrid-generating sequences.
3.5.2 The Quasi-random Case
When quasi-random sequences are used as generators, deterministic error bounds
are available. The key ingredients of the quasi-Monte Carlo theory can be illustrated
using simple one-dimensional integration.
3
Monte Carlo Methods
135
Definition 3. For a given sequence Q D fx1 ; x2 ; : : :g
0 t 1, first define a counting function
A.Œ0; t/I N; Q/ D number of xi ;
Œ0; 1/, and any t,
1 i N; with xi 2 Œ0; t/:
(3.5)
Using this notion, the discrepancy of the sequence Q becomes
DN .Q/ D sup0
t 1 jLN .Œ0; t//j
(3.6)
where
A.Œ0; t/I N; Q/
t:
(3.7)
N
The discrepancy DN .Q/ plays a key role in bounding the error that results when
estimating the theoretical mean of a random variable by its sample average.
For finite-dimensional integrals, quasi-Monte Carlo methods replace probability
with asymptotic frequency:
LN .Œ0; t// D
Z
N
1 X
lim
f .xi / D f .x/dx
N !1 N
i D1
(3.8)
Is
for a reasonable class of f . Equation 3.8 means, then, that the sequence x1 , x2 ,
. . . produces convergent sums for the estimation of integrals of functions in the
given class. Such sequences are said to be uniformly distributed in I s , a condition
that clearly has nothing to do with randomness. For integral equations, the analogous condition is
Z
N
1 X
lim
.!i / D
d
(3.9)
N !1 N
i D1
and
(which here ordinarily represents an estimator of a weighted integral
R
g.x/‰.x/dx of the solution ‰.x/ of the integral equation) must satisfy
mild smoothness restrictions (and Eq. 3.9 then defines the -uniformity of
!1 ; !2 ; : : : ; !N in ). The idea of -uniformity was introduced by Chelson [88]
who showed that replacing pseudorandom sequences by appropriately chosen uniformly distributed sequences produces -uniformity in . This is the critical result
needed to ensure that quasi-random sequences can be used to provide asymptotically valid (as N ! 1) estimating sums for solutions of integral equations.
Chelson’s construction, modified slightly in [89], is simply to sample the usual onedimensional conditional probability density functions derived from the source and
kernel of the transport equation by using low-discrepancy sequences, rather than
pseudorandom ones. However, if one were simply to use a one-dimensional lowdiscrepancy sequence, such as the van der Corput sequence for all of the decisions
needed to generate the random walks !1 ; !2 ; : : : , a little thought shows that the
random walks would not necessarily satisfy the Markov property. In other words,
in switching from pseudorandom sequences that are approximately uniformly and
136
J. Spanier
independently distributed in the unit interval to low-discrepancy sequences that are
very uniformly distributed but obviously serially “correlated,” the required condition Eq. 3.9 may be lost. The way around this predicament is to use sequences
that are uniform (in this new, deterministic way) in a unit cube I s of sufficiently
high dimension s to suffice for generating all collisions of every random walk.
The fact that there is no a priori upper bound for such a dimension s in the case of
integral equations means that the sequence used must be uniform over the infinitedimensional unit cube. A sequence such as the Halton sequence would suffice, for
example, and this is the sequence that Chelson employed in [88].
These ideas can be illustrated using the simple, one-dimensional random walk
problem introduced in Section 3.1. We first generated ten random walks using a
conventional pseudorandom number generator to make the required decisions and
compute the sample mean, m.1/
10 . The results of that simulation are listed in the table
below:
Particle 1 2 3 4 5 6 7 8 9 10
X
5 2 2 2 5 5 3 3 6 3
m.1/
10 3:6
jm.1/
10 EŒX j D j3:6 ej 0:9
0:87509
p D
0:28:
3:162
10
p
Here, D 2 is the population standard deviation. These results show that we had
rather bad luck with our sample of ten particles. If we want a 95% probability of a
relative error no worse than 1/100, using simple probabilistic arguments we would
need to take n 3:84.100/2 .0:77=2:72/2 D 3;100 random walk samples.
Next, we used the Van der Corput sequence instead of pseudorandom numbers
to select step sizes for the random walks. That produces, for n D 10 particles:
Particle
X
1
3
2
3
3
3
4
2
5
3
6
3
7
2
8
3
9
3
10
2
m.2/
10 2:7
.2/
jm10 EŒX j D j2:667 ej 0:0183:
It appears as though we have improved our estimate of e. However, continuing
the process (i.e., incorporating the results of additional particle histories generated
in this way) does not improve the estimate further. In fact, this particular quasirandom sequence produces estimates that converge (rapidly!) to 2 2/3 rather than
to e. Obviously, great care is needed in implementing quasi-Monte Carlo methods,
3
Monte Carlo Methods
137
even for such simple problems. The difficulty here is that the correlation which
is intrinsic in the Van der Corput sequence has defeated the Markov property in
the execution of particle random walks and has, therefore, not provided a faithful
simulation of the underlying physical process. If we are careful to restore a kind
of statistical independence in using quasi-random numbers, which is accomplished
here by using components of a higher dimensional quasi-random vector sequence
(in fact, the Halton sequence can again be used) to generate the individual steps
of each random walk, we can indeed improve the use of pseudorandom sequences.
When this was executed for model problem 1, we found:
Particle
X
1 2 3 4 5 6 7 8 9 10
3 3 3 3 2 7 2 3 3 3
m.3/
10 3:2
jm.3/
10 EŒX j D j3:2 ej 0:4818:
This provides a better estimate than the one obtained using pseudorandom numbers,
and the advantage in employing the quasi-random sequence in place of the pseudorandom one can be shown to increase as the number of samples grows.
The basis for analyzing error when using quasi-Monte Carlo methods is the
Koksma–Hlawka inequality [90, 91]. It provides a method for bounding the difference between an integral and an average of integrand values. In one variable, this
result takes the following form:
Theorem 1. Let Q D fx1 ; x2 ; : : :g
ation on [0,1]. Then
Œ0; 1/ and let f be a function of bounded vari-
ˇ
ˇ
Z1
N
ˇ1 X
ˇ
ˇ
ˇ
f .xi / f .t/dt ˇ
jıN .f /j D ˇ
ˇN
ˇ
i D1
0
DN .Q/V .f /
(3.10)
where DN .Q/ D discrepancy of Q, V .f / D total variation of f.
The Koksma–Hlawka inequality has been extended to functions of many variables and provides a rigorous (deterministic) upper bound for the difference between
a Monte Carlo-type sum and an integral of a function f of bounded variation. It can
be applied to any set of points x1 ; : : : ; xN that lie in the domain of the function f .
This idea of bounding such differences by a product of two factors, one of which
describes the “smoothness” of the integrand, and the other describes the uniformity
of the set of points used in the evaluation, has been generalized by making use of the
theory of reproducing kernel Hilbert spaces. In this more general setting, the space
of integrands is regarded as a reproducing kernel Hilbert space and the resulting
138
J. Spanier
inequality makes use of the norm in this space to characterize the smoothness of the
integrand, while the factor that replaces the discrepancy DN measures the uniformity of the point set in a way that generalizes the classical definition of discrepancy.
The interested reader should examine [93] and references cited therein.
A Koksma–Hlawka-type inequality has been established for the transport
equation [88, 89, 93], which is shown in these references to be an infinitedimensional extension of the finite-dimensional quadrature problem. That is,
jıN . /j CO DN
(3.11)
R
P
d. The constant CO in Eq. 3.11 measures the
where ıN . / N1 N
i D 1 .!i / variation of ; reductions in CO can be accomplished by importance sampling and
similar conventional variance reduction mechanisms, as described in [88, 93]. But
improvements in the rate of convergence as N ! 1 based on the Koksma–Hlawka
inequality can be obtained only as a result of the rate of decrease of the factor DN .
The key question then becomes: How rapidly can DN converge to 0 for arbitrary
sequences or point sets?
The answer is widely believed to be DN D OŒ.log N /S =N ; S D “effective” dimension of the problem (though slightly more rapid convergence cannot yet be ruled
out theoretically). Integral equations are really infinite-dimensional problems in the
sense that in general, no a priori upper bound exists for the number of steps in
a random walk. However, a finite effective dimension of such a transport problem
might be given by the product of the average number of steps and dim (); here is the physical phase space for the underlying transport model. In this case, the effective dimension is essentially the average number of decisions needed to simulate
a random walk in the transport process being modeled.
Sequences in s dimensions, whose discrepancies are OŒ.log N /S =N are the
ones typically used in quasi-Monte Carlo implementations, and these are often
referred to as low-discrepancy sequences. A very general family of low-discrepancy
sequences are the (t,s)-sequences [10, 94, 95]. These sequences generalize to arbitrary base previous constructions for base 2 [96] and any prime base [97, 98],
and are generally believed to possess the lowest discrepancies of any known sequences. Accordingly, these sequences are in wide use for quasi-Monte Carlo
implementations.
Let S denote either the actual dimension of a multidimensional integral to be
estimated or the effective dimension of a discrete or continuous random walk problem to be simulated, as defined, for example, above. For fixed S , quasi-Monte Carlo
methods will be superior to pseudorandom methods as N ! 1 because of the
inequality (3.11) and because Œ.log N /S =N becomes much smaller than N 1=2 in
this limit. However, for practical values of N and moderate values of S , such a
comparison may well favor the pseudorandom convergence rate. In fact, when S is
only 3, N must be larger than 107 for quasi-random sampling to be expected to
improve upon pseudorandom sampling based on the error upper bounds expressed
in (3.10) or (3.11). There is also some evidence in support of the conjecture that
N must be exponential in S before the advantages of quasi-Monte Carlo methods
3
Monte Carlo Methods
139
over conventional Monte Carlo using pseudorandom sequences can be realized [56].
This means that, for all practical purposes, S cannot be too large. It is when S is too
large that hybrid sequences have often been used successfully.
Another problem with the use of the Koksma–Hlawka inequality to bound quasiMonte Carlo errors is that the individual terms V .f /, DN appearing on the right side
of (3.10) (or the terms CO , DN on the right side of (3.11)) are difficult to estimate.
This creates another argument in favor of hybrid sequences, especially those designed with components of randomness that enable a statistical analysis of the error,
bypassing the Koksma–Hlawka inequality.
Even with these caveats against the routine use of purely quasi-random sequences
for estimating high-dimensional integrals or solutions of transport problems, they
have been amazingly successful in certain situations for which the Koksma–Hlawka
bounds suggest they would not. Examples of this sort arising, for example, in
stochastic financial modeling, have inspired a good deal of research aimed at obtaining a deeper understanding of the pros and cons of the use of quasi-random
sequences.
Figure 3.3 compares the theoretical asymptotic convergence rates associated with
pseudorandom Monte Carlo and quasi-random Monte Carlo implemented in two dimensions .S D 2/. The line with slope 1 is shown since it provides a theoretical
optimum (at least asymptotically) for quasi-Monte Carlo implementations whose
error analysis is based on the Koksma–Hlawka inequality. While this graph is instructive for values of the sample size N that are sufficiently large, for practical
values of N there is no assurance that quasi-random errors will be smaller than
100
Pseudorandom
Error
10−5
Quasirandom (S=2)
10−10
10−15
100
Line with slope = −1
105
1010
N = number of random samples
1015
Fig. 3.3 Comparison of pseudorandom, quasi-random (S D 1; 2) convergence rates
140
J. Spanier
pseudorandom ones, as discussed above. Notice that the quasi-random .S D 2/ and
pseudorandom lines cross for modest values of N .
We have alluded earlier to the fact that good lattice point methods, or their
extensions to lattice rules, can provide extremely good results (i.e., estimates of
finite-dimensional integrals) when the integrand is a periodic function of each of
its variables. In fact, the unusual effectiveness of the simple trapezoidal rule when
applied to periodic functions, which has been well known for some time, is an
illustration of this phenomenon. The explanation lies in the fact that the analysis
of the error made in applying lattice rules to periodic functions is quite different
from both of the error analyses discussed so far: statistically based for pseudorandom Monte Carlo or for some hybrid methods, and based on the Koksma–Hlawka
inequality for quasi-Monte Carlo methods. In fact, the usual error analysis for lattice
methods applied to periodic integrands makes use of number-theoretically based estimates of certain exponential sums. The interested reader might consult Chapter 5
of [10]. Any elaboration here would take us much too far afield.
3.5.3 The Hybrid Case
We have just seen that a major drawback to the use of purely quasi-random methods – especially for random walk problems – is the lack of an effective and low-cost
error analysis when it is used. By contrast, use of the sample standard deviation or
variance in conventional Monte Carlo applications as a measure of uncertainty in
the output is simple and inexpensive. Quite recently, Halton [99] has advocated analyzing low-discrepancy sequences as though they were independent and identically
distributed uniform sequences, but this device entails generating low-discrepancy
sequences in dimensions much higher than actually needed for the simulation itself,
and there are some disadvantages in doing this. A similar error analysis has been
suggested by Okten [100] when employing hybrid sequences. Okten’s idea is to
define a universe consisting of a number of different low-discrepancy sequences
of the appropriate dimension for the problem under study (e.g., of dimension s
when estimating s-dimensional integrals or infinite dimensional when solving integral equations). One then draws such a sequence at random from this universe
and uses it to perform the simulation. Repetition of this process a finite number of
times then permits conventional statistical analysis to be applied rigorously to the
resulting estimates. Okten has presented evidence [77, 79, 100] that this can be an
effective strategy.
3.5.4 Current State of the Art
Confidence interval theory is routinely applied to conventional pseudorandom
Monte Carlo implementations. For quasi-Monte Carlo implementations, there are no
known efficient ways of estimating the error bounds that result from application of
3
Monte Carlo Methods
141
the Koksma–Hlawka inequality. Certain hybrid sequences – for example, randomized low-discrepancy sequences – enable a conventional statistical error analysis
to be used, and this seems to be one of the major reasons for employing hybrid
sequences for difficult Monte Carlo simulations. Use of such an error analysis circumvents the thorny problem of estimating variations and discrepancies in order
to obtain rigorous deterministic error bounds for quasi-random methods that have
no random component at all. Many open questions remain concerning how best
to utilize low-discrepancy and/or hybrid-generating sequences, and the question of
effective error estimation for these methods is far from settled.
3.6 Error Reduction
3.6.1 Introduction
The period following World War II was marked by a shift in emphasis from weapons
development to peacetime applications of simulation methods. The design of nuclear reactors for power generation, both for use in naval propulsion systems and
for commercial application, focused attention anew on Monte Carlo methods. This
interest, in turn, accelerated the development of such methods, especially inasmuch
as the limitations of deterministic solutions of the transport equation, or various approximations to that equation, became better understood in the 1950s and 1960s.
For example, while formulation in terms of the transport equation was frequently
replaced by the simpler diffusion equation (which was then often solved by finite
difference methods), the inaccuracies inherent in this approach could not be assessed
without having “benchmark” transport calculations available.
The early development of Monte Carlo methods concentrated on the solution of
fairly specific families of reactor physics problems and the development of techniques designed to improve the efficiency of their solution. For example, the need to
deal with shielding problems – for which most of the useful information is carried
in analog histories that occur only very rarely – gave rise to the study of importance
sampling [7, 8, 11, 101–106]. In the review [13], considerable attention was paid to
the use of importance sampling to achieve variance reduction, especially in problems involving deep penetration.
Provided that sufficiently many random samples could be processed in early
pseudorandom Monte Carlo codes, statistical fluctuations could be expected to be
low enough to meet critical design criteria (at least most of the time!), albeit with
sizeable computing expenditures. Eventually, however, the demands for increased
accuracy and speed – especially in the naval reactors program and in the rapidly
expanding peacetime applications – inspired the quest for additional clever error
reduction strategies.
In spite of the gains made in understanding how to use Monte Carlo methods to
solve an increasing number of important physical problems, the method seems to
have been used only as a last resort during the early period following World War II.
142
J. Spanier
For reactor calculations generally, diffusion theory was dominant. Computations
based on finite-difference approximations to the diffusion equation were routinely
used for the design of nuclear reactors. There were, no doubt, several valid reasons
for this. Monte Carlo calculations required large amounts of computer time and produced only statistical estimates of a few quantities at a time. While diffusion theory
produced only an approximation to the solution of the transport equation of uncertain quality (and a rigorous examination of the approximation error was impractical), it provided at least qualitative knowledge of the particle behavior everywhere
in the phase space – a distinct advantage. Another factor might have been the growing realization that Monte Carlo simulations making use of importance sampling
could produce anomalous results. That is, certain rare occurrences of high-weight
particles could produce an “effective bias” that was not well understood at the time.
For pseudorandom Monte Carlo implementations, error reduction is usually
equated with variance reduction because the N – sample error, declines at the rate
¢N 1=2 where is the standard deviation of the estimating random variable. The
analogous issue for quasi-Monte Carlo methods is to seek reductions in the constant CO appearing in the inequality (3.11). In both cases, it is the fluctuations in
the sample-to-sample values of the estimator that determines how large the Monte
Carlo error is for any sample size. It is natural to inquire whether variance reduction
methods developed for pseudorandom Monte Carlo can be directly applied to quasiMonte Carlo implementations, as one might perhaps expect. There has not been a
systematic effort to convert variance reduction schemes for pseudorandom Monte
Carlo to error reduction schemes for quasi-random Monte Carlo, but we will report
on what seems to be known in this regard with respect to each strategy discussed in
this section.
As the number and kind of error reduction strategies has grown, it has become
increasingly difficult to organize them all into a short list of “types.” In the first part
of Section 3.6.2, however, we will mainly rely on the classification schemes used
by earlier authors [8, 11, 107] for describing variance reduction strategies such as
control variates, importance sampling, stratified sampling, and the use of expected
values. We will complete this section by discussing a few other error reduction
strategies that seemed sufficiently important to treat here.
Since we cannot hope for an exhaustive treatment of the subject of error reduction, we will content ourselves with tracing what we think are the most important
developments since the publication of [13], especially as they relate to impact on
transport applications.
3.6.2 Control Variates
The classical method of control variates has been well known in the statistical community for many years. In that context, the method has found application in the
design of certain types of sample surveys. For instance, suppose one is interested
in estimating the expected value y of a random variable and one can observe the
3
Monte Carlo Methods
143
outcomes of a control variable for which the expected value x is known. Then y
can be estimated as N C x N ; where ;
N N are the sample means of , . If and are highly positively correlated, this method will be much more effective than if the
simple estimate N were used for x.
When formulated as a technique for estimating integrals by Monte Carlo, the idea
is to represent the integrand f as a sum of a function, , that mimics the behavior
of f but whose integral is known (or easier to estimate than f ) and a remainder
function, f – . Then, Monte Carlo is used only to estimate the integral of f – ,
and a substantial reduction in variance or variation can result.
A similar idea finds use in transport applications, based on the linearity of the
transport operator
(3.12)
L.F˛ C Fˇ / D Q˛ C Qˇ
where L.F˛ / D Q˛ ; L.Fˇ / D Qˇ and L is the transport operator. One way to
make use of linearity, if the problem being studied is described by the equation
L.F˛ / D Q˛ , is to identify a source Qˇ such that the sum problem (3.12) has a
simple, analytically known solution. For example, in one-energy transport problems
with an isotropic source Q˛ one can simply define the source Qˇ to be isotropic
and of sufficient strength in each region so that the ratio R of source strength to
absorption cross section is constant across the entire geometry. This implies that the
flux is also isotropic and constant; in fact, the scalar flux is also equal to R. The
method then finds use in converting the “’-problem” to the ““-problem”; solving
the “-problem by Monte Carlo might provide significant advantages over simulating the ’-problem directly.
In general terms, this technique was described in [11] as the superposition principle. It was used by these authors to estimate both thermal flux averages and
resonance escape probabilities. In both of these applications, the method was responsible for large gains in efficiency and accuracy. For the resonance escape
application, this was especially true when the escape probability pres is close to 1,
which is the situation that presents the greatest computational challenge when conventional simulation of the “’-problem” is employed. The method makes use of an
analytic calculation of the narrow resonance approximation as a control [108].
Although it is usually identified by a different name, the method of antithetic
variates could equally well be described as a control variate method with negative rather than positive correlation. Instead of seeking a random variable that is
strongly positively correlated with the original one and which has a known expectation, one seeks a random variable with the same expectation as the original one
that is strongly negatively correlated to it. Then forming the average of the two random variables produces a new random variable with the same mean but reduced
variance. The idea was first presented in [5] and can be a powerful method for
lowering the resulting error. Furthermore, this very simple idea can be extended
in several ways, as was already made apparent in [5]. For example, if one is able
to construct n mutually antithetic random variables with identical means, their average, or some appropriately chosen weighted average, has the potential to provide
unbiased estimates of the same mean with greatly reduced variance. The antithetic
144
J. Spanier
variates method has recently enjoyed a resurgence of interest in problems arising in
the finance community, problems that entail estimating integrals of functions that
are well approximated by linear functions, for which the method is exceptionally
well suited.
In [109] (see also [110]), Halton applied the superposition principle iteratively
to solve matrix problems by means of a technique that he called “sequential Monte
Carlo.” The goal was to improve the Monte Carlo estimates steadily so that geometric convergence of the error to zero could be accomplished. In our view, this
idea – which has recently been extended to treat transport problems [111–114] –
marks one of the more important developments in Monte Carlo over the last 40
years because the idea of using Monte Carlo sampling to produce global solutions
of problems – e.g., the transport flux or collision density everywhere – is so striking.
We will devote a bit more attention to it here for that reason.
The main idea underlying this method is easy to describe. How can one build a
feedback loop, or “learning” mechanism, into the Monte Carlo algorithm by means
of which a larger and larger part of the problem can serve as a control variate, leaving
a smaller and smaller portion to be estimated stochastically? Since matrix problems
can be solved by Monte Carlo by generating random walks on a discrete index set
consisting of N objects, where the matrix is of order N N , it is possible to encompass both matrix and (continuous) transport problems within the same formulation
by studying random walks on a general (either discrete or continuous) state space,
and we will adopt this point of view shortly.
Of course, in implementing such a sequential (or adaptive) algorithm, if the feedback mechanism is imposed after each random walk, the additional overhead might
easily overwhelm the improvement caused by the increased information content.
Accordingly, the approach taken has been to process random walks in stages, each
consisting of many samples, and to revise the sampling method at the end of each
adaptive stage.
The goal then becomes to achieve
Ek < Ek1 < k E0 ;
0<<1
k D stage number;
(3.13)
where Ek = error after the kth stage. For example, for continuous transport
problems,
En D jj .P / Q n .P /jj;
and Q n .P / is an approximation to the transport solution obtained in the nth stage.
In this context, iterative zero variance Monte Carlo algorithms for global solutions
of transport equations make use of expansions
.P / D
1
X
ai Bi .P /
i D1
of the solution in a complete setP
of basis functions, Bi , and produce essentially
exact truncated solutions Q .P / D 1
.P /, where .P / satisfies
i D 1 ai Bi .P / the transport equation
3
Monte Carlo Methods
145
Z
.P / D
K.P; P 0 / .P 0 /dP 0 C S.P /:
This is accomplished by estimating the expansion coefficients
Z
ai D
Bi .P / .P /dP
(if Bi are orthonormal) in adaptive stages of ever-increasing accuracy.
We consider now the abstract transport equation
D C ;
(3.14)
in which represents a source term and is either a matrix or an integral operator.
The solution is most commonly interpreted as a (discrete or continuous) collision
density, so that describes the density of initial collisions and incorporates information that describes the probability of transfer from a given state in phase space to
the next one. Then, an iterative or multistage procedure can be written quite generally as
(3.15)
D .k/ C .k/ .0/ D ; .0/ D :
Thus, using Eq. 3.15 we reserve the right to modify both the source term .k/ and
the operator kernel term .k/ after each stage of our adaptive algorithm, but our first
stage only makes use of the source and kernel of the original transport equation. The
first method described by Halton alters only the source term .k/ iteratively, so that
.k/ D for all k. We will make use of the more general iterative procedure (3.15)
later when we describe an adaptive form of importance sampling.
In the context of Eq. 3.15, the control variate, or correlated sampling, algorithm
may be characterized by setting
.kC1/ D .k/ C yO .k/ yO .k/ ;
.0/ D ;
(3.16)
where yO .k/ is an approximate solution to (3.15). Then as .k/ ! 0; yO .k/ ! 0,
x .k/ D yO .k/ C yO .k1/ C C yO .0/ ! and convergence should be geometric with
sufficient care in establishing the number of samples in each stage. Thus, we subtract
an approximate solution from both sides of the transport equation and solve for the
difference between the approximate and exact solutions in each stage. The geometric
convergence of an algorithm based on this idea was established rigorously in [113].
For details about the implementation of this sequential correlated sampling (SCS)
algorithm that produces geometric convergence, the interested reader should consult
[112,114]. Here, we summarize the apparent advantages and disadvantages of using
SCS to solve either discrete or continuous transport problems.
We notice that Eq. 3.16 indicates that only the analog kernel is needed to implement SCS. This makes sampling the transition probabilities easiest, and is a distinct
advantage. However, the reduced source .k/ is nonvanishing everywhere, but it
146
J. Spanier
is normally small and of mixed sign. Most of the cost of using SCS stems from
the complexity of dealing with this reduced source. In addition, the solutions of
some transport problems vary by many orders of magnitude over the phase space,
and for such problems, analog construction of random walk histories may not be
sufficient to guarantee that sufficiently many “important” contributors to the solution are represented in the sample space obtained. For example, problems generally
unsuitable for analog treatment (e.g., shielding or streaming-type problems) may
require many random walks per sequential stage to achieve strict error reduction.
For these reasons, a rather different strategy was developed for solving transport
equations adaptively, one based on importance sampling. This will be described in
Section 3.6.3.
As for the use of a control variate strategy in conjunction with quasi-Monte Carlo
simulations, quite recently, Hickernell et al. [115] have explored its potential. The
context for their study was the estimation of definite integrals and they found, somewhat surprisingly, that functions that serve well as control variates in the stochastic
setting may not work well in the deterministic one. The implications of this result
for transport applications have apparently not been studied.
3.6.3 Importance Sampling
For the estimation of a weighted integral of the transport flux or collision density
Z
I D
g.P / .P /dP
(3.17)
where g is a known (weighting) function, is the physical phase space, and
satisfies the transport equation
Z
K.P; Q/ .Q/dQ C S.P /;
.P / D
an importance function
equation
(3.18)
could be defined as the solution of the adjoint transport
.P / D
Z
K .P; Q/
.Q/dQ C g.P /;
(3.19)
where K .P; Q/ K.Q; P /. Under appropriate restrictions, then, it is quite easy
to demonstrate that
Z
I D S.P / .P /dP
(3.20)
3
Monte Carlo Methods
147
so that the integral I can be obtained either as a weighted integral (3.17) of the
solution of the transport equation or as a weighted integral (3.20) of the solution
of a “backward” transport equation. Notice that the roles of the source S and
detector g are interchanged in Eqs. 3.18 and 3.19. The reason for describing as
an importance function will be made apparent in Section 3.7.
The duality expressed by the equality of the integrals (3.17) and (3.20) means
that it can be estimated by a Monte Carlo simulation of Eq. 3.18, with initial collisions described by the source function S , as well as by a Monte Carlo simulation
of Eq. 3.19, with the adjoint source described by the “detector” function g. Furthermore, while the function serves as an importance function for the estimation of
(3.17), the collision density serves as an importance function for the estimation
of (3.20). The references cited in Section 3.6.1 describe various aspects of this importance sampling theory and discuss how to use an importance function to alter the
analog sampling functions so that a zero variance estimate of the weighted integral
(3.17) or (3.20) can be obtained, given perfect information about the importance
function.
It appears to have been Maynard [116] who suggested that use of the reciprocity
principle for the transport operator might be advantageous when applied to certain
kinds of Monte Carlo calculations. In that paper he illustrated the use of reciprocity
to estimate absorption in a small region and to solve a flux-peaking problem (literally obtaining the transport solution at a point by Monte Carlo).
It was appreciated from the outset that accurate estimation of an importance
function is as difficult as estimation of the global solution of the original transport
equation, so that various practical approximations were brought into play. At their
core, many of these variance reduction methods depend on the identification of an
approximate importance function.
For example, the assumption that underlies use of the exponential transformation is that the importance function is well described by parametrized exponential
functions (see, e.g., [117]). The parametrization is provided by the optical distance
between the source of radiation and the detection region under investigation. Use of
a simple version of the Russian roulette and splitting techniques [11, 12, 118, 119]
amounts to the application of a region-wise constant importance function. At the
boundary between two subregions of differing “importance,” particles are split into
two or more independent particle “fragments,” each with a reduced weight, when
moving from a region of lower to higher importance. When moving from a region
of higher to lower importance, however, a game of Russian roulette is played that is
designed to terminate the random walk with a high probability while increasing the
weight of any particles that survive this Russian roulette game in such a way that
the unbiased nature of the game is maintained. In the case of shielding problems, use
of the exponential transformation is designed to maintain a roughly constant population of particle fragments whose weights should roughly decline exponentially as
the particle moves from source to detector. The ideal simulation would, of course,
result in every particle reaching the detector with a weight which is precisely the
quantity being estimated, for example, the probability of transmission associated
with the shield.
148
J. Spanier
Significant effort has continued to be invested in devising mechanisms for
reducing variance in difficult Monte Carlo simulations using, for example, splitting
and Russian roulette which essentially relies on input of an approximate importance
function that takes particularly simple forms. When this approximate importance
function is not sufficiently accurate, history weights can fluctuate wildly, adding
to the variance and making reliance on this method very risky. A weight windows
strategy [120, 121] that controls weight fluctuations by setting upper and lower
limits and using Russian roulette and splitting to maintain those limits assures that
variances cannot become too large. However, the parameters needed to guarantee
high precision using weight windows also rely on knowledge of how the importance function behaves over the phase space. Partly as a natural consequence of
this vast effort at creating “practical” variance reduction schemes making use of
splitting and Russian roulette, there has been interest [122–127] in the development
of an iterative form of importance sampling with goals similar to those achieved for
sequential correlated sampling, as described in the previous section. An iterative
importance sampling strategy can be based on the construction of ever more accurate approximations to an importance function throughout the entire phase space.
Next, we sketch this development.
Returning to Eq. 3.15, use is made of an approximate importance function (as
obtained from approximate solutions of an appropriate adjoint transport equation)
to modify both the source and the kernel of the original transport equation at each
adaptive stage. In other words, Eq. 3.15 is used in its full generality for adaptive
importance sampling (AIS) and now both .k/ and .k/ will change with k. Provided
that an improving approximation to the exact importance function can be assured,
convergence will also be geometric for this method. Proof of this fact for matrix
problems can be found in [109], and details about implementing AIS may be found
in [127, 128]. In these references and in [109] one can also find descriptions of
how use is made of importance sampling theory, based on the duality between the
original transport equation and its adjoint, to modify both the source and the kernel
in each adaptive stage.
As for the benefits and disadvantages of AIS, the most serious objection to its
use arises from the fact that sampling the nonanalog, importance-modified kernel
can be very costly. On the plus side, however, use of an importance function in
the sampling usually results in the need for fewer random walks in each adaptive
block of histories than is the case for sequential correlated sampling. Also, convergence rates obtained from a fixed number of histories in each stage tend to be
more favorable for especially difficult (i.e., slowly convergent) problems when adaptive importance sampling is used than when sequential correlated sampling is used.
Overall, however, each of the two methods is very problem-dependent and one cannot state categorically that it is always better to use one algorithm or the other. It is
an interesting and challenging question to determine a priori criteria for deciding
which method will be superior for any given transport problem.
A third iterative method, a variant of adaptive importance sampling, was quite
recently developed [129] in order to try to combine the best features of the previous
two methods. By relaxing the requirement that the importance sampling estimator
3
Monte Carlo Methods
149
be unbiased, extra freedom in the choice of density functions used to generate the
random walks is obtained. This idea had been suggested much earlier for estimating
integrals [130] and was extended to transport equations in [131]. Although it makes
use of estimators that are biased, they can be shown to be asymptotically (i.e., as the
sample size N ! 1) unbiased and to achieve rates of convergence comparable to
those obtainable with (unbiased) importance sampling. It remains to be seen whether
or not effective ways can be found to choose the sampling density functions so as to
realize the full potential of this new adaptive method.
The investigation of importance sampling for quasi-Monte Carlo applications
was taken up in the dissertation of Paul Chelson [88] where it was shown that the
method could be used to achieve reduction in the variation of an integrand function
or of a transport estimator. In a later dissertation [93], Earl Maize explored the question of quasi-Monte Carlo applications of the weighted analog sampling estimator.
Some of these results are summarized in [57]. Although these results pave the way
for a possible iterative application of these strategies for quasi-Monte Carlo implementation along the lines described above in the pseudorandom case, such studies
have not yet been carried out to the author’s knowledge.
3.6.4 Stratified Sampling
Stratified sampling is another conventional statistical method that has been adapted
for estimating integrals or solving transport problems with reduced sampling error.
When using Monte Carlo to estimate an integral, stratified sampling amounts to decomposing the range of integration into a number of subsets and applying standard
Monte Carlo methods to each subset separately. Optimization of this method then
involves deciding how to define the subsets and how many sample values to be used
in each.
It is less obvious how to make use of this idea for solving transport problems
by Monte Carlo. One application of the principle involved would be to subdivide
the phase space into mutually disjoint subsets and distribute the initial collision
sites of random walks among them according to some prescription that lowers the
sampling errors. In the book [11], one version of this idea is called systematic source
sampling, and it was shown there that this method always reduces the variance.
An extension of this idea might be to subdivide the sample space of random walks into disjoint subsets and attempt to distribute the random walks among
these in an optimal way from the point of view of error reduction. This would require imposing a metric, or measure, on the sample space, rather than on the phase
space. Recalling that the implementation of Monte Carlo methods for (discrete or
continuous) transport problems involves generating the various collision points of
each random walk one at a time by random sampling, the use of low-discrepancy
sequences rather than pseudorandom ones might be seen as a kind of intensive stratified sampling employed over the unit hypercube rather than over the phase space
or the sample space. Thus, the development of quasi-Monte Carlo for the solution
150
J. Spanier
of transport problems can be likened to using stratified sampling of the individual
collision sites visited by the random walks. As we have seen earlier, this can lower
the sampling error by increasing the rate of convergence of sample averages to theoretical expected values, depending upon the phase space dimension and the number
of samples processed.
An idea that appears to be related to these is introduced in [132]. In that paper, the
author replaces the continuous transport problem by a discrete one by representing
the discrete states as obtained by averaging over a user-specified decomposition
of the continuous phase space. This leads to a replacement of the integral transport
equation by a matrix problem that is inherently much easier to solve than the original
one. A similar point of view is taken in [88,133,134], where it was also demonstrated
that the use of low-discrepancy, rather than pseudorandom sequences would result
in further advantages in terms of rate of convergence.
It is primarily because of the potentially useful connections between stratified
sampling as a conventional statistical strategy for reducing error and the use of
low-discrepancy sequences that we felt it important enough to warrant separate mention here.
3.6.5 Use of Expected Values
Many important devices for lowering simulation error succeed because some component of statistical fluctuation in the computation has been replaced by a nonstochastic, exact or approximate, analytic representation. Perhaps the most obvious
example of this involves the use of expected values wherein an estimating random
variable is replaced by one that has been obtained by computing averages over certain events where this is possible. For example, the use of absorption weighting,
or survival biasing, for transport simulations consists of disallowing nonproductive
absorption events in favor of reductions of the particle weight through multiplication by the survival probability at each interaction where absorption would normally
have been possible. This technique is easily shown to reduce variance at the expense,
however, of increased cost per random walk.
A more sophisticated use of expected values for transport applications is normally called next-event estimation or expected value scoring. We sketch its use in
estimating the tally at some detector in a problem. Instead of waiting until a random
walk has reached the detector to record a tally, one can calculate exactly the expected direct (i.e., assuming no intervening collisions) contribution to the detector
along the current direction of each random walk. This produces an exponential that
involves the optical distance between the current position and detector (see, e.g.,
Section 3.6 of [11]). The general idea, of course, is to decrease the variability in
the tally produced by each random walk simulated by eliminating event-level variability through this theoretical averaging device. By summing all of the event-level
contributions (many of which may be zero) over all the events of each history, one is
able to extract more information from each random walk without altering the walk
3
Monte Carlo Methods
151
probabilities. One might be tempted to conjecture that it is always advantageous
to perform such averaging whenever theoretical averages can be calculated without
adding too much to the computational cost. However, reduction in the event-level
variability does not necessarily lower the history-level variability, as simple model
problem analyses can reveal. For example, if the size of the detector shrinks to a
very small volume or even a point, difficulties can be anticipated (see [12]). Therefore, since event-level averaging does add to the computational cost, it is wise to
apply next-event estimation with a certain amount of caution. Obtaining the flux or
collision density at a point can, however, actually be accomplished by means of a
simulation of the transport equation dual to the original one. This will be discussed
further in the next section.
A nontrivial application of the use of expected values has been worked out recently [135] for the estimation of pulse height tallies in problems arising in oil well
logging. In that application, the technique was responsible for improvements that
were impressive in simulations of practical importance. When used in conjunction
with a weight windows strategy, the gains in efficiency were even more pronounced.
An idea related to the use of expected values led to the development of Monte
Carlo codes to estimate spatial density functions in infinite media. This approach
makes use of the fact that, in a homogeneous medium, the spatial density of particles undergoing random walk can be determined analytically once the sequence
of collision parameters is known. Thus, one can use Monte Carlo to sample the
sequence of angles and energies and, using these as parameters, it is possible to calculate the probability that a random walking particle suffers any collision between
parallel planes at distances z and z C z from the point of origin. This led to the
creation of Monte Carlo codes [136] to calculate estimates of entire spatial density
functions and other quantities related to these functions that avoided the need to
perform any spatial sampling at all. The method is mentioned here because, while
it is not, strictly speaking, an example of the use of expected values in the sense described earlier, it does take advantage of the idea that it is most efficient to use Monte
Carlo sampling only for those elements of the transport model that cannot be treated
analytically. The interested reader can find details in [136].
3.6.6 Other Error Reduction Strategies
Space limitations permit only the mention of two other error reduction methods:
perturbation Monte Carlo methods and condensed history methods that seem too
important to omit, even in such a highly compressed survey. Each of these methods
has commanded a lot of attention over a long period of time and each has contributed greatly to the class of transport problems that can be solved effectively by
Monte Carlo methods. The correlated sampling method on which perturbation analysis relies is described in [12, 137–139].
Perturbation techniques have been valuable in estimating small differences accurately by Monte Carlo methods. There are several ways to achieve this, but the
152
J. Spanier
purest is to use the same random number or low-discrepancy sequence to generate a
set of random walks that can serve to solve both a background, unperturbed, transport problem and a perturbation of that problem. It achieves this by associating two
different sets of weights with the histories, weights that reflect the differences between the two problems in a statistically faithful way. Then the strong correlation
between the pair of problems makes estimates of the small perturbing effects much
more reliable than if the two problems were solved using independently generated
random walks.
More formally, the key is to track a pair of tallying random variables , O , where
is an unbiased estimator in the unperturbed problem, and O estimates the same
quantity for the perturbed problem. For example, the unknown to be estimated might
be the transmissivity of a shielding array and could be a straightforward analog estimate of transmissivity for some design specification of the shield geometry, while
O might represent the estimate of transmissivity with one (or more) small changes
introduced in the shielding configuration. Then if denotes the analog measure
characteristic of the design shielding array, and if O denotes the analog measure
characteristic of the perturbed configuration, the identity
Z
holds provided
O d D
Z
O D dO
d
dO
(3.21)
(3.22)
R
R
Defining O in this way serves to correlate estimates of
d; O d. In (3.21), the
left-hand side is the expected transmissivity for the design geometry, and the righthand side is the expected transmissivity for the perturbed geometry. A single set of
random walks generated according to is thus used to represent the solution of two
(or more) problems: one, with no perturbation, and the other with the perturbation of
known size and composition. In other words, when both and O are averaged with
respect to a single set of random walks generated according to the “background”
measure , they must be highly positively correlated for small perturbations.
Through an extension of this idea, estimates can be obtained of the rates of
change of some output quantity with respect to changes in one or more parameters of the problem. For example, if the geometry of the perturbation is assumed
to be known but its composition is unknown, this idea can be used to estimate the
effect on various output quantities of changes in the composition of the perturbing
region. This, then, provides an extremely powerful tool for performing sensitivity
analysis by means of Monte Carlo. Finally, the availability of accurate estimates
of these derivatives can be combined with the use of an optimization algorithm to
provide solutions to inverse problems. Such a solution would be provided, for example, by finding the best (in the sense of least squares) fit with a set of measured
or experimentally observed quantities. A recent example of the use of this method
to solve an inverse problem arising in biomedical optics can be found in [140, 141].
3
Monte Carlo Methods
153
Condensed history models – which have been widely used for modeling the
transport of ionizing radiation for many years (see [142–147]) – have been shown to
speed up Monte Carlo simulations significantly while retaining reasonable accuracy
for many problems. They achieve this acceleration by replacing detailed, collisionby-collision sampling with multiple collision models or equivalent compressions of
information. One such method makes use of similarity relationships for the transport
equation [148]. The critical idea is to rewrite the transport equation in terms of altered parameters in such a way that the solution to the equation remains unchanged,
and then introduces an approximation that is designed to preserve certain moments
of the solution, yet speed up the computation significantly.
Specifically, the one-speed integro-differential transport equation is
1@
Cr
v @t
Z
C †t .r/ .r; ; t / D
.r; 0 ; t/†s .r; 0 ! /d0 C q.r; ; t/
(3.23)
4
where the solution
describes the radiation flux, q is the physical source,
†t .r/ D †s C †a is the total cross section, decomposed as the sum of scattering and absorption cross sections, .r; / describe the position and unit vector along
the direction of motion, respectively, of a typical particle, and t denotes the time.
This equation can be rewritten as
1@
Cr
v @t
qQ.r; ; t/ C I.r; ; t/ D 0
(3.24)
where
I.r; ; t/ D †a .r/ .r; ; t /
2
C†s .r/ 4 .r; ; t / Z
3
f
.r;
/
d0 5
.r; 0 ; t /
2
(3.25)
4
and f .r; / is the probability density function for D 0 , the cosine of the
scattering angle.
Clearly, then, the solution
of Eq. 3.24 is unchanged by the replacement of
†a .r/; †s .r/, and f .r; / by †a .r/; †s .r/, and f .r; /, respectively, that use
a different set of physical parameters, provided that the quantity I.r; ; t/ is unchanged. This substitution yields the condition
Œ†a †a C †s †s
Z
D
4
.r; 0 ; t /
†s f .r; / †s f .r; / 0
d
2
(3.26)
154
J. Spanier
which, upon expanding
into spherical harmonics and the angular probability
density functions f and f into Legendre polynomials and simplifying, gives the
following family of equivalence relations:
SD
1 fn
†s
D
;
†s
1 fn
n D 1; 2; : : : ; N:
(3.27)
Here, the nth terms correspond to the nth coefficients in the Legendre expansions
of the angular probability density functions, f and f , and N is the similarity relation order. By choosing larger N , one increases the simulation accuracy, admitting
more directional anisotropy and thereby approximating the exact transport solution
more closely. The assumption that only N –1 angular moments, i.e., only N terms
in the series expansion are significant, produces the efficiency S D 1 fN .
From Eq. 3.27 it follows that the similarity efficiency, S , lies between 0 and 1.
It is also clear from this equation that for optimal efficiency, one wants to choose
†s as small as possible for the given N . For many problems involving optically
dense media, †s †t (absorption is negligible compared to scattering), †t is very
large, and reductions in †s accelerate the simulation by increasing the mean free
path approximately by the ratio S .
A somewhat different approach assumes that each particle travels a fixed distance s between collisions, where s is chosen to be greater than the inherent mean
free path and remains fixed throughout simulation. This artificial enlargement of the
mean free path is then accompanied by treating each scattering event by means of
a multiple-scattering model, rather than by sampling from the original single scattering phase function repeatedly. The theory that underlies the multiple-scattering
model is due to Goudsmit and Saunderson [142, 149]. This method has been carefully analyzed by various authors [145–147, 150] and has been shown to produce
very good results, both with respect to speed and accuracy, over a wide range of
problems arising in connection with radiation therapy planning.
3.6.7 State of the Art
A great deal of progress has been made in advancing the theory and practice of error reduction methods since the publication of [13]. Even so, the time has not yet
arrived when Monte Carlo methods can be used routinely to solve general transport
problems, even though codes such as MCNP [15] are in very wide use globally.
The optimum choice of a sampling/weighting strategy depends in a complicated
way on the details of the problem and, without fairly sophisticated controls, error reduction techniques, such as importance sampling, can produce poor results.
The recent development of adaptive Monte Carlo algorithms offers the promise of a
general-purpose code that automatically extracts the information it requires for each
problem; such a code might then finally fulfill this promise. It remains to be seen
whether an extremely efficient general-purpose code can be implemented, perhaps
3
Monte Carlo Methods
155
making use of advanced adaptive methods that converge geometrically to prescribed
quantities with very small statistical errors. In the meantime, special codes for special applications will still be needed in order to meet very high precision goals for
important problems/projects.
3.7 Foundations/Theoretical Developments
The fact that several of the most important publications of the early period were
classified may well have prevented more rapid dissemination of the critical theoretical ideas. Then too, the emphasis throughout this period was on results, whether in
support of the Manhattan project or to speed up development of the nuclear navy.
In both the cases, the “proofs of principle” were established by successful implementations – not mathematically rigorous arguments. Indeed, because Monte Carlo
methods are so intuitively plausible, there may well have been a reduced dependence
on rigor tolerated by its early practitioners. Coupled with the fact that those directly
involved with the early advances were preoccupied with races against the clock in
one way or another, publication in scholarly journals was often postponed or foregone in the interest of national security. In time, however, the stimuli for improved
understanding of the mathematical foundations underlying Monte Carlo methods
were at least twofold: the appearance of perplexing or anomalous numerical results
(e.g., erratic behavior of sample means in some problems; the possibility of theoretically infinite variance importance sampling strategies that seem quite reasonable
otherwise, etc.) and the pressure for constant improvement in new results obtained
via Monte Carlo.
In turn, not unexpectedly these newer methods often depended on sophisticated
transformations of the original problem into problems fully equivalent to it but easy
to solve by Monte Carlo. With this came the imposition of increased mathematical
sophistication and understanding.
Recognition that transport problems are natural infinite-dimensional extensions
of finite-dimensional quadrature problems has provided the motivation for developing a rigorous measure-theoretic foundation for Monte Carlo solutions of transport
problems. This allows a demonstration of the equivalence of formulations based
on the transport equation (analytic model), Monte Carlo sampling (probability
model), and quasi-Monte Carlo sampling (deterministic, number-theoretic model)
[11, 58, 151]. It has also been valuable to recognize that Monte Carlo solutions of
matrix problems can further our understanding of Monte Carlo solutions of continuous transport problems. The mathematical models of the two classes of problems are
very similar. In both the discrete and the continuous cases, the solution can be represented as an infinite series of terms, each of which accounts for random walks that
make exactly k collisions, k D 1; 2; : : : . The degree of difficulty of each problem
can be measured, to some extent, by the rate of convergence of this infinite series.
For example, the effective dimension of either type of problem can be defined as
the product of the average number of collisions made by the random walks and the
156
J. Spanier
phase space dimension, as we remarked earlier. If this effective dimension is higher,
more computational effort must be invested, in general, in order to solve the problem.
An important contribution to our theoretical understanding of the analysis of
error in Monte Carlo calculations was advanced in the 1970s in a series of publications by Harvey Amster and his collaborators [152–155]. The key idea involved
is to relate the expected value of an estimating random variable to an adjoint transport equation. For example, consider the integral transport equation for the collision
density
Z
.P / D S.P / C
Z
where
K.P; Q/ D
K.P; Q/ .Q/dQ;
(3.28)
C.P; P 0 /T .P 0 ; Q/dP 0 ;
(3.29)
and the kernel T describes free flight transport and the kernel C describes the collision mechanics. The function S is the density of initial collision states and is related
to the physical source density q via
Z
S.P / D T .P; P 0 /q.P 0 /dP 0 :
(3.30)
It is instructive to write down the equation adjoint to (3.28):
Z
.P / D S .P / C K.P; Q/ .Q/dQ
(3.31)
so that, as we showed in Section 3.6, we have the equality
Z
Z
.P /S .P /dP D
.P /S.P /dP :
(3.32)
Substitution of (3.30) into (3.32) gives
Z
.P /S .P /dP D
Z
Z
Now if we denote
M1 .P / D
.P /T .P; P 0 /q.P 0 /dP 0 dP :
.P /T .P; P 0 /dP 0 ;
it follows quite readily that the adjoint solution .P /Rmust be the expected contribution to the estimation of the “reaction rate” I D
.P /S .P /dP due to a
particle originating at P from the physical source q. It is also straightforward to
show that the integral equation satisfied by the function M1 is
Z
M1 .P / D
0
0
T .P; P /S .P /dP C
Z
L.P; P 0 /M1 .P 0 /dP 0
(3.33)
3
Monte Carlo Methods
157
Z
where
L.P; Q/ D
T .P; P 0 /C.P 0 ; Q/dP 0 :
Making use of uniqueness of solutions of the transport equation and its adjoint, it
is not difficult to show that Eq. 3.33 is adjoint to the integro-differential form of
the original transport process, making the function M1 an importance function for
this problem. Higher moment equations, that is, integral equations for the expected
square, cube, etc. of the score can be derived in similar fashion.
This technique was initially developed as a mechanism for comparing variances
of competing Monte Carlo strategies, thereby providing useful information about
whether or not a proposed Monte Carlo method might be superior to an alternative formulation. However, the connection between these moment equations and the
original transport process has led to a variety of other uses over the years.
3.7.1 State of the Art
We now know how to demonstrate equivalence between an analytic model of a
given problem derived from the transport equation with either a probabilistic model
to be used for pseudorandom simulation or a deterministic model to be used for
quasi-random simulation. Likewise, the derivation of coupled integral equations for
expected values of higher moments of estimators is well understood. However, applying these models and derivations to achieve optimal performance is still as much
art as it is science. Nevertheless, such theoretical advances have proven to be of
great value in advancing the development of effective Monte Carlo techniques.
3.8 Challenges
In this chapter, we have focused on the Monte Carlo solution of integral equations, and especially transport equations, as seemed in keeping with the spirit of
the Gelbard lecture series. We have, nevertheless, tried to indicate that estimation
of definite integrals and other conventional and unconventional Monte Carlo applications areas have also spurred developments highly relevant to the solution of
transport problems.
We close this chapter with some thoughts about open questions such as whose
resolutions would, in our opinion, further advance the theory and practice of Monte
Carlo methods for traditional nuclear applications.
We have seen that high-dimensional integrals and transport problems can both
benefit from the use of a judiciously chosen mixture of pseudorandom and quasirandom generating sequences. We have also suggested that these same techniques
ought to play a useful role in the Monte Carlo solution of transport problems, but
this has not been attempted in any systematic way as yet.
158
J. Spanier
A relatively new branch of the theoretical analysis of Monte Carlo algorithms
deals with their computational complexity (see, e.g., [87] and references cited
therein). It would certainly be instructive to develop practical criteria for optimality
with respect to complexity.
The use of perturbation and differential Monte Carlo tools, coupled with optimization algorithms, to solve inverse problems is a rather new and important development, in our view. Much more effort would seem warranted for such problems.
We have sketched the development of another rather new circle of ideas: the use
of sequential or adaptive algorithms to achieve very rapid convergence of Monte
Carlo estimates. Three different strategies have evolved to date: sequential correlated sampling, adaptive importance sampling, and generalized weighted analog
sampling. It will be important to learn how these, or possibly others, can be used
to solve various families of practical transport problems easily and efficiently.
Strengthening our theoretical understanding of the application of low-discrepancy
sequences to difficult transport problems would seem to be overdue, inasmuch as
most traditional Monte Carlo programs continue to rely on pseudorandom sequence
generators. At a minimum it would be well to derive more precise criteria for
estimating the overall efficiency of a program implemented with low-discrepancy
sequences in terms of the parameters that characterize the problem. This will, no
doubt, prove to be a difficult exercise.
Acknowledgments The author gratefully acknowledges the support of the Laser Microbeam and
Medical Program NIH P-41-RR-01192, and grants UCOP 41730 and NSF/DMS 0712853 during
the preparation of this chapter.
References
1. Buffon GC (1777) Essai d’arithmetique morale. Supplement á l’histoire naturelle 4
2. Laplace MP-S (1886) Theory Analytiques des Probabilities, Livre 2, contained in Oeuvres
Completes de Laplace, de L’Academie des Sciences, vol 7, part 2. Paris, pp 365–366
3. Mantel N (1953) An extension of the Buffon needle problem. Ann Math Stat 24:674–677
4. Kahan BC (1961) A practical demonstration of a needle experiment designed to give a number
of concurrent estimates of . J R Stat Soc Series A 124:227–239
5. Hammersley JM, Morton KW (1956) A new Monte Carlo technique: antithetic variates. Proc
Camb Phil Soc 52:449–475
6. Fishman G (1996) Monte Carlo: Concepts, Algorithms, and Applications, Springer Series in
Operations Research
7. Cashwell ED, Everett J (1959) Monte Carlo method for random walk problems. Pergamon,
New York
8. Hammersley JM, Handscomb DC (1964) Monte Carlo methods. Methuen & Co., Ltd.,
London
9. Kalos MH, Whitlock PA (1986) Monte Carlo methods, Volume I: basics. Wiley-Interscience,
New York
10. Niederreiter H (1992) Random number generation and quasi-Monte Carlo methods, #63 in
CBMS-NSF Regional Conference Series in Applied Mathematics. SIAM, Philadelphia, PA
11. Spanier J, Gelbard EM (1969) Monte Carlo principles and neutron transport problems.
Addison-Wesley, Reading, MA
3
Monte Carlo Methods
159
12. Lux I, Koblinger L (1991) Monte Carlo particle transport methods: neutron and photon calculations. CRC Press, Boca Raton, FL
13. Kalos MH, Nakache FR, Celnik J (1968) Monte Carlo methods in reactor computations. In:
Greenspan H, Kelber CN, Okrent D (eds) Computing methods in reactor physics. Gordon &
Breach, New York, pp 365–438
14. Greenspan H, Kelber CN, Okrent D (eds) (1968) Computing methods in reactor physics.
Gordon & Breach, New York, pp 365–438
15. X-5 Monte Carlo Team (2003) MCNP – a general N-particle transport code, Version 5,”
LA-UR-03–1987, Los Alamos National Laboratory
16. Metropolis N, Ulam S (1949) The Monte Carlo method. J Am Stat Assoc 44:335–341
17. Kahn H (1954) Applications of Monte Carlo, RAND Corp. Report AECU – 3259 (April 1954;
revised April 1956)
18. Kahn H (1956) Use of different Monte Carlo sampling techniques. In: Meyer HA (ed)
Symposium on Monte Carlo methods. Wiley, New York, pp 146–190
19. Coveyou RR (1969) Random number generation is too important to be left to chance. Appl
Math 3:70–111
20. Devroye L (1986) Non-uniform random variate generation. Springer, New York
21. Stadlober E, Kremer R (1992) Sampling from discrete and continuous distributions with
C-Rand. In: Pflug G, Dieter U (eds) Simulation and optimization. Lecture notes in economics
and math. systems, vol 374. Springer, Berlin, pp 154–162
22. Stadlober E, Niederl F (1994) C-Rand: a package for generating nonuniform random variates.
In Compstat’94, Software Descriptions, pp 63–64
23. Lehmer DH (1964) Mathematical methods in large-scale computing units. Proc 2nd Symp on
Large-Scale Calculating Machinery (1949), Ann Comp Lab Harvard Univ 26:141–146
24. Knuth DE (1998) The art of computer programming, Seminumerical Algorithms, vol 2, 3rd
edn. Addison-Wesley, Reading, MA
25. Coveyou RR (1960) Serial correlation in the generation of pseudo-random numbers. J ACM
7:72–74
26. MacLaren MD, Marsaglia G (1965) Uniform random number generators. J ACM 12:83–89
27. Marsaglia G (1968) Random numbers fall mainly in the planes. Proc Natl Acad Sci USA
61:25–28
28. Anderson SL (1990) Random number generators on vector supercomputers and other advanced architectures. SIAM Rev 32:221–251
29. Dagpunar J (1988) Principles of random variate generation. Oxford University Press, Oxford
30. Deak I (1989) Random number generators and simulation. Akademiai Kiado, Budapest
31. Dieter U (1986) Non-uniform random variate generation. Springer, New York
32. James F (1990) A review of pseudorandom number generators. Comp Phys Commun
60:329–344
33. L’Ecuyer P (1990) Random numbers for simulation. Commun ACM, 33:85–97
34. L’Ecuyer P (1994) Uniform random number generation. Ann Oper Res 53:77–120
35. Niederreiter H (1995) New developments in uniform random number and vector generation.
In: Niedderreiter H, Shiue P J-S (eds) Monte Carlo and quasi-Monte Carlo methods in scientific computing, Lecture Notes in Statistics #106. Springer, New York, pp 87–120
36. Niederreiter H (1993) Finite fields, pseudorandom numbers, and quasi-random points. In:
Mullen GL, Shiue PJ-S (eds) Finite fields, coding theory, and advances in communications
and computing. Marcel Dekker, New York, pp 375–394
37. Bergstrom V (1936) Einige Bemerkungen zur Theorie der Diophantischen Approximationen.
Fysiogr Salsk Lund Forh 6(13):1–19
38. Van der Corput JG, Pisot C (1939) Sur la Discrépance Modulo un. Indag Math 1:143–153,
184–195, 260–269
39. Bratley P, Fox BL, Schrage LE (1987) A guide to simulation, 2nd edn. Springer, New York
40. Fishman G, Moore III LS (1986) An exhaustive analysis of multiplicative congruential random number generators with modulus 231 – 1. SIAM J Sci Stat Comp 7:24–45
41. Fishman G (1989) Multiplicative congruential random number generators with modulus 2ˇ :
an exhaustive analysis for ˇ D 32 and a partial analysis for ˇ D 48. Math Comp 54:331–344
160
J. Spanier
42. Ripley BD (1983) The lattice structure of pseudo-random number generators. Proc R Soc
Lond Ser A 389:197–204
43. Law AM, Kelton WD (2002) Simulation modeling and analysis, 3rd edn. McGraw-Hill, New
York
44. L’Ecuyer P (1998) Random number generation. Chapter 4. In: Banks J (ed) Handbook of
simulation. Wiley, New York, pp 93–137
45. Hellekalek P, Larcher G (eds) (1998) Random and quasi-random point sets, vol 138 of Lecture
Notes in Statistics. Springer, New York
46. L’Ecuyer P, Simard R (2000) On the performance of birthday spacings tests for certain families of random number generators. Math Comp Simul 55:131–137
47. L’Ecuyer P (1999) Good parameters and implementations for combined multiple recursive
random number generators. Oper Res 47:159–164
48. Eichenauer J, Lehn J (1986) A nonlinear congruential pseudorandom number generator. Stat
Papers 27:315–326
49. L’Ecuyer P (2002) Random numbers. In: Smelser NJ, Paul B Baltes (eds) The international
encyclopedia of the social and behavioral sciences. Pergamon, Oxford, pp 12735–12738
50. Matsumoto M, Nishimura T (1998) Mersenne twister: a 623-dimensionally equidistributed
uniform pseudo-random number generator. ACM Trans Model Comp Simul 8:3–30
51. L’Ecuyer P, Andres TH (1997) A random number generator based on the combination of four
LCGs. Math Comp Simul 44:99–107
52. L’Ecuyer P (1999) Tables of maximally equidistributed combined LFSR generators. Math
Comp 68:261–269
53. Zaremba SK (1968) The mathematical basis of Monte Carlo and quasi-Monte Carlo methods.
SIAM Rev 10:304–314
54. Keller A (1995) A Quasi-Monte Carlo Algorithm for the global illumination problem in the
radiosity setting. In: Niederreiter H, Shiue PJ (eds) Monte Carlo and quasi-Monte Carlo methods in scientific computing, Lecture Notes in Statistics 106. Springer, New York, pp 239–251
55. Keller A (1998) The quasi-random walk. In: Niederreiter H, Hellekalek P, Larcher G, Zinterhof P (eds) Monte Carlo and Quasi-Monte Carlo methods 1996, Lecture Notes in Statistics
127. Springer, New York, pp 277–291
56. Morokoff WJ, Caflisch RE (1993) A Quasi-Monte Carlo approach to particle simulation of
the heat equation. SIAM J Num Anal 30:1558–1573
57. Spanier J, Maize EH (1994) Quasi-random methods for estimating integrals using relatively
small samples. SIAM Rev 36:18–44
58. Spanier J (1995) Quasi-Monte Carlo methods for particle transport problems. In: Niederreiter H, Shiue PJ (eds) Monte Carlo and quasi-Monte Carlo methods in scientific computing,
Lecture Notes in Statistics 106. Springer, New York, pp 121–148
59. Boyle P (1977) Options: a Monte Carlo approach. J Fin Econ 4(4):323–338
60. Morokoff WJ, Caflisch RE (1997) Quasi-Monte Carlo simulation of random walks in finance.
In: Niederreiter H, Hellekalek P, Larcher G, Zinterhof P (eds) Monte Carlo and quasi-Monte
Carlo methods 1996, Lecture Notes in Statistics 127. Springer, New York, pp 340–352
61. Joy C, Boyle P, Tan KS (1996) Quasi-Monte Carlo methods in numerical finance. Manage
Sci 42:926–938
62. Tezuka S (1998) Financial applications of Monte Carlo and Quasi-Monte Carlo methods.
In: Hellekalek P, Larcher G (eds) Random and quasi-random point sets, Lecture Notes in
Statistics 138. Springer, New York, 303–332
63. Cohen M, Wallace J (1993) Radiosity and realistic image synthesis. Academic Press Professional, Cambridge
64. Lafortune E (1996) Mathematical models and Monte Carlo algorithms for physically based
rendering. Ph.D. dissertation, Katholieke Universitiet, Leuven, Belgium
65. Van der Corput JG (1935) Verteilungsfunktionen I, II, Nederl. Akad Wetensch Proc Ser B,
38:813–821, 1058–1066
66. Halton JH (1960) On the efficiency of certain quasi-random sequences of points in evaluating
multi-dimensional integrals. Num Math 2:84–90
3
Monte Carlo Methods
161
67. Korobov NM (1959) The approximate computation of multiple integrals. Dokl Akad Nauk
SSSR 124:1207–1210 (in Russian)
68. Hlawka E (1962) Zur Angenäherten Berechnung Mehrfacher Integrale. Monatsch Math
66:140–151
69. Hua K, Wang Y (1981) Applications of number theory to numerical analysis. Springer, Berlin
70. Sloan IH, Joe S (1994) Lattice methods for multiple integrals. Oxford University Press, Oxford
71. Owen A (1995) Randomly permuted (t; m; s)-nets and (t; m; s)-sequences. In: Niederreiter H,
Shiue PJ (eds) Monte Carlo and quasi-Monte Carlo methods in scientific computing, Lecture
Notes in Statistics 106. Springer, New York, pp 299–317
72. Faure H (1992) Good permutations for extreme discrepancy. J Num Theor 41:47–56
73. Wang J, Hickernell FJ (2000) Randomized Halton sequences. Math Comp Model 32:887–899
74. Moskowitz B (1995) Quasi-random diffusion Monte Carlo. In: Niederreiter H, Shiue PJ-S
(eds) Monte Carlo and quasi-Monte Carlo methods in scientific computing, Lecture Notes in
Statistics, 106. Springer, Berlin, pp 278–298
75. Coulibaly I, Lecot C (1998) Monte Carlo and quasi-Monte Carlo algorithms for a linear
integro-differential equation. In: Niederreiter H, Hellekalek P, Larcher G, Zinterhof P (eds)
Monte Carlo and quasi-Monte Carlo methods 1996, Lecture Notes in Statistics 127. Springer,
New York, 176–188
76. Okten G (1999) High dimensional integration: a construction of mixed sequences using sensitivity of the integrand. Technical Report, Ball State University, Muncie, IN
77. Okten G (2000) Applications of a hybrid Monte Carlo sequence to option pricing. In:
Niederreiter H, Spanier J (eds) Monte Carlo and quasi-Monte Carlo methods, 1998. Springer,
New York, pp 391–406
78. Moskowitz BS (1993) Application of quasi-random sequences to Monte Carlo methods. Ph.D.
dissertation, UCLA
79. Okten G (1999) Random sampling from low discrepancy sequences: applications to option
pricing. Technical Report, Ball State University, Muncie, IN
80. Paskov SH (1997) New methodologies for valuing derivatives. In: Pliska S, Dempster M (eds)
Mathematics of securities. Isaac Newton Institute, Cambridge University Press, Cambridge
81. Paskov SH, Traub JF (1995) Faster valuation of financial derivatives. J Portfolio Manage
22:113–120
82. Cukier RI, Levine HB, Shuler KE (1978) Nonlinear sensitivity analysis of multiparameter
model systems. J Comp Phys 26:1–42
83. Owen A (1992) Orthogonal arrays for computer experiments, integration and visualization.
Statistica Sinica 2:439–452
84. Radovic I, Sobol’ IM, Tichy RF (1996) Quasi-Monte Carlo methods for numerical integration: comparison of different low discrepancy sequences. Monte Carlo Meth Appl 2:1–14
85. Sobol IM (1993) Sensitivity estimates for nonlinear mathematical models. MMCE 1:407–414
86. Sloan IH, Wozniakowski H (1998) When are quasi-Monte Carlo algorithms efficient for high
dimensional integrals? J Complexity 14:1–33
87. Wozniakowski H (2000) Efficiency of quasi-Monte Carlo algorithms for high dimensional
integrals. In: Niederreiter H, Spanier J (eds) Monte Carlo and quasi-Monte Carlo methods
1998. Springer, New York, pp 114–136
88. Chelson P (1976) Quasi-random techniques for Monte Carlo methods. Ph.D. dissertation, The
Claremont Graduate School, Claremont
89. Spanier J, Li L (1998) Quasi-Monte Carlo methods for integral equations. In: Niederreiter
H, Hellekalek P, Larcher G, Zinterhof P (eds) Monte Carlo and quasi-Monte Carlo methods
1996, Lecture Notes in Statistics 127. Springer, New York, pp 398–414
90. Koksma JF (1942–1943) Een Allgemeene Stelling uit de Theorie der Gelijkmatige Verdeeling
Modulo 1. Mathematica B. (Zutphen) 11:7–11
91. Hlawka E (1961) Funktionen von Beschränkter Variation in der Theorie der Gleichverteilung.
Ann Mat Pura Appl 54:325–333
162
J. Spanier
92. Hickernell FJ (2006) Koksma-Hlawka inequality. In: Kotz S, Johnson NL, Read CB,
Balakrishnan N, Vidakovic B (eds) Encyclopedia of statistical sciences, vol 6, 2nd edn. Wiley,
Hoboken, NJ, pp 3862–3867
93. Maize EH (1981) Contributions to the theory of error reduction in quasi-Monte Carlo methods. Ph.D. dissertation, The Claremont Graduate School, Claremont
94. Niederreiter H, Xing C (1998) The algebraic-geometry approach to low-discrepancy
sequences. In: Niederreiter H, Hellekalek P, Larcher G, Zinterhof P (eds) Monte Carlo and
quasi-Monte Carlo methods 1996, Lecture Notes in Statistics 127. Springer, New York,
pp 139–160
95. Niederreiter H (2000) Construction of (t,m,s)-nets. In: Niederreiter H, Spanier J (eds) Monte
Carlo and quasi-Monte Carlo methods, 1998. Springer, New York, pp 70–85
96. Sobol IM (1967) The distribution of points in a cube and the approximate evaluation of integrals. Zh Vychisl Mat i Mat Fiz 7:784–802 (in Russian)
97. Faure H (1981) Discrépances de Suites Associées à un Système de Numération (en Dimension
un). Bull Soc Math France 109:143–182
98. Faure H (1982) Discrépances de Suites Associées à un Système de Numération (en
Dimension S). Acta Arith 41:337–351
99. Halton J (1998) Independence of quasi-random sequences and sets. Working Paper CB#3175,
University of North Carolina, Chapel Hill, NC
100. Okten G (1998) Error estimation for Quasi-Monte Carlo methods. In: Niederreiter H,
Hellekalek P, Larcher G, Zinterhof P (eds) Monte Carlo and quasi-Monte Carlo methods
1996, Lecture Notes in Statistics 127. Springer, New York
101. Goertzel G, Kalos MH (1958) Monte Carlo methods in transport problems. In: Hughes DJ,
Sanders JE, Horowitz J (eds) Progress in nuclear energy, vol II, series I, Physics and Mathematics. Pergamon, New York, pp 315–369
102. Leimdorfer M (1964) On the transformation of the transport equation for solving deep
penetration problems by the Monte Carlo methods. Trans Chalmers University of Tech, #286.
Goteborg, Sweden
103. Leimdorfer M (1964) On the use of Monte Carlo methods for calculating the deep penetration
of neutrons in shields. Trans Chalmers University of Tech, #287, Goteborg, Sweden
104. Coveyou RR, Cain VR, Yost KJ (1967) Adjoint and importance in Monte Carlo application.
Oak Ridge National Laboratory Report ORNL-4093
105. Kalos MH (1963) Importance sampling in Monte Carlo shielding calculations – neutron penetration through thick hydrogen shields. Nucl Sci Eng 16:227
106. Goertzel G (1949) Quota sampling and importance functions in stochastic solution of particle
problems. Oak Ridge National Laboratory Report ORNL-434
107. Halton JH (1970) A retrospective and prospective survey of the Monte Carlo method. SIAM
Rev 12:1–63
108. Gelbard EM, Spanier J (1964) Use of the superposition principle in Monte Carlo resonance
escape calculations. Trans Am Nucl Soc 7:259–260
109. Halton J (1962) Sequential Monte Carlo. Proc Camb Phil Soc 58:57–73
110. Halton J (1994) Sequential Monte Carlo techniques for the solution of linear systems. J Sci
Comp 9:213–257
111. Kong R (1999) Transport problems and Monte Carlo methods. Ph.D. dissertation, Claremont
Graduate University, Claremont
112. Kong R, Spanier J (2000) Sequential correlated sampling methods for some transport problems. In: Niederreiter H, Spanier J (eds) Monte Carlo and quasi-Monte Carlo methods 1998.
Springer, Berlin, pp 238–251
113. Kong R, Spanier J (2000) Error analysis of sequential Monte Carlo methods for transport
problems. In: Niederreiter H, Spanier J (eds) Monte Carlo and quasi-Monte Carlo methods
1998. Springer, Berlin, pp 252–272
114. Spanier J (2000) Geometrically convergent learning algorithms for global solutions of transport problems. In: Niederreiter H, Spanier J (eds) Monte Carlo and quasi-Monte Carlo
methods 1998. Springer, Berlin, pp 98–113
3
Monte Carlo Methods
163
115. Hickernell FJ, Lemieux C, Owen AB (2005) Control variates for quasi-Monte Carlo, 2002.
Stat Sci 20:1–31
116. Maynard CW (1961) An application of the reciprocity theorem to the acceleration of Monte
Carlo calculations. Nucl Sci Eng 10:97–101
117. Spanier J (1970) An analytic approach to variance reduction. SIAM J Appl Math 18:172–190
118. Burn KW, Nava E (May 1997) Optimization of variance reduction parameters in Monte
Carlo radiation transport calculations to a number of responses of interest. Proceedings of
the international conference on nuclear data for science and technology. Italian Physical Society, Trieste, Italy
119. Burn KW, Gualdrini G, Nava E (2002) Variance reduction with multiple responses. In:
Kling A, Barao F, Nakagawa M, Tavora L, Vaz P (eds) Advanced Monte Carlo for radiation
physics, particle transport simulation and applications. Proceedings of the MC2000 conference. Lisbon, Portugal, 23–26 October 2000, pp 687–695
120. Hendricks J (1982) A code – generated Monte Carlo importance function. Trans Am Nucl
Soc 41:307
121. Cooper MA, Larsen EW (2001) Automated weight windows for global Monte Carlo particle
transport calculations. Nucl Sci Eng 137:1–13
122. Booth TE (1986) A Monte Carlo learning/biasing experiment with intelligent random
numbers. Nucl Sci Eng 92:465–481
123. Booth TE (1988) The intelligent random number technique in MCNP. Nucl Sci Eng
100:248–254
124. Booth T (1985) Exponential convergence for Monte Carlo particle transport. Trans Am Nucl
Soc 50:267–268
125. Booth T (1997) Exponential convergence on a continuous Monte Carlo transport problem.
Nucl Sci Eng 127:338–345
126. Kollman C (1993) Rare event simulation in radiation transport. Ph.D. dissertation, University
of California, Berkeley, CA
127. Lai Y, Spanier J (2000) Adaptive importance sampling algorithms for transport problems. In:
Niederreiter H, Spanier J (eds) Monte Carlo and quasi-Monte Carlo methods 1998. Springer,
New York, pp 276–283
128. Hayakawa C, Spanier J (2000) Comparison of Monte Carlo algorithms for obtaining geometric convergence for model transport problems. In: Niederreiter H, Spanier J (eds) Monte Carlo
and quasi-Monte Carlo methods 1998. Springer, New York, pp 214–226
129. Spanier J, Kong R (2004) A new adaptive method for geometric convergence. In: Niederreiter
H (ed) Proceedings MCQMC 2002. 25–28 November 2002, Singapore, pp 439–449
130. Powell MJD, Swann J (1966) Weighted uniform sampling – a Monte Carlo technique for
reducing variance. J Inst Math Appl 2:228–236
131. Spanier J (1979) A new family of estimators for transport problems. J Inst Math Appl 23:1–31
132. Booth TE (1990) A quasi-deterministic approximation of the Monte Carlo importance function. Nucl Sci Eng 104:374–384
133. Li L (1995) Quasi-Monte Carlo methods for transport equations. Ph.D. dissertation, The
Claremont Graduate School, Claremont
134. Li L, Spanier J (1997) Approximation of transport equations by matrix equations and sequential sampling. Monte Carlo Meth Appl 3:171–198
135. Mosher S, Maucec M, Spanier J, Badruzzaman A, Chedester C, Evans M, Gadeken L
Expected-value techniques for Monte Carlo modeling of well logging problems. In review
136. Amster HJ, Kuehn H, Spanier J (February 1960) Euripus-3 and Daedalus – Monte Carlo
Density Codes for the IBM-704. Westinghouse Atomic Power Laboratory report WAPD-TM205
137. Rief H (1984) Generalized Monte Carlo perturbation theory for correlated sampling and a second order Taylor series approach. Ann Nucl Energy 11:455–476
138. Rief H, Gelbard EM, Schaefer RW, Smith KS (1986) Review of Monte Carlo techniques for
analyzing reactor perturbations. Nucl Sci Eng 92:289–297
139. Rief H (1996) Stochastic perturbation analysis applied to neutral particle transport. Adv Nucl
Sci Tech 23:69–140
164
J. Spanier
140. Hayakawa C, Spanier J, Bevilacqua F, Dunn AK, You JS, Tromberg BJ, Venugopalan V
(2001) Perturbation Monte Carlo methods to solve inverse photon migration problems in
heterogeneous tissues. Optics Lett 26(17):1335–1337
141. Hayakawa CK, Spanier J Perturbation Monte Carlo methods for the solution of inverse
problems. In: Niederreiter H (ed) Proceedings MCQMC 2002. 25–28 November 2002,
Singapore, Springer, pp 227–241 (to appear)
142. Goudsmit S, Saunderson JL (1940) Multiple scattering of electrons. Phys Rev 57:24–29
143. Lewis HW (1950) Multiple scattering in an infinite medium. Phys Rev 78:526–529
144. Berger MJ (1963) Monte Carlo calculations of the penetration and diffusion of fast charged
particles. In: Alder B, Fernbach S, Rotenberg M (eds) Methods in computational physics, vol
I. Academic, New York, pp 135–215
145. Larsen EW (1992) A theoretical derivation of the condensed history algorithm. Ann Nucl
Energy 19(10–12):701–714
146. Fernandez-Varea JM, Mayol R, Baro J, Salvat F (1993) On the theory and simulation of
multiple elastic scattering of electrons. Nucl Inst Meth Phys Res B73:447–473
147. Kawrakow I, Bielajew AF (1998) On the condensed history technique for electron transport.
Nucl Inst Meth Phys Res B142:253–280
148. Wyman DR, Patterson MS, Wilson BC (1989) Similarity relation for anisotropic scattering in
Monte Carlo simulations of deeply penetrating neutral particles. J Comp Phys 81:137–150
149. Bielajew AF, Salvat F (2000) Improved electron transport mechanics in the PENELOPE
Monte Carlo model. Nucl Inst Meth Phys Res B173:332–343
150. Tolar DR, Larsen EW (2001) A transport condensed history algorithms for electron Monte
Carlo simulations. Nucl Sci Eng 139:47–65
151. SpanierJ, Li L (1998) General sequential sampling techniques for Monte Carlo simulations:
Part I – matrix problems. In: Niederreiter H, Hellekalek P, Larcher G, Zinterhof P (eds)
Monte Carlo and quasi-Monte Carlo methods 1996. Springer Lecture Notes in Statistics #127,
Springer, New York, 382–397
152. Amster HJ, Djomehri MJ (1976) Prediction of statistical error in Monte Carlo transport
calculations. Nucl Sci Eng 60:131–142
153. Booth TE, Amster HJ (1978) Prediction of Monte Carlo errors by a theory generalized to treat
track-length estimators. Nucl Sci Eng 65:273–281
154. Booth TE, Cashwell ED (1979) Analysis of error in Monte Carlo transport calculations. Nucl
Sci Eng 71:128–142
155. Amster HJ (1971) Determining collision variances from adjoints. Nucl Sci Eng 43:114–116
3
Monte Carlo Methods
165
Professor Jerome Spanier received a BA in
mathematics and physics at the University of
Minnesota in 1951 and an M.S. and Ph.D. in
mathematics from the University of Chicago
in 1952 and 1955, respectively. He spent
the next 16 years at industrial research laboratories, first in suburban Pittsburgh at the
Bettis Atomic Power Laboratory operated by
Westinghouse, then at the North American
Aviation (later Rockwell International) central research laboratory – the Science Center
– in Thousand Oaks, California. In 1971,
he moved to academia as Full Professor
of Mathematics at the Claremont Graduate School (CGS) in Claremont, California.
There he founded the Mathematics Clinic, a practicum course in which students and
faculty solve real world problems, in 1973–1974. This course has played a central
role in the applied mathematics curriculum at the Claremont Colleges since its inception. In 1998, Spanier became Professor Emeritus in order to devote full time
to research and he established a small research institute, the Claremont Research
Institute of Applied Mathematical Sciences (CRIAMS), in Claremont. He devoted
8 of the intervening years to full-time administration at CGS: first, as Dean of Faculty in 1982 and subsequently as Vice President for Academic Affairs and Dean of
the Graduate School until his return to the faculty in 1990. Spanier is the author of
several books and numerous articles and technical reports, and he has spent much
of his career contributing to both the theory and applications of Monte Carlo methods. He has applied these, and other numerical and analytical methods, to a variety
of problems arising in chemistry, physics, and engineering. He now focuses much
of his attention on medical applications at the University of California at Irvine,
although he remains a Senior Fellow and Director of CRIAMS in Claremont.
Chapter 4
Reactor Core Methods
Robert Roy
4.1 Introduction
This chapter addresses the simulation flow chart that is currently used for
reactor-physics simulations. The methodologies presented are more appropriate
to the context of power reactors, and the chapter focuses particularly on the threedimensional (3D) aspect of core calculations. Software design that is currently used
to achieve accurate numerical simulations of reactor cores is also studied from a
practical nuclear engineering point of view. The focus here is on processes and the
needs for reactor physicists or nuclear engineers to use modern-day software with
confidence and reliability.
In Section 4.2, some early combinations of mathematics with a computational
environment are given with no intention to give any global historical perspective.
Section 4.3 is devoted to the concepts in lattice cell calculations by which most of
the core databases are still constructed today. Section 4.4 is a selection of modern
tools and methodologies that are now used for industrial applications. In Section 4.5,
some applications of reactor core methods are given. Concluding remarks are drawn
in Section 4.6.
4.2 Analytic Methods and Early Calculation Schemes
A nuclear reactor must maintain a sustained neutron reaction chain. The ratio of
the number of neutrons in any generation to the number of neutrons from the previous generation is called the multiplication factor. First core calculation schemes
were trying to factor out the different important states that neutrons were trying
to reach during their lives. An idealized reactor core is a homogeneous infinite
medium where neutrons of different energies interact with the medium. Criticality means equilibrium between neutrons produced and removed, which is unitary
R. Roy ()
Nuclear Engineering Institute, École Polytechnique de Montréal,
P.O. Box 6079, Station Centre-Ville, Montréal (Québec) H3C 3A7, Canada
e-mail: robert roy@polymtl.ca
Y. Azmy and E. Sartori, Nuclear Computational Science: A Century in Review,
c Springer Science+Business Media B.V. 2010
DOI 10.1007/978-90-481-3411-3 4, 167
168
R. Roy
infinite multiplication. To take into account the neutron losses due to leakage out
of a finite core, corrections were added. Early core calculations were based on the
four- and six-factor formulas following the neutron life cycle:
keff D .k1 /ƒther ƒfast D ."pf ther /ƒther ƒfast
(4.1)
Of these six factors, two, ther and ", essentially concern the fuel pins. The middle two, p and f , strongly depend on the heterogeneity of the fuel assembly
(coolant/moderator, structural materials, etc.). These first four factors are the
following:
" The fast fission factor or the total number of fission neutrons per thermal-fission
neutron.
p The probability that this neutron escapes resonance capture.
f The thermal utilization factor that this neutron is captured in fuel.
ther The thermal reproduction factor is the average number of fission neutrons
produced per absorption in fuel.
The last two factors depend on the overall core geometry:
ƒther The thermal nonleakage probability.
ƒfast The fast nonleakage probability.
Another way to express the criticality is to use a formula such as
keff D
k1
2
1 C M 2 Beff
(4.2)
where the new variables are the following:
M 2 The so-called migration area.
2
Beff
The critical buckling value.
Early calculation schemes dealt mostly with such formulas to describe the neutron
multiplication inside physical systems. Homogeneous bare and lumped geometric
models were analytically deduced from the transport or diffusion equation using
two energy groups in order to get accurate slide-rule results. To carry more significant figures, the use of desk-type calculating machines was considered as a great
evolution. However, such an automated process requires the validation of every
significant result, as errors in the digital input could mean a bad tabulated value.
It was very soon realized that power reactors could benefit from having a reflector
zone surrounding the lumped fuel arrangement. To approximate how the reactor
core could be reduced from an equivalent bare configuration, calculations of the
reflector saving involved spatial coupling of different materials, and matrices were
needed to interconnect the fuel and reflector regions together.
In his review on the development of nuclear reactor theory in the Montreal laboratory, Prof. M.M.R. Williams provides some letters to show the kind of personnel
problems that such a laboratory can have [1]. Here is a memorandum to George
Placzek where Bengt Carlson is stressed because many projects are ready for computation . . . without computer!
4
Reactor Core Methods
169
Memorandum to – Dr. Placzek
Concerning: the Computing Situation
At present, there is a great amount of accumulated work in the computing
section. In the following, a list is given of the projects ready for computation
and the estimated time necessary for their completion with the present staff
and equipment of the computing section (see Table 4.1). We have thus about
12 weeks of accumulated work. Improvement of this situation is impossible
with the present staff, rather we expect it to become worse. We have also been
advised of the desirability of an acceleration of output.
Since the current rate of work thus is considered insufficient and since it is
besides a result of a somewhat forced tempo, more manpower and calculating
machines are necessary. The most effective remedy, in my opinion, would be
to hire two additional men, one with a college education including mathematics courses, and one with an excellent high school record to be trained here;
and at the same time acquire two Marchant calculators, model ACRM.
If we have to operate with an increased staff before the new calculators
arrive, we might try to arrange a double shift, but such a step with the present
type of work, which requires frequent decisions on my part and hence my
presence, is at best an emergency solution. No computing project with which
I have ever had contact left the staff without supervision during half of the
day. That would be too much of a menace to accuracy and efficiency over an
extended period.
Bengt Carlson
BC:VL
Montreal, October 13, 1943
Can you imagine doing that kind of projects using calculators like the one in
Fig. 4.1? In early calculations, tabulations were always used for Bickley–Naylor
functions, exponential integral functions, and all the others. No computer was
needed to obtain crude solutions. However, the analytic work to reduce the problems into mathematics tractable by hand was tremendous.
For example, let us think of Dr. Boris Davison and Jeanne Lecaine in their
Montreal laboratory as they were trying to approximate the linear extrapolation
length using several approximations. Their working process was to embed several
approximations together into analytical formulations with the aim of obtaining observable results. Variational principles were then very important because the manual
170
R. Roy
Table 4.1 Projects ready for computation in the Montreal
laboratory
Project
Estimated time
Density distribution in systems with
2 weeks
multiplication factor near 1 and
nonmultiplying reflector (Dr. Volkoff)
Integral equation for absorption in an
2 weeks
aluminum slab (Dr. Wallace)
Solution of transcendental equations and
2 weeks
calculations of residues in connection
with slowing-down problems (Dr.
Marshak)
Slowing-down length in water and related
3 weeks
problems (Dr. Marshak)
Albedo problems (Dr. Adler)
1 week
Improved numerical solution of Milne
1 week
equation (Dr. Mark)
Miscellaneous
1 week
Total
12 weeks
Fig. 4.1 Picture of a Marchant 8D calculator (The Marchant model ACRM has two more digits!)
4
Reactor Core Methods
171
numerical results were easier to obtain. Today’s formulation of the linear extrapolation length as
8
9
Z1
ˆ
>
ˆ
>
ˆ
>
ˆ
>
dx .x/E3 .x/
ˆ
>
ˆ
>
<3
=
3
0
2
3
lext D sup 2L2
C 1
1
Z
ˆ
>
8
2Z
ˆ
>
ˇ
ˇ
ˆ
>
ˆ
42 .x/ dx 0 E1 ˇx 0 x ˇ .x 0 /5 >
ˆ
>
dx
.x/
>
:̂
;
0
0
(4.3)
is quite easy to process using mathematical scripting language and obtain accurate
values of lext 0:7104. The quality of analytic methods used in the 1940s and
1950s can certainly be observed in Dr. Davison’s book [2].
In Report LA-756 (also available in [1]), Bengt Carlson gives a list of functions
that are needed to perform Serber calculations: exponential, circular hyperbolic,
exponential integral, and exponential integral for complex arguments. In Report
LA-2595, Clarence E. Lee states that [3]:
One of the first numerical experiments performed by Carlson after a computer was available
was an attempt to perform Serber calculations by numerical integration and a semianalytical approach. From the analysis of those experiments and recognition of the intrinsic
directional derivative treatment, the original Sn difference method was born.
Nowadays, the range of computing resources has many scales: from hyperthreading and multicore processors up to clusters and grid computing. But we still rely on
coherency of physical approximations in order to tackle reactor core problems . . .
since the number of unknowns is too large to have full-core “ready for computation”
models. Even though we now have sophisticated tools for reactor simulations, there
is still “too much of a menace to accuracy and efficiency over an extended period”
if we use these tools without being aware of their limits.
Note: The material of this section is mainly extracted from the master work of
Prof. Williams in analyzing a collection of papers on the development of the socalled Montreal theory involving many of the greatest nuclear reactor physicists.
The interested reader should find in this work [1] valuable historical landmarks for
many of the early reactor theory.
4.3 Lattice Cell and Assembly Codes
A reactor core is made of several materials. The evaluation of the neutron flux
distribution inside all regions in a reactor core still demands considerable computer resources. A direct transport solution for the entire core is not affordable for
day-to-day exploitation. One must proceed using various stages of calculations interconnected together in order to obtain reliable data at the core level. In an attempt
to clarify these stages, let us describe the equation governing the core behavior.
The neutron balance in a reactor can be expressed by the time-dependent transport
equation [4]:
172
R. Roy
1 @ˆ.r; ; E; t /
O ; E; t/
C Lstatic ˆ.r; ; E; t/ D Q.r;
v
@t
(4.4)
where v, ˆ, and QO are, respectively, the speed, the neutron angular flux, and the
source (including the scattering and the fission term), and Lstatic is a general static
transport operator:
Lstatic ˆ D rˆ.r; ; E; t/ C †t .r; E; t/ˆ.r; ; E; t/
(4.5)
Macroscopic cross sections, e.g., the macroscopic total cross section †t in Eq. 4.5,
represent probabilities per unit path length of neutron interaction in the physical
medium. In reactors at power, many types of nuclei can compose the medium, and
the microscopic values of cross sections are compounded by the nuclei concentrations (number of nuclei per volume unit) as
X
Ni .r; t/t;i .E/
(4.6)
†t .r; E; t/ D
i
to give the local probability per unit path length of a neutron collision. Neutron
collisions expressed by the total microscopic cross section t;i include different interactions: elastic and inelastic scattering, radiative capture, fission, etc. Some of
these interactions exhibit resonances. A resonance self-shielding calculation is required in order to take into account the resonant behavior. Other interactions will
affect the concentration Ni .r; t/. New nuclei will be formed (e.g., fission products),
others will disappear (as burnable absorbers). The depletion (or burnup) equations
can be linearized into
X˚
dNi
D
i j ˆ C i j N j
(4.7)
dt
j
where i j and i j represent the production rate after fission or capture and the
radioactive decay constant, respectively.
At any given space point, the concentrations of all the nuclides will affect the
neutron flux in a complicated manner. From the point of view of neutron kinetics,
the production and elimination of some fission products can have a considerable influence on reactor operation (fuel poisoning products, delayed-neutron precursors,
etc.) In lattice cell calculations, it is generally assumed that the flux (or the power)
remains constant over the span of a time step to allow for the solution of the depletion equation by standard numerical integration (Runge-Kutta, Kaps-Rentrop, etc.).
In Fig. 4.2, a UML (Unified Modeling Language) class diagram describes the
various objects that will be needed all along the core follow-up:
BurnupHistory has a collection of depleted states that are ordered in time; each
of these states is a snap-shot of Eq. 4.7.
DepletionState is an association class that relates the microscopic to the
macroscopic cross sections by Eq. 4.6.
MicroLib is a class that is composed of one chain of depletion, as many isotopes
that are needed to represent the problem and self-shielding particular data.
4
Reactor Core Methods
173
BurnupHistory
has
1
{ordered}
1..*
DepletionState
MicroLib
1
1
MacroLib
-updateXS
1
{ordered}
1
SelfShieldingOpt
1..*
Isotope
1
ChainOfDepletion
1..*
GroupData
Fig. 4.2 Class diagram for depletion calculations
MacroLib is a class used to represent macroscopic cross sections; the neutron
travels into media that have these properties. For deterministic core solvers, data
is generally ordered into energy groups.
The linear approach presented here is not physically sound for nuclear engineering applications. There exists loopback in this calculation flow chart. The feedback effects for a power reactor are certainly important. For example, the power
distribution and the different media temperatures interact together resulting in
Doppler broadening of microscopic cross sections. In fact, a whole set of nonlinear
effects may be introduced in order to keep a consistent physical model. However,
these effects will not be considered in this chapter.
4.3.1 Lattice Physics Calculations
A lattice code is primarily used to compute the neutron flux distribution and the
infinite multiplication factor. In most actual power reactors, the solid fuel pins and
174
R. Roy
the liquid (or gaseous) coolant are located so that the nuclear reaction and its thermal
hydraulics effect can be predicted and controlled.
The relative geometric arrangement of fissile, absorbent, and coolant materials
usually follows regular patterns inside the core. These patterns are referred to as
lattices. Such lattices are composed of cells. At the cell level, the local flux may vary
strongly and resonance effects are important. Broader interaction between cells, for
example, in fuel assemblies, can be taken into account if the interface currents are
known.
In the 1970s, the calculation flow chart involving reactor core simulations was
generally the following:
1. Perform pin-cell calculations using a fine group (few thousands) cross-section
library, and condense into fewer (few hundreds) groups and homogenize over the
pin-cell’s geometry.
2. Perform assembly calculations using homogenized pin-cell cross-section data
from step 1, condense into few-group reactor data, and homogenize over assembly’s geometry.
3. Perform core calculations using the homogenized assemblies’ few-group data
from step 2.
Using, nowadays, computer resources, the two first steps are generally performed
together; and the neutron coupling between the pin cells can be evaluated using
the transport equation. The unit cell used for these basic transport calculations de-
: Experiments
: NuclearDataFile
: LibraryXS
: LatticeCalculation
: ReactorCalculation
Fig. 4.3 Waterfall model for reactor core analysis
4
Reactor Core Methods
175
pends on the kind of reactor: for light-water reactors, the cell can be a particular
PWR fuel assembly with various enriched pins including gaps, a BWR assembly
along with cruciform rods, or a cluster of CANDU-like fuel element surrounded
by heavy-water moderator. Although these unit cell calculations do not provide the
flux distribution inside the whole reactor, they are still important for design and
follow-up of core behavior. The reactor physicist progresses from one phase to the
next according to Fig. 4.3, where problems with the physical approximations in one
phase demand interventions back in the previous phase.
Reactor methods encompass the last two phases : LatticeCalculation and
: ReactorCalculation with knowledge of : LibraryXS to perform core analysis. Nuclear power plant engineers rely on these phases for operation and safety
issues. The : LibraryXS phase requires building an isotopic cross-section library
from evaluation data files. In North America and Europe, the Evaluated Nuclear
Data File (ENDF) format is generally used as the source of cross sections to process
nuclear data relative to neutron and photon reactions. Using selected evaluations
such as, for example, ENDF/B-V, -VI, and -VII, the NJOY code can produce a consistent set of pointwise or multigroup microscopic cross sections, covering both the
resolved and unresolved resonances energy domain [5]. Multigroup cross-section
libraries are generated using specific modules offered by lattice codes to recover
the most interesting features for core analysis. In Section 4.3.1.1, we briefly discuss
how to generate these cross-section libraries.
NJOY system
ENDF/B : NuclearDataFile
[add isotope]
RECONR
BROADR
UNRESR
PENDF : PointWiseXS
DRAGR
THERMR
GENDF : GroupAvgXS
GROUPR
Fig. 4.4 Typical processing sequence in NJOY
DRAGLIB : LibraryXS
176
R. Roy
4.3.1.1 Producing Cross-section Libraries
Figure 4.4 shows a typical sequence used for producing cross sections per isotope.
The dashed lines mean objects that serve as input or output for the various modules.
Here, only short descriptions for modules of NJOY are provided:
RECONR is used for cross-section reconstruction (in fact, it reads the ENDF file,
reconstructs resonances to prepare pointwise ENDF data).
BROADR accounts for Doppler broadening of resonances.
UNRESR is used for unresolved resonance data processing (for main resonant
isotopes, the PURR module can also be used).
THERMR treats the thermal scattering law.
GROUPR is used for group-averaged data processing.
Code-specific modules are then used for formatting multigroup cross sections.
The shown example involves the posttreatment module DRAGR.
Once all necessary isotopes have been processed and cross-section libraries are
available, reactor core simulations can be done. In Fig. 4.5, reactor analysis states
are shown in an activity diagram. It can be seen that the : LibraryXS phase is essential for carrying out proper analysis. Note that the lattice and core calculations
appear as parallel activities where the nuclear engineer takes into account simulation results at both levels: the fine-flux lattice and the global flux distribution in the
design of a reactor core.
In Fig. 4.5, the standard flow of objects needed for or generated during reactor
core analysis is also shown; this includes
The : MeshGeom object representing cell and core geometry
The : UpdateCoreDB object containing nuclear properties after lattice calcu-
lation
The : PowerDist object representing the power distribution in the core.
4.3.1.2 Self-Shielding and Multigroup Approximation
In reactor physics, it is quite common to arrange energy groups in a reverse fashion, so that fastest energy group appears first and the most thermal one appears at
the end, and use lethargies instead of energies to describe neutron slowing down.
0
The lethargy, u, of a neutron of energy E is defined by u D ln EE where E 0 is
some maximum energy, commonly taken as 10 MeV [4]. It is generally assumed that
the energy range is divided into G energy groups and that group g lethargy interval
goes from E g to E g1 , corresponding to the lethargy interval U g D [ug1 , ug ].
Suppose that there exists a typical energy-dependent spectral weighting function
'.u/ in group g whose integration over U g is 1. The microscopic group-averaged
cross sections for reaction x will generally be group-condensed using
Z
xg
D
ug
ug1
du '.u/x .u/
(4.8)
4
Reactor Core Methods
177
get cross sections
define geometry
: MeshGeom
validate data
perform lattice calculation
: LibraryXS
perform core calculation
: PowerDist
: CoreBehaviourData
compare with observed core data
: UpdateCoreDB
: AcceptanceCrit
Fig. 4.5 Activity diagram for reactor core analysis
This equation makes physical sense when the flux shape in energy is quite regular. An important feature to take into account in core analysis is the fact that cross
sections for typical heavy nuclides exhibit resonances. The flux shape will drop
substantially in the resonance energy range, and it is no longer easy to assume an
energy spectrum '.u/. In that case, Eq. 4.8 is no longer valid.
Self-shielding calculations must be done to recover effective microscopic cross
sections for a resonant reaction x:
178
R. Roy
Z
g
x;eff
.0 /
ug
x .u/
t .u/ C 0
D Z ug
1
du '.u/
g1
.u/
C 0
t
u
ug1
du '.u/
(4.9)
where 0 is the total cross section outside the resonances, usually called the
background cross section and '.u/ is the fine energy spectrum. In the subgroup
method, the energy-spectrum function in group g is approximated and probability tables on a fine energy group structure are used to tabulate the resonant cross
sections:
P
x;s
's t;sC
s2g
g
x;l
D P
s2g
0;l
1
's t;s C
(4.10)
0;l
for various values of the background 0;l . Self-shielding effects are treated by specific modules in most lattice cell codes.
Once an appropriate library of self-shielded microscopic cross sections is available, lattice calculations are done in the multigroup framework. The time-dependent
angular flux has to be solved from equations
1 @ˆ.r; ; t/
C Lgstatic ˆ.r; ; t / D QO g .r; ; t/
vg
@t
(4.11)
with group-dependent solutions ˆ D ˆg .r; ; t/ at various time steps representative of the neutron life cycle. The numerical integration of the term @ˆ=@t can be
done using various schemes (explicit, implicit, -method, etc.) and, assuming, for
example, the power, i.e., burnup, history of the lattice, lattice calculations will be
done for obtaining steady-state solutions to transport problems over the full energy
range. In each energy group, the problem is to compute the value of ˆg satisfying
Lgstatic ˆ D rˆg .r; / C †gt .r/ˆg .r; / D Qg .r; /
(4.12)
with appropriate group-dependent sources Qg and boundary conditions pertinent
to the lattice. Finally, in some reactor dynamics studies, an exponential time-decay
period is often sought so that the separation ˆ.r; ; t/ ( e ˛t ˆ.r; ; ˛/ remains
valid for all groups in Eq. 4.11, also resulting in a transformation of the sources as
in Eq. 4.12.
4.3.1.3 Generic Multigroup Solver
The multigroup treatment has condensed the transport operator to G energy groups.
This operator now acts on the angular flux with two separate parts: the usual streaming part and the collision part, which depends on a group-dependent cross section
†gt .r/:
Lgstatic ˆ ŒLstatic j†gt .r/ ˆ D rˆ.r; / C †gt .r/ˆ.r; /
(4.13)
4
Reactor Core Methods
179
If the group-dependent source Qg .r; / is known, the methods presented in the
above sections can be used to solve the transport problem independently in each
group, as in Eq. 4.12, to obtain the flux map. However, the source term includes
scattering from other groups as well as fixed sources or fission effects, which couple
the set of G transport equations together:
Qg .r; / D S g ˆ C F g .r; /
XZ
0
d 2 0 †gs g .r; D
0
0 /ˆg .r; 0 / C F g .r; / (4.14)
g 0 4
Scattering Effects
There are three separate components comprising the scattering effect: the upscattering g<g 0 effect where the neutron scatters from lower to higher energy,
the self-scattering g D g 0 where the neutron stays in the same energy group, and
the down-scattering g>g 0 where the neutron is slowing down. When the source
F g .r; / is fixed, the standard iterative method of solution assumes an initial flux
guess ˆg.0/ .r; / and the convergence of the fixed point iterations of the type
Lgstatic ˆg.mC1/ D S g ˆg.m/ C F g .r; / D Qg.m/ .r; /
(4.15)
where the source Qg.m/ .r; / is updated in each iteration using the flux
ˆg.m/ .r; /. Because there is generally no up-scattering of neutrons in the
fast groups, the groups are processed from fastest to the most thermal (i.e.,
g D 1; : : : ; G/. In most solvers, the down-scattering sources are updated using
the new flux just obtained from the previous groups. In some solvers, it is also
possible to use an implicit form for the self-scattering source, which is to enable the
source calculation using the on-the-fly flux value in the same group; in other solvers,
this could lead to an iterative process on its own. Finally, the up-scattering sources
are updated for the next iteration. This method of solution, which is basically of
Gauss–Seidel type in energy groups, can be sped up using various acceleration
schemes.
Fission Effects
The fission source (generally assumed to be isotropic) can also depend on the flux.
In conventional lattice calculations where the fission spectrum does not depend on
the incident neutron energy, the fission source takes the form
g .r/ X g 0
P gˆ
D
†f .r/
K
K
0
g
Z
4
0
d 2 ˆg .r; /
(4.16)
180
R. Roy
where K is an adjustable parameter to achieve criticality. Lattice codes are often
interested in computing only the largest possible eigenvalue for K with the
eigenvector associated to the neutron flux. To find this critical value, the inverse
power method is often used
˚
P g ˆ.n/
Lgstatic C S g ˆ.nC1/ D
k .n/
(4.17)
where ˆ.n/ and k .n/ are approximations obtained for the multigroup flux map and
the eigenvalue at iteration n, respectively. Because Eq. 4.16 implies a sum over all
groups, an outer iteration (that is a complete loop where all the energy groups
are solved) is generally necessary to compute the new fission source. These remarks
have led most developers to use a two-level solver strategy of the Gauss–Seidel type.
In the following sections, we first review the steady-state transport solutions that
are generally used in order to recover few-group nuclear properties for core calculations. The basic calculational unit is a spatial convex domain D split into I
homogeneous disjoint regions, each having volume Vi , surrounded by an external
boundary @D that can be split into several surfaces:
8
<D D ˚ Vi
i
(4.18)
:@D D ˚ S˛
˛
For the sake of simplicity, we omit the energy dependence (or the group index) of the
variables in the following sections to concentrate on the angular and space variables
and to see how we can obtain solutions to a single group transport problem, as in
Eq. 4.12.
4.3.1.4 Discrete Ordinates
In the discrete ordinates method, the neutron transport equation is solved for a
number of discrete directions n . The unit sphere is split into areas with weights
wn D 2 n =4 and the discrete solid angles represent the directions where transport solutions will be obtained. The scalar flux is approximated as follows:
Z
d 2 ˆ.r; / '
ˆ.r/ D
X
wn ˆn .r/
(4.19)
n
4
where ˆn is the solution for the discrete ordinate n . For this fixed direction, the
steady-state transport equation is integrated over volume:
Z
Z
3
d 3 r ŒQ.r; n / †t .r/ˆ.r; n /
d rr .n ˆ.r; n // D
D
D
(4.20)
4
Reactor Core Methods
181
The divergence theorem is then applied for the left-hand term leading to an equation
representing the exact balance for neutrons that stream in the volume:
Z
Z
d 2 rb n NC ˆ.rb ; n / D
d 3 r ŒQ.r; n / †t .r/ˆ.r; n /
(4.21)
D
@D
where NC is the unit outward normal and rb 2 @D. The left-hand side is the
flow across the boundary and the right-hand side is the difference between neutron sources, including secondary production, and losses from collisions inside the
domain.
To illustrate the basics of discrete ordinates solvers,
let
i uschoose a 3D domain
h
C
C
; zC with centers
z
yj ; yj
split into small homogeneous cells xi ; xi
k k
at rijk D .xi ; yj ; zk /, having volumes Vijk D xi yj zk and with constant cross
section †ijk D †t .xi ; yj ; zk /. Select a direction of flight n D .n ; n ; n / in the
principal octant .n > 0; n > 0; n > 0/, and assume that the cell-centered flux is
constant inside the volume and that the surface (or cell-edged) fluxes are also constant over each of the cell’s six faces. The balance Eq. 4.21 can be approximated by
n
n
n
x ˆn C
y ˆn C
z ˆn D Qijk †ijk ˆn;ijk
xi
yj
zk
(4.22)
where constant differences are taken for each coordinate:
8
C
ˆ
ˆ
<x ˆn D ˆn .xi ; ; / ˆn .xi ; ; /
ˆ D ˆn .; yjC ; / ˆn .; yj ; /
ˆ y n
:̂ ˆ D ˆ .; ; zC / ˆ .; ; z /
z n
n
n
k
k
(4.23)
A 3D Cartesian solver will proceed as follows: assuming that the incoming surface
fluxes (associated with the – sign) are already known for neutrons entering the region, we march through Vijk following the neutron’s direction of motion, n , to
determine the local angular flux value ˆn;ijk inside the volume and the outgoing
surface fluxes (associated with the C sign) for neutrons going out of the volume.
We still need some additional relationships to close the system, because there
are four unknowns for each region Vijk : ˆn;ijk and the three outgoing fluxes. In the
classical diamond difference scheme, we crudely assume that:
ˆn;ijk D
D
ˆn .xiC ; ; / C ˆn .xi ; ; /
2
ˆn .; yjC ; / C ˆn .; yj ; /
2
ˆn .; ; zC
/
C
ˆn .; ; z
k/
k
D
2
(4.24)
182
R. Roy
but one may also use other difference schemes (step or weighted-diamond
relationship; see Chapter 1). If the domain is composed of several Cartesian cells,
each angle sweep is done starting from the domain external boundary in the direction of neutron motion. In the case of the principal octant and using natural 3D cell
numbering, the spatial calculation proceeds in each cell for increasing values of i ,
j , and k. The flux values propagate as a wave front in the 3D domain, similarly
as a step-by-step Euler’s solver for ordinary differential equations. In curvilinear
coordinate systems (spherical, cylindrical, etc.), the method can still be applied
using conservative difference schemes. Newer schemes have been developed using
finite element treatment of the spatial variable.
Typical discrete ordinates quadrature sets (also called Sn quadrature) are chosen
to preserve the maximum number of angular moments. The order of the Legendre
expansion of the scattering kernel, as defined in Section 4.3.1, implies the use of a
minimal number of angles to compute the angle-dependent sources.
4.3.1.5 Method of Characteristics
In the method of characteristics (MOC), the differential form of the Boltzmann
equation will be used [6]. This differential equation will then be solved by integrating along the characteristics of the differential operator that correspond to the
tracking lines. First, the streaming operator r in Eq. 4.12 is expressed as a derivative along the neutron’s direction of motion, . This leads to the following equation:
d
.r C s; / C †t .r C s/ .r C s; / D Q.r C s; /
(4.25)
ds
Assume a starting point r D r0 (e.g., located on the external boundary @D), s is the
distance measured from this point along the characteristic line with prolongation
in the neutron streaming direction. For one line segment of length L and constant
properties, we may integrate this last equation along the line and obtain:
C
D
e
†t L
C
Q 1 e †t L
†t
(4.26)
where D ˆ.r0 ; / is the inward value of the angular flux at the beginning of
the line segment and C D ˆ.r0 C L; / is the outward value at the line’s ending
point, where Q is approximated by a constant value. A single characteristic line is an
ordered collection of such line segments Lk crossing different region numbers Nk .
The outward flux of one segment also serves as inward flux in the next segment. If
the inward flux for the first segment of a characteristic line is known, a recursive
segment-by-segment calculation will be done to compute the outward flux and the
k-segment-averaged flux Nk according to the properties of the region traversed by
segment k.
4
Reactor Core Methods
183
Characteristics multigroup solvers have been used since the 1970s to obtain
accurate solutions for two-dimensional (2D) lattice cell problems. Nowadays, with
greater computing resources, extensions of this method to 3D solvers are already in
the development and benchmarking stages. The main advantage of MOC compared
to the many other deterministic transport methods is that the geometric description
is rather flexible. It is possible to consider the use of constructive solid geometry
packages to define the : MeshGeom object of Fig. 4.5. Assuming a 3D lattice split
into homogeneous regions and changing the phase space variables, the average flux
ˆj can be expressed by
Z
Vj ˆj D
Z
d 3r
d 2 ˆ.r; /
4
Vj
Z
Z
4
D
C1
d T
1
Z
D
d 4T
X
dt Vj .T; t/ˆ.rT C t ; /
ıjNk Lk
k
(4.27)
k
where the T characteristic line is determined by its orientation (solid angle) along
with a reference starting point rT for the line. To cover the domain, Monte Carlo
codes typically use pseudorandom number generators as elaborated in Chapter 3. In
MOC deterministic codes, a quadrature set of solid angles is selected (Sn quadrature
sets can be used) and the starting point is chosen by scanning the plane perpendicular
to the selected direction. Hence, in the second line of Eq. 4.27, the d 4 T is composed
of a solid angle element times the corresponding perpendicular plane element. In
the above, the variable t refers to the local coordinates on the tracking line and the
function Vj .; / is defined as 1 if the tracking line segment passes through the
region j and 0 otherwise. The last form of Eq. 4.27 is of particular importance in
MOC solvers; it states that the average flux over a volume Vj is made up of all the
local segment contributions. Note that the k summation runs over all the segments
of a characteristic line. However, only the contributions of segments crossing region
j are added together by virtue of the Kronecker delta, i.e. ı symbol.
A generic lattice MOC iterative solver may proceed as follows: after estimation of the neutron sources, the outward angular flux at the external boundary of
the domain is summed on every surface in order to keep outward current components, which serve as inward currents for the next inner iteration. It is also possible
to use cyclic characteristics on lattice domains; in this case, boundary conditions
are directly embedded into the periodic tracking procedure allowing factorization
of the infinite periodic source contribution to the local segment flux. Once all the
segment-averaged flux values are obtained, Eq. 4.27 is used to compute the new flux
distribution, which serves for updating the neutron sources of the next iteration.
Note: Unfortunately, the estimated (numerically computed) volumes do not generally agree with the true volume of the regions. In most deterministic MOC codes,
184
R. Roy
the segment lengths are usually renormalized to preserve the true volumes; this
is done by multiplying each segment’s length Lk by an angle-dependent factor
Vj =Vj0 .n /.
4.3.1.6 Collision Probability Method
The integral form of the transport equation is obtained after integration of the
streaming operator r along the neutron traveling direction. Let us first express
Eq. 4.12 assuming an actual position r for the neutron
d
.r s; / C †t .r s/ .r s; / D Q.r s; /
ds
(4.28)
Equations 4.25 and 4.28 are similar; however, the latter equation looks back
along
After multiplying by the integrating factor e s D
˚ the
R s neutron trajectory.
exp 0 dt†t .r t / , we integrate backward along the neutron path up to a
boundary point rb and obtain
ˆ.r; / D ˆ.r b; /e b C
Z
b
ds Q.r s; /e s
(4.29)
0
Assuming flat isotropic sources Q.r; / D .Qi = 4 /Vi .r/ and that the integration range covers an infinite lattice, the average scalar flux inside region j can be
computed as
Z
Z
Vj ˆj D
3
4
Vj
Z
Z
D
d 3r
Vj
D
d 2 ˆ.r; /
d r
X
Z
1
d 2
ds Q.r s; /e s
0
4
pij Vi Qi
(4.30)
i
where the matrix coupling coefficients in the third form of Eq. 4.30 are defined by
pij D
1
4 Vi
Z
Z
d 3r
Vj
d 3r 0
exp. s /
s2
(4.31)
Vi
Equation 4.30 is commonly referred to as the Peierls’ equation. The lattice CP matrices obey the reciprocity relation Vi pij D Vj pji and conservation of neutrons forces
the sum of normalized probabilities Pij D †t;j pij over all sources j to be one. CP
4
Reactor Core Methods
185
d 3r
dt
d 2Ω
d 3r '
s
d t'
Ω
d 2s
ΠΩ
Fig. 4.6 Change of variables for 3D tracking
solvers can use the same tracking files as MOC solvers since a change in variables
allows the calculation to be performed using the formula
1
pij D
4 Vi
Z
Z
2
d 4
Z
2
d s Vi Vj
Z
dt
dt 0 exp. ij /
(4.32)
…
where … is the plane perpendicular to the solid angle . Figure 4.6 shows how
this change of variables is done in order to translate the volume integration into
characteristic-like tracking. Assuming that d 2 s is a planar element of plane … , the
transformation of coordinates of Eq. 4.31 into local coordinates of the tracking lines
in Eq. 4.32 is:
d 3 r d 3 r 0 D d 2 d 2 s dt dt 0
The last two integrations are over line segments of the characteristics and can be
done analytically. Under the condition that the segment lengths are renormalized to
preserve the true volumes, it can be shown that results obtained by CP and MOC are
equivalent if both use the same tracking data.
186
R. Roy
Note: In the 1980s, CP methods were very popular because they could be used to
collapse the tracking information into energy-dependent region-to-region matrices.
Nowadays in most modern lattice codes, MOC is preferred because it has a linear
complexity, it is easier to integrate anisotropic effects and, last but not least, actual computer resources (memory and hard disk space) have grown sufficiently to
accommodate its computational load. Although the CP method is now deprecated, it
is interesting to see how we can express multigroup solvers assuming the existence
of these matrices.
4.3.1.7 Bn Solutions and Diffusion Coefficients
In this section, we study more closely the total leakage phenomenon and how it can
be accounted for using specially devised diffusion coefficients. Consistent Bn solutions are used to obtain homogenized nuclear parameters. The flux is first factorized
into a global form and a lattice-periodic fine-structure flux
ˆ.r; / D exp.iB r/'.r; ; B/
(4.33)
Critical buckling search consists of finding the vector Beff having a minimal modulus in a given orientation, such that the lattice is critical. A generic fundamental
lattice mode consists of a fine-structure solution over all directions on the critical
ellipsoid. Provided that the buckling vector complies with Corngold’s inequality
jIm.B /j < †
(4.34)
a critical buckling search is possible [7]. This inequality implies that the separation
of the global form is not always possible in a subcritical cell, in particular, when
void slits occur in some regions. When this method is applied to homogeneous media problems, the buckling value b is searched for the one-dimensional (1D) planar
transport equation. The factorization in Eq. 4.33 is combined with a Legendre polynomial expansion and anisotropic collision probabilities can be computed by
Z
Al 0 l
"
0
.1/l Cl
1
C
D
d Pl 0 ./Pl ./
1 C ib
1 C ib
0
0
1
1
.1/l Cl
Pl 0
Ql
for l 0 l
D
ib
ib
ib
1
#
(4.35)
where Pl and Ql are the usual Legendre functions of the first and second type,
respectively. If the transfer cross section is expanded up to Legendre order L, a
system of equations is obtained for the angular flux moments in the homogeneous
medium:
L
X
.2l C 1/Al 0 l †s;l l
†t l 0 D
lD0
4
Reactor Core Methods
187
For this systemˇ to have a nontrivial solution,
ˇ the following characteristic equation
must be solved ˇ†t ıl 0 l .2l C 1/Al 0 l †s;l ˇ D 0. This procedure can be shown to be
equivalent to the so-called dispersion equation used to compute the discrete eigenvalues of the transport equation [4].
Probably the best-known model is the so-called B1 homogeneous model, which
attempts to solve these equations in a cell-equivalent infinite medium when considering linear anisotropy. Under the assumptions that absorption is weak and that the
lattice has internal symmetries, it is possible to define axial-dependent diffusion coefficients to account for axial streaming effects. The most general form that can be
obtained to get rid of the angle and collapse the transport problem into a diffusion
form involves a tensor
J g .r/ D Dg .r/ r g .r/
where J g .r/ is the neutron current and Dg .r/ is the diffusion matrix. Because the
critical buckling vector length does not vary very much with direction (i.e., the critical ellipsoid being almost a sphere), the directional-dependent diffusion coefficients
are seldom needed in everyday core calculations where nondirectional diffusion coefficients are used. The only significant exception is a lattice cell with large voiding
where neutrons can travel a very long way [7].
4.3.1.8 Lattice Solvers
In lattice calculations, the boundary conditions are generally taken so that neutrons
do not leak from the system comprised of a single, or a few assemblies. Neutrons
arriving at the external boundaries of the lattice domain may change their geometric
state by a simple transformation in the phase space .rb ; / like:
Specular (mirror-like) reflection: ˆ.rb ; / D ˆ.rb ; 2. Nb /Nb /
Periodic or translation: ˆ.rb ; / D ˆ.rb0 ; /
where rb and rb0 are on @D and Nb is the outward normal on @D. In these cases, the
external boundary does not have to be discretized into surfaces. In many problems,
however, the external boundary is split into surfaces as in Eq. 4.18, and there is a
homogenization process that takes place on each of these surfaces. Neutrons arriving
at an external surface @D may forget (partly or totally) their directions of motion.
Neutron currents leaving surface S˛ are defined by
Z
Z
d 2 rb
J˛;C D
d 2 Nb;C ˆ.rb ; /
(4.36)
Nb;C >0
S˛
and will be inserted back into the system with incoming currents
Z
J˛; D
Z
d 2 j Nb;C jˆ.rb ; /
2
d rb
S˛
Nb;C <0
(4.37)
188
R. Roy
either at the same surface (that is the case of isotropic reflection) J˛; D J˛;C or at
another surface of the system Jˇ; D J˛;C . If white boundary conditions are applied,
the outgoing neutron currents are returned back into the
P lattice as isotropic incoming
Tˇ ˛ J˛;C . The orthogonality
currents using an orthogonal transformation Jˇ; D
˛
of the transformation ensures that no neutron leaks from the lattice.
Let us return to Eqs. 4.13–4.17 where we introduced the generic multigroup
solver. Infinite lattice calculations can be done using the generic solver in order
to get eigenvalues K D k1 for different configurations. Different search capabilities
are usually programmed in lattice codes. For example, a neutron leakage model may
be introduced to represent how the finite physical reactor core interacts with the embedded lattice. Assuming a homogeneous macroscopic flux distribution, solution of
r r‰ C B B‰ D 0 for a buckling vector B, a critical buckling search can be
done to adjust the buckling length jBjeff to yield K D keff D 1. To find the critical
buckling, there are several possibilities, but the two main branches assume basically
that a new absorption term is included [8]:
On the left-hand side of Eq. 4.12 using a transformed static transport operator of
the form (or some approximation of this):
Lgstatic ˆ ŒLstatic j†gt .r/CD g .r/B 2 ˆ
(4.38)
On the right-hand side of Eq. 4.12 by subtracting the absorption term from the
diagonal of the scattering matrix:
S g ˆ ŒS j†g
t
g0
0
0
.r/Cı gg D g .r/B 2
ˆ
(4.39)
where D g .r/ is the diffusion coefficient for group g that will be discussed later.
Finally, in cases where a critical period ˛eff has to be found, a similar absorption
correction term appears as a transformation of the transport operator
Lgstatic ˆ ŒLstatic j†gt .r/C˛= vg ˆ
(4.40)
and iterations are done in order to obtain the critical period.
Discrete ordinates, characteristics, and collision probabilities methods are similar in many aspects. The : MeshGeom object will be numerically tracked after
choosing a set of angles. In trajectory-based transport calculations, numerical values for volumes and surfaces strongly depend on the number of segments used.
Consider Fig. 4.7, where a triangular stretched region is represented in bold. On the
left-hand side, the tracking angle is nearly parallel to the main stretching axis. On
the right-hand side, the same region is tracked with another angle (nearly perpendicular to the stretching axis). On the left, the segment length crossing the region on
the middle track must be normalized to represent the real volume seen by the neutron (illustrated as a gray rectangle); the two other tracks do not even see the region.
The normalization to preserve volume is huge and significantly disturbs the characteristics line. However, on the right, the normalization of segment length has very
4
Reactor Core Methods
189
Fig. 4.7 Stretched volumes in trajectory-based methods
1960
1966
THERMOS WIMS
1973
1977
APOLLO CPM
1983
CASMO
1991
1995
DRAGON ECCO
1960
Today
2007
Fig. 4.8 Milestones for introduction of new lattice codes
little effect. Lattice-based results are better when a coherent scheme is taken to mesh
the basic geometry into elements with sound tracking data in order to represent the
neutron’s physical behavior.
4.3.1.9 Putting It All Together into Lattice Codes
Since the 1960s, different lattice codes have been developed. Most of these codes
have been maintained for several years, and their features have evolved with the
available computing resources. Figure 4.8 gives a few milestone indications of when
some codes have appeared in time. Short descriptions are now provided:
THERMOS, originally developed at Brookhaven National Laboratory, was
mostly a code devoted to neutron thermalization [9].
WIMS, developed at Winfrith laboratory, England, was the first full-feature lattice
code [10]. It is currently available under version WIMS-9.
APOLLO, developed at CEA Saclay, France, included interesting features
like resonance treatment and homogeneous B1 treatment embedded in the
flux solver [11]. The present version is APOLLO-2.
Both CPM and CASMO, developed at EPRI and Studsvik Scandpower, integrate
most features necessary for LWR cores [12]. The present version CASMO-4 is
widely used in the USA.
DRAGON, developed at École Polytechnique de Montréal, is an open-source
lattice code with many features pertinent to (but not limited to) CANDU reactors [13]. The present major number release is DRAGON-3.
ECCO, developed mainly at CEA Cadarache, is a European cell code with many
features pertinent to (but not limited to) fast reactors [14]. It is integrated into the
ERANOS-2 code system.
190
R. Roy
TRITON, developed at Oak Ridge National Laboratory, is a high-fidelity lat-
tice code based on the 2D extended step characteristic Sn transport module
NEWT, the continuous-energy resonance processing module CENTRM, and the
ORIGEN-S depletion/decay package [15]. The current version is part of the
SCALE 5.1 package.
4.3.2 Homogenization Process
This section presents some aspects of the homogenization problems in reactor
physics. For obtaining a homogenized few-group zone with neutron properties totally equivalent to a heterogeneous medium, Koebke [16] postulates that two kinds
of conservation should be fulfilled:
1. Volume-related conservation: the integral flux and the integral reaction rates
must be conserved in the homogenized zone.
2. Surface-related conservation: the integral net currents and the integral fluxes
must be conserved at each interface of the homogenized zone.
When these two conditions are fulfilled, the heterogeneous cell medium is perfectly
well represented by the homogenized equivalent zone. To enforce these conservation relations, lattice solvers are expected to preserve volumes and surfaces of
components in the first place.
Embedding each homogenized zone inside a realistic core application composed of several levels of heterogeneity would imply using a multigrid approach
where each different local heterogeneous cell configuration would react to its interface boundary conditions. Using this approach, each cell node is represented by a
response matrix where its output flux values depend on input at interfaces, while
preserving significant integral reaction rates.
Unfortunately, once these “exact” nodes are coupled together, the number of unknowns is as large as the one generated by the transport equation discretized over
the reactor core. In this section, we will review some classical approximations that
are used to reduce the number of spatial unknowns in core calculations.
4.3.2.1 Reaction Rates and Homogenized Cross Sections
A generic formula for cell homogenization is based upon the following formula for
computing reaction rates in a volume
˝
˛
R D † ; ˆ D
Z
Z
3
d 2 † .r; /ˆ.r; /
d r
V
(4.41)
4
It is sometimes possible to use the lattice adjoint calculation in order to take into
account important local effects, such as detector response. Knowing the sources Q,
4
Reactor Core Methods
191
˝
˛
the reaction rates are simply given by Rd D ˆ ; Q with the appropriate importance function ˆ . A homogenization process that attempts to conserve these
weighted reaction rates must ensure that
E
˝
˛ D
c ; b̂
R D † ; ˆ D †
(4.42)
where the hat symbol b
: is used for the homogeneous equivalent zone. Under angular isotropy assumptions, the regular form of this equation asserts that a uniform
homogenized cross section over a cell could be defined by
c
†
Z
d r b̂.r/ D
Z
3
V
d 3 r † .r/ˆ.r/
(4.43)
V
When the integral flux value is preserved in the homogeneous equivalent, the homogenized cross section is simply the flux-volume average of the heterogeneous
reaction rate. If the integrated flux in the homogeneous equivalent is not known,
correction factors may be applied to flux-volume average values to preserve reaction rates and integral values:
Z
d 3 r † .r/ˆ.r/
c D † D V Z
†
(4.44)
d 3 r ˆ.r/
V
Similar reasoning could be applied to separate streaming effects in the volume into
their contribution to currents at different interfaces while preserving
Z
Z
2
2
d rb
@V
d Nb;C
3/ D
ˆ.r;
4
Z
Z
d 2 Nb;C ˆ.r; /
2
d rb
4
@V
which can be translated into a current formulation
X
X
.J˛;C J˛; / D
.J˛;C J˛; /
5
˛
(4.45)
(4.46)
˛
A homogenizer (reactor physicist on task) must be aware of what is needed: here,
only the net neutron flow is preserved; in infinite lattice calculations (with no neutron leakage model), this means zero equals zero. However, this crude balance may
not preserve the surface currents at different interfaces. The lattice could be rotated
and there would be no visible effect for the homogenized equivalent. In the case of
completely symmetric unit cells, a single outer boundary surface (a white boundary
on @D), which behaves exactly the same way for neutrons of any direction entering
the volume and conservation like Eq. 4.41 may be sufficient. Using a leakage model,
this behavior can sometimes be represented using uniform diffusion coefficients.
192
R. Roy
Unfortunately, the true unit cell is embedded in the reactor core and currents on
each interface are probably different, even when the basis lattice geometry is totally
symmetric.
4.3.2.2 Generalized Equivalence Theory and Discontinuity Factors
Another important issue in multigroup equivalence theory is the neutron behavior at cell interfaces (inter-assembly effects). In order to deal with interfaces in
coarse problems, the classical approach generally assumes neutron current continuity. In the process of converting a large-scale transport problem into a diffusion-like
problem, axial diffusion coefficients can give some degrees of freedom to conserve
transverse leakage effects. However, if we add the last constraint of preserving also
the surface fluxes, additional degrees of freedom not usually encountered in purely
diffusive problems are needed.
This supplementary constraint yields the well-known homogenization paradox:
in the process of defining an equivalent homogeneous medium, some properties get
lost. The generalized equivalence theory proposes a framework for obtaining equivalent reactor core solutions. At each interface S˛ D VI \VJ between zones I and J ,
the surface fluxes on the two sides of the interface are allowed to be discontinuous
in the homogenized domain. Discontinuity factors are introduced on each side of
the interface [17]:
C
f˛;J
I
D
ˆ˛;side I
2
ˆ˛;side I
; f˛;I
J
D
ˆ˛;side J
2
ˆ˛;side J
(4.47)
and the core-level solution assumes that neutrons crossing interfaces respect the
discontinuity.
Note: Work on homogenization theory and neutron leakage is certainly not exhaustively covered in this presentation. Many variations regarding the issue of
defining homogenized parameters have evolved from the work of Behrens in the late
1950s to modern advanced nodal methods. In recent years, there have been significant achievements in heterogeneous whole-core analyses without homogenization
or group condensation (e.g., see [18]).
4.4 Reactor Core Solvers
Power-plant reactor core calculations are usually done with few energy groups.
Many approximations are done to condense and homogenize the lattice results, but
the angular dependence is also important. Basically, approximations made over the
angular variable serve as the basic glue joining together the core regions. In this
section, some methods used to obtain the flux distribution over the core will be
presented.
4
Reactor Core Methods
193
4.4.1 Pn Approximations and Diffusion
In isotropic media, the angular transfer operator depends on the angular deflection
0 D 0 resulting after a neutron scattering collision. The scattering cross section
is generally expanded up to Legendre order L in anisotropy [8]
†s .r; 0 / D
L
X
2l C 1
†s;l .r/Pl . 0 /
4
(4.48)
lD0
where Pl are the usual Legendre polynomials. In order to determine the scattering source in Eq. 4.14, the flux must also be expanded into spherical harmonics
components
L
Cl
X
2l C 1 X
(4.49)
ˆ.r; / D
lm .r/Rlm ./
4
lD0
mDl
where Rlm are orthogonal real spherical harmonics. By the use of the addition
theorem of spherical harmonics, it is possible to rewrite the neutron sources in a
truncated form
Q.r; / D
L
Cl
X
˚
2l C 1 X
Rlm ./ †s;l .r/
4
lD0
lm .r/
C flm .r/
(4.50)
mDl
where flm are the expansion coefficients of the source terms of Eq. 4.14.
4.4.1.1 Spherical Harmonics and the Even-Parity Transport Equation
To take into account the angular variable, the even and odd components of the
angular flux are often separated. Assuming an isotropic source, the steady-state onegroup transport problem described in Eq. 4.12 can be transformed into:
(
rˆ .r; / C †.r/ˆC .r; / D
C
1
4
Q.r/
rˆ .r; / C †.r/ˆ .r; / D 0
(4.51)
where ˆC .r; / D Œˆ.r; /Cˆ.r; / =2 and ˆ .r; / D Œˆ.r; /ˆ.r; / =2
are respectively the even and odd components of the angular flux. Extracting the
odd component from the second equation, the even-parity form of the transport
equation is
r
1
1
rˆC .r; / C †.r/ˆC .r; / D
Q.r/
†.r/
4
(4.52)
194
R. Roy
Assume that the spherical harmonics approximation of Eq. 4.49 is written using
scalar products ˆC .r; / D e./ ®C .r/ and ˆ .r; / D o./ ® .r/. The
even-parity spatial equations now reduce to a set of second-order differential
equations:
1
EET r®C .r/ C †.r/®C .r/ D q.r/
(4.53)
r †.r/
where EET D
R
d 2 e./eT ./ represents the even-component angular cou-
4
pling. To enforce continuity conditions at interfaces between neighboring regions,
the odd-parity components are also defined
® .r/ D where OT D
R
1
r OT ®C .r/
†.r/
(4.54)
d 2 o./eT ./. Following these angular approximations,
4
a complete spherical harmonics solver can be obtained from the spatial discretization of Eq. 4.53 with orthogonal functions correctly matching the homogeneous
regions inside the domain, with Lagrange multipliers coupling the odd-parity interface components of Eq. 4.54. For anisotropic scattering or sources, the formulation
is also possible, but a little more tedious [19]. For the sake of simplicity, simpler numerical schemes based on the diffusion approximation will be presented in
this chapter. The reader is referred to Chapter 2 of this book for a more detailed
presentation on spherical harmonics solvers.
4.4.1.2 The Diffusion Approximation
In the context of reactor codes, the P1 expansion is well known and directly related
to diffusion theory. In that case, the angular flux takes a simple form
ˆ.r; / D
1
f .r/ C 3 J.r/g
4
(4.55)
Using the generic multigroup formulation, the integration of Eqs. 4.13 and 4.14 in
angle gives
r:J g .r/ C †gt .r/
g
.r/ D
G
X
g 0 D1
†gs;0
g0
.r/
g0
.r/ C f0g .r/
(4.56)
4
Reactor Core Methods
195
Now, assuming isotropic fission sources, another equation to close the multigroup
system can be obtained after multiplying by the angle and integrating once more
1
r
3
g
G
X
.r/ C †gt .r/J g .r/ D
g0
†gs;1
0
.r/J g .r/
(4.57)
g 0 D1
A multigroup P1 solver employs these equations. Most of the core simulations are
probably still done using diffusion codes. Equations 4.56 and 4.57 can be combined
together into an elliptic form using the diffusion coefficient D g
r .D g .r/r
g
.r// C †gt .r/
g
G
X
.r/ D
g0
†gs;0
.r/
g0
.r/ C f0g .r/ (4.58)
g 0 D1
Several recipes are possible for defining D g :
g
If †s;1
g0
.r/ D 0, use recipe
1
3†gt .r/
D g .r/ D
g
If †s;1
g0
0
.r/ D ı gg †gs;1 g .r/, use recipe
D g .r/ D
g
If macro-reversibility †s;1
g0
1
3.†gt .r/ †gs;1 g .r//
0
0
.r/J g .r/ D †gs;1
g
.r/J g .r/, use recipe
1
D g .r/ D
3
†gt .r/
G
P
g 0 D1
!
0
†gs;1 g .r/
and so on
Transport-corrected cross sections were often used in older lattice codes (limited
to isotropic sources and scattering) to account for linear anisotropy. In the second
definition of D g , the static-corrected transport operator of Eq. 4.12 can be taken as
Lgstatic ˆ ŒLstatic j†g .r/†g
t
s;1
g
.r/
ˆ
consistent with the scattering operator defined as
S g ˆ ŒS j†g
s;0
g0
0
g
.r/Cı gg †s;1
g
.r/
ˆ
196
R. Roy
4.4.1.3 SPn and Improved Diffusion
In 3D geometries, the Pn (or Bn ) equations are quite complicated to implement due
to the large number of unknowns (in fact, there are .n C 1/2 coupled equations
in 3D). At the beginning of the 1960s, Gelbard introduced the simplified spherical
harmonics equations [20]. These simplified equations were historically derived after
some algebraic reductions that are formally exact in planar geometry. In 1D planar
geometry, the P3 expansion for the angular flux is the following:
1
ˆ.x; / D
0 .x/ C 3 1 .x/
2
2
2
5 3
3 1
.x/
C
7
.x/
(4.59)
C5
2
3
2
2
The corresponding P3 equations coupling the flux moments together are:
8
d
ˆ
ˆ
D S.x/
ˆ
1 .x/ C †r;0 0 .x/
ˆ
dx
ˆ
ˆ
ˆ
d
d
ˆ
ˆ
< 2
2 .x/ C
0 .x/ C 3 †r;1 1 .x/ D 0
dx
dx
ˆ
d
d
ˆ
ˆ
3
ˆ
3 .x/ C 2
1 .x/ C 5 †r;2 2 .x/ D 0
ˆ
ˆ
dx
dx
ˆ
ˆ
d
:̂
3
D 0
2 .x/ C 7 †r;3 3 .x/
dx
(4.60)
where †r;n D †t †s;n . Elimination of the odd-order moments is easy and the two
remaining equations can be written as:
8
ˆ
ˆ
<
d
1 d
f 0 .x/ C 2 2 .x/g C †r;0 0 .x/ D S.x/
dx 3†r;1 dx
9
d
2
ˆ d
f†r;0 0 .x/ S.x/g
:̂
2 .x/ C †r;2 0 .x/ D
dx 35†r;3 dx
5
(4.61)
Using an ad hoc procedure to obtain the multidimensional simplified P3 equations,
the derivatives of the odd moments are replaced by divergence operators and the
derivatives of the even moment by gradient operators. Equation 4.60 is then transformed into:
8
1
ˆ
ˆ
r. 0 C 2 2 / C †r;0 0 D S
< r 3†r;1
(4.62)
9
2
ˆ
r 2 C †r;2 2 D f†r;0 0 S g
:̂ r 35†r;3
5
These two multidimensional equations look like a two-group diffusion problem.
For a reactor core problem using G energy groups, there will be 2G such equations to solve; this is far more tractable than the full P3 formulation where there
would be 16G coupled equations. Moreover, the actual form of Eq. 4.62 being
4
Reactor Core Methods
197
very similar to the usual diffusion equations, all the numerical methods suitable
for diffusion problems can generally be extended easily to tackle these equations.
In general, .n C 1/=2 second-order diffusion-like equations are needed for developing an SPn method. Many current reactor codes use SPn calculation schemes, where
the diffusion solutions are improved by taking into account higher angular order relations. It is also possible to obtain SBn simplified equations after replacing r by
r C iB and eliminating the complex terms.
In the rest of this section, only the diffusion equation will be considered. However, the reader should be aware that this restriction is only in the interest of
simplifying the presentation.
4.4.2 Diffusion-Like Methods
Although not always accurate, diffusion solvers are invaluable in obtaining inexpensive reactor core data. They are also used as accelerators for transport core solvers.
For these reasons, we will now explain a few basic core methods using the framework of a generic steady-state multigroup diffusion equation
r D g r
g
C †gr
g
D
X
†gs;0
g0
g0
C g
X
†gf
0
g0
(4.63)
g0
g 0 ¤g
where †gr D †gt †gs;0 g is the removal cross section. A generic multigroup diffusion solver will follow a very similar pattern as the generic transport solver of
Section 4.2.1.3. However, the solution is much simpler to obtain since upon discretization the self-adjoint operator on the left-hand side will involve symmetric
positive definite matrices. The multigroup diffusion system stays nonsymmetric due
to the scattering transfers. However, the self-scattering term can now be directly
integrated in the inner iteration.
The external boundary conditions of the reactor core are normally taken into
account:
For symmetry planes, by
r
g
N D0
For zero incoming current, by
g
D 2 D g .r
g
N/
The flux distribution is the eigenvector solution corresponding to 0 D 1 = keff ; the
eigenvector is normalized to the total reactor power as:
Ptot
G Z
X
˝
˛
D †f ; core D
gD1
where is the energy release per fission.
core
d 3 r †gf .r/
g
.r/
(4.64)
198
R. Roy
4.4.2.1 Transverse Integrated Nodal Methods
Transverse integration processes the core solution in each dimension one at the time.
Assuming a 3D Cartesian core split into volume elements jV j D x y z, the
derivative with respect to one of the coordinates is kept on the left-hand side of
Eq. 4.63 and the other terms are sent to the other side:
1 g
1 g
d gd g
D
ˆ .x/ C †gr ˆg .x/ D Qg .x/ Lz .x/ L .x/
dx
dx
y
z y
(4.65)
Here, the flux is integrated in the two other directions:
Z yC Z zC
1 1
ˆg .x/ D
dy
dz g .x; y; z/
y z y
z
and so are the fission and scattering sources. Finally, the transverse leakage terms
are given by:
8
Z zC
yC
ˆ
1
g
ˆ
g @ g
ˆ
L
.x/
D
dz
D
.x;
y;
z/
< y
z z
@y
y
(4.66)
Z yC
zC
ˆ
1
ˆLg .x/ D
g @ g
dy D
.x; y; z/
:̂ z
y y
@z
z
In nodal expansion methods (NEM), polynomial flux expansions are combined with
a quadratic transverse leakage fit to get a fast core solver. Using local coordinates
where the origin is the center of the node, the flux is represented without taking into
account cross-directional dependency:
g
.x; y; z/ D
g
C
N
X
ang pn .x/ C
nD1
where
g
N
X
bng qn .y/ C
nD1
N
X
cng rn .z/
(4.67)
nD1
is the node-averaged flux and where the polynomials satisfy
Z xC
Z yC
Z zC
pn .x/ dx D
qn .y/ dy D
rn .z/ dz D 0
x
y
z
Without any loss of generality, only the x-direction will be treated from now on.
A classical example involves 4th degree polynomials
8
xx
ˆ
ˆ
p1 .x/ D
ˆ
ˆ
ˆ
x
ˆ
ˆ
1
ˆ
ˆ
<p2 .x/ D 3 2 4 1
2
ˆ
p3 .x/ D
ˆ
ˆ
ˆ
4 ˆ
ˆ
ˆ
1
ˆ
2
:̂p4 .x/ D
20
(4.68)
2
1
4
4
Reactor Core Methods
199
conveniently chosen such that: p3 .x˙/ D p4 .x˙/ D 0. The coefficients of the
expansions are related to flux and current values on each face of the node boundary.
For example, in the x-direction, it can be shown that
8
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
<
a1g D ˆg .xC / ˆg .x / ˆgxC ˆgx
a2g D ˆgxC C ˆgx 2 g Dg
1 g 1 g
g
g
g
g d
g
ˆ .x / D a 3a2 C a3 a4
Jx D D
ˆ
ˆ
dx
x 1
2
5
ˆ
ˆ
g
ˆ
D
1 g
1 g
ˆ g
g
g
g d
g
:̂JxC D D
ˆ .xC / D a1 C 3a2 C a3 C a4
dx
x
2
5
(4.69)
In order to obtain a well-posed system of equations involving the node-averaged
flux and face-averaged partial currents across nodal interfaces, a weighted residual
procedure is applied on the node. Integration over the node of Eq. 4.63 yields the
nodal balance equation
i
1 g
1 h g
1 g
g
g
g
g
g
C
JxC Jx
JyC Jy
J Jz
C
D Q †gr
(4.70)
x
y
z zC
The incoming interface currents are the outgoing interface currents from adjacent
nodes. It is possible to take into account the discontinuity factors of Eq. 4.47 to
link together interface fluxes. To close the system, Eq. 4.65 is also multiplied by
some weight functions wn .x/ and integrated over x 2 Œx ; xC . Using a constant
weight function, w0 .x/ D 1, leads to Eq. 4.70. In moment weighting, the next N 2
functions of the polynomial basis are used. In this case, the weights w1 .x/ D p1 .x/
and w2 .x/ D p2 .x/ will provide additional equations needed for the calculation of
expansion coefficients. Flux values (node-averaged and face-averaged) can be eliminated in favor of the source and leakage terms yielding a global interface current
system of equations.
The solution of steady-state nodal diffusion equations follows the standard multigroup solver procedure: that is inner group-by-group iterations nested in outer
iterations where neutron sources are updated. Inner iterations consist of mesh
sweeps through the domain up to the convergence of the interface currents for a
given group. Further information on the classical NEM solvers can be found in the
review paper of R.D. Lawrence [21].
4.4.2.2 Analytic Nodal Methods
In analytic nodal methods (ANM), an analytic solution to the one-dimensional
Eq. 4.65 is sought. The derivation is based on the fact that hyperbolic sine and cosine
functions are generic solutions of Eq. 4.65 with no source. Typical nodal solutions
are a blend of polynomials and exponentials:
ˆg .x/ D Ag cosh
ı ı g
g
ldg C B sinh ldg C fpart .x/
(4.71)
200
R. Roy
p
where ldg D D g = †gr is the well-known diffusion length [4]. Linear and quadratic
approximations are normally employed for the sources and the transverse leakg
.x/ is a second-degree polynomial a0g C
age terms, respectively. The function fpart
g
g
a1 p1 .x/ C a2 p2 .x/, computed as a particular solution of Eq. 4.65.
The approximation of transverse leakage terms is of great help to reduce the complexity of ANM solvers. However, for nonrectangular (such as hexagonal) nodes,
singular terms occur in the transverse leakages and these singularities have to be
smoothed. In the analytic function expansion nodal method (AFENM), recently developed at KAIST, the transverse integration with an assumed shape is no longer
necessary [22]. More nodal unknowns are needed to obtain a solvable system;
AFENM thus includes the corner-point flux values as supplementary unknowns
and corner-balance equations are needed to close the system of equations. Reference [22] offers a good review of the main principles and limitations behind the
various modern nodal methods.
4.4.2.3 Core Harmonics and Modal Synthesis
Consider the following eigenvalue matrix system obtained after discretization of
the transport or diffusion equation in energy, angle and space of a steady-state core
problem [23]
M® D F®
(4.72)
When an adjoint solution ® is required, as for perturbation theory or kinetics, direct
and adjoint calculations can be carried out simultaneously, as the eigenvalue is the
same. Both eigenvectors are often combined by the Rayleigh ratio, which serves as
an estimate for the eigenvalue for both systems
h® ; M ®i
® ; ® D
h® ; F ®i
(4.73)
This ratio is stationary with respect to both ® and ® . A technique similar to the
well-known biconjugate gradient method developed for nonsymmetric matrices can
be used to simultaneously solve direct and adjoint problems. The flux distribution in
the core and its corresponding adjoint are the solutions (fundamental reactor mode)
corresponding to the greatest real value K D keff D 1=0 . The core harmonics (direct
and adjoint) are solutions that correspond to other eigenvalues jKj D 1=jn j < keff ;
note that the other eigenvalues are not necessarily real or simple (that is of algebraic
multiplicity of 1).
Modal synthesis is used for real-time core simulators. The trial functions used as
“modes” are precalculated over the entire reactor core domain. These modes serve
as a function basis for core perturbations or transient analysis. For example, let us
consider a multigroup modal scheme. Using the first N natural harmonics specific
to the reactor being studied, the time-dependent flux is synthesized by:
ˆ .t; r/ D
g
N
X
nD0
Agn .t/
g
n .r/
4
Reactor Core Methods
201
4.4.3 Variational Formulation and Finite Elements
To allow the treatment of spatial variables in full-core calculations, nowadays
methods rely on many high-order accurate numerical schemes. In this section, we
will choose (once more) the template diffusive-like problem [24]:
(
r J.r/ C †.r/ .r/ D S.r/
(4.74)
J.r/ C D.r/r .r/ D 0
in a reactor core domain r 2 D.
4.4.3.1 Classical Spatial Finite Elements
The classical way to express this problem as a bilinear form suitable for finite elements is to substitute the second equation into the first one, multiply by a test
function , and integrate over space. For the moment let us forget about J and focus on . After integration by parts and assuming vanishing conditions .rb / D 0
at the core’s boundary (or at an extrapolated distance point) rb 2 @D, this gives the
following
Z
Z
d 3 r fDr r
C†
gD
core
„ƒ‚…
a. ; /
d 3 rS
(4.75)
core
„ƒ‚…
L. /
This bilinear form a.; / defined on the left-hand side has nice properties: symmetric and coercive, and there is a unique solution in the Sobolev space 2 H01 .core/
where and all its derivatives are square integrable over the core. The classical
Ritz–Galerkin method to approximate solutions of Eq. 4.75 consists of building finite subspaces Vh and looking for a minimal solution of the functional
=h . / D
1
a.
2
h;
h / L.
h / D inf
'2Vh
1
a.'; '/ L.'/
2
(4.76)
The core domain is generally seen as a partition of zones (or elements) in which
local simple functions are defined and matched together at interfaces. The finite
elements are built on these finite-dimensional subspaces, where the approximated
flux is searched as a linear combination
X
' hD
xn 'n
n
that will minimize the functional and the best solution is the (unique) solution of the
linear system
202
R. Roy
8
ˆ
ˆ
Z Ax D b
ˆ
ˆ
˚
ˆ
ˆ
d 3 r Dr'i r'j C †'i 'j
< aij D a.'i ; 'j / D
ˆ
ˆ
ˆ
ˆ
ˆ
:̂
core
(4.77)
Z
bj D L.'j / D
3
d r'j S
core
Implemented in the 1980s for reactor core calculations, this method has been in use
for several years. This is now known as the primal formulation of our diffusive
template problem; the first equation of Eq. 4.74 providing the flux distribution has
to be found and the second equation acts as a constraint. In the dual formulation,
the current has to be found using the second equation while the first equation acts
as a constraint on the divergence of the current.
4.4.3.2 Mixed and Hybrid Finite Elements
Nowadays, weaker variational formulations are often used to better represent the
lack of regularity of the flux distribution in the core. Let us consider again the basic
template problem in Eq. 4.74 and transform both equations into a variational system.
The first equation, simply multiplied by the test function and integrated over the
core, leads to
Z
Z
d 3 r fr J C † g D
d 3r S
core
core
while the second equation can be transformed into
Z
˚
d 3 r D 1 J I r I D 0
core
where I is a test vector function. The solution to this mixed problem does not only
provide the flux distribution, but the combination of . ; J / unknowns satisfying the
variational system. This leads to a larger linear system, as the unknowns will be of
the two kinds. However, this form is weaker because it involves a new functional
space for currents [25]
o
n
3
H.divI core/ D I jI 2 L2core ^ r I 2 L2core
with a well-defined trace on the boundary of the core. In order to close this variational system uncoupling the flux and current components, boundary conditions are
also needed for the current component. To simplify this presentation, let us choose
an essential boundary condition as I.rb / Nb;C D 0. The flux and the source term
in the first equation need only to be square integrable on the core domain. The symmetric form for this variational system is
4
Reactor Core Methods
8
ˆ
ˆ
ˆ
<
ˆ
ˆ
:̂
203
Z
Z
˚
d 3 r D 1 J I r I D 0
Z
core
d 3 r f r J †
core
gD
d 3r S
(4.78)
core
It is possible to combine Eq. 4.78 into a mixed formulation
Z
inf J 2H0 .divIcore/ sup
2L2 .core/
˚
d 3 r D 1 J J †
2
2 r J C 2S
core
where the system solution is a saddle point. The integration of various boundary
conditions is easier (reflection, zero incoming current, etc.) than in the primal formulation as the flux–current components are uncoupled. As the flux and source
terms are only square integrable, the function basis is much easier to choose. With a
core domain meshed into volumes Vi , the most obvious choice is to take the characteristic function of Vi with a value of 1 over the volume and 0 otherwise as the flux
function basis. For a 3D Cartesian core mesh, an obvious (nontrivial) current basis
would consist of linear components over each normal direction Jx .r/; Jy .r/; Jz .r/,
while the other directions can be constant on each interface. This formulation can
yield efficient calculation schemes reducing the number of flux unknowns and enabling certain discontinuities at domain interfaces.
4.4.4 Putting It All Together into Reactor Codes
Since the 1960s, reactor codes have been developed. Figure 4.9 gives a few milestone indications of when some codes have appeared in time. Short descriptions are
now provided:
FLARE, originally developed as an inexpensive 3D BWR simulator to determine
core reactivity and power distribution, was a prototype for nodal codes [26].
CITATION, developed at Oak Ridge National Laboratory, is a generic code that
solves various kinds of 3D multigroup diffusion problems (XYZ, RZ™, etc.) [27].
1964
FLARE
1976
1982
Q/CUBBOX DIF3D
1960
1993
VARIANT
Today
2007
CITATION
1971
QUANDRY
1978
Fig. 4.9 Milestones for introduction of reactor codes
204
R. Roy
QUABOX/CUBBOX uses a coarse-mesh flux expansion with multidimensional
polynomials for the solution of the nodal balance equations [28].
QUANDRY, developed at MIT, implemented a full two-energy groups analytic
nodal method with quadratic approximation of the transverse leakage [29].
DIF3D, originally developed by R.D. Lawrence at Argonne National Laboratory,
uses nodal diffusion and transport methods for the analysis of fast reactors [30].
VARIANT, also developed at Argonne National Laboratory, includes a significantly
expanded set of solution techniques using variational nodal methods [31].
4.5 Core Applications
The reactor-physics calculation for core tracking or other transient analysis is a multistep process, generally involving a core database. The hierarchical local parameter
database, generated with a lattice cell code, is used to compute the nuclear properties
associated with each cell according to its local parameters. Every fuel pin (or fuel
bundle) can have its own properties, which depend not only on the instantaneous
environment condition surrounding the pin, but also on the historical effects that
the fuel has experienced. On the other hand, structural materials, reactivity mechanisms, and reflector properties also must be generated consistently within the same
database, but these usually vary much less than the fuel.
The hierarchical reactor core database is generally built as follows [32]. Depending on the reactor type and core simulations, a set of local parameters (lEp ) of interest
is first selected (temperatures, densities, etc.). Some nominal conditions for the reactor core are then identified .lE0 ; hE0 /. Detailed cell calculations are performed for
E E
these nominal conditions and homogenized reference cross sections †ref
˛ .l0 ; h0 / are
saved. Then, multidimensional interpolation is used to obtain feedback coefficients
for potential operational situations. A sensitivity analysis for cross-section variations must be undertaken for every local parameter: perturbed cell calculations are
done and mixed effects must be tracked. The burnup and other history effects (hEb )
add another level to the hierarchy of data. Each perturbed state .lEp ; hEb / can affect
the multigroup macroscopic cell cross sections either directly by introducing new
microscopic data or via a perturbed cell flux. The perturbed cross sections can be
written as
X
E E
˛i C ˛i .N i C N i /
†pert
˛ .lp ; hb / D
i
D †ref
˛ C
X
i
˛i N i C
X
i
˛i N i C
X
˛i N i
i
Added to the reference cross section, there are three types of macroscopic crosssection corrections: the first term takes into account temperature effects and
any spectrum perturbations; the second term is devoted to nuclide concentration
variations, e.g., fuel burnup, and the third (second-order) term is for the mixed
effects between the nuclide concentrations and the microscopic data variation.
4
Reactor Core Methods
205
In power reactors, a traditional macroscopic depletion model is not accurate
enough for core analysis. On the other hand, data tabulation of every microscopic
cross-section effect is not affordable. Core databases use a blend composition for
representing cross sections:
MACRO E E
E E
†core
.lp ; hb / C
˛ .lp ; hb / D †˛
X
˛micro .lEp ; hEb /N micro
micro
where some microscopic data will be extracted from the cell calculations. The
concentration value N micro can be adjusted during the core simulations. These adjustments are sometimes done by automatic regulating or control systems included
in the core; otherwise, these concentrations can be driven by factors external to the
depletion or history-based cell models. The isotopes extracted depend on the reactor
type. For example, in LWR reactors, some actinides and burnable absorbers (such
as Gadolinium and Boron) are always extracted for obvious reasons. In Fig. 4.10, a
class diagram collecting a whole set of cell calculations is shown with some typical
local parameters. In that sketch, some standard parameters are taken into account
for generating the database, namely:
FuelCond includes the fuel conditions as fuelTemp, the fuel temperature; fuelDens, the fuel density; and fissProd, some fission production data.
IntCoreDB
- fitHomXS
- fitIntProp
- unitCalc : UnitCalc
CoreSimulation
UnitCalc
-latticeGeom[1]
-deplState[1..*] : DepletionState
-concHet[0..*]
FuelCond
-fuelTemp
-fuelDens
-fissProd
CoolantCond
-coolDens
-modTemp
-voidFrac
Fig. 4.10 Building of a reactor core database
ReactDev
-spacers
-detectors
-poisonRods
DepletionState
206
R. Roy
CoolantCond includes the coolant conditions as coolDens, the coolant density;
modTemp, the moderator temperature; and voidFrac, a void fraction.
ReactDev includes data for the reactor-driven mechanisms as spacers, detectors,
and poisonRods.
The preceding parameters will influence the behavior of different cell patterns
present in the reactor core. Each unit cell calculation sequence is organized in a
class UnitCalc identified by a specific lattice cell geometry latticeGeom. The various concentrations concHet of nuclides will vary depending upon the parameter
values and the depletion state of the cell. The reactor database contains the nuclear
properties of all these unit cell calculations and acts as an interface for restoring
these properties in core simulations.
4.5.1 Pin Power Reconstruction in LWR Reactors
LWR reactor design has evolved over the years. MOX fuel and extended cycles
have pushed the need for more sophisticated reactor analysis tools. The calculation of pin powers when tracking an LWR core is still a very challenging problem
for nowadays computer resources. A modern LWR core can contain more than
200 assemblies of 17 17 pins. In addition to various transient safety analyses,
LWR core studies usually include simulations for setting operation margins and for
core monitoring. In the late 1980s, pin power reconstruction capability was added
to the SIMULATE-3 nodal reactor analysis code [33]. In this example, Studsvik’s
CASMO-4/SIMULATE-4 calculation scheme will be presented [34].
In SIMULATE-4, the global core model is based on the multigroup diffusion
equation solved by the analytic nodal method. The diffusion equations are integrated over the transverse directions with the transverse leakage approximated by a
quadratic fit. This gives a full set of 1D multigroup equations where cross sections
inside each node are considered uniform. The analytic nodal method is not sensitive to mesh-spacing limitations that could occur in a finite difference approach and
can provide reliable 3D flux solutions. As material discontinuities can appear in the
axial direction, a separate multigroup 1D diffusion equation is solved for each fuel
assembly. The influence of neighboring assemblies is taken into account by radial
leakage. In the radial direction, the core is split into slices and full 2D calculations are performed with the SP3 approximation using N N submeshes for each
fuel assembly. Assuming zero net current at fuel assembly boundary, CASMO-4
is used to generate the cross sections †CASMO and discontinuity factors homogenized for each group and each submesh. For each fuel assembly, an equivalent SP3
calculation is done using these homogenized parameters and zero net current. The
assembly cross sections †SA and the side average discontinuity factors obtained for
this last calculation are computed and used to correct the node average parameters.
The global model stitches the 2D submesh slices together with the corrected cross
sections.
4
Reactor Core Methods
207
The pin power is reconstructed in the following way. For each group and each
submesh, the nodal flux is assumed to be given by
2D
.x; y/ D P .x/ C Q.y/ C ax e Kx x C bx e CKx x C ay e Ky y C by e CKy y
where P .x/ and Q.y/ are polynomials of degree 2 and the coefficients K are obtained in the global solver. This nodal equation is consistent with the development
presented in Section 4.3.2.2 on analytic nodal methods. To account for the heterogeneity of the submesh, the form factors are superimposed on the homogeneous
2D
recovered from the solver. This gives the following expression
powers pi;g
Pi D
G
CASMO
X
pi;g
gD1
SA
pi;g
2D
pi;g
4.5.2 Estimates of Zonal Powers in CANDU Reactors
CANDU reactors are refueled at power [13]. If a nuclear engineer has to select
channels to refuel today to sustain the total power, she has to take into account the
perturbation to the flux shape in the core. The fourteen (14) liquid zone controllers
of CANDU reactors (see Fig. 4.11) serve as zone control units that can be emptied
Fig. 4.11 Control power zones in a CANDU core. The fill patterns indicate the zones controlled
by the liquid zone controllers that are the black needles. There are seven zones in the front part of
the reactor and seven others in the rear part. The fuel channels are horizontal.
208
R. Roy
or filled with light water upon request of the regulating system. Let us say that our
nuclear engineer estimates that 16 new bundles of fresh fuel is the target; where
should she place these bundles? The characteristics of interest here are the zonal
power fractions, which are defined for the unperturbed state by:
˝
p`o D ˝
˛
†f o ;
o Z
`
†f o ;
o core
˛
where o is the unperturbed flux and Z` is the volume of control zone
`.` D 1; : : : ; 14/. We wish to obtain the variation in zonal power fractions caused
by a single refueling perturbation. For example, refueling in channel j will produce
perturbations to the power fractions p`.j / . The perturbation will depend on the
perturbation to the system matrices M and F on the left- and right-hand sides of
Eq. 4.68 and the variation in energy †f involved during refueling. Since there
are 380 fuel channels and the refueling scheme could be a combination in buckets
of four or eight bundles in a channel, selecting the channels using a full-core reactor
calculation model can lead to millions of core combinations, each one comprising a
small perturbation
Generalized perturbation theory can be used for evaluating the power fraction
increments to second-order accuracy with regards to variations in the original core
model. In that particular case, the generalized adjoints ` are obtained by solving
for each control zone ` an adjoint source problem:
Z` p` †f o
˛
D ˝
†f o ; o core
˚
.M o F
/`
where Z` is 1 inside zone Z` and 0 otherwise. This problem is much simpler
to solve than the perturbed equation since it does not involve another eigenvalue search, but rather a singular source problem. The unique solution to this
problem
˛is a generalized adjoint; by the Fredholm alternative, it must comply with
˝ ` ; F o D 0. Using this solution, estimates of the power perturbation are provided
by bilinear products
.j /
p`
'
D˚
/
Z` p` †.j
;
f
Po tot
E
o
core
D
` ; M.j / o F.j /
E
o
core
where Po tot is the total unperturbed reactor power. The above procedure has been
found very reliable compared to reference eigenvalue calculations of the perturbed
core configurations. To obtain the perturbed reference calculations, the engineer
may think that it is easier to restart the calculation from the original state fo ; o g.
However, a very accurate convergence criteria must be used because variations in the
eigenvalue are small compared to the full-core properties. More results concerning
this methodology can be found in [35].
4
Reactor Core Methods
209
4.5.3 Teaching Modern Reactor Core Methods
In a recent graduate course given at École Polytechnique, the reactor core was presented using a project approach. The challenge is to motivate students in collecting
input data, defining core simulations, and extracting and analyzing pertinent outputs. In order to teach modern reactor core analysis, some software tools used with
reactor data are necessary. At École Polytechnique, we use our own reactor-physics
chain of codes: DRAGON, a lattice cell and supercell code, and DONJON, a reactor
code [36,37]. Both are open-source codes that can be used with WLUP libraries [38].
The choice of a “real” reactor core for simulation is difficult: the core must be simple
enough so that the student is not overwhelmed by modeling details.
Among many research reactors that have been used for neutron physics studies (production of radioisotopes and testing of fuel samples), the Argonne National
Laboratory Chicago Pile-Five (CP-5) (see Fig. 4.12) was chosen. It is a small heavywater graphite research reactor built in 1952 and abandoned in 1979. The CP-5
decontamination and decommissioning was initiated in 1991 and completed in
2000. The whole life cycle of the core is thus completed, so that the student can
be confident that this core is entirely “feasible.”
The CP-5 reactor used highly enriched Uranium elements in a tank of heavy
water; the core has two reflectors: heavy water and graphite. The basic core layout
is simple:
Fig. 4.12 View of the CP5 Argonne reactor building (with kind permission of D&D – Nuclear
Engineering Division of Argonne) (http://www.dd.anl.gov/projects/cp5.html)
210
R. Roy
Graphite reflector
D 2O
moderator
D2O reflector
Fig. 4.13 CP5 reactor core with its two reflectors. The number of fuel elements in the core varies
with time from 12 (white pins only) up to 16 positions (grayer pins). The central black-pin location
is generally not loaded with fuel
There are 17 loading positions for inserting fuel elements.
Two feet of heavy water acts as a first reflector on the sides and bottom of the
core (2.5 ft on top).
Two feet of graphite acts as a second reflector on the sides and bottom.
The core layout is depicted in Fig. 4.13.
Students are asked to represent the core using different models using a fixed
microscopic library. For the project, the functional requirements are the following:
Cell models in 2D transport are done in order to recover nuclear properties for
the various material zones and important isotopes.
Core models in 3D transport are done using Cartesian grids based on the fuel
cells and preserving volumes for each different material region.
The core analysis provided must report on:
– Convergence analysis of core models.
– Study of critical load and core evolution.
– Calculation of the temperature feedback coefficient.
Among the quality requirements, consistency and physical sense of the analysis are
most important; explanation of key ideas, report presentation, and conciseness are
also expected.
Students learn reactor analysis through the use of codes. They must describe
and input materials and geometries; they must choose mesh splitting and options in
the numerical solvers. Homogenized and condensed nuclear properties are obtained
after the 2D cell calculation from collapsing the fuel and gap diluted with a quantity
of heavy water to obtain a volume equivalent to the square pitch of 6 in. Then, these
properties are sent to the 3D core model for studying core behavior.
4
Reactor Core Methods
211
Fig. 4.14 Thermal flux at the CP5 reactor core center
As an example of output, Fig. 4.14 gives the thermal flux at the center of the
12-element core during start-up using 1=4-symmetry. Most students have found consistent results for the temperature feedback coefficient (about 0:04K=K=ı C) and
the spatial core convergence analysis is generally well done. Major defects found
in the analyses are related to the fact that some students do not preserve reaction
rates when going from the cell to the core model. For example, they may input more
(or not enough) fuel than necessary in the core model by using improper dilution
factors.
4.6 Concluding Remarks
The future developments in reactor analysis methods are difficult to foresee. One
may speculate that the most probable development paths will depend on the availability of CPU resources. The hardware resources are still growing exponentially,
but the clock speed of a single-processor node has now reached a limit where the
power consumption begins to be a severe handicap. Design of high-end desktop
computer nodes are now based on dual or quad core with three levels of cache.
Moreover, interconnection network technology (copper or fiber optics) has also considerably improved, providing smaller latency and increased effective bandwidth.
A wide range of small-scale clusters is now available at affordable cost, and these
clusters can significantly reduce the processing times needed for reactor analysis.
212
R. Roy
However, the investment involved for producing large-scale parallel systems is
still important, and most of these systems do not fit all scientific needs. Highperformance computing will be more and more based on the efficient combination
of various parallelisms: not only instruction-level parallelism (branch prediction,
loop unrolling, etc.), but also thread-level parallelism and message passing. A wellbalanced assignment strategy of computing tasks and data to processors will have a
large positive influence on performance.
Let us go back to the context of reactor physics. More and more demanding
reactor core analyses will require optimization and multiphysics strategies. Multiscale computations will be necessary to achieve efficient calculations with the large
number of unknowns needed to represent modern reactor cores. Using the increased
power provided by multiple interconnected nodes, future reactor analysis methods
shall efficiently use parallel computing tools in an integrated and consistent multiphysics environment. A number of core simulations can potentially benefit from
the hybrid shared-distributed parallelism, where a distributed-memory machine has
embedded symmetric multiprocessor nodes. Inside each node, a shared-memory
programming model is applied. Between nodes or at the cluster level, message
passing is used. Functional and domain decomposition techniques provide ways
to distribute the reactor core domain across cluster nodes. Dynamic graph partitioning tools will be used to map core unknowns onto the nodes and adaptive
multigrid schemes will allow nuclear engineers to obtain accurate solutions to reactor problems. In the coming years, the reactor analyst will also need good graphics
rendering and data mining tools to extract useful information from large amounts
of data.
More than ever, the teaching of reactor core analysis is far more complex than the
simple understanding of some basic physical models. Before understanding core behavior for design or operation purposes, the apprentice physicist or nuclear engineer
will have to deal with many computational simulations involving a great amount of
data. The use of modern software tools and their related complex input and output files are subjected to a loss of insight into physical common sense. From the
apprentice’s point of view, these complex simulations may appear as black box operations with (or without) magic results. The teaching of reactor physics must help
the apprentice to fill the gap between physical knowledge and numerical results.
Validation and verification (V&V) steps are more than ever necessary for complex
core simulations. The steep learning curve to acquire V&V aptitudes is certainly
the most important challenge in reactor physics’ education, and these aptitudes are
certainly required if we want to succeed in sustaining new reactor designs over the
coming years.
Final note: The Unified Modelling Language (UML) was invented to help understand and communicate complex systems with fewer ambiguities. The UML
diagrams (Figs. 4.2–4.5 and 4.9) shown in this chapter are based on the book of
Jon Holt [39]. Nuclear engineering is a type of systems engineering where physical and nontangible artifacts (such as software and data processing) collaborate in
core modeling and, in such integrated complex systems, good software engineering
techniques are needed to ensure safety and reliability.
4
Reactor Core Methods
213
References
1. Williams MMR (2003) NEA-1706/01 CD package. Canadian and early British Energy Reports
on Nuclear Reactor Theory (1940–1946). Nuclear Energy Agency Data Bank, OECD, Paris.
Available at http://www.nea.fr/abs/html/nea-1706.html
2. Davison B (with coll. Sykes JB) (1957) Neutron transport theory. Oxford University Press,
Oxford/England
3. Lee CE (1962) The discrete Sn approximation to transport theory. Report LA-2595, Los
Alamos Scientific Laboratory, New Mexico, USA
4. Bell GI, Glasstone S (1970) Nuclear reactor theory. Van Nostrand Reinhold, New York
5. MacFarlane RE, Boicourt RM (1975) NJOY: a neutron and photon processing system. Trans
Am Nucl Soc 22:720
6. Askew JR (1972) A characteristics formulation of the neutron transport equation in complicated geometries. Report AEEW-M 1108. United Kingdom Atomic Energy Establishment,
Winfrith, England
7. Roy R (1996) Application of the Bn theory to unit cell calculations. Nucl Sci Eng 123:358–368
8. Bussac J, Reuss P (1978) Traité de neutronique. Hermann, Paris, France
9. Honeck HC (1961) THERMOS, a thermalization transport theory code for reactor lattice calculations. Report BNL-5826, Brookhaven National Laboratory (code available at NEA data
bank: http://www.nea.fr/abs/html/nea-0043.html)
10. Askew JR, Fayers FJ, Kemshell PB (1966) A general description of the lattice code WIMS. J
Br Nucl Energy Soc 5:564–585
11. Hoffman A, Jeanpierre F, Kavenoky A, Livolant M, Lorrain H (1973) APOLLO: Code Multigroupe de résolution de l’équation du transport pour les neutrons thermiques et rapides. Report
CEA-N-1610. Commissariat à l’énergie Atomique, Paris, France
12. Ahlin A, Edenius M (1977) CASMO – a fast transport theory assembly depletion code for
LWR analysis. Trans Am Nucl Soc 26:604–605
13. Roy R, Marleau G, Tajmouati J, Rozon D (1994) Modelling of CANDU reactivity control
devices with the lattice code DRAGON. Ann Nucl Energy 21:115–132 (code available at NEA
data bank: http://www.oecdnea.org/abs/html/ccc-0647.html)
14. Rahlfs S, Rimpault G, Ribon P, Finck P (1994) Recent developments of the sub-group method
for use in the European cell code ECCO. Algorithms and Codes for Neutronics Calculations.
Obninsk, Russia, 25–27 October
15. DeHart MD, Gauld IC, Williams ML (2007) High-fidelity lattice physics capabilities of the
SCALE code system using TRITON. Proc. Math. & Comp. and Supercomputing in Nuclear
Applications (M&C + SNA2007), 15–19 April
16. Koebke K (1981) Advances in homogenization and dehomogenization. Int. Top. Mtg advances in mathematical methods for the solution of nuclear engineering problems. München,
Germany, 27–29 April
17. Smith KS (1980) Spatial homogenization methods for light water reactor analysis. Ph.D. thesis,
Department of Nuclear Engineering, Massachusetts Institute of Technology, Boston, MA, USA
18. Smith MA, Lewis EE, Na BC (2003) Benchmark on deterministic 2-D/3-D MOX fuel assembly transport calculations without spatial homogenization (C5G7 MOX Benchmark). Report
NEA/NSC/DOC (2003) 16, OCDE/NEA, Paris, France
19. Lewis EE, Palmiotti G, Taiwo T (1999) Space-angle approximations in the variational nodal
method. Proc. Math. & Comp., Reactor physics and environmental analysis in nuclear applications. Madrid, Spain, 27–30 September
20. Gelbard EM (1961) Simplified spherical harmonics equations and their use in shielding problems. Report WAPD-T-1182. Bettis Atomic Power Laboratory, West Mifflin, PA, USA
21. Lawrence RD (1986) Progress in nodal methods for the solution of the neutron diffusion and
transport equations. Prog Nucl Energy 17(3):271–301
22. Cho NZ (2005) Fundamentals and recent developments of reactor physics methods. Nucl Eng
Tech 37(1):25–78
214
R. Roy
23. Rozon D (1992) Introduction à la cinétique des réacteurs nucléaires. Presses de l’École
Polytechnique de Montréal, Québec (translated to English as Introduction to Nuclear Reactor Kinetics (1998)).
24. Hennart JP (1999) From primal to mixed-hybrid finite elements: a survey. Proc. Math. &
Comp., Reactor physics and environmental analysis in nuclear applications. Madrid, Spain,
27–30 September
25. Brezi F, Fortin M (1991) Mixed and hybrid finite element methods. Springer, New York
26. Delp DL, Fisher DL, Harriman JM, Stedwell MJ (1964) FLARE – a three-dimensional boiling water reactor simulator. Report GEAP-4598, General Electric Company (code available at
NEA data bank: http://www.nea.fr/abs/html/nesc0167.html)
27. Fowler TB, Vondy DR, Cunningham GW (1971) Nuclear reactor core analysis code
CITATION. Report ORNL-TM-2496, Oak Ridge National Laboratory, USA (code available
at NEA data bank: http://www.nea.fr/abs/html/nesc0387.html)
28. Langenbuch S, Velkov K, Pevec D, Grgic D (1996) Capability of the QUABOX/CUBBOXATHLET coupled code system. Int. Conf. Nucl. Option in Countries with small and medium
electricity grids. Opatija, Croatia, 7–9 October
29. Greenman G, Smith KS, Henry AF (1979) Recent advances in an analytic nodal method for
static and transient reactor analysis. Comp. Methods in Nucl. Eng. Williamsburg, VA, USA,
23–25 April
30. Derstine KL (1982) DIF3D: a code to solve one-, two-, and three-dimensional finite difference
diffusion theory problems. Report Argonne-82–64, Argonne National Laboratory, USA
31. Palmiotti G, Carrico CB, Lewis EE (1993) Variational nodal methods with anisotropic scattering. Nucl Sci Eng 115:223–243
32. Sissaoui MT, Marleau G, Rozon D (1999) CANDU reactor simulations using the feedback
model with Actinide burnup history. Nucl Tech 125:197–212
33. Rempe KR, Smith KS, Henry AF (1989) SIMULATE-3 pin power reconstruction: methodology and benchmarking. Nucl Sci Eng 103:334–342
34. Bahadir T, Lindahl S-T, Palmtag SP (2005) SIMULATE-4 multigroup nodal code with microscopic depletion, Math. and Comp., Supercomputing, reactor physics and nuclear and
biological applications. Avignon, France, 12–15 September
35. Rozon D, Varin E, Roy R, Brissette D (1997) Generalized perturbation theory estimates of
zone level response to refuelling perturbations in a CANDU600 reactor. Advances in nuclear
fuel management. Myrtle Beach, USA, 23–26 March
36. Varin E, Hébert A, Roy R, Koclas J (2005) A user guide for DONJON 3.01, Report IGE-208
Rev. 1. Institut de génie nucléaire, École Polytechnique de Montréal, Québec
37. Marleau G, Hébert A, Roy R (2006) A user guide for DRAGON 3.05C, Report IGE-174 Rev.
6C. Institut de génie nucléaire, École Polytechnique de Montréal, Québec
38. Trkov A, Leszczynski F, Lopez Aldama D (2006)WIMS-D library update, final report of a
co-ordinated research project, IAEA, Vienna (WLUP libraries available at NEA data bank:
http://www.nea.fr/abs/html/iaea1408.html)
39. Holt J (2004) UML FPR Systems Engineering – 2nd Edn. IEE Prof Appl Comp Ser 4
4
Reactor Core Methods
215
Professor Robert Roy graduated from Université de Montréal with an M.Sc. degree in
Mathematics in 1981. Then, he studied in
France earning a D.E.A. in Numerical Analysis. In 1984, he had the chance to work
briefly at E.D.F. Clamart where he was introduced to nuclear engineering by Gérard
LeCoq and his team. He returned to École
Polytechnique de Montréal (ÉPM) and received his Ph.D. degree in Nuclear Engineering in 1987. His thesis research involved
the development of 3D transport models for
reactivity devices in CANDU reactors. He
presented a paper at the Topical Meeting on
Advances in Reactor Physics, Mathematics
and Computation, Paris (1987) and won the best student paper award. He was invited to join the S.E.R.M.A. team at C.E.A. Saclay where he spent a postdoctoral
year working on the first release of Apollo-2 with Richard Sanchez and Zarko
Stankovski. Returning to ÉPM, he spent 10 years as a research scientist at the
Nuclear Engineering Institute. While his research work remained focused on transport theory, he made important contributions to the development and validation
of the lattice code DRAGON along with Alain Hébert and Guy Marleau of the
Institute. He is also one of the authors of the open-source reactor-physics codes
DRAGON/DONJON used by the Canadian nuclear industry. In 2001, he decided
to join the new Department of Computer Engineering at ÉPM, where he has been
the first coordinator for a new undergraduate program in software engineering.
Professor Roy is well known for his contributions to the development of parallel algorithms, in particular, for neutron transport applications. He has supervised seven
Ph.D. and 14 M.Sc. students, more than half of whom are enrolled in Nuclear Engineering. Presently, he is a professor in the Department of Computer and Software
Engineering at ÉPM, where he leads the DRAP research laboratory that exploits
various computer cluster facilities available for students. He is also a member of
the control board of the Réseau québécois de calcul haute performance, a network
providing the most powerful computer resources in Québec.
Chapter 5
Resonance Theory in Reactor Applications
R.N. Hwang
5.1 Introduction
The most essential objective in reactor physics is to provide an accurate account
of the intricate balance between the neutrons produced by the fission process and
those lost due to the absorption process as well as those leaking out of the reactor. The presence of resonance structures in neutron cross sections obviously plays
an important role in such processes. Therefore, the treatment of neutron resonance
phenomena has constituted one of the most fundamental subjects in reactor physics
since its conception. It is the area where the concepts of nuclear reaction and the
treatment of the neutronic balance in reactor lattices over a wide span of energy
become intertwined. The basic issue here is how to apply the microscopic neutron cross sections in the macroscopic reactor systems. Because of its importance
to reactor physics, much of the existing nuclear data and a significant portion of
all cross-section processing codes downstream are devoted to the treatment of resonance phenomena prior to any meaningful neutronic calculations via either the
deterministic or Monte Carlo approaches.
The primary purpose of this chapter is to provide an overview of resonance
theory in reactor applications as it evolved through the last 50 years. The subject
under consideration can be divided into four major topics. First, various practical
cross-section representations currently in use and their theoretical bases will be described. Second, various analytical and numerical methods for Doppler-broadening
of cross sections will be discussed. Third, various methods for treating the resolved
resonances in both homogeneous media and heterogeneous reactor lattices will be
presented. Fourth, statistical methods for treating the unresolved resonances in the
high energy region where individual resonance parameters are not available will
also be given. For the sake of clarity, the historical perspective on this subject and
some fundamental issues will be discussed briefly prior to other specific topics.
Deceased
Y. Azmy and E. Sartori, Nuclear Computational Science: A Century in Review,
c Springer Science+Business Media B.V. 2010
DOI 10.1007/978-90-481-3411-3 5, 217
218
R.N. Hwang
5.1.1 Historical Perspective
It all started with the practical interest in resonance absorption due to low-lying
s-wave resonances of U238 and their implications to the feasibility of a natural
uranium-fueled system to achieve a self-sustaining chain reaction [1, 2]. Of particular interest were two early observations which were later identified as the natural
consequences of the so-called self-shielding effects. First one was the reduction in
absorption rate per atom for neutrons slowed down in a water and uranium mixture
as concentration was increased. Second one was that a substantial reduction in resonance absorption was possible if natural uranium was made into a lump surrounded
by moderator. In fact, the latter discovery is commonly recognized as one of the key
factors that signaled the beginning of the nuclear era. Practical needs have since provided the motivation for better understanding of the resonance phenomena exhibited
by neutron cross sections and their effects on various reactor physics parameters of
practical interest.
To provide better understanding of the sequence of discussions to follow, a brief
outline is presented on how resonance theory has evolved in conjunction to three
main periods of reactor development, namely, the period of earlier thermal reactor development, the period of emergence of fast reactor program, and the current
period of uncertainty.
In the period of earlier thermal reactor development, efforts were focused on a
handful of low-lying s-wave resonances of few actinides. The extent of their absorption depends on the number of high energy neutrons that can be slowed down
to reach thermal energy via elastic scattering process. The fraction of neutrons that
actually reaches the thermal energy was referred to as resonance escape probability.
In the West, the first practical method for estimating the resonance escape probability in a reactor cell was pioneered by Wigner et al. in the early 1950s [2, 3]. A great
deal of work in this area also began to emerge in the 1950s and most notable was
the work by Chernick et al. [4, 5] that provided a concise description of resonance
absorption in a two-region cell as neutrons were slowed down. One early book on
resonance absorption that summarized a great deal of work carried out in the USA
was that of Dresner published in 1960 [6]. At approximately the same time period,
the issue was also examined in Russia, and the most representative works were those
of Gourevich and Pomeranchouk [7]. A collection of papers on this subject that followed can be found in a book edited by Marchuk [8]. Two books on this subject that
covered much of the developments up to the early 1970s were subsequently published by Lukyanov [9, 10]. The conceptual basis and computational methodologies
began with relatively simple models not only in the cross-section representation
but also in dealing with the localized flux as neutrons slowed down past the resonances in question. Two obvious reasons for this are attributed to the general lack of
good quality resonance data and efficient computational tools. Consequently, efforts
were focused on low-lying resonances of a few actinides with known parameters
evaluated on the basis of the Breit–Wigner approximation [11]. From the practical
point of view, the relative ease of such an approximation by which the microscopic
cross sections can be concisely expressed as a function of energy made possible the
5
Resonance Theory in Reactor Applications
219
development of various plausible means of treating the macroscopic effects due to
the presence of resonance structures in reactor physics calculations. These include
various practical approximations to be discussed in sections to follow, notably, the
analytical expression for Doppler-broadening, the resonance integral concept, viable approximations for the localized flux in the presence of a resonance, and the
rational approximation for treating the collision probability.
In the early 1960s when the liquid metal fast breeder reactor began to emerge as
an attractive alternative for future generation reactors, there was a general shift of interest in resonance theory to the relatively high energy region and many intermediate
weight nuclides began to play an important role in addition to the major and minor
actinides. Because of safety considerations [12], a great deal of emphasis began to
focus on the accurate estimation of the Doppler coefficient and the sodium void coefficient, both of which relied heavily on our extensive knowledge of resonance data
for a large number of nuclides and our ability to compute those reactor parameters
arising directly due to the presence of resonance structures. Such practical needs had
motivated a great deal of developments in the area of resonance data evaluations as
well as that of the method development. From the consideration of nuclear data and
their appropriate representations, several developments are noteworthy. One important milestone was the formation of the “Cross Section Working and Evaluation
Group” (or CSWEG) in 1966, that was intended to serve two important functions:
(1) to provide the reference nuclear data base known as ENDF/B files; and (2) to
motivate continuous improvement of nuclear data of which resonance data constitute a major portion. With improvement in high resolution measurements, the
time had come to venture beyond the traditional Breit–Wigner approximation [11].
More accurate cross-section representations gradually began to emerge [13]. At the
same time, the availability of data made possible significant extension of resolved
resonance ranges especially for various major actinides and structural isotopes of
interest. Thus, from considerations of reactor physics calculations downstream, one
must deal with, not only the improved cross-section formalisms, but also cope with
an increasingly large number of resonances with potentially different characteristics.
This was particularly true when the exceedingly wide s-wave resonances spanning
over several 100 keV were considered. Because of the emphasis on the relatively
high energy region, the statistical treatment of unresolved resonances began to play
an important role. Many methods in this area were developed as will be elaborated
in the discussions to be presented in this chapter. Furthermore, various challenging
issues were associated with the need to analyze various experiments performed in
fast critical assemblies, which were constructed as experimental facilities to verify
our ability to estimate important fast reactor parameters. Unlike traditional reactors,
thermal or fast, the lattices of these assemblies were highly heterogeneous and special considerations were required. Therefore, there was a large body of work in this
area that will be covered in the discussions to follow.
The period from the early 1990s when the fast reactor program was cancelled
in the USA to the present can be viewed as a period of uncertainty for the future
of nuclear energy. Apparently, the reactor program has reached a crossroad. Perhaps less impact is felt elsewhere in the world. There is no question that nuclear
220
R.N. Hwang
research and development in the USA have suffered. It is unfortunate that the lack
of funding also reduces the traditional interest within the nuclear community in the
pursuance of rigor and better understanding of the basic concepts essential to the future of nuclear energy. The dilemma is that a great many difficult issues encountered
previously could probably be resolved today given our better knowledge of nuclear
data, improved methodologies, and the availability of efficient but less costly computational facilities. From the perspective of resonance theory, there are still some
interesting works worth noting. They will be described in later sections where appropriate.
5.1.2 Self-shielding Effects in Perspective
Aside from the Doppler-broadening effect that one inevitably encounters in all
practical problems involving cross sections, one macroscopic effect specifically
associated with reactor applications is the so-called self-shielding effect. It is a phenomenon directly attributed to the localized fluctuations in neutron cross sections
resulting from resonance structures on the averaged reaction rates of a reactor cell
in energy and space at a given temperature.
With no loss of generality, one convenient way is to view the self-shielding
concept within the context of the commonly used multigroup approach in reactor
calculations. One essential principle of this approach is the separation of the fine
structure effect attributed to resonance structures from the global neutronic calculations of the entire reactor. This can be best accomplished via the use of the effective
group cross sections for each nuclide and reaction type at a given cell in the reactor
lattice defined as
˛
˝
x .E; rE; T / .E; rE; T / E;Er
˝
˛
(5.1)
Q x D
.E; rE; T / E;Er
where the width of the group is usually taken to be much greater than the extent of
the resonance for actinides. One quantity widely used as a measure of self-shielding
effect is the self-shielding factor defined as the ratio of the effective cross section
and its corresponding value at infinite dilution. Conceptually, its physical meaning
can be better understood if it is cast into somewhat unconventional form [14],
fx D
Q x
COVŒx .E; rE; T /; .E; rE; T /
D1C
hx .E/iE
hx .E/iE h .E; rE; T /iE;Er
(5.2)
whereby the degree of the self-shielding effects is directly expressed in terms of the
covariance of the localized reaction cross section and the neutron flux. Thus, the degree of self-shielding effects can be construed as a measure of correlation between
the reaction cross section and neutron flux in energy and in space at a given temperature. Physically, it is anti-correlated and temperature-dependent which gives rise to
the temperature coefficient of a reactor system. All averages here can either be cast
5
Resonance Theory in Reactor Applications
221
into the form of the usual Riemann integral or Lebesgue integral as one so chooses.
For resonances in the unresolved energy range, these averages can be viewed as the
expectation values based on the statistical properties of resonance parameters. Such
a description provides a plausible basis, yet with no loss of generality, in much of
the discussions to follow.
5.2 Representation of Microscopic Cross Sections
The theory of nuclear reactions provides the basis for describing the behavior of resonance structures observed in cross sections. The most comprehensive discussions
on this subject were given by Lane and Thomas [15], Lynn [16], and Fröhner [17].
For our purpose here, it is useful to retrace some of the fundamentals pertinent to
practical applications. One obvious area to start is a brief description of the R-matrix
theory from which various forms of representations with various degrees of sophistication were derived.
5.2.1 Brief Description of R-Matrix Theory
In R-matrix theory, the reaction cross section for any incident channel c and exit
channel c 0 and the total cross section are generally expressed in terms of the collision
matrix Ucc0 , a symmetric and unitary matrix,
cc0 D
X
2
2
0 Ucc0 j ; t D
g
jı
cc0 D 2 gc .1 <efUccg/
c
cc
k2
k
0
(5.3)
c
where k is the wave number (or momentum) reciprocal of .2 times the neutron
wave length) and gc is the statistical spin factor, a function of the compound nucleus
p
spin J and the target nucleus spin I defined as k 2:196771 103ŒA=.AC1/ E
and gc .2J C 1/=Œ2.2I C 1/ , respectively.
There are two versions of R-matrix theory. The most commonly used basis for
practical applications is the Wigner–Eisenbud version [18]. The other is the Kapur–
Peierls [19] version which is also of some practical importance. For our purposes
here, brief discussions of both will be presented.
5.2.1.1 Wigner–Eisenbud Version
In this version, the explicit energy-dependent behavior of the resonance structure
specified by Ucc0 is given either in terms of the channel matrix characterized by
a real matrix R, or equivalently in terms of the level matrix A. Using Fröhner’s
notation [17], one obtains
222
R.N. Hwang
Ucc0 D e i 2
D e i 2
n
ıcc0 C 2 i Pc1=2 .I RL0 /1 R
0
1
X 1=2
1=2 A
c @
c A c
ıcc0 C i
0
c
cc0
Pc1=2
0
o
(5.4)
;
where two essential matrices here are,
Rcc0 D
X c c 0
;
E E
A1
D .E E/ı X
c L0c c
(5.5)
c
and L0 is a diagonal matrix defined as
L0cc0 D L0c ıcc0 D .Lc Bc /ıcc0
(5.6)
The quantity Lc above is the logarithmic derivative of the outgoing wave function
Oc .rc / at the channel radius rc with the boundary parameter Bc taken to be a real
constant. Lc and the hard-sphere phase shift factor c , another Oc .rc /-dependent
quantity, are explicitly given as
Lc D rc
Oc0 .rc /
D Sc C iPc ;
Oc .rc /
c
D arg Oc .rc /
(5.7)
Two generic resonance parameters for a given level that appear in the formulations
are the reduced width amplitude c for a given channel c (or alternatively, the
2
partial width c D 2Pc c
/ and the eigenvalue E . These parameters are real and
their statistical properties are well-known. Such attributes are not only important in
resonance data evaluations but also essential in reactor physics applications when
it comes to the treatment of the unresolved resonances to be addressed later. Aside
from the obvious energy dependence that appears in matrix R or in the level matrix
A, there are three energy-dependent factors that must be specified, namely, Sc ; Pc ,
and c . The quantity Sc , real component of Lc , is referred to as the level shift factor
whereas Pc is referred to as the penetration factor. These quantities along with the
hard-sphere phase shift factor, c , are weakly energy-dependent when the elastic
scattering and inelastic channels are considered, whereas they are usually taken to
be constant for photon and fission channels, usually by setting Bc D Sc and Pc to
be constant. Thus, the explicit energy dependence of the collision matrix requires
the specification of those energy-dependent factors.
The outgoing wave function for a neutron channel with a given angular momentum state l at the channel radius is expressible in terms of the spherical Hankel
function of the first kind, a linear combination of spherical Bessel functions,
.krc / D .ikrc /Œjl .krc / C i yl .krc /
Oc .rc / D ikrc h.1/
l
(5.8)
where jl .x/ and yl .x/ are the usual spherical Bessel functions of the first and the
second kind, respectively. The incoming wave function, =c .rc /, is equal to Oc .rc /
5
Resonance Theory in Reactor Applications
223
Table 5.1 Momentum-dependent factors for various l-states defined at channel
radius rc ( D krc )
Fctr l D 0 l D 1
l D2
l D3
3
Pl
Sl
l
1C
0
1
1C
5
9C3
2
2
7
C
225 C 45
4
.18 C 3 2 /
9C3 2C 4
3
tan1
3
2
tan1
2
2
C6
4
C
6
.675 C 90 2 C 6 4 /
225 C 45 2 C 6 4 C 6
.15 2 /
tan1
15 6 2
for the neutron-induced cross sections. One useful identity from which the spherical
Hankel function and its derivatives can be defined explicitly is,
h.1/
.x/ D i l1
l
l
e ix X
.l C j /Š
.2ix/j
x
j Š.l j C 1/
(5.9)
j D0
By utilizing Eqs. 5.7–5.9, the weakly energy-dependent factors, Sc ; Pc , and c can
be defined explicitly. Table 5.1 shows these factors as a function of D krc for four
l-states of practical interest.
One limiting property of the level shift factor when k ! 0 is that Sc l, which
can be shown directly via the use of Eqs. 5.7–5.9. Thus, for the relatively low energy
range, one can set Bc Dl so that L0c0 c becomes dependent on the penetration alone.
5.2.1.2 Kapur–Peierls Version
The Kapur–Peierls version [19] amounts to a different way of choosing the boundary parameter defined in Eq. 5.6. A simpler expression can be obtained if one sets
Bc D Sc C iPc . Because of its energy-dependent nature, it is sometimes referred to
as the energy-dependent boundary condition. The consequence is that L0c0 c D 0 and
I RL0 D I . Thus, it eliminates the need for inverting the channel matrix so that
the collision matrix can be cast into the much simpler form given below,
Ucc0 D e
i 2
c
ıcc0 C
X i g 1=2 g 1=20
c
c
" E
!
(5.10)
where two generic parameters gc and " become complex and energy-dependent.
To obtain such a set of parameters, it requires computations at each incident neutron
energy inferred by the use of the energy-dependent boundary condition, a trade-off
that is not necessarily attractive in practical applications. Furthermore, there does
not appear to be any plausible means to derive their statistical properties analytically. Thus, the Kapur–Peierls version has not been taken seriously as the tool for
data evaluations, with exception of perhaps some special cases. However, it does
have two major merits quite appealing to reactor physics applications. First, the
224
R.N. Hwang
general form of Eq. 5.10 is readily amenable to analytical Doppler-broadening, a
major task in the processing of point-wise cross sections. Second, it resembles the
single-level Breit–Wigner approximation on which much of the reactor physics concept was based. For these reasons, various attempts were made, at least for some
limited cases, to convert the known set of evaluated R-matrix parameters based on
the Wigner–Eisenbud version to pole parameters of the Kapur–Peierls type.
5.2.2 Practical Representations Currently in Use
For practical applications, it is convenient to separate the resonance component from
the nonresonance component (i.e., potential scattering cross section) in the crosssection representation [13]. This can be accomplished symbolically by letting
X 1=2
1=2
2&nc D i
n A c
(5.11)
in Eq. 5.4. For our purposes here, let tR denote the resonance component of t so
that the total and partial cross section of reaction process x can be written as
t D p C 42
x D 42
X
l;J
a D 42
X
X
˚
gJ <e e i 2 l &nn ;
p D 42
l;J
gJ j&nc j2 ;
X
gJ sin2
l
(5.12)
l;J
c … n; x 2 f; gJ <ef&nn g j&nn j2 ;
(5.13)
sR D tR a
(5.14)
l;J
It should be noted that the resonance scattering component, sR is usually taken to be
the difference of the compound nucleus cross section and the absorption cross section. Similarly, for some approximations to be described, the capture cross section
is taken to be D t s f to ensure the unitary nature of the collision matrix.
5.2.2.1 Single Level Breit–Wigner Approximation (SLBW)
In the limit when resonances are well-isolated, the inverse level matrix can be represented by the matrix of rank 1 attributed to a single resonance, i.e.,
A1 E0 E X
c
L0c c2 E0 C E i
t
2
(5.15)
where the total width t is equal to the sum of all partial widths and the level shift
is D .Sc Bc /c2 associated with elastic scattering channel c.
Thus, strictly speaking, the corresponding cross section should reflect the discrete nature of isolated levels. In practice, there are very few situations where
5
Resonance Theory in Reactor Applications
225
resonances are completely isolated. The question arose as to how one could maintain the continuity in flux calculations if the tails of neighboring resonances ran into
one another. In lieu of other options beyond the SLBW in earlier days, one solution
to circumvent this difficulty was to consider the cross sections as a superposition of
individual resonances accompanied by a set of precomputed point-wise “smooth”
cross-section files. The latter was intended to provide the necessary corrections to
the inadequacies of the approximation. In this context, the single-level approximation actually in use can be specified as
x D
X X 0xk
X X nk xk
i
2
;
D
4
g
<e
J
2tk
zk E
1 C xk2
l;J
tR D
k
l;J
XX
l;J
D 4
0t k
k
2
X
l;J
gJ
xk sin 2 l
cos 2 l
C
1 C xk2
1 C xk2
X nk
k
x 2 ; f
k
(5.16)
!
i e i 2 l
<e
2
zk E
(5.17)
where 0xk D 42 gk nk xk =tk2 , xk D .E Ek k /=.t k =2/ and zk D
Ek C k i t k =2.
The equivalent representations of the traditional expressions in complex domain
are introduced here to signify the meromorphic nature of cross sections pertinent
to discussions to follow. The quantities in parentheses in Eqs. 5.16 and 5.17 that
characterize the energy dependence of a given resonance are also referred to as
the Lorentzians. They are readily amenable to analytical Doppler-broadening, an
attribute essential to much of the resonance integral concept in reactor physics.
5.2.2.2 Multilevel Breit–Wigner Approximation (MLBW)
In the late 1960s, improvement of the single-level approximation began to receive
a great deal of attention as the fast reactor development was in full swing. To account, at least in part, for the overlap effects due to neighboring levels, the multilevel
parameters based on the MLBW approximation became available. This method is
based on the assumption that the inverse level matrix A1 can be taken as a diagonal
matrix consisting of various SLBW elements, i.e.,
i t ı
.A1 / D E C E 2
(5.18)
Thus, the resulting collision matrix can be expressed in the form of a pole expansion
similar to that of Kapur–Peierls with easily derivable parameters without matrix
inversion [20]. With no loss of generality, the corresponding cross sections can be
cast into the same form as those of the SLBW in the complex domain,
226
R.N. Hwang
91
8
.x/ =
<
X
X
i
r
1
J;l; A
@<e
x D
; x 2 ; f
: z E ;
E
0
(5.19)
l;J
where the pole z is clearly identifiable with that defined in Eq. 5.16 for SLBW
.x/
.x/
resonance and rJ;l;
is related to the corresponding residue cJ;l;
defined as
.x/
cl;J;
1 X .2i /
D
4
1=2
p .0/
.0/
E n x nu x
z
z
.x/
.x/
; rl;J; D 42 gJ E cl;J;
(5.20)
p
where
D n = E is referred to as the “reduced” neutron width and is independent of energy for s-wave neutrons. Note that the residue here, unlike the case
of SLBW, is complex and involves the overlap effect of all levels.
The total resonance cross section can be cast into the similar form,
91
0 8
i 2 l .tR/ =
rJ;l;
1 X X @ < i e
A
tR D
<e
(5.21)
;
:
E
z
E
.0/
n
l;J
where the residue is real and readily identifiable with the parameters defined for the
SLBW approximation given by Eq. 5.17.
The same procedure is applicable to the absorption cross section defined by
Eq. 5.14. It can be shown readily that,
9
8
.a/ =
<
i
X
X
.1/r
h
1
l;J;
.a/
.n/
a D
<e
D 42 gJ E n =2 cl;J;
; rl;J;
: z E ;
E
l;J
(5.22)
.n/
residue rl;J;
is obtainable from Eq. 5.20 by replacing the index x with n.
where the
The MLBW representations given in the complex domain here are directly identifiable with those originally proposed by Adler–Adler [20] when expressed explicitly
in terms of real parameters. It should be noted that the approach described above
represents the generic form including multilevel effects in all cross sections. In
the version suggested by the ENDF/B user’s manual [13], such effects are only
accounted for in the resonance scattering cross section while other partial cross
sections are taken to be the same as those of the single-level approximation described earlier.
5.2.2.3 Adler–Adler Approximation (AA)
The cross sections based on MLBW approximation account for the multilevel
effects attributed to the diagonal elements alone while the multichannel effects arising from the off-diagonal elements are not accounted for. Such an approximation
5
Resonance Theory in Reactor Applications
227
may be satisfactory for nuclides with relatively isolated resonances. It can become unsatisfactory when fissionable nuclides with closely spaced resonances are
considered.
One improved alternative was pioneered by Adler and Adler [20] in order to
deal with the low-lying s-wave resonances of fissionable isotopes. For such resonances, the total width is practically dominated by the sum of fission and capture
width. Hence, all weakly energy-dependent factors attributable to the elastic scattering process described earlier will have little impact on the level matrix. The
penetration factor, Pc .E/, that appears in A1 can be approximated by Pc .Ek /
when applied to the neutron width of each resonance considered. Consequently,
one can deduce the Kapur–Peierls-like, yet energy-independent, parameters via
diagonalization of the inverse level matrix and the resulting cross sections can be
expressed in the same general form in terms of the single-level-like pole expansion
described for the MLBW approximation.
With no loss of generality, one convenient way to derive the Adler–Adler parameters directly from the inverse level matrix is to consider a complex eigenvalue
problem, a procedure also used in [21] for computing S-matrix parameters, as given
below
X
c L0 c
(5.23)
B Z ./ D " Z./ ; B D E ı c
where " is the -th complex eigenvalue and Z./ is the normalized eigenvector
corresponding to the -th eigenvalue respectively. The normalization is such that
P ./ 2
D 1. Once the eigenvalues and eigenvectors are known, the collision
Z
matrix can be expressed in the Kapur–Peierls form where the residue amplitude
g 1=2
c is given by
X
p
g 1=2
2Pc
Z ./ c
(5.24)
c D
It is important to note that g1=2
c , the residue amplitude, is physically equivalent to
1=2
c
except it is defined in the complex
p domain. For the elastic scattering channel, gkn , like nk , is proportional to E, for s-wave resonances. Hence, in the
p
same context, one can define g .0/
E as the “reduced” neutron residue
n D gn =
amplitude.
Thus, the generic forms defined in Eqs. 5.19, 5.21 and 5.22 are equally applica1=2
ble here except that z and x must be replaced by pole " and residue amplitude
g 1=2
x , respectively. For the sake of completeness, the corresponding residue parameters are explicitly given below:
.x/
rl;J;
.tR/
rl;J;
1=2
p .0/
.0/
42 gJ E X .2i / E g n g x g n g x
D
4
" "
.a/
.tR/
.n/
.n/
D 42 gJ E e i 2 l gn ; rl;J;
D rl;J;
rl;J;
rJ;l;
(5.25)
(5.26)
228
R.N. Hwang
.n/
where rl;J;
can be obtained by Eq. 5.20 if n and x are replaced by gn and
g x respectively. The complex parameters here are readily identifiable with those
originally derived by Adler–Adler [20] when cast into real arithmetic expressions.
It is noteworthy that the traditional approach of converting the R-matrix parameters into the Kapur–Peierls-type parameters presented here is by no means the only
choice. An attractive alternative pioneered by deSaussure and Perez [22] is believed
to be more efficient for practical applications when used in conjunction with converting the Reich–Moore parameters [23] into pole parameters and is conceptually
more amenable to further generalization.
5.2.2.4 Reich–Moore Approximation
This is an approximation that best preserves the rigor of the R-matrix theory and
is most preferred for resonance data evaluations. The only significant assumption is
given below [23],
X
X
2
c L0c c D ı
c
L0c
(5.27)
c2
c2
i.e., the photon channels that appear in the above matrix are taken to be diagonal.
The physical justification for this assumption is based on the fact that a large number
of photon channels with c assuming either positive or negative signs are expected
so that the off-diagonal elements are unlikely to be important due to cancellations.
This assumption, in effect, provides a significant reduction in the number of channels that one has to deal with.
The inverse level matrix after such channel reduction becomes
X
ı i
c L0c c
(5.28)
.A1 / D E E i
2
c…
and the corresponding “reduced” R-matrix is expressible as
0
Rcc
0 D
X
c c 0
; c0 … E E i =2
(5.29)
The collision matrix in the channel matrix form after elimination of photon channels
can be written as
Ucc0 D e i.
c C c0 /
˚
Œ2.I K/1
cc0
1=2
ıcc0 ; Kcc0 D iPc1=2 Rcc0 Pc 0
(5.30)
Thus, the Reich–Moore approximation can be readily cast into the same generic
forms defined by Eqs. 5.12–5.14 in terms of the resonance component
&nc D ınc .I K/1
nc ; 2&nc D i
X
1=2
1=2
c A c
;
for the channel matrix or level matrix representations respectively.
(5.31)
5
Resonance Theory in Reactor Applications
229
The fact that only a small number of elastic scattering and fission channels are
present makes the approach much more manageable than the formal R-matrix representation. There are, however, two trades-off here. First, the “reduced” R-matrix
is no longer real, a requirement to maintain the unitarity of the collision matrix. To
circumvent this issue, the capture cross section must be defined as the difference of
the total cross section and sum of non-capture cross sections. The common practice
is to compute t , f and a via Eqs. 5.12–5.14 respectively, from which s and
can be deduced. Second, unlike other approximations, the generic form of pole
representation can no longer be maintained. This has two significant implications
on reactor physics applications where much of the basic concepts and codes were
based on the single-level approximation. It is particularly so in conjunction with
the analytical Doppler-broadening of cross sections and utilization of the resonance
integral concept to be described later.
5.2.3 Other Alternative: Generalized Pole Representation
In the mid-1980s when the fast reactor program reached its peak, a large body of
Reich–Moore resonance data for practically all major nuclides of interest began to
become available. To enable utilization of the vastly improved resonance data in
many reactor applications motivated the development of the generalized representation, whereby both the attractive features of the pole representations for reactor
applications and the rigor of the Reich–Moore formalism [23] could be preserved.
One important development prior to the conception of the generalized pole representation was the method pioneered by deSaussure and Perez [22] for converting
the R-matrix parameters based on the Reich–Moore approximation to the Kapur–
Peierls-type parameters (or Adler–Adler parameters) described earlier. Instead of
the traditional approach based on diagonalization of the level matrix described previously, their rationale was based on the explicit mathematical behavior of &nc , the
resonance component of the collision matrix, as a function of energy. One simple
way to illustrate the basis of their approach is to examine the energy dependence
of A1 given by Eq. 5.5. In lieu of energy dependence of L0c in this equation under
the Adler–Adler assumption [20] for the s-wave resonances, it is quite obvious that
the N N matrix A is mathematically equivalent to a rational function of energy
of order N . This can be proved readily by examining the cofactor and determinant of A1 . The former must be expressible as a polynomial in E of order N 1
while the latter as a polynomial in E of order N , the total number of s-wave resonances in question. The same conclusion can also be reached if one examines the
channel matrix .I K/1 defined by Eq. 5.30. Mathematically, it is equivalent to
state that A and/or .I K/1 must be single-valued and meromorphic in the energy
plane. Thus, the Adler–Adler parameters can be obtained directly from a given set
of Reich–Moore parameters via the computation of the poles and their corresponding residues of the channel matrix. The former are equivalent to the roots of det
230
R.N. Hwang
.I K/ D 0, which can be computed efficiently via the usual Newton–Raphson
scheme. Once the poles are known, the residues can be determined readily. The
POLLA-code [22] was developed for this purpose.
The rationale of deSaussure and Perez [22] can be readily extended to the rigorous pole expansion of the collision matrix in the k-plane [24]. Conceptually, such
a representation can be construed as the natural consequence of the physical condition that the collision matrix must be single-valued and meromorphic in the k-plane
(or momentum domain). From the theorem of complex calculus, such a function is
expressible as a rational function with simple poles. With no loss of generality, the
same rationale can be extended to the general situation for all possible l-states if
one substitutes their respective explicit energy dependence of the level shift factor
and penetration factor, given in Table 5.1, into A1 . Upon examining the ratio of the
cofactor and determinant, one can show that &nc defined by Eq. 5.11 is representable
by a rational function of order 2.N C l/. Thus, the generalized pole representations
of cross sections based on such a rational function and its absolute square can be
expressed as [24],
(
)
N Cl 2
1 XXX
.i /e i 2 l
.t /
< e rl;J;j; .j / p
t D p C
E
p E
j
D1
l;J D1
(5.32)
(
)
N Cl 2
1 XXX
.i /
.x/
x D
< e rl;J;j; .j / p
;
E
p E
l;J D1 j D1
(5.33)
x 2 f; a
.x/
where pole p.j / and residue rl;J;j;
are genuinely energy-independent. The inner
sum reflects the behavior of the poles that appear in pairs with p.1/ nearly equal to
.2/
.1/
.2/
p . The AA can be considered as a special case for which p D p . The pole
expansion here also preserves the same general form as that for the Breit–Wigner
approximation readily amenable to various existing codes.
Studies have shown that these pole parameters can also be computed efficiently
via the Newton–Raphson scheme provided extended precision is used. The algorithms have been coded in the WHOPPER code [24] for routine applications. It
is also noteworthy that, given these parameters, one can deduce the Humblet–
Rosenfelt [25]-type parameters directly as shown in [26]. This feature further
enhances its compatibility with existing codes.
5.3 Doppler-Broadening of Cross Sections
The representations of microscopic cross sections described in the previous section
have been based on the assumption that the target nucleus is at rest in the laboratory system in the neutron–nucleus interaction. Realistically, the target nuclei
5
Resonance Theory in Reactor Applications
231
must be subject to thermal motion at the elevated temperatures of practical interest.
The procedure of taking into account such an effect in computing cross sections is
generally referred to as the Doppler-broadening of cross sections. In fact, it is the
first macroscopic effect one encounters prior to others in the reactor physics calculations to follow.
5.3.1 Practical Doppler-Broadening Kernels in Use
For practical applications, the classical theory of velocity distribution has generally
been used while the solid state effects of quantum mechanical nature are usually
considered separately. Two models have been widely used to provide the Dopplerbroadening kernel required for computation of the temperature-dependent cross
sections. They will be addressed in order as follows.
5.3.1.1 Ideal Gas Model
One commonly used model is based on the assumption that the velocities of the
target nuclei follow the classical Maxwell–Boltzmann distribution. One attractive
simplification of such a distribution derived by Solbrig [27] is of particular interest
for practical applications. The resulting Doppler-broadened cross section can be
expressed in terms of the following integral,
p
Z
p
p
E x
E; m D Œu x .u; 0/ S
E; u du
1
(5.34)
0
p
where x . E; 0/ will henceforth be used to denote the microscopic cross section
at 0ı K and S.u; u0 /, referred to as Solbrig’s kernel is given by
S u; u0 D
.u C u0 /2
.u u0 /2
u0
exp p
exp 4t
4t
2 t u
(5.35)
p
p
with m D 2 t D .KT=A/1=2 denoting the Doppler-width
in the u D E domain.
p
As pointed out by Solbrig, Eq. 5.34 preserves the 1= E -dependence of absorption
cross section in the limit of low energy.
The use of Solbrig’s kernel in conjunction with Eq. 5.34 provides the rigorous
Doppler-broadened cross sections so long as the ideal gas model is assumed to be
valid. It is particularly amenable to applications in conjunction with the generalized
pole representation described in the previous section. In the early period of reactor
physics, the broadening kernel actually used was a Gaussian in the E-domain, which
can be considered as a limiting case of Solbrig’s kernel defined by Eq. 5.35 when
applied in the relatively high energy region, i.e.,
232
R.N. Hwang
u
1
.E E 0 /2
0
0
0
0
dE0
S.u;
u
/du
G.E;
E
/dE
D
p
exp
u0
2
(5.36)
where D Œ.4KT E/=A 1=2 is specifically referred to as the Doppler-width when
applied in conjunction with the Gauss kernel. Hence, the corresponding Dopplerbroadened cross section becomes
Z1
x .E; / .E 0 ; 0/G.E; E 0 /dE0
(5.37)
0
5.3.1.2 Accommodation of Crystalline Binding Effects via Effective
Temperature Model
The fact that the thermal motion of the target nuclei may not necessarily be represented adequately by the traditional ideal gas model has been a long-standing
issue in the nuclear community since the early days of reactor development. Strictly
speaking, the target nucleus and compound nucleus subsequently formed are bound
in the energy states of the atom. Hence, this is an area where nuclear physics and
solid state physics become intertwined. For details, readers are referred to an excellent book on the fundamental aspect of this subject by Osborn and Yip [28].
For our purposes here, a widely accepted model will be described. The model,
sometimes referred to as the “effective temperature” model, was pioneered
by Lamb [29] in the study of the binding effects for purely absorbing nuclei. It
was further extended by Egelstaff [30], Nelkin and Parks [31], and others [32]. Two
general assumptions were used in this model. First, the motion of the nuclei about
the lattice sites is harmonic in nature characterized by its frequency distribution,
which reflects the solid state properties of the lattice in question. In the harmonic
approximation, all normal modes of vibration are taken to be independent. Second,
the motion of nuclei is independent of that of the lattice. Of particular interest
here is the case of “weak” lattice binding considered by Lamb that most closely
resembles the practical situation. The weak binding approximation is based on the
assumption that the compound nucleus’ lifetime for an energy level in the nuclide
is small compared to the maximum relaxation period of the lattice, the reciprocal of
the maximum frequency of oscillations. Let !max be the maximum frequency. The
maximum binding energy within the crystal is given by „!max where „ is Plank’s
constant divided by 2. The “weak binding” approximation is assumed to be valid
if .t C /=2 >>> „!max . Here, t and are the total width of a Breit–Wigner
resonance and the Doppler width in energy domain, respectively, as defined previously. Physically, it is quite obvious that this condition will be met as the neutron’s
energy increases.
One quantity of particular physical interest is the average energy characteristic
of the thermal motion. For the ideal gas model, the average energy E is well-known
from the thermal dynamic nature of the gas, i.e., E D KT, the product of Boltzmann
5
Resonance Theory in Reactor Applications
233
constant and temperature. The corresponding average energy using the effective
temperature model is given in terms of the average per modes of oscillation, i.e.,
!
Zmax
E D KT eff D
„!
„!
coth
g.!/ d!
2
2KT
(5.38)
0
where g.!/ is the normalized frequency distribution, also known as phonon distribution, describing the wave number per unit frequency interval and Teff is the
effective temperature deduced from the average energy. The equation above provides the general relationship between the effective temperature and the thermal
dynamic temperature. Thus, all broadening kernels described earlier remain intact
so long as the effective temperature is known. It is important to realize, however,
that our knowledge of the phonon distribution for various lattices of interest is, at
best, sketchy. Earlier work in this area was focused mostly on UO2 lattice [32]. Most
notable results were given by Dolling et al. [33] based on the shell model fitting the
experimental dispersion curves obtained by inelastic scattering. For practical applications, people usually resort to the use of the much simplified model known as
the Debye model commonly used to calculate the specific heat. It is based on the assumption that g.!/ varies parabolically in ! between zero and the cut-off frequency
!max . Thus, the effective temperature defined by Eq. 5.38 is characterized by !max
alone. Substituting g.!/ D ! 2 into Eq. 5.38, one immediately obtains an equation
of manageable form,
Teff
3
D D
2
Z1
x 3 coth
xD
2T
dx
(5.39)
0
where D is commonly referred to as the Debye temperature defined as
D D .„!max =K/. Again, viable information of the Debye temperature is also
generally lacking. It is quite obvious that more studies in this area are needed.
5.3.2 Analytical Broadening via Doppler-Broadened
Line-Shape Functions
As long as the cross sections can be expressed in the form of a pole expansion as
described in Section 5.2, the Doppler-broadening of each Lorentzian-like term leads
to the analytical forms commonly referred to as the Doppler-broadened line-shape
functions, in which the energy and temperature dependence are explicitly defined.
There are two types of approaches by which these Doppler-broadened line-shape
functions can be derived. One is based on the approximate Gaussian kernel while
the other is based on Solbrig’s kernel.
234
R.N. Hwang
5.3.2.1 Traditional Doppler-Broadened Line-Shape Functions
The traditional Doppler-broadened line-shape functions based on the approximate
Gauss kernel become quite obvious when the Lorentzian term-based cross sections
are expressed in the complex domain. Upon broadening via Eq. 5.37, each
Lorentzian term described in Section 5.2.2 becomes
p
. k =2/W .z/ D Π.x; k / C i .x; k / ;
i
W .z/ D
Z1
1
2
e t dt
zt
(5.40)
where z D .E zk /=. For SLBW and MLBW approximations, z D .E Ek / C
ik =2 with k D t =. For the AA, zk is identifiable with "k described previously.
Here, W .z/, referred to as the complex probability integral, is a well-known function while the - and - functions are its symmetric and asymmetric components
of the traditional Doppler-broadened functions. Thus, the corresponding Dopplerbroadened cross sections for the case of the SLBW approximation become
x .E; T / D
tR .E; T / D
XX
l;J
k
l;J
k
XX
0xk .x; k /
0tk Œcos 2
l
(5.41)
.x; k / C sin 2
l
.x; k /
(5.42)
respectively. Other comparable approximations can also be expressed in the same
general forms accordingly.
5.3.2.2 Generalization of Doppler-Broadened Line-Shape Functions
As discussed in Section 5.2.3, the rigor of the energy dependence of cross sections
as defined by any formalism can always be preserved
if cast into the form of a
p
generalized pole expansion in the k-plane (or E -domain). Therefore, the exact
Doppler-broadening of each Lorentzian-like term can be carried out analytically as
before if the rigorous Solbrig’s kernel is used. The substitution of Eqs. 5.32 and 5.33
into Eq. 5.34 leads to [24],
x
N Cl 2
n
p
p
o
1 XXX
.x/
E; m D
<e rl;J;j;
Bx
E; m ; p.j /
E
l;J
(5.43)
j D1
Cl X
2
p
1 X NX
p
o
n
.tR/
.j /
(5.44)
tR
E; m D
<e rl;J;j; e 2i l Bx
E; m ; p
E
l;J
j D1
5
Resonance Theory in Reactor Applications
where Bx .
is given as
235
p
E; m ; p.j / /, the Doppler-broadened line-shape function of each pole
!
p
p
E p
.j /
E; m ; p ; x 2 a; f
Cx
m
i
h
p
Z1 exp t 2 2 E= t
.j /
p
m
i p
E
.j /
Cx
E; m ; p D p
exp 2
dt
2
m
.j /
2m
p =m t 2
0
p
p
W
.j /
Bx
E; m ; p D
m
(5.45)
(5.46)
p
The quantity Cx . E; m ; p.j / / can be viewed as the low energy correction
term which vanishes exponentially as a function of .E=2m /. Typically, it becomes unimportant when E 1eV for most nuclides of practical interest. Thus,
the Doppler-broadened
line-shape function exhibits the same mathematical propp
erties in E -domain as the traditional one in E-domain except under the low
energy limit.
It should be noted that, strictly speaking, Eq. 5.44 implicitly assumes that
the weakly energy-dependent l is unimportant in the Doppler-broadening process. For the sake of rigor, the exact broadening to account for such an effect
has been developed and incorporated into the W -function approach for practical
applications [34].
5.3.2.3 Evaluations of the Doppler-Broadened Functions
For the Doppler-broadening based on the Gauss kernel widely in use, the evaluation of the W -function alone will suffice. For exact Doppler-broadening based on
Solbrig’s kernel, an additional integral defined by Eq. 5.45 may also be required
when the extremely low energy region is considered. Therefore, the evaluation of
the W -function and that of the correction term will be described separately.
Method for Evaluating the W -Function
Two identities of the W -function are of particular practical interest, namely,
W .z / D W .z/;
2
W .z/ D 2e z W .z/
(5.47)
which are also referred to as the symmetry relations. Thus, computation of W .z/ in
the first quadrant (i.e., 0 < arg z < =2) will suffice because the values for the other
quadrants can be deduced directly from these relations.
For illustration purposes, two contrasting methods will be described briefly.
The first method is based on the efficient algorithm developed by O’Shea and
Thacher [35]. The second method is known as Turing’s method using the Cauchy
integral approach [36].
236
R.N. Hwang
First Method
The fundamental basis of the O’Shea–Thacher algorithm [35] is to use a Taylor
expansion based logic. The well-known Taylor’s expansion of the W -function, upon
separation of the even and odd terms, yields the following identity,
W .z/ D
1
X
1
1
X
X
.iz/n
.1/k 2k
.1/k
n
D
z Ci
z2kC1
.k
C
1/
.k
C
3=2/
C
1
2
nD0
kD0
kD0
2i
2
D e z C p D.z/
where D.z/ D e z
2
Rz
(5.48)
2
e t dt is the Dawson integral.
0
Thus, it is natural to divide jzj into three basic regions. In region 1.jzj <<< 1/,
it is quite obvious that the use of the low-order terms given by Eq. 5.48 will
suffice. O’Shea and Thacher [35] have demonstrated that the continued fraction
method provides an exceedingly accurate and efficient tool for other regions. In
region 2.jzj 4/, the coefficients required for the continued fraction approach can
be deduced from the portion of the expansion for the Dawson integral given by
Eq. 5.48 via the quotient-difference algorithm developed by Henrici [37]. In Region
3.jzj 4/, there are two ways of computing the W -function. The original version
of the O’Shea–Thacher algorithm [35] was based also on the continued fraction approach in this region, except its coefficients via the quotient-difference algorithm
were based on the asymptotic series of the W -function, i.e.,
W .z/ D
1
X
.k C 1=2/
z2kC1
(5.49)
kD0
Alternatively, another version adopted later was based on the more straightforward
Gauss–Hermite quadrature approach.
Second Method
The W -function can also be computed via the Cauchy integral approach originally
developed by Turing [36] for computing the zeta-function. Turing’s method was
later extended by Goodwin [38] to compute the integral of a meromorphic function
R1 t 2
e f .t/dt, via consideration of a contour integral
with a Gaussian weight, i.e.,
of the form,
1
Z
I D
e z
2
f .z/
dz
1 e 2iz= h
(5.50)
C
Aside from the poles of f .t/, all poles of the sinusoidal function in the denominator
are equally spaced with a spacing of h on the real axis. The contour of integration is
5
Resonance Theory in Reactor Applications
237
taken to be a rectangular box enclosing these poles. The integral of interest can be
related to the contour integral by extending the contour to infinity while the latter can
be determined via the Cauchy integral theorem if the poles of f .z/ are also known.
The same method was first used by Bhat and Lee-Whiting [39] to compute the W function and later by Fröhner [40] to broaden the zero-temperature Reich–Moore
cross sections assuming that the poles were known.
For the case in point, f .t/ D 1=.z t/. Bhat and Lee-Whiting [39] showed via
this method that,
2 2
2
1
2P .h/e z
ih X e n h
C
R.h/
W .z/ D
nD1 z nh
1 e 2iz= h
(5.51)
where P .h/ D 1, 1=2, or 0 for =mfzg<.= h/, =mfzg D .= h/ or =mfzg > .= h/,
respectively. Mathematically, the first and the second terms in Eq. 5.51 are attributed
to the poles of the sinusoidal functions in the denominator of Eq. 5.50 and the single
pole of f .t/, respectively, while the remainder R.h/ is of the order of exp. 2 =h2 /.
Therefore, by choosing the value of h in the range 0:5 < h < 1, optimum efficiency
and accuracy can be achieved.
Evaluation of the Correction Terms
For the sake of rigor, one may include the low-energy correction terms defined by
Eqs. 5.45 and 5.46 as well if the generalized pole representation is considered. It was
found that the pertinent integrals involved can be treated efficiently via the use of
the semi-infinite Gauss–Hermite quadrature developed by Steen et al. [41], under
the condition where these terms may play a role, i.e., small E=m . A code for
generating the weights and abscissas of such quadrature in extended precision was
developed in order to alleviate the round-off problem resulting from the recursive
relations required for the higher-order orthogonal polynomials of interest. For our
purpose here, a quadrature of 24 points was found to be sufficient. For details, the
reader is referred to [34].
5.3.3 Direct Numerical Doppler-Broadening of Point-Wise
Cross Sections
It is important to point out that the Doppler-broadening of cross sections via the
traditional line-shape functions is by no means unique. In fact, various numerical
techniques can also be utilized for computing the Doppler-broadened cross sections
at any elevated temperature directly from a predetermined set of point-wise cross
sections at zero temperature. Unlike the analytical broadening where one deals only
with a known function with known characteristics, one must deal with the numerical
algorithms for treating an extremely large body of mesh points over a large energy
range of interest.
238
R.N. Hwang
There are two types of numerical methods for reactor applications in existence,
namely, the kernel-broadening method and the method based on the solution of the
traditional heat equation via the finite-difference method. Conceptually, the validity
of the low-order Taylor expansion within any mesh interval for the cross sections
to be broadened must be implicitly or explicitly assumed in these methods. Brief
discussions of this subject will be presented in the following.
5.3.3.1 Kernel-Broadening Approach
With no loss of generality, the Doppler-broadened cross section defined by Eq. 5.34
will be considered. The kernel-broadening
p
schemes in question are based on the
p
E; 0 over Solbrig’s kernel defined by Eq. 5.36
piece-wise integration of Ex
in each predetermined segment. Upon changing variables, Eq. 5.34 can be cast into
the following discretized form,
0
Zbi1
N
X
1
B
t 2
f .y; m / D p
@ e .t C y/f Œm .t C y/; 0 dt
y i D1
ai1
Zbi 2
1
2
C
e t .t y/f Œm .t y/; 0 dtA W xi y xi C1 (5.52)
ai 2
p
p
p
p
where f
E; m D Ex
E; m , y D
E=m and fxi g is a set of
p
precomputed mesh points with xi D E=m . Here, the limits of integration are
defined as ai1 D xi y, bi1 D xi C1 y, ai 2 D xi C y and bi 2 D xi C1 C y for these
two integrals, respectively.
There are two general ways of carrying out the integration numerically in each
mesh interval in question. One method is to assume that the cross section to be
broadened is representable by a low-order Taylor expansion explicitly, typically no
more than two terms. The schemes based on this assumption will henceforth be referred to as the explicit kernel-broadening scheme. Alternatively, the integral of this
type is obviously readily amenable to the low-order Gauss–Hermite quadrature for
arbitrary limits of integration if the pertinent weights and abscissas can be computed
efficiently. The latter implicitly assumes that the integrand excluding the Gaussian
weight must be representable by a polynomial of reasonable order.
Explicit Kernel-Broadening Method
The integral in Eq. 5.52 can be carried out analytically if Taylor’s expansion is
used, i.e.,
Zbij
aij
N
X
2
e t t ˙ y f Œm t ˙ y ; 0 dt D
cn.ij / Mn.ij / aij ; bij ; j 2 1 or 2 (5.53)
nD0
5
Resonance Theory in Reactor Applications
239
Rbij
2
.ij/
where the n-th order moment of the Gaussian is defined as Mn .aij ; bij /D t n e t dt
aij
and these moments can be readily computed recursively in terms of error functions
and exponential functions.
There are two comparable methods which differ only in the domain of expansion
one assumes.
Linearity in Energy
The most commonly used method [42, 43] is based on the assumption that the cross
section at zero temperature varies linearly in energy between two mesh points, xi <
x < xi C1 , i.e.,
xi2C1 xi2
y .x; 0/ D y.i / C si x 2 xi2 ; si D y.i C1/ y.i /
(5.54)
p
Physically, this expansion can not rigorously reproduce the 1= E limit of absorption cross section as energy approaches zero. It should also be noted
pthat, although
the linear equation in energy becomes a quadratic equation in the E -domain, it
is not equivalent to a second-order approximation from the perspective of Taylor’s
expansion in lieu of the presence of the linear term. The substitution of Eq. 5.54 into
5.53 leads to the explicit specification of Eq. 5.52 in which the linear combination
up to the fourth-order moment is required.
Linearity in
p
E
In contrast, the linear approximation for xi < x < xi C1 in the
Solbrig’s kernel was defined, can be represented by [44]
xy .x; 0/ D xi y.i / C si .x xi /;
si D
p
E -domain, where
xi C1 y.i C1/ xi y.i /
xi C1 xi
(5.55)
Upon substituting Eq. 5.55 into 5.53, the resulting integrals in Eq. 5.52 become a
linear combination of moments up to the second order. Thus, it is numerically more
efficient than the former approach.
p It is also important to note that this approximation clearly preserves the 1= E-dependence of the absorption cross section
as energy approaches zero, the necessary criterion from which Solbrig defined
the Doppler-broadening given by Eq. 5.34. Extensive benchmark calculations have
shown that both approximations based on identical sets of precomputed mesh points
and internally consistent coding give practically the same results except in the low
energy limit.
Implicit Kernel-Broadening Method
The basic integral for the Kernel broadening scheme defined by Eq. 5.52 immediately suggests that one simple, yet accurate, means to evaluate this type of integral
240
R.N. Hwang
is via the generalized Gauss–Hermite quadrature for integrals with finite limits of
integration. One obvious advantage of the Gauss quadrature is that the N -th order
quadrature will provide the exact value of the integral if the integrand excluding
the weight function is representable by a polynomial of order .2N 1/. Such an approach requires the specification of the corresponding weights and abscissas which,
in turn, must be determined from the associated orthogonal polynomials and their
roots. For a two-point quadrature of this type, the latter can be determined analytically with equal ease as the other two methods described above using the same
Gaussian moments up to third order, as demonstrated in [44]. Much better accuracy
than the previous two methods can be achieved via this method using the same mesh
structures as also reported in [44].
5.3.3.2 Heat Equation Approach Based on the Finite Difference Method
It is well-known that the Doppler-broadening process is equivalent to the solution
of the linear heat flow equation in a semi-infinite or infinite medium which can be
described by a second-order partial differential equation of the form,
@f .u; t/
@2 f .u; t/
D
2
@u
@t
(5.56)
if the initial condition f .u; 0/ is specified. It can be shown [44], via the use of
Fourier transform, that the solution of Eq. 5.56 isdirectly identifiable
with Eq. 5.34
p
p
2
E; m .
if one sets u D E, t D m and f .u; t/ D Ex
For reactor applications, there are two finite-difference methods in use for this
purpose. One is based on the implicit finite-difference method developed by Dunford
and Bramblett [45] and the other is based on the explicit finite-difference method developed by Leal and Hwang [46]. For details, the reader is referred to the respective
references cited.
5.4 Resonance Absorption in Homogeneous Media
Much of the earlier developments in the theory of resonance absorption in nuclear
reactor systems began with the treatment of such phenomena in an infinite homogeneous medium. In lieu of the spatial dependence, the steady state Boltzmann
transport equation that defines the balance of the neutron population at a given energy E reduces to the following form, generally referred to as the slowing-down
equation [47],
.E/†t .E/ D
XZ
i
dE0 .E 0 /†si .E 0 /fi .E 0 ! E/ C Q.E/
(5.57)
5
Resonance Theory in Reactor Applications
241
where †si .E/ and fi .E 0 ! E/ are the macroscopic elastic scattering cross section
and the average scattering kernel for a given nuclide i , respectively. Here, Q.E/ denotes the neutron source of non-elastic scattering nature. For the laboratory system,
where the target nucleus is usually assumed to be at rest, the average scattering
kernel fi .E 0 ! E/ is the average of fi .0 ; E 0 ! E/ over all solid angles,
fi .E 0 ! E/ D
1
4
Z
d 0 fi .0 ; E 0 ! E/
(5.58)
E0 E and the energy/angle correlation is determined by the kinematics
where 0 D of elastic collision.
5.4.1 Average Scattering Kernel for Practical Applications
For practical applications, the energy-angle correlation can be simplified considerably if the kinematics of elastic collision is cast into the center-of-mass system
where the center of mass remains stationary upon collision. In other words, the
magnitudes of the initial and final neutron velocities are invariant. By utilizing such
a property and the requirement of energy conservation, it can be shown that the
energy-angle correlation can be specified by the following relation [47],
1
E0 E
1
D
0
E
2
Ai 1
Ai C 1
2 !
.1 c / D
1
.1 ˛i /.1 c /
2
(5.59)
where ˛i is referred to as the maximum neutron energy loss per collision with a
nuclide with atomic weight Ai and c is the directional cosine in the center-of-mass
system. Equation 5.59 implies that the corresponding scattering kernel is identifiable
with a ı-function of the form,
2Ai E 0
.1
/
(5.60)
f .c ; E 0 ! E/ D ı E E 0 C
c
.1 C Ai /2
If one further assumes that the scattering is isotropic in the center-of-mass system,
i.e., d c D 2 sin c dc , a condition clearly true for the relatively low energy region dominated by the s-wave resonances, one obtains,
1 dE
; ˛i E 0 E E 0
1 ˛i E 0
D 0; elsewhere
fi .E 0 ! E/dE D
(5.61)
Physically, Eq. 5.61 signifies the fact that upon elastically colliding with nuclide Ai
a neutron acquires an energy E with equal probability in the range ˛i E 0 E E 0 .
For practical calculations, it is more convenient to examine slowing-down issues on
242
R.N. Hwang
the basis of a logarithmic scale instead of the energy scale particularly attractive in
conjunction with the widely used multigroup approach. This can be accomplished
by introducing a new variable u D ln.E0 =E/, generally referred to as “lethargy,”
where E0 is an arbitrary reference energy. It follows that the scattering kernel in the
u-domain becomes
0
fi .E 0 ! E/dE D Ki .u u0 /du D
D 0;
e .uu /
du;
1 ˛i
0 u u0 "i
elsewhere
(5.62)
where "i D ln.1=˛i / is the maximum increment in lethargy per collision. Given
these scattering kernels, the slowing-down equations in the respective domains are
defined.
5.4.2 Characteristics of the Slowing-Down Equation
In the lethargy domain, the slowing-down equation given by Eq. 5.57 becomes,
F .u/ D
u
X Z
Ki .u u0 /hi .u0 /F .u0 /du0 C Q.u/
(5.63)
i u"
i
where F .u/ D †t .u/ .u/ D E†t .E/ .E/, known as the collision density, physically signifies the total collision rate per unit volume by those neutrons within the
element du, and hi .u/ D †si .u/=†t .u/ is the ratio of the macroscopic scattering
cross section of a given nuclide i and the macroscopic total cross section of the
system.
Unlike .u/, F .u/ is a much smoother function of u and, therefore, Eq. 5.63 is
preferable when resonances are present. From the perspective of resonance treatment, the energy region of interest is usually far away from the neutron source of
non-elastic nature. Hence, it suffices to focus on the Green’s function solution of
Eq. 5.63 equivalent to the case of mono-energetic source whereby
F0 .u/ D
u
X Z
0
0
0
Zu
0
Ki .u u /hi .u /F0 .u /du C ı.u/; F .u/ D
i u"
i
Q.u0 /F0 .u u0 /du0
0
(5.64)
The solution to Eq. 5.64, in turn, can be pictured as the linear combination of the
form, F0 .u/ D ı.u/ C Fs .u/, where Fs .u/ signifies the collision density away from
the mono-energetic source. Thus, it follows that
Fs .u/ D
u
X Z
i u"
i
Ki .u u0 /hi .u0 /Fs .u0 /du0 C
X
i
hi .0/Ki .u/
(5.65)
5
Resonance Theory in Reactor Applications
243
where hi .0/ will henceforth be
Ptaken as a constant independent of u. It should be
noted that the source function i hi .0/K.u/ above exhibits discontinuities at u D 0
and u D "i .
Although Eq. 5.65 is seldom used in practical calculations, it is, nevertheless, of
great interest conceptually because it provides the means to illustrate the fluctuations of collision density as a function of u, commonly referred to as the Placzek
oscillations [48] as one shall see.
5.4.2.1 Slowing-Down Density
One quantity of great practical importance is generally referred to as the slowingdown density, which represents a measure of the total number of neutrons slowed
down past a given energy E (or above the corresponding lethargy u), defined as
q.u/ D
u
X Z
hi .u0 /F .u0 /du0
i u"
i
D
u
X Z
0 C"
uZ
i
Ki .u00 u0 /du00
u
hi .u0 /F .u0 /ki .u u0 /du0
(5.66)
i u"
i
where ki .u u0 / D Œexp.u0 u/ ˛i =.1 ˛i / is the explicit form of the inner integral on the first lines of Eq. 5.66. For the special case in the absence of resonances,
the slowing-down density simply becomes
q.u/ D Fs ;
D
X
i
Zu
hi .0/
.u u0 /ki .u u0 /du0 D
X
Œhi .0/
i
(5.67)
i
u"i
where the constant is referred to as the average increment of lethargy per collision
in the system consisting of many nuclides.
Differentiating Eq. 5.66 with respect to u and utilizing Eq. 5.63, one obtains the
following first-order differential equation,
dq.u/
D Q.u/ †a .u/ .u/
du
(5.68)
where †a .u/ D †t .u/ †s .u/ is the macroscopic absorption cross section of the
system. If one considers a mono-energetic source Q.u/ D qasym ı.u/, one obtains
2
q.u/ D qasym 41 1
qasym
Zu
0
3
†a .u/ .u/du5
(5.69)
244
R.N. Hwang
Physically, the term inside the square brackets is commonly referred to as the
resonance escape probability, while the second term therein denotes the resonance
absorption probability. These equations have provided the theoretical basis for many
useful methods in reactor applications. Their impact will be further addressed in
later sections.
5.4.2.2 Concept of Placzek Oscillations
One fundamental phenomenon associated with the neutron slowing-down process
is known as the Placzek oscillations. It was first examined by Placzek [48] for the
case of a medium with a single nuclide in the absence of resonance absorption.
Two specific features of the slowing-down process are of great theoretical interest
here, namely, a general description of the Placzek oscillations in the presence of
many nuclides and their impact on the slowing-down equation when resonances are
present. In the following discussions, the emphasis will be focused on the conceptual
aspects of these topics.
One conceptually plausible way of providing a better understanding of this subject is to cast Eq. 5.65 into an alternative form as first proposed by Corngold [49]
via the use of the Laplace transform. Define the Laplace transform of a function and
its inverse symbolically as
fQ.p/ D
Z1
f .u/e
pu
1
L
du;
1
ffQ.p/g D
2 i
0
cCi
Z 1
e up fQ.p/dp D f .u/ (5.70)
ci 1
Taking the Laplace transform of Eq. 5.65, one obtains
I
P
FQs .p/ D
I
P
hi .0/KQ i .p/
i D1
I
P
1
i D1
hi .0/KQ i .p/
i D1
KQ i .p/Lfgi .u/Fs .u/g
1
I
P
i D1
hi .0/KQ i .p/
Q
D PQ .p/ R.p/
(5.71)
where gi .u/ D hi .0/ hi .u/ and I is the total number of nuclides in the system. In
Q
the absence of resonance absorption, gi .u/ must vanish so that R.p/
D 0. Thus,
upon inversion, Fs .u/ D P .u/. Here, P .u/ will henceforth be referred to as the
Placzek function in the presence of many nuclides unless otherwise stated.
In the presence of resonance structures, the inversion of Eq. 5.71 leads to an
alternative form of the slowing-down equation derived by Corngold [49],
Fs .u/ D P .u/
I Z
X
i D1 0
u
du0 Gi .u u0 /gi .u0 /Fs .u0 /
(5.72)
5
Resonance Theory in Reactor Applications
245
where the kernel Gi .u u0 / and the Placzek function can be defined as
8
9
ˆ
>
ˆ
>
ˆ
>
I
<
=
X
Q i .p/
K
1
; P .u/ D
hi .0/Gi .u/
Gi .u/ D L
I
ˆ
>
P
ˆ
>
i
D1
>
Q
hi .0/Ki .p/ ;
:̂ 1 (5.73)
i D1
Hence, Eq. 5.72 provides the direct connection between the solution to the slowingdown equation without resonances to that with resonances.
The Laplace transform of the scattering kernel can be determined analytically.
The substitution of KO i .p/ into Eq. 5.71 yields,
I
P
LfP .u/g D
i D1
hi .0/
1˛i
.p C 1/ I
P
i D1
1 e "i .1Cp/
hi .0/
1˛i
(5.74)
1
e "i .1Cp/
By examining the above equation, the asymptotic properties of P .u/ and its shortrange nature of fluctuations can be readily established analytically. From the Laplace
transform identity, it is quite obvious that the asymptotic limit of the Placzek function must be
1
lim P .u/ D lim Œp LfP .u/g D
u!0
p!0
1
I
P
i D1
D
˛i "i
hi .0/ 1˛
1
N
(5.75)
i
where is the average lethargy increment per collision also defined in Eq. 5.67.
Similarly, the short-range nature of fluctuations in P .u/ and/or Gi .u/ can also be
determined via the use of the asymptotic expansion of Eq. 5.74 for large p.
For illustration purposes, consider a single nuclide in the limit of large p, so that
Eq. 5.74 can be represented by the following series,
˛ n n" p
1
e
1 1 ˛e "p X .1/n 1˛
lim LfP .u/g D
(5.76)
n
˛
˛
p!1
1 ˛ p 1˛ nD0
p 1˛
where the subscript i is dropped for convenience. Note that the presence of
exp.n"p/ indicates that its inverse must consist of a unit step function for each
interval n". The inversion of Eq. 5.76 can be carried out term by term and, upon
rearrangement, one obtains
˛
1
e 1˛ u X .1/n ˛ n ˛ .un"/
C
e 1˛
P .u/ D
1˛
nŠ
1
˛
nD1
Œn C .u n"/ .u n"/n1 H.u n"/
where H.u n"/ is the Heaviside function (or unit step function).
(5.77)
246
R.N. Hwang
It is interesting to note that Eq. 5.77 provides the same results identifiable with
those derived originally by Placzek if the sum in Eq. 5.77 is written out explicitly.
A similar approach can be extended readily to P .u/ and/or Gi .u/ for a mixture with
many nuclides. It can be shown that the results so obtained are identifiable with those
given in [50]. Some results of Gi .u/ and P .u/ for a typical fast reactor composition
are given both quantitatively as well as graphically in [50]. The fluctuations are
noticeably less striking in the first few intervals when other scatterers with much
smaller atomic weights are present.
From Eq. 5.72, it is quite obvious that the fluctuations in the kernel Gi .uu0 / will
impact the collision density as well when resonances are present. For the relatively
high energy region where the extent of a resonance is usually less compared to the
corresponding ", only the first scattering interval will be of practical interest from the
perspective of resonance integral considerations. As a general rule, under the latter
condition, F .u/ either exhibits a bump above its asymptotic value, or a drop below it,
corresponding to whether gi .u/ assumes predominately positive or negative values.
As discussed in [50], the former represents the scenario where the neutron width n
is much greater than other partial widths whereas the latter represents the reverse
scenario. The impact can be substantial when the neutron width is very large. For
more details, the reader is referred to [50].
5.4.3 Resonance Integrals and Their Applications
The resonance integral concept was introduced in the early stages of reactor physics
development as an effective means to account for resonance absorption attributed to
a handful of low-lying s-wave resonances of few actinides using the Breit–Wigner
approximation. The general concept is readily extendable to other types of crosssection representations when cast into the form of the generalized pole expansion
described in Section 5.2.
5.4.3.1 Traditional Resonance Integral Concept
In addition to the use of the Breit–Wigner approximation, the three main assumptions used in earlier days that eventually led to four widely used approximations are:
(1) flux recovery between resonances; (2) .0/ 1 above each resonance; (3) the
collision density is taken to be constant for all nuclides in the mixture except for
the resonant isotope in question. Hence, the slowing-down equation at energies far
below the source energy becomes
Zu
†t .u/ .u/ D †m C
u"
0
du0
e .uu /
†sr .u0 / .u0 /
1˛
(5.78)
5
Resonance Theory in Reactor Applications
247
where †m is the energy-independent macroscopic scattering cross section of
nonresonance nuclides and the subscripts for various parameters of the resonant
isotope will, henceforth, be dropped for convenience. The simplified equation above
also implies that the absorption of the medium is due entirely to the resonance under
consideration.
These assumptions lead immediately to various means to determine the flux and
absorption rate for each Breit–Wigner resonance from which the slowing-down density can be determined in the context of Eq. 5.69. The resonance integral, defined as
the absorption rate per atom, i.e.,
Z1
RI D
x .u/ .u/du
(5.79)
0
has been used as one of the major tools in studies of resonance absorption.
5.4.3.2 Various Resonance Integral Approximations
During the period preceding the mid-1960s, there were four widely used approximations for treating the resonance integrals of the well-isolated Breit–Wigner
resonances. Brief discussions of these methods are presented below.
Narrow Resonance Approximation (NR)
If the extent of the resonance is small compared to the maximum energy loss per
collision, the resonance contribution to the integral term in Eq. 5.78 becomes negligible so that Eq. 5.78 is reduced to
.u/ D
†p
†p
D
†t .u/
†p C †Rt .u/
(5.80)
where †Rt .u/ is the total macroscopic resonance component and †p D †m C †pr
is the total macroscopic potential scattering cross section. The corresponding resonance integral becomes
p ar
J.r ; ˇr ; ar /
Er cos l
Z1
.r ; x/dx
p ar 1
D
Er cos l 2
ˇr C .r ; x/ C ar .r ; x/
.RI/NR D
1
(5.81)
248
R.N. Hwang
where p D †p =Nr , ar D r C fr , ˇr D †p =.†0r cos l / and ar D tan 2 l .
Here, †0r is the total macroscopic peak resonance cross section of level r. It is
worth noting that the hard-sphere phase shift angle, l , was usually assumed to
be small in the low energy range so that cos l 1 and sin 2 l 2 l . In fact, the
asymmetric Doppler-broadened line-shape function was often ignored for the sake
of expediency in many earlier works [6]. Since J.r ; ˇr /, in absence of ar , is readily
amenable to the utilization of a precomputed table in a two-dimensional array, it was
most widely used in earlier days. It will be shown that the integral of the form given
in Eq. 5.81 can be computed efficiently without resorting to tabulation.
Infinite Mass Approximation (NRIM or WR)
In the limit of infinite mass or the extent of the resonance much wider than the
maximum lethargy increment per collision, i.e., " ! 0 in Eq. 5.78, the flux and its
corresponding resonance integral are also reduced to the simple forms [6],
.u/ D
†m
;
†m C †a .u/
.RI /NRIM D
m t
J.r ; ˇr /
Er
(5.82)
respectively, where ˇ r D †m =.†0r a =t / and m is the diluent cross section per
absorber atom.
Intermediate Resonance Approximation (IR)
It is apparent that some means to bridge the gap between the two approximations
given above is needed. One widely used approximation that serves this purpose is
the IR approximation originally proposed by Goldstein and Cohen [51]. From the
foregoing discussions, it is reasonable to conjecture that the neutron flux across a
resonance generally resembles the following approximate form,
.u/ D
†m C †pr
†m C †a .u/ C †sr
(5.83)
where is a parameter characteristic of the resonance to be determined. This expression obviously leads to the NR and WR approximations as approaches 1 and 0
respectively. Furthermore, the corresponding resonance integral based on such a flux
shape must retain the same general forms defined by Eqs. 5.81 and 5.82 provided
that is insensitive to energy. By substituting the Doppler-broadened Breit–Wigner
cross sections into Eq. 5.83, one obtains the same general form of the resonance
integral as that for the NR-approximation with different arguments, i.e.,
.RI/IR D
./ p./ ar
J r ; ˇr./
Er
(5.84)
5
Resonance Theory in Reactor Applications
249
./
where p./D.†m C †pr /=Nr , ar
D ar t =.ar Cnr / and ˇr./ D ˇr t =.ar
Cnr /, respectively. Thus, it amounts to redefining the parameters for the expression based on the NR-approximation defined by Eq. 5.81. Here, the parameter must reflect the higher-order effects of the slowing-down equation. Goldstein and
Cohen [51] argued that Eq. 5.83 can be viewed as the first-order iterant of the
integral equation of the Fredholm type defined by Eq. 5.78, i.e., .1/ D .u/. Substituting .1/ .u/ into Eq. 5.78, one obtains the second-order solution to be denoted by
.2/
.u/. If the iterative process converges rapidly, .1/ .u/ and .2/ .u/ must not be
significantly different. Therefore, one plausible criterion for determining was by
setting
Z1
Z1
.1/
†ar .u/
.u/ du D †ar .u/ .2/ .u/ du
(5.85)
0
0
From this transcendental equation, one may deduce the value of provided that
the integration on the right-hand side can be carried out analytically into a manageable form. One obvious problem that hinders such a procedure is its complexity when the Doppler-broadened function along with its asymmetric component
are considered. If one neglects the inherent temperature-dependence and uses the
Lorentzian shape for the resonance, a procedure commonly used, can be determined explicitly.
Direct Numerical Approach (Nordheim’s Method)
In the mid-1960s, the availability of modern computers made possible the alternative
of a relatively rigorous treatment of the isolated resonance integral without resorting
to significant approximations and complications illustrated by the IR approximation.
One such method widely used in thermal reactor applications, especially in the
United States, is that pioneered by Nordheim [52]. Nordheim’s method was intended
for the treatment of the resonance integral in two-region repeated cells imbedded in
an infinite reactor lattice via the collision probability method. For the purposes of
this section, it suffices to focus only on its basic algorithm of treating the slowingdown equation and resonance integral for the case of the infinite homogeneous
medium. The subject of collision probabilities will be addressed separately in the
next section.
This method amounts to a breakup of the resonance integral into the following
form,
Z1
Zu2
(5.86)
.RI/Nord D x .u/ .u/du D x .u/ .u/du C I
0
u1
where the lethargies u1 and u2 correspond to the predetermined energy boundaries
E1 and E2 , respectively, around the resonance peak with Ej D E0 ˙ m. p =2/; j 2
1; 2. Physically, the interval is taken to be an “m” multiple of the “practical” width
p D t =2ˇ 1=2 approximately equivalent to the “half width” of the integrand based
250
R.N. Hwang
on the NR or WR approximations. If m p is taken to be much greater than the
Doppler width, where the Doppler-broadened line-shape functions approach their
Lorentzian limits, I , consisting of two integrals corresponding to those attributed
to the tails on both sides of the resonance outside of u1 < u < u2 , becomes
analytically integrable.
To evaluate the integral defined in Eq. 5.86 requires the solution of the slowingdown equation given by Eq. 5.78 at each mesh point uj between u1 and u2 as well
as the subsequent resonance integral itself. Thus, it amounts to the evaluation of
a double integral which can be accomplished via the use of Simpson’s integration
scheme with equally spaced mesh points with the mesh spacing taken to be much
smaller than the maximum increment of lethargy per collision for the resonance
absorber in question. Nordheim [52] showed that the discretized integrand for the
integral defined in Eq. 5.86 can be readily obtained recursively. Unlike other approximations, this method preserves the rigor of the flux behavior within the crucial
region around the peak of the Breit–Wigner resonance. It was eventually superseded
by other more rigorous methods during the emergence of the fast reactor program
when more advanced computational tools became available.
5.4.4 Various Developments Motivated by the Emergence
of the Fast Reactor Program
The emergence of fast reactor development and modern computing facilities had
a significant impact on our philosophies for treating the resonance phenomena in
practical applications. As discussed in Section 5.1.1, the former casts the role of resonance cross sections into a somewhat different light while the latter makes possible
the development of many rigorous methods for routine applications unimaginable
in earlier days.
There were several challenges directly associated with fast reactor development. For resonance treatments in the homogeneous media, there are three major
challenges. First, one must account for overlap effects resulting from neighboring
resonances either from the same nuclide or from different nuclides even if the Breit–
Wigner approximation is assumed. Second, one must also account for the spectral
effects resulting from resonance scattering of intermediate weight nuclides. Third,
one must deal with compatibility issues of the vastly improved nuclear data and their
representations in conjunction with the existing reactor physics concepts and codes
based on the Breit–Wigner approximation. As discussed in Section 5.2.3, the latter
can be resolved via the use of the generalized pole representation. Hence, it suffices
to focus on various developments in the first area for our purposes here.
5.4.4.1 Generalization and Computation of the J -Integral
Our shift of interest to the higher energy region clearly justifies the extensive use
of the NR-approximation for treating many sharp resonances of actinides. One
5
Resonance Theory in Reactor Applications
251
way to handle these issues is to generalize the integral within the context of the
NR-approximation. If the total macroscopic resonance cross section defined in
Eq. 5.80 is taken to be a linear combination of all Breit–Wigner resonances in the
system, the NR-approximation at the k-th resonance can be cast into the following
form via partial fractions [53],
0
B
B
1
.u/ D †p B
B † C † .u/
tk
@ p
P
1
C
C
!C
C
P
A
†p C †t k .u/ †p C †t k .u/ C
†k 0 k .u/
k 0 ¤k
†k 0 k .u/
(5.87)
k 0 ¤k
where k 0 denotes the neighboring resonances. Thus, the corresponding generalized
J -integral, referred to as the J -integral, must exhibit the following form,
Jk D Jk .k ; ˇk ; ak ; bk / X
Okk0
(5.88)
k 0 ¤k
where
1
Jk .k ; ˇk ; ak ; bk / D
2
Z1
1
.k ; xk / C bk .k ; xk /
dxk
ˇk C .k ; xk / C ak .k ; xk /
(5.89)
and Okk0 is referred to as the overlap term attributed to the k-th resonance with
the integrand deducible from Eq. 5.87 in terms of the Doppler-broadened line-shape
functions. The principal component J.k ; ˇk ; ak ; bk / represents the contribution
from the k-th resonance alone. The parameter bk D 0 is taken when used in conjunction with the capture and fission resonance integrals so that it is consistent with
Eq. 5.81. For total and/or scattering resonance integrals, one sets bk D ak . For details, the reader is referred to discussions given in [53].
It is important to point out that such an integral is composition-dependent and
the required calculations must be carried out at run time. Hence, an efficient method
for such a purpose is obviously needed.
One proven method that has been extensively used in fast reactor applications
is that based on the special form of the Gauss–Jacobi quadrature, sometimes also
referred to as the Gauss–Chebyshev quadrature. While the detailed discussions are
given in [53], some general features of this method will be discussed briefly for our
purposes here.
The rationale for this approach is based on the fact that the Doppler-broadened
line-shape functions can become relatively well-behaved in a new domain upon
252
R.N. Hwang
changing the variable of integration. One simple way to accomplish this is via either
one of the following two types of rational transformations [53],
or u2 D C 2 xk2 = 1 C C 2 xk2
(5.90)
u2 D 1= 1 C C 2 xk2
where C is a constant to be chosen. With no loss of generality, the integral of the following form can be cast into the form readily amenable to Gauss–Jacobi quadrature
via the first type of transformation, i.e.,
Z1
f . .xk ; k /; .xk ; k //dxk
1
Z1
D
1
D
du
p
1 u2
1 f . .xk ; k / ; .xk ; k / ; k /
C
u2
N
1 X f . .xk .un /; k /; .xk .un /; k /; k /
C RN
C N nD1
u2n
(5.91)
where un D cosŒ.2n 1/=2N and N is the total number of points considered.
For such a quadrature to be effective, the part of the integrand inside the bracket
in the equation above must be a smooth function of the new variable u so that it
can be accurately approximated by a low-order Chebyshev polynomial expansion.
For computation of the integral J.k ; ˇk ; ak ; bk /, it was shown in [53] that only a
few mesh points are required to ensure accurate results if the constant C is carefully
chosen in various ranges of k and ˇk . The same algorithm is equally applicable to
the evaluation of the overlap correction term usually consisting of few contributing
neighboring resonances if a larger number of mesh points is used. This algorithm
has been incorporated into the MC2 -2 code [54] for routine applications.
5.4.4.2 Connections Between the Resonance Integral and Traditional
Multigroup Cross Section Processing
With no loss of generality, the multigroup cross section for a given reaction process
x in a predetermined lethargy group ug D ug ug1 can be expressed in terms of
resonance integrals within this group boundary as follows [53],
P
xk Jxk
Fk =E0k
.g/
p k2g
.g/
Q x D
(5.92)
ug
fg
where fg , sometimes referred to as the flux correction factor, is defined as
1 X tk Jtk Fk
fg D hFk i ug
E0k
(5.93)
k2g
and Fk , the collision density at the k-th resonance, is generally taken to be constant.
5
Resonance Theory in Reactor Applications
253
There are two ways that the multigroup cross sections can be deployed in reactor
applications. One simple approach known as the Bondarenko scheme where the selfshielding factors for a chosen group structure are precomputed by using Eq. 5.92,
.g/
without the overlap term, as a function of p at various temperatures with results tabulated and stored in one-dimensional arrays. In lieu of the overlap term,
.g/
the quantity p infers the approximate composition dependence of the multigroup
cross sections so that this scheme provides an efficient means for survey-type calculations. Another approach is to process these multigroup cross sections taking into
account the actual composition dependence before the results are passed on to neutronic calculations [54]. As mentioned earlier, accurate accounts of the composition
involving many nuclides with resonance structures are extremely important in fast
reactor applications and the multigroup cross sections must undergo further refinements prior to their deployment. For our purposes here, it suffices to present the
rationale for two widely used methods that provide a better account of the global
spectral effects.
Ultra-Fine Group/Fundamental Mode Approach
The ultra-fine group (ufg) approach was pioneered by Hummel [12] in order to
account for the fine spectrum effects attributed to resonances of nuclides with intermediate atomic weights not accounted for in the earlier resonance theory. It was
intended specifically for the treatment of resonance structures of the structural and
metal coolant materials present in a fast reactor composition. The dominant resonances of these materials are characterized by relatively wide neutron widths not
sensitive to Doppler-broadening.
One conceptually simple means to deal with this issue is to divide the entire energy span of interest into 2,000 ufgs with equally spaced lethargy widths
u D 1=120, approximately equivalent to half of the maximum lethargy increment
per collision of an actinide. One way to account for fine structure effects on the broad
group cross sections is to use the ufg flux computed based on the predetermined ufg
cross sections as the tool to collapse the latter into the desired group structure. This
can be accomplished in two steps. First, compute the ufg cross sections according
to Eq. 5.92. Second, once the ufg cross section set for a given composition is generated, the ufg spectrum can be computed via the usual fundamental mode spectrum
calculations utilizing the consistent or inconsistent PN or BN approximation for
diffusion equations widely used in reactor applications. This algorithm was coded
in the MC2 -2 code [54].
Continuous Slowing-Down Approach
The continuous slowing-down approach is an alternative in which the ufg weighting spectrum is determined by solving the slowing-down density equation given
by Eq. 5.66. It is based on the rationale that the effects attributed to the relatively
254
R.N. Hwang
smooth-varying cross sections and those attributed to the sharp resonances can be
treated separately, a method particularly amenable in conjunction with the resonance
integral concept. If the slowing-down density defined by Eq. 5.66 can be determined
in the absence of sharp resonances, the corresponding local slowing-down density
with sharp resonances and thus the local flux can also be specified via the attenuation
of the former using the resonance escape probability in the same context defined by
Eq. 5.69. A comprehensive review of this subject was presented by Stacey [55]. For
our purposes here, it suffices to focus only on the conceptual basis of this approach.
One best known approximation for solving Eq. 5.66 for the case of relatively
slow-varying cross sections was pioneered by Goertzel and Greuling [56] using
the synthetic kernel approach. Their rationale can also be viewed as a natural
consequence of applying a low-order Taylor’s expansion to the quantity †s .u/ .u/,
the scattering component of the collision density [55]. The substitution of the expansion into Eq. 5.66 yields
!
1
X X
.1/n .i / d n
mn
Πsi .u/ .u/
q.u/ D
nŠ
dun
nD0
i
Zu
/
m.i
n D
.u u0 /n ki .u u0 /du0
(5.94)
u"i
/
where ki .u u0 / is defined in Eq. 5.66 and m.i
n is the n-th order moment for the i-th
nuclide. By retaining the first two terms in the expansion and the relation given by
Eq. 5.68, the resulting first-order differential equation of q.u/ can be represented in
.i /
terms of moderating parameters that depend on the low-order mn and local cross
sections. In absence of sharp resonances, such an equation can be solved readily.
Because of its importance in fast reactor applications, a great deal of improvement of the original version by Goertzel and Greuling [56] has since been added,
most notably by Stacey [55]. The improved version of Stacey [55] was based on
a somewhat different low-order Taylor’s expansion from that given by Eq. 5.94.
Instead of †s .u/ .u/, a similar expansion was made on †t .u/ .u/, the total collision density, which conceptually exhibits smoother behavior than the former. By so
doing, the same type of first-order differential equation as the former was derived except that all moderating parameters must be redefined. For fast reactor applications,
higher-order Legendre moments were also included in computing the moderating
parameters. In particular, this improved method has been incorporated into the
MC2 -2 code [54] as an option for computing the ufg fundamental mode spectrum
in the resolved energy range. For details, the reader is referred to [54] and [55].
5.4.4.3 Rigorous Treatment of Resonance Absorption via Numerical Means
There exist various degrees of inherent limitations in all resonance integral methods
described so far. In general, there are two limitations in common. First, the rigor in
the treatment of the slowing-down equation is lacking especially when resonances
5
Resonance Theory in Reactor Applications
255
of many nuclides are present. Second, the range of integration is generally taken
from minus infinity to infinity without due consideration of the finite group structures in conjunction with the multigroup applications downstream. This gives rise
to the so-called boundary effects. These limitations along with the need for accurate treatment of heterogeneous effects of reactor lattices to be addressed in the next
section provided a strong motivation for the development of more rigorous methods
particularly useful for benchmarking purposes.
For our purpose here, one rigorous numerical treatment of the resonance absorption in homogeneous media will be presented. The method pioneered by Kier [57]
and later improved by Olson [58], sometimes referred to as the “hyper-fine group”
method, is believed to be tailor-made for such a purpose.
For illustration purposes, first let us consider the simplest case involving only
one nuclide. The rationale is to divide a given scattering interval into many equally
spaced hyper-fine groups (hfg’s) with spacing u much smaller than the extent of
the resonances under consideration. This allows us to describe the flux defined by
Eq. 5.63 for any given group, say k , discretely if the elastic scattering process for
the hfgs within a span of "i is known. Thus, it suffices to isolate the neutronic balance for one scattering interval in the presence of a single nuclide for illustration
purposes.
Let the total number of hfgs within the span of " below the hfg in question be
L D "=u;
˛ D exp.Lu/
(5.95)
The neutronic balance in the hfg sense can be achieved via the use of the
“effective” scattering kernel according to Kier [57]. Physically, if the scattering
kernel K.u u0 / is viewed as the probability density function (p.d.f.), Kier’s “effective” scattering kernel [57] can be construed as the corresponding cumulative
density function (c.d.f.) as a function of the successive hfgs within " above the hfg
in question. If Pl denotes the c.d.f. for the l-th hfg above the initial group with lower
boundary u0 , one can express it as
Pl u D KQ l u;
D KQ LST u C KQ s u;
1 l L1
lDL
(5.96)
where the c.d.f. for l D L represents the probability of neutrons initiated at u0 reaching the hfg in question and must take into account in-group scattering of that hfg in
order to preserve neutron balance. These effective scattering kernels can be specified
in the form of double integrals,
1
KQ l u D
1˛
u0ZCu
.l1/
Z
u0
u
du
u0
ul0 u
0
e .uu / du0 D KQ 1 u e .li/u
(5.97)
256
R.N. Hwang
KQ LST u D
1
1˛
u0ZCu
u0 .L1/u
Z
u0
KQ s u D
1
1˛
0
e .uu / du0 D KQ L u ˛ KQ s u (5.98)
du
uLu
uLZCu
Zu
du
uL
1
.u 1 C e u /
1˛
0
e .uu / du0 D
(5.99)
u0
Here, KQ s u is the probability the scattering is in-group. It can be readily shown that
the sum of Pl over all l is equal to unity as expected.
If one denotes the flux per unit lethargy for a given group k by k and the corresponding collision density per unit lethargy by Sk , the corresponding slowing-down
equation can be discretized as follows.
Sk D
L
X
!
Q
Kl .†s /kl ˛Ps .†s /kL C .†s /k KQ s D Sk.in/ C Sk.s/ (5.100)
lD1
where Sk.in/ and Sk.s/ physically signify the in-scattering source from other hfgs and
the self-scattering source of group k, respectively. One potential issue is the evaluation of the in-scattering source when an exceedingly large number of hfgs are
considered. Kier [57] showed that the computational efficiency could be significantly enhanced via the use of the following recurrence relation derivable from the
properties of KQ l given by Eq. 5.97, i.e.,
.in/
Sk.in/ D Sk1
.KQ 1 KQ s e u /˛.†s /k1L
CKQ 1 .†s /k1 ˛ KQ s .†s /kL
(5.101)
Hence, for a scattering interval, one only needs to deal with four terms coming from
the lethargy groups below k (or energy groups above k). For practical applications
in conjunction with the multigroup approach, the calculation usually begins in the
energy region slightly above the resolved energy region where the cross sections are
taken to be constant.
.CR/k D
Sk.in/
1 .†sk =†tk /KQ s
;
k
D
.CR/k
†tk
(5.102)
Once Sk.in/ is known, the corresponding collision rate, .CR/k , and flux, k , for the
given k-th hfg can also be defined. The procedure described above can be repeated
until the hfg fluxes and the corresponding reaction rates are determined. The effective group cross section for a lethargy group, u1 u2 , consisting of N hyper-fine
groups is simply,
!, N
!
N
X
X
(5.103)
Q x D
xk k
k
nD1
nD1
5
Resonance Theory in Reactor Applications
257
The same procedure can be readily generalized for the practical case involving
many nuclides. This approach is, in principle, rigorous so long as the slowing-down
equation is valid. These basic algorithms have been incorporated into the RABBLEcode [57] and MC2 -2 code [54].
5.5 Resonance Absorption in Heterogeneous Media
The treatment of resonance absorption in heterogeneous media obviously presents
a greater challenge than that in homogeneous media. Unlike the latter, the energy
and space dependence become intertwined. It is clearly unrealistic if the detailed
distribution of neutron flux throughout the entire reactor must be specified simultaneously at any arbitrary energy and temperature. Hence, the general starting point
for all existing deterministic approaches is to focus on the examination of the resonance absorption within a localized unit cell. A realistic reactor can be generally
viewed as an ensemble of lattices consisting of unit cells. These unit cells, in principle, can have either identical or different composition depending on the design
under consideration.
5.5.1 Traditional Collision Probability Methods
for a Two-Region Cell
One convenient way of treating resonance absorption in a reactor lattice is via the
use of the collision probability method. The rationale of using such a method can
be best illustrated via the two-region cell embedded in an infinite reactor lattice
considered by Chernick [4, 5]. The cell consists of a fuel region and a large moderator region designated by 0 and 1, respectively and no inter-cell interactions are
assumed. If F0 .u/ and F1 .u/ denote the collision density in these regions, respectively, the neutron conservation of this cell can be specified by the coupled integral
equations given below,
V0 F0 .u/ D .1 P0 /S0 C P1 S1
(5.104)
V1 F1 .u/ D .1 P1 /S1 C P0 S0
(5.105)
where the respective volumes and escape probabilities are denoted by V0 , V1 , P0 ,
and P1 , respectively. Alternatively, .1 Pn / is generally referred to as the collision
probability for the n-th region. The quantity Sn represents the scattering source in
the specific region and is given by
Sn D
I.n/
X
Zu
i D1u"
i
.n/
Ki .u u0 /
†si
†.n/
t .u/
Vn Fn .u/d u0 ;
n 2 0 or 1
(5.106)
where I.n/ is the total number of nuclides in the nth region under consideration.
258
R.N. Hwang
Thus, the coupled equations amount to the combination of two slowing-down
equations previously defined for homogeneous media connected together via the
respective escape (or collision) probabilities. To specify the collision density as a
function of u requires the explicit description of Pn as a function of lethargy (or
energy) and the means to solve the coupled integral equations. In earlier days, the
direct numerical solution to these equations, in conjunction with resonance integral calculations, was obviously prohibitive and the task required a great deal of
simplifications.
One general identity which is crucial to lattice physics studies is the widely used
reciprocity relation. For a two-region cell, it is simply,
.0/
.1/
P0 †t .u/ V0 D P1 †t .u/ V1
(5.107)
so that P1 can be computed once P0 is known. Another simplification is made possible by the fact that region 1 is usually assigned to the moderator with constant cross
sections. Hence, it is reasonable to assume the NR-approximation in that region.
By taking F1 .u/ D †1 .u/ and utilizing the reciprocity relation, the coupled integral
equations can be reduced to a familiar form similar to that for homogeneous media
described in the previous section,
F0 .u/ D .1 P0 /
fuel
X
i
1
1 ˛i
Zu
u"
0
e .uu /
0
†.0/
si .u /
0
†.0/
t .u /
F0 .u0 /d u0 CP0 †t .u/ (5.108)
.0/
The simplified equation above is generally considered as the starting point for various approximations of the two-region cell configurations to follow. The key issue
here is how to define the escape probability P0 as a function of †.0/
t .u/ so that the
traditional approximations for homogeneous media can be applied consistently.
5.5.1.1 General Features of Collision Probability of Practical Interest
For the sake of completeness, the transport theory origin of the escape probability
method [59] will be briefly addressed. With no loss of generality, let us start with the
generic representation of the escape probability for an arbitrary region with volume
V and surface area S from which all subsequent simplifications are based,
0
1
Z
Z
h
i
1 @ 1
E nE 0 / 1 exp.†t jErs rEs 0 j/d E AD 1T
dErs 0
.
Pesc D
N
S
†t l
†t lN
S
(5.109)
E
where lN D 4V =S is commonly referred to as the average of the chord length, 0
is the unit vector along the direction of motion of the neutron, and nE is the unit
vector normal to the surface S . The quantity inside the parentheses is denoted by
5
Resonance Theory in Reactor Applications
259
1 T , where T is known as the transmission probability signifying the fraction of
neutrons transmitted through the region in question without suffering any collision.
Alternatively, T also represents the neutron current of unit strength at the surface
that passes through the region without collision. The corresponding collision probability is simply,
(5.110)
Pc D 1 Pesc
Let l D jErs rEs 0 j be the chord length along the neutron path, a straight line connecting the intersections of the two vectors rEs and rEs 0 with the surface. For practical
configurations of convex geometries, the outer integral over the surface in Eq. 5.109
becomes independent of the inner integral so that,
Pesc D
S
4V †t .u/
Z
E nE / Œ1 exp.†t .u/l.d; //
d .
(5.111)
where l.d; /, is a function of “diameter” (or thickness where appropriate) and the
direction of flight .
For the sake of clarity, the general attributes of geometry-related quantities in
the integrand of Eq. 5.111 are given in Table 5.2 for the three most commonly used
configurations in reactor physics applications. These are the fundamental geometric
relationships in terms of ', the angle between the projection of l with respect to the
x-axis, and , the azimuthal angle, used in defining their respective escape and/or
collision probabilities.
Let us begin with a simple scenario of an isolated unit cell in the sense that
all neutrons escaping from the fuel lump will suffer their next collision in the surrounding moderator. For the isolated fuel configurations of practical interest, the
escape probabilities are analytically derivable as illustrated in [59]. Figure 5.1 ilN a quantity commonly referred to
lustrates their behavior as a function of †t .u/l,
as the “optical” thickness, for slab, cylinder, and sphere unit cells. In spite of their
differences in shape, the escape probabilities very much follow the same pattern.
N
In particular, they share
ı the same limiting values, i.e., for †t l ! 0, Pesc D 1 and
†t lN ! 1, Pesc D 1 †t lN known as the “white” and “black” limits, respectively.
These limiting properties lead immediately to two well-known approximations, referred to as the mean-chord approximation and Wigner’s rational approximation
[2, 3]. The former is defined as
h
i
Pesc D 1 exp.†t .u/l/
.†t .u/l/
Table 5.2 Geometry-dependent quantities
computing collision probabilities
E n
E
Configurations
.;
E/
d
Infinite slab, d D t
cos '
sin ' d' d
Infinite rod, diam. d sin ' cos ' sin ' d' d
Sphere, diameter ds cos '
sin ' d' d
required
(5.112)
for
E
l.d; /
t =sin '
d cos =sin '
ds cos '
260
R.N. Hwang
THREE CONFIGURATIONS & WIGNER APPROX.
ESCAPE PROBABILTY
1
cylinder
slab
sphere
rational
0.8
0.6
0.4
0.2
0
2
0
4
6
8
OPTICAL THICKNESS
10
12
Fig. 5.1 Analytically based results and Wigner’s rational approximation
while the latter is defined as
Pesc D 1
1 C †t .u/ l
(5.113)
It is quite apparent that these functions exhibit the same limiting properties as those
of the rigorous representations. The discrepancies of the rational approximation, for
instance, are usually no more than a few percent for the isolated lumps considered as
illustrated also in Fig. 5.1. These approximations made possible much of the earlier
work on lattice physics.
5.5.1.2 Various Earlier Methods Based on Approximate
Escape Probabilities for Isolated Fuel Lumps
The similarity between the simplified version of the slowing-down equation defined
by Eq. 5.108 and that of the homogeneous media previously given makes possible
the application of the comparable approximations for reactor physics calculations.
For our purpose here, the most widely used NR and WR approximations will be
presented for illustration purposes.
NR-Approximation
.0/
Let F0 .u/ D †t .u/ 0 .u/ be the collision density for the fuel region. Under the
NR-approximation as before, Eq. 5.108 for a single resonance absorber mixed with
diluent becomes
†.0/
t .u/
0 .u/
.0/
D .1 P0 /†.0/
p C P0 †t .u/
(5.114)
5
Resonance Theory in Reactor Applications
261
Thus, the corresponding resonance integral can be expressed in terms of two
integrals
Z1
RI D
†.0/
p
0
a .u/
†.0/
t .u/
Z1
du C
0
P0 a.0/ .u/†.0/
Rt .u/
†.0/
t .u/
du D Iv C Is
(5.115)
.0/
where †.0/
p and †Rt .u/ are the total macroscopic potential scattering cross section
and total macroscopic resonance cross section, respectively. The quantities Iv and Is
are referred to as the “volume” integral and “surface” integral respectively according
to Wigner et al. [3]. The former is identical to that for homogeneous media described
in the previous section while the latter involves the escape probability characteristic
of the heterogeneous nature of the cell. The general form of the surface integral
becomes
+
*Z1
h
i
a.0/ .u/†.0/
.u/
S0
.0/
Rt
1 exp.†t .u/l/ du
Is D
(5.116)
4V0
†.0/
.u/
t
0
where the angular bracket denotes the integration over the geometric configuration
in the context of Eq. 5.111. From the perspective of reactor applications, this approach is not widely used.
In contrast, the simplest possible means of computing the resonance integral
for the fuel lump is via the direct use of Wigner’s rational approximation. The
substitution of Eq. 5.113 into 5.108, leads to the well-known “equivalent relation”
whereby the flux exhibits the same functional form as that for homogeneous media,
0
†.0/
p C †e
.u/ D ;
.0/
†p C †e C †.0/
t
†e D
1
l
(5.117)
where †e is referred to as the “equivalent” cross section. It follows that
.RI /k D
.eq/
where p
p.eq/ a
J.ˇk0 ; k ; ak /
cos 2 l E0
(5.118)
i.
h
.0/
D †p C †e N0 is commonly referred to as the “equivalent” po-
tential scattering cross section and ˇk0 D p.eq/ =.0k cos 2 l /. Hence, Eq. 5.118 is
identical to the corresponding resonance integral for homogeneous media defined
by Eq. 5.81 if p.eq/ is replaced by p . It was the early version of this equivalence
relation which was later generalized to the closely packed lattice as one shall see.
This approximation was used extensively not only in reactor physics calculations but was also used as a tool to analyze resonance integral measurements. If one
neglects the temperature dependence and the asymmetric component of the resonance line shapes, Eq. 5.118 for a low energy resonance immediately reduces to the
following form,
a 0k 0 ı
1=2
ˇ .1 C ˇ 0 /
(5.119)
.RI/k D
2 E0
262
R.N. Hwang
In the limit of a strong resonance, the above expression can be approximated by
1=2
S0 1=2
S0
.RI/k const †.0/
C
/
A
C
B
p
4V0
M
(5.120)
which was used as the means to estimate the approximate behavior of the resonance
integral as a function of the surface to mass ratio in analyzing experimental results.
Similar results were also obtained by Gourevich and Pomeranchouk [7] based on
Eq. 5.116.
NRIM (or WR) Approximation
In contrast to the NR-approximation, Eq. 5.108 under the NRIM-approximation
in the same context described for homogeneous media must assume the following form:
i
h
.0/
.0/
†t 0 .u/ D .1 P0 / †.0/
.u/
†
0 .u/
s
m
h
i
.0/
C †.0/
(5.121)
CP0 †.0/
t .u/ †m
m
Again, like the case of the NR-approximation, one practical method to circumvent
the task of computing the escape probability rigorously is the use of the rational
approximation. Upon substituting Eq. 5.121 into 5.108, one obtains the pertinent
equivalence relation as before, i.e.,
0 .u/
N
†.0/
m C 1=l
D
.0/
.0/
†m C 1=lN C †a .u/
(5.122)
Thus, the corresponding resonance integral for a Breit–Wigner resonance becomes
.eq/
.RI/k D
p t
J.ˇk0 ; k /;
E0
.0/
p D
†m C †e
N0
(5.123)
.
where ˇk0 D p.eq/ .0k ak =tk / and the equivalent cross section is the same as
that given in Eq. 5.117.
It is quite clear that the same rationale is equally applicable to the IR approximation described in the previous section.
5.5.2 Traditional Collision Probability Treatment
in a Closely Packed Lattice
The discussions so far have been focused on the idealistic case in which the fuel
lumps are considered as isolated from one another, separated by large moderator
5
Resonance Theory in Reactor Applications
263
regions. The basic principle was later extended to the more realistic situations
involving closely packed reactor lattices.
5.5.2.1 General Features of the Escape Probability
In the presence of many closely packed unit cells, neutrons escaping from the fuel
lump in a given cell may not necessarily suffer their next collision in the surrounding moderator. This issue was first addressed by Dancoff and Ginsburg [60]. The
“shadowing” effect attributed to neighboring fuel lumps on the escape probability
of the fuel lump in question has henceforth been referred to as the Dancoff effect.
The most comprehensive discussions on this subject are believed to be those
given by Rothenstein [61], Nordheim [52], and Lukyanov [10]. The physical attributes of this phenomenon can be best illustrated by following a straight-line
trajectory of the neutron path through the fuel lumps and the moderator regions. Let
subscript 0 denote the fuel lump in question and 2; 4; : : :; 2N denote all neighboring fuel lumps, while 1; 3; 5; : : :; 2N C 1 denote the moderator regions sandwiched
between the fuel lumps. If ln is the chord length of the path in the n-th region, and
the local transmission probability between two surface points is denoted by n , then
for the isolated fuel lumps previously described, one has
.iso/
P0
D
1
†.0/
t l
.1 h0 i/
(5.124)
To account for the Dancoff effect, one must take into account the probability that
neutrons survive without suffering any collision along the trajectory through the
successive regions. Let P0 be the improved escape probability for the lump 0 in
such a lattice. It is not difficult to see that P0 is expressible symbolically in the
general form as follows:
P0 D
1
.0/
†t l
h.1 0 / Œ.1 1 / C 1 2 .1 3 / C i
(5.125)
Similarly, the escape probability for the surrounding moderator, P1 , can be obtained
by interchanging the indices in the equation above.
However, the escape probability in its general form is obviously too complicated
for practical applications when resonance structures in cross sections are involved.
One commonly used simplified assumption is that all unit cells are taken to be identical, a configuration referred to as an infinite lattice with repeated cells. Under such
an assumption, it follows that
h2n i D h0 i D T0 ;
h2nC1 i D h1 i D T1 ;
n D 1; 2; 3; : : : ; N
(5.126)
The repeated cell assumption leads to four approximations most frequently used
in applications. In particular, the first two of these provide the theoretical basis for
extension of Wigner’s rational approximation to accommodate the Dancoff effect in
closely packed lattices.
264
R.N. Hwang
“Black” Limit Approximation
If the total cross section of the absorber is assumed to be large, all higher-order
terms in the square bracket of Eq. 5.125 diminish. In addition, if the average of the
product is replaced by the product of the averages, one obtains,
P0 1
†.0/
t l
.1 h0 i/.1 h1 i/ D P0.iso/ .1 C /
(5.127)
where C is referred to as the Dancoff correction factor depending on the cross
section and geometric configuration of the moderator alone. Thus, the rationalfunction-based approaches derived for isolated lumps are equally applicable to
closely packed lattices if the Dancoff factor is known.
Nordheim’s Approximation
The fact that the absorbers may not always be considered as “black” provides the
need for further improvement. Since the linear terms in Eq. 5.125 are generally most
dominant, all higher-order products in the ’s can be adequately approximated by
the respective products of the averages as pointed out by Nordheim [52]. Thus,
upon substitution of the averages and noting the geometric nature of the series for
the repeated cells, Eq. 5.125 can be expressed in the closed form
P0 D
.1 h0 i/.1 h1 i/
.1 C /P0.iso/
D
.0/
.0/
1 h0 i h1 i
†t l
1 .1 l†t P0 /C
1
(5.128)
Infinite Slabs
The case of closely packed repeated cells consisting of infinite slabs is probably
one of the few cases where the escape probability can be represented analytically
without simplifying assumptions. It is quite obvious from Eqs. 5.111, 5.125 and
Table 5.1 that the escape and transmission probabilities for this type of configuration are generally expressible as a single integral. If the absorber and moderator
with the thicknesses t0 and t1 , respectively, are either in a periodic (or reflective)
arrangement, Eq. 5.125 can be reduced to a single integral of manageable form as
pointed out by Corngold [62] and Rothenstein [61], namely,
P0
D
1
2†.0/
t t0
12
1
X
E3 ..n C 1/
0
nD0
2E3 ..n C 1/.
C n 1 / C E3 .n
0
C .n C 1/ 1 /
0C
1 //
(5.129)
5
Resonance Theory in Reactor Applications
265
.1/
where 0 D †.0/
t t0 and 1 D †t t1 are the optical thickness of the fuel and of the
moderator, respectively. E3 .x/ is the exponential integral of order 3.
It is quite obvious that the escape probability of an isolated fuel plate defined by
.0/
Eq. 5.124 corresponds to n D 0 with transmission probability h0 i D 2E3 .†t t0 /.
On the other hand, under the black limit assumption, the Dancoff factor in Eq. 5.127
is simply C D 2E3 .†.1/
t t1 /.
Repeated Cells with Cylindrical Configuration
The infinite lattice in this case can be pictured as an ensemble of unit cells each
consisting of a fuel rod and a cylindricized moderator region. In contrast to the
infinite slabs, one general complication here is that the resulting integral generally
involves a double integral according to the basic geometric properties of the cylinder
given in Table 5.1. As discussed by Takahashi [63] and also by Rothenstein [61], it
is possible to derive the same type of expression as that for infinite slabs except that
it involves a series consisting of double integrals. In particular, the inner integrals
are identifiable with the Bickley–Nayler function [64] of order 3.
From the perspective of practical applications, especially fast reactor applications
involving an extremely large number of resonances over a large energy span, the
method derived on this basis is far more difficult to use for routine calculations than
that for the infinite slabs.
Other practical alternatives will be discussed later.
5.5.2.2 Fine-Tuning of the Rational Approximation for Routine Applications
As discussed earlier, the dramatic simplification of the lattice physics calculations
for the case of an isolated cell was made possible via Wigner’s rational approximation, leading to the equivalent relation whereby the lattice physics calculations can
be treated in the same context as those for homogeneous media. The extension to
closely packed lattices is obviously possible via either the black limit approximation
or Nordheim’s approximation discussed in the previous subsection. By substituting
Eq. 5.113 into 5.128, one obtains,
P0
ı
1 lN .1 C /
D .0/
D .0/
ı
†t .u/ C ŒS0 =.4V0 / .1 C /
†t .u/ C 1 lN .1 C /
ŒS0 =.4V0 / .1 C /
(5.130)
It is interesting to note that, physically, the Dancoff effect is equivalent to the reduction of the surface to volume ratio. As long as the moderator cross sections are taken
to be constant, or C is independent of energy, the equivalence relation is clearly applicable as well.
From the NR-approximation consideration, Eq. 5.130 leads to an equivalent cross
section of the following form,
266
R.N. Hwang
†e D
1
lN
a.1 C /
1 C .a 1/C
(5.131)
where a, referred to as the Bell–Levine factor [65, 66], provides more fine-tuning
of the rational approximation. Typically, a is set to 1.33 and 1.08 for cylindrical and slab geometries respectively. The same argument can be extended to the
NRIM-approximation defined by Eq. 5.123. The second issue is how to compute the
Dancoff factor simply at run time. One obvious choice is to apply the same rational
approximation proposed by Bell [65],
.1/
B D 1 C D
l 1 †t
1 C l 1 †.1/
t
(5.132)
N
where the moderator cross section, †.1/
t , is taken to be constant and l1 here is the
average chord length for the moderator region. Hummel et al. [67] have found that
Eq. 5.132 can be significantly enhanced by the following modification:
D 1 C D B C B4 .1 B /
(5.133)
The approximation above was found to be adequate and has been incorporated into
the MC2 -2 code [54] for routine applications.
5.5.3 Connections to Resonance Integral and Multigroup
Cross-section Calculations
For closely packed repeated cells, there are two types of resonance integral-based
methods widely used in reactor applications, which are summarized below.
5.5.3.1 Rational Approximation and Approximate Flux Based Approaches
As discussed in the previous section, the direct equivalence between the treatment
of the resonance integral in heterogeneous media and that in homogeneous media
can be established to various degrees of sophistication when applied in conjunction
with traditional approximations to the slowing-down equation. Like the previously
described case of isolated fuel lumps, one only needs to redefine the “equivalent”
cross section †e according to Eq. 5.131. For the NR-approximation, it is readily amenable to the J -integral approach discussed in Section 5.4 from which the
multigroup cross sections can be obtained via Eq. 5.92. The viability of this type of
approach has been demonstrated by the MC2 -2 code. The same logic applies if one
chooses to use the NRIM or IR approximation.
5
Resonance Theory in Reactor Applications
267
The equivalence relation also makes possible the use of the Bondarenko scheme
widely used for routine reactor calculations. For the case in point, the precomputed self-shielding factors at a given temperature as a function of p.eq/ defined
by Eq. 5.131 are generated for various preselected groups and stored prior to their
deployment at run time.
5.5.3.2 Nordheim’s Method
Like the case of homogeneous media, Nordheim’s method using a numerical
method for solving the slowing-down equation and the subsequent calculation of
the resonance integral described in Section 5.4.3.2 is also applicable here. The
only difference is that the slowing-down equation is given by Eq. 5.108 instead
of Eq. 5.78. Like Eq. 5.78, 5.108 can also be further simplified by assuming the
NR-approximation for the diluents in the fuel lump in the same context as the former [52]. For fuel lumps with a single resonance absorber, Eq. 5.108 becomes
1
Zu
.r/ 0
†
.u
/
1
0
s
e .uu / .r/
F0 .u0 /du0 A
F0 .u/ D .1 P0 / @
1 ˛r
†t .u0 /
u"r
.r/
C P0 †t .u/ C †m
0
(5.134)
where the indices r and m denote the resonance absorber and the diluent in the fuel
lump respectively. The above equation would become identical to Eq. 5.78 if one
sets the escape probability to zero. Thus, by using the escape probability defined by
Eq. 5.128, the same numerical algorithm described in Section 5.4.3.2 for computing
the resonance integral can be deployed.
It is important to note that the original numerical scheme was developed primarily for computing a well-isolated Breit–Wigner resonance for thermal reactor
applications. The question will arise when the mutual self-shielding effects resulting
from the presence of other resonances in the vicinity of the one under consideration,
a scenario quite common in fast reactor physics calculations. Therefore, more rigorous methods were subsequently developed in order to cope with the overlapping
of resonances in the slowing-down process.
5.5.4 Rigorous Treatment of Resonance Absorption in a Unit
Cell with Multiple Regions and Many Resonance Isotopes
Motivated by needs in conjunction with fast reactor development, rigorous methods
for treating resonance absorption were developed for various types of calculations
where accuracy is required. The case in point can be viewed as a natural extension
of the rigorous treatment of the slowing-down equation in homogeneous media if
268
R.N. Hwang
the same hfg approach is used as the means to discretize the lethargy domain. For
a unit cell with multiple regions and many resonance isotopes, generally it would
require the numerical solution of a system of N slowing equations analogous to the
coupled equations for the two-region cell defined by Eqs. 5.104 and 5.105, i.e.,
Vi Fi .u/ D
N
X
Pij .u/Vj Sj .u/
(5.135)
j D1
where Pij .u/ is the collision probability for neutrons originating in region j that
will make their next collision in region i and Sj .u/ is the scattering source from
region j as defined in Eq. 5.106.
In principle, the numerical solution of these slowing-down equations can be
carried out in the same context of the hyper-fine group approach described in
Section 5.4.4.3. The analogy to Eq. 5.102 can be readily established via the use of
matrix notation. For a given hfg k, the collision rates and scattering sources in various regions of the cell can be represented as vectors. In the same context as Eq. 5.102
for the homogeneous case, the collision rate can be expressed as
.in/
E
E D .P1 R/1 S
E
E .in/
E
(5.136)
CR D P S C R CR ; CR
where R is a diagonal matrix with elements Ri i D .†sk =†t k /i .KQ s /i for a given
region i and the remaining quantities appearing in Eq. 5.136 are consistent with
Eq. 5.102 except that they are written in vector form. Thus, the collision rate for a
given region can be specified once the collision matrix is known and the corresponding neutron flux is simply,
CRi .uk /
(5.137)
i .uk / D
Πt .uk / i
Given the hfg flux in every subregion of the unit cell, the desired broad group cellaveraged cross section for reaction process x required for practical applications
becomes,
Q xG
D
X
N X
i D1 k2G
x .uk / i .uk /
, X
N X
.u
/
;
i k
x2i
(5.138)
i D1 k2G
The practical issue here is how to compute the collision probabilities for cells involving many regions and resonance isotopes at a large number of mesh points (or
hfgs). For our purpose here, two proven methods will be outlined.
5.5.4.1 Kier’s Method for Cylindrical Unit Cells
Consider a typical unit cell in the form of an infinite cylinder consisting of a fuel rod
with cladding surrounded by a cylindricized moderator region [57]. Let us subdivide
5
Resonance Theory in Reactor Applications
269
the cell into N intervals all of which are in the form of annuli except for the interval
at the center. To compute the detailed neutron flux at a given hfg and spatial interval
requires an explicit or implicit description of the collision probability. This can be
accomplished in steps as follows.
Intra-Cell Neutronic Balance
One way to simplify the complexity of computing the collision probabilities for the
multiregion cell is to cast the intra-cell neutronic balance into a somewhat different
perspective. In Kier’s approach [57], the interface current can be described in the
form of a matrix equation,
E
TJE D PS
(5.139)
where the elements of T and JE denote the transmission probabilities and currents
at the surfaces of each region, respectively, while P denotes the escape probabilE is the neutron source
ities from a given region to its immediate neighbors and S
in the region considered. Within the context of the hyper-fine group approach described in Section 5.4.4.3, this matrix equation must be solved at each hyper-fine
group level with the source in each region computed the same way as before starting at the hfgs beyond the resonance region. Physically, one only needs to deal with
the transmission and escape probabilities of the region under consideration to the
immediately adjacent neighboring regions while the effects from the remaining regions are implicitly accounted for via boundary conditions. If one further assumes
isotropic return of the currents at the boundaries, the corresponding T and P for
annuli can be specified readily.
To determine the interface currents based on the above equation requires the solution of a tri-diagonal matrix equation of order 2N 2N reflecting the tri-diagonal
nature of the matrix equation given by Eq. 5.139 where two currents per region for
such a region are required. Thus, Eq. 5.139 can be expressed in terms of 2N linear
equations as specified in [54, 57].
As proposed by Kier [57], this matrix equation can be solved efficiently via the
use of the usual Gauss elimination procedure followed by backward substitution
once the pertinent transmission probabilities and escape probabilities are known.
Computation of the Probabilities of a Given Annulus
Consider the i-th annulus bounded by circles with radii ri 1 and ri respectively.
For an annulus, three first-flight transmission probabilities and two escape probabilities are required to specify the neutronic balance defined by Eq. 5.139. The former
can be denoted symbolically by TiOI ; TiIO and TiOO signifying the transmission
of neutrons from inner-to-outer, outer-to-inner and outer-to-outer surfaces of region i , respectively. The latter can be denotedby PiC and Pi signifying the fraction
270
R.N. Hwang
of neutrons escaping the outer and inner surfaces of region i , respectively. These
probabilities can be defined in manageable forms if the cosine current assumption
is used, i.e.,
Z Z
2
x
4
4 ˛i
TiOO D
cos Ki3 .bi cos /d D
p
Ki3 .bi x/dx (5.140)
sin1 ai
0
1 x2
Z
q
p
z
4 1
OI
2 2
2
; TiIO Dai TiOI (5.141)
Ti D
dxKi3
1 ai x ai 1 x
0
1 ai
where ai D .ri 1 =ri /; ˛i D .1 ai2 /1=2 ; bi D 2†t i ri and z D bi .1 ai /=2. Here,
Ki3 .x/ is the well-known Bickley–Nayler function [64] of order 3. The corresponding outward and inward escape probabilities for the i-th annulus are respectively,
PiC D
1 1 1 TiOO TiIO ; Pi D
1TiOI
†t i l
†t i l
(5.142)
where lN D 2.ri2 ri21 /=ri 1 . Hence, all probabilities are specified once TiOO and
TiOI are known. For practical calculations, these transmission probabilities can be
either precomputed and stored in two-dimensional tables or alternatively, computed
at run time via a low-order quadrature described in [68].
Determination of Collision Probabilities
From neutron-conservation considerations, the collision rate for a given region can
be specified symbolically as
.CR/i D 1 TiOO Ji C 1 PiC Si ; i D 1
OO
D 1 TiOI JiC
TiIO Ji C 1 PiC Pi Si ;
1 C 1 Ti
i D 2; N (5.143)
where i D 1 denotes the inner-most region with the configuration of a circle. In
the original development of Kier [57], the corresponding flux for region i and
hfg k is simply, i .uk / D .CR/i =†t i .uk /. To put the results in the same context
of Eq. 5.136, the traditional collision probabilities can be readily deduced from
Eq. 5.143. Physically, it is not difficult to rationalize that the collision rate for region
i is identifiable with Pij if one sets Si D 1 and Sj D 0 in the equation above, i.e.,
Pij D .CR/j ;
Si D 1; Sj D 0 if j ¤ i
(5.144)
Thus, the hfg flux for all subregions can be determined via Eq. 5.137, from which
the cell-averaged broad group cross section defined by Eq. 5.138 can be specified.
This approach has been incorporated into the RABBLE-code [57] as well as the
MC2 -2 code [54].
5
Resonance Theory in Reactor Applications
271
5.5.4.2 Olson’s Method for Unit Cell with Many Plates
During the period of fast reactor development, a great deal of emphasis was placed
on the analysis of measurements from various fast critical assemblies made of drawers with a large number of plates. A much more rigorous alternative for such an
approach was developed by Olson [58] without resorting to the assumptions of the
cosine or cosine-square for the interface current made by Kier [57], conditions plausible for cylindrical cells. The rationale was that the neutron transport in the infinite
lattice consisting of repeated cells with infinite slabs can be specified directly by the
integral transport equation without resorting to the use of Eq. 5.139 as was evident
from the earlier work of Corngold [62] and Rothenstein [61].
In the context of the hfg approach described previously, the scattering source
for a given hfg k and i-th region, Si.k/ encompasses the sum of scatter-in and
self-scattering terms described in Section 5.4.4.3. In the approach proposed
by Olson [58], the source is allowed to vary linearly within each plate and is
then used in conjunction with the integral transport approach to compute the intraand inter-cell currents. The resulting current at mean free path ; D †t i x, beyond
the plate with optical thickness i can be expressed as
E ; i/ D
J.
Si.k/
ŒE3 . / E3 . C
2 i
i/
C f. ; i/
(5.145)
where the first term is identifiable with the traditional result using constant source
and the second term amounts to a correction term taking into account the linear variation of the source. f . ; i / involves a linear combination of exponential integrals
of both orders 3 and 4 [58]. If a plate j is at mean free path away from the plate i ,
the collision rate in j due to the source in i is
E
E C
E ; i / J.
CR.i
! j / D J.
j; i/
(5.146)
For the infinite repeated cells with periodic plate arrangement, the contribution of
all plates i to plate j become
E 1 .i ! j / D
CR
1 h
X
E C
E C mh; i / J.
J.
j
C mh; i /
i
(5.147)
mD0
where h is the optical thickness of the unit cell. A similar expression can also be
derived for cells with reflective boundary condition. The infinite sum involving exponential integrals of order n 3 can be evaluated readily via the use of the Gauss
quadrature developed for this purpose [58].
Thus, the collision probability Pij for a unit cell is expressible as
h
i
E 1 .j ! i /
E 1 .i ! j / C ŒCR
Pij D CR
Si
(5.148)
272
R.N. Hwang
By substituting Eq. 5.148 into 5.136 followed by inverting the resulting matrix,
the flux in a given plate for the hfg k can be determined. The substitution of the
flux so obtained into Eq. 5.138 yields the desired broad group cell-averaged cross
section for applications. The approach described here was first coded in the RABIDcode [58] and later incorporated into the current MC2 -2 code [54] as a valuable
option.
5.6 Treatment of Unresolved Resonances
The treatment of resonance self-shielding effect in the unresolved energy range
constitutes one of the most important topics in resonance theory especially in fast
reactor applications. The range under consideration is usually defined as the energy
span between the upper boundary of the resolved energy region and an upper bound
beyond which the self-shielding effect becomes unimportant. For major actinides,
it spans the range from the low keV region up to about 100 keV. This is the energy
region where the Doppler-width becomes much larger than the resolution width of
the instrument so that resonances may not be parameterized deterministically. The
methods for treating resonance absorption in this range can be viewed as a natural
extension of the statistical theories of average cross sections such as those described
by Moldauer [69] and Ericson [70]. Since the theoretical foundations may often be
obscured in routine applications, it is useful to summarize briefly some conceptual
aspects of the problem prior to the discussions of the basis for the computational
methods. For our purpose here, the discussions will be focused on current methods
based on the single-level Breit–Wigner approximation although the same concept is
extendable to the more rigorous S -matrix approach if its parameters are expressed
in terms of the R-matrix parameters.
5.6.1 Statistical Theory Basis
The statistical basis for evaluating the average cross section hx iE is assumed to be
extendable to the estimation of the expectation values of the reaction rate hx iE;Er
and the flux h iE;Er as well. Without loss of generality, each microscopic cross
section is represented by the R-matrix formalism defined previously in terms of
parameters E and c . From the statistical theory of spectra [71], the distributions of these parameters are well-known. These distributions, in effect, define the
joint density function (p.d.f.) required for evaluating the averages attributed to an
ensemble of resonances
˝ 2in˛ the vicinity of an energy, say E . Given information
of hjEi Ei C1 ji and ci and through the explicit knowledge of the behavior of
x and as function of energy, the expectation values of interest are, in principle,
completely specified.
5
Resonance Theory in Reactor Applications
273
5.6.1.1 Some Statistical Theory Fundamentals
A brief outline of some pertinent statistical theory fundamentals is believed to be a
good starting point for the case in point.
Basic Rule
The probability distribution of an event A characterized by statistically independent
variables .x1 ; x2 ; x3 ; : : : ; xi / 2 A is defined as
Z Z
P .A/ D
:::
Z Y
pi .xi /dx1 dx2 : : : dxi
(5.149)
i
where pi .xi / is referred to as the probability density function (p.d.f.) for xi and the
products of all p.d.f. is known as the joint density function for these variables. For
the case under consideration here, these quantities are identifiable with the partial
width distributions and the level correlation functions. The direct use of the joint
density function provides the most widely used basis for computing the averages of
interest to be described.
Addition Theorem
The probability distribution function for the union of two events is given by,
P .A [ B/ D P .A/ C P .B/ P .AB/
(5.150)
Multiplication Theorem
The probability distribution of the product of two correlated events is given by,
P .AB/ D P .A/P .BjA/ D P .B/P .AjB/
(5.151)
where P .BjA/ is known as the conditional probability of B for a given occurrence
of the event A. The above relation must be symmetric when A and B are interchanged.
This theorem is of a great deal of practical interest since it provides another basis
for computing the averages of interest known as the “probability table” method to
be described shortly.
274
R.N. Hwang
5.6.1.2 Statistical Distributions of Practical Interest
There are three types of distributions for the resonance parameters of practical
interest. They will be summarized as follows.
Partial Width Distributions
It is well-known that the reduced width amplitude c for a given level and
channel c follows the normal distribution with zero means and variance of unity
according to Porter and Thomas [72]. For a given reaction process, there may exist
several channels and, consequently, the evaluation of the average quantities can require the task of evaluating many multiple integrals with many variables. For the
Breit–Wigner approximation, the partial width of a given reaction type is actually
used. For a given reaction process x, it is defined as
X
2
2Pc c
(5.152)
x D
c2x
where Pc is the penetration factor described in Section 5.2.1.1. If one assumes that
2
> is the same for all c 2 x, the probability density function for
the average < c
the partial width becomes,
p .y/dy D
2 1 y
e 2 y dy
2 .=2/ 2
(5.153)
where p .y/ is the well-known 2 -distribution with the degrees of freedom equal
to the total number of exit channels for process x and y D x =hx i, the local to
average
˝ 2 ˛ ratio of the partial width. It should be noted that Eq. 5.153 is no longer valid
is different for c 2 x, a scenario where the multichannel effect is important.
if c
Level Width Distribution
The distribution of E is characterized by the Wigner distribution [73] or the longrange correlation described by Dyson [74]. The former, also known as Wigner’s
surmise developed in 1957, that the probability distribution of the level spacing
jE E j for a given l and J state should follow a simple analytical form,
w.x/dx D x exp x 2 dx
(5.154)
2
4
ı˝
˛
where x D jE E j jE E j is the local to average ratio of the level spacing.
The fact that w.0/ D 0 signifies the repulsion of eigenvalues of a real and symmetric Hamiltonian matrix. More studies of this subject of such a matrix ensemble
were further pursued by Wigner and others [71, 73, 74]. Of particular interest for
5
Resonance Theory in Reactor Applications
275
practical applications is the so-called Gauss orthogonal ensemble (GOE) in which
the distributions of the matrix elements are taken to be statistically independent
Gaussians and invariant under orthogonal transformation. The Wigner distribution
given above is identifiable with the behavior of the difference of eigenvalues of a
2 2 matrix. In contrast, the latter proposed by Dyson [74] is free from the assumption used by the former. Here, the ensemble under consideration is not the
Hamiltonian directly but an ensemble of eigenvalues fE g of a unitary matrix S
of the form exp.i ‚ / uniformly distributed around a unit circle. For an orthogonal
ensemble, a general expression for the n-level correlation function as a function of
‚ with D 1; 2; 3; : : : ; n was developed by Dyson [74]. From the practical point
of view, the general form of the distribution is difficult to use. In the limit of two
levels, however, it is expressible in a simple analytical form which can be utilized
for routine applications as one shall see.
Level Correlation Functions
In practical calculations, another distribution of interest, .jEk Ej j/, is the probability of finding any level within an interval at a distance jEk Ej j away from
a given level Ek when the direct integration approach is used for averaging. It is
of particular interest when the overlapping
ˇ˛ is dominated by the immediate
ı˝ˇ of levels
neighbors. If one sets x D jEk Ej j ˇEk Ej ˇ as before, this distribution can
be defined by the following convolution integral equation,
Zx
.x/ D w.x/ C
w.t/.x t/dt
(5.155)
0
where w.x/ is the Wigner distribution of the level spacing belonging to a resonance sequence of a given l and J previously defined. The analytical solution to
this equation in a closed form is difficult to derive but one can handle it numerically.
It is interesting to note that the analytical solution in a closed form can be derived
if w.x/ is replaced by the 2 -distribution of even order via the use of the Laplace
transform [75], i.e.,
1
.x/ D
2 i
cCi
Z 1
ci 1
.v=2/v=2 e xp dp
.v=2/ .v=2/x
e
D
2 i
.p C v=2/v=2 .v=2/v=2
cCi
Z 1
ci 1
e .v=2/xz dz
z.v=2/ 1
(5.156)
whereby, for all even 2, it is expressible in terms of a damped oscillatory
function in closed form readily shown via the use of the Cauchy integral theorem
with the poles of the integrand computed via De Moivre’s theorem. In particular, by
setting D 8, the corresponding 2 -distribution exhibits resemblance to the Wigner
distribution and Eq. 5.156 yields the following simple analytical form,
.x/ D 1 e 8x 2 sin 4xe 4x
(5.157)
276
R.N. Hwang
which was used in earlier work based on the “nearest level” approximation for
estimating the overlap effects on the self-shielded cross sections attributed to immediate neighbors.
Physically, Dyson’s two-level correlation function can be considered as the rigorous alternative to .x/. The former was later introduced into reactor applications
by replacing .x/ with r2 .x/ derived by Dyson [74], i.e.,
ds.x/
r2 .x/ D 1 Œs.x/ dx
Z1
2
s.y/dy; s.x/ D
sin x
x
(5.158)
x
There is another type of level correlation function of practical interest. If the ensembles levels fEk g and fEj g belong to either different spin states or different nuclides,
the distribution of jEk Ej j must follow the Poisson distribution. Thus, if one
substitutes the 2 -distribution of D 2 into Eq. 5.156 in place of the Wigner distribution, the corresponding correlation function becomes .x/ D 1. Physically, this
signifies the fact that the probability of finding a level at a distance jEk Ej j is
equally probable so long as these levels are statistically independent.
5.6.1.3 Conceptual Aspects of Computing Average Cross Sections
Conceptually, the computation of these averages can be considered as the natural
extension of the previously described treatment of resonance cross sections in the
resolved energy range. With no loss of generality, consider an ensemble of a generic
quantity qk.l;J / .E/ attributed to a Breit–Wigner resonance k for a given l and J state.
E
D
, can be pictured as the population
The statistically averaged value, qk.l;J / .E/
E par:
average in the following context,
D
E
qk.l;J / .E/
E par:
N Z E2
X
1
D
qk.l;J / .E/dE
E2 E1
E1
kD1
Z
1
1 1 .l;J /
lim
D
q
.E/dE
hDi N !1 N 1 k
Z 1
1
q .l;J / .E/dE
D
hDi 1 k
par:
(5.159)
where the average resonance parameters for the distributions are specified at the
predetermined energy E provided by the evaluated data file. Here, hDi denotes
the average level spacing. The equivalence of the discrete average and the statistical
average clearly requires the validity of the ergodicity and stationarity of the samples
inside the angular bracket in the vicinity of E .
5
Resonance Theory in Reactor Applications
277
This concept has been widely used as a basis for computing the unshielded,
as well as the shielded, group cross sections. For the former, qk.l;J / .E/ is set to
.l;J /
.E/ as defined in Eqs. 5.41 and 5.42. For the latter, it requires the average rexk
.l;J /
action rate as well as the average flux, which can be obtained by setting qk .E/
.l;J /
equal to xk
.E/ k .E/ and k .E/ respectively.
The relation defined in Eq. 5.159 can be cast into a different context from the
standpoint of the widely used Monte Carlo technique. Given the probability density
functions for level spacing and partial widths, one can construct the corresponding discrete resonance sequence whereby the averages can be treated in the same
manner as the resolved resonances as shown in the first part of Eq. 5.159. This can
be accomplished by the following procedure. First, one specifies the c.d.f. from the
known p.d.f., say p.x/,
Zx
P .x/ D
p.y/dy; x D P 1 .x/
(5.160)
0
where P 1 .x/ symbolically
R 1 represents the value x obtainable from a given value
in P .x/. By definition, 0 p.x/dx D 1 so that the range of the c.d.f. must be between 0 and 1. P .x/ can be either the c.d.f. for level spacing or that for partial
widths where appropriate. Second, a resonance sequence can be generated if one
assumes that P .x/ is equally probable within the range 0 P .x/ 1. This can be
accomplished by choosing a random number i obtained by a random number generator, a computer program for generating statistically uncorrelated numbers with
0 i 1, and setting Pi D i , from which the corresponding set of parameters xi can be obtained from the inverse relation given by Eq. 5.160. Thus, by
repeating such a process for the c.d.f.’s of all resonance parameters, a resonance sequence consisting of discrete resonances can be generated. A sequence of discrete
resonances so generated is commonly referred to as a “resonance ladder.” Given the
resonance ladder, the desired average can be treated in the same manner as if they
were resolved resonances. It is important to realize, however, that this procedure
is always accompanied by significant statistical uncertainties reflecting the highly
fluctuating nature of the resonance phenomena. Thus, to reduce such uncertainties
generally requires the inclusion of many statistically uncorrelated ladders. This undoubtedly hinders the routine usage of such a method especially when high accuracy
is required.
It should be noted that the desired averages can also be specified from a somewhat different perspective via the use of the multiplication theorem defined by
Eq. 5.159. This concept leads to another alternative approach commonly referred
to as the “probability table” method, which is particularly attractive in conjunction
with reactor calculations using the Monte Carlo method. A discussion of this method
will be presented later.
278
R.N. Hwang
5.6.2 Average Unshielded Cross Sections
and Fluctuation Integrals
The simplest possible averages based on the statistical concept given in the previous
subsection are the average unshielded cross sections also referred to as the “infinitely dilute” cross sections. Such averages are generally temperature-independent
due to the invariant nature of the Doppler-broadened line-shape functions when integrated over energy. One average of particular importance is the average compound
nucleus cross section. Substituting Eq. 5.42 into 5.159, one obtains,
htR i D 2 2 2
X
gJ cos 2 l sl;J
(5.161)
l;J
where sl;J D hn i=hDi is known as the “strength function,” a quantity directly relatable to the transmission measurements.
For partial cross sections, the unshielded averages are directly expressible in
terms of fluctuation integrals. The substitution of Eq. 5.41 into 5.159 leads to,
X g n x J
; x 2 ; or f
hx i D 2 hDiJ
t
l;J
2
X g
n
J
hsR i D 2 2 2
2 sin2 l hn i
hDiJ
t
2 2
(5.162)
(5.163)
l;J
for reaction x cross section and resonance scattering cross section, respectively. It
should be noted that the total width in practical applications is taken to be the sum
of all partial widths plus a “competitive width” c , i.e., t D n C C f C c .
The latter presumably provides the approximate means to account for the inelastic
scattering channels in the energy range above the first excited state of the nuclide in
question. The quantity hn x =t i is referred to as the “fluctuation integral” which
can be computed accurately via direct integration over all distributions of the partial widths. Let fx D hn x =t i be the multiple integral over the 2 -distributions
of partial widths involved. Taking advantage of the exponential nature of these distributions, one can show readily that fx can be reduced to a single integral of the
generic form given below,
Z1
fx D aC
0
e t dt
.1 C c1 t/. 1 =2/Cj .1 C c2 t/. 2 =2/Ck .1 C c3 t/.
3 =2/Cm
(5.164)
for x 2 ; s, or f , where ci D 2hx i = = i . with the degrees of freedom ˝i in
˛
the context given by Eq. 5.153 and i 2 1; 2; 3 corresponding to hx i D hn i ; f ,
and hc i, respectively, and C D hn i hx i = . The constant a is equal to unity for
all x ¤ s while a D 3 when the fluctuation integral for the scattering cross section
5
Resonance Theory in Reactor Applications
279
that appears in the first term of Eq. 5.163 is considered. The integer constants j , k
and m are dependent on the type of reaction considered. For x 2 , j D 1, k D 0,
and m D 0. For x 2 f , j D 1, k D 1 and m D 0. For x 2 s, j D 2, k D 0 and m D 0.
For x 2 c, j D 1, k D 0 and m D 1. It should be noted that fc is not used in practical
calculations.
One way to evaluate the single integral given by Eq. 5.164 is via a Gauss quadrature. In particular, the integral can be computed accurately and efficiently if the
change of variable of integration setting y D .1 C t/=.1 t / is made prior to integration and is followed by the use of the Gauss–Legendre quadrature in the
y-domain [77].
The accurately computed average unshielded cross sections via the fluctuation
integrals provides the necessary criteria to verify those computed by other methods, either via deterministic or Monte Carlo approaches. The disagreement in the
unshielded cross sections clearly will be passed on to the calculations of the selfshielded cross sections as well.
5.6.3 Traditional Approaches for Computing Self-shielded
Average Cross Sections
Discussions in this section focused on the traditional methods for computing the
self-shielded cross sections in the unresolved resonance range that directly utilize the known joint probability distribution of resonance parameters described in
Section 5.6.1.1. There are two different types of approaches based on the same general principle currently in use. The most commonly used approach is based on the
direct integration, a procedure similar to that of computing the unshielded averages. Another less commonly used approach, referred to as the “resonance ladder”
method, is based on statistically generated discrete resonance sequences via the use
of the Monte Carlo technique described in Section 5.6.1.3. A brief discussion of
these methods is now presented.
5.6.3.1 Methods Based on Direct Integrations
For the direct integration approach, the self-shielded cross section for a given reaction process x associated with a given l and J state is defined on a statistical basis
in the vicinity of a given energy E and can be viewed as the natural extension
of the J -integral approach described in Section 5.4.4.2 if the NR-approximation
is assumed [53]. The only difference is that the sum over individual resonances is
replaced by a statistical average in the context of Eq. 5.159, i.e.,
D
Q x.l;J / D
†b
x.l;J / .E /
†b C†Rt .E /
f
E
; f D1
†Rt .E/
†b C †Rt .E/
(5.165)
280
R.N. Hwang
where †b denotes the macroscopic energy-independent background cross section
including all potential scattering cross sections and smooth cross sections given by
the data file as well as the “equivalent” cross section for the cell where appropriate.
Like the case of the resolved resonances, the presence of a neighboring resonance
can affect the resonance integral of a given resonance through its impact on the local
flux. From statistical considerations, such overlap effects can come from two types
of resonances present. They may either belong to resonances of the same spin sequence or those of a different spin sequence of the same nuclide and/or sequences
of different nuclides. The former requires statistical averaging over the level correlation function defined by Eq. 5.157 or 5.158, while the latter requires integration
over the uniform distribution, or .x/ D 1, as described in Section 5.6.1.2 in addition to averaging over the partial widths. Two scenarios can be described separately
as follows.
Scenario 1: One Spin Sequence Only
In absence of other statistically independent resonance sequences, Eq. 5.165 can be
expressed as [53],
Q x.l;J /
h
iE
x J.k ; ˇk ; ak / Ok.x/
0k
D h
iE
D
1
1 hDi t J.k ; ˇk ; ak ; ak / Ok.t0 /k
b
hDi
D
(5.166)
where b D †b =N0 is the background cross section per atom density of the nuclide under consideration. As demonstrated in [53,77], the statistical average can be
computed efficiently via the use of Gauss quadratures. In particular, the integration
involving the level correlation function can be significantly simplified if the overlap
effects can be attributed to the nearest level, an approximation usually referred to as
the “nearest neighbor” approximation.
Scenario 2: Presence of Many Statistically Independent Resonance Sequences
As described in [53], the integrand in Eq. 5.165, in principle, can also be further
partial-fractioned into the similar form given by Eq. 5.166 in the presence of other
statistically independent resonance sequences. Let the subscript i denote such a sequence and Jk denote the corresponding J -integral. The statistical averaging of
such an integral requires not only integration over the correlation function of neighbors of the k-th sequence but also a uniform distribution of those belonging to the
i-th sequence as well. The latter integration to first order can be approximated readily and the resulting expression for the reaction rate and the flux become,
5
Resonance Theory in Reactor Applications
281
2
E
E
X 1
b D
b D
.x/
.x/ 4
xk Jk
x Jk
1
hDk i
hDk i
hDi i
3
E
D
.t / 5
ti Ji
(5.167)
i ¤k
f 1
X
all m
E
1 D
tm Jm.t /
hDm i
(5.168)
where Jk.x/ denotes the quantity inside the square bracket in the numerator of
.t /
Eq. 5.166 while Jk denotes the quantity inside the square bracket in the denominator of Eq. 5.166.
Note that the second-order terms in Eqs. 5.167 and 5.168 are usually small and
the inter-sequence overlap terms can be approximated by,
1
X
all i
E
E
Y
1 D
1 D
1
ti Ji.t / ti Ji.t /
hDi i
hDi i
(5.169)
all i
Thus, the substitution of Eq. 5.169 into 5.167 and 5.168 leads to an extremely simple
expression for the average self-shielded cross section particularly useful for reactor
applications,
X
Q x.l;J /
(5.170)
Q x D
l;J
where Q x.l;J / is dependent on the in-sequence overlap term but independent of the
inter-sequence overlap terms as given by Eq. 5.166.
Physically, the direct integration approach is directly compatible with the fluctuation integral approach for computing the unshielded average cross sections. In
the limit of infinite dilution, the average shielded cross section will approach the
unshielded cross section based on the fluctuation integral.
5.6.3.2 Methods Using the Statistically Generated Resonance Ladders
As described in Section 5.6.1.3, the unresolved resonances can also be pictured
as an ensemble of discrete resonances generated based on the known distribution
functions of resonance parameters via the usual Monte Carlo techniques. One advantage of this approach is that subsequent calculations of the shielded average cross
sections are no longer constrained by the use of the NR-approximation required
by the previous methods based on direct integration. However, the resulting cross
sections computed on this basis are subject to large statistical uncertainties difficult
to circumvent. Therefore, this approach is not widely used in practical applications
as others.
282
R.N. Hwang
5.6.4 Probability Table Methods
One method particularly amenable in conjunction with the Monte Carlo approach
for reactor applications is the “probability table” method pioneered by Levitt [78].
The method bypasses the need of directly using the ladder approach at run time in
earlier Monte Carlo codes, a procedure that would not only require large storage
space and excessive computing time but also would lead to less manageable statistical uncertainties in its results. Conceptually, the general idea is to utilize the
probability and conditional probability distributions themselves deducible numerically from the known distributions of resonance parameters so that the averages
of interest can be cast into the form of a Lebesgue integral instead of that of the
Riemann integral described earlier.
5.6.4.1 Conceptual Basis
Consider a simple case where the neutron flux .t / is a function of t alone,
i.e., the NR-approximation. One alternative joint probability distribution required
to specify the average of the type hx .t /i is, according to the multiplication
theorem,
.max/
Z
t
hx .t /i D
.max/
Z
t
p.t /E.x jt / .t /dt ; h .t /i D
0
.t /p.t /dt
0
(5.171)
where p.t / is the p.d.f. for the total cross section and E.x jt / is referred to as the
conditional means of the partial cross section, the average x over the conditional
probability p.x jt /, defined as
.max/
Z
x
E.x jt / D
x p.x jt /dx
(5.172)
0
which can be precomputed once the conditional probability is known.
Thus, if these probability distributions are deducible from the known distributions of resonance parameters, the averages of practical interest can be expressed
in terms of single integrals of Lebesgue form given by Eq. 5.171. For practical applications, only the prior knowledge of p.t / and the conditional means of various
partial cross sections is required at run time. The main attraction of this approach is
that p.t / along with its c.d.f. and various conditional means can be predetermined
and tabulated in one-dimensional arrays for a prescribed E and different temperatures before their deployment. The same principle is extendable to the treatment
of the resolved resonances in conjunction with the subgroup methods developed
5
Resonance Theory in Reactor Applications
283
independently by Nikolaev [79], Cullen [80] and Ribon [75]. Various numerical
methods for computing these tabulated quantities will be further addressed.
There is one word of caution if this approach is applied beyond the
NR-approximation. Strictly speaking, this would require the conditional distribution of the form p.x js ; t /, which may negate, at least in part, the advantage
of simplicity described above.
5.6.4.2 Methods for Computing the Tabulated Quantities
There are two widely used methods for computing the discrete values of p.t / and
E.x jt / as a function of t . One is the numerical scheme based on the statistically generated resonance “ladders” originally developed by Levitt [78]. The other
is the moment-based approach based on matching the moments and partial moments of cross sections developed by Ribon and Maillard [75]. A brief outline of
these methods is presented below.
Method Based on Resonance Ladders
The rationale is relatively straightforward. The discretized p.t / and the corresponding conditional means are determined through the following steps.
1. For a given reference energy E where the averages are to be computed, a set
.max/
is preselected as the
of total cross section bands fti g with 0 < ti < t
mesh points for the tables to be constructed. t D ti1 and t D ti , graphically equivalent to two parallel lines in a plot of t vs. E, constitutes the i-th
“band.” The index i will henceforth be used to denote the t bands that will
serve as entries to the tables.
2. Select an energy range E D E2 E1 around E with E >>> hDi, the average spacing of the resonance ladder in question.
3. Generate a “ladder” of resonances with the peak resonance energies falling inside E using the usual Monte Carlo scheme described in Section 5.6.1.3.
4. Construct a set of fine energy mesh fEk g within the preselected interval E
from which the corresponding point-wise total cross sections ftk g and partial
cross sections fxk g are computed.
5. Determine a subset of energy pairs fE2j ; E2j 1 g, signifying the consecutive
pairs of intersections created between the upper and lower lines of the i-th
“band” with the plot of t vs. E, which can be obtained via the criterion
ti < tk < ti1 . The index j will henceforth be used to denote the consecutive
energy pairs sandwiched by a given band i .
6. Store the localized average partial cross sections N xj.L/ .E2j ; E2j 1 / and total
.L/
cross section N tj .E2j ; E2j 1 / of this ladder L for all energy pairs within the
given band.
284
R.N. Hwang
7. The localized average of the probability density function (p.d.f.) of the total
cross section at the i-th band amounts to the fraction of N tj.L/ .E2j ; E2j 1 / that
falls in the range between ti and ti1 , i.e.,
J
P
pi.L/
D
j D1
jE2j E2j 1 j
E
D 0;
; ti1 < .L/
tj .E2j ; E2j 1 / < ti
elsewhere
(5.173)
if and only if the band width is sufficiently small.
.L/
8. The c.d.f. for the given ladder is simply Pi
D
i
P
kD1
.L/
pk .
9. The corresponding conditional means of the partial cross section at the i-th band
becomes
E.x j ti / D
.L/
1 X
.L/
.L/
.E2j E2j 1 / xj .E2j ; E2j 1 /; ti1 < tj .E2j ; E2j 1 /<ti
E j
D 0; elsewhere
(5.174)
Steps 7 through 9 are repeated for all predetermined bands.
10. Repeat steps 4 through 8 as many times as one desires. The final quantities to
be tabulated are just the population averages of all ladders considered.
Moment Method
The quantities of interest can also be deduced from the known moments and partial
moments of cross sections via the method developed by Ribon and Maillard [75].
Their rationale is that the usual Riemann integral, in principle, can be truncated into
a form of the discretized Lebesgue integral. For the case in point, the n-th order
moment and partial moments can be expressed respectively as
Mn D
1
E
Z
tn .E/dE D
E
Mn.x/ D
1
E
N
X
pi tin I
i D1
Z
x .E/tn .E/dE D
E
N
X
pi xi tin
(5.175)
i D1
where xi D E.xi jti /. The question is how to compute these discretized quantities.
The same rationale is equally applicable for the resolved as well as the unresolved treatments in the context of Eq. 5.159. Ribon and Maillard [75] conjectured
5
Resonance Theory in Reactor Applications
285
that these discretized quantities ti and pi in the first equation above can be pictured
as the Gauss quadrature abscissae and weights in conjunction with an integral of the
following form:
Z
F .z/ D
t
X pi
p.t /
C R2N
dt D
1 zt
1 ti z
N
(5.176)
i D1
As the usual criterion, the abscissae and weights of the Gauss quadrature of order
N requires the knowledge of up to (2N 1)-th order moments. This can be accomplished by using the Pade approximation for F .z/ whereby
X wi
a0 C a1 z C a2 z2 C C aN 1 zN 1
D
C R2N
1 C b1 z C b2 z2 C C bN zN
1 zz
N
F .z/ D
i D1
(5.177)
i
The coefficients in the numerator and denominator can form a system of 2N linear
equations in which each equation is a linear combination of moments if the function
is matched with all moments from n D 0 to n D N 1. The solution of fai g and fbi g
provides the unambiguous basis to compute the weight wi , the residue identifiable
with pi and abscissas 1=zi , the pole identifiable with ti .
The same rationale can be extended to the matching of the partial moments given
by Eq. 5.175. It should be noted, however, the determination of xi , the conditional
means, requires a total of only N partial moments in contrast to 2N moments required for computing pi . Thus, there is an issue of uniqueness when the conditional
means are determined this way.
5.7 Future Challenges
Given our vastly improved computational capability and tools for evaluating nuclear
data, much of the topics discussed can be greatly enhanced. Some of the future
challenges in various areas, in the author’s opinion, are listed as follows:
1. Nuclear Data and Cross-Section Representation
Continuous efforts to improve the resonance parameters for a wide range of
nuclides are needed.
More utilization of the self-indication ratio measurement at various tempera-
tures as a means for data verification is needed particularly in the unresolved
energy range. It is the only viable, yet simple, means that provide a basis to
verify the computed self-shielding effect directly vs measurements.
One cross-section representation not widely recognized in the West is the
so-called combined method pioneered by Lukyanov and Sirakov [81]. The
method certainly deserves more attention by the nuclear data communities
around the globe.
286
R.N. Hwang
2. Point-wise Doppler-Broadening of Cross Sections
One long-standing issue in Doppler-broadening of cross sections is the poten-
tial impact of crystalline binding effects not accounted for in the traditional
approach based on the ideal gas model. In addition, such effects can affect the
elastic scattering kernel currently in use according to recent studies [82, 83].
Although other recent investigations were also made in this area [84], the fundamental issues remain unresolved. In particular, more in-depth theoretical,
as well as experimental, studies on the solid state aspects pertinent to the
chemical binding of various lattices of interest needs to be pursued.
To enhance the accuracy and efficiency of calculations at run time, one area
of practical interest is optimization of point-wise cross section libraries at a
given temperature currently used in many production codes.
3. Multigroup Cross-Section Processing
One obvious challenge is the improvement of the current collision probabil-
ity treatment at the resonance level in order to accommodate complex reactor
cells. For many realistic reactor lattices, a two-dimensional collision probability approach is apparently needed.
In the same context, improvement of the current methods for computing the
global weighting spectrum for cross-section collapsing in various multigroup
constant codes is also in order.
4. Improvement of Unresolved Resonance Treatment
A new analytical approach for computing probability tables via the integral
transform techniques is currently under consideration as outlined in [85]. Further development is underway.
One method that has not received much attention in the West is the characteristic function approach for treating averages involving the S-matrix pioneered
by Lukyanov et al. [86, 87]. Some recent results have demonstrated its viability to compute the self-shielded cross sections of practical interest. It is yet
another alternative method that is worth pursuing further.
The method based on the concept of maximizing information entropy as applied to the treatment of S-matrix averages pioneered by Fröhner [88] has
provided a new alternative quite different from all the traditional methods
described here. The challenge is whether one can utilize such a concept in
calculations of practical interest in reactor applications.
5. Book on Resonance Theory in Reactor Applications
In this author’s opinion, what has been missing is a comprehensive book that
provides an up-to-date reference basis for a large body of work in this area.
There were two representative books of this kind in earlier days. One was
authored by L. Dresner [6] in the early 1960s while the other by Lukyanov
[10] was published in the mid 1970s. It is quite evident from the foregoing
discussions that a new book in this area is needed.
5
Resonance Theory in Reactor Applications
287
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
Fermi E (1962) Collected papers of E. Fermi, vol 1. University of Chicago Press, Chicago, IL
Wigner EP (1955) Nuclear reactor issue. J Appl Phys 26:257
Wigner EP, Creutz JH, Snyder TJ (1955) J Appl Phys 26:260
Chernick J (1956) The theory of uranium-water lattices. In: Proceedings of the international
conference on peaceful uses of atomic energy, New York, pp 603
Chernick J, Vernon, R (1958) Nucl Sci Eng 4:649
Dresner L (1960) Resonance absorption in nuclear reactors. Pergamon, New York
Gourevitch II, Pomeranchouk IY (1955) Theory of resonance absorption in heterogeneous
systems. First Geneva Conference on Peaceful Use of Atomic Energy, vol 5, pp 466
Marchuk GI (1964) Methods of reactor calculations. Atomizdat, Moscow (1961) (in Russian)
(English trans: Consultant Bureau). New York (1964)
Lukyanov AA (1978) Structures of neutron cross sections. Atomizdat, Moscow (in Russian)
Lukyanov AA (1974) Slowing-down and absorption of resonance neutrons. Atomizdat,
Moscow (in Russian)
Breit G, Wigner EP (1936) Phys Rev 49:519
Hummel HH, Okrent D (1970) Reactivity coefficients in large fast power reactors. Monograph
series on nuclear science and technology, American Nuclear Society
Rose PF, Dunford CL (eds) (1990) ENDF-102 Data formats and procedures for evaluated nuclear data file ENDF-6, BNL-NCS-44945. Brookhaven National Laboratory
Hwang RN (1992) R-Matrix parameters in reactor applications. In: Dunford CL (ed) Proceedings of the symposium on nuclear data evaluation methodology, Brookhaven National
Laboratory, New York, pp 316–326
Lane M, Thomas RG (1958) Rev Mod Phys 30:257
Lynn JE (1968) The theory of neutron resonance reactions. Clarendon, Oxford
Fröhner FH (1978) Applied resonance theory, KFK-2669. Kernforschungszentrum Karlsruhe
Wigner EP, Eisenbud L (1947) Phys Rev 72:29
Kapur PL, Peierls R (1938) Proc R Soc Lond A 166:277
Adler DB, Adler FT (1963) Neutron cross sections in fissile elements. In: Proceedings of
the conference on breeding, economics and safety in fast reactors, ANL-6792, 695, Argonne
National Laboratory
Moldauer PA, Hwang RN, Garbow BS (1969) MATDIAG, a program for computing multilevel
S-Matrix resonance parameters, ANL-7569, Argonne National Laboratory
deSaussure G, Perez RB (1969) POLLA, a Fortran program to convert R-Matrix-type multilevel resonance parameters for Fissile nuclei into equivalent Kapur–Peierls-type parameters.
ORNL-2599, Oak Ridge National Laboratory
Reich CW, Moore MS (1958) Phys Rev 111:929
Hwang RN (1987) Nucl Sci Eng 96:192
Humblet J, Rosenfeld L (1961) Nucl Phys 26:529
Hwang RN (1992) Nucl Sci Eng 111:113
Solbrig AW, Jr (1961) Nucl Sci Eng 10:167
Osborn RK, Yip S (1964) Foundations of neutron transport theory. Am Nucl Soc Monograph
Lamb WE (1939) Phys Rev 55:190
Egelstaff PA (1962) Nucl Sci Eng 12:250
Nelkin M, Parks DE (1960) Phys Rev 119:1060
Adkins CR (1966) Effects of chemical binding on the Doppler broadened cross section of
Uranium in a UO2 lattice. Ph.D. thesis, Carnegie Institute of Technology
Dolling G, Cowley RA, Woods ADB (1965) The crystal dynamics of uranium dioxide. Can J
Phys 43(8):1397
Hwang RN (1998) An analytical method for computing Doppler-broadening of cross sections,
ANL-NT-69, Argonne National Laboratory
O’Shea DM, Thacher HC (1963) Computations of resonance line-shape functions. Trans Am
Nucl Soc 6:36
288
R.N. Hwang
36. Turing AM (1943) Proc Lond Math Soc 48:130
37. Henrici P (1963) Some applications of the quotient-difference algorithm. Proc Symp Appl
Math 15:159
38. Goodwin ET (1949) Proc Cambridge Phil Soc 45:241
39. Bhat MR, Lee-Whiting GE (1967) A computer program to evaluate the complex probability
integral. Nucl Instr Meth 47:277–280
40. Fröhner FH (1980) New techniques for multi-level cross section calculation and fittings. KFK3081, Kernforschungszentrum Karlsruhe
41. Steen NM, Byrne GD, Gelbard EM (1969) Math Comp 23:107
42. Cullen DE (1977) Program SIGMA1 (Version 77-1): Doppler broaden evaluated cross sections
in the evaluated nuclear data file/Version B (ENDF/B) Format, UCRL-50400, vol. 17, Part B
43. MacFarlane RE, Muir DW (1994) NJOY nuclear data processing system, Version 91. LA12740-M, Los Alamos National Laboratory
44. Hwang RN (1998) Critical examinations of commonly used numerical methods for Dopplerbroadening of cross section, ANL-NT-72, Argonne National Laboratory
45. Dunford CL, Bramblett ET (1966) Doppler broadening of resonance data for Monte Carlo
calculations, MAI-CE-MEMO-21, Atomic International
46. Leal LC, Hwang RN (1987) A finite difference method for treating the Doppler-broadening of
cross sections. Tran Am Nucl Soc 55:340
47. Weinberg AM, Wigner EP (1958) The physical theory of neutron chain reactors. University of
Chicago Press, Chicago, IL
48. Placzek G (1946) Phys Rev 69:423
49. Corngold (1957) Slowing down of neutrons in infinite homogeneous media. Proc Phy Soc Lond
A 70:793
50. Hwang RN (1965) Effects of the fluctuations in collision density on fast reactor Doppler effect
calculations. In: Proceedings of the conference on safety, fuels, and core designs in large fast
reactors, 449, ANL-7120
51. Goldstein R, Cohen ER (1962) Nucl Sci Eng 13:132
52. Nordheim LW (1961) A program of research and calculations of resonance absorption. GA2527, General Atomic
53. Hwang RN (1973) Nucl Sci Eng 52:157
54. Henryson H, Toppel BJ, Stenberg CG (1976) MC2 -2: a code to calculate fast-neutron spectra
and multigroup cross sections. ANL-8144, Argonne National Laboratory
55. Stacey WM, Jr (1972) Advances in continuous slowing-down theory. In: Proceedings of the
National Topical Meeting on New Developments in Reactor Physics and Shielding, Conf720901 Book 1, 143
56. Goertzel G, Greuling E (1960) Nucl Sci Eng 7:69
57. Kier PH, Robba AA (1967) RABBLE, a program for computation of resonance absorption in
multiregion reactor cells, ANL-7326
58. Olson AP (1970) An integral transport-theory code for neutron slowing-down in slab cells,
ANL-7645
59. Case KM, de Hoffmann F, Placzek G (1963) Introduction to the theory of neutron diffusion,
vol 1. Los Alamos Scientific Laboratory
60. Dancoff SM, Ginsburg M (1944) Surface resonance absorption in a closely packed lattice,
USAEC-Report cp-2157
61. Rothenstein W (1959) Collision probabilities and resonance integrals for lattices. BNL-563
(T-151), Brookhaven National Laboratory
62. Corngold NC (1957) Resonance escape probability in slab lattices. J Nucl Energy 4:293
63. Takahashi H (1960) First flight collision probability in lattice systems. J Nucl Energy A12:1
64. Bickley WC, Nayler J (1935) Phil Mag 20(Series 7):343
65. Bell GL (1959) Nucl Sci Eng 5:138
66. Levine MM (1963) Nucl Sci Eng 16:271
67. Hummel HH, Hwang RN, Phillips K (1965) Recent investigations of fast reactor reactivity
coefficients. In: Proceedings of the conference on safety, fuels, and core design in large fast
power reactors, ANL-7120, pp 413–420
5
Resonance Theory in Reactor Applications
289
68. Hwang RN, Toppel BJ (1979) Mathematical behavior probabilities for annular regions. In:
Proceedings of the topical meeting on computational methods in nuclear engineering, vol 2.
Williamsburg, VA, pp 7–85
69. Moldauer PA (1964) Phys Rev 135(3B):642–659; 136B:947
70. Ericson T (1963) Ann Phys 23:390–414
71. Porter CE (ed) (1965) Statistical theory of spectra: fluctuations. Academic, New York and
London
72. Porter CE, Thomas RG (1956) Phys Rev 104:483
73. Wigner EP (1957) Ann Math 53:36 (1951); 62:548 (1955); 65:203 (1957)
74. Dyson FJ (1962) J Math Phys 3(1):140
75. Ribon P, Maillard JM (1986) Les tables de probabilité applications au traitement des sections
efficaces pour la neutronique. Report CEA-N, NEACRP-L-294
76. Hwang RN (1965) Nucl Sci Eng 21:523
77. Hwang RN, Henryson H, II (1975) Critical examination of low-order quadratures for statistical
integrations. Trans Am Nucl Soc 22:712
78. Levitt LB (1971) The probability table method for treating unresolved resonances in Monte
Carlo criticality calculations. Trans Am Nucl Soc 14:648
79. Nikolaev MN, et al. (1971) At Energy 29:11 (1970); 30:426 (1971)
80. Cullen DE (1974) Nucl Sci Eng 55:378
81. Lukyanov AA, Sirakov IA (1998) Combined method for resonance cross section parameterization. Bulg J Phys 25(3–4):112
82. Ouisloumen M, Sanchez R (2000) Nucl Sci Eng 107:189–200
83. Bouland O, Kolesov V, Rowlands JL (1994) The effects of approximations in the energy distributions of scattered neutrons on thermal reactor Doppler effects. International conference on
nuclear data for science and technology, Gatlinburg, Tenn, pp 1006–1008
84. Gressier V, Naberejnev D, Mounier C (2000) Ann Nucl Energ 27:1115–1129
85. Hwang RN (1998) Recent developments pertinent to processing of ENDF/B 6 resonance cross
section data. In: Proceedings of the international conference on nuclear science and technology,
Long Island, New York, pp 1241–1250
86. Koyumdjieva N, Sovova N, Janeva N, Lukyanov AA (1989) Bulg J Phys 16(1):13
87. Alami MN, Janeva NB, Lukyanov AA (1991) Bulg J Phys 9:16
88. Fröhner FH (1987) Information theory applied to unresolved resonance. In: Proceedings of the
international conference on nuclear data for basic and applied science, Santa Fe, N. M. (1985);
also Resent progress in compound nuclear theory and calculation of resonance-averaged cross
section, IAEA Meeting on Nuclear Model for Fast Neutron Nuclear Data Evaluation, Beijing,
IAEA TEC-DOC Series
290
R.N. Hwang
Dr. Richard N. Hwang was born in Canton, China on January 24, 1935 and came
to the USA as a student in 1956. He joined
the staff at the Argonne National Laboratory
in 1962 after he received his B.S. in Mechanical Engineering and Ph.D. in Nuclear
Physics from Iowa State University. He retired from Argonne National Laboratory as
a senior physicist after a 45-year career. His
work there was focused on nuclear resonance
theory and its nuclear reactor physics applications. His work included: generalization
of the traditional resonance integral concept; development of accurate analytical and
numerical methods for computing Dopplerbroadening of cross sections; development
of efficient methods for treating the generalized resonance integral; contribution
to a better understanding of Placzeck oscillations on resonance integrals; studies
of the effect of crystalline binding on Doppler-Broadening; original contributions
to the theory of statistical treatment of self-shielding effects in the unresolved
resonance range; development of efficient numerical algorithms for treating selfshielded cross sections in the resolved as well as the unresolved regions; accurate
method for computing transmission probabilities in annular cell geometry; development of statistical distributions for S-matrix parameters; development of a method
for converting statistically generated R-matrix parameters to Kapur–Peierls type
parameters; introduction of rigorous pole representation of cross sections; development of a method to extract the Humblet–Rosenfeld type parameters from rigorous
pole parameters. He was responsible for the development of the MATDIAG code,
resolved and unresolved resonance modules for the MC2 -2 code, the WHOPPER
code, the SUPERW code, and the IMPLIC code. Dr. Hwang was a member and fellow of the American Nuclear Society. He was the author of over 66 Journal articles
and Conference papers and reports. In 2004 he received the Eugene P. Wigner Reactor Physicist award for his work in nuclear resonance theory. The Wigner award,
given by the American Nuclear Society for “outstanding achievements in the field
of nuclear reactor physics,” is the highest honor a reactor physicist can receive. He
passed away on December 20, 2007.
Argonne National Laboratory
February, 2008
Composed by Dr. Roger Blomquist
Chapter 6
Sensitivity and Uncertainty Analysis
of Models and Data
Dan Gabriel Cacuci
6.1 Introduction
This chapter highlights the characteristic features of statistical and deterministic
methods currently used for sensitivity and uncertainty analysis of measurements and
computational models. The symbiotic linchpin between the objectives of uncertainty
analysis and those of sensitivity analysis is provided by the “propagation of errors”
equations, which combine parameter uncertainties with the sensitivities of responses
(i.e., results of measurements and/or computations) to these parameters. It is noted
that all statistical uncertainty and sensitivity analysis methods first commence with
the “uncertainty analysis” stage, and only subsequently proceed to the “sensitivity
analysis” stage. This procedural path is the reverse of the procedural (and conceptual) path underlying the deterministic methods of sensitivity and uncertainty
analysis, where the sensitivities are determined prior to using them for uncertainty
analysis. In particular, it is emphasized that the Adjoint Sensitivity Analysis Procedure (ASAP) is the most efficient method for computing exactly the local sensitivities
for large-scale nonlinear problems comprising many parameters. This efficiency is
underscored with illustrative examples. The computational resources required by
the most popular statistical and deterministic methods are discussed comparatively.
A brief discussion of unsolved fundamental problems, open for future research, concludes this chapter.
6.2 Sensitivities and Uncertainties in Measurements
and Computational Models: Basic Concepts
In practice, scientists and engineers often face questions such as: How well does
the model under consideration represent the underlying physical phenomena?
What confidence can one have that the numerical results produced by the model
D.G. Cacuci ()
Institute for Nuclear Technology and Reactor Safety, Karlsruhe Institute of Technology, Germany
e-mail: Dan.Cacuci@KIT.Edu
Y. Azmy and E. Sartori, Nuclear Computational Science: A Century in Review,
c Springer Science+Business Media B.V. 2010
DOI 10.1007/978-90-481-3411-3 6, 291
292
D.G. Cacuci
are correct? How far can the calculated results be extrapolated? How can the
predictability and/or extrapolation limits be extended and/or improved? Answers to
such questions are provided by sensitivity and uncertainty analyses. As computerassisted modeling and analyses of physical processes have continued to grow and
diversify, sensitivity and uncertainty analyses have become indispensable investigative scientific tools in their own right.
Since computers operate on mathematical models of physical reality, computed
results must be compared to experimental measurements whenever possible. Such
comparisons, though, invariably reveal discrepancies between computed and measured results. The sources of such discrepancies are the inevitable errors and
uncertainties in the experimental measurements and in the mathematical models.
In practice, the exact forms of mathematical models and/or exact values of data are
not known, so their mathematical form must be estimated. The use of observations
to estimate the underlying features of models forms the objective of statistics. This
branch of mathematical science embodies both inductive and deductive reasoning,
encompassing procedures for estimating parameters from incomplete knowledge
and for refining prior knowledge by consistently incorporating additional information. Thus, assessing and, subsequently, reducing uncertainties in models and data
require the combined use of statistics together with the axiomatic, frequency, and
Bayesian interpretations of probability.
As is well known, a mathematical model comprises independent variables, dependent variables, and relationships (e.g., equations, look-up tables, etc.) between
these quantities. Mathematical models also include parameters whose actual values
are not known precisely, but may vary within some ranges that reflect our incomplete knowledge or uncertainty regarding them. Furthermore, the numerical methods
needed to solve the various equations themselves introduce numerical errors. The
effects of such errors and/or parameter variations must be quantified in order to assess the respective model’s range of validity. Moreover, the effects of uncertainties
in the model’s parameters on the uncertainty in the calculated results must also be
quantified. Generally speaking, the objective of sensitivity analysis is to quantify the
effects of parameter variations on calculated results. On the other hand, the objective of uncertainty analysis is to assess the effects of parameter uncertainties on the
uncertainties in calculated results. Sensitivity and uncertainty analyses can be considered as formal methods for evaluating data and models because they are associated with the computation of specific quantitative measures that allow, in particular,
assessment of variability in output variables and importance of input variables.
Models of complex physical systems usually involve two distinct sources of
uncertainties: (i) stochastic uncertainty, which arises because the system under
investigation can behave in many different ways, and (ii) subjective or epistemic
uncertainty, which arises from the inability to specify an exact value for a parameter that is assumed to have a constant value in the investigation. Epistemic (or
subjective) uncertainties characterize a degree of belief regarding the location of the
appropriate value of each parameter. In turn, these subjective uncertainties lead to
subjective uncertainties for the response, thus reflecting a corresponding degree of
belief regarding the location of the appropriate response values as the outcome of
6
Sensitivity and Uncertainty Analysis of Models and Data
293
analyzing the model under consideration. A typical example of a complex system
that involves both stochastic and epistemic uncertainties is a nuclear reactor power
plant: in a typical risk analysis of a nuclear power plant, stochastic uncertainty arises
due to the hypothetical accident scenarios that are considered in the respective risk
analysis, while epistemic uncertainties arise because of uncertain parameters that
underlie the estimation of the probabilities and consequences of the respective hypothetical accident scenarios.
This section commences with a brief description of the main sources and features of errors and uncertainties associated with measurements. The fundamental
concepts used for assessing the magnitude and effects of uncertainties for complex measurements and computational simulations of physical phenomena are also
presented. The practical consequences of these fundamental concepts are embodied in the “propagation of errors (moments)” equations. As will be shown in this
section, the “propagation of errors” equations provide a systematic way for assessing the uncertainties in the results of measurements and computations arising not
only from uncertainties in the parameters that enter the computational model, but
also from numerical approximations. The “propagation of errors” equations systematically and consistently combine the parameter errors with the sensitivities of
responses (i.e., results of measurements and/or computations) to the respective parameters, thus providing the mathematical connection between the objectives of
uncertainty analysis and those of sensitivity analysis. The efficient computation of
sensitivities and, subsequently, uncertainties in results produced by various models
(algebraic, differential, integral, etc.) will then form the objectives of subsequent
sections in this chapter.
6.2.1 Measurement Uncertainties
A measurable quantity is a property of phenomena, bodies, or substances that can
be defined qualitatively and expressed quantitatively. Measurable quantities are also
called physical quantities. The term quantity is used both in a general sense, when
referring to the general properties of objects (e.g., length, mass, temperature, electric
resistance, etc.), and in a particular sense, when referring to the properties of a specific object (e.g., the length of a given rod, the electric resistance of a given segment
of wire, etc.). Measurement is the process of finding the value of a physical quantity experimentally with the help of special devices called measuring instruments.
The result of a measurement is a numerical value, together with a corresponding
unit, for a physical quantity. The true value of a measurable quantity is the value
of the measured physical quantity, which, if it were known, would ideally reflect,
both qualitatively and quantitatively, the corresponding property of the object. The
theory of measurement relies on the following postulates:
(a) True value of the measurable quantity exists.
(b) True value of the measurable quantity is constant (relative to the conditions of
the measurement).
(c) True value cannot be found.
294
D.G. Cacuci
Since measuring instruments are imperfect, and since every measurement is an
experimental procedure, the results of measurements cannot be absolutely accurate.
This unavoidable imperfection of measurements is generally expressed as measurement inaccuracy, and is quantitatively characterized by measurement errors. Thus,
the result of any measurement always contains an error, which is reflected by the
deviation of the result of measurement from the true value of the measurable quantity. Knowledge of measurement errors would allow statements about measurement
accuracy, which is the most important quality of a measurement: the smaller the
underlying measurement errors, the more accurate is the respective measurement.
However, since the true value of a measurable quantity is always unknown, the errors
of measurements must be estimated theoretically, by computations, using a variety
of methods, each with its own degree of accuracy.
The basic sources of measurement errors are errors arising from the method of
measurement, errors due to the measuring instrument, and personal errors committed by the person performing the experiment. These errors are assumed to be
additive. Methodological errors are caused by unavoidable discrepancies between
the actual quantity to be measured and its model used in the measurement. Most
commonly, such discrepancies arise from inadequate theoretical knowledge of the
phenomena on which the measurement is based, and also from inaccurate and/or
incomplete relations employed to find an estimate of the measurable quantity. Instrumental measurement errors are caused by imperfections of measuring instruments.
Finally, the individual characteristics of the person performing the measurement
may give rise to personal errors.
If the results of separate measurements of the same quantity differ from one another, and the respective differences cannot be predicted individually, then the error
owing to this scatter of the results is called random error. Random errors can be
identified by repeatedly measuring the same quantity under the same conditions. On
the other hand, a measurement error is called systematic if it remains constant or
changes in a regular fashion when the measurements of that quantity are repeated.
Systematic errors can be discovered experimentally either by using a more accurate
measuring instrument or by comparing a given result with a measurement of the
same quantity, but performed by a different method. In addition, systematic errors
are estimated by theoretical analysis of the measurement conditions, based on the
known properties of the measuring instruments and the quantity being measured.
Although the estimated systematic error can be reduced by introducing corrections,
it is impossible to eliminate all systematic errors completely from experiments. Ultimately, a residual error will always remain, which will then constitute the systematic
component of the measurement error.
The smallest of the measurement errors are customarily referred to as elementary errors (of a measurement), and are defined as those components of the overall
measurement error that are associated with a single source of inaccuracy for the respective measurement. The total measurement error is calculated, in turn, by using
the estimates of the component elementary errors. Even though it is sometimes possible to partially correct certain elementary errors (e.g., systematic ones), no amount
or combination of corrections can produce an absolutely accurate measurement.
6
Sensitivity and Uncertainty Analysis of Models and Data
295
In particular, the corrections themselves cannot be absolutely accurate, and, even
after they are implemented, there remain residuals of the corresponding errors,
which cannot be eliminated and which later assume the role of elementary errors.
Since a measurement error can only be calculated indirectly, based on models and
experimental data, it is important to identify and classify the underlying elementary errors. This identification and classification is subsequently used to develop
mathematical models for the respective elementary errors. Finally, the resulting
(overall) measurement error is obtained by synthesizing the mathematical models
of the underlying elementary errors.
In the course of developing mathematical models for elementary errors, it has
become customary to distinguish four types of elementary errors: absolutely constant errors, conditionally constant errors, purely random errors, and quasi-random
errors. Thus, absolutely constant errors are defined as elementary errors that remain
the same (i.e., are constant) in repeated measurements performed under the same
conditions, for all measuring instruments of the same type. Absolutely constant errors have definite limits but these limits are unknown. For example, an absolutely
constant error arises from inaccuracies in the formula used to determine the quantity
being measured, once the limits of the respective inaccuracies have been established. Typical situations of this kind arise in indirect measurements of quantities
determined by linearized or truncated simplifications of nonlinear formulas (e.g.,
analog/digital instruments where the effects of electromotive forces are linearized).
Based on their properties, absolutely constant elementary errors are purely systematic errors, since each such error has a constant value in every measurement, but
this constant is nevertheless unknown. Only the limits of these errors are known.
Therefore, absolutely constant errors are modeled mathematically by a determinate
(as opposed to random) quantity, whose magnitude lies within an interval of known
limits.
Conditionally constant errors are, by definition, elementary errors that have
definite limits (just like the absolutely constant errors) but (as opposed to the absolutely constant errors) such errors can vary within their limits due to both the
non-repeatability and the non-reproducibility of the results. A typical example of
such an error is the measurement error due to the intrinsic error of the measuring
instrument, which can vary randomly between fixed limits. Usually, the conditionally constant error is mathematically modeled by a random quantity with a uniform
probability distribution within prescribed limits. This mathematical model is chosen because the uniform distribution has the highest uncertainty (in the sense of
information theory) among distributions with fixed limits. Note, in this regard,
that the round-off error also has known limits, and this error has traditionally
been regarded in mathematics as a random quantity with a uniform probability
distribution.
Purely random errors appear in measurements due to noise or other random errors
produced by the measuring device. The form of the distribution function for random errors can, in principle, be found using data from each multiple measurement.
In practice, however, the number of measurements performed in each experiment is insufficient for determining the actual form of the distribution function.
296
D.G. Cacuci
Therefore, a purely random error is usually modeled mathematically by using a
normal distribution characterized by a standard deviation that is computed from the
experimental data.
Quasi-random errors occur when measuring a quantity defined as the average
of nonrandom quantities that differ from one another such that their aggregate behavior can be regarded as a collection of random quantities. In contrast to the
case of purely random errors, though, the parameters of the probability distribution for quasi-random errors cannot be unequivocally determined from experimental
data. Therefore, a quasi-random error is modeled by a probability distribution with
parameters (e.g., standard deviation) determined by expert opinion.
It is very difficult to identify systematic errors; for example, variable systematic errors can be identified by using statistical methods, correlation, and regression
analysis. Systematic errors can also be identified by measuring the same quantity
using two different instruments (methods) or by measuring periodically a known
(instead of an unknown) quantity. If a systematic error has been identified, then it
can usually be estimated and eliminated. However, making rational estimates of the
magnitude of the residual systematic errors and, in particular, assigning consistent
levels of confidence to these residual errors is an extremely difficult task. In practice,
therefore, residual systematic errors are assumed to follow a continuous uniform distribution, within ranges that are conservatively estimated based on experience and
expert judgment.
When the random character of the observational results is caused by measurement errors, the respective observations are assumed to have a normal distribution.
This assumption rests on two premises: (i) since measurement errors consist of many
components, the central limit theorem implies a normal distribution for such errors;
and (ii) measurements are performed under controlled conditions, so that the distribution function of their error is actually bounded. Hence, approximating a bounded
distribution by a normal distribution (for which the random quantity can take any
real value) is a conservative procedure since such an approximation leads to larger
confidence intervals than would be obtained if the true bounded distribution were
known. Nevertheless, the hypothesis that the distribution of the observations is normal must be verified, since the measured results do not always correspond to a
normal distribution. For example, when the measured quantity is an average value,
the distribution of the observations can have any form.
An indirect measurement is a measurement in which the value of the unknown
quantity is calculated by using matched measurements of other quantities, called
measured arguments or, briefly, arguments, which are related through a known relation to the measured quantity. In an indirect measurement, the true but unknown
value of the measured quantity or response, denoted by R, is related to the true but
unknown values of arguments, denoted as .˛1 ; : : : ; ˛k /, by a known relationship
(i.e., function) f . This relationship is called the measurement equation, and can be
generally represented in the form
R D f .˛1 ; : : : ; ˛k /:
(6.1)
6
Sensitivity and Uncertainty Analysis of Models and Data
297
In practice, only the nominal parameter values .˛10 ; : : : ; ˛k0 / are known together
with their uncertainties or errors .ı˛1 ; : : : ; ı˛k /. The nominal parameter values are
given by their respective expectations, while the associated errors and/or uncertainties are given by their respective standard deviations. Processing the experimental
data obtained in an indirect measurement is performed with the same objectives as
for direct measurements, namely to calculate the expected value E.R/, of the measured response R, and to calculate the error and/or uncertainty, including confidence
intervals, associated with E.R/. The higher-order moments of the distribution of R
are also of interest, if they can be calculated. It is very important to note here that
the measurement equation, Eq. 6.1, can be interpreted to represent not only results
of indirect measurements, but also results of computations. In this interpretation,
.˛1 ; : : : ; ˛k / are considered to be the parameters underlying the respective computation, R is considered to represent the result or response of the computation, while
f represents not only the explicit relationships between parameters and response,
but also implicitly the relationships among the parameters and the independent and
dependent variables comprising the respective mathematical model.
6.2.2 Propagation of Errors (Moments)
The true, but unknown, parameter values .˛1 ; : : : ; ˛k / can be expressed in vector
form as
˛ D ˛0 C ı˛ D ˛10 C ı˛1 ; : : : ; ˛k0 C ı˛k :
As noted in Eq. 6.1, the response is related to the parameters via the measurement
equation or computational model. Traditionally, Eq. 6.1 is written in the simpler
functional form
R D R.˛1 ; : : : ; ˛k / D R.˛10 C ı˛1 ; : : : ; ˛k0 C ı˛k /:
(6.2)
In the functional relation above, R is used in the dual role of both a random function
and the numerical realization of this function, which is consistent with the notation
used for random variables and functions.
Expanding R.˛10 C ı˛1 ; : : : ; ˛k0 C ı˛k / in a Taylor series around the nominal
values ˛0 D .˛10 ; : : : ; ˛k0 / and retaining only the terms up to the nth order in the
variations ı˛i ˛i ˛i0 around ˛i0 gives:
R.˛1 ; : : : ; ˛k / R ˛10 C ı˛1 ; : : : ; ˛k0 C ı˛k
k X
@R
ı˛i
D R ˛0 C
@˛i1 ˛0 1
i1 D1
C
1
2
k
X
i1 ;i2 D1
@2 R
@˛i1 @˛i2
˛0
ı˛i1 ı˛i2
(6.3)
298
D.G. Cacuci
C
1
3Š
1
C
nŠ
k
X
i1 ;i2 ;i3 D1
@3 R
@˛i1 @˛i2 @˛i3
k
X
i1 ;i2 ;:::;in D1
˛0
ı˛i1 ı˛i2 ı˛i3 C @n R
@˛i1 @˛i2 : : : @˛in
˛0
ı˛i1 : : : ı˛in :
Using the above Taylor-series expansion, the various moments of the random variable R.˛1 ; : : : ; ˛k /, namely its mean, variance, skewness, and kurtosis, can be
computed by considering that the system parameters .˛1 ; : : : ; ˛k / are random variables distributed according to a joint probability density function p.˛1 ; : : : ; ˛k /,
with mean values, variances, and covariances defined as:
E.˛i / ˛i0 ;
Z
var .˛i ; ˛i / i2 Z
cov ˛i ; ˛j ˛i ˛io
2
(6.4)
p .˛1 ; : : : ; ˛k / d˛1 d˛2 : : : d˛k ;
(6.5)
S˛
˛i ˛io ˛j ˛jo p .˛1 ; : : : ; ˛k / d˛1 d˛2 : : : d˛k :
(6.6)
S˛
Even when the joint probability density function p.˛1 ; : : : ; ˛k / is unknown, as is
often the case in practice, the means, variances, and covariances of the system parameters .˛1 ; : : : ; ˛k / can be “propagated” to compute the various moments of the
response R.˛1 ; : : : ; ˛k /. This procedure is called the method of propagation of errors or propagation of moments, and the resulting equations for the various moments
of R.˛1 ; : : : ; ˛k / are called the moment propagation equations.
For large complex systems, with many parameters, it is impractical to consider
the nonlinear terms in Eq. 6.3. In such cases, the response R.˛1 ; : : : ; ˛k / becomes
a linear function of the parameters .˛1 ; : : : ; ˛k / of the form
k
X
R.˛1 ; : : : ; ˛k / D R ˛0 C
i D1
@R
@˛i
˛0
ı˛i D R0 C
k
X
Si ı˛i ;
(6.7)
i D1
where R0 R.˛0 /, and where the quantity Si .@R = @˛i /˛0 denotes the sensitivity of the response R.˛1 ; : : : ; ˛k / to the parameter ˛i . The mean value of
R.˛1 ; : : : ; ˛k / is obtained from Eq. 6.7 as
Z
E.R/ D
k
X
i D1
S˛
k
X Z
Si
i D1
D R0 :
!
Si ı˛i p.˛1 ; : : : ; ˛k /d˛1 d˛2 : : : d˛k C R0
.˛i ˛io /p.˛1 ; : : : ; ˛k /d˛1 d˛2 : : : d˛k C R0
S˛
(6.8)
6
Sensitivity and Uncertainty Analysis of Models and Data
299
The various moments of R.˛1 ; : : : ; ˛k / can be calculated by using Eqs. 6.7 and 6.8;
thus, the lth central moment l .R/ of R.˛1 ; : : : ; ˛k / is obtained as the following
k-fold integral over the domain S˛ of the parameters ˛, to obtain:
l l .R/ E R E.R/
Z
k
X
D
!l
Si ı˛i
p.˛1 ; : : : ; ˛k / d˛1 d˛2 ; : : : ; d˛k :
(6.9)
i D1
S˛
The variance of R.˛1 ; : : : ; ˛k / is calculated by setting l D 2 in Eq. 6.9 and by using
the result obtained in Eq. 6.8; the detailed calculations are as follows:
2 .R/ var.R/ E
Z
D
S˛
D
k
X
i D1
C2
k
X
R R0
!2
Si ı˛i
D
p.˛1 ; : : : ; ˛k / d˛1 d˛2 : : : d˛k
i D1
Z
Si2
.ı˛i /2 p.˛1 ; : : : ; ˛k / d˛1 d˛2 : : : d˛k
S˛
k
X
Z
Si Sj
i ¤j D1
k
X
2 .ı˛i /.ı˛j /p.˛1 ; : : : ; ˛k / d˛1 d˛2 : : : d˛k
S˛
k
X
Si2 var.˛i / C 2
i D1
Si Sj cov.˛i ; ˛j /:
i ¤j D1
The result obtained in the above equation can be written in matrix form as
var.R/ D SV˛ ST ;
(6.10)
where the superscript “T ” denotes transposition, and V˛ denotes the covariance
matrix for the parameters .˛1 ; : : : ; ˛k /, with elements defined as
(
cov.˛i ; ˛j / D ij i j ; i ¤ j; ij correlation coefficient
.V˛ /ij D
var.˛i / D i2 ; i D j;
and the column vector S D .S1 ; : : : ; Sk /, with components Si D .@R=@˛i /˛0 ,
denotes the sensitivity vector. Equation 6.10 is colloquially known as the sandwich
rule. If the system parameters are uncorrelated, Eq. 6.10 takes on the simpler form
var.R/ D
k
X
i D1
Si2 var.˛i / D
k
X
i D1
Si2 i2 :
(6.11)
300
D.G. Cacuci
The above concepts can be readily extended from a single response to n responses
that are functions of the parameters .˛1 ; : : : ; ˛k /. In vector notation, the n responses
are represented as the column vector
R D .R1 ; : : : ; Rn /:
(6.12)
In this case, the vector-form equivalent of Eq. 6.7 is the following linear, first-order
Taylor expansion of R.˛/:
R ˛0 C ı˛ D R ˛0 C ıR Š R ˛0 C Sı˛;
(6.13)
where S is a rectangular matrix of order n k with elements representing the sensitivity of the j th response to the i th system parameter, namely
.S/ji D @Rj =@˛i :
(6.14)
The expectation E.R/ of R is obtained by following the same procedure as that
leading to Eq. 6.8, to obtain
(6.15)
E.R/ D R 0 :
The covariance matrix VR for R is obtained by following the same procedure as that
leading to Eq. 6.10; this yields
VR D E Sı˛.Sı˛/T D SE ı˛ı˛T ST D SV˛ ST ;
(6.16)
where the superscript “T ” denotes transposition. Note that Eq. 6.16 has the same
“sandwich” form as Eq. 6.10 for a single response.
The equations for the propagation of higher-order moments become increasingly complex and are seldom used in practice. For example, for a single response
R .˛1 ; : : : ; ˛k / and uncorrelated parameters .˛1 ; : : : ; ˛k /, the respective propagation of moments equations can be obtained from Eq. 6.3, after a considerable amount
of algebra, as follows:
k 2 1X
0
@ R
0
E.R/ D R ˛1 ; : : : ; ˛k C
2 .˛i /
2
@˛i2 ˛o
i D1
k k 1 X @3 R
1 X @4 R
C
3 .˛i / C
4 .˛i /
6
24
@˛i3 ˛o
@˛i4 ˛o
i D1
i D1
(
)
k
k1
@4 R
1 X X
C
2 .˛i /2 .˛j / I
24
@˛i2 @˛j2 o
i D1 j Di C1
˛
(6.17)
6
Sensitivity and Uncertainty Analysis of Models and Data
(
)
k
X
@R 2
2 .R/ D
@˛i
i D1
˛o
k X
@R @2 R
3 .˛i /
@˛i @˛i2 ˛o
i D1
@R @ R
4 .˛i /
@˛i @˛i3 ˛o
i D1
(
2 )
k
@2 R
1X
4 .˛i / .2 .˛i //2 I
(6.18)
C
2
4
@˛
i
i D1
˛o
(
)
k
X
@R 3
3 .R/ D
3 .˛i /
@˛i
i D1
˛o
(
)
k
3X
@R 2 @2 R
4 .˛i / .2 .˛i //2 I
(6.19)
C
2
2
@˛i
@˛
i
i D1
˛o
(
)
k
X
@R 4
4 .R/ D
4 .˛i / 3.2 .˛i //2 C 3Œ2 .R/ 2 : (6.20)
@˛i
o
C
1
3
i D1
k X
2 .˛i / C
301
3
˛
In Eqs. 6.17–6.20, the quantities l .R/; .l D 1; : : : ; 4/ denote the central moments of the response R.˛1 ; : : : ; ˛k /, while the quantities k .˛i /; (i D 1; : : : ; kI
k D 1; : : : ; 4) denote the respective central moments of the parameters .˛1 ; : : : ; ˛k /.
Note that E.R/ ¤ R0 when the response R.˛1 ; : : : ; ˛k / is a nonlinear function
of the parameters .˛1 ; : : : ; ˛k /. As has been already mentioned, Eqs. 6.17–6.20 are
valid for uncorrelated parameters only.
It is important to emphasize that the “propagation of moments” equations are
used not only for processing experimental data obtained from indirect measurements, but also for performing uncertainty analysis of computational models. In the
latter case, the “propagation of errors” equations provide a systematic way of obtaining the uncertainties in computed results, arising not only from uncertainties in
the parameters that enter the respective computational model, but also from the numerical approximations themselves. The efficient computation of sensitivities and,
subsequently, uncertainties in results produced by various models (algebraic, differential, integral, etc.) are the objectives of subsequent sections in this chapter.
As a simple illustrative example of using Eq. 6.10, consider the computation of
the standard deviation R1 of a response R1 ˛1 ˛2 , where ˛1 and ˛2 are two
correlated parameters, with standard deviations 12 , 22 , and correlation V12 . The
sensitivities of R1 to ˛1 and ˛2 can be readily calculated as
ˇ
ˇ
@R1 ˇˇ
@R1 ˇˇ
0
S1 D
D
˛
;
S
D
D ˛10 :
(6.21)
2
2
@˛1 ˇ˛0
@˛2 ˇ˛0
Substituting the above sensitivities in Eq. 6.10 yields the following result for the
relative standard deviation of R1
1=2
2
R1
1
22
V12
D
C 0 2 C2 0 0
:
(6.22)
R1
.˛10 /2
.˛2 /
˛1 ˛2
302
D.G. Cacuci
6.3 Statistical Methods for Sensitivity and Uncertainty Analysis
There are many methods, based either on deterministic or statistical concepts, for
performing sensitivity and uncertainty analysis. However, despite this variety of
methods, or perhaps because of it, a precise, unified terminology, across all methods, does not seem to exist yet, even though many of the same words are used by
the practitioners of the various methods. For example, even the word “sensitivity” as
used by analysts employing statistical methods may not necessarily mean or refer to
the same quantity as would be described by the same word, “sensitivity,” when used
by analysts employing deterministic methods. Care must therefore be exercised,
since identical words may not necessarily describe identical quantities, particularly
when comparing deterministic to statistical methods. Furthermore, conflicting and
contradictory claims are often made about the relative strengths and weaknesses of
the various methods.
Sensitivity and uncertainty analysis procedures can be either local or global in
scope. The objective of local analysis is to analyze the behavior of the system response locally around a chosen point (for static systems) or chosen trajectory (for
dynamical systems) in the combined phase space of parameters and state variables.
On the other hand, the objective of global analysis is to determine all of the system’s
critical points (bifurcations, turning points, response maxima, minima, and/or saddle
points) in the combined phase space formed by the parameters and dependent (state)
variables, and subsequently analyze these critical points by local sensitivity and uncertainty analysis. The methods for sensitivity and uncertainty analysis are based
on either statistical or deterministic procedures. In principle, both types of procedures can be used for either local or for global sensitivity and uncertainty analysis,
although, in practice, deterministic methods are used mostly for local analysis while
statistical methods are used for both local and global analysis.
This section, 6.3, highlights the salient features of the most popular statistical
procedures currently used for local and global sensitivity and uncertainty analysis. These statistical procedures can be classified as follows: sampling-based
methods (random sampling, stratified importance sampling, and Latin Hypercube
sampling), first- and second-order reliability methods (FORM and SORM, respectively), variance-based methods (correlation ratio-based methods, the Fourier amplitude sensitivity test (FAST), and Sobol’s method), and screening design methods
(classical one-at-a-time (OAT) experiments, global one-at-a-time design methods,
systematic fractional replicate designs (SFRD), and sequential bifurcation (SB) designs). It is important to note that all statistical uncertainty and sensitivity analysis
methods first commence with the “uncertainty analysis” stage, and only subsequently proceed to the “sensitivity analysis” stage. This procedural path is the
reverse of the procedural (and conceptual) path underlying the deterministic methods of sensitivity and uncertainty analysis, where the sensitivities are determined
prior to using them for uncertainty analysis.
6
Sensitivity and Uncertainty Analysis of Models and Data
303
6.3.1 Sampling-Based Methods
If the uncertainty associated with the parameters ˛ were known unambiguously,
then the uncertainty in the response R.˛/ could also be assessed unambiguously. In
practice, however, the uncertainty in ˛ can rarely be specified unambiguously; most
often, many possible values of ˛, of varying levels of plausibility, could be considered. Such uncertainties can be characterized by assigning a distribution of plausible
values
(6.23)
D1 ; D2 ; : : : ; DI ;
to each component ˛i .x/ of ˛. Correlations and other restrictions can also be considered to affect the parameters ˛i .x/. Uncertainties characterized by distributions
of the form (6.23) are called epistemic or subjective uncertainties, and characterize
a degree of belief regarding the location of the appropriate value of each ˛i .x/. In
turn, these subjective uncertainties for the parameters ˛i .x/ lead to subjective uncertainties for the response R.˛/, which reflect a corresponding degree of belief
regarding the location of the appropriate response values as the outcome of analyzing the model under consideration.
Sampling-based methods for sensitivity and uncertainty analysis are based on a
sample
(6.24)
˛ D Œ˛k1 ; ˛k2 ; : : : ; ˛kI ; .k D 1; 2; : : : ; nS /;
of size nS , taken from the possible values of ˛ as characterized by the distributions
in Eq. 6.23. The response evaluations corresponding to the sample ˛ defined in
Eq. 6.24 can be represented in vector form as
R.˛ / D ŒR1 .˛ /; R2 .˛ /; : : : ; RJ .˛ / ;
.k D 1; 2; : : : ; nS /;
(6.25)
where the subscript J denotes the number of components of the response R. The
pairs
(6.26)
Œ˛ ; R.˛ / ; .k D 1; 2; : : : ; nS /;
represent a mapping of the uncertain “inputs” ˛ to the corresponding uncertain
“outputs” R.˛ /, which result from the “sampling-based uncertainty analysis.” Subsequent examination and post-processing (e.g., scatter plots, regression analysis,
partial correlation analysis) of the mapping represented by Eq. 6.26 constitute procedures for “sampling-based sensitivity analysis,” in that such procedures provide
means of investigating the effects of the elements of ˛ on the elements of R.˛/.
Specifically, a “sampling-based uncertainty and sensitivity analysis” involves five
steps, as follows:
Step 1: Define the subjective distributions Di described by Eq. 6.23 for characterizing the uncertain input parameters. Note that this first step is actually
the most important step of the entire sampling-based procedure! Because
of its fundamental importance, the characterization of subjective uncertainty has been widely studied (see, e.g. [2, 6, 36]). In practice, this step
invariably involves formal expert review processes. Two of the largest
304
D.G. Cacuci
examples of analyses that used formal expert review processes to assign
subjective uncertainties to input parameters are the US Nuclear Regulatory
Commission’s reassessment of the risks from commercial nuclear reactor
power stations (1990), and the assessment of seismic risk in the Eastern
USA(1990–1991) [62]. Although formal statistical procedures can occasionally be used for constructing subjective distributions, experience has
shown that it is more useful to specify selected quantile (minimum, median, maximum, etc.) values, rather than attempting to specify a particular
type of distribution (e.g., normal, beta, etc.) and its associated parameters.
This is because experts are more likely to be able to justify the selection of
specific quantile values than the selection of a particular form of distribution
with specific parameters. When distributions from several expert opinions
are combined, it is practically very difficult to assign weights to the respective opinions; these difficulties are discussed, for example, by Clement and
Winkler [20].
Once a subjective distribution Di has been assigned to each element
˛i .x/ of ˛, the collection of distributions of the form given by Eq. 6.23
defines a probability space (S, E, p), which is a formal structure where:
(i) S denotes the sample space (containing everything that could occur
in the particular universe under consideration; the elements of S are elementary events); (ii) E denotes an appropriately restricted subspace of
S, for which probabilities are defined; and (iii) p denotes a probability
measure.
Step 2: Use the distributions described by Eq. 6.23 to generate the sample ˛
described by Eq. 6.24. The most widely used sampling procedures are:
random sampling, importance sampling, and Latin Hypercube sampling.
In random sampling, each sample point is selected independently of all
other sample points. However, there is no guarantee that points will be sampled from all subregions of S; furthermore, if sampled values fall closely
together, the sampling of S is quite inefficient. To alleviate these shortcomings, importance sampling is used to ensure that specified regions in
the sample space are fully covered. The idea of fully covering the range
of each parameter is further extended in the Latin Hypercube sampling
procedure (see, e.g. [46]). In this procedure, the range of each parameter ˛i is divided into nLH intervals of equal probability, and one value is
randomly selected from each interval. The nLH values thus obtained for
the first parameter, ˛1 , are then randomly paired, without replacement,
with the nLH values obtained for ˛2 . In turn, these pairs are combined
randomly, without replacement, with the nLH values for ˛3 to form nLH
triples. This process is continued until one obtains a set of nLH of I tuples, of the form ˛ D Œ˛k1 ; ˛k2 ; : : : ; ˛kI ; .k D 1; 2; : : : ; nLH /, which
is called a Latin Hypercube sample. Latin Hypercube sampling is suitable for uncorrelated parameters only; if the parameters are correlated, then
the respective correlation structure must be incorporated into the sample,
for otherwise the ensuing uncertainty/sensitivity analysis would yield false
6
Sensitivity and Uncertainty Analysis of Models and Data
305
results. To incorporate parameter correlations into the sample, Iman and
Conover [37] proposed a restricted pairing technique for generating Latin
Hypercubes based on rank correlations (i.e., correlations between ranktransformed parameters) rather than sample correlations (i.e., correlations
between the original, untransformed, parameters).
Since random sampling is easy to implement and provides unbiased estimates for the means, variances, and distribution functions, it is the preferred
technique in practice, if large samples are available. However, a sufficiently
“large sample” for producing meaningful results by random sampling cannot be generated for complex models (with many parameters) and/or for
estimating extremely high quantiles (e.g., the 0.99999 quantile), since the
computation of the required sample becomes prohibitively expensive and
impractical. In such cases, the random sampling method of choice becomes
the stratified sampling method, in which the sampling space is partitioned
in regions called “strata.” The main difficulty for implementing stratified
sampling lies with defining the strata and for calculating the probabilities
for the respective strata, unless considerable a priori knowledge is already
available for this purpose. For example, the fault and event trees used in
risk assessment studies of nuclear power plants and other complex engineering facilities can be used as algorithms for defining stratified sampling
procedures. Latin Hypercube sampling is used when very high quantiles
need not be estimated, but the calculations needed for generating the “large
sample” required for random sampling still remain impractical. This is often the case in practice when assessing the effects of subjective uncertainty
in medium-sized problems (e.g., ca. 30 parameters) while a 0.9–0.95 quantile is adequate for indicating the location of a likely outcome. For such
problems, random sampling is still not feasible computationally, but the unbiased means and distribution functions provided by the full stratification
(i.e., each parameter is treated equally) of the Latin Hypercube sampling
makes it the preferred alternative over importance sampling, since the unequal strata probabilities used in importance sampling produce results that
are difficult to interpret (particularly for subsequent sensitivity analysis). In
this sense, Latin Hypercube sampling provides a compromise importance
sampling when a priori knowledge of the relationships between the sampled parameters and predicted responses is not available.
Step 3: Use each of the elements of the sample ˛ in order to perform model recalculations, which then generate the responses R.˛ / described by Eq. 6.25.
Once the sample has been generated, its elements must be used to perform
model recalculations, which then generate the responses R.˛ / described
by Eq. 6.25. These model recalculations can become the most expensive
computational part of the entire uncertainty and sensitivity analysis and,
if the model is complex, the limited number of feasible model recalculations may severely limit the sample size and the other aspects of the overall
analysis.
306
D.G. Cacuci
Step 4: Perform “uncertainty analysis” of the response R.˛/, by generating displays of the uncertainty in R.˛/ using the results for R.˛ / obtained above,
in Step 3. It is customary to display the estimated expected value and the
estimated variance of the response (as estimated from the sample size).
However, these quantities may not be the most useful indicators about the
response, since information about the physical system under consideration
is always lost in the computations of means and variances. In particular,
the mean and variance are less useful for summarizing information about
the distribution of subjective uncertainties; by comparison, quantiles associated with the respective distribution provide a more meaningful locator for
the quantity under consideration. Distribution functions (e.g., cumulative
and/or complementary distribution functions, density functions) provide the
complete information that can be extracted from the sample under consideration.
Step 5: Perform “sensitivity analysis” of the response R.˛/ to parameters ˛,
by exploring the mappings represented by Eq. 6.26, to assess the effects
of the components of ˛ on the components of R.˛/. In the context of
sampling-based methods, statistical sensitivity analysis (as opposed to deterministic sensitivity analysis) involves the exploration of the mapping
represented by Eq. 6.26 to assess the effects of some, but not all, of the individual components of ˛ on the response R.˛/. This exploration includes
examination of scatter plots, regression and stepwise regression analysis,
correlation and partial correlation analysis, rank transformation, identification of non-monotonic patterns, and identification of nonrandom patterns.
It is important to note that correlated variables introduce unstable regression coefficients, in that the values of these coefficients become sensitive to
the specific variables introduced into the regression model. In such situations, the regression coefficients of a regression model that includes all of
the parameters are likely to give misleading indications of parameter importance. If several input parameters are suspected (or known) to be highly
correlated, it is usually recommended to transform the respective parameters so as to remove the correlations or, if this is not possible, to analyze
the full model by using a sequence of regression models with all but one of
the parameters removed, in turn. Note, however, that the regression model
should attempt to match the trend displayed by the collective sample rather
than match the predictions associated with individual sample parameters;
otherwise over-fitting of data could arise if parameters are arbitrarily forced
into the regression model.
Regression models based on linear representations of the impact of parameters
on the response will perform poorly when the relationships between the parameters and the response are nonlinear. In such cases, the rank transformation may be
used to improve the construction of the respective regression model. The conceptual
framework underlying rank transformation involves simply replacing the parameters
by their respective ranks, and then performing the customary regression analysis on
the ranks rather than the corresponding parameters (see, e.g. [37, 55, 57]). Thus, if
6
Sensitivity and Uncertainty Analysis of Models and Data
307
the number of observations is M , then the smallest value of each parameter is assigned rank 1, the next largest value is assigned rank 2, and so on, until the largest
value, which is assigned rank M ; if several parameters have the same value, then
they are assigned an averaged rank. The regression analysis is then performed by
using the ranks as input/output parameters, as replacements for the actual parameter/response values. This replacement has the effect of replacing the linearized
parameter/response relationships by rank-transformed monotonic input/output relationships in an otherwise conventional regression analysis. In practice, a regression
analysis using the rank-transformed (instead of raw) data may yield better results,
but only as long as the relationships between parameters and responses are monotonically nonlinear. Otherwise, the rank transformation does not improve significantly
the quality of the results produced by regression analysis.
6.3.2 Reliability Algorithms: FORM and SORM
In many practical problems, the primary interest of the analyst may be focused on a
particular mode of failure of the system under consideration, while the detailed spectrum of probabilistic outcomes may be of secondary concern. For such problems, the
so-called reliability algorithms provide much faster and more economical answers
(in comparison to the sampling-based methods discussed in Section 6.3.1) regarding
the particular mode of failure of the system under consideration. The typical problems that can be analyzed by using reliability algorithms must be characterized by a
mathematical model (whose solution can be obtained analytically or numerically),
by input parameters that can be treated as being affected by subjective (epistemic)
uncertainties, and by a threshold level that specifies mathematically the concept
of “failure.” The reliability algorithms most often used are those known as firstorder reliability methods (FORM) and second-order reliability methods (SORM),
respectively. Both of these methods use optimization algorithms to seek “the most
likely failure point” in the space of uncertain parameters, using the mathematical
model and the response functional that defines failure. Once this most likely failure point (referred to as the “design point”) has been determined, the probability of
failure is approximately evaluated by fitting a first- or second-order surface at that
point in parameter space. Reliability algorithms have been applied to a variety of
problems, including structural safety (see, e.g. [45]), offshore oil field design and
operation (see, e.g. [8]) and multiphase flow and transport in subsurface hydrology
(see, e.g. [66]).
The FORM and SORM algorithms are susceptible to non-convergence or convergence to an erroneous design point, particularly when the failure probability
approaches the extreme values of 0.0 or 1.0. Therefore, the numerical optimization algorithm and convergence tolerances underlying FORM and SORM should be
tailored, whenever possible, to the specific problem under investigation.
308
D.G. Cacuci
6.3.3 Variance-Based Methods
These methods use variance, among other indicators, as a measure of the importance
of a parameter in contributing to the overall uncertainty in the response. The concept
of variance as a measure of the importance of a parameter also underlies the conceptual foundation of three further methods for statistical uncertainty and sensitivity
analysis, namely the Fourier Amplitude Sensitivity Test (FAST), Sobol’s method, and
the correlation-ratio method (including variants thereof). It is important to note that,
in contrast to the sampling-based methods discussed in Section 6.3.1, the correlation
ratio, the FAST, and Sobol’s methods do not make the a priori assumption that the
input model parameters are linearly related to the model’s response.
The FAST procedure was originally proposed by Cukier et al. [23], and was subsequently extended by Cukier’s group and other authors. This procedure uses the
following Fourier transformation of the parameters ˛i :
˛i D Fi sin.!i z/;
i D 1; : : : ; I;
(6.27)
where f!i g is a set of integer frequencies, while z 2 . ; / is a scalar variable. The
expectation E.R/ and variance of the response R can be approximated, respectively,
as follows
1
E.R/ D
2
Z
f .z/dz; and Var.R/ Š 2
1
X
2
Aj C Bj2 ;
(6.28)
j D1
where
f .z/ f ŒF1 sin.!1 z/; F2 sin.!2 z/; : : : ; FI sin.!I z/ ;
(6.29)
while
1
Aj 2
Bj 1
2
Z
f .z/ cos.jz/dz;
(6.30)
f .z/ sin.jz/dz :
(6.31)
Z
The transformation given by Eq. 6.27 should provide, for each parameter ˛i , a uniformly distributed sample within the unit I -dimensional cube. As z 2 . ; /
varies for a given transformation, all parameters change simultaneously; however,
their respective ranges of uncertainty are systematically and exhaustively explored
(i.e., the search curve is space-filling) if and only if the set of frequencies f!i g is
incommensurate (i.e., if none of the frequencies !i may be obtained as a linear
combination, with integer coefficients, of the remaining frequencies).
6
Sensitivity and Uncertainty Analysis of Models and Data
309
The first-order sensitivity indices are computed by evaluating the coefficients
Aj and Bj for the fundamental frequencies f!i g and their higher harmonics
p!i .p D 1; 2; : : :/. If the frequencies f!i g are integers, the contribution to the total
variance Var.R/ coming from the variance Di corresponding to parameter ˛i is
approximately obtained as
M
X
Di Š 2
2
A2p!i C Bp!
;
i
(6.32)
pD1
where M is the maximum harmonic taken into consideration (usually M 6).
The ratio of the partial variance Di to the total variance Var.R/ provides the
so-called first-order sensitivity index. The minimum sample size required to compute Di is .2M!max C 1/, where !max is the maximum frequency in the set
f!i g (see, e.g. [56]). Furthermore, the frequencies that do not belong to the set
fp1 !1 ; p2 !2 ; : : : ; pI !I g for .pi D 1; 2; : : : ; 1/, and for any .i D 1; 2; : : : ; I /,
contain information about the residual variance ŒVar.R/ Di that is not accounted
for by the first-order indices. Saltelli et al. have proposed a method that extracts
information regarding this residual variance in .I NS / computations, where NS
is the respective sample size.
The method due to Sobol’ relies on a particular case of a theorem originally
proven by Kolmogorov, in which a multivariate function f .x1 ; x2 ; : : : ; xn / is
decomposed into summands of increasing dimensionality of the form
f .x1 ; x2 ; : : : ; xn / D f0 C
n
X
f i .xi / C
i D1
X
f ij .xi ; xj /
1 i <j n
C C f12:::n .x1 ; x2 ; : : : ; xn /
(6.33)
The above decomposition of f .x1 ; x2 ; : : : ; xn / is reminiscent of the ANOVA decomposition. When the quantities xi are uncorrelated, the above decomposition is
unique, and has the following properties.
(i) The integrals of any summand over any of its own variables is zero, i.e.,
Z1
fi1 i2 :::in .xi1 ; xi2 ; : : : ; xin / dxim D 0;
if 1 m n ;
(6.34)
0
(ii) The summands are orthogonal, i.e.,
Z
fi1 i2 :::in fj1 j2 :::jm dx D 0;
Œ0;1n
if .i1 ; i2 ; : : : ; in / ¤ .j1 ; j2 ; : : : ; jm / ; (6.35)
310
D.G. Cacuci
(iii) f0 is a constant, i.e.,
Z
f0 D
f .x/dx :
(6.36)
Œ0;1n
By squaring Eq. 6.33 and integrating the resulting expression over the unit cube
Œ0; 1 n , the following relation is obtained for the total variance D of f .x/:
Z
f 2 .x/dx f02 D
D
n
X
Di C
D ij C C D12:::n ;
(6.37)
fi1 i2 :::im .xi1 ; xi2 ; : : : ; xim / dxi1 : : : dxim ;
(6.38)
i D1
Œ0;1n
X
1 i <j n
where the partial variances of f .x/ are defined as
Z1
Di1 i2 :::im D
Z1
:::
0
0
for 1 i1 < : : : < im n; m D 1; : : : ; n :
The sensitivity indices are defined as
Si1 i2 :::im Di1 i2 :::im =D;
for 1 i1 < : : : < im n; m D 1; : : : ; n :
(6.39)
The first-order sensitivity index, Si , for the parameter xi indicates the fractional
contribution of xi to the variance D of f .x/; the second-order sensitivity index,
Sij ; .i ¤ j /, measures the part of the variation in f .x/ due to xi and xj that cannot
be explained by the sum of the individual effects of xi and xj ; and so on. Note also
that Eqs. 6.38 and 6.39 imply that
n
X
i D1
Si C
X
S ij C C S12:::n D 1 :
(6.40)
1 i <j n
6.3.4 Design of Experiments and Screening Design Methods
Design of Experiments (DOE) was first introduced by Fischer [31], and can be defined as the process of selecting those combinations of parameter values, called
design points, which will provide the most information on the input–output relationship embodied by a model in the presence of parameter variations. However,
the basic question underlying DOE is often a circular one: if the response function were known, then it would be easy to select the optimal design points, but the
response is actually the object of the investigation, to begin with! Often used in
practice is the so-called Factorial Design (FD), which aims at measuring the additive and interactive effects of input parameters on the response. An FD simulates all
6
Sensitivity and Uncertainty Analysis of Models and Data
311
possible combinations of assigned values, li , called levels, to each (uncertain) system parameter ˛i . Thus, even though an FD can account for interactions among
parameters, the computational cost required by a FD is l1 l2 : : : lI , where I denotes the total number of parameters in the model; such a computational effort is
prohibitively high for large-scale systems. A useful alternative is the Fractional
Factorial Design (FFD) introduced by Box et al. [7], which assumes a priori that
higher-order interactions between parameters are unimportant.
Screening design methods refer to preliminary numerical experiments designed
to identify the parameters that have the largest influence on a particular model response. The objective of screening is to arrive at a short list of important factors.
In turn, this objective can only be achieved if the underlying numerical experiments are judiciously designed. An assumption often used as a working hypothesis
in screening design is the assumption that the number of parameters that are truly
important to the model response is small in comparison with the total number of
parameters underlying the model. This assumption is based on the idea that the influence of parameters in models follows Pareto’s law of income distribution within
nations, characterized by a few, very important parameters and a majority of noninfluential ones. Since screening designs are organized to deal with models containing
very many parameters, they should be computationally economical. There is an inevitable trade-off, however, between computational costs and information extracted
from a screening design. Thus, computationally economical methods often provide
only qualitative, rather than quantitative information, in that they provide a parameter importance ranking rather than a quantification of how much a given parameter
is more important than another.
Falling within the simplest class of screening designs are the so-called one-ata-time (OAT) experiments, in which the impact of changing the values of each
parameter is evaluated in turn [24]. The standard OAT experiment is defined as the
experiment that uses standard or nominal values for each of the I parameters underlying the model. The combination of nominal values for the I parameters is
called the control experiment (or scenario). Two extreme values are then selected
to represent the range of each of the I parameters. The nominal values are customarily selected midway between the two extremes. The magnitudes of the residuals,
defined as the difference between the perturbed and nominal response (output) values, are then compared to assess which factors are most significant in affecting the
response.
Since classical OAT cannot provide information about interactions between parameters, the model’s behavior can only be assessed in a small interval around the
“control” scenario. In other words, the classical OAT experiments yield information
only about the system’s response local behavior. Therefore, the results of a classical OAT experiment are meaningful only if the model’s input–output relation can
be adequately represented by a first-order polynomial in the model’s parameters.
If the model is affected by nonlinearities (as is often the case in practice), then parameter changes around the “control” scenario would provide drastically different
“sensitivities,” depending on the chosen “control” scenario.
312
D.G. Cacuci
To address this severe limitation of the classical OAT designs, Morris [47], has
proposed a global OAT design method. In this method, the entire space in which the
parameters may vary is covered independently of the specific initial “control” scenario used to initiate the experiment. A global OAT design assumes that the model is
characterized by a large number of parameters and/or is computationally expensive
(regarding computational time and computational resources) to run. The range of
variation of each component of the vector ˛ of parameters is standardized to the
unit interval, and each component is then considered to take on p values in the set
f0; .p 1/1 ; 2.p 1/1 ; : : : ; 1g, so that the region of experimentation becomes
an I -dimensional grid with p-levels. An elementary effect of the i -parameter at a
point ˛ is then defined as ŒR.˛1 ; : : : ; ˛i 1 ; ˛i C ;: : : ; ˛I / R.˛/ = , where
is a predetermined multiple of 1=.1 p/, such that ˛i C is still within the
region of experimentation. A finite distribution Fi of elementary effects for the i th
parameter is obtained by sampling ˛ from within the region of experimentation.
The number of elements for each Fi is p k1 Œp .p 1/ . The distribution Fi
is then characterized by its mean and standard deviation. A high mean indicates a
parameter with an important overall influence on the response; a high standard deviation indicates either a parameter interacting with other parameters or a parameter
whose effect is nonlinear.
The alternative systematic fractional replicate design (SFRD), proposed by
Cotter [21], does not require any prior assumptions about parameter interactions. For a model with I parameters, an SFRD involves the following steps:
(i) one model computation with all parameters at their low levels; (ii) I model
computations with each parameter, in turn, at its upper level, while the remaining .I 1/ parameters remain at their low levels; (iii) I model computations
with each parameter, in turn, at its low level, while the remaining .I 1/ parameters remain at their upper levels; and (iv) one model computation with
all parameters at their upper levels. Thus, an SFRD requires 2.I C 1/ computations. Denoting by .R0 ; R1 ; : : : ; RI ; RI C1 ; : : : ; R2I ; R2I C1 / the values
of the responses computed in steps (i)–(iv) within an SFRD, the measures
M.j / jCe .j /jC jCo .j /j, where the quantities jCe .j /j and jCo .j /j are defined
as jCe .j /j Œ.R2I C1 RI Cj / .Rj R0 / =4 and jCo .j /j Œ.R2I C1 RI Cj /
C.Rj R0 / =4, respectively, are used to estimate the order of importance of the
I parameters ˛i . It is apparent from these definitions that the measures M.j /
may fail when a parameter induces cancellation effects in the response; such a
parameter would remain undetected by an SFRD. Worse yet, it is not possible to
protect oneself a priori against such occurrences. Furthermore, an SFRD is not
sufficiently precise, since the above definitions imply that, for one replicate, the
variances are varŒCo .j / D varŒCe .j / D 2 =4, whereas a fractional replicate with
n-computations would allow the estimations of parameter effects (on the response)
with variances 2 =n.
In addition to screening designs that consider each parameter individually, the
(originally) individual parameters can be clustered into groups that are subsequently treated by group screening designs. Perhaps the most efficient modern group
screening designs techniques are the iterated fractional factorial design (IFFD)
6
Sensitivity and Uncertainty Analysis of Models and Data
313
proposed by Andres and Hajas [1], and the SB technique proposed by Bettonvil [3].
In principle, the IFFD requires fewer model computations, n, than there are parameters, I . To identify an influential parameter, an IFFD investigates the groups through
a fractional factorial design; the procedure is then repeated with different random
groupings. Influential parameters are then sought at the intersection of influential
groups. The IFFD samples three levels per parameter, designated low, middle, and
high, while ensuring that the sampling is balanced: different combinations of values for two or three parameters appear with equal frequency. Hence, IFFD can be
considered as a composite design consisting of multiple iterations of a basic FFD.
The sequential bifurcation (SB) design combines two design techniques: (i) the
sequential design, in which the parameter combinations are selected based on the
results of preceding computations, and (ii) bifurcation, in which each group that
seems to include one or more important parameters is split into two subgroups of
the same size. However, the SB design must a priori assume that the analyst knows
the signs of the effects of the individual parameter, in order to ensure that effects of
parameters assigned to the same group do not cancel out. Furthermore, the sequential nature of SB implies a more cumbersome data handling and analysis process
than other screening design methods. To assess the effects of interactions between
parameters, the number of SB computations becomes the double of the number of
computations required to estimate solely the “main effects”; quadratic effects cannot
be currently analyzed with the SB design technique.
The screening designs surveyed in the foregoing are the most representative and
the most widely used methods aimed at identifying at the outset, in the initial phase
of sensitivity and uncertainty analysis, the (hopefully not too many!) important parameters in a model. Each type of design has its own advantages and disadvantages,
which can be summarized as follows: the advantages of OAT designs are: (i) no assumption of a monotonic input–output relation; (ii) no assumption that the model
contains only “a few” important parameters; and (iii) the computational cost increases linearly with the number of parameters. The major disadvantage of OAT
designs is the neglect of parameter interactions. Although such an assumption drastically simplifies the analysis of the model, it can rarely be accepted in practice.
This simplifying assumption is absent in the global OAT design of Morris, which
aims at determining the parameters that have (i) negligible effects, (ii) linear and
additive effects, and (iii) nonlinear or interaction effects. Although the global OAT
is easy to implement, it requires a high computational effort for large-scale models,
and provides only a qualitative (but not quantitative) indication of the interactions
of a parameter with the rest of the model; the global OAT cannot provide specific
information about the identity of individual parameter interactions.
The SFRD does not require a priori assumptions about parameter interactions
and/or about which few parameters are important. Although the SFRD is relatively
efficient computationally, it lacks precision and cannot detect parameters whose effects cancel each other out. The IFFD estimates the main and quadratic effects,
and two-parameter interactions between the most influential parameters. Although
the IFFD requires fewer computations than the total number of model parameters,
it gives good results only if the model’s response is actually influenced by only
314
D.G. Cacuci
a few truly important parameters. The SB design is simple and relatively costeffective (computationally), but assumes that (i) the signs of the main effects are
a priori known, and (ii) the model under consideration is adequately described by
two-parameter interactions.
6.4 Deterministic Methods for Sensitivity
and Uncertainty Analysis
The importance of parameters in large-scale complex models is not a priori obvious,
and may often be counterintuitive. To analyze such complex models, information
about the slopes of the model’s response at a given set of nominal parameter values
in parameter space is of paramount importance. The exact slopes are provided by
the local partial functional derivatives @R=@˛i of the response R with respect to the
model parameters ˛i ; these local partial functional derivatives are called the local
sensitivities of the model’s response to parameter variations.
The simplest way of estimating local sensitivities is by recalculations of the
model’s response, using parameter values that deviate by small amounts, ı˛i , of
the order of 1%, from their nominal values ˛i0 . The sensitivities are then estimated
by using a finite difference approximation to @R=@˛i of the form
@R
@˛i
D
˛0
R.˛10 ; : : : ; ˛i0 C @˛i ; : : : ; ˛I0 / R.˛i0 /
;
ı˛i
.i D 1; : : : ; I /:
This procedure, occasionally called the “brute-force method,” requires .I C 1/
model computations; if central differences are used, the number of model computations could increase up to a total of 2I . Although this method is conceptually simple
to use and requires no additional model development, it is slow, relatively expensive
computationally, and involves a trial-and-error process when selecting the parameter perturbations ı˛i . Note that erroneous sensitivities will be obtained if: (i) ı˛i
is chosen to be too small, in which case computational round-off errors will overwhelm the correct values, and (ii) the parameter dependence is nonlinear and ı˛i is
chosen too large, in which case the assumption of local linearity is violated.
Historically, limited considerations of sensitivity analysis already appeared a
century ago, in conjunction with studies of the influence of the coefficients of a
differential equation on its solution. For a long time, however, those considerations
remained merely of mathematical interest. The first systematic methodology for performing sensitivity analysis was formulated by Bode [4] for linear electrical circuits.
Subsequently, sensitivity analysis provided a fundamental motivation for the use of
feedback, leading to the development of modern control theory, including optimization, synthesis, and adaptation. The introduction of state-space methods in control
theory, which commenced in the late 1950s, and the rapid development of digital
computers have provided the proper conditions for establishing sensitivity theory
as a branch of control theory and computer science. The number of publications
6
Sensitivity and Uncertainty Analysis of Models and Data
315
dealing with sensitivity analysis applications in this field grew enormously (see,
e.g., the books by Kokotovic 1972 [42]; Tomovic and Vucobratovic [61]; Cruz [22];
Frank [32]; Fiacco [30]; Deif [25]; Eslami [29]; Rosenwasser and Yusupov [53]).
In parallel, and mostly independently, ideas of sensitivity analysis have also permeated other fields of scientific and engineering activities; notable developments in
this respect have occurred in the nuclear, atmospheric, geophysical, socioeconomical, and biological sciences.
When the parameter variations are small, the traditional way to assess their effects on calculated responses is by using perturbation theory, either directly or
indirectly, via variational principles. The basic aim of perturbation theory is to
predict the effects of small parameter variations without actually calculating the
perturbed configuration but rather by using solely unperturbed quantities (see, e.g.
[5,28,35,40–42,50,51,54]). As a branch of applied mathematics, perturbation theory
relies on concepts and methods of functional analysis. Since the functional analysis
of linear operators is quite mature and well established, the regular analytic perturbation theory for linear operators is also well established. Even for linear operators,
though, the results obtained by using analytic perturbation theory for the continuous spectrum are less satisfactory than the results delivered by analytic perturbation
theory for the essential spectrum. This is because the continuous spectrum is less
stable than the essential spectrum, as can be noted by recalling that the fundamental condition underlying analytic perturbation theory is the continuity in the norm
of the perturbed resolvent operator (see, e.g. [40, 67]). If this analytical property
of the resolvent is lost, then isolated eigenvalues need no longer remain stable to
perturbations, and the corresponding series developments may diverge, have only a
finite number of significant terms, or may cease to exist (e.g., an unstable isolated
eigenvalue may be absorbed into the continuous spectrum as soon as the perturbation is switched on). The analysis of such divergent series falls within the scope of
asymptotic perturbation theory, which comprises: (i) regular or uniform asymptotic
perturbation expansions, where the respective expansion can be constructed in the
entire domain, and (ii) singular asymptotic perturbation expansions, characterized
by the presence of singular manifolds across which the solution behavior changes
qualitatively. A nonexhaustive list of typical examples of singular perturbations includes: the presence or occurrence of singularities, passing through resonances, loss
of the highest-order derivative, change of type of a partial differential equation, and
leading operator becoming nilpotent. A variety of methods have been developed for
analyzing such singular perturbations; among the most prominent are the method
of matched asymptotic expansions, the method of strained coordinates, the method
of multiple scales, the WKB (Wentzel–Kramers–Brillouin, or the phase-integral)
method, the KBM (Krylov–Bogoliubov–Mitropolsky) method, Whitham’s method,
and variations thereof. Many fundamental issues in asymptotic perturbation theory
are still unresolved, and a comprehensive theory encompassing all types of operators (in particular, differential operators) does not exist yet. Actually, the problems
tackled by singular perturbation theory are so diverse that this part of applied mathematics appears to the nonspecialist as a collection of almost disparate methods that
often require some clever a priori guessing at the structure of the very answer one
316
D.G. Cacuci
is looking for. The lack of a unified method for singularly perturbed problems is
particularly evident for nonlinear systems, and this state of affairs is not surprising
in view of the fact that nonlinear functional analysis is much less well developed
than its linear counterpart. As can be surmised from the above arguments, although
perturbation theory can be a valuable tool in certain instances for performing sensitivity analysis, it should be noted already at this stage that the aims of perturbation
theory and sensitivity analysis do not coincide, and the two scientific disciplines are
evolving separately from each other.
For models that involve a large number of parameters and comparatively few
responses, sensitivity analysis can be performed very efficiently by using deterministic methods based on adjoint functions. The use of adjoint functions for analyzing
the effects of small perturbations in a linear system was introduced by Wigner [65].
Specifically, he showed that the effects of perturbing the material properties in a
critical nuclear reactor can be assessed most efficiently by using the adjoint neutron
flux, defined as the solution of the adjoint neutron transport equation. Since the
neutron transport operator is linear, its adjoint operator is straightforward to obtain. In the same report, Wigner was also the first to show that the adjoint neutron
flux can be interpreted as the importance of a neutron in contributing to the detector response. Wigner’s original work on the linear neutron diffusion and transport
equations laid the foundation for the development of a comprehensive and efficient
deterministic methodology, using adjoint fluxes, for performing systematic sensitivity and uncertainty analyses of eigenvalues and reaction rates to uncertainties in the
cross sections in nuclear reactor physics problems (see, e.g. [33, 34, 58–60, 63, 64]).
Since the neutron transport and neutron diffusion equations underlying problems in
nuclear reactor physics are linear in the dependent variable (i.e., the neutron flux),
the respective adjoint operators and adjoint fluxes are easy to obtain, a fact that undoubtedly facilitated the development of the respective sensitivity and uncertainty
analysis methods. The responses considered in all of these works were functionals of
the neutron forward and/or adjoint fluxes, and the sensitivities were defined as the
derivatives of the respective responses with respect to scalar parameters, such as
atomic number densities and energy-group-averaged cross sections.
Local sensitivities can be computed exactly only by using deterministic methods
that involve some form of differentiation of the system under investigation. The
(comparatively few) deterministic methods for calculating sensitivities exactly are
as follows: the Green’s Function method, the direct method (including its decoupled
direct method [DDM] variant), the Forward Sensitivity Analysis Procedure (FSAP),
and the Adjoint Sensitivity Analysis Procedure (ASAP). The Green function method
(GFM) commences by differentiating the model under consideration with respect
to its initial conditions in order to obtain a Green’s function, which is subsequently
convoluted with the matrix of parameter derivatives, and is finally integrated in time
to obtain the respective time-dependent sensitivities. There are several variants of
the GFM; the integrated Magnus version (GFM/AIM) proposed by Kramer et al.
[43] appears to be the most efficient (computationally) GFM. In practice, though,
the GFM is seldom used, since it is computationally very expensive and difficult to
implement.
6
Sensitivity and Uncertainty Analysis of Models and Data
317
The so-called direct method has been applied predominantly to systems of
ordinary differential and/or algebraic equations describing chemical kinetics (including combustion kinetics) and molecular dynamics. This method is practically
identical to the sensitivity analysis methods used in control theory, involving differentiation of the equations describing the model with respect to a parameter. The
resulting set of equations is solved for the derivative of all the model variables with
respect to that parameter. The actual form of the differentiated equations depends
on the parameter under consideration. Consequently, for each parameter, a different
set of equations must be solved to obtain the corresponding sensitivity. The most
advanced and computationally economical version of the direct method is the decoupled direct method (DDM), originally introduced by Dunker [26, 27], in which the
Jacobian matrix needed to solve the original system at a given time-step is also used
to solve the sensitivity equations at the respective time-step, before proceeding to
solve both the original and sensitivity systems at the next time-step. It is important to
note that the computational effort increases linearly with the number of parameters.
The direct method, on the one hand, and the variational/perturbation methods using adjoint functions, on the other hand, were unified and generalized by
Cacuci et al. [17], who presented a comprehensive methodology, based on Frechetderivatives, for performing systematic and efficient sensitivity analyses of largescale continuous and/or discrete linear and/or nonlinear systems. Shortly thereafter,
this methodology was further generalized by Cacuci [9, 10], who used methods
of nonlinear functional analysis to introduce a rigorous definition of the concept
of sensitivity as the first Gâteaux-differential – in general a nonlinear operator – of
the system response along an arbitrary direction in the hyperplane tangent to the
base-case solution in the phase-space of parameters and dependent variables. These
works presented a rigorous theory not only for sensitivity analysis of functionaltype responses, but also for responses that are general nonlinear operators, and for
responses defined at critical points. Furthermore, these works have also set apart
sensitivity analysis from perturbation theory, by defining the scope of the former
to be the exact and efficient calculation of all sensitivities, regardless of their use.
As detailed in the recent book by Cacuci [12], the two most general and effective
procedures for computing local sensitivities are the Forward Sensitivity Analysis
Procedure (FSAP) and the Adjoint Sensitivity Analysis Procedure (ASAP). The
FSAP constitutes a generalization of the decoupled direct method (DDM), since the
concept of Gâteaux-differential (which underlies the FSAP) constitutes the generalization of the concept of total differential in the calculus sense, which underlies the
DDM. Notably, the Gâteaux-differential exists for operators and generalized functions (e.g., distributions) that are not continuous in the ordinary calculus sense,
and therefore do not admit the “nice” derivatives required for using the DDM.
As expected, the FSAP reduces to the DDM, whenever the continuity assumptions
required by the DDM are satisfied. Finally, even though the FSAP represents a generalization of the DDM, the FSAP requires the same computational and programming
effort to develop and implement as the DDM.
Most problems of practical interest comprise a large number of parameters and
comparatively few responses. In such situations, it is by far more advantageous
318
D.G. Cacuci
to employ the ASAP. Note, though, that the ASAP is not easy to implement for
complicated nonlinear models, particularly in the presence of structural discontinuities (i.e., when the structure of the model itself changes). Furthermore, Cacuci
[9, 10] has underscored the fact that the adjoint functions needed for the sensitivity
analysis of nonlinear systems depend on the unperturbed forward (i.e., nonlinear)
solution, a fact that is in contradistinction to the case of linear systems. These
works have also shown that the adjoint functions corresponding to the Gâteauxdifferentiated nonlinear systems can be interpreted as importance functions – in that
they measure the importance of a region and/or event in phase space in contributing
to the system’s response under consideration; this interpretation is similar to that
originally assigned by Wigner [65] to the adjoint neutron flux for linear neutron
diffusion and/or transport problems in reactor physics and shielding.
Once they become available, the sensitivities can be used for various purposes,
such as for ranking the respective parameters in order of their relative importance
to the response, for assessing changes in the response due to parameter variations,
for performing uncertainty analysis using either the Bayesian approach or the response surface approach, or for data adjustment and/or assimilation. As highlighted
by Cacuci [12], it is necessary to define rigorously the concept of sensitivity and to
separate the calculation of sensitivities from their use, in order to compare clearly,
for each particular problem, the relative advantages and disadvantages of using one
or the other of the competing deterministic methods, statistical methods, or Monte
Carlo methods of sensitivity and uncertainty analysis.
The exact local sensitivities obtained by using deterministic methods can be used
for the following purposes: (i) understand the system by highlighting important
data; (ii) eliminate unimportant data; (iii) determine effects of parameter variations
on system behavior; (iv) design and optimize the system (e.g., maximize availability/minimize maintenance); (v) reduce over-design; (vi) prioritize the improvements
effected in the system; (vii) prioritize introduction of data uncertainties; and (viii)
perform local uncertainty analysis by using the “propagation of errors” method.
Important applications of deterministically computed local sensitivities include core
reactor physics and shielding (see, e.g. [33,34,44,52]; and references therein), reactor thermal-hydraulics and neutron dynamics [16, 19], two-phase flows with phase
transition [14, 38], geophysical fluid dynamics (see, e.g. [13, 48, 49, 68]), reliability
and risk analysis [15, 39].
6.4.1 The Forward Sensitivity Analysis Procedure (FSAP)
for Nonlinear Systems
Consider that the physical system is represented mathematically by means of K
coupled nonlinear operator equations of the form
N Œu.x/; ˛.x/ D QŒ˛.x/ ;
x 2 ;
(6.41)
6
Sensitivity and Uncertainty Analysis of Models and Data
319
where
1. x D .x1 ; : : : ; xJx / denotes the Jx -dimensional phase-space position vector for
the primary system; x 2 X RJx , where x is a subset of the Jx -dimensional
real vector space RJx .
2. u.x/ D Œu1 .x/; : : : ; uKu .x/ denotes a Ku -dimensional (column) vector whose
components are the primary system’s dependent (i.e., state) variables; u.x/ 2 Eu ,
where Eu is a normed linear space over the scalar field F of real numbers.
3. ˛.x/ D Œ˛1 .x/; : : : ; ˛I .x/ denotes an I -dimensional (column) vector whose
components are the primary system’s parameters; ˛ 2 E˛ , where E˛ is also a
normed linear space.
4. QŒ˛.x/ D ŒQ1 .˛/; : : : ; QKu .˛/ denotes a Ku -dimensional (column) vector
whose elements represent inhomogeneous source terms that depend either linearly or nonlinearly on ˛; Q 2 EQ , where EQ is also a normed linear space; the
components of Q may be operators, rather than just functions, acting on ˛.x/
and x.
5. N .u; ˛/ ŒN1 .u; ˛/; : : : ; NKu .u; ˛/ denotes a Ku -component column vector
whose components are nonlinear operators (including differential, difference, integral, distributions, and/or infinite matrices) acting on u and ˛.
In view of the definitions given above, N represents the mapping N W D E ! EQ ,
where D D Du D˛ , Du Eu , D˛ E˛ , and E D Eu E˛ . Note that an arbitrary
element e 2 E is of the form e D .u; ˛/. If differential operators appear in Eq. 6.41,
then a corresponding set of boundary and/or initial conditions (which are essential
to define D) must also be given. The respective boundary conditions are represented
in operator form as
ŒB.u; ˛/ A.˛/ @
x
D 0;
x 2 @x ;
(6.42)
where A and B are nonlinear operators, and @x denotes the boundary of x .
The vector-valued function u.x/ is considered to be the unique nontrivial solution of the physical problem described by Eqs. 6.41 and 6.42. The system response
(i.e., performance parameter) R.u; ˛/ associated with the problem modeled by
Eqs. 6.41 and 6.42 is a phase-space-dependent mapping that acts nonlinearly on
the system’s state vector u and parameters ˛, and is represented in operator form as
R.e/ W DR
E ! ER ;
(6.43)
where ER is a normed vector space.
In practice, the exact values of the parameters ˛ are not known; usually, only
the nominal (mean) parameter values, ˛0 , and their covariances, cov.˛i ; ˛j /, are
available (in exceptional cases, higher moments may also be available). The nominal
parameter values ˛0 .x/ are used in Eqs. 6.41 and 6.42 to obtain the nominal solution
u0 .x/ by solving the equations
N u0 ; ˛0 D Q ˛0 ; x 2 x ;
B u0 ; ˛0 D A ˛0 ; x 2 @x :
(6.44)
(6.45)
320
D.G. Cacuci
Thus, Eqs. 6.44 and 6.45 represent the “base-case” (nominal) state of the primary
(non-augmented) system, and e 0 D .u0 ; ˛0 / represents the nominal solution of
the non-augmented system. Once the nominal solution e 0 D .u0 ; ˛0 / has been
obtained, the nominal value R.e 0 / of the response R.e/ is obtained by evaluating
Eq. 6.43 at e 0 D .u0 ; ˛0 /.
As was generally shown by Cacuci [9], the sensitivity of the response R to variations h in the system parameters is given by the Gâteaux- (G)-differential ıR.e 0 I h/
of the response R.e/ at e 0 D .u0 ; ˛0 / with increment h, defined as
ıR e 0 I h d 0
R e C th
dt
R e 0 C th R.e 0 /
;
t !0
t
D lim
t D0
(6.46)
for t 2 F, and all (i.e., arbitrary) vectors h 2 E. For the non-augmented system
considered here, it follows that h D .hu ; h˛ /, since E D Eu E˛ . The G-differential
ıR.e 0 I h/ is related to the total variation ŒR.e 0 C th/ R.e 0 / of R at e 0 through
the relation
R e 0 C th R e 0 D ıR e 0 I h C .th/;
with lim Π.th/ =t D 0: (6.47)
t !0
The objective of local sensitivity analysis is to evaluate ıR.e 0 I h/. Recall that the
system’s state vector u and parameters ˛ are related to each other through Eqs. 6.41
and 6.42, which implies that hu and h˛ are also related to each other. Therefore,
the sensitivity ıR.e 0 I h/ of R.e/ at e 0 can only be evaluated after determining the
vector of variations hu in terms of the vector of parameter variations h˛ . The firstorder relationship between hu and h˛ is determined by taking the G-differentials of
Eqs. 6.41 and 6.42, to obtain the forward sensitivity system (FSS)
N0u u0 ; ˛0 hu D ıQ ˛0 I h˛ N0˛ u0 ; ˛0 h˛ ; x 2 x ;
B0u u0 ; ˛0 hu D ıA ˛0 I h˛ B0˛ u0 ; ˛0 h˛ ; x 2 @x :
(6.48)
(6.49)
For a given vector of parameter variations h˛ around ˛0 , the forward sensitivity
system represented by Eqs. 6.48 and 6.49 is solved to obtain hu . Once hu is available, it is, in turn, used in Eq. 6.46 to calculate the sensitivity ıR.e 0 I h/ of R.e/ at
e 0 , for a given vector of parameter variations h˛ .
Equations (6.48) and (6.49) represent the “forward sensitivity equations (FSE),”
also called occasionally the “forward sensitivity model (FSM),” or the “forward
variational model (FVM),” or the “tangent linear model (TLM).” The direct computation of the response sensitivity ıR.e 0 I h/ by using the (h˛ -dependent) solution
hu of Eqs. 6.48 and 6.49 constitutes the Forward Sensitivity Analysis Procedure
(FSAP). From the standpoint of computational costs and effort, the FSAP is advantageous to employ only if, in the problem under consideration, the number of
different responses of interest exceeds the number of system parameters and/or parameter variations to be considered. This is rarely the case in practice, however,
since most problems of practical interest are characterized by many parameters (i.e.,
6
Sensitivity and Uncertainty Analysis of Models and Data
321
˛ has many components) and comparatively few responses. In such situations, it is
not economical to employ the FSAP to answer all sensitivity questions of interest,
since it becomes prohibitively expensive to solve the h˛ -dependent FSE repeatedly
to determine hu for all possible vectors h˛ .
6.4.2 The Adjoint Sensitivity Analysis Procedure (ASAP)
for Nonlinear Systems
When the response R.e/ is an operator of the form R W DR ! ER , the sensitivity
ıR.e 0 I h/ is also an operator, defined on the same domain, and with the same range
as R.e/. To implement the ASAP for such responses, the spaces Eu , EQ , and ER
are henceforth considered to be Hilbert spaces and denoted as H u .x /, H Q .x /,
and H R .R /, respectively. The elements of H u .x / and H Q .x / are, as before,
RJx , with smooth boundary
vector-valued functions defined on the open set x
@x . The elements of H R .R / are vector or scalar functions defined on the open
set R Rm ; 1 m Jx , with a smooth boundary denoted as @R . Of course,
if Jx D 1, then @x merely consists of two endpoints; similarly, if m D 1, then
@R consists of two endpoints only. The inner products on H u .x /, H Q .x /, and
H R .R / are denoted by h; iu , h; iQ , and h; iR , respectively. Furthermore, the
ASAP also requires that ıR.e 0 I h/ be linear in h, which implies that R.e/ must
satisfy a weak Lipschitz condition at e 0 , and that
R e 0 C th1 Cth2 R e 0 Cth1 R e 0 Cth2 C R e 0 D o .t/ I (6.50)
h1 ; h2 2 Hu H ˛ I t 2 F
If R.e/ satisfies the two conditions above, then the response sensitivity ıR.e 0 I h/
is indeed linear in h, and can therefore be denoted as DR.e 0 I h/. Consequently,
R.e/ admits a total G-derivative at e 0 D .u0 ; ˛0 /, such that the relationship
DR e 0 I h D R0u e 0 hu C R0˛ e 0 h˛
(6.51)
holds, where R0u .e 0 / and R0˛ .e 0 / are the partial G-derivatives at e 0 of R.e/ with
respect to u and ˛. Note also that R0u .e 0 / is a linear operator, on hu , from H u into
H R , i.e., R0u .e 0 / 2 L.H u ./; H R .R //. It is convenient to refer to the quantities
R0u .e 0 / hu and R0˛ .e 0 / h˛ appearing in Eq. 6.51 as the “indirect-effect term” and the
“direct-effect term,” respectively.
The direct effect term can be evaluated efficiently at this stage. To proceed with
the evaluation of the indirect-effect term, we consider that the orthonormal set
fps gs2S , where s runs through an index set S, is an orthonormal basis of H R .R /.
Then, since R0u .e 0 / hu 2 HR .R /, it follows that R0u .e 0 / hu can be represented as
the Fourier series
X˝ ˛
R0u e 0 hu ; p s R p s :
(6.52)
R0u e 0 hu D
s2S
322
D.G. Cacuci
In the above sum only an at most countable number of elements are different from
zero, and the series extended upon the nonzero elements converges unconditionally.
The functionals
hR0u e 0 hu ; ps iR
are the Fourier coefficients of R0u .e 0 /hu with respect to the basis fp s g. These functionals are linear in hu , since R.e/ was required to satisfy the conditions stated in
Eq. 6.50. Since R0u .e 0 / 2 L.H u .x /; H R .R //, and since Hilbert spaces are selfdual, it follows that the following relationship holds:
hR0u e 0 hu ; p s iR D hƒ e 0 ps ; hu iu ;
s 2 S:
(6.53)
In Eq. 6.53, the operator ƒ.e 0 / 2 L .H R .R /; H u .x // is the adjoint of R0u .e 0 /;
recall that ƒ.e 0 / is unique if R0u .e 0 / is densely defined.
To eliminate the unknown values of hu from the expression of each of the functionals hhu ; ƒ.e 0 /p s iu ; s 2 S, the next step of the ASAP is to construct the operator
LC .e 0 /, which is the operator formally adjoint to N0u .u0 ; ˛0 /, by means of the relationship
h
s;
N0u u0 ; ˛0 hu iQ D hLC e 0
s;
hu iu C fP .hu I
s /g@
x
; s 2 S; (6.54)
which holds for every vector s 2 HQ ; s 2 S. Recall that the operator LC .e 0 / is
defined as the Ku Ku matrix
0
LC e 0 ŒLC
; .i; j D 1; : : : ; Ku / ;
(6.55)
ji e
obtained by transposing the formal adjoints of the operators
0 0 0
Nu u ; ˛ ij ;
while
fP.hu I
s /g@
x
is the associated bilinear form evaluated on @x . The domain of LC .e 0 / is determined by selecting appropriate adjoint boundary conditions, represented here in
operator form as
˚ C
B
sI e
0
A C ˛0 @
x
D 0;
s 2 S:
(6.56)
The above boundary conditions for LC .e 0 / are obtained by requiring that:
(a) They must be independent of hu , h˛ , and G-derivatives with respect to ˛.
(b) The substitution of Eqs. 6.49 and 6.56 into the expression of the so-called bilinear concomitant
fP.hu I s /g@ x
must cause all terms containing unknown values of hu to vanish.
6
Sensitivity and Uncertainty Analysis of Models and Data
323
This selection of the boundary conditions for LC .e 0 / reduces the bilinear concomitant to a quantity that contains boundary terms involving only known values of
O ˛ ; s I ˛0 /. In general,
h˛ , , and, possibly, ˛0 ; this quantity will be denoted by P.h
O
P does not automatically vanish as a result of these manipulations, although it may
O could be forced to vanish by considdo so in particular instances; in principle, P
0
0
0
ering extensions of N˛ .u ; ˛ /, in the operator sense, but this is seldom needed in
practice. Introducing Eqs. 6.49 and 6.56 into 6.54 reduces the latter to
hLC e 0 s ; hu iu D h s ; ıQ ˛0 I h˛ N0˛ u0 ; ˛0 h˛ iQ
O h˛ ; s I ˛0 ; s 2 S :
P
(6.57)
The left side of Eq. 6.57 and the right side of Eq. 6.53 are now required to represent
the same functional; this is accomplished by imposing the relation
LC .e 0 /
s
D ƒ e 0 ps
s 2 S;
(6.58)
which holds uniquely in view of the Riesz representation theorem. This last step
completes the construction of the desired adjoint system, which consists of Eq. 6.58
and the adjoint boundary conditions given in Eq. 6.56. Furthermore, Eqs. 6.52–6.58
can now be used to obtain the following expression for the sensitivity DR.e 0 I h/ of
R.e/ at e 0 :
X
h s ; ıQ ˛0 I h˛ N0˛ .u0 ; ˛0 /h˛ iQ
DR e 0 I h D R0˛ e 0 h˛ C
s2S
O h˛ ;
P
sI ˛
0
ps :
(6.59)
As Eq. 6.59 indicates, the desired elimination of all unknown values of hu from
the expression of the sensitivity DR.e 0 I h/ of R.e/ at e 0 has thus been accomplished. Note that Eq. 6.59 includes the particular case of functional-type responses,
in which case the respective summation would contain a single term .s D 1/ only.
To evaluate the sensitivity DR.e 0 I h/ by means of Eq. 6.59, one needs to compute
as many adjoint functions s from Eqs. 6.58 and 6.56 as there are nonzero terms in
the representation of R0u .e 0 /hu given in Eq. 6.52. Although the linear combination
of basis elements ps given in Eq. 6.52 may, in principle, contain infinitely many
terms, obviously only a finite number of the corresponding adjoint functions s can
be calculated in practice. Therefore, special attention is required to select the Hilbert
space H R .R /, a basis
fps gs2S
for this space, and a notion of convergence for the representation given in Eq. 6.52
to best suit the problem at hand. This selection is guided by the need to represent
the indirect-effect term R0u .e 0 /hu as accurately as possible with the smallest number of basis elements; a related consideration is the viability of deriving bounds
and/or asymptotic expressions for the remainder after truncating Eq. 6.52 to the
first few terms.
324
D.G. Cacuci
6.4.3 The Adjoint Sensitivity Analysis Procedure
(ASAP) for Responses Defined at Critical Points
Often, the response functional R.e/, where e D .u; ˛/, is located at a critical point
y.˛/ of a function F .u; x; ˛/ which depends on the system’s state vector and parameters. In such situations, the components yi .˛/; . i D 1; : : : ; M /, of the critical
point y.˛/, must be treated as responses in addition to R.e/. Such a response can be
represented in the form
Z
R.e/ D
F .u; x; ˛/
M
Y
ıŒxi yi .˛/
i D1
J
Y
ı.xj zj / dx1 : : : dxJ :
(6.60)
j DM C1
The quantities appearing in the integrand of Eq. 6.60 are defined as follows:
(i) F is the nonlinear function under consideration.
(ii) ı.x/ is the customary “Dirac-delta” functional.
(iii) ˛ 2 R I , i.e., the components ˛i ; .i D 1; : : : ; I /, are restricted throughout this
section to be real numbers.
(iv) y.˛/ D Œy1 .˛/; : : : ; yM .˛/ , M J , is a critical point of F .
If the G-differential of F vanishes at y.˛/, then y.˛/ is a critical point defined implicitly as the solution of the system of equations
f@F=@xi gy.’/ D 0I .i D 1; : : : ; J /:
(6.61)
In this case, y.˛/ has J components (i.e., M D J ), and
YJ
j DM C1
ı.xj zj / 1
in the integrand of Eq. 6.60. Note that, in general, y.˛/ is a function of ˛. Occasionally, it may happen that @F=@xj takes on nonzero constant values (i.e., values that
do not depend on x) for some of the variables xj ; .j D M C 1; : : : ; J /. Then, as
a function of these variables xj , F attains its extreme values at the points xj D zj ,
zj 2 @x . Evaluating F at zj , .j D M C 1; : : : ; J /, yields a function G, which depends on the remaining phase-space variables xi , .i D 1; : : : ; M /; G may then have
a critical point at y.˛/ D Œy1 .˛/; : : : ; yM .˛/ defined implicitly as the solution of
f@G=@xi gy.˛/ D 0 ; .i D 1; : : : ; M / :
(6.62)
With the above specifications, the definition of R.e/ given in Eq. 6.60 is sufficiently
general to include treatment of extrema (local, relative, or absolute), saddle, and
inflexion points of the function F of interest. Note that, in practice, the base-case
solution path, and therefore the specific nature and location of the critical point
under consideration, are completely known prior to initiating the sensitivity studies.
6
Sensitivity and Uncertainty Analysis of Models and Data
325
As first shown by Cacuci [9, 10], the objective of sensitivity analysis for such
systems is twofold, namely:
(a) To determine the G-differential ıR.e0 I h/ of R.e/ at the “base-case point” e0 D
.u0 ; ˛0 /, which gives the sensitivity of R.e/ to changes h D .hu ; h˛ / in the
system’s state functions and parameters.
(b) To determine the (column) vector ıy.˛0 I h˛ / D .ıy1 ; : : : ; ıyM / whose components ıym .˛0 I h˛ / are the G-differentials of ym .˛/ at ˛0 , for .m D 1; : : : ; M /.
The vector ıy.˛0 I h˛ / gives the sensitivity of the critical point y.˛/ to
changes h˛ .
Applying the definition of the G-differential to Eq. 6.60 shows that
ıR e0 I h D
Z
M
Y
dx Fu0 e0 hu C F˛0 e0 h˛
ı xi yi ˛0
i D1
J
Y
ı.xi zj / C
j DM C1
Z
M
X
h˛ mD1
M
Y
dx F ı 0 .xm ym /
dym
d˛
ı.xi yi /
(6.63)
’0
J
Y
ı.xj zj /:
j DM C1
i D1;i ¤m
The last term on the right side of Eq. 6.63 vanishes, since
Z
M
Y
F ı 0 .xm ym /
ı.xj zj /dx D
j DM C1
i D1;i ¤m
(6.64)
Z
J
Y
ı.xi yi /
.@F=@xm /
M
Y
ı.xi yi /
i D1
J
Y
ı.xj zj / dx D 0 ; .m D 1; : : : ; M / ;
j DM C1
in view of the well-known definition of the ı 0 functional and in view of either
Eq. 6.61 if M D J , or of Eq. 6.62 if M < J . Therefore, the expression of ıR.e0 I h/
simplifies to
ıR e0 I h D
Z
d x Fu0 e0 hu C F˛0 e0 h˛
M
Y
ı xi yi ˛0
i D1
J
Y
ı.xi zj / :
(6.65)
j DM C1
As already mentioned, the sensitivity of the location in phase space of the critical point is given by the G-differential ıy.˛0 I h˛ / of y.˛/ at ˛0 . In view of either
Eq. 6.61 or 6.62, each of the components y1 .˛/; : : : ; yM .˛/ of y.˛/ is a real-valued
326
D.G. Cacuci
function of the real variables .˛1 ; : : : ; ˛I /, and may be viewed as a functional
defined on a subset of RI . Therefore, each G-differential ıym .˛0 I h˛ / of ym .˛/
at ˛0 is given by
ıym ˛0 I h˛ D
dym
d˛
˛0
h˛ D
I X
@ym
i D1
@˛i
˛0
h˛i I .m D 1; : : : ; M /; (6.66)
provided that @ym =@˛i , .i D 1; : : : ; I / exist at ˛0 for all .m D 1; : : : ; M /.
The explicit expression of ıy.˛0 I h˛ / is obtained as follows. First, it is observed
that both Eqs. 6.61 and 6.62 can be represented as
Z
.@F=@xm /
M
Y
j
Y
ıŒxi yi .˛/
i D1
ı.xj zj /dx D 0I .m D 1; : : : ; M /: (6.67)
j DM C1
Taking the G-differential of Eq. 6.67 at e0 yields the following system of equations
involving the components ıym :
Z
M
Y
˚
dx @.Fu0 hu C F˛0 h˛ /=@xm e0
ı xi yi ˛0
i D1
J
Y
ı.xi zj / j DM C1
M
X
ıys ˛0 I h˛
Z
dx f@F=@xm ge0
sD1
M
Y
ı 0 Œxs ys ˛0
ı xi yi ˛0
i D1;j ¤s
J
Y
ı.xi zj / D 0; .m D 1; : : : ; M /:
(6.68)
j DM C1
The above system is algebraic and linear in the components ıys .˛0 I h˛ /; therefore,
it can be represented in matrix form as
ˆ.ıy/ D by defining ˆ D Œ
ms
to be the M
Z
ms
(6.69)
M matrix with elements
M
Y
˚ 2
dx @ F=@xm @xs e0
ıŒxi yi ˛0
i D1
J
Y
j DM C1
ı.xj zj /; .m; s D 1; : : : ; M /;
(6.70)
6
Sensitivity and Uncertainty Analysis of Models and Data
327
and by defining to be the M -component (column) vector
.f1 C g1 ; : : : ; fM C gM /T ;
(6.71)
where
Z
fm M
Y
˚
dx @.F˛0 h˛ / = @xm e0
ı xi yi ˛0
i D1
J
Y
ı.xi zj /; .m D 1; : : : ; M /;
(6.72)
j DM C1
and
Z
gm M
Y
˚
dx @.Fu0 hu / = @xm e0
ı xi yi ˛0
i D1
J
Y
ı.xi zj /; .m D 1; : : : ; M /:
(6.73)
j DM C1
Notice that the definition of the ı 0 functional has been used to recast the second
integral in Eq. 6.68 into the equivalent expression given in Eq. 6.70. Equation 6.69
can be solved by employing methods of linear algebra to obtain
ıy.˛0 I h˛ / D ˆ 1 :
(6.74)
Both ıR.e0 I h/ and ıy.˛0 I h˛ / can be evaluated after obtaining the solution hu of the
“forward sensitivity equations” given in Eqs. 6.48 and 6.49. However, this formalism is ill-suited for sensitivity analysis of problems with large data bases (i.e., when
˛ has many components). For such problems, the ASAP should be applied to circumvent the need for having to solve repeatedly the “forward sensitivity equations.” The
ASAP can be developed if and only if the following three conditions, labeled as C.1
through C.3, are satisfied:
(C.1)
(C.2)
(C.3)
The partial G-derivatives at e0 of R.e/ with respect to u and ˛ exist.
The partial G-derivatives at e0 of the operators N and B with respect to u
and ˛ exist.
The spaces Eu and EQ are real Hilbert spaces, denoted by Hu and HQ , respectively. For u1 ; u2 2 Hu , the inner
R product in Hu will be denoted by
Œu1 ; u2 , and is given by the integral
u1 u2 dx. The inner product in HQ
will be denoted by h; i.
328
D.G. Cacuci
An examination of Eq. 6.65 shows that ıR.e0 I h/ is linear in h. Hence, the condition
(C.1) mentioned above is satisfied, and the hu -dependent component of ıR.e0 I h/,
i.e., the “indirect-effect term,” can be written in inner product form as
Z
M
Y
Fu0 e0 hu
ı xi yi ˛0
i D1
J
Y
ı.xi zj /dx D Œr u R e0 ; hu ; (6.75)
j DM C1
where
M
Y
ru R e0 D
ı xi yi ˛0
i D1
J
Y
ı.xi zj /
j DM C1
!T
@F e0
@F e0
;:::;
:
@ui
@uk
(6.76)
Following the procedure set forth in Section 6.4.2, the sensitivity ıR.e0 I h/ is obtained as
ıR e0 I h D
Z
M
Y
F˛0 e0 h˛
ı xi yi ˛0
i D1
J
Y
ı.xi zj /dx
j DM C1
˝
˛
C ıQ ˛0 I h˛ N0˛ e 0 h˛; v PO h˛ ; vI e0 ;
(6.77)
where the adjoint function v is the solution of the adjoint sensitivity system
LC e0 v D r u R e0
fBC . I e 0 / AC ˛0 g@
(6.78)
x
D 0:
(6.79)
Unknown values of hu can be eliminated from the expression of ıy.˛0 I h˛ / given
in Eq. 6.74 only if they can be eliminated from appearing in Eq. 6.73. Examination
of Eq. 6.73 reveals that each quantity gm is a functional that can be expressed in the
equivalent form
Z
gm D
Fu0 e0 hu ı 0 .xm ym /
M
Y
ıŒxi yi
i D1;i ¤m
J
Y
ı.xi zj /dx (6.80)
j DM C1
by employing the definition of the ı 0 functional. In turn, the above expression can
be written as the inner product
gm D Œ” m e0 ; ; hu ;
(6.81)
6
Sensitivity and Uncertainty Analysis of Models and Data
329
where
” m e0 ı 0 xm ym ˛0
M
Y
ıŒxi yi
i D1;i ¤m
!T
@F e0
@F e0
;:::;
:
@u1
@uk
J
Y
ı.xi zj /
j DM C1
(6.82)
The desired elimination of the unknown values of hu from Eq. 6.73 can now be accomplished by letting each of the functions ”m .e0 / play, in turn, the role previously
played by ru R.e0 /, and by following the same procedure as that leading to Eq. 6.78.
The end result is
˛
˝
gm D ıQ.˛0 I h˛ / N0˛ e0 h˛ ; wm PO .h˛ ; wm I e0 /;
(6.83)
where each function wm is the solution of the adjoint system
(
LC e0 wm D ” m e0
˚ C
B w m I e 0 AC e 0 @
x
D 0; .m D 1; : : : ; M /:
(6.84)
It is important to note that LC .e0 /, BC .wm I e0 /, and AC .e0 / appearing in Eq. 6.84
are the same operators as those appearing in Eqs. 6.78 and 6.79. Only the source
term ”m .e0 / in Eq. 6.84 differs from the corresponding source term ru R.e0 / in
Eq. 6.78. Therefore, the computer code employed to solve the adjoint system given
in Eqs. 6.78 and 6.79 can be used, with relatively trivial modifications, to compute the functions wm from Eq. 6.84. Comparing the right sides of Eqs. 6.77 and
6.83 reveals that the quantity PO .h˛ ; vI e0 / is formally identical to the quantity
PO .h˛ ; wm I e0 / and that the vector ıQ.˛0 I h˛ / N0˛ .e0 /h˛ appears in both of the
inner products denoted by h; i. This indicates that the computer program employed
to evaluate the second and third terms on the right side of Eq. 6.77 can also be used
to evaluate the functionals gm , .m D 1; : : : ; M /, given by Eq. 6.83. Of course, the
values of v required to compute ıR.e 0 I h/ are to be replaced by the respective values
of wm , when computing the gm ’s.
In most practical problems, the total number of parameters I greatly exceeds
the number of phase-space variables J , and hence M , since M J . Therefore,
if the ASAP can be developed as described in this section, then a large amount of
computing costs can be saved by employing this formalism rather than the FSAP.
In this case, only M C 2 “large” computations (one for the “base-case computation,” one for computing the adjoint function v, and M computations for obtaining
the adjoint functions w1 ; : : : ; wM / are needed to obtain the sensitivities ıR.e0 I h/
and ıy.’0 I h˛ / to changes in all of the parameters. By contrast at least .I C 1/
computations (one for the “base-case,” and I to obtain the vector hu ) would be required if the “forward sensitivity formalism” were employed.
330
D.G. Cacuci
6.4.4 ASAP for Linear Systems
For a physical system that is linear in the system’s vector of dependent (state)
variables, u.x/, the operators N .u; ˛/ and B.u; ˛/ in Eq. 6.41 and, respectively,
Eq. 6.42 take on the forms N .u; ˛/ ! L.˛/ u and B.u; ˛/ ! B.˛/ u, where L.˛/
and B.˛/ act linearly on u.x/. Hence, Eqs. 6.41 and 6.42 take on the forms
L.˛/u D QŒ˛.x/ ;
ŒB.˛/u A.˛/ @
x
x 2 x ;
D 0;
(6.85)
x 2 @x :
(6.86)
Note that although the components of L D ŒL1 .˛/; : : : ; LK .˛/ act linearly on
the state vector u.x/, they depend nonlinearly (in general) on the vector of system
parameters ˛.x/.The G-differentiated forms of Eqs. 6.85 and 6.86; i.e., the forward
sensitivity system (or the “tangent linear model”) take on the forms
(6.87)
L ˛0 hu C ŒL0˛ ˛0 h˛ ıQ ˛0 I h˛ D 0; x 2 x ;
˚ 0
B ˛ hu C ŒB0˛ ˛0 h˛ ıA ˛0 I h˛ @ x D 0; x 2 @x ; (6.88)
respectively. Furthermore, Eqs. 6.58 and 6.56 take on the forms
LC ˛0 s D ƒ e 0 p s ; s 2 S;
˚ C
0
AC ˛0 @ x D 0;
B
sI ˛
(6.89)
s 2 S:
(6.90)
while the expression for the sensitivity DR.e 0 I h/ of R.e/ at e 0 , cf. Eq. 6.59, becomes:
DR.e 0 I h/ D R0˛ e 0 h˛
Xh˝
0 ˛
0
0
C
s ; ıQ ˛ I h˛ L˛ ˛ h˛ Q
s2S
O h˛ ;
P
0
sI ˛
i
ps :
(6.91)
The following important characteristics must be noted when dealing with linear
problems: (i) the adjoint operators LC .˛0 / and BC . s I ˛0 / are independent of the
nominal solution u0 of the original system; (ii) if the response R.e/ is also linear in
u.x/, then the operator ƒ.e 0 / ! ƒ.˛0 / (i.e., ƒ also becomes independent of u0 )
in Eq. 6.89, which leads to the very important consequence that the adjoint sensitivity system for linear problems can be solved independently of solving the original
(linear) system for u0 . This important property of linear systems will be highlighted
in several illustrative paradigm examples presented in the next section.
6
Sensitivity and Uncertainty Analysis of Models and Data
331
6.5 Paradigm Applications of the ASAP
6.5.1 Application of the ASAP to Compute the Variance
of the Maximum Flux of Particles in a Particle
Diffusion Problem
The application of the ASAP for responses defined at critical points, presented in the
foregoing section, will now be illustrated by considering a simple diffusion problem
of neutral particles (e.g., neutrons) within a slab of material of extrapolated thickness
a Œcm , placed in vacuum, which contains distributed particle sources of strength
QŒparticles cm3s1 , and is also driven externally by a flux of particles of strength
'in Œparticles cm2 s1 , which impinges on one side (e.g., on the left side) of the
slab. Consider, also for simplicity, that the material within the slab only scatters
the particles, but does not absorb them. The linear particle diffusion equation that
simulates mathematically this problem is
L.˛/' D
d2 '
D Q;
dx 2
x 2 .0; a/;
(6.92)
where '.x/ denotes the (everywhere positive) particle flux, D is the diffusion coefficient, and Q is the corresponding distributed source term within the slab. The
boundary conditions considered for '.x/, namely
'.0/ D 'i n ;
'.a/ D 0;
(6.93)
simulate an incoming flux 'in at the boundary x D 0, and a vanishing flux at the
extrapolated distance x D a. Thus, the vacuum at the right side of the slab plays the
role of a perfect absorber, in that particles that have left the slab can never return.
The response R considered for the diffusion problem modeled by Eqs. 6.92 and 6.93
is the reading of a particle detector placed at the position y within the slab where
the flux attains its maximum value, 'max . Such a response (i.e., the reading of the
particle reaction rate at y) would be simulated mathematically by the following
particular form of Eq. 6.60:
Za
†d '.x/ ıŒx y.˛/ dx;
R.e/ D
(6.94)
0
where †d represents the detector’s equivalent reaction cross section, in units of
Œcm1 . The location y.˛/ where the flux attains its maximum value is defined implicitly as the solution of the equation
Za
0
d'.x/
ıŒx y.˛/ dx D 0:
dx
(6.95)
332
D.G. Cacuci
The parameters for this problem are the positive constants †d , D, Q, and 'in , which
will be considered to be the components of the vector ˛ of system parameters, defined as
(6.96)
˛ .†d ; D; Q; 'in / :
The vector e.x/ appearing in the functional dependence of R in Eq. 6.94 denotes
the concatenation of '.x/ with ˛, i.e., e .'; ˛/. The parameters ˛ are considered
experimentally
determined quantities, with known nominal values ˛0 D
0 to be
0
0
0
†d ; D ; Q ; 'in and known variances Œvar.†d /; var.D/; var.Q/; var.'i n / ; but
being otherwise uncorrelated to one another. These parameter variances will give
rise to variances in the value of the response R (i.e.,variance in the maximum flux
measured by the detector) and the location y.˛/ of R within the slab. The goal of
this illustrative example is to determine the variances of R and y.˛/ by using the
“Sandwich Rule” of the “Propagation of Moments” method presented in Section 6.2.
The nominal value ' 0 .x/ of the flux
is determined by solving Eqs. 6.92 and 6.93
for the nominal parameter values ˛0 †0d ; D 0 ; Q0 ; 'in0 : For this simple example,
the expression of ' 0 .x/ can be readily obtained in closed form as
' 0 .x/ D
Q0
x 0
2
' :
.ax
x
/
C
1
2D 0
a in
(6.97)
Note that even though Eq. 6.92 is linear in '; the solution '.x/ depends nonlinearly
0
; of the
on ˛, as evidenced by Eq. 6.97. Using Eq. 6.97, the nominal value, 'max
maximum of ' 0 .x/, and y.˛0 /; respectively, are readily obtained as
0
D
'max
Q 0 a2
D 0 .'in0 /2
'in0
C
C
8D 0
2a2 Q0
2
and
y.˛ 0 / D
a 'in0 D 0
:
2
aQ0
(6.98)
(6.99)
From Eqs. 6.94 and 6.98, it follows that the nominal value, R.e0 /, of the response
R.e/ is obtained as
0
:
(6.100)
R e0 D †0d 'max
Of course, for the complex large-scale problems analyzed in practice, it is not pos0
, R.e0 /, and y.˛0 /.
sible to obtain exact, closed form expressions for ' 0 .x/, 'max
For uncorrelated parameters, the “sandwich rule” formula presented in
Section 6.2, cf. Eq. 6.11, can be readily applied to this illustrative problem, to
obtain the following expression for the variance, var.R/, of R:
ıR 2
ıR 2
var.R/ D
var.†d / C
var.D/
ı†d
ıD
ıR 2
ıR 2
var.Q/ C
var.'in /;
C
ıQ
ı'in
(6.101)
6
Sensitivity and Uncertainty Analysis of Models and Data
333
where the symbol .ıR=ı˛i / denotes the partial sensitivity (i.e., the partial
G-derivative) of R.e/ to a generic parameter ˛i . In turn, the sensitivities of the
response R.e/ are given by the G-differential ıR.e 0 I h/ of R.e/ at e 0 , for variations
(6.102)
h' I h˛ .ı†d ; ıD; ıQ; ı'in / :
The G-differential ıR.e 0 I h/ is readily obtained from Eq. 6.94 as
ıR e I h D
Za
0
ı†d '.x/ ı x y ˛0 dx
0
Za
C
h' .x/ ı x y ˛0 dx
0
Za
dy
h˛ '.x/ ı 0 x y ˛0 dx:
d˛ ˛0
(6.103)
0
The last term on the right side of Eq. 6.103 vanishes in view of Eq. 6.95, namely
Za
Za
0
'ı .x y/dx D 0
.@'=@x/ı.x y/dx D 0:
0
Therefore, Eq. 6.103 reduces to
ıR e0 I h D R0˛ e 0 h˛ C R0' e 0 h' ;
(6.104)
where the “direct-effect” term R0˛ h˛ is defined as
R0˛
e
0
Za
h˛ 0
ı†d '.x/ ı x y ˛0 dx D ı†d 'max
(6.105)
0
while the “indirect-effect” term R'0 h' is defined as
R0'
e
0
Za
h' †0d h' .x/ ıŒx y ˛0 dx:
(6.106)
0
The “direct-effect” term R0˛ h˛ can be evaluated at this stage by substituting Eq. 6.98
into 6.105, to obtain
D 0 .'in0 /2
'in0
Q 0 a2
:
C
C
R0˛ e 0 h˛ D ı†d
8D 0
2a2 Q0
2
(6.107)
334
D.G. Cacuci
The “indirect-effect” term R0' h' , though, cannot be evaluated at this stage, since
h' .x/ is not yet available. The first-order (in kh˛ k) approximation to the exact value
of h' .x/ is obtained by calculating the G-differentials of Eqs. 6.92 and 6.93 and
solving the resulting “forward sensitivity equations” (FSE)
L ˛0 h' C L0˛ ˛0 ' 0 h˛ D O kh˛ k2 ;
(6.108)
together with the boundary conditions
h' .0/ D ı'in ;
h' .a/ D 0:
(6.109)
In Eq. 6.108, the operator L.˛0 / is defined as
while the quantity
d2
L ˛0 D 0 2 ;
dx
(6.110)
d2 0
L0˛ ˛0 ' 0 h˛ ıD
C ıQ;
dx 2
(6.111)
is the partial G-differential of L' at ˛0 with respect to ˛, and contains all of the
first-order parameter variations h˛ .
The Adjoint Sensitivity Analysis Procedure (ASAP) can be readily applied since,
as indicated by Eq. 6.106, the “indirect-effect” term R0' .e 0 /h' is expressible as a
linear functional of h' . Therefore, R'0 .e 0 /h' can be represented as an inner product
in an appropriately defined Hilbert space Hu ; for our illustrative example, Hu is
chosen to be the real Hilbert space Hu L 2 ./, with .0; a/, equipped with
the inner product
Za
hf .x/; g.x/i f .x/g.x/dx;
0
for f; g 2 Hu L2 ./;
.0; a/:
(6.112)
In Hu L 2 ./, the linear functional R0' .e 0 /h' defined in Eq. 6.106 can be represented as the inner product
˛
˝
R0' .e 0 /h' D †0d ı x y ˛0 ; h' :
(6.113)
The operator LC .˛0 /, which is formally adjoint to L.˛0 / is readily obtained as
d2
LC ˛0 D 0 2 :
dx
(6.114)
6
Sensitivity and Uncertainty Analysis of Models and Data
335
Note that LC .˛0 / and L.˛0 / are formally self-adjoint. The qualifier “formal” must
still be kept at this stage, since the boundary conditions for LC .˛0 / have not been
determined yet. These boundary conditions are derived by applying Eq. 6.54 to the
operators L.˛0 /h' and LC .˛0 / , to obtain
Za
.x/ D
0d
2
h' .x/
dx D
dx 2
0
Za
D0
d2 .x/
h' .x/dx
dx 2
0
˚ C P h' ;
xDa
xD0
:
(6.115)
Note that the function .x/ is still arbitrary at this stage, except for the requirement
that
2 H Q D L 2 ./; note also that the Hilbert spaces H u and H Q have now
both become the same space, i.e., H u D H Q D L 2 ./.
Integrating the left side of Eq. 6.115 by parts twice and canceling terms yields
the following expression for the bilinear boundary form:
˚
P h' ;
xDa
xD0
D D0
dh'
d
h'
dx
dx
xDa
:
(6.116)
xD0
The boundary conditions for LC .˛0 / can now be selected by applying to Eq. 6.116
the general principles expressed in items (a) and (b) immediately following Eq. 6.56.
From Eq. 6.109, h' is known at x D 0 and x D a; however, the quantities
˚
dh' = dx
xDa
xD0
are not known. These unknown quantities can be eliminated from Eq. 6.116 by
choosing
.0/ D 0;
.a/ D 0;
(6.117)
.x/. Implementing Eqs. 6.117 and
as boundary conditions for the adjoint function
6.109 into 6.116 yields
˚ P h' ;
xDa
Note that the quantity
xD0
D D 0 ı'in
d
dx
PO h˛ ; I ˛0 :
(6.118)
xD0
PO h˛ ; I ˛0
does not vanish; furthermore, the boundary conditions in Eq. 6.117 for the adjoint
operator LC .˛0 / differ from the boundary conditions in Eq. 6.109 for L.˛0 /. Hence
even though the operators LC .˛0 / and L.˛0 / are formally self-adjoint, they are not
self-adjoint for this illustrative example.
336
D.G. Cacuci
The last step in the construction of the adjoint system is the identification of the
source term, which is done noting from Eq. 6.113 that
r' R e 0 D †0d ı x y ˛0 :
(6.119)
Thus, the complete adjoint system becomes
LC ˛0
D0
d2
D †0d ı x y ˛ 0 ;
2
dx
(6.120)
where the adjoint function .x/ is subject to the boundary conditions given in
Eq. 6.117.
Using Eqs. 6.118–6.120 and 6.114 in 6.113 gives the following expression for
the “indirect-effect” term R0' .e0 /h' :
R0'
e
0
Za
d2 ' 0 .x/
C ıQ dx
dx 2
0
d .x/
D 0 .ı'in /
;
dx
xD0
h' D .x/ ıD
(6.121)
where .x/ is the solution of the adjoint sensitivity system defined by Eqs. 6.120
and 6.117. Note that the adjoint sensitivity system is, as expected, independent of
parameter variations h˛ ; thus, the adjoint sensitivity system needs to be solved only
once to obtain the adjoint function .x/. Very important for our illustrative example
is also the fact (characteristic of linear systems) that the adjoint system is independent of the original solution ' 0 .x/, and can therefore be solved directly, without
any knowledge of the (original) flux '.x/. Of course, the adjoint sensitivity system
depends on the response, which provides the source term as shown in Eqs. 6.119
and 6.120.
Solving the adjoint sensitivity system, namely Eqs. 6.120 and 6.117, yields the
following expression for the adjoint function .x/:
0
0 x o
0
†0d n
;
H
x
y
˛
a
y
˛
x
y
˛
D0
a
where H x y ˛0 is the Heaviside-step function defined as
.x/ D
(
H.x/ D
0;
for x < 0
1;
for x 0
Noting from Eq. 6.92 that
d2 ' 0 .x/
Q0
D
;
dx 2
D0
:
(6.122)
6
Sensitivity and Uncertainty Analysis of Models and Data
337
using the above result together with Eq. 6.122 in 6.121, and carrying out the
respective integrations over x yields the following expression for the “indirecteffect” term R0' .e0 /h' :
R0'
0
e h' D †0d
"
0 0 2 !
D 'in
Q0 ıD
a2
ıQ D0
4
aQ0
#
D 0 'in0
1
Cı'in
C 2 0
:
(6.123)
2
a Q
1
2D 0
From Eq. 6.105, it follows that the partial sensitivity (i.e., the partial G-derivative)
of R with respect to †d has the expression
ıR
Q 0 a2
D 0 .'in0 /2
'in0
0
:
D 'max
D
C
C
ı†d
8D 0
2a2 Q0
2
(6.124)
Similarly, from Eq. 6.123, it follows that the partial sensitivities (i.e., the partial
G-derivatives) of R with respect to D; Q; and 'in are:
"
0 0 2 #
†0d Q0 a2
D 'in
ıR
;
D
0
2
ıD
2.D /
4
aQ0
"
0 0 2 #
†0d a2
ıR
D 'in
D
;
ıQ
2D 0 4
aQ0
D0' 0
1
ıR
C 2 in0 :
D †0d
ı'in
2
a Q
(6.125)
(6.126)
(6.127)
Since the sensitivities of the response R.e/ have thus been determined, Eqs. 6.124–
6.127 can be replaced in Eq. 6.101 to compute the variance var.R/. The “sandwich
rule,” i.e., Eq. 6.11, can also be applied to obtain the variance, var.y/, of the critical
point y.˛/, where the flux attains its maximum (and where the particle detector is
located in this illustrative problem). The corresponding expression for var.y/ is:
var.y/ D
ıy
ıD
2
var.D/ C
ıy
ıQ
2
var.Q/ C
ıy
ı'in
2
var.'in /;
(6.128)
where the symbol .ıy=ı˛i / denotes the partial sensitivity (i.e., the partial Gderivative) of y.˛/ to a generic parameter ˛i . Using the ASAP, the sensitivities
of the critical point y.˛/ can be computed by specializing Eqs. 6.82–6.84 to our
illustrative example. Thus, applying Eq. 6.82 to our example shows that
” 1 e0 ı 0 x y.’0 / :
(6.129)
338
D.G. Cacuci
Furthermore, Eq. 6.83 reduces to
Za
g1 D 0
d2 ' 0 .x/
dw1 .x/
0
w1 .x/ ıD
C ıQ dx D .ı'in /
; (6.130)
dx 2
dx
xD0
where the function w1 is the solution of the following particular form taken on by
the adjoint system shown in Eq. 6.84 for our illustrative example:
8
d2 w1 .x/
< C 0
D ı 0 x y ’0 ;
L ˛ w1 .x/ D 0
(6.131)
dx 2
:
w1 .0/ D 0; w1 .a/ D 0:
It is important to note that the same operator, namely LC .˛0 /, appears in both
Eqs. 6.120 and 6.131. Furthermore, the functions .x/ and w1 satisfy formally
identical boundary conditions, as can be noted by comparing Eq. 6.131 to 6.117, respectively. Only the source term, 1 .e0 /, in Eq. 6.131 differs from the corresponding
source term ru R.e0 / in Eq. 6.120. Therefore, the computer code employed to solve
the adjoint system given in Eqs. 6.120 to compute the adjoint function .x/ can
be used, with relatively trivial modifications, to compute the function w1 by solving Eq. 6.131. Comparing the right sides of Eqs. 6.130 and 6.121 reveals that they
are formally identical, except that the function .x/, which appears in Eq. 6.121, is
formally replaced by the function w1 in Eq. 6.130. This indicates that the computer
program employed to evaluate the “indirect-effect term” R0' .e0 /h' can also be used,
with only formal modifications, to evaluate the functional g1 , in Eq. 6.130.
To complete this illustrative example, we note that the explicit form of w1 can be
readily obtained by solving Eq. 6.131, as
xo
1 n ;
(6.132)
w1 .x/ D 0 H x y ˛0 D
a
where H Œx y.˛0 / is the previously defined Heaviside-step functional. Replacing
Eq. 6.132 in 6.130 and carrying out the respective integrations over x yields
Q0 ıD ı'in
1
ıQ a 2y ˛0 C
g1 D 0
0
2D
D
a
'in0 ıQ ıD ı'in
(6.133)
D
0 :
a Q0 D 0
'in
The sensitivity ıy ’0 I h˛ of the critical (i.e., maximum) point y.˛/ at ˛ 0 is
now obtained by applying Eq. 6.74 to our illustrative example. Thus, specializing
Eq. 6.70 to this illustrative problem yields:
Za
11
0
˚
@2 '= @x 2
e0
Q0
ı x y ’0 dx D 0 :
D
(6.134)
6
Sensitivity and Uncertainty Analysis of Models and Data
339
Replacing Eqs. 6.133 and 6.134 into the particular form taken on by Eq. 6.74 for this
illustrative example leads to
D 0 'in0
ıy ’0 I h˛ D
aQ0
ıQ ıD ı'in
:
Q0 D 0
'i0n
(6.135)
The above expression indicates that the partial sensitivities (i.e., the partial
G-derivatives) of y.˛/ with respect to D; Q; and 'in are as follows:
'0
ıy
D in0 ;
ıD
aQ
D 0 'in0
ıy
D
;
ıQ
a.Q0 /2
ıy
D0
D 0:
ı'in
aQ
(6.136)
(6.137)
(6.138)
The variance var.y/, of the critical point of y.˛/, can now be computed by replacing Eq. 6.136–6.138 in 6.128. Further details regarding this illustrative example are
given by Cacuci et al. [18].
6.5.2 ASAP for a Ricatti Equation
Consider the computation, using the ASAP, of the local sensitivities of a response
associated with a nonlinear initial-value problem modeled by the Ricatti equation
N.u/ du.t/
cu2 D 0;
dt
(6.139)
where c is an experimentally determined, time-independent quantity with nominal (mean-) value c 0 , and where the dependent variable u.t/ satisfies the initial
condition
u.0/ D uin ; for t D 0:
(6.140)
The initial condition uin is also considered to be an experimentally determined
quantity, having the nominal value u0in . For illustrative purposes, consider that the
response is a simple functional of u.t/, of the form:
Ztf
RD
f .t/ u.t/ dt;
(6.141)
0
where tf represents some final-time value. The function f .t/ is defined as
f .t/ fu d.t/;
(6.142)
340
D.G. Cacuci
where fu is a time-independent uncertain parameter with nominal value fu0 , while
d.t/ is a time-dependent function that contains no uncertain parameters. Thus, the
nominal value f 0 .t/ of f .t/ is
f 0 .t/ fu0 d.t/;
(6.143)
For this simple problem, the vector of parameters, ’, is a three-component column
vector of the form
(6.144)
’ .c; uin ; fu /;
with nominal value
’0 .c 0 ; u0in ; fu0 /:
(6.145)
Solving Eqs. 6.139 and 6.140 for the nominal parameter values given by Eq. 6.145
yields the following base-case nominal value, u0 .t/, for the dependent variable u.t/
u0 .t/ D
u0in c 0 .u0in /2 t
0
2
.c / .u0in /2 t 2 2c 0 u0in t
C1
:
(6.146)
In view of Eqs. 6.141, 6.144, and 6.146, the nominal value of the response is
Ztf
R D
0
f 0 .t/u0 .t/dt :
(6.147)
0
As usual, the known variations in the parameters ’ around the nominal value ’0
will be denoted by the vector of parameter variations h˛ , defined as
h˛ .ıc; ıuin ; ıfu /:
(6.148)
The variations h˛ will induce variations hu .t/ in u.t/ around u0 .t/, and will also
induce variations in the response R around R0 . The sensitivity of R to variations in
the vector of parameters ’ is obtained by computing, as usual, the G-differential of
R at e0 .u0 ; ’0 / in the arbitrary direction h .hu ; h˛ /; performing the respective
operations shows that the G-differential of R is linear in h. Therefore, it can be
written as:
(6.149)
DR.e 0 I h/ D R0u e 0 hu C R0˛ e 0 h˛
where
R0u e
0
Ztf
hu f 0 .t/hu .t/dt
(6.150)
ıfu d.t/u0 .t/dt:
(6.151)
0
and
R0˛ .e 0 /h˛
Ztf
0
6
Sensitivity and Uncertainty Analysis of Models and Data
341
Applying the FSAP to Eqs. 6.139 and 6.140 yields the following “forward sensitivity
equations (FSE)”:
L.˛ 0 /hu .t/ h.0/ D ıuin ;
d
2c 0 u0 .t/ hu .t/ D ıc u0 .t/
dt
for t D 0:
2
(6.152)
(6.153)
In Eq. 6.152, the operator L.˛ 0 / is defined as
d
L ˛0 2c 0 u0 .t/:
dt
(6.154)
In principle, Eqs. 6.152 and 6.153 could be solved, repeatedly, to obtain hu .t/ for
each parameter variation h˛ . The need to solve repeatedly the FSE can be circumvented by using the Adjoint Sensitivity Analysis Procedure (ASAP). The first
prerequisite for applying the ASAP is that the “indirect-effect” term R0u .e 0 /hu be
expressible as a linear functional of hu .t/. An examination of Eq. 6.150 readily
reveals that R0u .e 0 /hu is indeed a linear functional of hu .t/, which can be represented as an inner product in the real Hilbert space Hu L 2 ./, with .0; tf /,
equipped with the inner product
Ztf
hf .t/; g.t/i f .t/ g.t/ dt;
0
for f; g 2 Hu L2 ./;
.0; tf /:
(6.155)
In Hu L 2 ./, the linear functional R0u .e 0 /hu defined in Eq. 6.150 can be represented as the inner product
˛
˝
R0u .e 0 /hu D f 0 .t/; hu :
(6.156)
The next step, namely the construction of the operator LC .˛0 /, which is formally
adjoint to L.˛0 /, readily yields
LC .˛0 / d
2c 0 u0 .t/:
dt
(6.157)
Note that LC .˛0 / and L.˛0 / are not even formally self-adjoint. The boundary conditions for LC .˛0 / are derived by specializing Eq. 6.54 to the operators L.˛0 /hu
and LC .˛0 / , to obtain
Ztf
0
d
2c 0 u0 .t/ hu .t/dt D
.t/
dt
Ztf
hu .t/ d
2c 0 u0 .t/
dt
.t/ dt
0
C f P Œhu ;
t Dt
gxD0f :
(6.158)
342
D.G. Cacuci
Note that the function .x/ is still arbitrary at this stage, except for the requirement
that 2 HQ D L2 ./; note also that the Hilbert spaces Hu and HQ have now both
become the same space, i.e., Hu D HQ D L2 ./.
Integrating the left side of Eq. 6.158 by parts and canceling terms yields the following expression for the bilinear boundary form:
fP Œhu ;
t Dt
gxD0f D hu .tf / .tf / ıuin .0/:
(6.159)
The boundary conditions for LC .˛0 / can now be selected by noting that hu is not
known at t D tf ; it can therefore be eliminated from Eq. 6.159 by choosing
.tf / D 0;
(6.160)
as boundary conditions for the adjoint function .t/. Replacing Eq. 6.160 into 6.159
yields
xDa
˚ (6.161)
D ıuin .0/ PO h˛ ; I ˛0 :
P h' ;
xD0
The last step in the construction of the adjoint system is the identification of the
source term, which is done by using Eq. 6.156 to obtain
r' R.e 0 / D f 0 .t/;
(6.162)
so that the complete adjoint system becomes
LC .˛0 /
d
2c 0 u0 .t/
dt
D f 0 .t/;
(6.163)
where the adjoint function .x/ is subject to the boundary conditions given in
Eq. 6.160.
Using Eqs. 6.158–6.163
in 6.156 gives the following expression for the “indirect effect” term R0u e 0 hu :
R0u .e 0 /hu
Ztf
D ıc
.t/ u0 .t/
2
dt C ıuin .0/
(6.164)
0
where .t/ is the solution of the adjoint sensitivity system defined by Eqs. 6.163
and 6.160.
As expected, Eqs. 6.163 and 6.160, which underlie the adjoint sensitivity system,
are independent of parameter variations h˛ ; thus, the adjoint sensitivity system
needs to be solved only once to obtain the adjoint function .t/. Note, though, that
the adjoint sensitivity system depends on the nominal solution u0 .t/, since the original equation is nonlinear in u.t/. Of course, the adjoint sensitivity system depends
on the response, which provides the source term as shown in Eq. 6.163.
6
Sensitivity and Uncertainty Analysis of Models and Data
343
6.5.3 ASAP for a System of Linear Ordinary
Differential Equations
Consider the application of the ASAP for computing the sensitivity of a response associated with the following system of coupled linear ordinary differential equations:
8 du
1
ˆ
ˆ
ˆˇ dx u2 D Q1 ; x 2 .0; `/
<
du2
(6.165)
D xQ2 ;
u1 C ˇ
ˆ
ˆ
dx
:̂
u1 .0/ D u10 I u2 .0/ D u20 ;
and consider that the response of interest for this system is simply the functional
Z
R.e/ `
.r1
0
r2 /
u1 .x/
u2 .x/
dx:
(6.166)
where r1 and r2 are “experimentally determined” constants. In this example,
the column vector e .u; ˛/ has components u.x/ D Œu1 .x/; u2 .x/ and
˛ .ˇ; u10 ; u20 ; Q1 ; Q2 ; r1 ; r2 /. The components of the column vector
˛ .ˇ; u10 ; u20 ; Q1 ; Q2 ; r1 ; r2 / are considered to be uncertain parameters that
can vary by amounts h˛ .ıˇ; ıu10 ; ıu20 ; ıQ1 ; ıQ2 ; ır1 ; ır2 / around their nominal values ˛0 .ˇ 0 ; u010 ; u020 ; Q10 ; Q20 ; r10 ; r20 /. Note that, since all vectors are
considered to be column (rather than row) vectors, the symbol “T ,” denoting
“transposition,” will be omitted, to keep the notation as simple as possible.
The system defined by Eq. 6.165 can readily be solved for the nominal parameter
values to obtain
0
u1
0
u D
u02
!
.Q10 ˇ 0 Q20 C u020 / sin x=ˇ 0 C u010 cos x=ˇ 0 C xQ20
D
: (6.167)
.Q10 ˇ 0 Q20 C u020 / cos x=ˇ 0 u010 sin x=ˇ 0 Q10 C0 ˇQ20
Using Eq. 6.167 in 6.166 and performing the integration over x gives the nominal
response value as
R.e 0 / D .r1 r2 /
0
1
.Q10 ˇ 0 Q20 C u020 /ˇ 0 .1 cos `=ˇ 0 / C u010 ˇ 0 sin `=ˇ 0 C Q20 `2 =2
@ 0
A:
.Q1 ˇ 0 Q20 C u020 /ˇ 0 sin `=ˇ 0 u010 ˇ 0 .1 cos `=ˇ 0 / C .ˇ 0 Q20 Q10 /`
(6.168)
344
D.G. Cacuci
The sensitivity DR.e 0 I h/ of R.e/ at e 0 for a variation h D .hu ; h˛ /, where hu Œh1 .x/ h2 .x/ , is obtained by computing the G-differential of Eq. 6.166; this yields
DR.e 0 I h/ D R0u e 0 hu C R0˛ e 0 h˛ ;
(6.169)
where the indirect-effect term is defined as
R0u .’0 /hu Z
`
.r10
0
r20 /
Z `
0
h1 .x/
dx D
r1 h1 .x/ C r20 h2 .x/ dx; (6.170)
h2 .x/
0
whereas the direct-effect term is defined as
R0˛ .e 0 /h˛
Z
`
u0 .x/
ır2 / 10
u2 .x/
.ır1
Z
0
`
D
0
dx
.ır1 /u01 .x/ C .ır2 /u02 .x/ dx:
(6.171)
The direct-effect term given by Eq. 6.171 can already be evaluated at this stage by
inserting the nominal solution u0 .x/ D Œu01 .x/; u02 .x/ from Eq. 6.167 into 6.171,
above, and by carrying out the integration over x. The indirect-effect term, though,
must be evaluated by using either the FSAP or the ASAP.
If the FSAP is used, then the Forward Sensitivity System (FSS) is obtained by
G-differentiating Eq. 6.165, to obtain
8
ˆ
ˆ
ˆ
L.’0 /
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
<
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
:̂
h1
!
h2
h1 .0/
h2 .0/
d
ˇ 0 dx
1
1
d
ˇ 0 dx
!
h1
!
h2
0
1
du0
1
ıˇ
ıQ
1
dx A ;
D@
du0
xıQ2 ıˇ dx2
!
D
ıu10
ıu20
x 2 .0; `/;
(6.172)
!
;
The vector of variations hu Œh1 .x/ h2 .x/ in the state-function
u.x/ D Œu1 .x/; u2 .x/ around the nominal values e 0 can be computed by solving the Forward Sensitivity System (FSS) given by Eq. 6.172. In turn, the values
hu Œh1 .x/ h2 .x/ thus obtained could be used to compute the indirect-effect term
R0u .e 0 /hu from Eq. 6.170. However, the ASAP is more advantageous to use than the
FSAP since there are seven parameters but only one response, which means that the
sensitivities to all parameters could be obtained by computing the adjoint function
once (rather than solving the FSS seven times, if the FSAP were employed). Proceeding with the application of the ASAP, we construct next the operator LC .’0 /,
6
Sensitivity and Uncertainty Analysis of Models and Data
345
which is formally adjoint to L.’0 /. For this illustrative example, LC .’0 / is readily
obtained as
d
ˇ 0 dx
1
C
0
:
(6.173)
L .’ / D
d
1 ˇ 0 dx
Using Eqs. 6.172 and 6.173 in 6.54, performing the respective vector-matrix multiplications, and the subsequent integration over x yields the following form for the
bilinear concomitant:
g@
fP Œhu ;
x
ˇŒ
.`/ h1 .`/ C
1 .0/ h1 .0/ 1
2
.`/ h2 .`/
2 .0/h2 .0/ :
(6.174)
The boundary conditions for the adjoint function are determined by noting that
the values h1 .0/ and h2 .0/ are known from Eq. 6.172, but the values of h1 .`/ and
h2 .`/ are unknown. The latter can be eliminated from the expression of the bilinear
concomitant by requiring that
1 .`/
2 .`/
D
0
:
0
(6.175)
Replacing the boundary conditions for the adjoint function selected in Eq. 6.175
together with the boundary conditions for h1 .0/ and h2 .0/ from Eq. 6.172 reduces
the bilinear concomitant given in Eq. 6.174 to the expression
O ˛ ; I ˛0 / D ˇ Œ
P.h
1 .0/ıu10
2 .0/ıu20
:
(6.176)
The construction of the adjoint system can now be completed by using the expression of the indirect-effect term defined by Eq. 6.170 to obtain
LC .e 0 /
d
ˇ 0 dx
1
d
1 ˇ 0 dx
1 .x/
2 .x/
D
r10
r20
:
(6.177)
As expected, the adjoint sensitivity system, consisting of Eqs. 6.175 and 6.177, is
independent of any parameter variations h˛ .ıˇ; ıu10 ; ıu20 ; ıQ1 ;ıQ2 ; ır1 ; ır2 /,
and is also independent of variations hu Œh1 .x/ h2 .x/ . Thus, the adjoint sensitivity system needs to be solved once only to obtain the adjoint function . For this
illustrative example, Eqs. 6.177 and 6.175 can be readily solved to obtain
D
1
2
D
A sin x=ˇ 0 B cos x=ˇ 0 r2
B sin x=ˇ 0 C A cos x=ˇ 0 C r1
;
(6.178)
where the constants A and B are defined as
A r2 sin `=ˇ 0 r1 cos `=ˇ 0 ;
B r1 sin `=ˇ 0 r2 cos `=ˇ 0 :
(6.179)
346
D.G. Cacuci
Once has been obtained, it is used in conjunction with Eq. 6.91, specialized to this
illustrative example, to obtain the following expression for the sensitivity DR.e 0 I h/
in terms of the adjoint function :
Z
`
0
DR.e I h/ D
0
.ır1 /u01 .x/ C .ır2 /u02 .x/ dx
C ˇ Œ 1 .0/ıu10 C 2 .0/ıu20
Z `
du01
.x/
ıQ
ıˇ
C
1
1
dx
0
0
du
dx :
C 2 .x/ xıQ2 ıˇ 2
dx
(6.180)
As indicated by the above expression, the sensitivity DR.e 0 I h/ is calculated
most efficiently in terms of the adjoint function , by quadratures. Thus, even
this simple illustrative example highlights the powerful efficiency of the ASAP,
as can be seen from the following considerations: (i) to compute the sensitivity
DR.e 0 I h/ to the parameter variations by the FSAP, one would need to solve
(at least) seven times (since there are seven parameters) the coupled differential equations shown in Eq. 6.172, in order to compute hu Œh1 .x/ h2 .x/
for the vector of variations h˛ .ıˇ; ıu10 ; ıu20 ; ıQ1 ; ıQ2 ; ır1 ; ır2 /; (ii) alternatively, to compute the sensitivity DR.e 0 I h/ to the parameter variations
h˛ .ıˇ; ıu10 ; ıu20 ; ıQ1 ;ıQ2 ; ır1 ; ır2 / by the ASAP, one would need to solve
the adjoint sensitivity system once only, which is no more difficult than solving the
forward sensitivity system once, and then perform seven quadratures (i.e., integrations over x), as indicated in Eq. 6.180, to obtain the same sensitivities as would
be provided by the FSAP. If one needs sensitivities to more than seven parameter
variations, then, for each parameter variation, one would need to solve a system
of coupled differential equations by using the FSAP, as opposed to one quadrature
by using the ASAP. Comparing the computational effort needed to solve a system
of coupled differential equations as opposed to performing a quadrature clearly
underscores the powerful advantages of using the ASAP, whenever possible.
6.6 Computational Considerations and Open Problems
Recall from Section 6.3 that statistical uncertainty and sensitivity analysis methods aim at assessing the contributions of parameters uncertainties to the overall
uncertainty of the model response (output). The relative magnitude of this uncertainty contribution is assigned a measure of the statistical sensitivity of the response
uncertainty to the respective parameter, and this measure is also used to rank the importance of the respective parameter. The simplest conceptual attempt at “sensitivity
analysis” is to use screening design methods to identify a short list of parameters
that have the largest influence on a particular model response. The fundamental
6
Sensitivity and Uncertainty Analysis of Models and Data
347
assumption underlying all screening design methods is that the influence of parameters in models follows Pareto’s law of income distribution within nations, i.e.,
the number of parameters that are truly important to the model response is small
by comparison to the total number of parameters in the model. There is an inevitable trade-off between the computational costs and the information extracted
from a screening design. Thus, computationally economical methods provide only
qualitative, rather than quantitative information, in that they provide a parameter importance ranking rather than a quantification of how much a given parameter is more
important than another. Furthermore, the importance of parameters in large-scale,
complex models is seldom a priori obvious (and may even be counterintuitive).
Hence, screening design methods should be used cautiously, since they may be
a priori inadequate to identify the truly important parameters.
As has also been discussed in Section 6.3, “sampling-based uncertainty and sensitivity analysis” is performed in order to ascertain if model predictions fall within
some region of concern (“uncertainty in model responses due to uncertainties in
model parameters”) and to identify the dominant parameters in contributing to the
response uncertainty (“statistical sensitivity analysis”). It is particularly important to
recall from Section 6.3 that the very first step (of the five specific steps described
there) of “sampling-based uncertainty and sensitivity analysis,” in which subjective uncertainties are assigned through expert review processes, is crucial to the
results produced by the subsequent steps in the analysis. This exceptional importance of the very first step is due to the fact that the subsequent results produced by
sampling-based uncertainty and sensitivity analysis methods depend entirely on the
distributions assigned to the sampled parameters; hence, the proper assignment of
these distributions is essential to avoid producing spurious results. Furthermore, the
statistical “sensitivity analysis” of the response to the parameters is performed in
the fifth (and last) step of these procedures (by using scatter plots, regression analysis, partial correlation analysis, etc.). Therefore, it is also important to note that
correlated variables introduce unstable regression coefficients when performing the
“statistical sensitivity analysis” part, because these coefficients become sensitive to
the specific variables introduced into the regression model. In such situations, the
regression coefficients of a regression model that includes all of the parameters are
likely to give misleading indications of parameter “sensitivities.” If several input
parameters are suspected (or known) to be highly correlated, it is usually recommended to transform the respective parameters so as to remove the correlations or,
if this is not possible, to analyze the full model by using a sequence of regression
models with all but one of the parameters removed, in turn.
From the material presented in the preceding sections, it has also become apparent that all statistical uncertainty and sensitivity analysis procedures commence with
the “uncertainty analysis” stage, and only subsequently proceed to the “sensitivity
analysis” stage; this path is the exact reverse of the conceptual path underlying the methods of deterministic sensitivity and uncertainty analysis where the
sensitivities are determined prior to using them for uncertainty analysis. Without any a priori assumption regarding the relationship between the parameters
and the response, the construction of a full-space statistical uncertainty analysis
348
D.G. Cacuci
requires O.s I / computations, where s denotes the number of sample values for
each parameter and I denotes the number of parameters. If a local polynomial
regression is used, the rate of convergence of the approximate to the true parametersto-response mapping is sN D N p=.2pCI / , where N denotes the number of sample
points, p denotes the degree of smoothness of the function representing the response
in terms of the parameters, and I denotes the number of parameters. This relation indicates that the parameters-to-response mapping (function) can be approximated to
a resolution of s 1 with O.s I=p / sample points. The FAST method appears to be the
most efficient of the global statistical methods, needing I.8!i C 1/Nr computations
for each frequency, where Nr denotes the number of replicates. For example, if the
response is a function of eight parameters, and if the sample size is 64, then Sobol’s
method requires 1,088 model evaluations, while the FAST method requires 520
model evaluations; when the sample size increases to 1,024, then Sobol’s method
requires 17,408 model evaluations, while the FAST method requires 8,200 model
evaluations. It becomes clear that even for the most efficient statistical methods (e.g.,
the FAST method) the number of required model evaluations becomes rapidly impractical for realistic, large-scale models involving many parameters. Thus, since
many thousands of simulations are needed, statistical methods are at best expensive (for small systems), or, at worst, impracticable (e.g., for large time-dependent
systems). Furthermore, since the response sensitivities and parameter uncertainties are inherently amalgamated, improvements in parameter uncertainties cannot
be directly propagated to improve response uncertainties; rather, the entire set of
simulations must be repeated anew. Currently, a general-purpose “fool-proof” statistical method for correctly analyzing mathematical models of physical processes
involving highly correlated parameters does not seem to exist, so that particular care
must be taken while interpreting regression results for such models.
Summarizing the computational effort required by the various deterministic
methods for computing local sensitivities, we recall that the “brute-force method”
is conceptually simple to use and requires no additional model development, but it
is slow, relatively expensive computationally, and involves a trial-and-error process
when selecting the parameter perturbations ı˛i , in order to avoid erroneous results
for the computed sensitivities. The “brute-force method” requires .I C 1/ model
computations; if central differences are used, the number of model computations
could increase up to a total of 2I . Of the deterministic methods for obtaining the
local first-order sensitivities exactly, the Green’s function method is the most expensive computationally. The DDM requires at least as many model evaluations as there
are parameters, and the computational effort increases linearly with the number of
parameters. Since it uses Gâteaux-differentials, the FSAP represents a generalization
of the DDM; nevertheless, the FSAP requires the same computational and programming effort to develop and implement as the DDM. Hence, just like the DDM, the
FSAP is advantageous to employ only if the number of different responses of interest for the problem under consideration exceeds the number of system parameters
and/or parameter variations to be considered. Otherwise, the use of either the FSAP
or the DDM becomes impractical for large systems with many parameters, because
the computational requirements become unaffordable.
6
Sensitivity and Uncertainty Analysis of Models and Data
349
By far the most efficient local sensitivity analysis method is the ASAP, but the
ASAP requires development of an appropriate adjoint sensitivity model. If the adjoint model is developed simultaneously with the original model, then it requires
very little additional resources to develop. If, however, the adjoint sensitivity system is developed a posteriori, considerable skill may be required for its successful
implementation and use. Note that the adjoint sensitivity system is independent of
parameter variations, but depends on the response, which contributes the sourceterm for this system. Hence, the adjoint sensitivity system needs to be solved only
once per response in order to obtain the adjoint function. Furthermore, the adjoint
sensitivity system is linear in the adjoint function. In particular, for linear problems,
the adjoint sensitivity system is independent of the original state variables, which
means that it can be solved independently of the original system. In summary, the
ASAP is the most efficient method to use for sensitivity analysis of systems in which
the number of parameters exceeds the number of responses under consideration.
It is important to emphasize that the “propagation of moments” equations are
used both for processing experimental data obtained from indirect measurements
and also for performing statistical analysis of computational models. The “propagation of moments” equations provide a systematic way for obtaining the uncertainties
in computed results, arising not only from uncertainties in the parameters that enter
the computational model, but also from the numerical approximations themselves.
The major advantages of using the “propagation of moments” method are: (i) if all
sensitivities are available, then all of the objectives of sensitivity analysis (enumerated above) can be pursued efficiently and exhaustively; and (ii) since the response
sensitivities and parameter uncertainties are obtained separately from each other,
improvements in parameter uncertainties can immediately be propagated to improve
the uncertainty in the response, without the need for expensive model recalculations.
On the other hand, the major disadvantage of the “propagation of moments” method
is that the local sensitivities need to be calculated a priori; as we have already
emphasized, such calculations are very expensive computationally, particularly for
large (and/or time-dependent) systems. It hence follows that a very efficient overall
methodology for performing local sensitivity and uncertainty analysis would be to
combine the ASAP (which would provide the local response sensitivities) with the
“propagation of moments” method, to obtain the local response uncertainties.
Several techniques have been proposed (see, e.g., the reviews by Greenspan
[34], Ronen [52]; and references therein) for computing higher-order response
derivatives with respect to the system’s parameters. However, none of these techniques has proven routinely practicable for large-scale problems. This is because
the systems of equations that need to be solved for obtaining the second-order (and
higher-order) Gâteaux-differentials of the response and system’s operator equations
are very large and depend on the perturbation ı˛. Thus, even the calculation of
the second-order (local) Gâteaux-differentials of the response and system’s operator
equations is just as difficult as undertaking the complete task of computing the exact
value of the perturbed response R.˛10 C ı˛1 ; : : : ; ˛k0 C ı˛k /.
It appears that the only genuinely global deterministic method for sensitivity
analysis is the Global Adjoint Sensitivity Analysis Procedure (GASAP) proposed
350
D.G. Cacuci
by Cacuci [11]. Instead of attempting to extend the validity of local Taylor series,
the GASAP is based on a global homotopy-based concept for exploring the entire
phase space spanned by the parameters and state variables. The GASAP yields information about the important global features of the physical system, namely the
critical points of R.˛/ and the bifurcation branches and/or turning points of the system’s state variables. Although the GASAP is both exhaustive and computationally
efficient, its general utility for large-scale models is still untested at the time of this
writing.
Regarding future developments in sensitivity and uncertainty analysis, two of the
outstanding issues, whose solution would greatly advance the state of overall knowledge, would be: (i) to extend the adjoint sensitivity analysis procedure (ASAP) to
problems describing turbulent flows, and (ii) to combine the GASAP with global
statistical uncertainty analysis methods, striving to perform, efficiently and accurately, global sensitivity and uncertainty analyses for large-scale systems.
References
1. Andres TH, Hajas WC (April 1993) Using iterated fractional factorial design to screen parameters in sensitivity analysis of a probabilistic risk assessment model. Proceedings of the Joint
International Conference on Mathematical Methods and Supercomputing in Nuclear Applications, vol 2, 328. Karlsruhe, Germany, pp 19–23
2. Berger J (1985) Statistical decision theory and Bayesian analysis, 2nd edn. Springer, New York
3. Bettonvil B (1990) Detection of important factors by sequential bifurcation. Tilburg University
Press, Tilburg
4. Bode HW (1945) Network analysis and feedback amplifier design. Van Nostrand-Reinhold,
Princeton, NJ
5. Bogaevski VN, Povzner A (1991) Algebraic methods in nonlinear perturbation theory.
Springer, New York
6. Bonano EJ, Apostolakis GE (1991) Theoretical foundations and practical issues for using expert judgments in uncertainty analysis of high-level radioactive waste disposal. Radioact Waste
Manag Nucl Fuel Cycle 16:137
7. Box GEP, Hunter WG, Hunter JS (1978) Statistics for experimenters. Chap.15. Wiley,
New York
8. Bysveen S et al. (1990) Experience from application of probabilistic methods in offshore field
activities. In Proceedings of the ninth international conference on off-shore mechanics and
arctic engineering. Saga Petroleum, AS, Norway
9. Cacuci DG (1981) Sensitivity theory for nonlinear systems. I. Nonlinear functional analysis
approach. J Math Phys 22:2794
10. Cacuci DG (1981) Sensitivity theory for nonlinear systems. II. Extensions to additional classes
of responses. J Math Phys 22:2803
11. Cacuci DG (1990) Global optimization and sensitivity analysis. Nucl Sci Eng 104:78
12. Cacuci DG (2003) Sensitivity and uncertainty analysis: I. theory, vol 1. Chapman & Hall/CRC
Press, Boca Raton, FL
13. Cacuci DG, Hall MCG (1984) Efficient estimation of feedback effects with application to climate models. J Atmos Sci 41:2063
14. Cacuci DG, Ionescu-Bujor M (2000) Adjoint sensitivity analysis of the RELAP5/MOD3.2
Two-fluid thermal-hydraulic code system: I. theory. Nucl Sci Eng 136:59
15. Cacuci DG, Ionescu-Bujor M (2002) Adjoint sensitivity and uncertainty analysis for reliability/availability models, with application to the international fusion materials irradiation
6
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
Sensitivity and Uncertainty Analysis of Models and Data
351
facility. A&QT-R 2002 (THETA 13). International conference on automation, quality and testing, robotics. Cluj-Napoca, Romania, May 23–25
Cacuci DG, Wacholder E (1982) Adjoint sensitivity analysis for transient two-phase flow. Nucl
Sci Eng 82:461
Cacuci DG et al. (1980) Sensitivity theory for general systems of nonlinear equations. Nucl
Sci Eng 75:88
Cacuci DG, Ionescu-Bujor M, Navon IM (2005) Sensitivity and uncertainty analysis: II.
Applications to large-scale systems, vol 2. Chapman & Hall/CRC Press, Boca Raton, FL
Cacuci DG, Maudlin PJ, Parks CV (1983) Adjoint sensitivity analysis of extremum-type responses in reactor safety. Nucl Sci Eng 83:112
Clement RT, Winkler RL (1999) Combining probability distributions from experts in risk analysis. Risk Anal 19:187
Cotter SC (1979) A screening design for factorial experiments with interactions. Biometrika
66:317
Cruz JB (1973) System sensitivity analysis. Dowden, Hutchinson and Ross, Stroudsburg, PA
Cukier RI et al. (1973) Study of the sensitivity of coupled reaction systems to uncertainties in
rate coefficients. I: theory. J Chem Phys 59:3873
Daniel C (1973) One-at-a-time-plans. J Am Statist Assoc 68:353
Deif AS (1986) Sensitivity analysis in linear systems. Springer, New York
Dunker AM (1981) Efficient calculation of sensitivity coefficients for complex atmospheric
models. Atmos Environ 15:1155
Dunker AM (1984) The decoupled direct method for calculating sensitivity coefficients in
chemical kinetics. J Chem Phys 81:2385
Eckhaus W (1979) Asymptotic analysis of singular perturbations. North-Holland, Amsterdam
Eslami M (1994) Theory of sensitivity in dynamic systems. Springer, Heidelberg
Fiacco AV (ed) Sensitivity, stability, and parametric analysis (a publication of the Mathematical
Programming Society). North-Holland, Amsterdam
Fischer RA (1935) The design of experiments. Oliver & Boyd, Edinburgh
Frank PM (1978) Introduction to system sensitivity theory. Academic, New York
Gandini A (1987) Generalized Perturbation Theory (GPT) methods: a heuristic approach.
In: Lewins J, Becker M (eds) Advances in nuclear science and technology, vol 19. Plenum,
New York
Greenspan E (1982) New developments in sensitivity theory. In: Lewins J, Becker M (eds)
Advances in nuclear science and technology, vol 14. Plenum, New York
Holmes MH (1995) Introduction to perturbation methods. Springer, New York
Hora SC, Iman RL (1989) Expert opinion in risk analysis: the NUREG-1150 methodology.
Nucl Sci Eng 102:323
Iman RL, Conover WJ (1982) A distribution-free approach to inducing rank correlation among
input variables. Commun Stat Simul Comput B 11:311
Ionescu-Bujor M, Cacuci DG (2000) Adjoint sensitivity analysis of the RELAP5/MOD3.2
two-fluid thermal-hydraulic code system: II. Applications. Nucl Sci Eng 136:85
Ionescu-Bujor M, Cacuci DG (2003) Illustrative application of the adjoint sensitivity analysis procedure to reliability models of electromechanical devices, SIELMEN 2003. The 4th
international conference on electromechanical and energetic systems. Chisinau, Republic of
Moldavia, September 26–27
Kato T (1963) Perturbation theory for linear operators. Springer, Berlin
Kevorkian J, Cole JD (1996) Multiple scale and singular perturbation methods. Springer,
Heidelberg
Kokotovic PV et al. (1972) Singular perturbations: order reduction in control system design,
JACC
Kramer MA et al. (1981) An improved computational method for sensitivity analysis: Green
function method with AIM. Appl Math Model 5:432–441
Lillie RA et al. (1988) Sensitivity/Uncertainty analysis for free-in-air tissue kerma at Hiroshima and Nagasaki due to initial radiation. Nucl Sci Eng 100:105
352
D.G. Cacuci
45. Madsen HO et al. (1986) Methods of structural safety. Prentice Hall, Engleqood Cliffs, NJ
46. McKay MD et al. (1979) A comparison of three methods of selecting values of input variables
in the analysis of output from a computer code. Technometrics 21:239
47. Morris MD (1991) Factorial sampling plans for preliminary computational experiments. Technometrics 33:161
48. Navon IM (1998) Practical and theoretical aspects of adjoint parameter estimation and identifiability in meteorology and oceanography. Dyn Atmos Ocean 27(1–4):55
49. Navon IM et al. (1992) Variational data assimilation with an adiabatic version of the NMC
spectral model. Mon Wea Rev 120:1433
50. Nayfeh AH (1973) Perturbation methods. Wiley, New York
51. O’Malley RE, Jr (1991) Singular perturbation methods for ordinary differential equations.
Springer, New York
52. Ronen Y (1988) Uncertainty analysis based on sensitivity analysis. In: Ronen Y (ed) Uncertainty analysis. CRC Press, Boca Raton, FL
53. Rosenwasser E, Yusupov R (2000) Sensitivity of automatic control systems. CRC Press, Boca
Raton, FL
54. Sagan H (1969) Introduction to the calculus of variations. McGraw-Hill, New York
55. Saltelli A, Sobol IM (1995) About the use of rank transformation in sensitivity analysis of
model output. Reliab Eng Syst Safety 50:225
56. Saltelli A et al. (1999) A quantitative, model independent method for global sensitivity analysis
of model output. Technometrics 41:39
57. Sanchez MA, Blower SM (1997) Uncertainty and sensitivity analysis of the basic reproductive
rate. Tuberculosis as an example. Am J Epidemiol 145:1127
58. Selengut DS (1959) Variational analysis of multidimensional systems. Rep. HW-59129,
HEDL, Richland, WA, pp 97
59. Selengut DS (1963) On the derivation of a variational principle for linear systems. Nucl Sci
Eng 17:310
60. Stacey WM, Jr (1974) Variational methods in nuclear reactor physics. Academic, New York
61. Tomovic R, Vucobratovic M (1972) General sensitivity theory. Elsevier, New York
62. US NRC (Nuclear Regulatory Commission) (1990–1991) Severe accident risks: an assessment for five U.S. nuclear power plants, NUREG-1150. US Nuclear Regulatory Commission,
Washington, DC
63. Usachev LN (1964) Perturbation theory for the breeding ratio and for other number ratios
pertaining to various reactor processes. J Nucl Energy A/B 18:571
64. Weinberg AM, Wigner EP (1958) The physical theory of neutron chain reactors. University of
Chicago Press, Chicago, IL
65. Wigner EP (1945) Effect of small perturbations on pile period. Manhattan Project Report CPG-3048
66. Xiang Y, Mishra S (1997) Probabilistic multiphase flow modelling using the limit-state method.
Groundwater 35(5):820
67. Yosida K (1971) Functional analysis. Springer, Berlin
68. Zou X et al. (1993) An adjoint sensitivity study of blocking in a two-layer isentropic model.
Mon Wea Rev 121:2833
6
Sensitivity and Uncertainty Analysis of Models and Data
353
Professor Dan Gabriel Cacuci received his
degrees (M.Sc. in 1973, M.Phil. in 1977,
and Ph.D. in Applied Physics and Nuclear
Engineering, in 1978) from Columbia University. He worked at Oak Ridge National
Laboratory, becoming Senior Section Head
overseeing activities in several high-profile
projects, including the Strategic Defense Initiative and the Advanced Neutron Source,
while actively conducting theoretical and experimental research on a variety of topics
including sensitivity and uncertainty analysis of large-scale nonlinear systems. In 1988,
Dr. Cacuci joined the Department of Chemical and Nuclear Engineering, University of
California, Santa Barbara, as a Full Professor. He continued his research and teaching activities in the Department of Nuclear
Engineering, University of Illinois, Urbana-Champaign, where he was a Full Professor from 1990 to 1993. In 1992, he moved to the University of Karlsruhe, Germany,
as Ordinarius Chair Professor and Director, a lifetime appointment, of the Institute
for Nuclear Technology and Reactor Safety, while concurrently serving as director of the Institute for Reactor Safety at the National Research Center in Karlsruhe
(FZK). Over the years, Professor Cacuci held faculty positions in various leading
universities, including Distinguished Professor of Engineering and Applied Science
at the University of Virginia, Visiting Professor in the Department of Nuclear Engineering and Radiological Sciences, University of Michigan, Visiting Professor at
the Royal Institute of Technology, Stockholm, and Adjunct Professor of Nuclear
Engineering at the University of California, Berkeley, graduating 51 Ph.D. students
in Germany. In 2004, Dr. Cacuci resigned his position as Institute Director at FZK to
become the Scientific Director for the Nuclear Energy Sector (Division) of France’s
Commissariat a l’Energie Atomique. He was elected as Fellow of the American
Nuclear Society (ANS) in 1986, and has been Editor, since 1984, of the Society’s
leading international research journal, Nuclear Science and Engineering. Included
among the awards and recognitions received by Dr. Cacuci are four honorary doctorates, the Alexander Von Humboldt Prize for Senior Scholars, Germany, 1990; the
Science Prize and title of Honorary Member of the Romanian Cultural Foundation,
Bucharest, Romania, 1995; the A. A. Harms International Award, 1998; the Glenn
Seaborg Medal of the ANS, 2000; and the Eugene P. Wigner Reactor Physics Award
of the ANS, 2001. In 1998, he won the Department of Energy’s prestigious E. O.
Lawrence Memorial Award. He is an honorary member of the Romanian Academy
and European Academy of Sciences and Arts. He has published three books and ca.
180 refereed publications in international journals and conferences.
Chapter 7
Criticality Safety Methods
G.E. Whitesides, R.M. Westfall, and C.M. Hopper
7.1 Introduction
7.1.1 Overview
The objective of this chapter is to examine the history of nuclear criticality safety
calculations. To this end, we will review the history of criticality safety concerns and
look at the various approaches in dealing with this particular type of calculation. The
criticality safety methods that are the subject of this chapter are those that are used to
determine the criticality safety of the handling, transportation, and storage of fissile
materials outside nuclear reactors. The object of criticality safety studies is defined
in American National Standards Institute, Inc./American Nuclear Society Standard
ANSI/ANS-8.1–1998 for Nuclear Criticality Safety in Operations with Fissionable
Materials Outside Reactors, specifically:
4.1.2 Process Analysis. Before a new operation with fissionable material is begun, or
before an existing operation is changed, it shall be determined that the entire process will
be subcritical under both normal and credible abnormal conditions.
4.2.5 Subcritical Limit. Where applicable data are available, subcritical limits shall be
established on bases derived from experiments, with adequate allowance for uncertainties
in the data. In the absence of directly applicable experimental measurements, the limits may
be derived from calculations made by a method shown by comparison with experimental
data to be valid in accordance with 4.3. (i.e., 4.3 Validation of a Calculational Method).
7.1.2 Historical Background
While we normally think of criticality occurrences as events that have taken place
since the 1940s, we now know that one occurred many years ago. The first known
G.E. Whitesides (), R.M. Westfall, and C.M. Hopper
Oak Ridge National Laboratory
e-mail: whitesidesge@ornl.gov; westfallrm@ornl.gov; hoppercm@ornl.gov
Y. Azmy and E. Sartori, Nuclear Computational Science: A Century in Review,
c Springer Science+Business Media B.V. 2010
DOI 10.1007/978-90-481-3411-3 7, 355
356
G.E. Whitesides et al.
“criticality” took place 1,800 million years ago in the Republic of Gabon [1]; hence,
the issue of criticality outside a nuclear reactor has been with us for a long time.
Presumably, from the perspective of personnel safety, we do not know the consequences of this occurrence. The most interesting aspect of the Oklo criticality is that
the material has remained in place, without migration, for a very long time.
7.2 Nuclear Criticality Safety: The Early Years
7.2.1 The First Criticality Concerns
In modern history, the origin of criticality safety studies began in the early 1940s.
In determining whether a nuclear weapon could be built, the US government decided on three approaches to obtaining fissile material: electromagnetic separation
of U-235, conversion of U-238 into Pu-239 by irradiation, and separation by gaseous
diffusion. In the first approach, the batch-processing method, the amount of fissile
material was relatively small and could not pose a criticality safety problem. In the
second, the fissile material was natural uranium in solid form, which was converted
into small-batch-approach quantities of plutonium, far below critical quantities.
Thus, it raised no concerns. However, the last approach was a continuous-process
gaseous diffusion method that did cause concerns about a criticality accident. Because the material initially was in the form of natural uranium, no criticality
concerns would normally exist. However, because the quantity of material was large
and the process was intended to continuously enrich the material in U-235 content,
a question arose that gave birth to modern nuclear criticality safety concerns. This
slightly humorous story about one of the first criticality safety concerns was recalled
by Edward Teller in a criticality safety experiments conference at Lawrence Livermore National Laboratory in 1969 (with kind permission of the Lawrence Livermore
National Laboratory) [2].
Since the total uranium quantity in the gaseous diffusion plant was large and, hence, the
U-235 inventory significant, Dr. Teller was given the task of evaluating the criticality safety
concerns. At one meeting to address safety concerns, as Teller recalls, one of those present
asked a question. Since uranium was in gaseous form and uranium atoms could move
around freely, would it be possible that the U-238 atoms, in the U-238–U-235 mixture
could be collected by chance, in one end of a tube, and the U-235 could be collected in
the other end, in sufficient quantity to achieve criticality. To this Teller responded positively
and added that though, the possibility existed, the possibility of this occurrence was exactly
the same as the possibility of all the oxygen atoms in the room, where they were sitting,
congregating under the table and all of them dying due to lack of oxygen. Thus, the first
criticality accident concerns were allayed without the need to resort to calculations.
This was, of course, before the time of highly enriched U-235 in the gaseous diffusion plants.
7
Criticality Safety Methods
357
7.2.2 Early Attempts at Criticality Safety Computations
Needless to say, it was not long before sufficient fissile material was produced that
real concerns about criticality safety did indeed arise. The early pioneers had only
rudimentary computational tools available, and most computations were made with
either one- or two-energy groups and with cross sections that had been determined
from the relatively few cross-section measurements that had been made at that time.
The concerns centered primarily on two physical forms: fissile materials in liquids
and solids. The latter was mainly in metal form since that was the material required
for building a nuclear weapon. In the liquid form, the thermal-energy cross sections
were most important; in the metal form, the high-energy cross sections were dominant. Hence, these one- or two-energy-group calculations, made on a mechanical
calculator, provided the tools that initiated the field of nuclear criticality safety
calculations.
The field of nuclear criticality safety calculations has always been intimately
bound to criticality experiments. As the weapons development program progressed,
experimental criticality facilities were developed, first at Los Alamos and then at
Oak Ridge. The first critical experiments for demonstrating the safety of enrichment
plant design were conducted at Los Alamos by Oak Ridgers, including Clifford
Beck, Dixon Callihan, and Raymond Murray. These early experiments were intended to establish critical mass values for various forms of fissile materials, with
little attention paid to the needs for validating criticality safety calculations. This
emphasis, as will be discussed later, changed slowly over the years as the need to
validate calculations became the major thrust of the fissile materials critical experiments programs.
The one- and two-group calculations coupled with critical experimental data
provided the information to guide the criticality safety specialist for a substantial
period in the early nuclear program. The major criticality accidents during this time
occurred in the critical experiments facilities and not in other fissile-handling operations. Most of the accidents in operations occurred in the 1950s, with accidents
occurring at both Oak Ridge (the Y-12 Plant) and Los Alamos (TA-55) in 1958.
7.3 Criticality Safety Versus Reactor Design Calculations
7.3.1 Computational Requirements for Reactor Design
During the 1950s, the emphasis in the nuclear industry shifted to developing nuclear
reactors. During this period, the first electronic computers appeared on the scene.
This combination provided the environment for the development of multigroup diffusion theory computer programs and the first capabilities to perform neutronic
calculations for multiregion systems. The need for a computational tool for reactors, at that time, was quite adequately met by these codes.
358
G.E. Whitesides et al.
7.3.2 Special Requirements for Criticality Safety Calculations
While the development of these new multigroup diffusion computation programs
was hailed as a major development for the design of nuclear reactors, those involved
in nuclear criticality calculations found that this was only a partial solution to their
need for a computational tool. Most reactor designs under consideration at that time
relied on thermal fission for their principal reaction. Furthermore, the systems were
designed with a minimum of leakage and a minimum of materials with large absorption cross sections. These are precisely the conditions for which the diffusion
theory approximation is applicable, that is, large single units with low leakage, no
voids, and low neutron absorption.
On the other hand, the criticality safety practitioners were seeking to prevent criticality and to design safe systems. They relied on minimizing moderators, increasing
leakage, and using materials with large absorption cross sections. It was clear that
better computational tools were needed.
This need resulted in the development of a number of computational tools.
We will briefly describe each method and then provide more specific information regarding each tool and its use in criticality safety calculations. The
specific time periods for the development of each of these methods were intertwined, and the development of each in some way was dependent on the other
methods.
7.3.3 Early Computational Tools for Criticality
Safety Calculations
The methods discussed here are the diffusion theory method, the solid angle method,
the surface density method, and the density analog method. These are later followed
by the Sn , or discrete ordinates method, the Monte Carlo method, and the N.BN /2
method.
The diffusion theory method was clearly the first of the computational methods
and found extensive use in the development of nuclear reactors. Its usefulness was
greatest for systems that met the conditions of the limits that arose from the approximation to the Boltzmann transport equation, which gave rise to the diffusion
approximation. From a criticality safety viewpoint, this allowed computation for
well-moderated fissile material in a single unit. This was a good first computational
step, but clearly something better was needed. Transformations of the diffusion
theory, as supported by critical data, from the “four-factor formula” permitted treatments for neutron leakage, due to geometric effects (buckling, including neutron
extrapolation), reflector effects (reflector savings), and neutron nonleakage probability effects (Fermi age, migration area) as solved with a one- or two-energy-group
“five-factor formula” or “six-factor formula,” respectively. Even with its limitation,
modifications of diffusion theory provided the basis for the early development of
7
Criticality Safety Methods
359
other approximate methods such as the solid angle, surface density, and density
analog methods for determining the criticality safety of arrays of fissile materials.
These methods were developed to take the output from the diffusion calculation for
a single fissile unit and use this information to determine the multiplication factor
for an array of units. While the methods were not necessarily rigorous, they were
born of necessity and nurtured by ingenuity. A complete mathematical description of
these various approximate methods can be found in an article authored by Douglas
Hunt [3].
The first was the solid angle method. In this method, the leakage from the single unit was input into a calculation that computed the solid angle between each
unit in an array with all other units in the array. The solid angle method was first
proposed by criticality methods staff at the Oak Ridge Gaseous Diffusion Plant to
determine the safe spacing of process equipment at the plant. The original idea
was a very simple method, based on the optical view of leaking neutrons from a
unit. The solid angle of a unit was that angle subtended by all visible surrounding units. By using this information, an approximate keff of the array could be
determined.
The second was the surface density method. In this method, the fissile material
in an array was assumed to have somehow flowed downward into a slab. An empirical model was developed to determine whether the array was safe, based on the
knowledge of the infinite critical slab thickness for the material involved.
The third was the density analog method, which was based on the premise that
if the densities of a just critical system are increased to x times their initial value
and all the linear dimensions are reduced to 1/x times their initial value, the system
will remain critical. This method allowed the development of a theory that allowed
single unit data for a material to be used, to evaluate array criticality.
At a later time, a method called the N.BN /2 was developed by the melding of diffusion theory with interaction potential for limiting surface densities. This method
treated an array of fissile units, from both the array leakage and the multiunit coupling perspectives.
All of these methods had the same goal: to take output from a one-dimensional
calculation for a given material and determine whether an array of units containing
the same material would be subcritical.
7.4 Role of the Sn , or Discrete Ordinates Method
The Sn , or discrete ordinates method, arose from development work being performed by Bengt Carlson and his colleagues at Los Alamos in the mid-1950s. As
the name implies, this method involves a direct solution of the integro-differential
transport equation by spatial differencing and angular quadrature approximate representations. This work is more completely described in Chapter 1.
360
G.E. Whitesides et al.
7.4.1 Impetus for the Early Sn Method Development
Clearly, this work had its first applications in nuclear weapons design, but it quickly
became clear that this was a very useful tool for criticality safety calculations. The
method appeared to provide exactly what the criticality safety specialists needed. It
had the capability to analyze systems with the properties of a nuclear reactor as well
as those with properties often applied in safety design to prevent criticality. This
computational tool was able to perform calculations in one dimension for spheres,
infinitely long cylinders, and planes infinite in two dimensions. The combination of
the Sn method with neutron cross sections, which had been provided by Hansen and
Roach [4] became the standard computational tool for criticality safety calculations.
In the late 1950s, a two-dimensional version of the Sn method was developed.
In practice, the two-dimensional Sn method provides capability for the computation
of certain simple arrays through the combined use of reflecting and true-surface
boundary conditions or for a set of individual cylinders stacked along the same axis.
However, the more general applications would require another method.
7.4.2 Later Sn Method Development
Subsequent to the early Sn method development at Los Alamos, several newer
Sn codes were developed at both Los Alamos and Oak Ridge. These codes have
found extensive use for criticality safety calculations as their capabilities have
been extended to include many additional geometrical and cross-section-handling
capabilities. Of particular importance were the additions of flux anisotropy and
anisotropic scattering, which can strongly influence neutron transport, leakage, and
coupling between systems.
7.5 Role of the Monte Carlo Method
During this same time frame, the Monte Carlo method, which had previously been
used by the weapon designers at Los Alamos, began to be examined as the basis for
calculations by criticality safety specialists. Development of this method was centered in Los Alamos, Oak Ridge, and the UK. The Monte Carlo method involves a
solution of the integral form of the transport equation with various neutron tracking schemes for treating the spatial variable and statistical sampling of the neutron
kinematics. As discussed below, the energy variable is treated in either a continuous or piecewise continuous (multigroup) fashion. For the first time, this method
allowed the evaluation of the criticality safety of any system for which the mathematical equations defining the geometry could be defined. The major drawback to
this method at that time was the difficulty in using the available computer programs –
primarily because of the complexity in input specifications and interpretation of results.
7
Criticality Safety Methods
361
7.5.1 Early Monte Carlo Calculational Methods
The first most notable two codes employing the now widely used Monte Carlo
methods were the O5R code (from Oak Ridge) and the GEM code (from the UK).
These two codes had unique capabilities, which are worth examining.
Of the two notable codes, the O5R code was the first truly general Monte Carlo
neutron transport code. It was developed for application on the ORACLE computer,
an early Oak Ridge National Laboratory mainframe based on cathode-ray-tube
memory technology. It was designed to accommodate pseudo point neutron cross
sections and could be used to deal with any geometry for which second-order equations could be written for the surfaces. While this was a noble and energetic effort,
the tremendous generality of the code made it almost impossible to use for complicated geometrical arrangements of material. Moreover, the cross-section data
were of such volume and complexity that they were nearly impossible to validate.
Hence, the generality of O5R in treating the geometry and neutron energy became
its greatest burden.
The GEM code from the UK was designed specifically for nuclear criticality calculations and had the unique capability of performing calculations of regular arrays
of fissile materials as long as the units were made up of nested spheres, cylinders,
or rectangular parallelepipeds. The cross-section capability of the code was almost
as general as that for O5R. However, in practice, the cross-section data that accompanied the code consisted of a more limited and simplified description of the cross
sections.
The unique feature of the GEM code was the tracking method. The developer of
the code incorporated an interesting concept. For the purpose of computation, the
neutrons, N1 , were initially started at the “reflector/core interface.” The neutrons
were directed initially into the core and followed until they were either absorbed or
returned across this interface. The number of neutrons returned was defined as N2 .
At this point, a value M was calculated as N2 =N1 . The N2 neutrons were then
tracked into the reflector until they leaked, were absorbed, or returned across the
reflector–core interface. The number returned into the core was defined as N3 . From
this value, R was defined as N3 =N2 . From this calculation a quantity defined as MR
(M R) was determined as being related to keff . When MR is equal to 1.0, it is exactly equal to the value for keff . While this code was easy to use and provided useful
information, it was not easy to interpret the results.
No written documentation exists that details the reasoning that went into the
tracking concept built into GEM. The first author of this chapter had the opportunity to listen to GEM’s author, Ed Woodcock, as he described the method to
Robert Coveyou of Oak Ridge National Laboratory. Woodcock believed that source
convergence would occur more quickly on a surface than in a volume. Hence, he
indicated that this was the basis for the tracking method used in GEM. Coveyou
countered by using a Green’s function argument to prove that convergence on a surface would take exactly the same computational time as that for the volume. At the
end of the discussion, Woodcock agreed with Coveyou. When the UK staff produced
a successor code to GEM, they abandoned the MR method.
362
G.E. Whitesides et al.
7.6 Critical Experiments, Benchmarks, and Validation
With the advent of rigorous computational methods, it became clear that many of
the critical experiments that had been performed in the past were not sufficiently
defined for use in the validation of these new methods. This finding resulted in a
rush to provide these data. During the late 1950s through the early 1970s critical
experiments facilities at Hanford, Los Alamos, Oak Ridge, and Rocky Flats in the
USA and several facilities in Europe performed most of the experiments on which
today’s calculations rely for validation.
The need for validation became such an important part of the criticality safety
calculation field that the need for an American National Standard became clear.
To meet this need, a representative group of individuals working under the
auspices of the American Nuclear Society’s Subcommittee 8 produced a standard titled, “Validation of Calculational Methods for Nuclear Criticality Safety,”
ANS8.11, ANSI N16.9–1975, that has subsequently been incorporated into ANSI/
ANS-8.1–1998.
To ensure that the experimental data are properly documented for use as benchmark data, the US Department of Energy initiated an Evaluated Criticality Safety
Benchmarks project to collect, document, and determine which experiments were
qualified to be used as benchmark data. Subsequently, this project made an international effort under the auspices of the Organisation for Economic Cooperation
and Development’s Nuclear Energy Agency, which has published data for literally
hundreds of experiments that qualify as benchmark data.
Unfortunately, many of the world’s critical experiments facilities no longer operate. Obtaining the required new experimental data appears very unlikely at the
moment. This scenario has the potential to limit the range of applications for validation, thus creating a need for increased uncertainty in evaluating the results of
criticality safety calculations for several new applications.
7.7 Evaluation of the Various Methods and Their Role
in Current Criticality Safety Calculations
Each of the methods discussed in this chapter can still play an important role in a
total criticality safety computational study.
7.7.1 Role of the Sn Method
The Sn method has played a very important role in nuclear criticality safety. This
method allowed computation for almost any situation encountered (except for arrays of fissile units), although some simple array systems can be treated. The Sn
7
Criticality Safety Methods
363
method provided the capability to handle systems with voids, strongly anisotropic
scattering, and the presence of strong absorbers. As a matter of solution technique,
this method provides the neutron flux at every mesh point, in addition to the value
of keff . Furthermore, it can provide an adjoint solution that can be of significant
value in understanding the importance of various regions as they contribute to the
keff of a system. Another advantage of the Sn method was the speed with which calculations could be performed, particularly in one-dimensional systems. While these
data were often of direct value to the criticality safety specialist, more importantly,
they provided input to other methods (e.g., solid angle, surface density, and density
analog) in evaluating the criticality of arrays of fissile materials. The latter methods
were very simple calculations that easily provided the ability to perform many array
parameter studies at small computational cost and could often be performed on a
calculator.
7.7.2 Evaluation of the Role of the Sn Method
Today we have a number of computer programs that use the Sn method. Both the
Los Alamos and Oak Ridge National Laboratories have developed one-, two-, and
three-dimensional codes with the capability of computing the keff of fissile systems.
The great advantage of this method is its ability to produce the neutron flux and
fission density at every mesh point as well as to compute the keff of the system.
Furthermore, the keff of a system, as well as the change in keff resulting from small
changes in the material or geometry, can be computed much more precisely than
with the Monte Carlo method.
The Sn method is particularly useful when performing Parameter studies, especially if the system can be described in one dimension. Often great insight into the
neutronic behavior of a system can be obtained for little expenditure of personnel
and computer time.
7.7.3 Role of the Monte Carlo Method
As discussed earlier, O5R and GEM were early codes that introduced many of us
to the Monte Carlo technique. Even though these two codes were very different
in technique and in actual application, each played a key role in the development
of the Monte Carlo method. The O5R code was perhaps years ahead of its time
in that it provided many capabilities that have been incorporated into later codes.
Because the O5R code had been developed initially for use in shielding calculations
for which the geometrical descriptions were relatively simple, the main problem was
its generality, particularly the great difficulty in generating the input to describe the
geometry.
364
G.E. Whitesides et al.
The GEM code, on the other hand, provided a very easy-to-use technique for
describing geometries that were likely to be encountered in criticality safety calculations. In fact, the sole purpose in developing the code was to provide the capability
to determine the criticality of single fissile units as well as arrays of fissile units.
GEM’s main problem was the fact that the code did not yield the keff of a system
directly. Furthermore, since the technique required that a core–reflector interface be
defined, many users seemed to have difficulty with the concept when a reflector was
not present. The fact that an actual reflector was not required was a very confusing
concept for those who did not fully understand the technique. On the other hand,
the primary advantage of GEM was its geometry package, as well as its reasonably
good set of neutron cross sections.
The O5R and GEM codes provided the basis for the development of most of the
Monte Carlo criticality computational tools on which we rely today.
7.7.4 Evaluation of the Monte Carlo Method
Today we have a number of codes that use the Monte Carlo method to determine
the criticality of fissile systems. The more widely used codes were developed at Los
Alamos and Oak Ridge National Laboratories and in the UK.
The great strength of the Monte Carlo method is its ability to determine the criticality for almost any geometry using a variety of cross-section treatment methods.
Moreover, the dimensionality of a system as well as the method of cross-section
treatment has little effect on the total computational time. Thus, Monte Carlo is the
method of choice for the most difficult problems.
The main weakness of the Monte Carlo method centers on its statistical limitations. While the statistical uncertainty can be reduced to almost any given value, the
cost of this reduction will be increased by the square of the reduction: that is, to
reduce the uncertainty by a factor of 10 requires 100 times more computation time.
While the neutron flux can be determined in a Monte Carlo computation, the uncertainty in the flux in any small region of the system is usually too large for most
practical uses.
The remaining weakness of the Monte Carlo method stems from its great geometrical generality. Because it is so easy to model systems, there is a tendency to
expand the system to include many loosely coupled features, some of which are
not important to integral quantities, such as the multiplication factor. This can, and
often does, result in undersampling, which can cause the keff of the system to be underestimated. This concept has been discussed widely since it was first discovered
by Whitesides [5]. Although numerous attempts have been made to find a general
solution to the problem, the following cautions remain the best advice: be aware of
the problem and use a large number of histories per generation and a large number
of generations.
7
Criticality Safety Methods
365
7.7.5 Summary of the Monte Carlo Criticality Safety Software
Beginning with O5R, a number of general-purpose Monte Carlo transport codes
have been developed. These are listed below, along with the parent organization and
the unique and/or significant features of each code:
UNCLE SAM code: United Nuclear (MAGI); Combinatorial Geometry
MORSE code: Oak Ridge National Laboratory; Multigroup, Biasing Techniques,
Anisotropic Scattering
VIM: Argonne National Laboratory; Benchmarking of ZPPR Assemblies, Improved Unresolved Resonance Shielding, Source Convergence Studies
COG: Lawrence Livermore National Laboratory; Advanced Geometry Modeling, CAD Setup and Display
Additionally, both Bettis Atomic Power Laboratory and Knolls Atomic Power
Laboratory have sophisticated Monte Carlo codes that are utilized in reactor design
and fuel exposure analysis.
Some Monte Carlo codes have been developed that are specific to criticality
safety analysis. The KENO program at Oak Ridge National Laboratory and the
MONK computer program at Winfrith, UK, have undergone extensive modification
and enhancement over 3 decades. Geometry capabilities have been enhanced to include more complex region boundaries, automated array specifications, and nested
arrays. The kinematics has been enhanced to be fully compatible with modern nuclear data specifications. The MCNP program at Los Alamos National Laboratory
has been enhanced for criticality applications by the inclusion of neutron thermal
scatter, an automated array feature, and improved unresolved resonance shielding.
7.7.6 The N.BN /2 Method
The N.BN /2 approximate method was briefly mentioned earlier. The method combines the density analog and surface density methods and diffusion theory geometric
buckling concepts into a single picture of array criticality, resulting in a limiting
surface density, ˆ.m/, g=cm2 . The method relates the geometric buckling of the
external boundary of an array in terms of the number of orthogonally arranged airspaced units (i.e., N D nx ny nz ) and a relating parameter, ˆ.m/, for various
fissionable materials and reflector conditions. The relating parameter is determined
from the mass of a single bare critical fissionable material and the mass of units in a
reflected array of N units. Numerous transformations of the basic relations among
the values for N.BN /2 and ˆ.m) have been developed that permit high degrees of
reliability in the prediction and equating of interpolated and extrapolated values for
critical and subcritical arrays that are reflected with various material thicknesses and
that may have units of similar and dissimilar fissionable materials and unit spacing.
The basic parameters of the N.BN /2 are determined either from experiment or by
366
G.E. Whitesides et al.
Monte Carlo calculation for particular fissionable material compositions and reflector conditions. From these data, it is possible to determine the array criticality or
subcriticality for a particular fissionable material composition for an array of nearly
any size. This method has been useful in evaluating a large number of array combinations with substantially reduced computing costs.
7.8 The Role of Cross-Section Representation
The two options for cross-section representation are the multigroup and the pointwise models. It is important to understand these options and to be able to evaluate
their use.
7.8.1 Multigroup Cross Sections
Because of the discrete representation of the neutron energy dependence through
the group-wise transfer array in the Sn codes, the multigroup cross-section model
gained great acceptance. As a result, significant effort was expended in providing
multigroup cross sections that were benchmarked against many of the available experimental data.
As discussed earlier, the O5R code had a very general cross-section representation, which could represent the cross sections to almost any detail desired. This very
general representation was one of the factors that limited the method from being
effectively used for criticality safety calculations. While the data were very detailed,
they had not been adequately evaluated and benchmarked for criticality calculations. On the other hand, the multigroup cross sections, particularly those provided
by Hansen and Roach, had been effectively evaluated and benchmarked for criticality for use in the Sn method. This provided the incentive for the development of a
Monte Carlo computer program that used these cross sections.
The effective use of multigroup cross sections in the Sn method led to their introduction into Monte Carlo methods being developed in the early 1960s. Through
the development of a computer program with the geometrical capabilities of a code
such as GEM, along with the use of the multigroup cross sections that had been used
in the Sn method, a very useful computational tool resulted.
The effectiveness of these new Monte Carlo methods using multigroup cross sections was the major driver in pushing the Monte Carlo method to the forefront of
criticality safety calculations. Furthermore, the large number of evaluated multigroup cross sections, which have been evaluated against a very large number of
critical experiments, continues to provide an incentive for their use.
A further incentive for the use of multigroup cross sections is the ability to
directly compare the results of Sn method calculations with those produced by multigroup Monte Carlo calculations. Finally, there is an additional incentive in that the
7
Criticality Safety Methods
367
multigroup approximation greatly facilitates the calculation of the adjoint flux for
use in perturbation analyses and sensitivity/uncertainty (S/U) calculations.
7.8.2 Point-Wise Cross Sections
As time evolved, it was inevitable that the use of point-wise cross sections in the
Monte Carlo method would be implemented. After all, the capability exists and, in
actuality, the cross sections do indeed change at every possible energy point. Several
codes that effectively use point-wise cross sections have now been developed and
are in widespread use. As the cross sections are evaluated and validated against
critical experiments data, the dependence on these codes will grow.
The major concern that still remains in the use of point-wise cross sections is the
possibility of undersampling. As with the geometrical concerns with undersampling
raised earlier, the same problem exists with the use of point-wise cross sections
in the energy domain. For the energy-sampling of system-integrated quantities,
such as keff , the undersampling tends to occur in less important energy-reaction
type domains, which mitigates the potentially adverse effect on results. The advice
is the same: to be aware of the problem and to follow many neutron histories.
Unfortunately, undersampling of the cross sections might not be so obvious to the
casual user.
The use of point-wise cross sections will always be required when performing
calculations in which iron, nickel, or chromium has significant effects on neutron
transport through core and/or reflector regions. These materials have scattering resonances with major gaps and overlap for which there is no means of treatment via
multigroup representations of the cross sections.
7.9 Elements of a Complete Nuclear Criticality Safety
Computational Tool Set
In order to provide the most effective evaluation of problems involving criticality safety, it is necessary to implement a variety of computational tools and cross
sections.
7.9.1 Cross-section Selection
The choice of cross-section model will depend on several factors: the materials involved, the computational tools to be used, and the benchmark data available.
368
G.E. Whitesides et al.
While some versions of the Sn method that use point-wise cross sections have
been developed, these versions are not in widespread use. Hence, if the Sn method
is used, then the only available option is the multigroup cross-section representation.
If one plans to use both the Sn and the Monte Carlo methods, it may be desirable to use the same cross-section set, which, for practical purposes, requires the
multigroup model. This is particularly important if direct comparison of the results
is required.
If the Monte Carlo method is the only method to be used in an analysis, then
point-wise cross sections can be an attractive alternative. In practice, point-wise
cross sections eliminate most of the problems that result from inadequate crosssection representation that can occur in multigroup cross-section formulation. As
mentioned earlier, if there are materials with scattering resonances present, particularly when the neutron transmission through thick regions of these materials is
important, it is imperative that point-wise cross sections be used.
7.9.2 Using All Available Tools to Ensure Economical
and Accurate Computations
Although the advances in computer speed and memory have led to attractive and
powerful computational methods, incentives still exist to look at all alternatives
in evaluating a criticality safety issue. In the analysis of a specific situation, the
use of several of the computational methods, along with simple hand methods, can
add significantly to the validity of a computational evaluation of a criticality safety
problem.
7.10 The Future
The future of nuclear criticality safety calculations lies in the continued development of new geometrical packages that minimize data input effort, the acquisition
of better cross-section data, the acquisition of additional critical experiments data,
and the development of sensitivity methods for evaluating the uncertainties of the
computed results. The third issue, which has been addressed in Chapter 6, is the
subject of intensive effort in the criticality safety computational methods community
that is involved in the development of S/U methods. Development of S/U methods is
needed to quantify the applicability of experiment benchmark computational validations to criticality safety computational evaluations and to determine appropriately
safe margins of subcriticality accounting for the availability and applicability of
critical experiment benchmarks, neutron cross-section data, and their associated uncertainties. The need for S/U methods development is increasing due to the changing
emphases of the nuclear industry (e.g., decontamination, waste processing/disposal,
spent nuclear reactor fuel burnup credit, new mixed oxide fuel compositions) that
7
Criticality Safety Methods
369
are confronting nuclear criticality safety specialists, particularly in a scientific environment of declining activity of nuclear criticality safety experimentation and
cross-section measurements/evaluation.
7.11 Summary
To the best of our knowledge, no criticality accident has ever resulted from an
erroneous computation. However, this record should not be a cause for complacency and no safety analysis should ever be made solely on the basis of the use of
one computational method.
While we have become accustomed to the use of the Sn , and now more particularly the Monte Carlo, methods for much of our safety analysis, we need to be
aware of the importance of the older computational tools. While the methods such
as solid angle, surface density, and density analog may not produce the accurate results to which we are accustomed from our more rigorous methods, they do provide
valuable insights into the characteristics of a problem and help provide bounding
conditions.
The use of the Sn method to evaluate bounding conditions can be extremely valuable when one uses the Monte Carlo method as the primary tool in an analysis. For
large single units satisfying the constraints of the neutron diffusion theory, diffusion
theory codes can be very useful in accelerating the space-energy solution for the
neutron flux. On the other hand, the use of a one- or two-dimensional Monte Carlo
calculation to validate whether the proper angular and spatial mesh has been used
in an Sn calculation can also be very useful.
As we have demonstrated, the mathematics of criticality safety calculations is
precisely the same as that required for rigorous reactor design calculations. From
the neutronics perspective, the important differences are in the materials present and
the geometrical configuration. It is precisely these differences that cause a divergence in the computational tools required. The reactor designer has a fairly rigid
physical arrangement of fissile material and moderator and is interested in very precise values of keff . His concerns are with relatively small variations of the materials
present and how the value of keff varies as a function of the material changes.
With the much broader variation of the materials present – along with introduction of voids and strongly absorbing materials, compounded by the consideration
of system upset conditions – criticality safety calculations present a very different
challenge. It is this challenge that has provided the basis for the development of
computational tools specifically for use in criticality safety calculations.
References
1. Neuilly M, Bussac J, Frejacques C, Nief G, Vendryes G, Yvon J (1972) C R Acad Sci Paris
275D:1847–1849
2. Proceedings of the livermore array symposium, Conf-680909, 23–25 Sept 1968
370
G.E. Whitesides et al.
3. Douglas CH (1976) A review of criticality safety models used in evaluating arrays of fissile
materials. Nucl Technol 30(2):138–165
4. Six and sixteen group cross sections for fast and intermediate critical assemblies, LAMS-2543.
Los Alamos National Laboratory, December 1961
5. Whitesides GE (1971) A difficulty in computing the keff of the world. Trans Am Nucl Soc
14: 680
7
Criticality Safety Methods
371
G. Elliott Whitesides graduated from North Carolina State University with a B.S. in Nuclear
Engineering in 1960. Upon graduation he accepted a position with the nuclear facilities at
Oak Ridge operated by Union Carbide for the
(then) Atomic Energy Commission. One of his
first assignments was to implement the Sn computer programs that had been developed at Los
Alamos for the solution of criticality safety problems that existed at the Oak Ridge facilities. In
1963, he was named to head the Nuclear Computer Programs Group that was charged with the
task of developing computer programs for nuclear
criticality safety, shielding, and neutronic crosssection processing. In this capacity he initiated
the development of the DOT and ANISN Sn programs with an emphasis on solving neutron and
gamma ray shielding capabilities. He then turned his personal attention to developing the first neutronically rigorous Monte Carlo program designed especially for
the solution of nuclear criticality problems. This program, KENO, was developed
in conjunction with the Critical Experiments Facility activities at Oak Ridge. This
development was especially important because it provided the ability to compute
systems with arrays of fissile materials, a problem which had been particularly vexing to the safety of many operations involving fissile materials. The organization he
headed also produced the SCALE code system that integrated the shielding, criticality safety, and cross-section computer programs that had been developed at Oak
Ridge. This system has become a major computational tool for the nuclear industry.
He has had an active role with the American Nuclear Society having served on the
Executive Board of the Mathematics and Computations Division, and on the Board
of Directors, vice chairman, and then chairman of the Nuclear Criticality Safety
Division. In 1980, he was elected as Fellow of the American Nuclear Society. He
has been active in the Standards activities of the American Nuclear Society. He
has served on the ANS – 8 Standards Subcommittee since 1975, and as chairman
of the working groups that produced two standards entitled, “Validation of Calculational Methods for Nuclear Criticality Safety” and “Criticality Safety Criteria for
the Handling, Storage, and Transportation of LWR Fuel Outside Reactors.” In 1981,
he was promoted as site manager for Computing at Oak Ridge National Laboratory
and subsequently director of the Computing Applications Division. He has maintained a strong international interest by serving as the chairman of the Criticality
Calculations Working Group at the Organization for Economic Cooperation and
Development’s Nuclear Energy Agency (OECD-NEA) in Paris from 1980 to 1996.
He also was a principal factor in initiating and promoting the series of international
criticality conferences now know as ICNC.
372
G.E. Whitesides et al.
Robert Michael (Mike) Westfall completed
his Ph.D. in nuclear engineering at the University of Virginia in 1974. His dissertation
research was the development of a highly
precise transport solution for the Radially Reflected Critical Cylinder. From 1963 to 1973
he worked on methods development for processing nuclear data for reactor design at the
NASA-Lewis Research Center. In 1973, he
joined the Oak Ridge National Laboratory.
An early ORNL project was the development of the ROLAIDS software, which treats
resonance overlap in multiregion geometries.
From 1976 to1980, he initiated the development of the SCALE system. In 1980–1981,
Westfall served as a guest scientist with the Service for Reactor Studies and Applied
Mathematics at CEN-Saclay, where he worked on resonance processing methods
for the APOLLO code system. From 1981 to 2001, he led the Nuclear Engineering
Applications Section at ORNL. In the 1980s, he led the technical support efforts
for the nuclear criticality safety (NCS) qualification of TMI-2 recovery and the
initial DOE/NCS burnup credit studies. In the 1990s, he served on technical support groups for DOE responses to Defense Nuclear Facility Safety Board (DNFSB)
Recommendations 93–2 and 97–2. Westfall currently serves as a member of the
DOE Criticality Safety Support Group formed as part of the DOE response to
DNFSB Rec. 97–2, as well as the DOE Nuclear Data Advisory Group. Regarding international consensus standards, he is a vice chair of the Nuclear Technical
Advisory Group, which administers US participation in ISO Technical Committee
85, “Nuclear Technology.” For domestic standards, Westfall serves as a member of
ANS/N-16, the consensus committee for NCS standards, as well as being a member
of the ANS Standards Board. Also, since 1977, he has served as a technical expert
on the working group for ANSI/ANS-8.15, “Nuclear Criticality Control of Special
Actinide Elements.” In 1994, Westfall was named a Fellow of the American Nuclear
Society with the citation: “For innovation in the development and application of advanced problem-solving methodologies in the nuclear field with major contributions
to nuclear safety.” He served as technical program chair of the 1985 Mathematics
and Computation (M&C) Division topical meeting “Advances in Nuclear Engineering Computational Methods” in Knoxville. He chaired the M&C Division in
1987–1988. He also served as general chair of the 1993 Nuclear Criticality Safety
Division topical meeting “Physics and Methods in Criticality Safety” in Nashville.
In his current position as the ORNL Manager of the DOE Nuclear Criticality Safety
Program, with major responsibilities for the Analytical Methods and Nuclear Data
work elements, he has had technical leadership roles on a DOE-wide basis.
7
Criticality Safety Methods
373
Calvin Mitchell Hopper is a distinguished
senior development engineer at Oak Ridge
National Laboratory (ORNL), graduated
from Southern Colorado State College with
a B.S. in engineering physics. Between 1968
and 1970, he provided radiation protection
services to the Oak Ridge Critical Experiments Facility and the Oak Ridge Y-12
Development Division. Between 1970 and
1978, he provided nuclear criticality safety
(NCS) engineering evaluations and analysis for the Oak Ridge Y-12 Plant and K-25
Gaseous Diffusion Plant. Between 1978 and
1980, he was the manager of Nuclear Material Licensing, Nuclear Material Accountability, and Nuclear Safety (health physics
and nuclear criticality safety) at the Texas
Instruments, Inc. – HFIR Project resulting in TI receiving their first comprehensive special nuclear material license from the US Nuclear Regulatory Commission
(NRC). In 1980, he returned to the Oak Ridge Y-12 Plant as the Technical Manager of the Health Physics Department responsible for developing a corporate wide
external dosimetry service and managing the internal and external dosimetry programs at Y-12. In 1982, he became the department head for the Y-12 Nuclear
Criticality Safety Department. In 1984, he transferred to ORNL to provide Nuclear
Criticality Safety Officer services and to develop an ORNL NCS organization and
holding an interim position of ORNL Nuclear Criticality Safety Section Head. In
1995, he engaged the US NRC to support the seminal application of first-orderlinear-perturbation-theory sensitivity and uncertainty analysis at ORNL for NCS
engineering applications and critical experiment design and evaluations. Between
1997 and 2007, he was the principle investigator of the US Department of Energy
Nuclear Criticality Safety Program for Applicable Ranges of Bounding Curves and
Data that lead to major neutronics code enhancements at ORNL. He has been a
member of the American Nuclear Society (ANS) since 1970 serving in all offices
of the Nuclear Criticality Safety Division, is the Chair of the ANSI/ANS Consensus Committee N16 on Criticality Safety, was a chair and is a member of various
standards Working Groups, and is a member of the ANS Standards Board. Internationally, he has been the convener (chair) of Working Group 8 on Criticality
Safety within ISO TC85/SC5 since 1998. He has participated in OECD-NEA Expert Groups on MOX critical experiment needs and reference critical values and
is a contributor and peer-reviewer for the OECD-NEA International Handbook of
Evaluated Criticality Safety Benchmark Experiments. His latest publications have
focused on the use and benefits of sensitivity and uncertainty analyses as applied to
NCS problems and critical experiment design.
Chapter 8
Nuclear Reactor Kinetics: 1934–1999
and Beyond
Jack Dorning
8.1 Introduction
When one of the editors of this book, Professor Yousry Azmy, asked me to give a
lecture on the development of reactor kinetics during the twentieth century and the
future direction of research in this area in the twenty-first century, my reaction was,
“Wow! Review the developments of a century of reactor kinetics. That’s a lot to
cover.” But then, upon reflection on the fact that the discipline of reactor kinetics,
and reactor physics in general, did not even exist until the 1930s, I realized that I did
not have to review a whole century of development, but rather, a mere two thirds of
the century! Still a somewhat daunting task!
Very soon after beginning that task, I became somewhat intrigued, one might
even say obsessed, with the question of the precise origins of the equations of reactor
kinetics – specifically the precise origins, i.e., the first appearance in the literature,
of the time-dependent neutron diffusion equation and of the point reactor kinetics
equations. Thus, this lecture and chapter begin with the historical origins of the
equations of reactor kinetics – more precisely, the results of my best efforts to uncover them.
The lecture and chapter are composed of seven sections. Following this introduction, the development of nuclear reactor kinetics during the twentieth century,
starting with its historical origins in the 1930s and 1940s, is reviewed in a prologue which comprises Section 8.2. The introduction of the time-dependent neutron
diffusion equation and the point reactor kinetics equations for reactor analysis is
chronicled, and the roles played by Enrico Fermi and Eugene Wigner in these
events are discussed. Subsequent derivations of more general point reactor kinetics
equations during the 1950s, 1960s, and 1970s are summarized in Section 8.3. Then,
during a digression that evolved into Section 8.4, the theory of pulsed neutron experiments, which was developed primarily in the 1960s and 1970s and which played
an important role in the development of reactor physics, is described with emphasis
J. Dorning ()
Le Carlina Lodge, Biarritz, France
e-mail: dorning@virginia.edu
Permanent address: University of Virginia, Charlottesville, VA USA
Y. Azmy and E. Sartori, Nuclear Computational Science: A Century in Review,
c Springer Science+Business Media B.V. 2010
DOI 10.1007/978-90-481-3411-3 8, 375
376
J. Dorning
on the resolution of apparent contradictions between theory and experiment and
some very interesting connections between several mathematical subtleties and initially perplexing experimental observations. Next, in Section 8.5, the evolution of
elementary and advanced numerical methods for the solution of space-time reactor kinetics problems, from the advent of large-scale mainframe computers to the
present, is discussed. Then, in Section 8.6, a short summary of some special topics
in reactor dynamics is given. Finally, the chapter ends with a closing section, or
epilogue, which presents some thoughts about the possible directions of research
and development in the future – a short prologue on reactor kinetics and reactor
dynamics in the twenty-first century.
8.2 Prologue: The Historical Origins of the Equations
of Reactor Kinetics
8.2.1 The Time-Dependent Neutron Diffusion Equation
To find the first use in the literature of the neutron diffusion equation to describe the
motion of neutrons in a nonmultiplying or subcritical system, I originally thought
that it would only be necessary to look quickly at a few standard textbooks on reactor physics, at least some of which, I had assumed, would cite the original reference.
However, after checking a few of the elementary reactor physics and reactor theory
books currently in use [1,2] and even some of the vintage books on the subject [3–6]
that I used as a student, I was very surprised and disappointed to find no reference
to the original introduction of the, subsequently so widely used, neutron diffusion
equation. Further efforts expended trying to locate a citation of the seminal publication in many, many textbooks on reactor physics [7, 8], neutron transport theory
[9–11], and reactor kinetics and reactor dynamics [12–15] – many of which are not
cited here – led to the same disappointing result. Even after consulting with some of
the most “mature” members of our nuclear engineering and neutron physics communities, who provided a wellspring of information on the history of these disciplines,
no path to the elusive reference was opened.
After a fair effort searching through the post-World War II literature of declassified articles led to no success in my quest, I turned to two obvious references.
The first was Enrico Fermi – Collected Papers [16, 17], with which I was already
very familiar from my Ph.D. dissertation days because of the many important papers
on neutron thermalization published by Fermi and members of his youthful group
at the Istituto Fisico dell’Università di Roma in the 1930s. The other was The Collected Works of Eugene Paul Wigner [18]. All I needed, plus lots more of course,
was in these seven volumes – the two-volume Fermi collection plus the five-volume
Wigner set.
The first published article that I was able to find in which the neutron diffusion
equation appeared was by Edoardo Almadi and Enrico Fermi. (Actually, I was “rooting” for Fermi from the outset, since he made so many important contributions to
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
377
neutron physics, reactor physics, and reactor theory via his pioneering experiments,
his keen physical insight and his profound theoretical ideas – and to so, so many
other areas of physics.) The article, a letter (in Italian), “Sull’Assorbimento dei Neutroni Lenti – III” (“On the Absorption of Slow Neutrons – III”), appeared in Ricerca
Scientifica in 1936 [19]. It is also reprinted in Italian in Enrico Fermi – Collected
Papers Volume I, Italy 1921–1938 as article number 114 on pages 823–826. It is
noteworthy that this letter opens with a reference to two letters published in 1935
(also in Ricerca Scientifica) in which they discussed the existence of groups of slow
neutrons with different absorption and diffusion properties. The first section of this
letter [19], entitled “I. Diffusione dei singoli gruppi di neutroni lenti” (“I. Diffusion
of the individual groups of slow neutrons”), begins with “Giá nella lettera precedente abbiamo accennato alle differenze tra le proprietà di diffusione dei neutroni
dei gruppi A e C” (“Already in the preceding letter we mentioned the differences
between the diffusion properties of the neutrons of the groups A and C”). It continues after a few lines with “Nell’ipotesi che i neutroni del gruppo C nella paraffina
obbediscano alle solite leggi della diffusione, ed abbiamo in più la possibilitá di essere distrutti con vita media , la loro densitá n soddisfa all’equazione differentiale”
(“Under the hypothesis that the group C neutrons in paraffin obey the usual law of
diffusion, and have in addition the possibility of being destroyed with half-life ,
their density n satisfies the differential equation.”)
n
dn
D Dn ;
dt
“dove D è il coefficiente di diffusione dato da 1=3Nv; essendo il cammino libero
medio e vN la velocità media” (“where D is the diffusion coefficient given by 1=3Nv;
being the mean free path and vN the average velocity”).
Finally, there is an earlier letter published in Ricerca Scientifica in 1934 by
E. Fermi, E. Amaldi, B. Pontecorvo, F. Rasetti, and E. Segrè [20] in which there
is a detailed explanation of the laboratory observations they made, given in terms of
a diffusion model of neutron migration in water or paraffin. An English translation
by Emilio Segrè of that letter appears on page 761 ff as paper number 105b in Enrico Fermi – Collected Papers Volume I, Italy 1921–1938 [16], and the last major
paragraph states the following:
A possible explanation of these facts seems to be the following: neutrons rapidly lose their
energy by repeated collisions with hydrogen nuclei. It is plausible that the neutron-proton
collision cross section increases for decreasing energy and one may expect that after some
collisions the neutrons move in a manner similar to that of the molecules of a diffusing
gas, eventually reaching the energy corresponding to thermal agitation. One would form in
this way something similar to a solution of neutrons in water or paraffin, surrounding the
neutron source. The concentration of this solution at each point depends on the intensity of
the source, on the geometrical conditions of the diffusion process and on possible neutroncapture processes due to hydrogen or other nuclei present.
A longer article was published in Italian by Fermi in Ricerca Scientifica in
1936 [21]; in it he discussed, in some detail, neutron slowing-down and neutron
diffusion in hydrogenous materials. On the penultimate page of the 40-page article,
which is reprinted in his collected works [16] as article number 119a on pages
378
J. Dorning
943–983, he points out that “given the considerable number of collisions a thermal
neutron undergoes before being captured, it is clear that its motion can be accurately
described as diffusive motion, and that one can easily write a differential equation
for the neutron density n
v
1
@n
D
n n C q;
@t
3
where q is the number of thermal neutrons produced per unit volume and time from
the slowing down of fast neutrons and the first term on the right-hand side represents
the effect of diffusion.” He goes on to emphasize the importance of the stationary
case and explicitly includes the reduction of the above equation to the steady-state
diffusion equation for thermal neutrons.
Clearly, the young leader – referred to as the “Pope” by his colleagues because of
his theoretical prowess – of the youthful group at the Istituto Fisico dell’Università
di Roma, and the members of that group – known as the “Via Panisperna Group” –
understood neutron slowing-down and neutron diffusion as a theoretical description of neutron motion and used the neutron diffusion equation to explain the
results of their pioneering experiments at least as early as 1934 [20] through 1936
[19, 21]. So, it seems, based on the results of my limited search efforts that the
time-dependent (and steady-state) neutron diffusion equation originated with Enrico Fermi and his colleagues. Of course, it was a small modification to add the
prompt and delayed neutron fission production terms later to arrive at the neutron
diffusion equation so widely used even today in reactor physics in general, and in
reactor kinetics and reactor dynamics in particular. A photo of Enrico Fermi, taken
during those “early days” in Rome and reprinted from his collected works, appears
on the following page.
8.2.2 The Point Reactor Kinetics Model
The point reactor kinetics equations, of course, can be derived easily from the coupled time-dependent diffusion equation and delayed neutron precursor equations, as
will be done in Section 8.3. However, as in the case of the time-dependent neutron
diffusion equation, I thought it would be appropriate to include in this lecture and
the book chapter a citation of the article in which the equations, in this case the
“point reactor kinetics model” or point reactor kinetics equations, first appeared.
Again, as in the case of the time-dependent neutron diffusion equation, a simple perusal of the standard textbooks on reactor physics [1–8], reactor kinetics and reactor
dynamics [12–15], and neutron transport theory [9–11] failed to reveal a reference
that was cited as the article or report in which the point kinetics equations first
appeared. Thus, I again turned to the collected works of two of the greatest men
and women of physics in the twentieth century and pioneers of neutron physics and
reactor physics, Eugene Paul Wigner and Enrico Fermi. In The Collected Works
of Eugene Paul Wigner, Part A, The Scientific Papers, Vol. V [18], annotations are
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
379
provided by Alvin M. Weinberg on the part entitled “Macroscopic Reactor Theory.”
Among those annotations, Weinberg wrote the following:
In paper 34 Wigner derives the usual kinetic equations, in which the delayed neutrons dominate the variations of power following a small reactivity change. Similar results had been
obtained by Fermi, by Wheeler and Ibser, and a little later, by J. Schwinger (who spent one
month in Chicago in 1943 working with Wigner’s group) [18].
Paper number 34 beginning on page 520 is “On Variations of the Power Output
in a Running Pile,” by E. P. Wigner, November 11, 1942. Wigner begins that paper
with the statement, “The importance of the delayed neutrons for the steady operation
of the pile has been early recognized by Allison, Fermi, and Szilard (Conferences on
the Power Plant, February 1942).” He goes on, in the second paragraph, to introduce
Enrico Fermi during the “early days” in the 1930s in Rome. (Photographer unknown.)
the point kinetics equations with the statement, “If we denote the number of neutrons
by n, by sj the number of radioactive nuclei which emit a delayed neutron of the
kind j , we have the equations
sPj D Aj sj C fj .n C S /
X
X X nP D
Aj sj C n ke fj :00
fj C S 1 Although the notation differs slightly from that which is currently in use, these
equations are easily recognized as the point kinetics equations, so familiar to even
beginning students of reactor kinetics. In this paper, which was originally Chicago
Project Report CP-351 [22], Wigner uses the solutions to these equations to discuss
the initially rapid and subsequently more gradual increase in neutron density during
the transient that follows the introduction of a small increase in reactivity into an
initially critical system. A 1958 photo of Eugene Wigner, copied from his collected
works, appears on the following page. On his lap in this photo, very appropriately,
380
J. Dorning
is a copy of the classic textbook, The Physical Theory of Neutron Chain Reactors,
by A. M. Weinberg and E. P. Wigner [8].
The references to the earlier work of Fermi (and others) in both Wigner’s paper and Weinberg’s annotation, although not specific, led me to a slightly earlier
Chicago Project Report CP-291 (Notes on Lecture of October 7, 1942) by Enrico
Fermi entitled “Problem of Time Dependence of the Reaction Rate: Effect of Delayed Neutrons Emission” [23]. In the first four pages of those lecture notes, Fermi
first estimates the “time to change the number of neutrons by a factor e” following an
increase of keff from 1.000 to 1.001 as 1 s – based on the assumption all neutrons are
prompt neutrons – in Part A. Simple Theory Neglecting Delayed Emission of Neutrons. Then, in Part B. Simple Theory Including Delayed Emission of Neutrons, he
introduces , the “normal time of one generation;” T, the “time of one generation of
delayed neutrons;” n, the “the number of neutrons present in the reacting mass;” c,
the “the number of existing radioactive atoms which will decay to give delayed neu, the “rate of change of the number of neutrons;”
trons (c stands for “credits”);” dn
dt
and p D 1%, the “fraction of neutrons which produce delayed neutrons.” He then
adds “c atoms decay with lifetime T to give c=T new neutrons per second.”
“All neutrons are absorbed after an average lifetime at the rate n= neutrons
per second.”
“Each absorbed neutron forms k new neutrons. k includes emission of both credit
neutrons and instantaneous neutrons. These statements may be expressed by the
equations (1) and (2).
Eugene Paul Wigner with a copy of his classic textbook on reactor theory on his lap (with kind
permission of Springer Science + Business Media)
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
381
c
n
n
dn
D
C k .1 p/ ;
dt
T
where k.1 p/ neutrons produce instantaneous neutrons
(1)
(2)
dn
n
c
D kp ;
dt
T
where kp n D new credits formed per second (and) Tc D credits lost per second
from radioactive decay.” These equations, of course, are the point kinetics equations
written in almost the same notation we use today – although, interestingly, the “c”
denoted “credits” in Fermi’s description whereas it subsequently was taken to indicate the “concentration” of the precursors of the various delayed neutron groups,
as it is to this day. (Although Fermi lumped the delayed neutrons into one effective
delayed neutron group in these equations, earlier in these notes he mentions that the
“appreciable time lag,” after which the delayed neutrons are emitted, is “described
by a complicated law which depends on about 3 lifetimes.”) After developing the
exact solutions to the above point kinetics equations for a step change in keff , Fermi
introduces the simple jump approximation to generate even simpler solutions and
uses those to arrive at the conclusion that “The relaxation time for k D 1:001 from
this theory is 1990 s, while if the delayed neutrons are neglected, simple theory gives
the quite different value of 1 s.” Thus, recognizing the significance of the delayed
neutrons, he used the point kinetics equations (in the notes of October 7, 1942) to
show that a graphite/natural-uranium pile could be controlled – less than 2 months
prior to the historic event on December 2, 1942, during which he and his colleagues
achieved mankind’s first controlled neutron chain reaction in the graphite/naturaluranium pile in a squash court under the west stands of Stagg Field at the University
of Chicago. (The pile had been moved from Columbia University in New York City
earlier that year.)
It is clear that in anticipation of that historic event, interest in the neutron kinetics
of a critical, or near critical, pile was very high in the experimental and theoretical physics groups at Chicago led, respectively, by Fermi and Wigner. No doubt,
the related point kinetics equations were floating around in the air there, and were
well-known to the few people in those groups. In fact, although the first appearances of these equations that I was able to find were in the reports by Fermi and
by Wigner discussed above, statements by Wigner in the opening paragraph of his
paper suggest that they appeared in earlier reports. He begins his report with “The
importance of the delayed neutrons for the steady operation of the pile has been
early recognized by Allison, Fermi, and Szilard (Conference on the Power Plant,
February 1942),” and his third sentence states, “The solution of the equations which
give the power output of the pile as a function of time has been given by Ibser,
Manley and Wheeler (C-65) in their discussion of the effect of a sudden neutron
burst.” Unfortunately, I was not able to obtain any information on Conferences on
the Power Plant, February 1942 or on C-65. However, it is clear that these two
giants of twentieth century physics were among the first to introduce and use the
point kinetics equations.
382
J. Dorning
In fact, Fermi and Wigner’s seminal contributions to the antecedents of our
field – neutron physics and reactor physics, although extensive and profound, were
but tiny fractions – in both scope and importance – of their numerous brilliant
contributions in so many diverse areas of physics. To generate partial evidence of
this fact instantly, one need only make a mental list of the enormous number of
terms developed in twentieth century physics that are based on their names – especially Fermi’s name. In our field: the Fermi four factor formula, Fermi age theory,
the Fermi pseudo-potential – in slow-neutron scattering [24], the Fermi synthetic
slowing-down kernel [25, 26], etc. (There also is a Wigner synthetic slowing-down
kernel. [25, 26]) and in other areas of physics: the Fermi solid, the Fermi sea, the
Fermi surface, the Fermi distribution, Fermi–Dirac statistics, Fermions, and the list
goes on and on! Enrico Fermi probably has more phenomena and concepts named
after him than any other physicist. And well he should!
A picture of the still fairly young Fermi appears on the opposite page. It is from
his collected works and was taken at Los Alamos in 1946, a short 4 years after he led
the Chicago group in achieving the first controlled nuclear chain reaction and only
10 years after “the early days” in Rome during which he and the young members of
his group performed the pioneering neutron physics experiments (and developed the
related seminal theories) that led to his Nobel Prize in 1938 – “For the transmutation
of the elements by means of slow moving neutrons,” according to his obituary which
appeared in the New York Herald Tribune on November 28, 1954.
8.3 The Point Reactor Kinetics Equations
8.3.1 The Basics: From the One-Group Diffusion Equation
with Delayed Neutrons for a Bare Homogeneous Reactor
to the Point Reactor Kinetics Equations
Although Fermi (and others), no doubt, originally developed the point kinetics equations from simple physical arguments based on neutron balances for the overall
assembly, the standard “classroom” derivation of these equations at an introductory
level starts from the one-speed neutron diffusion equation for a bare homogeneous reactor, with delayed neutrons, and the coupled delayed neutron precursor
concentration equations. And that will be the course followed here.
The time-dependent one-group neutron diffusion equation with delayed neutrons
for neutron migration in a bare homogeneous reactor for the space- and timedependent neutron number density N.r; t/ is
X
@
N.r; t/ D vDr 2 N v†a N C .1 ˇ/k1 v†a N C
i Ci C S.r; t /; (8.1)
@t
I
i D1
with boundary conditions N.r; t/ D 0 at the extrapolated boundary r , and the
coupled equations for the space- and time-dependent delayed-neutron precursor
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
383
Enrico Fermi at Los Alamos, NM after World War II in 1946. (Photographer unknown.)
concentrations (Fermi’s “credits” [23]) Ci .r; t/; i D 1; : : : ; I; are
@
Ci .r; t / D ˇi k1 v†a N i Ci ; i D 1; : : : ; I;
@t
(8.2)
where I is the effective number of delayed-neutron groups, and the other notation
is standard [1–15]. In many developments, the fixed source term S.r; t / is taken as
zero, and the neutron number density N.r; t / and the precursor concentration are
factored at this point into a space-dependent factor – the fundamental “buckling”
mode – and a time-dependent factor n.t/, often interpreted as the total number of
neutrons in the reactor, and ci .t/; i D 1; : : :; I , interpreted as the total number of
the i -th group delayed-neutron precursors in the reactor. However, it is slightly more
satisfactory simply to separate variables as is so frequently done in solving partial
differential equations (PDEs) like the neutron diffusion equation. It follows from the
separation
1 0
1
0
N.r; t/
f .r/n.t/
B Ci .r; t/ C B gi .r/ci .t/ C
C B
C
B
(8.3)
CDB
C;
B
::
::
A
A
@
@
:
:
CI .r; t/
gI .r/cI .t/
via the precursors equations, Eq. 8.2, that the gi .r /; i D 1; : : :; I , are proportional
to f .r/. Hence, they can be written as gi .r/ D i f .r/; i D 1; : : : ; I . Further, since
the proportionality constants i can be incorporated into the definitions of the ci .t/,
the gi .r/=f .r/ become unity, and the precursor equations reduce to
384
J. Dorning
cPi .t/ D ˇi k1 v†a n.t/ i ci .t/; i D 1; : : : ; I;
(8.4)
and Eq. 8.1, the diffusion equation, becomes
1
r 2 f .r/
1 n.t/
1 X ci .t/
P
Œ.1 ˇ/ k1 1 v†a D
D B 2 ; (8.5)
i
vD n.t/ vD
vD
n.t/
f .r/
I
i D1
in this special case of separation of variables (special case because there are no
derivatives with respect to r in the precursor equations), and B 2 is the usual separation constant. The equation for the time-dependent function n.t/ follows, as usual,
from setting the time-dependent terms – the terms that precede the first equal sign –
equal to B 2 , which after some very elementary manipulations becomes
X
. ˇ/
n.t/ C
i ci .t/;
`
I
n.t/
P D
(8.6)
i D1
where D kek1
, of course, is the reactivity, and ` `0 =ke is the neutron gene
eration time, both of which appear as a result of the following two definitions: (1)
the neutron lifetime `0 `1 =.1 C L2 B 2 /, where `1 1=.v†a / is the infiniteneutron non-leakage
medium neutron lifetime, and 1=.1 C L2 B 2 / is the thermal
p
probability written in terms of the diffusion length L D D=†a and the separation
constant –B 2 ; and (2) the effective multiplication constant ke k1 =.1 C L2 B 2 /.
In terms of these parameters, Eq. 8.4 becomes
cPi .t/ D
ˇi
n.t/ i ci .t/; i D 1; : : : ; I:
`
(8.7)
The corresponding equation for the space-dependent terms in Eq. 8.5 – the terms
between the two equal signs in that equation – equal to the separation constant –B 2
is the familiar Helmholtz equation
r 2 f .r / C B 2 f .r / D 0;
(8.8)
with boundary conditions f .Qr / D 0, which of course follow from the boundary
conditions on N.r; t/ stated immediately below Eq. 8.1. Of course, this linear homogeneous PDE for f .r/ with homogeneous boundary conditions is an eigenvalue
problem, with the eigenvalues being the eigenvalues of the Laplacian B 2 D Bn2 ; n D
0; 1; 2; : : :; and the corresponding eigenfunctions are 'n .r/; n D 0; 1; 2; : : :. According to long-standing tradition in reactor physics, these eigenvalues and eigenfunctions are referred to, respectively, as “bucklings” and “buckling modes,” with
the fundamental eigenvalue being the buckling. The explicit expressions for the Bn
and 'n .r/, of course, depend upon the specific geometry of the reactor. The solution
to the initial-value problem given by the linear PDEs, Eqs. 8.1 and 8.2, thus is
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
N.r; t/ D
1
X
385
nn .t/'n .r/;
(8.9)
ci;n .t/'n .r/; i D 1; : : : ; I:
(8.10)
nD0
and
Ci .r; t/ D
1
X
nD0
Here, the nn .t/ and the ci;n .t/ are determined from the time-dependent equations,
Eqs. 8.6 and 8.7, with the parameters defined below Eq. 8.6 in terms of B now written in terms of the Bn ; n D 1; 2; : : :, and therefore having index n (i.e., . ; ke ; `; `0 /
become . n ; ke;n ; `n ; `0;n /), as do the unknowns nn .t/ and ci;n .t/. Thus, Eqs. 8.6
and 8.7 become
X
ˇ
nn .t/ C
i ci;n .t/; n D 0; 1; 2; : : : ;
`n
(8.11)
ˇi
nn .t/ i ci;n .t/; i D 1; : : : ; I; n D 0; 1; 2; : : : :
`n
(8.12)
I
nP n .t/ D
n
i D1
and
cPi;n .t/ D
This set of coupled linear ordinary differential equations (ODEs) with constant coefficients is solved for each value of the index n using standard procedures developed
in an introductory undergraduate ODEs course to arrive at exponential solutions,
nn .t/ D
I
X
An;j e ˛n;j t ; n D 0; 1; 2; : : : ;
(8.13)
Ai;n;j e ˛n;j t ; i D 1; : : : ; I; n D 0; 1; 2; : : : ;
(8.14)
j D0
and
ci;n .t/ D
I
X
j D0
where the ˛n;j ; n D 0; 1; 2; : : : ; j D 0; 1; : : : ; I are determined from the characteristic equation that arises in the solution to the ODEs
ˇ
ˇ
ˇ
ˇ
ˇ
ˇ
ˇ
ˇ
ˇ
ˇ
ˇ
ˇ
ˇ
ˇ
ˇ
˛n
`n
ˇ1
`n
::
:
ˇI
`n
n
1
ˇ
ˇ
ˇ
ˇ
ˇ
ˇ
ˇ
0
ˇ D 0;
ˇ
::
ˇ
:
ˇ
ˇ
ˇ
.I C ˛n / ˇ
2 .1 C ˛n / 0
::
:
::
:
0
0
I
(8.15)
the Ai;n;j ; i D 1; : : : ; I are written in terms of the An;j using Eqs. 8.11 and 8.12,
and the An;j are determined from the initial conditions – not previously mentioned
386
J. Dorning
here – associated with Eqs. 8.1 and 8.2, N.r; 0/ and Ci .r ; 0/ using the orthogonality
properties of the eigenfunctions 'n .r/; n D 0; 1; 2; : : : of the Laplacian.
When a reactor is near critical – the case in which the point kinetics equations
normally are used – the ˛n;j ; n D 1; 2; : : : ; j D 0; 1; : : : ; I are negative and much
smaller (more negative) than the ˛0;j ; j D 0; 1; : : : ; I ; hence, the higher spatial
modes 'n .r/; n D 1; 2; : : : die away and only the fundamental “buckling” mode
'0 .r/ remains at long times. Thus, in the development of the point kinetics equations, only this fundamental mode – corresponding to n D 0 – is retained, and
Eqs. 8.11 and 8.12 are rewritten with n set equal to zero and, in fact, with that zero
suppressed, to give the standard point kinetics equations,
X
ˇ
n.t/ C
i ci .t/;
`
I
n.t/
P
D
(8.16)
i D1
cPi .t/ D
ˇi
n.t/ i ci .t/; i D 1; : : : ; I;
`
(8.17)
with the spatial distribution or “shape function” given by the fundamental buckling
mode solution to Eq. 8.8 '0 .r/. The solutions to Eqs. 8.13 and 8.14 become for
n D 0 (with the zero suppressed)
n.t/ D
I
X
Aj e ˛j t ;
(8.18)
Ai;j e ˛j t ; i D 1; : : : ; I;
(8.19)
j D0
ci .t/ D
I
X
j D0
and the characteristic equation for these time-dependent solutions associated with
the fundamental spatial mode, Eq. 8.15, becomes
ˇ
ˇ ˇ ˛
1
ˇ `
ˇ
ˇ1
ˇ
.1 C ˛/
`
ˇ
ˇ
::
::
ˇ
:
:
ˇ
ˇI
ˇ
0
`
ˇ
ˇ
ˇ
ˇ
ˇ
0
ˇ D 0;
ˇ
::
ˇ
:
ˇ
.I C ˛/ ˇ
2 0
::
:
0
I
(8.20)
which is of course, in reactor kinetics, called the “in-hour equation,” for the
characteristic roots ˛ D ˛j ; j D 0; 1; : : : ; I that explicitly give the exponential
solutions in time, exp ˛j t ; j D 0; 1; : : : ; I in Eqs. 8.18 and 8.19. If the reactor is
slightly supercritical ˛0 is positive and ˛j ; j D 1; : : : ; I are negative, and if it is
subcritical all the ˛j ; j D 0; 1; : : : ; I are negative.
Finally, the case in which the fixed source S.r; t/ in Eq. 8.1 is non-zero should
be included. To do this, it is convenient to expand S.r; t/ if it is an L2 function in
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
387
r and t (i.e., jS.r; t/j2 is integrable over the spatial domain of the reactor and over
all time), for practical geometries in the complete set (in L2 ) of eigenfunctions of
Eq. 8.8 – the “buckling” modes,
S.r; t/ D
1
X
Sn .t/'n .r/:
(8.21)
nD0
Substituting this expansion for S.r; t/ into Eq. 8.1 and also substituting the eigenfunction expansions of N.r; t/ and Ci .r; t/; i D 1; : : : ; I (which also belong to L2
if S.r; t/ does) into Eqs. 8.1 and 8.2, forming the inner product (in r ) of the adjoint
eigenfunctions 'm
.r/ with Eqs. 8.1 and 8.2, and using the bi-orthogonality prop
erties of the adjoint and “forward” eigenfunctions .'m .r/; w.r /'n .r// D Cn ınm ,
where w.r/ is the appropriate weight function for the reactor geometry and the inner product ( , ) corresponds to integration over the spatial domain of the reactor,
leads to Eq. 8.11, with a source term
X
ˇ
nn .t/ C
i ci;n .t/ C Sn .t/; n D 0; 1; 2; : : : ;
`n
I
nP n .t/ D
n
(8.22)
i D1
and the coupled equations for the precursor concentrations, Eq. 8.12, which are unchanged. When the system is near critical, as discussed above in this section, only
the fundamental spatial eigenfunction, or “buckling” mode, corresponding to n D 0
remains at long times, and – under the usual assumption that only this fundamental
spatial mode is present in the source term S.r; t/ – Eqs. 8.22 and 8.12 become the
standard point reactor kinetics equations with a fixed source term:
X
ˇ
n.t/ C
i ci .t/ C S.t/;
`
I
n.t/
P
D
(8.23)
i D1
cPi .t/ D
ˇi
n.t/ i ci .t/; i D 1; : : : ; I;
`
(8.24)
where Eq. 8.23 replaces Eqs. 8.16 and 8.24 is identical to Eq. 8.17. The solution to
these nonhomogeneous equations (and also to the more general equations, Eqs. 8.22
and 8.12) are given in terms of the complementary solution, Eqs. 8.18 and 8.19
(Eqs. 8.13 and 8.14 in the more general case) plus the particular solution obtained
using standard techniques for the solution of elementary ODEs.
Of course, this simple and very explicit and concrete development of the point
kinetics equations from the one-speed neutron diffusion theory description of neutron migration in a reactor, Eqs. 8.1 and 8.2, is only the beginning of the story. Much
more general developments of the point kinetics equations, starting from the energy(or speed-), angle-, space-, and time-dependent neutron transport theory description
have been given. And some of these will be summarized in the next section.
388
J. Dorning
8.3.2 More General Developments of the Point Reactor Kinetics
Equations: “Shape Functions,” “Time Functions,”
and “Neutron Importance”
More general developments of the point reactor kinetics equations were first
reported almost simultaneously by L. N. Ussachoff [27] and A. F. Henry [28], with
a more detailed derivation published a few years later by Henry [29]. (A photograph
of the late Allan F. Henry who had a long and distinguished career in reactor kinetics, reactor physics, and reactor computations appears on the opposite page.) It is
this more detailed derivation that we shall summarize here. A more complete, and
very nice, development appears in [14].
Although it is possible to begin this derivation from the one-speed neutron diffusion equation, as was done above in Section 8.3.1 for the simple introductory
first-course classroom development, it is more general, therefore more useful – and
certainly more elegant – to start from the space-, speed- (or energy-), directionvector- (or angle-), and time-dependent neutron transport (or Boltzmann) equation –
as is done in [14].
O t
I
@n r; v; ;
X
1
0
O t ;
DLt n C Mt n C
i Ci CS r; v; ;
i
(8.25)
@t
4
i D1
1
i .v/ Ci .r; t/
@ 4
1
D Mti n i
i Ci ; i D 1; ; I:
(8.26)
@t
4
O t into
The first step [14,29] is to decompose the neutron number density n r; v; ;
O t and a “Time Function” T .t/
a product of a “Shape Function” ‰ r; v; ;
O t D ‰ r; v; ;
O t T .t/ ;
n r; v; ;
(8.27)
and substitute this decomposition into the transport equation, Eq. 8.25. It is worthy
of comment that at this point no approximation has
and this is still
been made,
O t still depends upon
completely general because the shape function ‰ r; v; ;
time t, and therefore this decomposition is not a “separation of variables” – at least
not yet.
It is useful now to introduce the concept of a “steady-state reference reactor” by
which is usually meant the reactor under consideration in its critical state – although
this does not have to be the case. The corresponding steady-state Boltzmann equation and adjoint Boltzmann equation for the reference reactor are
O D 0;
H0 N0 r; v; (8.28)
and
O D 0;
H0 N0 r; v; (8.29)
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
389
The late Professor Allan F. Henry (courtesy of the Nuclear Science and Engineering Department
of The Massachusetts Institute of Technology)
where Ht D Lt C Mt , Ht D Lt C Mt , H0 D Ht D0 , H0 D HtD0 , and the
dagger () indicates adjoint operators and adjoint functions. The time-dependent,
streaming, removal and inscatter operator Lt above is given by
O r †T .r ; v; t/ C
Lt D Z
1
0
d v0
Z
O 0 †s r; t; v0 ! v; O0 !
O ;
d
4
(8.30)
the time-dependent fission neutron production operators Mti , i D 0; : : : ; I; for the
prompt neutrons .i D 0/ and the delayed neutrons (in group i ), i D 1; : : : ; I , are
given by
Z
Z 1
1
O 0 v0 †f r ; v0 t ;
d v0 d .1 ˇ/0 .v/
4
Z 1 0 Z
1
i
O 0 v0 †f r; v0 t ;
ˇi i .v/
Mt D
d v0 d 4
0
Mt0 D
Mt D
I
X
Mti ;
(8.31)
(8.32)
(8.33)
i D0
where here, and also in Eqs. 8.25 and 8.26, 0 .v/ denotes the fission spectrum of
the prompt neutrons and i .v/; i D 1; : : : ; I , that of the neutrons in the i -th delayed
390
J. Dorning
group, and ˇi ; i D 1; : : : ; I , is the fraction of delayed neutrons in the i -th group.
In Eqs. 8.31 and 8.32, and in the remainder of this Chapter, denotes the average
number of neutrons emitted per fission event and †f is the macroscopic fission
cross section.
Substitution of the above decomposition, Eq. 8.27, into the precursor equations,
Eq. 8.26, leads to an analogous decomposition
1
i .v/Ci .r; t/ D
4
O t C i .t/; i D 1; : : : ; I;
r; v; ;
(8.34)
which is sufficient, but not necessary, for consistent solutions in the forms given
by Eqs. 8.27 and 8.34. After substituting these decompositions into the Boltzmann
equation, Eq. 8.25, and the precursor equations, Eqs. 8.26 the inner products – over
O of the solution N .r; v; /
O
the space, speed and direction-vector variables .r; v; /
0
to the adjoint steady-state (critical) reference reactor equation, Eq. 8.29, and both the
Boltzmann and precursor equations – are formed to obtain
T .t/
d d
No ;
T .t/ D No ; Lt
C No ;
T .t/C No ; Mt0
T .t/
dt
dt
I
X
C
i No ;
C i .t/C No ; S (8.35)
i D1
and
d d
N ;
T .t/
C No ;
dt o
dt
D No ; M1i
T .t/ i No ;
C i .t/; i D 1; : : : ; I:
C i .t/
(8.36)
Here, the inner product ( , ) is the inner product in a complex L2 space (or Hilbert
space)
Z 1
Z
Z
O
O g r; v; O ;
d v d f
r; v; (8.37)
.f; g/ D d 3 r
V
0
4
and the spatial integral is over the volume of the reactor V .
Next, the “Normalization Condition” [14, 29]
d O ;
N0 ; r; v; dt
O t
r; v; ;
D 0;
(8.38)
which renders the first term on the left-hand side of the Boltzmann equation,
Eq. 8.35, and of each precursor equations, Eqs. 8.36, equal to zero – is introduced. This implies that .N0 ; / is constant – independent of time – even though
O t / depends upon time which, indeed, imposes a very special condition
.r ; v; ;
on the shape function. Of course, if the shape function is independent of time,
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
391
the normalization condition, Eq. 8.38, is satisfied immediately, but in this case, the
decompositions in Eqs. 8.27 and 8.34 become a separability assumption.
O correWith the usual interpretation of the adjoint eigenfunction, N0 .r ; v; /,
sponding to the zero eigenvalue of H0 , as the neutron importance [1, 2, 5, 7–11, 13,
14,30] the inner product of N0 with Eq. 8.27 shows that the time function is propor
tional to .N0 ; n/, the total neutron importance in the critical reference reactor due
to the instantaneous space-, speed-, and direction-vector-dependent neutron number density in the actual reactor described by Eqs. 8.25 and 8.26. Hence, the time
function T .t/ is often interpreted as the total neutron importance (for sustaining the
chain reaction) in the actual reactor under consideration.
Division of Eqs. 8.35 and 8.36, with the first term in each now eliminated (via
the normalization condition), first by
;
Ft N0 ; Mt
where
Z
1
Mt D
.v/
4
1
dv
0
0
Z
(8.39)
O 0 v0 †f r; v0 ; t ;
d
(8.40)
4
and
.v/ D .1 ˇ/0 .v/ C
I
X
ˇi i .v/;
(8.41)
i D1
(see also Eqs. 8.31–8.33), and then by .N0 ; /=Ft leads immediately to the point
reactor kinetics equations,
d
T .t/ D
dt
.t/ ˇ
`
t
t
!
T .t/ C
I
X
i C i .t/ C S .t/;
(8.42)
I D1
and
t
d
ˇ
C i .t/ D ti T .t/ i C i .t/; i D 1; : : : ; I:
dt
`
Here, the reactivity .t/ is given by
.t/ D
1
.N ; .Ht H0 / /;
Ft 0
(8.43)
(8.44)
and the term in Eq. 8.42, in which it appears, was derived by adding and
subtracting Mt to Lt C Mt0 in Eq. 8.35, identifying Lt C Mt0 as Ht , and subtracting .N0 ; H0 / from .N0 ; Ht /, the former of which is equal to zero since
.N0 ; H0 / D .H0 N0 ; / D 0.
392
J. Dorning
t
The effective delayed neutron fractions ˇ i in Eq. 8.43 are defined as
t
ˇi D
1 N0 ; Mti
; i D 1; : : : ; I;
Ft
(8.45)
and the effective neutron generation time `t and the effective source S .t/ in
Eqs. 8.42 and 8.43 are given by
`t D
1 N0 ;
;
Ft
and
S.t/ D N0 ; S
(8.46)
:
(8.47)
N0 ;
t
The effective total delayed neutron fraction ˇ , which appears in Eq. 8.42, is
consistent with Eq. 8.45 because the addition and subtraction of Mt that led to
the
expression for the
reactivity .t/ in Eq. 8.44 also leads to the appearancet of
N0 ; .Mt M0 /
=Ft which is precisely the sum over i D 1; : : : ; I of the ˇ i as
defined in Eq. 8.45.
This more general development of the point reactor kinetics equations, which
has become, to some extent, the standard development used in more advanced
presentations and textbooks (see [14] for example) seems quite elegant and convincing. However, it does have some mathematical shortcomings. Although, as
noted above, the decomposition of the neutron number density into the product of
a “Shape Function” and “Time Function” is completely general – since the shape
function depends upon time in Eq. 8.27 (see also Eq. 8.34), this generality ceases
to exist when the “Normalization Condition” is imposed on the shape function, in
Eq. 8.38. The result is that the shape function is restricted in some not-very-welldefined way which, to some, makes the development seem a little magical, or at
least somewhat ad hoc. If the shape function were taken as time-independent, which
surely would satisfy the normalization condition, the decomposition would become
a simple separation of variables – as was also noted above – and the resulting analysis using the Boltzmann equation (with constant coefficients) would lead to a set of
shape functions, the eigenfunctions of
H0
n
D n
n; n
D 0; 1; : : :
(8.48)
which are the so-called lambda-modes [13, 31] in which the solutions to Eqs. 8.25
and 8.26 are often expanded. Alternatively, when the analysis resulting from the
separation of variables is carried out using the Boltzmann equation and the coupled precursor equations (again with constant coefficients) the eigenfunctions are
the so-called omega-modes [13, 32, 33] (sometimes called omega-d modes) [13], in
which the solution to Eqs. 8.25 and 8.26 are also often expanded – which is more
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
393
appealing physically – although mathematically both sets of eigenfunctions form
complete basis sets in L2 . These two eigenfunction expansions are different in the
context of the speed-dependent transport theory description adopted here, and even
in the context of a two-energy-group diffusion theory description; however, they are
equivalent to each other in the context of one-speed diffusion theory for a bare reactor, since in that context the two sets of eigenfunctions become identical and the
two expansions reduce to the “buckling mode” expansion discussed at the end of
Section 8.3.1.
Although the general development of the point reactor kinetics equations by L. N.
Ussachoff [27] and A. F. Henry [28,29] summarized here, has some mathematically
ad hoc features, which are bothersome, it does have a great deal of appeal. First, it
has the very practical advantage of accommodating time-dependent coefficients, i.e.,
time-dependent cross sections – essential for “real-world” applications; and, second,
it seems to work fairly well when applied to the analysis of mild transients, i.e., when
it is used in the simulation of near-critical real-world reactors. Of course, the fact
that the shape function, in practical applications, is an instantaneous fundamentalmode eigenfunction naturally should restrict the resulting point kinetics equations
to applications involving fairly slow reactor transients. In practice, the above point
reactor kinetics equations, Eqs. 8.42 and 8.43, are usually applied using the quasistatic method [34–36] (which evolved from the so-called adiabatic method) [31],
the names of which both rather faithfully convey the basic idea of these approaches.
The time-discretized point kinetics equations are solved for several successive time
steps using coefficients that were calculated based on the shape function computed
at the beginning of the first time step for the instantaneous critical reactor. Then,
after these several time steps – with the number of these steps depending on how
rapid the transient is – the shape function is recalculated based on the new state of
the reactor at the end of the just completed sequence of time steps, and the resulting
point kinetics coefficients are used for the next several time steps, etc. In the adiabatic method, the lambda-mode was used as the instantaneous shape function, while
in the quasi-static methods, various approximations to the omega mode – which
generally lead to improvements – are used. For a very short and somewhat dated,
but very nice, discussion of the differences among the various quasi-static methods,
see [13].
Notwithstanding the fact that the point reactor kinetics equations that result from
the general development summarized above [14, 27–29] usually work reasonably
well in practice, other developments of point kinetics equations are worthy of mention [37–40, 42]. In many ways they lead to truer approximations to the original
equations, e.g., Eqs. 8.25 and 8.26, and – perhaps of equal importance – they add
additional insight. Thus, formulations of the point reactor kinetics equations based
on a variational principle [39] and on asymptotic expansions [40, 42] will be summarized in the next section.
394
J. Dorning
8.3.3 Variational Formulations and Asymptotic Formulations
of Point Reactor Kinetics and the Appearance
of “Additional Terms”
Subsequent to the seminal works by L. N. Ussachoff [27] and A. F. Henry [28,29] on
the general development of point reactor kinetics equations, a few other general formulations have appeared, three of which will be mentioned here. Not surprisingly,
in all three the somewhat controversial “Normalization Condition” was avoided;
and, of course, additional terms – terms that are not present in the point kinetics
equations derived by Ussachoff [27] and by Henry [28, 29] – appear in the final
equations.
The first of these formulations – which was motivated by early applications to
reactor safety – was reported by E. P. Gyftopoulos [38] in The Technology of Nuclear
Reactor Safety in 1964. The end result of that development was the point reactor
equations
t
I
X
.t/ ˇ
d
T .t/ D
T
.t/
C
i C i .t/ C S .t/ W .t/T .t/
t
dt
`
i D1
and
(8.49)
t
d
ˇ
C i .t/ D ti T .t/ i C i .t/ W .t/C i .t/; i D 1; : : : ; I
(8.50)
dt
`
with one term – the last one on the RHS – in each equation which does not appear
in Eqs. 8.42 and 8.43 in Section 8.3.2. The first factor in each of these terms is
d h i
d 1
N0 ;
ln N0 ;
D
;
W .t/ D dt
dt
N0 ;
(8.51)
from which it is very obvious that these “new terms” result from the fact that the
normalization condition has
introduced in the derivation since clearly W .t/
not been
were taken to be independent of time.
would be equal to zero if N0 ;
A few years later M. Becker [39] reported a derivation of the point kinetics equations starting from the coupled Boltzmann equation and delayed neutron
precursor equations, Eqs. 8.25 and 8.26 above, and the corresponding adjoint
O t/ and C .r; t /; i D 1; : : : ; I . In this variational formulaequations for n .r; v; ;
i
tion, he used a functional for non-self-adjoint operators
D
E D
E
y ; Lx y ; Q D 0;
(8.52)
(with additional terms to allow the approximation of initial and final conditions in
time) where the inner product here h ; i is over space (the volume of the reactor),
speed, direction vector and time .0; tfinal /, and L is the matrix of linear operators that
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
395
appear in the Boltzmann equation and the precursor equations, Eqs. 8.25 and 8.26,
rewritten here as Lx D Q, where x is the vector of the neutron number denO t/ and the precursor concentrations Ci .r; t /; i D 1; : : : ; I , and Q
sity n.r ; v; ;
is the vector of fixed-source terms in these equations – which, of course, are zero in
the precursor equations. The adjoint vector y , which satisfies the adjoint equation
O t/ and the
L y D Q , comprises the adjoint neutron number density n .r; v; ;
adjoint precursor concentrations Ci .r; t /; i D 1; : : : ; I which satisfy the equations
that are adjoint (with respect to all the independent variables, including time) to
Eqs. 8.25 and 8.26. The adjoint source vector Q is made up of the fixed-source
O t / in the adjoint Boltzmann equation and zeros corresponding to
term S .r; v; ;
the zero fixed sources in the adjoint precursor equations.
The minimization of this functional, after the introduction of decompositions for
the neutron number density and the precursor concentrations of the type given in
Eqs. 8.27 and 8.34 – and analogous decompositions for the adjoint neutron number density and adjoint precursor concentrations, led to (forward) Euler-LaGrange
equations and adjoint Euler–LaGrange equations that resulted in the following generalized point reactor kinetics equations (when the same weight function, the adjoint
neutron number density, is used with all the forward equations):
t
I
X
.t/ ˇ
d
T .t/ D
T
.t/
C
i C i .t/ C S .t/ W .t/T .t/;
t
dt
`
i D1
and
(8.53)
t
ˇ
d
C i .t/ D ti T .t/ i C i .t/ W .t/C i .t/; i D 1; : : : ; I
(8.54)
dt
`
and analogous adjoint reactor kinetics equations.
t
t
Here, the reactivity and parameters ˇ i and ` and S .t/ are given by expressions analogous to those in Eqs. 8.44–8.47 with the fundamental adjoint eigen
O t / and W .t/ is
function N0 replaced by the adjoint shape function .r; v; ;
given by an expression analogous to Eq. 8.51 above with N0 again replaced by
O t /.
.r; v; ;
A decade later another derivation of generalized point reactor kinetics equations from the coupled Boltzmann equation and precursor equations, Eqs. 8.25
and 8.26 – this one based on an asymptotic expansion technique – was developed by
J. Dorning and G. Spiga [40]. In this development, the small parameter ", defined
as the ratio of the prompt neutron lifetime to the average delayed neutron precursor
half-life,
" D `1 =
d
D d =v†a ;
(8.55)
was exploited. (This parameter is, indeed, very small since the prompt neutron lifetime is of the order of 103 s in a thermal reactor (and 105 s in a fast reactor)
and the average delayed neutron precursor half-life is of the order of seconds –
0.23 to 55.72 s [13].) After introducing the small parameter, and the appropriate
396
J. Dorning
ordering of the time derivatives and the fixed source (for a near critical reactor), the
neutron number density, the precursor concentrations, and 1=keff were expanded in
powers of "
O t D n0 r; v; ;
O t C n1 r; v; ;
O t "C ;
n r; v; ;
(8.56)
Ci .r; t/ D Ci;0.r; t / C Ci;1 .r; t/" C ;
(8.57)
and
1=keff D .1=k/0 C .1=k/1 " C ;
(8.58)
and substituted into the original equations, Eqs. 8.25 and 8.26. In the resulting hierarchy equations associated with successively higher powers of ", the first equations,
the O."0 / equations, the time dependence appears only through the cross sections.
The time derivatives and the time-dependent source term do not appear. Hence, these
zero order equations combine to become an instantaneous steady-state Boltzmann
equation for the critical state of the reactor under consideration with cross sections that are those associated with the instantaneous state of the reactor undergoing
the transient. The solutions to this equation are the products of the eigenfunctions
.n/;t
O of this homogeneous transport equation (with parametric depen.r; v; /
o
dence upon time due to the dependence of the cross sections upon time) times
arbitrary functions of time T0.n/;t .t/; n D 1; : : : . The fundamental eigenfunction
.0/;t
O is the instantaneous distribution in space, speed, and
of this equation o .r; v; /
direction vector of the neutron number density in the instantaneous critical reactor
associated with the cross-section values in the actual reactor (the one undergoing the
transient) at the instant of time t. Since the actual reactor is near critical, only the
fundamental eigenfunction is relevant, and the leading order term in the expansion
of the neutron number density is given by
O t D o.0/;t r; v; O T0 .t/;
(8.59)
n0 r; v; ;
which provides, directly as a result of the asymptotic expansion, the “decomposition” of the neutron number density into the product of a time-dependent “shape
function” – the instantaneous critical distribution in the reactor – times an arbitrary
“time function” T0 .t/.
Both the time derivatives and the fixed source appear in the first-order equations,
O 0 .t/. These equations for n1 .r; v; ;
O t/
the O."1 / equations, as does o.0/;t .r; v; /T
and Ci;1 .r; t/; i D 1; : : : ; I are identical to the zero-order equation, except they are
nonhomogeneous equations in which the nonhomogeneous terms include the time
O t / and also involve o.0/;t .r ; v; /T
O 0 .t/.
derivatives and the fixed source S.r; v; ;
The solvability condition (the Fredholm Alternative Theorem [41]) for this non-selfadjoint nonhomogeneous set of equations requires that the sums of these nonhomogeneous terms in these equations be orthogonal to the fundamental eigenfunction of
the instantaneous steady-state adjoint equations for the instantaneous critical reactor
O t /. This solvability condition leads directly to equations for the upNo.0/;t; .r; v; ;
to-this-point arbitrary function of time T0 .t/ – now written without the subscript as
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
397
T .t/ – and the related C i .t/; i D 1; : : : ; I , in the form of generalized point reactor
kinetics equations
t
I
X
.t/ ˇ
d
T .t/ D
T
.t/
C
i C i .t/ C S .t/ W .t/T .t/ V .t/T .t/; (8.60)
t
dt
`
i D1
t
d
ˇ
C i .t/ D ti T .t/ i C i .t/ W .t/C i .t/ Ui .t/; i D 1; : : : ; I: (8.61)
dt
`
t
t
Here, the reactivity and parameters ˇ i and ` and source term S .t/ are again
given by expressions analogous to those in Eqs. 8.44–8.47 with the “shape func.0/;t
O the fundamental (lambda-mode) eigenfunction
tion” replaced by 0 .r; v; /,
of the instantaneous critical reactor, and with the fundamental (lambda-mode) adO replaced by the fundamental (lambda-mode) adjoint
joint eigenfunction N0 .r; v; /
.0/;t
O
eigenfunction o
.r; v; / of the instantaneous critical reactor. In the new “additional terms” W .t/T .t/, V .t/T .t/, W .t/C i .t/ and Ui .t/ on the RHSs of these
equations, W .t/, V .t/ and Ui .t/ are given by
d t
ln ` ;
dt
@ .0/;t
V .t/ D
;
@t
W .t/ D
and
Ui .t/ D
@
@t
.0/;t
;
(8.62)
.0/;t
.0/;t
;
.0/;t
;
1
0
i Ci;0
; i D 1; : : : ; I:
4
(8.63)
(8.64)
All four of the “additional terms” that appear in these generalized point reactor
kinetics equations, W .t/T .t/, V .t/T .t/, W .t/C i .t/, and Ui .t/; i D 1; : : : ; I ,
will reduce to zero if, in the spirit of A. F. Henry’s “Normalization Condition,”
the inner product . 0.0/;t ; 0.0/;t / is forced to be independent of time, and these
generalized point reactor equations will reduce to those originally derived by L. N.
Ussachoff [27] and A. F. Henry [28, 29], although with slightly different definitions
of the reactivity, the effective parameters, and the effective source. But within the
context of this asymptotic formulation, there is absolutely no reason to do this.
This asymptotic development of the generalized point reactor equations [41] led
deductively to (a) the decomposition of the leading order neutron number density
into the product of a time-dependent shape function and a time function; (b) the
identification of the shape function as the fundamental lambda-mode eigenfunction of the steady-state Boltzmann equation for the instantaneous critical state of
the reactor; (c) the arrival at the fundamental lambda-mode eigenfunction of the
corresponding adjoint Boltzmann equation as the weight function – used to form
the inner product that leads to the generalized point reactor kinetics equations,
398
J. Dorning
and, with the shape function, to generate the expressions for the reactivity, the
effective neutron generation time, and the effective delayed neutron fractions, and
with the source to generate the effective source, in the generalized point reactor kinetics equations; and finally, (d) the generalized point reactor equations for the time
function T .t/ and precursor concentrations, C 1 .t/; i D 1; : : : ; I . All these results
follow directly and deductively from the initial asymptotic expansion; no ad hoc
factorizations, weight functions, or normalization conditions were introduced. Of
course, without the ad hoc normalization condition, the additional terms W .t/T .t/,
V .t/T .t/, W .t/C i .t/, and Ui .t/; i D 1; : : : ; I appear in the generalized point reactor kinetics equations. (It would be an interesting exercise to compare the results
of simulations done using quasi-static methods based on Eqs. 8.60 and 8.61, with
and without these additional terms with the results of comparable simulations of
the full time-dependent Boltzmann equations and coupled precursor concentration
equations.)
Finally, in closing our discussion of the derivation of the point reactor kinetics equations, and generalizations thereof, it should be mentioned that a few years
after the above-mentioned derivation – of the generalized point kinetics equations
as an asymptotic approximation to the coupled Boltzmann equation and precursor
equations – appeared, an extension of this approach was reported [42]. Although
N a was used, two time scales were
the same small parameter " D `1 = Nd D d =Nv†
introduced – a “fast time scale” t1 D "t, based on the prompt neutron time scale,
and a “slow time scale” t2 D t based on the average precursor half-life time scale.
O t1 ; t2 /, the precursor concentraThere, when the neutron number density n.r ; v; ;
tions Ci .r; t1 ; t2 / and 1=keff were expanded in powers of ", the O."0 / equations in
the resulting hierarchy were not instantaneous steady-state equations; rather, they
were time-dependent equations in which there appeared time derivatives with respect to the fast time variable t1 and cross sections that depended on both t1 and
the slow time variable t2 . Hence, they were coupled differential equations in the fast
time variable t1 with instantaneous parametric dependence on the slow time variO t1 ; t2 / that was the
able t2 . This led again to a leading order solution n0 .r; v; ;
O t1 / – that was a function of the fast time
product of a shape function ot2 .r; v; ;
variable t1 and also had additional parametric dependence on the slow time variable t2 – and a time function T .t2 / – that was a function of only t2 . The solution
to these zero order equations leads to the instantaneous (in the slow time variable)
fundamental omega-mode eigenfunction as the time-dependent shape function; and
the solvability condition – the Fredholm Alternative Theorem [41] – for the O."1 /
equations leads to the instantaneous fundamental omega-mode adjoint eigenfunction as the time-dependent weight function. Otherwise the results were essentially
the same as those summarized above obtained using just one time scale which led to
lambda-mode forward and adjoint eigenfunctions.
The point reactor kinetics equations are often used to described the time evolution
of a reactor when the reactor is just one component of a much larger system being
simulated – for example, an entire nuclear power plant. They are also used in the
context of the adiabatic method [31] and quasi-static methods [34–36] in the simulation of reactor transients when only the reactor is represented – i.e., it is not coupled
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
399
to an entire plant. In these simulations, the shape functions – used to calculate the
reactivity and update the effective point kinetics parameters – sometimes are calculated using the multigroup transport (Boltzmann) equation; however, more often
the multigroup diffusion equations or the few group diffusion equations are used. Of
course, the point reactor kinetics equations can also be derived starting from any of
these descriptions [2] instead of the energy-dependent or speed-dependent transport
theory description, Eqs. 8.25 and 8.26, used here – and in some textbooks [14] – for
generality.
Notwithstanding the fact that the quasi-static method may frequently yield sufficiently accurate results, it often is necessary to solve the full space- and timedependent multigroup diffusion equations – and in some applications the multigroup
transport equations – along with the coupled delayed neutron precursor equations.
Thus, the solution of these so-called space–time kinetics equations, which, in practice, must be done numerically, is an extremely important subject, and a summary of
the developments of reactor kinetics in the twentieth century surely could not omit
it. Therefore, the main methods that have evolved – principally since the 1950s –
for practical numerical solution of the space–time reactor kinetics equations will be
discussed in Section 8.5.
That discussion, however, will be preceded by a digression on the kinetics of
neutron pulses. This subject, which was an integral part of reactor kinetics at the
time,1 was of substantial importance in the 1950s and 1960s and even afterward,
first in connection with neutron thermalization problems related to the development
of thermal reactors and later in connection with the physics of fast reactors.
8.4 A Digression on the Kinetics of a Pulse of Neutrons
in Non-multiplying Systems and Subcritical Multiplying
Systems: Pulsed Neutron Experiments and Their Analysis
8.4.1 Neutron Thermalization, Exponential Decay,
and Diffusion Cooling
The generation of accurate thermal neutron cross sections and thermal neutron
scattering data were important priorities for the development of thermal reactors
during the late 1950s, the 1960s, and into the early 1970s. The generation and validation of cross sections and slow neutron scattering kernels for thermal reactor core
materials was essential for the advancement of thermal reactor design. And there
was an experiment that could be done in many laboratories because all it required
1
For a collection of articles on reactor kinetics, which included the kinetics of neutron pulses, see
the following conference proceedings, which contains many articles that were representative of
the state-of-the-art in the late 1960s: Hetrick, D. L. (Editor), Dynamics of Nuclear Systems. 1972,
Tucson, AZ: University of Arizona Press.
400
J. Dorning
was a “neutron generator” – i.e., a small Cockcroft–Walton accelerator in which
deuterium ions were accelerated to bombard a tritium target producing a burst, or
“pulse,” of 14MeV neutrons via the resulting .d; t/ reaction. These neutrons then
slowed down on a very fast time scale . 105 s/, and thermalized, and diffused on
a slower time scale . 103 s/ in the “assembly” made up of a single material of
interest in the physics of thermal reactors. Moreover, relatively inexpensive BF-3
tube detectors, or later (slightly more expensive) He-3 tube detectors, were used to
measure the time evolution of the thermal neutron population – or the “decay of the
neutron pulse.” (Actually, the experiments were performed using many well-spaced,
narrow pulses and the detector counts were combined using the appropriate delay
time following each pulse.) So, the experiment was conceptually fairly simple –
although practical technical details such as background noise, room return, etc.
often plagued the physicists doing the experiments. In fact, the basic experiment
was conceptually sufficiently simple that a “Pulsed Neutron Decay Experiment”
was routinely included by the 1960s and early 1970s in the graduate lab course
in most nuclear science and engineering departments in the United States and
elsewhere.
While the basic pulsed neutron experiment was conceptually very simple, the
“theory” on which it was based was even simpler! Well, at least, at first glance! It
starts from the one-speed diffusion equation for the neutron flux in a homogeneous
non-multiplying medium (cannot be much simpler than that!) – which, of course,
is actually the speed-dependent neutron diffusion equation averaged over a thermal
neutron speed distribution,
@ N̂
.r; t/ D vDr 2 N̂ .r; t / v†a .r; t / N̂ .r; t /;
@t
(8.65)
where the overbar indicates the average over the Maxwellian thermal neutron spectrum. Separation of variables, N̂ .r; t/ D '.r/T .t/ leads immediately to
r 2 'n .r/ C Bn2 'n .r/ D 0;
˛
X
N̂ .r; t / D
An 'n .r/e ˛n t ;
(8.66)
(8.67)
nD0
where the time eigenvalues ˛n
˛n D v†˛ C vDBn2 ; n D 0; 1; 2; : : :
(8.68)
and Bn2 is the geometric “buckling” (spatial eigenvalue) for the nth spatial mode
'n .r/ (spatial eigenfunction), all of which result from the application of the boundary condition (zero neutron flux at the “extrapolated” boundary) for the specific
geometry of the assembly used in the experiment.
The total experiment actually comprises a sequence of experiments carried out in
a sequence of assemblies (usually) of the same geometric shape but of successively
larger sizes. For a specific assembly, say the smallest one, the time asymptotic decay
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
401
constant ˛0 is measured, using the counts recorded after all the higher modes have
damped away (but before long-time background noise becomes a significant fraction
of the count rate). That is, the time asymptotic flux
'.r;
N t/
A0 '0 .r/e ˛0 t ;
(8.69)
is measured – usually at a node (zero) of the first spatial harmonic '1 .r/ if possible, or, if the initial pulse is spatially symmetric, at a node of the second harmonic
'2 .r/ – to obtain the time-asymptotic, fundamental mode decay constant (or timeeigenvalue)
˛0 D v†a C vDB02 ;
(8.70)
in terms of the fundamental-mode buckling (or spatial eigenvalue) B02 . This experiment is done for each of the successively larger assemblies (with smaller
fundamental-mode bucklings B02 ) and the sequence of measured decay constants
˛0 is plotted vs these bucklings B02 . (See sketch in Fig. 8.1.) Of course, the extrapolation of this curve, Eq. 8.70 to zero buckling gives, as the y-intercept, the
experimental value of the thermal neutron spectrum averaged (Maxwellian averaged – at the temperature of the assemblies) absorption frequency v†a , which,
happily, is precisely the cross-section data needed in the one-speed (thermal spectrum averaged) description of a thermal reactor (or, in practice, the thermal group
averaged cross-section data needed in a two-group description). Further, the experimental value of the neutron speed times the diffusion coefficient, needed in the
one-speed or two-group diffusion theory description of a thermal reactor, is given
by the slope of the curve (Fig. 8.1). (Actually, the experimental points for smaller
assemblies (larger bucklings B02 ) lie below the straight line given by Eq. 8.70 as
shown in the figure. This is easily understood, even in terms of a simple two-group
a0
2
a 0 = nSa + nDB0
nD
nSa + nDB0 - CB04
nSa
2
B0
Fig. 8.1 Sketch of the decay constant ˛0 vs the buckling B02
402
J. Dorning
diffusion theory model, as a diffusion cooling phenomenon in which the preferential
leakage of the faster neutrons in the thermal spectrum in the smaller assemblies
cools the neutron energy distribution lowering the decay constant ˛0 – by decreasing the spectrum-averaged value vD. This phenomenon was included by adding a
(negative) diffusion cooling term in the expression for ˛0 that is quadratic in B02
˛0 D v†a C vDB02 CB40 C ;
(8.71)
and the measured value of the diffusion cooling coefficient was often reported along
with the other experimental data v†a , vD and the values of †a and DN deduced from
those data.) So, in many ways everything associated with the pulsed neutron experiment and the theory of the pulsed neutron experiment seemed to be very simple
and quite nice and tidy. And it was – until the transport equation reared its ugly
head! Ah, the demon Boltzmann equation strikes again – as it so often does! But,
of course, since the Boltzmann equation is a more precise description of particle
migration, difficulties to which it sometimes leads usually reflect complexities that
are physically real, often subtle, and sometimes very important – and should not be
overlooked, and certainly never “swept under the carpet!”
8.4.2 Non-Exponential Decay and the Theory of Pulsed
Neutron Die-Away: The Continuous Spectrum
of the Boltzmann Operator
In order to introduce the complications that arise when the better description of
particle migration provided by the speed-dependent (or equivalently, the energydependent) Boltzmann equation is introduced, it is convenient to first discuss the
eigenvalue spectrum of the one-speed neutron diffusion operator. Thus, returning to
Eq. 8.65 and “looking for solutions” of exponential form in time (or, equivalently,
separating variables as in Section 8.4.1 above) '.r; t / D a'.r/e t , one immediately
arrives at
'.r/ D vDr 2 '.r/ v†a .r/'.r/ D ADiff '.r/;
(8.72)
where ADiff D vDr 2 v†a is the one-speed diffusion operator, and the ’s that
result from the related eigenvalue equation
.ADiff /' D 0;
(8.73)
are the so-called time-eigenvalues D ˛0 ; ˛1 ; ˛2 ; : : : (the same ˛’s that appear above in Eq. 8.68). Since it is well known that the Laplacian operator r 2 has
an infinite set of eigenvalues that are real, negative, isolated (a finite distance apart
from each other) and countable, and because division by vD does not change any of
these properties and v†a just shifts every eigenvalue to a more negative value, the
time-eigenvalue “spectrum” or eigenvalue “spectrum” of the one-speed diffusion operator – or the point spectrum written as P .ADiff / is just n D ˛n ; n D 0; 1; 2; : : :
(see Fig. 8.2).
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
403
Im (λ)
• • • −α3 −α2
−α1
−α0
Re(λ)
Fig. 8.2 Sketch of the eigenvalue spectrum of the one-speed diffusion operator
Now, upon the introduction of the velocity-dependent Boltzmann equation to
describe the transport of the neutrons more precisely, this mathematical result
changes quite significantly. Starting from the time- and velocity-dependent transport equation for the neutron flux ‰.r; v; t / in a homogeneous medium
@
‰.r; v; t / D Ev r‰.r; v; t/ v†T ‰.r; v; t/
@t
Z
C d 3 v0 v0 †s v0 ! v ‰ r; v0 ; t ;
(8.74)
and again “looking for solutions” of exponential form in time (or, equivalently,
separating variables – more precisely in this case “trying” to separate variables)
‰.r; v; t / D .r ; v/e t leads directly to
.r ; v/ D vE r .r ; v/ v†T .v/
Z
0
r; v D ATrans .r ; v/;
C d 3 v0 v0 †s v0 ! v
(8.75)
where ATrans is the velocity-dependent transport operator and the ’s for which there
are bounded solutions to the resulting corresponding homogeneous equation
.ATrans /
D 0;
(8.76)
are the time-eigenvalues of the velocity-dependent transport equation. But the situation here is not so simple as it was in the case of the one-speed diffusion operator. In
addition to real, negative, isolated discrete eigenvalues – which belong to the point
spectrum of the transport operator P .ATrans / – the transport operator has a (nonempty) continuous spectrum – written as C .ATrans / – and this makes the problem
“much more interesting!” Very loosely speaking, the continuous spectrum of an operator provides additional “eigenstuff” (usually a continuously distributed curve or
area in the complex -plane or “spectral plane”) which leads, in the eigenfunction
expansion – or more precisely, the spectral expansion – of the solution to the related
nonhomogeneous operator equation, to a (continuous) integral over the boundary of
404
J. Dorning
the continuous spectrum in addition to the (discrete) summation over the point spectrum, i.e., over the (discrete) eigenvalues. (An example, familiar to many physicists
because it arises in elementary quantum mechanics, is the (continuous) integral over
the unbound states in addition to the (discrete) summation over bound states in the
eigenstate expansion of the wave function in the quantum mechanical scattering of a
particle by a potential well of finite depth and height.2 This familiar example simply
corresponds to the integral over the continuous spectrum plus the summation over
the discrete spectrum of the Schrödinger operator.)
At this point, before discussing the continuous spectrum of the Boltzmann operator and its implications for the time decay of the neutron flux in a pulsed neutron
experiment, a very short review of the definitions of the components of the spectrum
of a linear operator might be helpful – in that it should make that discussion somewhat easier to follow, and therefore clearer and more informative. Starting from a
general nonhomogeneous linear operator equation
.A /f D S;
(8.77)
and the associated normed linear vector space X (Banach space) of which S is an
element and in which the solution f is sought, it is helpful here to restrict X to
be a Hilbert space since that is the space in which essentially all the early spectral
analyses of the linear Boltzmann operator were done. In this space H – which is
also the Lesbegue space L2 of (Lesbegue) square-integrable functions – the norm
k k of a function f is defined as the square root of the complex inner product of f
with itself
(8.78)
kf k D .f; f /1=2 :
Here, the inner product of two functions f and g in the space is given by
Z
.f; g/ D
dxfN.x/g.x/;
(8.79)
D
where x is the vector-valued independent variable (for a multidimensional problem)
and it belongs to D, the domain over which f .x/ and g.x/ (and S.x// are defined;
and the overbar indicates the complex conjugate. Hence, the norm (or “length”) of
f satisfies
Z
dxjf .x/j2 ;
kf k2 D
(8.80)
D
which, of course, must be finite for f .x/ to belong to L2 .
2
For example see: Schiff, L. I., Quantum Mechanics, 3rd Edition. 1968, New York: McGrawHill; or Messiah, A., Quantum Mechanics, Volume I. 1958, Amsterdam, The Netherlands: North
Holland; or Dirac, P.A.M., Principles of Quantum Mechanics, 4th Edition. 1958, Oxford: The
Oxford University Press.
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
405
Now that a function space has been introduced, the definitions of the components
of the spectrum of the linear operator A – written as .A/ – can be clearly stated.
1. The point spectrum P .A/ comprises the set of values of for which the inverse
.A /1 of the operator .A / does not exist.
2. The continuous spectrum C .A/ comprises the set of values of for which the
inverse .A /1 of the operator .A / exists but is unbounded. That is, the
values of for which .A /1 operating on a S 2 L2 gives a function fO
outside the space L2 – or more mathematically, the values of for which the
domain of .A /1 is L2 but its range is outside L2 .
3. The residual spectrum R.A/ comprises the set of values of for which the
inverse .A /1 is not one-to-one. For most linear operators that describe physical problems the residual spectrum is empty.
4. The (total) spectrum .A/ D P .A/ [ C .A/ [ R.A/ and P .A/ \ C .A/ \
R.A/ D Ø, the null set.
5. The complement of the spectrum of A in the complex plane .A/ D C n.A/
is called the resolvent set. For values of 2 .A/, the inverse of Eq. 8.77 exists
and is bounded; hence, it maps known nonhomogeneous terms S.x/ 2 L2 to
solutions f .x/ 2 L2 . That is, the resolvent set .A/ is the set of values of for
which the nonhomogeneous equation, Eq. 8.77, has (bounded) solutions f .x/ 2
L2 for “source” terms S.x/ 2 L2 .
Before ending this short informal review of the definitions related to the generalization of the “eigenvalue” spectrum of a linear operator, two final points should be
made. The first, a minor one in practice, is that elements belonging to the point spectrum, 2 P .A/, need not be discrete points; they can be continuously distributed
in the complex -plane. Conversely, elements belonging to the continuous spectrum, 2 C .A/, need not be continuously distributed; they can be discrete points
in the complex -plane. (Examples in which these two somewhat counter-intuitive
situations arise are given in many introductory texts on functional analysis and even
in [41, 43]. More complete and more rigorous discussions of the spectral theory of
linear operators are available in many textbooks on functional analysis [43]).
The second of these two final points is a little more important – or at least a little
more useful – because it is helpful in developing an intuitive interpretation of the
definitions just summarized. If the above equation, Eq. 8.77, results from an initial
time-dependent equation
@ O
f .x; t / D .AfO /.x; t / C SO .x; t /;
@t
(8.81)
with initial condition fO.x; 0/ D fO0 .x/ – as is the case when the Boltzmann
equation, Eq. 8.74, is used to describe the pulsed neutron experiment which, of
course, is the circumstance that initially motivated this brief review – and a Laplace
transform, with which most applied mathematicians, physicists, and engineers are
quite comfortable, is introduced to solve it, the result is
.A s/fs D Ss ;
(8.82)
406
J. Dorning
where s is the new variable that was introduced by the Laplace transform with
respect to time, fs D f .x; s/, the Laplace transform of fO.x; t /, and Ss D
S.x; s/ C fO0 .x/. Clearly, this equation is formally the same as Eq. 8.77. Now if
fs .x/ is obtained by inverting .A s/, and then the inverse Laplace transform is
applied to obtain the solution for fO.x; t / and the Bromwich contour is deformed,
the only contributions to the solution will be from the singularities of fs .x/ in the
complex s-plane. These singularities occur where s equals a value of that belongs
to .A/. This means that contributions to f .x; t / will be in the form of a sum or
residues associated with P .A/ plus a contour integral around C .A/, assuming
P .A/ comprises discrete points (the usual case of discrete eigenvalues), C .A/
comprises one or more continuously distributed sets of points (also the usual case
when it is not empty), and R.A/ is empty (the usual case in physically motivated
problems). This solution, then, is clearly equivalent to the spectral representation of
the solution to Eq. 8.77, i.e., the “generalized” eigenfunction expansion of that solution in terms of a discrete summation of the discrete eigenfunction components over
P .A/ plus the continuous contour integration (or set of such integrations) of the
continuously distributed “singular” eigenfunction components over the boundary
of C .A/.
Now returning to the Boltzmann equation description of the pulsed neutron experiment, there are two seminal papers to which it is essential to refer here. The
first – which in some ways represents a mathematical perspective – is by R. Lehner
and G. Milton Wing [44], not surprisingly both mathematicians, and the second –
which more or less reflects a physicist’s point of view – is by Noel Corngold [45],
a physicist. Lehner and Wing studied the time-evolution of a pulse of neutrons using the one-speed transport equation in one-dimensional infinite-slab geometry with
vacuum boundary conditions (zero incoming angular flux on the two infinite plane
surfaces). Using the techniques of functional analysis and the spectral theory of linear operators, they proved that the spectrum of the one-speed transport operator in
this geometry was comprised of a point spectrum – made up of a countable number
of discrete eigenvalues with negative real parts, the algebraically largest of which
was real and isolated, and the associated single eigenfunction was everywhere real
and positive – a physically essential property for the fundamental space-angle mode
– plus a continuous spectrum that occupies the entire half-plane Re v†T
where v†T is the energy-spectrum averaged value of the total collision frequency
in the one-speed transport theory description they employed (see sketch of the spectral plane – the -plane – in Fig. 8.3). Moreover, they also proved that as the width
of the slab is decreased all the (discrete) eigenvalues move to the left and for a sufficiently thin slab (but one still of finite width) the last eigenvalue ˛0 disappears
into the continuous spectrum Re v†T .
The implications of this last result were profound. For a pulsed neutron experiment performed in a sufficiently thin slab, there would not exist a fundamental
space-angle eigenmode; moreover, the time-asymptotic die-away of the neutron
population would not be the exponential decay associated with a real, isolated,
discrete eigenvalue ˛0 , but rather it would be given by a non-exponential asymptotic time dependence that results from a contour integral along the edge of the
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
407
Im (l)
Cσ (ATr)
Re(l) = -nST
• • • -a2 -a1
Re(l)
-a0
Fig. 8.3 Sketch of the spectrum of the one-speed transport operator
continuous spectrum at Re D v†T . But these results were for the one-speed
transport equation – which is not a particularly faithful representation of the thermal
neutron population in a pulsed neutron experiment, especially in small assemblies
in which, as already mentioned, a thermal spectral shift to lower energies occurs,
due to the greater leakage of fast neutrons, resulting in diffusion cooling.
No doubt N. Corngold, whose picture appears on the next page, was at least
partially motivated by concern about these limitations on the one-speed transport equation when he studied the full velocity-dependent Boltzmann equation,
Eq. 8.74, in the context of the pulsed neutron experiment. In order to make the
treatment of the spatial operator .Ev r/ more manageable, he introduced the ansatz
‰.r; v; t / D .B; v; t/ exp.i BE rE/ – to represent the spatial dependence of the
neutron flux based on so-called image reactor theory [8] which also is simply a
fundamental-buckling-mode approximation in the context of the transport equation.
The result of this simplification is a form of the Boltzmann equation in which the
spatial operator, .Ev r/, is replaced by a simple (complex-valued) multiplicative
operator added to v†T .v/,
h
i
@
.B; x; t / D i vE BE C .v/†T
.B; v; t /
@t
Z
B; v0 ; t :
C d 3 v0 v0 †s v0 ! v
(8.83)
(Clearly, this equation also corresponds to the spatially Fourier transformed
Boltzmann equation – for a spatially infinite medium – but it was not treated as such
and the solution for .B; v; t/ was not inverse Fourier transformed back from the
B variable to the r variable.) Treating this equation as an initial-value problem that
describes the experiment, Corngold Laplace transformed it (with respect to time) to
obtain
Œs C i vE BE C v†T .v/ Q .B; v; s/ D
Z
d 3 v0 v0 †s .v0 ! v/ Q B; v0 ; s C
.B; v; 0/;
(8.84)
408
J. Dorning
Professor Noel Corngold of Caltech (courtesy of Professor Noel Corngold)
and proceeded with tender loving care. (No heavy-duty functional analysis and
spectral theory of linear operators here! Ah! Different ships, different long splices!)
Upon inverting the Laplace transformed solution Q .B; v; s/ by deforming the
Bromwich contour in the complex s-plane, he arrived at a sum of contributions
due to residues at isolated poles sn – equivalent to eigenvalues n D ˛n – plus
a contribution due to a contour integral along the boundary of the region s D
i vE BE v†T .v/ which arises because it corresponds to Œs C i vE BE C v†T .v/ D 0
and this quantity appeared in the denominator of the transformed solution. These
contributions are indicated schematically in the sketch of the s-plane in Fig. 8.4. He
also showed that when the buckling B 2 becomes sufficiently large (the assembly
becomes sufficiently small), all the poles s D ˛n – including ˛0 , associated with
the fundamental velocity-mode – “disappeared” into the shaded region on the left
of the contour integral along s D i vN BE v†T .v/. Hence, these results – on the
velocity-dependent Boltzmann equation also showed that, for a sufficiently small
assembly, the time-asymptotic behavior of the neutron flux in the pulsed neutron
experiment would not be exponential. And, since the fundamental pole ˛0 moves
along the real axis, it crosses into the shaded region at s D minŒv†T .v/ ; therefore, the upper bound on a discrete ˛0 was minŒv†T .v/ , and exponential decay
would not occur in assemblies of dimensions less than the dimension corresponding to this value. This result was sometimes cited as Corngold’s .v†/min theorem
or Corngold’s .v†/min limit [7, 25, 26, 45]. Subsequent, more rigorous mathematical spectral analyses based on the full space- and velocity-dependent Boltzmann
equation, largely building on the techniques introduced by Lehner and Wing, led to
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
409
essentially the same results[46, 47] as those reported by Corngold. Those analyses
showed that the continuous spectrum occupied the region Re minŒv†T .v/ in
the spectral plane – which the result reported by Corngold would approach in the
large buckling (small assembly) limit. They also showed that for a sufficiently small
assembly all the discrete eigenvalues moved into the continuous spectrum – hence,
no exponential decay.
Alas, there was one “small” difficulty – some experimental values of the decay
constant ˛0 that exceeded the .v†/min limit had been reported! And notwithstanding
the intimidation of all the heavy mathematics, some stalwart experimentalists, after
carefully re-evaluating their data, firmly stood by the exponential decay – within experimental error – with ˛0 > .v†/min that they observed, specifically in small heavy
water assemblies and small beryllium assemblies. This apparent paradox was subsequently resolved by analysis using simple cross-section and scattering kernel models
– that retained the salient physical features – in conjunction with the exp.i BE rE/ spatial ansatz. Because of the simple models used, the analysis – done using a simple
Laplace transform in time – could be carried out rather explicitly, and it showed that
for a detector response the continuous spectrum or singular region, shown shaded in
Fig. 8.4 leads to a branch cut (see Fig. 8.5), and that, for a sufficiently small system
(large B 2 ), where the fundamental eigenvalue ˛0 (pole in the Laplace transformed
solution) has passed through the branch point at .v†/min , it bifurcates, and the two
poles that are born in this bifurcation move onto the adjacent sheets of the Riemann
surface (see Fig. 8.5). The resulting time dependence for the detector response, given
by the integral around the branch cut, is very close to exponential for a long time (but
not as t goes to infinity) if the bifurcated poles are very close to the branch cut [48].
Of course, if this pseudo-exponential decay is very close to exponential and if it lasts
for a long time, it would be observed in a pulsed experiment as exponential decay
– since, by the time (slower) non-exponential behavior would appear the count rate
would most likely be so low as to be indistinguishable from the background count
in the experiment. A separate, not unrelated analysis, based on a more specialized
cross-section model for a crystalline material with a Bragg peak (e.g., beryllium),
gave an earlier somewhat different explanation of exponential decay rates above
.v†/min in crystalline materials [49].
® ®
s = −iv . B − vΣT (v)
Im (s)
s = −min[vΣT (v)]
· · · - a2
- a1
- a0
Re (s)
Fig. 8.4 Sketch of the complex s-plane for the velocity-dependent transport operator in the image
reactor theory approximation
410
J. Dorning
Im (s)
Branch Cut
X
X
(nS)min
Re (s)
Bifurcated Poles on Adjacent
Sheets of the Riemann Surface
Fig. 8.5 Sketch of the complex s-plane for the energy-dependent diffusion operator
Many important theoretical and experimental problems in neutron thermalization related to thermal reactor physics – not just those related to pulsed neutron
experiments – were put to rest during those years. Cross-section measurements and
time-of-flight measurements of energy and angular distributions in thermal neutron
scattering experiments, combined with the development of theoretical models for
slow-neutron scattering, led to very much improved cross-section and scattering
kernel data for use in reactor design and analysis. (See [24] for a very nice summary of some of this research written shortly after the research was completed.)
These data were the antecedents (and in many cases the origins) of the Evaluated
Nuclear Data Files (ENDF), and many of them are still used today. These were exciting times! They were great fun! And some really good science and engineering
was done along the way! And the present author was thrilled to be part of it – albeit
only at the very end.
Like great wines, some data improves with age! (And even becomes perfect!) Before leaving the subject of pulsed neutron experiments in thermal systems – now so
distant in time, but still so close to my heart – a personal anecdote seems irresistible
(and perhaps even justified, since it was part of the original oral presentation of this
material). Figure 8.6 from [50] shows experimental values (all well below .v†/min /
of the decay constant in pulsed neutron experiments in light water assemblies along
with dashed and solid curves that represent the calculated values of ˛0 based on the
Nelkin model [51] and its slight improvement, the so-called anisotropic model [52].
It is clear that the data point for the smallest experimental system – actually a pingpong ball into which water was injected using a hypodermic needle – does not lie
on the solid curve. And in the original graph given to the draftsman (by me), it was
even oh so slightly further from the curve. (It took a few iterations to get where it
is on this figure. After that I gave up!) Thus, Fig. 8.6 shows the initial improvement
of that data point with age. But the real improvement was yet to come – when, soon
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
411
10
9
8
α0 (sec−1), ×104
7
6
NELKIN MODEL
5
4
3
ANISOTROPIC
MODEL
2
1
0
0
1
2
3
4
5
6
7
8
9
10
11
R (cm)
Fig. 8.6 Experimental points and theoretical curves for the decay constant in spheres of water as a
function of their radius (with permission from the American Nuclear Society, Copyright July 1968
by the American Nuclear Society, La Grange, IL)
after [50] appeared, the textbook by G. Bell and S. Glasstone [7] was published
with an apparently redrafted version of this figure in it, on which this data point lies
smack dead center on the solid curve! “Like great wines. . . .!”
8.4.3 Exponential and Non-exponential Decay in Subcritical
Fast Multiplying Assemblies
The simplest description of a pulsed neutron experiment performed in a subcritical, multiplying assembly rather than a nonmultiplying system is again the
time-dependent one-speed diffusion equation, Eq. 8.65, but with v†a replaced
by .1 ˇ/k1 v†a . Then the same simple steps reviewed at the beginning of
Section 8.4.1 lead to
˛0 D v†a Œ1 .1 ˇ/k1 C vDB 2 ;
(8.85)
from which it immediately follows that
keff k1
D
1 C L2 B 2
1 ˛0
`1
1CL2 B 2
.1 ˇ/
D
1 ˛0 `0
;
.1 ˇ/
(8.86)
412
J. Dorning
where, as indicated below Eq. 8.6, `1 D 1=v†a , etc. and the overbar has been
added here as a reminder that the v†a which appears in the one-speed equation
has been averaged over the neutron energy spectrum. It is clear from Eq. 8.86 that
a pulsed neutron experiment in a subcritical multiplying assembly is an integral
experiment that yields the effective multiplication constant keff for the assembly; and
from Eq. 8.85 that a sequence of such experiments on successively larger assemblies
(smaller values of B 2 ) gives k1 for the system, given v†a and ˇ are known.
Not unexpectedly, when the velocity-dependent Boltzmann equation is used to
describe even a subcritical thermal multiplying assembly, all the subtleties summarized in Section 8.4.2 above arise. And in a subcritical fast multiplying assembly
they are more pronounced. This is because, unlike in a thermal assembly in which
the up-scattering of the thermal neutrons provides an energy-spectrum regeneration
mechanism (in addition to that which results from the fission process) that tends to
lead to the establishment of a collective energy mode, in a fast assembly, in which
only down-scattering occurs, the resulting downward shift of the energy spectrum
that occurs continuously in time does not lead to such a mode. Only when a fast
assembly is quite close to critical, and the fission regeneration is dominant over the
slowing down, is a persistent (decaying) collective energy mode established.
From the mid-1960s to the mid-1970s, when various Western European countries, the Soviet Union, Japan, and the United States were developing plans to build
fast reactors, pulsed neutron experiments were done as integral experiments to measure keff and other integral parameters, and to verify computational capabilities and
models. Because they required considerably more financial and other investment
(subcritical fast multiplying assemblies with plutonium or enriched uranium fuel)
than non-multiplying thermal assemblies (e.g., spheres of water or graphite stacks)
a limited number of these pulsed neutron experiments were done.
Experimental data from two such experiments [53], done at the SUAK facility in
the mid-1960s at the Kernforschungszentrum in Karlsruhe, Germany, are shown in
Fig. 8.7 (adapted from [54]). The die-away data for SUAK-B – the larger, closer to
critical assembly – clearly is exponential; whereas that for SUAK-A – the smaller
farther subcritical assembly – clearly is not. Both sets of data initially appear to
be exponential (upper portions of the straight lines; however, while the data for
SUAK-B remain on the straight line at later times, those for SUAK-A depart from
the straight line indicating slower than exponential decay at long times. This corresponds to pseudo-mode decay in which the die-away appears to be exponential for
a finite time but then becomes slower at later times when the pseudo-mode ceases
to be sustained and collapses. Because the .v†/min limit and measurements of related non-exponential die-away in thermal non-multiplying assemblies were well
known by the time these experiments in fast assemblies were carried out, the nonexponential die-away in SUAK-A was not a surprise – and some experimentalists
were, indeed, very well prepared for it.
Similarly, some theorists also were prepared to do the analysis. Because of the
fast, rather than thermal energy spectra in these assemblies – and especially because of the continuously downward-shifting fast spectrum in the pseudo-mode
die-away – a one-speed diffusion theory description certainly was inadequate.
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
413
A : 25 ELEMENTS, keff =0.78
EXPERIMENTAL POINTS
I/α = 122 nsec
5
10
B : 36 ELEMENTS, keff =0.869
EXPERIMENTAL POINTS
AND CURVE, I/α = 230 nsec
4
10
A
B
3
10
10
2
0.2
0
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
2.0
t = ( m sec)
Fig. 8.7 Experimental die-away curves for the two SUAK assemblies (Adapted from [54])
A speed-dependent (or energy-dependent) representation was essential. And, since
these fast assemblies were larger (even in mean free paths) than the most interesting
thermal assemblies in previous studies, a speed-dependent diffusion theory description seemed adequate. Whether this description, or a speed-dependent transport theory description, is used it is simple and convenient to exploit the separability of the
kernel in the Fredholm integral associated with the fission process. Starting from the
speed-dependent diffusion equation with isotropic scattering (slowing down) in the
center-of-mass system, introducing the ansatz ˆ.r; v; t / D '.B; v; t / exp.i BE rE/
and Laplace transforming from time to the variable s leads immediately to
Z1
s C vD.v/B C v†T .v/ 'Q B ; v; s d v0 v0 †s v0 ! v 'Q B 2 ; v0 ; s
2
2
v
Z1
D .1 ˇ/.v/ d v0 v0 †f v0 'Q B 2 ; v0 ; s C 'Q B 2 ; v; 0 S B 2 ; v; s ; (8.87)
0
where '.B
Q 2 ; v; s/ depends only on B 2 not B, and '.B
Q 2 ; v; 0/ is the initial condition.
Now, introducing the Green’s function of the associated slowing-down equation
s C vD.v/B 2 C v†T .v/ GQ vjv0 ; B 2 ; s
Z1
d v0 v0 †s v0 ! v GQ v0 jv0 ; B 2 ; s D ı.v v0 /;
(8.88)
0
414
J. Dorning
which is simply the textbook slowing-down equation with v†t .v/ augmented by
s C vD.v/B 2 , the solution to Eq. 8.87 then can be written in terms of this Green’s
function as
Z1
'Q B 2 ; v; s D
d v0 GQ vjv0 ; B 2 ; s S B 2 ; v0 ; s
0
Z1
D .1 ˇ/ d v0 GQ vjv0 ; B 2 ; s .v0 / Q B 2 ; s
0
Z1
C
d v0 GQ vjv0 ; B 2 ; s ' .B; v0 ; 0/ ;
(8.89)
0
where the Laplace transformed fission neutron production rate
2
Z1
Q B ;s d vv†f .v/'Q B 2 ; v; s ;
(8.90)
0
has been introduced. Then multiplying Eq. 8.89 by v†f .v/ and integrating with
respect to v from 0 to 1 yields a formal solution for the transformed fission neutron
production rate, or the transform of the time dependence of a fission detector.
Q B 2; s
Q.B 2 ; s/
D
;
Q B ;s D
1 K .B 2 ; s/
D .B 2 ; s/
2
(8.91)
where
Z1
2
Q.B ; s/ D
Z1
d vv†f .v/
0
d v0 GQ vjv0 ; B 2 ; s '.B; v0 ; 0;/;
0
Z1
K.B ; s/ D
Z1
d vv†f .v/.1 ˇ/
2
0
(8.92)
d v0 GQ vjv0 ; B 2 ; s .v0 /;
(8.93)
0
and D.B 2 ; s/ D 1 K.B 2 ; s/ – sometimes referred to as the dispersion relation
since its zeros are poles of Q.B 2 ; s/ that relate the discrete time decay constants ˛n
to B 2 – has been introduced. It follows from this simple development that, if the
2
Q
slowing-down equation, Eq. 8.88, can be solved for G.vjv
0 ; B ; s/, an expression
2
for the time-dependent response of a fission detector .B ; t/ in the experiment
can be obtained via the inverse Laplace transform of Eq. 8.91. This was done using model cross sections and, first a synthetic (hydrogen-like) slowing-down kernel
[55], and then the exact elastic slowing kernel [56, 57]. The key results, which were
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
415
essentially the same for the two studies, showed that the expressions for Q.B 2 ; s/
had branch points at sbp D Œv†.v/ C vD.v/B 2 min and poles sn .B 2 / – at the zeros of D.B 2 ; s/ – and that as B 2 is increased – i.e., the assembly becomes smaller
and farther subcritical – the last pole s0 .B 2 / “disappears” into the branch point.
Prior to this there is a time-asymptotic exponentially decaying solution associated
with the real, negative, isolated pole s0 .B 2 / D ˛0 .B 2 /. When that pole “disappears,” as B 2 is increased through a critical value B2 , it disappears only from the
principal sheet of the Riemann surface for Q.B 2 ; s/. Actually, the pole bifurcates, as
B 2 is increased through B2 , resulting in a complex conjugate pair of poles so˙ .B 2 /
which were obtained by analytically continuing the expression for Q.B 2 ; s/ in the
counter-clockwise (+) and clockwise (–) directions around the branch point sbp (see
Fig. 8.8). The inverse Laplace transform is then given by a contour integral (path
shown as dashed line in Fig. 8.8) that results from deforming the original integral
along the Bromwich contour. When the integrand of this contour integral was expanded it led to time-dependence, for the detector response, of the form
Q B 2; t
ˇ Cˇ
ˇ t C 1 sC 2 t 2 ;
exp ˇsR
2 I
(8.94)
C
on an intermediate time scale. Here, sR
is the (negative) real part of the complex
conjugate poles on the +1 and –1 Riemann sheets and sIC is the (very small) imaginary part.
s
Fig. 8.8 The Riemann
surface with the trajectories
of the bifurcated poles onto
the adjacent (C1 and –1)
sheets. The dashed curve
represents the deformed
contour of the Laplace
inversion integral
(Adapted from [54])
416
J. Dorning
If ˛.t/ is defined as ˛.t/ D P.t/= .t/, then on the intermediate time scale
C
C
when Eq. 8.94 applies, it will be given by ˛.t/
jsR
j .sIC /2 t, and sR
will
C 2
be the y-intercept of this local curve – an inclined plateau – and .sI / will be
its slope. (See sketches of (a) log .B 2 ; t/ vs t and (b) ˛.t/ P.B 2 ; t /= .B 2 ; t /
vs t in Fig. 8.9 (adapted from [54]).) Experimental data [58] comprising such an
inclined plateau, indicating the existence of a pseudo mode and quasi-exponential
die-away on a 10–40 s time scale in a 13 14 EURECA subcritical fast multiplying
assembly, is shown in Fig. 8.10a (adapted from [54]). Analogous data [58] is shown
in Fig. 8.10b (adapted from [54]), for a smaller, farther subcritical, 9 9 EURECA
assembly, where the inclined plateau is less well-defined and much shorter-lived –
4–7 s [54].
As priorities in fast reactor development shifted toward other physics problems,
e.g., sodium void effects in liquid metal fast breeder reactors (LMFBRs), and toward
a
b
•
log rf
a (t) =
- rf
rf
t
t
Fig. 8.9 (a) Sketch of the metastable (pseudomode) decay of a detector response. (b) Sketch of
the time dependence of the logarithmic derivative, log .B 2 ; t/ (Adapted from [54])
a
0.20
b
a ( 1 )
msec
0.15
0.10
10
20
30
50 msec
40
0.9
0.8
0.7
0.6
0.5
1
2
3
4
5
6
7
8
9
10
11
12 msec
Fig. 8.10 (a) Experimental measurements of the “time-dependent” decay constant,
˛.t / P.B 2 ; t/= .B 2 ; t/ vs t for a EURECA 13 14 element assembly. Solid line is
sketched in. (b) Similar data for a 9 9 EURECA assembly (Adapted from [54])
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
417
many pressing technological problems – and in general toward nuclear engineering
problems directly related to LMFBR design in France, Japan, the Soviet Union and
the United States, support waned for experiments of this type in bare fast subcritical assemblies; hence, the understanding of these and various other fundamental
problems related to fast reactor physics problems did not progress nearly as far as
the understanding that resulted from the earlier, more extensive studies of problems
related to thermal reactor physics. Perhaps the budding resurgence of interest in fast
reactors in France will change this, and put us on the path to developing a deep
understanding of pulsed neutron experiments in subcritical fast reactor assemblies,
and, therefore, a deep understanding of some important aspects of the kinetics of
fast reactors.
8.5 Space–Time Reactor Kinetics
In the early days of reactor development – the mid-1940s to the mid-1950s – when
high speed, large digital computers did not operate at very high speeds, and did not
have very large memory capacities, and were not very widely available – reactor
physics analysis and reactor kinetics analysis was largely done using basic theoretical techniques and hand calculations. (In fact, when I was a graduate student, I met
a computer programmer who told me his first job title had been “Computer.” Yes,
he was a Computer! Early in his career he had done numerical calculations – by
hand, and using a desk-top mechanical calculator – based on the expressions the
physicists gave to him (e.g., the solution to the two-group, one-dimensional, steadystate diffusion equations in slab geometry).) Then, when the early antecedents of
modern digital computers became available, e.g., in the US Navy’s nuclear reactor program, numerical solution techniques based on finite-difference schemes were
programmed for the one- and two-group steady-state diffusion equations in slab geometry and for the one-group steady-state P-1 and P-3 equations in slab geometry.
For some delightful reminiscences of that era the reader is strongly encouraged (dare
I say, “commanded!”) to see the text of a wonderful after-dinner talk given by the
late Dr. Ely Gelbard at an early American Nuclear Society Mathematics and Computational Division National Topical Meeting [59]. (I was there to enjoy it live, and
it was one of the most interesting and entertaining talks I have ever heard!) As time
passed, one-dimensional and later two-dimensional reactor kinetics codes based on
finite-difference schemes in space and time were developed using the few-group
and multigroup diffusion equations and the coupled delayed neutron precursor concentration equations. Some of these developments will be summarized briefly in
Section 8.5.1.
Because the memory capacity and speed of digital computers were still very
limited in the late 1950s and early 1960s, alternative methods to those based on
simple direct finite-difference schemes, were developed. These included methods
based on variational functionals which were fairly widely used in reactor analysis
(again, especially in the US Navy’s nuclear reactor program) in both steady-state
418
J. Dorning
reactor calculations and reactor kinetics calculations. Developed primarily during
the early 1960s, they gradually evolved by the late 1960s into so-called synthesis
methods in which the trial functions used in the variational formulations were the
numerical outputs of the computer-generated solutions to related simpler problems.
These methods were extensively developed and widely used in reactor analysis and
design at both of the US Navy nuclear laboratories: Bettis Atomic Power Laboratory
(BAPL) and Knolls Atomic Power Laboratory (KAPL). The basis for these methods
will be reviewed in Section 8.5.2, and anomalies that arise in their application will
be discussed briefly there.
The early 1970s saw the development of another general class of computational
methods for both reactor criticality calculations and reactor kinetics calculations.
Also motivated by the goals of more accurate numerical solutions and more realistic
representations of reactors using limited computer resources, they, to some extent,
represented a logical compromise between the decomposition of a reactor into numerical cells (or boxes, or computational elements) characteristic of finite-difference
methods and use of analytical solutions characteristic of variational methods. The
basic idea of these so-called coarse mesh methods – some of which later bore the
appellation nodal methods – was to decompose the reactor into computational elements or cells, as is done in finite-difference methods, and then rather than using
the linearly truncated Taylor series approximation to the solution within the cell –
as also is done in finite-difference schemes – introduce some more complicated, but
better analytical approximations to the solution – as is done in variational methods. The distinction, of course, was that the solution is not approximated by global
analytical functions over the whole reactor domain but rather by local analytical
functions within each cell or computational element. This typically results in more
accurate solutions – in comparison with those obtained using traditional finite difference methods – for a given cell size; or, conversely, it permits the use of larger, thus
fewer, cells to achieve a solution of given accuracy requirements. Thus, specified solution accuracy was achieved using both less computer memory and less computer
central processor unit (CPU) time on the mainframe computers of that era. Some
of these coarse mesh [60] and nodal [61–67] methods for the multigroup diffusion
equations, originally developed for criticality calculations, were soon afterwards
extended to space–time reactor kinetics applications [62, 64, 65, 67] and some were
also extended to the multigroup transport equations [68–73]. An early review of
coarse-mesh and nodal methods for the diffusion equation appeared in 1979 [69],
and an analogous review of nodal methods for the transport equation was written in
1985 [73]. Excellent, much more comprehensive, reviews of nodal methods for the
diffusion equation and a related fuel-assembly homogenization procedure [74], and
of nodal methods for the diffusion and transport equations [75] were published in
1986. The main ideas involved in the development of these coarse-mesh and nodal
methods will be presented in Section 8.5.3 in which some results obtained using
them will also be discussed.
The success of these coarse mesh and nodal methods in achieving very high
computational accuracy using very coarse meshes, or large computational elements
or cells – in some cases as large as a light water reactor (LWR) fuel assembly
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
419
or larger – led to the need for a homogenization procedure that could be used to
homogenize the fuel pins and associated lattice cells over a fuel assembly or a significant fraction thereof. One such homogenization procedure, called “equivalence
theory,” in which an exact transport theory solution is introduced as a hypothetical
solution on a computational element (fuel assembly) boundary, was introduced by
K. Koebke [76], and subsequently used extensively and extended by G. Greenman,
K. Smith and A. F. Henry [65] and K. Smith [74]. Others, based on asymptotic expansions were developed by several authors over a period of years. Most of these
other homogenization theories began with an asymptotic expansion of the solution
to the transport equation for a lattice of fuel cells in powers of the small parameter " D =R, where is an appropriate mean free path and R is a characteristic
overall reactor dimension [77–81]. Clearly, this ratio is very small (of the order of
102 ) for a light water thermal reactor. One development, however, started from the
asymptotic expansion of the solution to the diffusion equation in the small parameter © D L=R where L is the diffusion length; hence, this parameter is equally small
(also of the order of 102 ) for a LWR. The essential ingredients that enter into these
homogenization theories will be discussed in Section 8.5.4 in which a few results
obtained using them in conjunction with nodal methods will be summarized.
It is clear that, for different applications, it is appropriate to use different descriptions of reactor kinetics – the point reactor kinetics model, quasi-static models,
space–time multigroup diffusion theory, and in some cases even space–time multigroup transport theory. It is also clear that, in some complicated transients, it is
necessary to use a more precise model such as the time-dependent multigroup transport equations to adequately represent the portion of the transient in which, crucial
local space–time variations of the neutron flux occur. However, it typically is very
costly if such a precise computational representation is used throughout a long simulation of a transient. Thus, just as adaptive spatial grids and adaptive variable time
steps are very effectively used in a whole host of computational physics and computational engineering problems – spatial grids in the neighborhood of shock-wave
fronts in gas dynamics, refined time steps in phase transition problems, etc. – adaptive procedures to represent the evolution of the neutron flux during different epochs
of a complex reactor transient by different models is an efficient way to proceed.
Hence, this section on space–time reactor kinetics will close with a brief summary
in Section 8.5.5 of two articles in which the development and application of precisely such an adaptive model procedure was reported [82, 83].
8.5.1 Finite-Difference Schemes for the Time-Dependent
Multigroup Neutron Diffusion Equations
The origins of numerical methods based on finite-difference schemes for the approximation of the solutions to differential equations go far back in time, as is
evidenced by the names some of these schemes bear – for example, the forward
Euler scheme and the backward Euler scheme, both of which are named after the
420
J. Dorning
Swiss mathematician Leonhard Euler, who lived from 1707 to 1783. These schemes
were used extensively in “hand calculations” long before modern computers – or
even desk-top mechanical calculators – had arrived on the scene. They are very
simple and straightforward, mathematically well-understood, easy to program, and
very widely applicable; and they are particularly suitable for the solution of diffusion
equations. It is, therefore, not surprising that they were the first schemes employed
in serious attempts at the numerical solution of reactor physics equations, particularly the neutron diffusion equation.
In the 1950s and 1960s, and even now some 50 years later, most whole-core
thermal reactor calculations begin, not from the multigroup transport equations, but
rather from the few-group or multigroup diffusion equations
1 @
vg @t
g .r; t /
D r Dg .r; t /r
C
G
X
†s;g
g .r; t/
†a;g .r; t /
g .r; t /
g 0 .r; t/ g 0 .r; t /
g 0 D1
g 0 ¤g
C.1 ˇ/0;g
G
X
†f;g 0 .r; t /
g 0 .r; t/
g 0 D1
C
I
X
i i;g Ci .r; t / C Qg .r; t /; g D 1; : : : ; G
(8.95)
i D1
and the coupled delayed neutron precursor equations
G
X
@
Ci .r; t/ D ˇi i;g
†f;g 0
@t
0
g 0 .r; t /
i Ci .r; t/
i D 1; ; I ;
(8.96)
g D1
for reactor kinetics calculations, and from the steady-state version of these equations
for reactor criticality calculations (k-calculations).
Due to the dramatic difference between the time scale on which the prompt neutrons arrive – of the order of the prompt neutron lifetime – and that on which the
delayed neutrons arrive – of the order of the precursor delay times – these equations
are stiff differential equations. Thus, the time-eigenvalues associated with the delayed neutrons are much greater in magnitude than those associated with the prompt
neutrons, and because of this an explicit finite difference scheme in time (such as the
forward Euler scheme) would lead to an algorithm that would be numerically unstable in many applications. Hence, implicit schemes in time – such as the backward
(implicit) Euler scheme (first order) or the Crank–Nicholson scheme (second order)
– usually are used.
Because most thermal reactor core layouts are very amenable to Cartesian
geometry representation, the discussion that follows will use two-dimensional x–y
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
421
spatial coordinates. (The inclusion of the z-coordinate would be completely straightforward, but cumbrous.) Further, since there is no differential operator in the spatial
variables, in the precursor equations, and their treatment is therefore straightforward, they will be omitted from the discussion. Finally, since the linear algebraic
equations that result from applying difference schemes to the multigroup (and even
the few group) diffusion equations comprise very large systems, which typically are
solved via source iteration of the fission source, or the fission plus inscatter sources,
the basic finite-difference scheme for the space–time solution is developed for just
the g-th group equation. Hence, the essential part of Eqs. 8.95 and 8.96 that is relevant to this discussion is the time-dependent g-th group diffusion removal equation
with a fixed source in x–y spatial coordinates
1 @
vg @t
g .x; y; t /
@2
@2
g .x; y; t / C Dg .t/
2
@x
@y 2
†r;g .t/ g .x; y; t / C Sg .x; y; t /;
D Dg .t/
g .x; y; t /
(8.97)
where the group removal cross section and diffusion coefficient have been taken to
be uniform in space to avoid tedious and possibly confusing details; and Sg .x; y; t /
represents the fission, inscatter, and precursor decay sources, along with any possible fixed sources. A simple central difference scheme applied to the second-order
partial derivatives in x and y leads to
1 d
v dt
j:k .t/
D
D.t/
D.t/
D.t/
j C1;k .t/ 2
j;k .t/ C
j 1;k .t/
2
2
x
x
x 2
D.t/
D.t/
D.t/
C
j;kC1 .t/ 2
j;k .t/ C
j;k1 .t/
2
2
y
y
y 2
†r .t/ j;k .t/ C Sj;k .t/; j D 0; : : : ; J; k D 0; : : : ; K
(8.98)
where j;k .t/ D .xj ; yk ; t /, a uniform spatial mesh of cells xy has been
used, and the group index g has been dropped. Clearing the 1=v and writing
these equations in matrix form for the column vector .t/ of unknowns j;k .t/;
j D 0; : : : ; J; k D 0; : : : ; K yields
d
.t/ D L.t/ .t/ C S .t/;
D
dt
(8.99)
where the definitions of the square matrix L.t/ and the vector S .t/ follow from
D
Eq. 8.98 and are obvious. Finally, introducing a simple finite difference scheme for
the time derivative leads to
`
`1
D .1 /t` L `
D
`
t`1 L `1
D
`1
C.1 /t` S ` t`1 S `1 ; ` D 1; 2; 3; : : :
(8.100)
422
J. Dorning
where
`
.t` /; L ` L.t` /; S ` S .t` /; t` is the discretized time, t` D
D
t` t`1 is the `th time step, and has been introduced so that a -weighted average
of the right-hand-side of Eq. 8.99 appears in Eq. 8.100. Solving these equations for
, the vector of fluxes at spatial grid points .xj ; yk / at time level t` in terms of
`
and the sources S ` and S `1 yields
`1
`
D I .1 /t` L `
D
D
I t`1 L `1
D
1
D
C .1 /t` S ` t`1 S `1 ; ` D 1; 2; 3; : : :
`1
(8.101)
Here, the vector of sources S ` depends upon the group fluxes at the current (the
`th) time level. Some of these group fluxes have been computed in the current outer
source iteration and therefore are available; the others have to be approximated by
their values at the previous time level t`1 .
When , in this so-called theta-difference method in time, is set equal to unity,
Eq. 8.101 becomes the result for the forward Euler scheme, which has a first-order
global error in t (for a uniform time step) and is explicit (no matrix inversion
required) – but often unstable. For equal to 0, this equation gives the timeadvancement operator for the backward Euler scheme, which also is first order,
but implicit (matrix inversion required) and stable. Finally, for equal to one half,
it gives the advancement operator for the Crank–Nicolson scheme, which has a
second-order global error in t (for a uniform time step) and also is implicit and
stable.
Of course, there are many very important details – cell-centered versus celledge spatial finite-difference schemes, acceleration schemes such as coarse-mesh
rebalancing, asymptotic source extrapolation, Wielandt (eigenvalue shift) iteration,
etc. – that have been omitted here in this brief summary. However, many of these
are discussed at some length in well-known textbooks [84–87].
The -difference scheme just summarized, albeit rather briefly, was the basis
for the one-dimensional space–time reactor kinetics code WIGLE [88], developed at BAPL in the early 1960s – and also for TWIGLE [89], its extension
later in the 1960s to two-dimensional space–time reactor kinetics. These codes
were very widely and effectively used in the 1960s and 1970s; and their “guts”
can still be found in some reactor kinetics and reactor dynamics software used
today.
Unfortunately, in order to achieve reasonable accuracy in calculations for
LWRs – in which the neutron diffusion length is of the order of a couple of
centimeters – very fine spatial meshes must be used when these standard finite difference schemes are employed. Thus, the vectors of group fluxes, ` in Eq. 8.101,
become quite large even in two-dimensional calculations. For example, if only one
hundred points were used in each direction (x and y) in a quarter-core calculation,
that vector would have dimension 10,000 and the matrix that would have to be
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
423
inverted (in Eq. 8.101) in each group, in each iteration, at each time step would be
10; 000 10; 000. And for three-dimensional quarter-core calculations – which are
important in order to properly represent the motion of control rods or blades in reactor kinetics calculations – these numbers become 2 106 and .2 106 / .2 106 /!
Thus, in those days, when “large jobs” were run on (32K!) mainframe computers
(e.g., the IBM 360) only at night or over weekends, two-dimensional LWR kinetics
calculations were not done routinely! And three-dimensional calculations were not
done at all – at least not using codes based on finite difference schemes. Rather,
other methods that required less computer storage and time were developed both for
criticality calculations and space–time reactor kinetics calculations. These included
methods developed in the 1960s based on variational principles – in which the
space-dependence of the neutron flux typically was expanded in functions defined
over the whole reactor spatial domain – and later, in the 1970s and 1980s, methods
that were capable of achieving very high numerical accuracy while using very large
computational elements, or, equivalently, coarse meshes. The main ideas used in
the development of these methods based on variational principles, along with some
results obtained using them, will be summarized in the next section. And the key
steps in the formulation of the so-called coarse-mesh and nodal methods and some
examples of their application will be described in the section after that.
8.5.2 Variational, Modal, Synthesis, and Related Methods for the
Time-Dependent Multigroup Diffusion Equations
The development of variational methods, (variational) synthesis methods and socalled modal methods for space–time reactor kinetics (and reactor statics) begins
from a simple variational principle [90–94]; hence, it seems apropos to include
here a brief review of the use of these principles to develop techniques for obtaining approximate solutions to differential equations, integral equations, and other
functional equations. Even though none of the equations of interest in space–time
reactor kinetics – except the one-group steady-state diffusion equation – is selfadjoint it is convenient for expository purposes to begin from a linear self-adjoint
equation
L DS
(8.102)
and introduce the classic functional [90–92]
GŒ' D .'; L'/ 2.'; S/:
(8.103)
Here, all the variables are taken to be real, and the real inner product is given by
the integral over the multidimensional domain X over which the unknown function
'.x/ is defined. The function that minimizes this functional is the '.x/ that is the
424
J. Dorning
solution to Eq. 8.102. This becomes obvious when the variation of GŒ' with respect
to ' is set equal to zero
ıGŒ' D .ı'; L'/ C .'; Lı'/ 2.ı'; S/ D 0;
(8.104)
D .ı'; L'/ C .L'; ı'/ 2.ı'; S/ D 0;
D 2.ı'; fL' S g/ D 0;
(8.105)
(8.106)
Here, ı' is the variation in ', and since it is arbitrary, the inner product is zero
only if '.x/ is the solution to Eq. 8.102. (If this development began from Eq. 8.103
in which GŒ' represented some physical quantity – for example, the energy in a
mechanical system – its minimization would yield Eq. 8.102 as the equation(s) of
motion for the system, the so-called Euler-LaGrange equation(s) corresponding to
the functional GŒ' . The second variation of GŒ' with respect to ' is 2.ı'; Lı'/
where L' S D 0 has been used; hence, if L is a positive operator the extremum
that follows from ıGŒ' D 0 is a minimum.
In practice, when the original operator in Eq. 8.102 is known – as is the case
in reactor kinetics and reactor physics in general – an approximation to its solution
can be generated by minimizing GŒ' , not with respect to an arbitrary variation of ',
but rather with respect to the variation of a trial function – comprising, for example,
a linear combination of acceptable functions, i.e., functions that belong to some
specific class. This procedure is made very explicit by introducing the trial function
'T .x/ D
N
X
Cn n .x/;
(8.107)
nD1
into the functional GŒ' and then minimizing this functional by setting all its partial
derivatives with respect to the Cn equal to zero – or equivalently setting its first
variation fıGŒ T =ıCn gıCn ; n D 1; : : : ; N with respect to all the Cn equal to zero –
which leads to
N
X
.n ; Lm /Cm .n ; S / D 0; n D 1; : : : ; N;
(8.108)
mD1
where a common factor of 2 has been cancelled, and the facts that L is self-adjoint
and the inner product is real have been used. This is a simple nonhomogeneous
linear system of algebraic equations which can be inverted as long as it does not
have a zero eigenvalue, to obtain the Cn ; n D 1; : : : ; N and therefore, via Eq. 8.107,
the specific 'T .x/ that minimizes the functional GŒ'T over functions of the form
given by that equation, and thereby provides an approximation of this form to the
solution of Eq. 8.102. This procedure is known as the Rayleigh–Ritz method [91],
and it is equivalent to the Bubnov–Galerkin method [91] of approximation to solutions in linear function spaces. In that method, the solution to Eq. 8.102 is simply
approximated by the expansion
'.x/ Š 'N .x/ N
X
nD1
Cn n .x/;
(8.109)
8
Nuclear Reactor Kinetics: 1934–1999 and Beyond
425
in the set of functions n .x/; n D 1; : : : ; N , which is substituted directly
into Eq. 8.102; then, the inner product of the resulting equation with each of
the n .x/; n D 1; : : : ; N , is separately set equal to zero, forcing the residual
RN .x/ D L'N .x/ S.x/ to be orthogonal to each of the expansion functions
n .x/; n D 1; : : : ; N . The final equations, therefore, are identical to Eq. 8.108;
hence, this method is sometimes called the Ritz–Galerkin method – even though
the lives of the two mathematicians did not overlap. (Equation 8.108 is equivalent to setting the gradient of GŒ'T with respect to the vector C of coefficients
Cn ; n D 1; : : : ; N , equal to zero, i.e., rC GŒ'n D 0.) This, of course, establishes
that GŒ'T undergoes an extremum for the resulting value of C . To insure that this
extremum actually is a minimum – and not a maximum or inflection point – the
Hessian rC rC GŒ'T also should be calculated and evaluated at the value of C that
corresponds to the extremum to show that it is positive and, therefore that GŒ'T has
been minimized, not maximized. But this long and tedious step is usually omitted
in real-world calculations.
Now that the development for self-adjoint operators has been summarized, the
extension to non-self-adjoint operators can be made clear very easily. This development – which applies to the time-dependent (and steady-state) multigroup diffusion
equations and multigroup transport equations, and the time- and energy-dependent
diffusion and transport equations as well – begins from the linear non-self-adjoint
operator equation
A' D S;
(8.110)
and the related adjoint equation
A ' D S ;
(8.111)
where, for example, A might be a nonsymmetric matrix of non-self-adjoint integrodifferential operators, and A , of course, is its adjoint.
As for the self-adjoint case a functional is introduced and its variation is set equal
to zero. Here, however, the classic functional GŒ' , given by Eq. 8.103, is replaced
by its generalization to the non-self-adjoint case – the Roussopoulos functional [93,
94] (Reference [94], cited here, is a somewhat obscure, but lovely, introduction to
variational methods in general and variational methods for reactor calculations in
particular; and every one interested in reactor calculations and calculational methods
should have a copy in her or his library. (My copy is a not-very-good photocopy of
an old mimeographed copy; nevertheless, I cherish it!))
i h
F ' ; ' D ' ; A' ' ; S S ; ' ;
(8.112)
where, in general, all the variables and the operator A may be complex and the inner
product is the standard complex inner product over the multidimensional domain X.
The variation of F Œ' ; ' with respect to both ' and ' , set equal to zero, is
h
i ıF ' ; ' D ı' ; fA' S g C fA ' S g; ı' D 0;
(8.113)
426
J. Dorning
and since both ı' and ı' are arbitrary variations this leads to Eqs. 8.110 and
8.111 for the so-called forward and adjoint solutions, respectively, as the generalized Euler–LaGrange equations. (Analogously to the self-adjoint case, if the original
equations for ' and ' , Eqs. 8.110 and 8.111, had not been known and this development began from the functional F Œ' ; ' , setting its first variation equal to zero
would have generated these equations.) Now, approximations to the solutions to the
forward equation and the adjoint equation can be generated via steps closely analogous to those just summarized for the self-adjoint case. Introducing trial functions
for both the forward and adjoint solutions
'T .x/ D
N
X
Cn n .x/;
(8.114)
Ck k .x/;
(8.115)
nD1
and
'T .x/ D
N
X
kD1
into the Roussopoulos functional and setting its first variation with respect to all the
Ck and all the Cn equal to zero
N n
X
N n
h
o
h
o
i
i
X
@F ' ; ' =@Ck ıCk C
@F ' ; ' =@Cn ıCn D 0;
(8.116)
nD1
kD1
or equivalently, since all the ıCk and all the ıCn are arbitrary, setting all the par
tial derivatives of F Œ' ; ' with respect to both the Ck ; k D 1; N and the Cn ;
n D 1; N equ
Download