See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/227209296 Reactor Core Methods Chapter · April 2010 DOI: 10.1007/978-90-481-3411-3_4 CITATION READS 1 443 3 authors: Robert Roy Yousry Y. Azmy champlain North Carolina State University 17 PUBLICATIONS 193 CITATIONS 177 PUBLICATIONS 1,094 CITATIONS SEE PROFILE Enrico Sartori Organisation for Economic Co-operation and Development (OECD) 103 PUBLICATIONS 1,231 CITATIONS SEE PROFILE Some of the authors of this publication are also working on these related projects: Porous Media View project THOR - Tetrahedral High Order Radiation Transport Code View project All content following this page was uploaded by Enrico Sartori on 25 March 2015. The user has requested enhancement of the downloaded file. SEE PROFILE Yousry Azmy • Enrico Sartori Nuclear Computational Science A Century in Review 13 Prof. Yousry Azmy North Carolina State University Department of Nuclear Engineering Raleigh, NC 27695 1110 Burlington Engineering Labs USA yyazmy@ncsu.edu Enrico Sartori Organisation for Economic Co-operation and Development (OECD) 12 bd. des Iles 92130 Issy-les-Moulineaux France esartori@noos.fr ISBN 978-90-481-3410-6 e-ISBN 978-90-481-3411-3 DOI 10.1007/978-90-481-3411-3 Springer Dordrecht Heidelberg London New York Library of Congress Control Number: 2009944067 Mathematics Subject Classification (2010): 82D75, 65C05 c Springer Science+Business Media B.V. 2010 No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording or otherwise, without written permission from the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Cover design: deblik Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com) Ely Gelbard November 6, 1924–April 18, 2002 Preface Scheduled on the heels of the atomic century, the American Nuclear Society’s international topical meeting on Mathematics and Computation seemed like an opportune moment in time to capture accomplishments in this area during the first half-century of nuclear engineering. Held in a semi-secluded part of the city of Gatlinburg, Tennessee, April 6–10, 2003, this gathering of prominent experts in the field and young professionals embarking on exciting careers in what promises to develop into a nuclear renaissance turned out to be the perfect venue for such a review. The conference was co-sponsored by three divisions of the American Nuclear Society, namely the Mathematics and Computation Division, the Reactor Physics Division, and the Radiation Protection and Shielding Division. The Technical Program of the conference revolved around the theme of its title, Nuclear Mathematical & Computational Sciences: A Century in Review, A Century Anew. The Anew component comprised contributed papers organized in 25 regular and special sessions on a broad variety of topics, plus a poster session and a panel session. The Review component of the conference comprised the lecture series that grew into this book. As Technical Program Chair (YYA) and Assistant General Chair (ES) of the conference, we decided to break with the traditional format of plenary sessions standard in technical meetings and organize a lecture series that takes stock of the state of the art in nuclear computational science at the turn of a new century. Thus the concept of the lecture series that led to the chapters of this book was born. One of the first experts we solicited to present a lecture in the series was the late Dr. Ely Gelbard of Argonne National Laboratory at the time. In his gentle, but firm and persuasive manner, he declined preferring instead to participate as co-organizer of the lecture series. We jumped on the opportunity recognizing his long-standing, distinguished, and generous contributions to many subareas in nuclear computational science, and his many years of service in the field positioned him well to know the major areas to cover in the lectures and to nominate world-renowned lecturers. In short order the three of us came up with a slate of topics and a corresponding list of lecturers. The response of the nominated lecturers was supportive and enthusiastic, and by mid Fall 2001 what has later become known as the Gelbard Lecture Series was fully conceived, and a tentative idea of ultimately documenting the lecture contents in book chapters was initiated. Our charge to the invited lecturers was to provide an overview of the assigned topic aiming primarily at breadth of coverage, vii viii Preface with a sharp focus on its mathematical and computational aspects. Specifically we requested that each author provide a historical perspective of the conception of their topic as a major area of research in nuclear computational science, and to identify landmarks for the evolution of the topic through the end of the twentieth century. We further requested that the lecturers delineate the current state of the art in their assigned topic and to project into the future by exposing perceived challenges and opportunities for advancing the frontier of knowledge. Our renowned lecturers did not disappoint and the lecture series was a smashing success, thanks to their dedicated effort and professionalism. The lectures, scheduled to open each half-day of the conference, were well attended, with conference participants packing the lecture hall on a consistent basis. Perhaps the only sour note that tainted the lecture series was the passing on April 18, 2002, of Dr. Ely Gelbard whose contributions to the success of the lecture series, and ultimately to the publication of this book, cannot be overstated. This great loss to the field of nuclear computational science overshadowed the conference leading to various observances of this sad event. The conference banquet included a memorial celebrating Dr. Gelbard’s life and his significant contributions to nuclear computational science, and the lecture series was named after him in recognition of his involvement that propelled the series to success. Later, the contributing authors to this book agreed to dedicate it to the memory of Dr. Ely Gelbard. Unfortunately death struck again with the passing of Dr. Richard Hwang on December 20, 2007, shortly after he completed the final revisions to his chapter appearing in this book. We are grateful for Richard’s contribution to the success of the lecture series, for the chapter he composed in this book, and for his dedication to his research over the past 5 decades. While the original list of topics envisioned in our early planning of the lecture series has not changed, the reader will notice a few differences between the lectures lineup and the chapters herein. First, Dr. Dan Cacuci who, for unforeseen circumstances, was unable to deliver his lecture on Sensitivity and Uncertainty Analysis at the conference has graciously composed the corresponding chapter for this book. Second, Dr. Kord Smith who presented the lecture on Reactor Physics at the conference apologized from composing the corresponding book chapter due to increased job-related responsibilities. We are grateful to Dr. Robert Roy for accepting to undertake such burden and for the excellent job he did in composing his chapter on Reactor Core Methods. Lastly, in composing Chapter 7, Elliott Whitesides recruited Mike Westfall and Calvin Hopper to help with the composition. This book would not have been possible without the support and active involvement of many people over the span of 6 years. Most of all we wish to thank the authors who willingly and cheerfully accepted this additional burden to their normally hectic schedules. We are confident that the benefit to the field of nuclear computational science and the gratitude of its practitioners, especially the young scientists who will carry the torch into the future, will reward the authors’ perseverance and patience during this long an arduous journey. We are grateful to Argonne National Laboratory’s Dr. Roger Blomquist for composing the memorials to Ely Gelbard and Richard Hwang, and for reviewing the final version of Richard’s Preface ix Chapter 5. The support and encouragement of Bernadette Kirk, Director of Oak Ridge National Laboratory’s Radiation Safety Information Computational Center (RSICC) and General Chair of the Gatlinburg conference, was invaluable to the completion of this project. The technical help by Alice Rice of RSICC with bringing together the pieces of this book into a single volume is greatly appreciated. In addition, we wish to acknowledge the tacit approval and support of our respective institutions, The Pennsylvania State University and North Carolina State University (YYA), and the Nuclear Energy Agency of the Organisation for Economic Cooperation and Development (ES). June 2009 Yousry Y. Azmy Enrico Sartori Obituary Composed by Dr. Roger Blomquist for Dr. Ely Meyer Gelbard Ely Gelbard was born in New York City on November 6, 1924. He was the son of immigrants. His undergraduate work was at the City Colleges of New York and after World War II he earned his Ph.D. in physics from the University of Chicago. During the war, he served in the US Army Air Corps as a radar technician. He was a Senior Scientist at Argonne National Laboratory and a Fellow of the American Nuclear Society. Ely started his postgraduate career when the use of digital computers to solve the neutron balance equations for fission reactor core design and analysis was just starting to receive wide application. At Bettis (1954–1972), he participated in the efforts that put the numerical methods for the solution of the finite difference form of the neutron transport equation on a firm mathematical basis, and he devised several approximation schemes that were suitable for numerical methods and also developed efficient algorithms for their solution. While at Bettis, he earned international stature in the field, authoring important papers in many variants of the solution procedures (spherical harmonics, Sn , synthetic methods, and Monte Carlo), including the book, Monte Carlo Principles and Neutron Transport Problems, with J. Spanier. He was the first physicist at Bettis to attain the rank of Consulting Scientist, and earned the Atomic Energy Commission’s prestigious E. O. Lawrence Award. Since 1972, when Dr. Gelbard joined Argonne National Laboratory, fast reactors have been the focus of ANL’s reactor program, with its emphasis on more accurate computation of the neutron spectrum. His work in this area produced fundamental advances in the analysis of neutron streaming, collision probabilities, improvements in Monte Carlo methods, and neutron diffusion and transport within the nodal approximation. He also brought improved iterative solution strategies to bear on the equations of single-phase computational thermal-hydraulics analysis of passively safe metal-cooled reactor systems. He was consulted by many at ANL, at other labs, and at universities on a wide variety of technical issues, and invariably provided important insights. Ely’s sustained record of high productivity of the highest-quality technical work attracted a series of bright and vigorous visiting scholars and students whose participation magnified his work. He excelled at distilling complex technical issues to their essence, then performing the relevant mathematical analysis and, finally, computationally confirming the analysis. He was always careful, honest, and thoroughly xi xii Obituary Composed by Dr. Roger Blomquist for Dr. Ely Meyer Gelbard scrupulous in his work. He earned the ANS Special Award for Computer Methods for the Solution of Problems in Reactor Technology, the ANS Mathematics and Computations Division Distinguished Service Award, the ANS Reactor Physics Division Eugene Wigner Award, and the University of Chicago Distinguished Performance Award. In spite of his great stature and many accomplishments, Ely was a mild and modest gentleman who always gave full credit to others’ work, and was very approachable and an excellent listener. His technical questions at meetings were insightful, probing, and gentle. He also pursued the understanding of others’ points of view in personal and political matters with both intellect and sensitivity. His restaurant adventures at meetings and other venues have provided a rich array of gastronomic experiences and many fond memories to his many friends in our profession. The Gelbard Review Lecture Series Conducted during the American Nuclear Society’s Conference Nuclear Mathematical and Computational Sciences: A Century in Review, A Century Anew Gatlinburg, Tennessee, April 6–10, 2003 Back row: Richard Hwang, Elmer Lewis, Kord Smith, Enrico Sartori Front row: Yousry Azmy, Elliott Whitesides, Jerry Spanier, Jack Dorning, Ed Larsen Contents Preface .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . vii Obituary Composed by Dr. Roger Blomquist for Dr. Ely Meyer Gelbard .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . xi 1 Advances in Discrete-Ordinates Methodology .. . . . . . . . . . . . . . . . . .. . . . . . . . . . . Edward W. Larsen and Jim E. Morel 1 2 Second-Order Neutron Transport Methods . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 85 E.E. Lewis 3 Monte Carlo Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .117 Jerome Spanier 4 Reactor Core Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .167 Robert Roy 5 Resonance Theory in Reactor Applications . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .217 R.N. Hwang 6 Sensitivity and Uncertainty Analysis of Models and Data.. . . . .. . . . . . . . . . .291 Dan Gabriel Cacuci 7 Criticality Safety Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .355 G.E. Whitesides, R.M. Westfall, and C.M. Hopper 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond . . . . . . . . . . . . . .. . . . . . . . . . .375 Jack Dorning Index . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .459 xv Chapter 1 Advances in Discrete-Ordinates Methodology Edward W. Larsen and Jim E. Morel 1.1 Introduction In 1968, Bengt Carlson and Kaye Lathrop published a comprehensive review on the state of the art in discrete-ordinates .SN / calculations [10]. At that time, SN methodology existed primarily for reactor physics simulations. By today’s standards, those capabilities were limited, due to the less-developed theoretical state of SN methods and the slower and smaller computers that were then available. In this chapter, we review some of the major advances in SN methodology that have occurred since 1968. These advances, combined with the faster speeds and larger memories of today’s computers, enable today’s SN codes to simulate problems of much greater complexity, realism, and physical variety. Since 1968, several books and reviews on general numerical methods for SN simulations have been published [32, 46, 71], but none of these covers the advanced work done during the past 20 years. The specific purpose of this chapter is to describe how the field of SN calculations has matured through the lens of three important physical problems that can be simulated today but could not be realistically simulated in 1968. By discussing these problems and the methods developed to overcome their calculational difficulties, we hope to (i) show how dramatically the field of SN simulations of the transport equation has advanced and (ii) provide an introduction to the new algorithmic techniques that have enabled these advances. An outline of the remainder of this review follows. In Section 1.2, we briefly introduce the transport equation and discuss its basic temporal (implicit), energy (multigroup), directional .SN /, and spatial (finite-difference) discretizations, together with iterative solution procedures – as of 1968. The purpose of this section is to establish notation and set the stage for the later sections, which describe more recent developments. E.W. Larsen () Department of Nuclear Engineering and Radiological Sciences, University of Michigan, Ann Arbor, 48109-2104 Michigan, USA e-mail: edlarsen@umich.edu J.E. Morel Department of Nuclear Engineering, Texas A&M University, College Station, Texas, USA e-mail: morel@tamu.edu Y. Azmy and E. Sartori, Nuclear Computational Science: A Century in Review, c Springer Science+Business Media B.V. 2010 DOI 10.1007/978-90-481-3411-3 1, 1 2 E.W. Larsen and J.E. Morel Section 1.3 discusses three important physical problems that could not be simulated in 1968 but can be realistically simulated today: thermal radiation transport, charged-particle transport, and oil-well logging tool design. In Section 1.4, we discuss advanced spatial discretizations (characteristic methods, discontinuous finite-element methods [DFEMs], and nodal methods) and the asymptotic thick diffusion limit (a technique to predict the validity of SN spatial discretizations for diffusive systems with optically thick spatial cells). Section 1.5 describes advances in discretizations of the angular derivatives associated with curvilinear geometries and treatments of anisotropic scattering. Section 1.6 covers advances in angular and energy discretizations for charged particles; Section 1.7 describes advances in time discretizations. In Section 1.8, we discuss major advances in iteration acceleration: diffusionsynthetic acceleration (DSA), linear multifrequency-grey acceleration for thermal radiation transport, fission source acceleration for time-dependent calculations, and upscatter acceleration. Section 1.9 outlines the recent application of preconditioned Krylov methods. Section 1.10 concludes with a brief discussion of challenges for the future: robust finite-element methods on nonorthogonal grids, positive and monotone methods, efficient parallel sweep algorithms for unstructured grids, further development of Krylov methods for solving the SN equations, methods for charged-particle calculations with pencil-beam sources, Galerkin quadrature with positive generalized weights, and ray-effect mitigation. 1.2 Basic Concepts The physical process discussed in this chapter is the interaction of radiation with matter (radiation transport, or particle transport). The archetypical equation that describes these interactions is the linear Boltzmann equation (LBE) [2, 3, 7, 13]: 1 @ .r; ; E; t / C r .r; ; E; t / C †t .r ; E/ .r ; ; E; t / v @t Z 1Z D †s r; 0 ; E 0 ! E .r; 0 ; E 0 ; t / d0 dE 0 0 4 Z Z .r; E/ 1 Q .r; E; t/ C †f r; E 0 : (1.1) r; 0 ; E 0 ; t d0 dE 0 C 4 4 0 4 In full generality, this equation has seven independent variables: three spatial variables .r/, two direction-of-flight (or angular) variables , energy .E/, and time .t/. Particle transport problems are difficult and costly to simulate because, in part, of the high dimensionality of phase space. In this section, we discuss the basic numerical methods used to solve Eq. (1.1) in the principal large computer codes of the 1960s [4, 5, 10, 14]. We assume that the reader understands the physical meaning and basic mathematical properties of each of the terms in Eq. (1.1), and we have used notation that is broadly standard. The discussion in this section is terse; we refer the reader to standard texts [13, 71] for details. 1 Advances in Discrete-Ordinates Methodology 3 The LBE given in Eq. (1.1) describes neutron transport with scattering and fission interactions. Variations of this equation primarily involve the types of interactions that are included. For instance, a gamma-ray transport equation would not have a fission term. Systems of coupled transport equations, each similar to Eq. (1.1), are required to describe the coupled transport of multiple types of particles, e.g., coupled neutron gamma-ray transport in which neutrons interact with nuclei to create gamma-rays and gamma-rays interact with nuclei to create neutrons. The principal computational difficulties associated with Eq. (1.1) are common to essentially all variations of this equation that are associated with different physical applications. In this chapter, we describe numerical methods in terms of Eq. (1.1), or simpler versions of that equation whenever possible. We consider variations of Eq. (1.1) that correspond to different physical applications only when necessary. To begin, we mention a few technical details. First, the differential scattering cross section is commonly written as a Legendre polynomial expansion: 1 X 2m C 1 Pm 0 †s;m r; E 0 ! E : †s r; ; E ! E D 4 mD0 0 0 (1.2) The Legendre moments †s;m are typically calculated and stored for each material region. Also, initial and boundary conditions must be specified for Eq. (1.1). If V denotes the physical system and t D 0 is the initial time, then Eq. (1.1) holds for all r 2 V; 2 4; 0 < E < 1, and t > 0. At t D 0, must be fully specified in V : .r ; ; E; 0/ D i .r; ; E/; r 2 V; 2 4 ; 0 < E < 1: (1.3) Also, must be specified on the boundary @V for directions of flight pointing into V : .r ; ; E; t/ D b .r; ; E; t/; r 2 @V; n < 0; 0 < E < 1; 0 < t: (1.4) Here, n is the unit outer normal vector at the boundary point r 2 @V . Many important algorithmic concepts can be explained most easily for problems with planar-geometry symmetry, in which the geometry and solution depend on only one spatial variable x and one angular variable D i . (The unit vector i points in the positive x-direction.) For a planar-geometry system 0 x X; Eq. (1.1) simplifies to 1 @ .x; ; E; t/ @ .x; ; E; t/ C C †t .x; E/ .x; ; E; t/ v @t @x Z 1Z 1 †s x; 0 ; ; E 0 ! E x; 0 ; E 0 ; t d0 dE 0 D 0 1 .x; E/ C 2 Z 0 1Z 1 1 †f x; E 0 Q.x; E; t/ x; 0 ; E 0 ; t d0 dE 0 C ; (1.5) 2 4 E.W. Larsen and J.E. Morel where 1 X 2m C 1 Pm ./Pm 0 †s;m x; E 0 ! E : (1.6) †s x; 0 ; ; E 0 ! E D 2 mD0 The initial condition for Eq. (1.5) is .x; ; E; 0/ D i .x; ; E/; 0 < x < X; 1 1; 0 < E < 1; (1.7) and the boundary conditions are .0; ; E; t/ D l .X; ; E; t/ D .; E; t/; 0 < 1; 0 < E < 1; 0 < t; r (1.8a) .; E; t/; 1 < 0; 0 < E < 1; 0 t: (1.8b) Because D vN , where v is the particle speed and N is the particle density, physically must be non-negative. If the cross sections, inhomogeneous source, initial conditions, and boundary conditions in Eqs. (1.5) through (1.8) are all non-negative (as they must be physically), then it can be shown that the solution of these equations is non-negative. However, the positivity of does not necessarily hold when approximations (discretizations) of the LBE are imposed. A desirable feature of a discretization for the LBE is that the resulting approximate solution should be positive – or nearly so. We now sketch the basic discretization and solution methods for Eqs. (1.5) through (1.8), which existed in computer codes in the late 1960s. We begin with the discretization of time. The most widely used time-discretization technique for transport problems, even today, is implicit time differencing. For a time interval tk1=2 <t<tkC1=2 , and with the definition kC1=2 .x; ; E/ D .x; ; E; tkC1=2 /, Eq. (1.5) with implicit time differencing is given as follows: .x; ; E/ 1 kC1=2 C †t .x; E/ C .x; ; E/ @x vt k Z 1Z 1 kC1=2 D x; 0 ; E 0 d0 dE 0 †s x; 0 ; ; E 0 ! E @ kC1=2 0 1 Z Z kC1=2 .x; E/ 1 1 x; 0 ; E 0 d0 dE 0 C †f x; E 0 2 0 1 kC1=2 k1=2 .x; E/ .x; ; E/ Q C C : (1.9) 2 vt k This equation can be obtained by integrating Eq. (1.5) over tk1=2 <t<tkC1=2 , dividing by t k D tkC1=2 tk1=2 , and approximating 1 t k Z tkC1=2 tk1=2 .x; ; E; t/dt kC1=2 .x; ; E/: (1.10) 1 Advances in Discrete-Ordinates Methodology 5 The definition of QkC1=2 in Eq. (1.9) follows Eq. (1.10). The boundary conditions kC1=2 .0; ; E/ D l;kC1=2 kC1=2 .X; ; E/ D .; E/; 0 < 1; 0 < E < 1; (1.11a) .; E/; 1 < 0; 0 < E < 1; (1.11b) r;kC1=2 for kC1=2 are obtained similarly. Implicit time discretization yields a steady-state LBE to be solved within each time step. The angular flux at the end of the previous time step appears as a source in the right side of the (steady-state) LBE. Since the solution of the LBE with a positive source is guaranteed to be positive, it follows that implicit time differencing – in the absence of other truncation errors – is guaranteed to yield a positive solution. (This is not the case for other time discretizations.) However, all discretization methods contain truncation errors that degrade the solution. For implicit time differencing, these errors are first-order: x; ; E; t kC1=2 D kC1=2 .x; ; E/ C O.t/: (1.12) Because Eq. (1.9) is equivalent to a steady-state equation, the same solution techniques can be used for both time-dependent and steady-state calculations. This is generally true even for time discretizations that are more advanced than the fully implicit discretization. Thus, we shall, henceforth, only discuss steady-state problems. Next, we consider the multigroup discretization of the energy variable. This approximation begins with the specification of an energy grid Emin D EG < EG1 < < Eg < Eg1 < E1 < E0 D Emax , which defines the boundaries of energy groups (or, more simply, groups), the gth group being the interval Eg < E < Eg1 . On each group, the cross sections are represented as constants: †t;g .x/, †s;g 0 !g .x; 0 ; /, †f;g .x/, and g .x/. (In effect, the continuous-energy cross sections are approximated as histograms in E.) Doing this, integrating the (steady-state) Eq. (1.5) over the gth group, and defining the gth group flux as Z g .x; / Eg1=2 D .x; ; E/dE; (1.13) EgC1=2 we obtain the coupled system of multigroup transport equations: @ g .x; / @x C †t;g .x/ g .x; / C D G Z X 1 g 0 D1 1 †s;g 0 !g .x; 0 ; / G Z g .x/ X 1 †f;g 0 .x/ 2 1 0 g D1 g0 g0 x; 0 d0 Qg .x/ ; x; 0 d0 C 2 0 < x < X; 1 1; 1 g G: (1.14) 6 E.W. Larsen and J.E. Morel The multigroup boundary conditions are likewise obtained by integrating Eq. (1.8) over the gth group: D g .0; / g .X; / D l g ./; 0 < 1; 1 g G; (1.15a) 1 < 0; 1 g G: (1.15b) r g ./; This simplified derivation of the multigroup equations does not specify the multigroup cross sections, e.g., †s;g 0 !g . This topic is routinely covered in nuclear engineering textbooks and will not be discussed here. However, the above derivation shows that if the physical cross sections are histograms in E, then the multigroup transport equations are exact. The multigroup technique is one of the most successful approximations in computational transport theory. In practice, much effort goes into the definition of the energy groups and multigroup cross sections: the more carefully these are chosen, the fewer the number of groups required, and the less costly is the calculation. The optimal number of groups and choice of the group structure depend on the specific application. For some light water reactor calculations, only two energy groups are required to achieve sufficient accuracy. However, other problems – such as neutron and gamma-ray transport in shields, or charged-particle transport – can require well over 100 energy groups. Equation (1.14) can also be written as Z 1 @ g .x; / C †t;g .x/ g .x; / D †s;g x; 0 ; g x; 0 d0 @x 1 Qg .x/ ; (1.16) C Sg .x; / C 2 where †s;g D †s;g!g is the within-group scattering cross section for group g, and XZ 1 Sg .x; / D †s;g 0!g x; 0 ; g 0 x; 0 d0 g 0 ¤g C 1 G Z g .x/ X 1 †f;g 0 .x/ 2 1 0 g0 x; 0 d0 (1.17) g D1 is the scattering plus fission source to group g. Thus, the multigroup equations can be viewed as a system of one-group equations that are coupled through the scattering and fission sources. Like the implicit time discretization of the transport equation, the multigroup approximation is inherently positive: for any number of groups G, if the multigroup cross sections, source Qg , and boundary conditions gl and gr are non-negative, the solution g of the multigroup Eqs. (1.14) and (1.15) is non-negative. Next, we discuss the discrete-ordinates .SN / discretization of the angular variable. This begins with the specification of N discrete angles nC1=2 , satisfying 1 Advances in Discrete-Ordinates Methodology 7 1 D 1=2 < 3=2 < < N C1=2 D 1, which define the boundaries of angular bins, the nth angular bin being the interval n1=2 < < nC1=2 . The width of this bin is wn D nC1=2 n1=2 . At a point within each nth angular bin, a discrete ordinate n is specified. The set f.n ; wn / j1 n N g is an angular quadrature set of order N. In practice, one-dimensional (1-D) quadrature sets for Eq. (1.16) have an even number N of discrete-ordinates and angular bins, which are symmetric about D 0: N D 1 , wN D w1 , etc. (The direction D 0 is always chosen as the edge of an angular bin, never as a discrete ordinate.) [15]. Many numerical methods can be described in terms of a simplified one-group version of Eq. (1.16). This simplified equation is obtained from Eq. (1.16) by absorbing Sg into Qg , assuming Qg to be a known source, and dropping the group subscript g. The SN approximation to this equation is n @ n .x/ @x C †t .x/ n .x/ N X D †s .x; n ; n0 / n0 D1 n0 .x/wn0 0 < x < X; 1 n N C Q.x/ ; 2 (1.18) with boundary conditions: n .0/ D l n; 0 < n 1; (1.19a) n .X / D r n; 1 n < 0: (1.19b) This approximation can be obtained by integrating the one-group version of Eq. (1.16) over the nth angular bin, dividing by the width of the bin, and making simple approximations. The above SN equations are a coupled system of first-order differential equations in x. The unknowns in this system are the angular fluxes n .x/ for the nth angular bin. In the absence of scattering .†s D 0/, the SN equations become uncoupled; particles cannot “jump” from one angular bin to another. (However, in curvilinear-geometry transport equations, angular “jumping” does occur, even in the absence of scattering.) In two-dimensional (2-D) and three-dimensional (3-D) problems, two angular variables are needed to define the direction-of-flight variable on the unit sphere. In this case, a multidimensional angular quadrature set consists of a set of discrete directions (discrete ordinates) n ; one for each angular bin, and a corresponding set of angular weights wn that define the area (on the unit sphere) of each bin. Like implicit time differencing and the multigroup approximation, the SN approximation – applied to a Cartesian-geometry equation such as Eq. (1.16) – is inherently positive: if the cross sections and sources in Eqs. (1.18) through (1.19) are non-negative, then the SN angular fluxes are non-negative. The positivity of the SN approximation for 1-D planar and other Cartesian geometries does not automatically apply in curvilinear geometries. For example, let us consider 1-D spherical geometry problems, in which the cross sections depend only on the radial variable r D jrj D .x 2 C y 2 C z2 /1=2 and the sources depend spatially on r and angularly on D r : r (1.20) 8 E.W. Larsen and J.E. Morel In such problems, the angular flux depends spatially only on r and angularly only on , and the appropriate (one-group) 1-D spherical geometry transport equation is @ .r; / 1 2 @ C .r; / C †t .r/ .r; / @r r @ Z 1 Q.r/ : †s .r; ; 0 / r; 0 d0 C D 2 1 (1.21) Now, an angular derivative term occurs on the left side of the LBE. This term is present because, by Eq. (1.20), the spherical geometry variable is spacedependent; as a particle streams through the system, its angular variable changes continuously. The new complication is that a successful -discretization must accurately treat the integral scattering term on the right side of the equation and the angular derivative term on the left side. We now describe the original SN approximation to Eq. (1.21). First, we multiply by r 2 and rearrange to obtain the conservative form of the LBE: @ @ 2 r .r; / C r .1 2 / .r; / C †t .r/r 2 .r; / @r @ Z 1 r 2 Q.r/ : D †s r; ; 0 r 2 r; 0 d0 C 2 1 (1.22) (Note that by integrating this equation over 1 1, the angular derivative term automatically vanishes. This fact implies that the angular redistribution of particles as the particles freely stream is conservative – it neither adds nor subtracts particles from the system.) We integrate Eq. (1.22) over the nth angular bin n1=2 < < nC1=2 and divide by the bin width wn . After making simple approximations, we obtain r @ 2 r n .r/ C ˇnC1=2 nC1=2 .r/ ˇn1=2 @r wn N X r 2 Q.r/ ; D †s .r; n ; n0 /r 2 n0 .r/ wn0 C 2 0 n n1=2 .r/ C †t .r/r 2 n .r/ (1.23a) n D1 where ˇnC1=2 D 2 n X n0 wn0 (1.23b) n0 D1 is defined so that Eq. (1.23a) admits a constant solution for the infinite medium configuration in the presence of a constant source. In Eq. (1.23a), extra unknowns arise – the angular fluxes n˙1=2 at the edges of the angular bins. Originally, these unknowns were treated by solving Eq. (1.23) in combination with the DiamondDifference (DD) approximation in angle n .r/ D 1 2 n1=2 .r/ C nC1=2 .r/ ; (1.24) 1 Advances in Discrete-Ordinates Methodology 9 and the starting-direction calculation for 1=2 , which is obtained directly from Eq. (1.21) by setting D 1 and assuming that @ =@ at D 1 is bounded, @ 1=2 .r/ @r C †t .r/ 1=2 .r/ D N X †s .r; 1; n0 / n0 .r/wn0 n0 D1 C Q.r/ : 2 (1.25) Equations (1.23) through (1.25) hold in (e.g.) a finite sphere 0 < r < R, with boundary conditions n .R/ b n D 1=2 .R/ D ; n < 0; (1.26a) b 1=2 ; (1.26b) which specify the fluxes entering the sphere through its outer boundary r D R. It is apparent that for curvilinear geometries, in which angular derivatives occur, the SN approximation is more complicated than in Cartesian geometries. As a result, there are issues of accuracy in curvilinear geometries that do not occur in Cartesian geometries. We will discuss some of these issues later. Although the 1-D spherical geometry SN equations do not automatically yield positive angular fluxes, it is rare that these equations yield negative solutions given positive sources and cross sections, provided that the variation of the solution is resolved by the computational mesh. Because the cost of a calculation increases with the quadrature order N , there is an incentive to use angular quadrature sets with the fewest number of angles that yield the desired accuracy. It has long been known that, for most 1-D problems, the even-order Gauss–Legendre quadrature sets are optimal (or nearly so). However, for multidimensional problems, the choice of angular quadrature sets is more ambiguous. Next, we discuss spatial discretizations for the one-group SN Eqs. (1.18) and (1.19). As with the previously discussed variables, we must specify points f0 D x1=2 < x3=2 < < xj C1=2 < < xJ C1=2 D X g that define the edges of spatial cells, the j th spatial cell being the interval xj 1=2 < x < xj C1=2 . The cell width is xj D xj C1=2 xj 1=2 . Within each (j th) spatial cell, the cross sections are constant: †t .x/ D †t;j , etc. The cell-edge angular fluxes are defined as n;j ˙1=2 D n .xj ˙1=2 /, and the cell-average angular fluxes are defined as n;j D 1 xj Z xj C1=2 n .x/dx: (1.27) xj 1=2 Integrating Eq. (1.18) over the j th spatial cell and dividing by xj , we obtain exactly the balance equation: n . xj n;j C1=2 n;j 1=2 / C †t;j n;j D N X n0 D1 †s;j .n ; n0 / n0 ;j wn0 C Qj ; 2 (1.28) 10 E.W. Larsen and J.E. Morel which relates the cell-edge and cell-average angular fluxes, and holds for each spatial cell 1 j J and each direction 1 n N: These equations and the boundary conditions n;1=2 D n;J C1=2 l n; D 0 < n < 1; r n; (1.29a) 1 < n < 0; (1.29b) yield a system of N.J C 1/ equations for the N.2J C 1/ cell-average and cell-edge unknown fluxes. To obtain a discrete system with the same number of equations as unknowns, NJ extra equations must be formulated. One way to formulate the extra equations is to specify finite difference relationships between the cell-edge and cell-average fluxes within each cell. If these relationships only directly couple fluxes traveling in the same direction, and if the infinite-medium spatially constant solutions are to be preserved, then one is led to algebraic relationships of the form n;j D 1 C ˛n;j 1 ˛n;j n;j C1=2 C 2 2 n N; 1 j J; n;j 1=2 ; (1.30) where the constants ˛n;i are constrained by the goals of accuracy, stability, and symmetry about D 0. The simplest choice, ˛n;j D 0, yields the diamonddifference scheme. The discrete system (1.28) through (1.30) is the basis for the earliest planargeometry SN codes. For the diamond-difference scheme, the numerical solution is strictly positive if the widths of the spatial cells are sufficiently small. Unfortunately, as the cell widths increase, diamond-difference solutions can become negative, and unphysical oscillatory behavior can occur. In multidimensional problems, positive diamond-difference solutions are never guaranteed; oscillatory behavior is always possible, for any spatial grid. These issues led to other choices of the constants ˛n;j , and nonlinear (negative flux fixup) modifications to Eq. (1.30). In practice, the diamond-difference scheme (sometimes with negative flux fixup) was found to generally work well for relatively “diffusive” and weakly absorbing reactor cores, while Weighted-Diamond (WD) techniques .˛n;j ¤ 0/ generally worked well in more strongly absorbing reactor shields. All of the independent variables in the LBE – time, energy, angle, and space – have now been discretized. Typically, a very large system of linear algebraic equations has been generated. How does one solve this system? In principle, one can write the entire discrete system as a single matrix equation A D q and invert the matrix A. However, A is usually so large that this is impractical; thus, iterative techniques that avoid the explicit construction and direct inversion of A must be used. 1 Advances in Discrete-Ordinates Methodology 11 The most fundamental iterative approach, called Source Iteration (SI), is based on a simple concept. Let us suppose that the entire right side of Eq. (1.28) is known. Then for n > 0, it is easy to show that by starting at the leftmost .j D 1/ spatial cell and recursively marching from left to right (in the direction of particle flow) across the system to the rightmost cell, one can explicitly solve Eqs. (1.28) through (1.30) for all cell-edge and cell-average unknowns that correspond to n > 0. Likewise, by marching across the system from right to left, one can solve Eqs. (1.28) through (1.30) for all cell-edge and cell-average unknowns that correspond to n < 0. In the absence of scattering, this marching (or transport sweep) procedure yields an explicit noniterative solution of the transport problem. If the physical problem includes scattering, then the SI scheme can be implemented. SI begins with an estimate (typically, zero) for the scattering source on the right side of Eq. (1.28). Using this estimated source, Eqs. (1.28) through (1.30) are solved by a transport sweep. The resulting estimates of the cell-average angular fluxes are introduced into the scattering source, and the next iteration is ready to commence. These “source iterations” are repeated until the scalar fluxes converge. Mathematically, the SI scheme for Eqs. (1.28) through (1.30) is defined by n xj .`1=2/ n;j C1=2 N X .`1=2/ n;j 1=2 C †t;j .`1=2/ n;j Qj ; 2 n0 D1 1 C ˛n;j 1 ˛n;j .`1=2/ .`1=2/ D C n;j n;j C1=2 2 2 D .`1=2/ n;1=2 .`1=2/ n;J C1=2 †s;j .n ; n0 / D D l n; r n; .`1/ 0 n0 ;j wn C 0 < n < 1; (1.31a) .`1=2/ n;j 1=2 ; (1.31b) (1.31c) 1 < n < 0; (1.31d) with the update equation .`/ n;j D .`1=2/ : n;j (1.32) Here, ` is the iteration index. The `th iteration begins with cell-average angular flux .`1/ estimates n;j , which are introduced into the scattering source. Equations (1.31) are then solved by a transport sweep, and Eq. (1.32) defines the new cell-average angular flux estimates to be those that were just calculated in the sweep. .0/ If the initial scattering source is taken to be zero . n;j D 0/, the SI angular flux estimate after ` sweeps is, physically, the angular flux due to particles that have experienced at most ` 1 scattering events. If the physical system has significant absorption or is optically thin (leaky), particles will generally have short histories, and the SI scheme will converge rapidly. However, if the physical system is optically thick with weak absorption (i.e., diffusive), then most particles will experience long histories, and the SI scheme will converge slowly. 12 E.W. Larsen and J.E. Morel For iteratively solving multigroup problems, the single-group SI scheme is the main building-block. If no upscattering is present, the equations for the highest .g D 1/ group are iterated to sufficient precision; then the equations for the next .g D 2/ group are iterated to sufficient precision; this process continues to the lowest-energy .g D G/ group, and the problem is then considered fully solved. If upscattering is present, it is necessary to iterate on the upscattering sources: in the first outer iteration, the upscattering sources are estimated (typically set to zero) and the resulting downscattering problem is solved as described in the previous paragraph. The second outer iteration begins by updating the upscattering sources and continues with a second downward “sweep” through the groups. In this process, it is typical not to fully converge the scattering source within each group; instead, a user-specified number of transport sweeps is performed for each group. If the upscattering probability is small, this upscattering or outer iteration strategy will converge rapidly; if the upscattering probability is high, it will generally converge slowly. A widely used parameter for assessing the efficiency of an iterative scheme is the spectral radius of the iteration operator, defined as the asymptotic (for large `) ratio of the error in the `th iteration to the error in the (`–1)st iteration: D lim `!1 .`/ .`1/ : (1.33a) An equivalent definition, which allows to be estimated without knowing the limit of the sequence of iterates, , is given as follows: .`/ D lim `!1 .`1/ .`1/ .`2/ : (1.33b) A rapidly converging scheme will have 1; a slowly converging scheme will have < 1 and 1; and a nonconverging scheme will have 1. A Fourier analysis, to be discussed later, shows that the SI scheme applied to a one-group transport equation in an infinite homogeneous medium has the spectral radius D †s D c scattering ratio: †t (1.34) Therefore, the SI scheme applied to one-group problems always converges; but if c is arbitrarily close to unity (the probability of absorption is arbitrarily small) and the system becomes increasingly thick, SI will converge arbitrarily slowly. (As c ! 1, increasingly many particles will undergo increasingly many scattering events during their histories.) Similarly, it can be shown that the SI scheme applied to multigroup problems, as described above, always converges; but if the probability of upscattering is high, then the spectral radius will again be close to unity. 1 Advances in Discrete-Ordinates Methodology 13 The SI scheme is the most fundamental iteration scheme for solving discretized particle transport problems. All practical iterative schemes today use SI as a principal ingredient in the iteration procedure. The SI method is guaranteed to converge, but for many important problems it converges slowly. Improvements to SI involve modifications to Eq. (1.32) that are intended to improve the efficiency of the SI scheme (reducing the spectral radius) for diffusive problems, or for problems with significant upscatter. The first widely used improvement to SI was the rebalance scheme. In fine-mesh rebalance, Eq. (1.32) is replaced by .`/ n;j D .`1=2/ .`/ Fj ; n;j (1.35) where the rebalance factors Fj.`/ are determined by requiring that .`/ n;j C1=2 D 8 < : .`1=2/ F .`/ ; n;j C1=2 j .`1=2/ .`/ n;j C1=2 Fj C1 ; .`/ n;j and n > 0 (1.36) n < 0 satisfy Eq. (1.28), integrated over n . [Equations (1.35) and (1.36) show that the rebalance factor for the j th cell applies to all the cell-average fluxes within this cell and all the cell-edge fluxes that correspond to exiting directions from the cell.] .`/ This procedure yields a tridiagonal system of equations for Fj ; the updated fluxes satisfy the correct angle-integrated balance equation on each spatial cell. Coarse-mesh rebalance operates by the same principle, except that now the J fine-mesh cells are clustered into a smaller number of coarse-mesh cells. (e.g., two adjacent fine-mesh cells could be clustered into one coarse-mesh cell, yielding J /2 coarse-mesh cells.) Coarse-mesh rebalance factors are defined on the coarse-mesh spatial cells; any cell-average angular flux within a given coarse-mesh cell is updated multiplicatively by the same coarse-mesh rebalance factor. The same concept can be applied to multigroup problems. Here, fine-mesh rebalance factors have one rebalance factor for each spatial cell and energy group. Coarse-mesh rebalance factors correspond to multiple adjacent fine-mesh spatial cells and/or multiple energy groups. In practice, rebalance is often inefficient, even when it is implemented optimally. For most problems, fine-mesh rebalance is unstable (divergence occurs), while coarse-mesh rebalance with a sufficiently coarse grid is stable and provides acceleration. However, if the coarse grid is too coarse, then coarse-mesh rebalance becomes ineffective [56]. Thus, the application of rebalance can be time-consuming, in terms of both the cost of setting up the problem (determining the optimal coarse mesh) and the resulting inefficiency (poor iterative performance after the coarse mesh is chosen). In spite of these deficiencies, rebalance was used widely for many years because it was still more efficient than Source Iteration. To summarize, the discretization methods described above are finite-difference in nature, and the resulting truncation errors are guaranteed to be small only when the discretization grid is “fine.” The implicit time discretization is accurate when 14 E.W. Larsen and J.E. Morel a time step is small compared to a mean free time (t .v†t /1 , where v D neutron speed). We have already discussed the accuracy of the multigroup approximation in energy; depending on the problem, as few as two groups, or as many as over 100 can be required. Angular discretizations can also be difficult; if the solution of a specified problem has a weak angle-dependence, then a low-order quadrature set may be sufficient. However, if strong angular effects are present, due to strongly absorbing regions or voids with streaming, then high-order angular quadrature sets are required. Finally, the finite-difference spatial differencing schemes discussed above require, for reliable accuracy, that spatial cells be optically thin (in 1-D,†t x 1). Thus, the state of the art in production particle transport codes in the late 1960s consisted of the following: implicit differencing in time, the multigroup approximation in energy, the SN approximation in angle, and finite-difference discretizations in space requiring optically thin and rectangular (orthogonal) spatial grids. The iteration methods that existed to solve these discrete problems were often inefficient. The production codes in which these methods were implemented were 1-D or 2-D; no major 3-D codes existed. 1.3 Three Challenging Physical Problems Next, we discuss three important physical problems that could not be simulated realistically by the numerical techniques described in Section 1.2, but that can be meaningfully simulated using the larger and faster computers and the more sophisticated SN algorithms that are available today. These problems are: Thermal radiation transport. This problem requires accuracy in strongly absorb- ing media, accuracy in the optically thick diffusion limit with both resolved and unresolved boundary layers, and robust iteration acceleration. Charged-particle transport. This problem requires accurate treatment of highly forward-peaked scattering, accurate treatment of scattering with extremely small energy losses, and robust iteration acceleration. Oil-well logging tool design. This problem requires efficient treatment of extremely complex 3-D geometries with vastly different scale lengths in a single problem. 1.3.1 Thermal Radiation Transport in the Stellar Regime Thermal radiation transport in the stellar regime is important in astrophysical applications. Hence, some of the most significant early contributions to transport theory came from astrophysicists, e.g., Eddington [1] and Chandrasekhar [2]. For stellar modeling applications, the thermal radiation transport equation is usually coupled to hydrodynamic equations [17, 39], but in this exposition we ignore hydrodynamic 1 Advances in Discrete-Ordinates Methodology 15 coupling. We caution the reader not to confuse infrared transport with thermal radiation transport in the stellar regime. Although both constitute thermal (photon) radiation transport, the stellar regime corresponds to high stellar temperatures and photon energies characteristic of X-rays, while the infrared regime corresponds to low terrestrial temperatures and infrared photon energies. From a numerical viewpoint, infrared calculations are much more benign. In thermal radiation transport, high-energy photons transport through and nonlinearly interact with a host medium (a plasma). The photons can scatter, or they can be absorbed in the medium. If they are absorbed, their energy goes into the medium, increasing its temperature. The medium also emits photons (via a temperaturedependent Planck function), causing its temperature to decrease. The equations of thermal radiation transport consist of a transport equation for the angular intensity of photons I.r; ; E; t/, 1 1 @I C rI C †t I D †s C †a B; c @t 4 (1.37) and an energy-balance equation for the material temperature T .r; t/, Cv @T D @t Z 1 †a Œ 4 B dE: (1.38) 0 Here, c is the speed of light, †s .r; E; T / is the macroscopic Thompson scattering cross section, †a .r; E; T / is the macroscopic absorption cross section, †t .r; E; T / D †a C †s is the macroscopic total cross section, .r; E; t/ is angular intensity integrated over all directions, Cv .r; T / is the material heat capacity, and B.E; T / is the Planck function: B.E; T / D E 2E 3 1 exp h3 c 2 kT 1 ; (1.39) where h is Planck’s constant and k is Boltzmann’s constant. The primary unknowns are the angular intensity I.r; ; E; t/, which is a photon energy flux rather than a number flux, and the material temperature T .r; t/. Because Eqs. (1.37) and (1.38) are nonlinear in T , they are generally solved in each time step via Newton’s method. To illustrate this technique, we implicitly difference the equations in time: 1 kC1=2 kC1=2 kC1=2 k1=2 I C rI kC1=2 C †t I I k ct 1 kC1=2 kC1=2 †s D C †a B kC1=2 ; 4 Z 1 CvkC1=2 kC1=2 k1=2 kC1=2 T D T † a t k 0 kC1=2 (1.40) 4 B kC1=2 dE: (1.41) 16 E.W. Larsen and J.E. Morel We let T denote the latest Newton iterate for the temperature. Then the linearized equations for the next Newton iteration are obtained by evaluating the material properties at T and linearly expanding the temperature-dependent Planck function about T : @B kC1=2 (1.42) T ; T B kC1=2 D B C @T where a superscript “ ” denotes a quantity evaluated at T . With the above expansion, the material temperature can be eliminated from the transport equation. Suppressing the temporal superscript “k C 1=2”, we express the linearized temporally differenced transport equation as follows: Z 1 1 1 †s C †a .E 0 / .E 0 / dE 0 C ; (1.43) rI C † I D 4 4 0 where † D †t C ; D (1.44a) 1 ; ct k (1.44b) R1 @B .E/ dE @T ; D R1 Cv @B .E/ .E/ dE C 4 † a 0 @T t k 4 0 †a .E/ @B .E/ @T .E/ D ; R1 .E 0 / @B .E 0 / 0 dE † a 0 @T (1.44c) †a .E/ (1.44d) D †a B C I k1=2 Z 1 C †a .E 0 /4 B .E 0 / dE 0 C vk T k1=2 T ; (1.44e) 4 t 0 and the material temperature is given by R1 T kC1=2 D T C 0 Cv k1=2 T T k t : R1 Cv .E/ @B .E/4 dE C † a 0 @T t k (1.45) †a .E/ Œ .E/ 4 B .E/ dE C Equation (1.45) is used to calculate the temperatures after the linearized transport Eq. (1.43) has been solved. Equation (1.43) has the form of the steady-state neutron transport equation, with †a acting as the fission cross section, acting as the number of neutrons per fission, acting as the fission spectrum, and acting as the inhomogeneous source. Also, we note that the scattering process is monochromatic, 1 Advances in Discrete-Ordinates Methodology 17 i.e., photons do not change energy when they scatter. The spectral radius for the inner iteration process (the iteration on a within-group scattering source) is the usual scattering ratio, c D †s=†t , and the spectral radius for the outer iteration process (the iteration on the absorption or effective “fission” rate) is given by Z 1 †a dE: (1.46) D 0 †a C 0 The spectral radii observed in a practical problem are the maximum values of c and 0 , respectively, evaluated over the problem domain. Any region in which 0 is close to unity will be diffusive. The diffusion limit for thermal radiation transport is characterized by I D B.T /; (1.47) @T 4acT 3 r rT D 0; (1.48) @t 3†r where a D 8 5 k 4 =15h3c 3 , and †r is the Rosseland-averaged total cross section: Z 1 Z 1 @B.E/ 1 @B.E/ dE dE : (1.49) †r D @T † 0 0 t .E/ @T .Cv C 4aT 3 / The linearized thermal radiation transport equation is analogous to a particular form of the neutron transport equation, but it is generally much more difficult to accurately and efficiently simulate than the neutron transport equation for the following reasons: Within a specific material region, the cross sections †a and †s can vary with energy by six or more orders of magnitude. †a and †s can change in space by six or more orders of magnitude across a material interface. Problems often contain both optically thin regions and optically thick regions. Optically thick regions can be strongly absorbing or highly diffusive. In diffusive regions, the spatial scale length of the solution can have a thickness of many mean free paths (e.g., thousands or more). The spectral radius of the outer iteration process can be very close to unity, e.g., 0 0:9999. For these reasons, the numerical requirements for thermal radiation transport are severe. In particular, spatial differencing schemes must be highly damped in strongly absorbing media, accurate in optically thin regions, have the thick diffusion limit (produce the correct diffusion solution in diffusive regions where the spatial grid is optically thick), and behave well with unresolved spatial boundary layers. A special diffusion-synthetic acceleration-like scheme, known as the linear multifrequencygrey method [42, 50, 77], is used to accelerate the outer iterations. The numerical implementation of this scheme must be very robust because there are often huge spatial discontinuities in the cross sections. In multidimensional calculations, the linear multifrequency-grey method must be recast as a preconditioner and used in conjunction with a Krylov solution technique. 18 E.W. Larsen and J.E. Morel New algorithmic technologies that have made accurate and efficient thermal radiation transport calculations possible include lumped discontinuous finiteelement temporal and spatial discretizations, diffusion-synthetic acceleration of the inner iterations, and linear multifrequency-grey acceleration of the outer iterations. Fourier analysis has played a major role in determining the effectiveness of acceleration schemes, and asymptotic analysis has played a major role in determining the behavior of finite-element spatial differencing schemes in the thick diffusion limit. While 1-D methods are fairly mature [77], multidimensional methods are still a topic of research. Currently, preconditioned Krylov methods are having a major impact on the efficiency of multidimensional calculations. In subsequent sections, we discuss all of these new techniques. The numerical technology of 1968 was completely inadequate for thermal radiation transport calculations in the stellar regime. The spatial differencing schemes of that time were not accurate in optically thick, diffusive problems, and the iterative acceleration techniques available then were not sufficiently efficient and robust. 1.3.2 Charged-Particle Transport Charged-particle transport problems are described by Eq. (1.1) with †f D 0 (charged particles do not yield fission). However, charged particles can generate secondary particles of various sorts, which, for simplicity and without loss of generality, we neglect in our equations. Charged-particle problems are challenging because the mean free path, the average scattering angle, and the associated average energy loss are generally very small. To deal with this, the nearly singular components of the differential scattering cross section are often split off from the nonsingular (“smooth”) components and approximated by Fokker–Planck operators, yielding the approximate Boltzmann–Fokker–Planck (BFP) equation [37]: Z r C †t;sm D 1 Z †s;sm.0 ; E 0 ! E/ .0 ; E 0 /d0 dE 0 1 @2 †r;tr @ 2 @ .1 / C C 2 @ @ 1 2 @! 2 @ˇr C Q: (1.50) C @E 0 4 Here, the direction vector is defined in terms of the cosine of the polar angle and the azimuthal angle !. Also, †t;sm is the “smooth” total cross section, †s;sm is the “smooth” differential scattering cross section, †r;tr is the restricted transportcorrected scattering cross section (also called the restricted momentum transfer), and ˇr is the restricted stopping power. The differential (in , !, and E) operators on the right side of Eq. (1.50) are Fokker–Planck operators in the angular and energy variables. As described above, the complete scattering cross section in Eq. (1.50) has been decomposed into 1 Advances in Discrete-Ordinates Methodology 19 “singular” and “smooth” components. The singular (differential) terms represent the part of the scattering cross section associated with highly forward-peaked scattering and small energy losses, while the smooth (integral) term represents the part of the scattering that is not singular. The cross sections and Fokker–Planck coefficients in Eq. (1.50) are said to be “restricted” because each is restricted to one component of the scattering. The angular Fokker–Planck operator on the right side of Eq. (1.50) is called the continuous-scattering operator. It represents an asymptotic limit of the Boltzmann scattering operator in which the scattering cross section †s becomes unbounded and the average cosine of the scattering angle N 0 limits to unity in such a manner that the momentum transfer, ˛ D †s .1 N 0 /, remains constant. Thus, particles scatter more often per unit pathlength but undergo a smaller average change in direction, in such a way that the mean change in direction cosine per unit pathlength remains constant. This limit results in a continuous-scattering process. In fact, it rigorously represents diffusion on the unit sphere. The continuous-scattering operator in Eq. (1.50) is the Laplacian operator on the unit sphere multiplied by the effective diffusion coefficient †r;tr =2. The energy Fokker–Planck operator on the right side of Eq. (1.50) is called the continuous-slowing-down operator. It represents an asymptotic limit of the Boltzmann scattering operator in which the scattering cross section †s becomes unbounded and the average energy loss per scatter E goes to zero in such a manner that the stopping power, ˇ D †s E, remains constant. Thus, particles scatter more often per unit pathlength but undergo a smaller average energy loss per scatter, such that the average energy loss per unit pathlength remains constant. This limit yields a process in which particles continuously lose energy per unit pathlength at a rate given by the stopping power. It is not always necessary to split the singular components off the scattering operator and approximate them by Fokker–Planck operators. In 1-D slab geometry and spherical geometries, one can obtain extremely accurate discrete approximations for the full scattering source (singular C smooth), even when the Legendre expansion for the scattering cross section is highly truncated; one need simply use a PN1 cross-section expansion, together with a Gauss SN quadrature set [54]. However, in 2-D and 3-D calculations, special techniques must be used to achieve the accuracy associated with 1-D Gauss quadratures. Special discretizations of the continuous-scattering operator can be very useful for obtaining a positive scattering source representation [40, 55, 120]. It is difficult to simultaneously maintain accuracy and scattering source positivity with highly truncated Legendre cross-section expansions. Charged-particle calculations are numerically difficult for the following reasons: If the LBE is not approximated by the BFP equation, Legendre expansions for charged-particle scattering cross sections can require thousands of terms to converge. The scattering ratio can be very close to unity, resulting in very slow convergence of the source iteration process. 20 E.W. Larsen and J.E. Morel Diffusion-synthetic acceleration improves convergence, but because of the highly forward-peaked scattering, it does not necessarily bound the spectral radius away from unity. Some higher-order Legendre moments of the angular flux must also be accelerated to bound the spectral radius away from unity. Multigroup-like methods for treating the continuous-slowing-down term generally result in a great deal of numerical diffusion in energy. This loss of accuracy can make it difficult to calculate energy spectra. Higher-order (in energy) methods are more accurate than multigroup-like methods, but the energy spectra can still be difficult to converge. (This is a greater difficulty for [heavy] ions than for [light] electrons.) Strong boundary layers can exist at material interfaces if secondary particle generation occurs. For instance, near interfaces between low-Z and high-Z materials, X-ray photons can create low energy electrons with strong boundary layers. Different particle species in a coupled calculation may have vastly different scale lengths for their solutions. For instance, photons can generate electrons that have much smaller mean free paths than the photons. Although it was adequate for simple demonstration calculations, the numerical transport technology of 1968 was not adequate for realistic charged-particle calculations. Some of the new technologies that have strongly impacted charged-particle transport are discontinuous finite-element methods (DFEMs) in both space and energy. DFEM discretizations are essential for robustness. For instance, if one is interested in bulk energy deposition rather than the anomalous deposition that occurs very near high-Z/low-Z interfaces, DFEM space-energy methods can yield accurate bulk energy deposition, even with completely unresolved boundary layers. Furthermore, DFEM energy discretizations are much more accurate than multigroup-like methods when energy spectra are of interest, or when energy and charge deposition are of interest in deep penetration problems [47]. Diffusion-synthetic acceleration can significantly reduce running times, and methods that accelerate higher-order moments can have an even greater impact [36, 62]. The Galerkin quadrature method, which is compatible with standard Legendre cross-section expansions, has made it possible to accurately treat highly anisotropic scattering in 2-D and 3-D calculations [54]. One-dimensional methods for coupled electron–photon transport are quite mature [58], but 2-D and 3-D methods remain a research topic for charged-particle transport calculations of all types. A difficult problem of great interest is that of pencil-beam sources in 2-D cylindrical and 3-D geometries. Acceleration of the high-order scattering source moments can be efficiently done in 1-D [62], but this task has proven to be very difficult in 2-D and 3-D [93]. The application of the SN method to multidimensional charged-particle transport calculations is just beginning to occur [101]. This technology will probably be slow to gain widespread use in production settings because of the high level of expertise required to apply the SN method to charged-particle transport problems, and because of the historical preference of Monte Carlo rather than deterministic methods within the chargedparticle transport community. Nonetheless, the SN method has the potential to be 1 Advances in Discrete-Ordinates Methodology 21 much more efficient than Monte Carlo for a wide variety of applications, including space shielding, accelerator shielding, radiation effects on electronics, and radiation cancer therapy. 1.3.3 Oil-Well Logging Tool Design Oil-well logging describes the measuring of physical characteristics of rock formations surrounding a borehole using various types of instruments [48]. Data acquired by an instrument is recorded as a function of its depth in the borehole, resulting in a scroll-like plot called a log. Several types of logging tools based on electromagnetic, acoustical, mechanical, and nuclear radiation physics exist. Each tool type measures specific properties of rock formations, which provide information on the oil content of the formations. For instance, a nuclear neutron–neutron tool (a neutron source with a neutron detector) measures the porosity of the rock, while a gamma–gamma tool (a gamma-ray source with a gamma-ray detector) measures the density of the rock. Nuclear tools are generally operated with the tool pressed against the side of the borehole, rather than being centered in the borehole; this makes design calculations inherently three-dimensional. Nuclear tools function by emitting radiation from the source in the tool into the rock formation. The source particles, or secondary particles generated by the source particles, are then detected within the tool. (Each nuclear tool has two detectors: one “close” to the source, and one “far” from the source.) Finally, the intensity and energy spectra of the detected radiation are interpreted to yield information about the rock formation surrounding the borehole [48]. Monte Carlo methods have traditionally been used to perform logging tool design calculations because of the inherent 3-D nature of these calculations. Extensive variance reduction techniques must generally be employed because the probability of a source particle reaching a detector can be very small. For instance, the probability that a source particle will reach the far detector in a gamma–gamma tool is roughly 108 . This extremely small value is the consequence of a highly collimated source and a highly collimated detector. Furthermore, statistical errors in design calculations must generally be reduced to a standard relative deviation of one percent or less. The need for extensive variance reduction and very low statistical errors can lead to extremely high CPU times for nuclear tool design calculations. Thus, there is much room for improvement in methods for performing these calculations. There are two roles for SN methods to play in tool design calculations. The first is to use deterministic methods to directly perform design calculations; the second is to use deterministic methods to provide adjoint calculations for automatic variance reduction in Monte Carlo calculations [118]. The properties of nuclear logging tool design calculations that present the most difficulties for SN methods are 3-D geometric complexity and the need for accuracy to within 1%. The inherent 3-D nature of logging tool design calculations put them far beyond the reach of the 1-D and 2-D computational transport technology of 1968. Much of 22 E.W. Larsen and J.E. Morel the work on nodal spatial discretization schemes in the 1980s was motivated by the high accuracy required for logging tool design calculations. However, most of these nodal methods were developed for rectangular meshes, and the geometric complexity of logging tools can cause the use of such meshes to be highly inefficient. In the late 1980s it became clear that rectangular meshes were inadequate for logging tool design calculation, regardless of the accuracy of the discretization schemes developed for such meshes. Since all of the first production 3-D SN codes were limited to rectangular meshes, and since it was not clear that sufficiently accurate unstructuredmesh SN methods could be developed, most of the research on SN methods for logging tool design ended without success in the early 1990s. This situation changed in the late 1990s with the advent of 3-D unstructured-mesh SN methods. It has recently been demonstrated that 3-D unstructured-mesh SN methods can in fact be successfully applied to oil-well logging tool design calculations. In particular, a special version of the ATTILA code [78] was used to perform calculations for both a neutron–neutron tool and a gamma–gamma tool [89]. No special techniques were required for the neutron–neutron tool, but the highly collimated source and detectors associated with the gamma–gamma tool required the use of a firstcollided source technique (to get the source gamma-rays into the rock formation) and a last-collided source technique (to get the scattered gamma-rays into the highly collimated near and far detectors) in conjunction with the SN method. Monte Carlo calculations for both tools were performed with the MCNP code [72] to provide benchmark results. Each tool required two MCNP calculations: one calculation for the response of the near detector, and one calculation for the response of the far detector. The ATTILA code was also used to generate two adjoint solutions for the gamma–gamma tool (one for the response of each detector). These adjoint solutions were used to generate weight windows for the MCNP calculations. All calculations were performed on an SGI Octane workstation with a 250 MHz R10000 processor. The ATTILA results were excellent for the neutron–neutron tool (errors less than 1%) and very good for the gamma–gamma tool (errors less than 2%). Completely converged solutions could not be obtained for the gamma–gamma tool calculations because of memory limitations. The CPU time for the SN neutron–neutron tool calculation was about 7 h versus about 10 h for the Monte Carlo calculation (which used the best variance reduction techniques available). A general conclusion from these results is that SN methods can now accurately and efficiently perform neutron–neutron tool calculations, but Monte Carlo methods will compete with SN methods for such calculations. This conclusion is not necessarily true for gamma–gamma tool calculations. The CPU time for the SN gamma–gamma tool calculation was about 12 h, while the CPU time for the Monte Carlo calculation would have been about 100 h if the statistical errors had been reduced to the requisite 1%. Furthermore, the adjoint SN solution used to generate weight windows for the Monte Carlo calculation of the far detector response resulted in a factor of 1,000 speedup. It is doubtful that the Monte Carlo calculations could have been performed on a single workstation without the SN adjoint solutions. A general conclusion is that 3-D unstructured-mesh SN methods have the potential to dramatically improve capabilities for gamma–gamma 1 Advances in Discrete-Ordinates Methodology 23 tool calculations [89]. However, to achieve 1% accuracy, the direct use of SN methods will require a large number of energy groups, and hence a large amount of storage. In the near future, this will likely necessitate a parallel 3-D unstructuredmesh capability, because huge memories are only available on massively parallel computers. However, the indirect use of SN methods (to provide weight windows for Monte Carlo) is feasible with workstations, because adjoint solutions do not need to be highly accurate to be effective for variance reduction. Also, highly accurate adjoint solutions may not be cost-effective because the automatic variance reduction techniques available in production Monte Carlo codes do not give a zerovariance solution when the adjoint solution is exact [118]. Monte Carlo methods will not be able to fully utilize SN adjoint solutions until zero-variance (or nearly zero-variance) biasing algorithms are developed. 1.4 Advances in Spatial Discretizations As discussed in Section 1.2, the state of the art in spatial discretizations of the transport equation prior to the 1970s consisted of finite-difference approximations involving the particle balance equation (Eq. (1.28)) and associated weighted-diamond auxiliary equations (Eq. (1.30). These methods were accurate for optically thin cells .†t x 1/, but they generally became inaccurate for problems with spatial cells that were not optically thin. This deficiency led to the consideration of more sophisticated spatial discretization schemes with multiple unknowns per cell. In the most successful of these new methods, the solutions’ increased accuracy and robustness more than compensated for the greater computer arithmetic and storage requirements. In this section, we discuss some of these new methods. To keep our discussion simple, we consider the SN Eq. (1.18) with isotropic scattering and an isotropic inhomogeneous source, n d n .x/ dx .x/ D C †t N X n .x/ D n0 .x/wn0 ; 1 Œ†s .x/ C Q.x/ ; 2 (1.51a) (1.51b) n0 D1 together with a spatial grid fxj C1=2 j 0 j J g and related notation, as described in Section 1.2. 1.4.1 Characteristic Methods Characteristic methods are some of the most successful of the new classes of spatial discretization methods that have been developed since the early 1970s 24 E.W. Larsen and J.E. Morel [12, 65, 88, 102, 106]. The methods discussed below are often referred to as short characteristics to distinguish them from long characteristic methods typically employed in assembly calculations in core physics applications; see Chapter 4. Long characteristics are based on an integral form of the transport equation, and their solution algorithm usually utilizes a ray-tracing scheme of particle paths throughout the problem geometry. Short characteristic methods have the following features: On each spatial cell, .x/ and Q.x/ are approximated by a low-order polynomial (usually constant or linear). The leakage-plus-collision operator m d=dx C †t is mathematically inverted. (In multidimensional problems, this inversion requires integration along a characteristic line of the transport equation; this property gives its name to the method.) The resulting analytic expression for n .x/ is manipulated to yield a polynomial representation, which, when combined with similar results for other directions of flight, yields a new polynomial expression for .x/. The resulting discrete equations are considered to be solved when, for each cell, the original polynomial expression for .x/ in the right side of Eq. (1.51a) equals the polynomial expression described in the previous bullet. For example, if .x/ is represented in the j th cell as a spatial constant – expressed in terms of the cell-average angular fluxes – then Eq. (1.51a) becomes (assuming for simplicity Q D 0) n d n .x/ dx C †t n .x/ D †s 2 j; xj 1=2 < x < xj C1=2 : (1.52) For n > 0 (flow from left to right), Eq. (1.52) can be analytically solved for †t .xxj 1=2 /=n .x/ D e n † s †t .xxj 1=2 /=n C 1 e n;j 1=2 2†t n .x/: j: (1.53) Using this expression, the flux exiting the j th cell is n;j C1=2 D e †t xj =n n;j 1=2 † s C 1 e †t xj =n 2†t j; (1.54) and the cell-average flux is n;j D n 1 e †t xj =n n;j 1=2 †t xj † n s 1 e †t xj =n C 1 †t xj 2†t j: (1.55) The exiting flux (Eq. (1.54)) is used as the incident flux for the next, i.e., .j C 1/, cell, and the cell-average flux (Eq. (1.55)) is folded into an array that, on completion of the transport sweep in all discrete directions, yields a new estimate for j . 1 Advances in Discrete-Ordinates Methodology 25 The solution is considered to be converged when the new value of j differs from the old value by less than a prespecified convergence criterion for all j . This algorithm, with .x/ represented as a constant in each cell, is the StepCharacteristic (SC) method [12, 25]. For 1-D problems, the SC method can be formulated as a weighted-diamond scheme, i.e., of the form described by Eqs. (1.28) and (1.30). The SC method readily generalizes to multidimensional problems on irregular grids, but in these circumstances it cannot be formulated as a weighteddiamond scheme. The SC method is currently used in several multidimensional production neutron transport codes. In applications of these codes, the spatial grid is optically thin, and accurate solutions are obtained. However, the SC method is not accurate for problems demanding optically thick meshes, e.g., of the type described in Section 1.3. Thus, more complicated characteristic methods have been proposed and implemented, in which the scattering source is represented within each spatial cell as a linear, or even quadratic function of the spatial variables [25]. As can be expected, with each increase in the polynomial order of the representation of .x/: The accuracy of the resulting solution increases. The computational effort required to process the extra algebraic complexity increases. The computer memory needed to store the extra problem unknowns increases. 1.4.2 Linear Discontinuous Method Among the most flexible and successful of the noncharacteristic methods are the discontinuous finite-element (DFE) methods [18, 70, 98, 105, 109, 116]. The linear discontinuous (LD) method is perhaps the archetypical method in this class. This method is based on representations of n .x/ and .x/ that are linear within each cell, but discontinuous at cell edges. In the LD method, Eq. (1.52) is approximated by n d n .x/ C†t dx n .x/ D †s 2 .0/ j C 2 .x xj / xj .1/ j ; xj 1=2 < x < xj C1=2 ; (1.56) where xj is the center of the j th cell; now the representation of .x/ requires two unknowns per cell, j.0/ and j.1/ . If the operator on the left side of Eq. (1.56) were inverted exactly, as described previously, we would obtain a (linear) characteristic method. However, discontinuous finite-element methods are based on an approximate, rather than an exact, inversion of this operator. 26 E.W. Larsen and J.E. Morel The LD method employs the following linear-discontinuous representation of n .x/: n .x/ D 8 ˆ ˆ ˆ < nj C 2 xj ˆ ˆ :̂ nj C 2 xj .x xj / n;j C1=2 nj ; n > 0; xj 1=2 < x xj C1=2 ; .x xj / nj n;j 1=2 ; (1.57) n < 0; xj 1=2 x < xj C1=2 : Equation (1.57) is consistent with the notation that nj is the cell-average flux and .xj C1=2 ; n / is the cell-edge angular flux. We note that at the cell n;j C1=2 D edges, n .x/ is continuous from the left for n > 0 and from the right for n < 0, but is discontinuous otherwise. The unknowns nj and n;j ˙1=2 in Eq. (1.57) and .0/ .1/ j and j in Eq. (1.56) are related by .0/ j D N X nj wn ; (1.58a) nD1 .1/ j D X n;j C1=2 nj wn C n >0 To obtain equations for X nj n;j 1=2 wn : (1.58b) n <0 nj and Z `C1 xj`C1 n;j ˙1=2 , xj C1=2 we operate on Eq. (1.56) by .x xj /` ./dx; ` D 0; 1; xj 1=2 to obtain, for ` D 0, the conventional balance equation: n xj n;j C1=2 n;j 1=2 C †t nj D †s 2 .0/ j ; (1.59) and for ` D 1: n xj n;j C1=2 C n;j 1=2 2 nj 2†t C xj2 Z xj C1=2 .x xj / n .x/dx xj 1=2 D †s 6 .1/ j : (1.60) (Here, we have used the previously adopted notation for the cell-edge and cellaverage angular fluxes.) To close these equations, the integral term in Eq. (1.60) must be expressed in terms of nj and n;j ˙1=2 . To do this, we introduce Eq. (1.57) into the integral term and for n > 0 obtain n xj n;j C1=2 C n;j 1=2 2 nj C †t 3 n;j C1=2 nj D †s 6 .1/ j : (1.61) 1 Advances in Discrete-Ordinates Methodology 27 Equations ( 1.59) and ( 1.61) are two linear algebraic equations for nj and n;j ˙1=2 . [For n < 0, the term . n;j C1=2 nj / in Eq. (1.61) is replaced by . n;j n;j 1=2 /.] The resulting LD method can be generalized to multidimensional problems on structured (rectangular or orthogonal) and unstructured (nonorthogonal) spatial grids [97]. Variants of the LD method have also been developed, such as the lumped LD method, which is more robust for optically thick spatial cells but less accurate for optically thin cells, and various corner balance methods [76, 83, 90, 96]. In general, LD-like methods are much more accurate and robust for the difficult physical problems described in Section 1.3 than finite-difference methods. LD methods require greater computer arithmetic and storage than finite-difference methods, but their increased accuracy and robustness usually more than compensates for these disadvantages. 1.4.3 Nodal Methods Another class of spatial differencing techniques that have been developed and widely used in the nuclear reactor community are nodal methods [28–30, 43–45]. In essence, a nodal method approximates a multidimensional transport equation by a coupled system of 1-D transport equations. Discretization techniques that are highly accurate for 1-D can then be utilized. To illustrate, let us consider an .x; y/-geometry version of Eq. (1.51a) on a spatial cell (assuming for simplicity Q D 0): n @ n .x; y/ C n @ n .x; y/ C †t n .x; y/ D @x @y xi 1=2 < x < xi C1=2 ; yj 1=2 < y < yj C1=2 : †s .x; y/; 4 (1.62) Transversely integrating this equation by the operator < >y;j D 1 yj Z yj C1=2 ./ dy; (1.63a) >y;j ; (1.63b) yj 1=2 and defining n;y;j .x/ D< n .x; y/ we obtain (after moving the transverse derivative to the right side of the equation) d †s n;y;j .x/ C †t n;y;j .x/ D y;j .x/ dx 4 n n x; yj C1=2 n x; yj 1=2 : yj n (1.64) 28 E.W. Larsen and J.E. Morel Similarly, transverse-integrating Eq. (1.62) by the operator < >x;i D 1 xi Z xi C1=2 ./dx; (1.65a) >x;i ; (1.65b) xi 1=2 and defining n;x;i .y/ D< n .x; y/ we obtain (after moving the transverse derivative to the right side of the equation) n d dy D n;x;i †s 4 .y/ C †t x;i .y/ n;x;i n xi .y/ n .xi C1=2; y/ n .xi 1=2; y/ : (1.66) In the context of the SN equations, Eqs. (1.64) and (1.66) are exact. These are two 1-D first-order transport equations containing the transversely integrated angular fluxes defined in Eqs.( 1.63b) and( 1.65b). Equations (1.64) and (1.66) can be closed by taking n .x; yj ˙1=2 / D n;x;i .yj ˙1=2 /; xi 1=2 < x < xi C1=2 ; (1.67a) n .xi ˙1=2 ; y/ D n;y;j .xi ˙1=2 /; yj 1=2 < y < yj C1=2 ; (1.67b) which yield d †s n;y;j .x/ C †t n;y;j .x/ D y;j .x/ dx 4 n n;x;i .yj C1=2 / n;x;i .yj 1=2 / ; yj n (1.68) and n d dy n;x;i n xi .y/ C †t n;x;i .y/ n;y;j .xi C1=2 / D †s 4 x;i .y/ n;y;j .xi 1=2 / : (1.69) More accurate approximations can be obtained by calculating higher-order transverse spatial moments of Eq. (1.62) and using these to obtain more accurate closure relations than defined in Eqs. (1.67a) and (1.67b). Nodal methods always consist of coupled 1-D transport equations for each multidimensional spatial cell. These equations can be fully discretized by applying a standard 1-D discretization to each transverse-integrated equation. For example, one can obtain a pseudo-characteristic method by replacing the analytic functions on the right side of each transverse-integrated equation by moment-preserving polynomial fits to those functions, and then inverting the 1-D 1 Advances in Discrete-Ordinates Methodology 29 transport operator on the left side of each transverse-integrated equation. The result is a transport solution for a polynomial source that depends on the solution itself. For instance, assuming a constant spatial dependence for the scattering source on the right side of Eq. (1.68), and inverting the 1-D transport operator on the left side of Eq. (1.68), we obtain the following solution for > 0 and > 0 : n;y;j .x/ D †t .x xi 1=2 / n †t .x xi 1=2 / 1 exp ; n n;y;j .xi 1=2 / exp C qn;y;j;i †t (1.70) where qn;y;j;i D †s 4 xi Z xi C1=2 y;j .x/dx xi 1=2 n yj n;x;i .yj C1=2 / n;x;i .yj 1=2 / : (1.71) Following the same procedure for Eq. (1.69), we obtain †t .y yj 1=2 / n;x;i .y/ D n;x;i .yj 1=2 / exp n †t .y yj 1=2 / qn;x;i;j C 1 exp ; †t n (1.72) where qn;x;i;j D †s 4 yj Z yj C1=2 x;i .y/ d y yj 1=2 n xi n;y;j .xi C1=2 / n;y;j .xi 1=2 / : (1.73) Equation (1.70) is now evaluated at x D xi C1=2 , Eq. (1.72) is evaluated at y D yj C1=2 , and the resulting two equations [together with Eqs. (1.71) and (1.73)] are solved for the outgoing edge fluxes n;x;i .yj C1=2 / and n;y;j .xi C1=2 /. This procedure constitutes the 2-D Constant-Constant Nodal (CCN) method. This method is so-named because the transverse derivative term and the scattering source are approximated as constants in each transverse-integrated equation. The more accurate (but more expensive) 2-D Linear-Linear Nodal (LLN) method has four transverse-integrated equations with linear transverse derivatives and linear scattering sources in each equation. Extending these methods to 3-D is straightforward. Nodal transport (and diffusion) methods have played an extremely important role in the nuclear engineering community during the past 20 years. As discussed in Section 1.3.3 nodal methods (on rectangular cells) were applied to oil-well logging problems in the 1980s but were abandoned for that application because of their lack of suitability to nonorthogonal grids. The applicability of nodal methods to nonorthogonal grids remains a research topic. 30 E.W. Larsen and J.E. Morel 1.4.4 Solution Accuracy in the Thick Diffusion Limit The last topic in this section is not a discretization scheme, but rather a theoretical technique, which, in the past 20 years, has become essential in predicting the accuracy of discretization schemes for diffusive problems with optically thick spatial cells .†t x 1/. Early (finite difference) spatial differencing schemes for the transport equation were experimentally and theoretically understood to be accurate only when spatial cells were optically thin .†t x 1/, and the accuracy of these schemes was generally measured by the order of their truncation error [25]. For example, an nth-order scheme would satisfy k exact x k D O. n /; 1; where jj jj is a suitable error norm and D †t x. In slab geometry, the DD and SC schemes are second-order, while the LD method is third-order. Analyses to mathematically prove the order of convergence were always carried out in slab geometry; in multidimensional geometries the SN solutions have singular characteristics, across which the solution is not smooth [20], so the truncation error analyses that can be carried out in slab geometry are not applicable. In fact, computer experiments have shown [33] that because of the singular characteristics, the order of convergence of the DD scheme in x,y-geometry depends on the definition of the error norm. Worse yet, the difficult thermal radiation and charged-particle transport problems described in Section 1.3 are so optically thick that it is impossible, because of limits in computer memory, to assign spatial grids for them that are optically thin. However, such calculations are associated with the diffusion and Fokker–Planck limits, respectively, and the spatial scale lengths for the solution associated with these limits are much larger than a mean free path. It is not unreasonable to expect a transport spatial discretization scheme to yield accurate results with optically thick cells if the scale length of the solution is well-resolved by the mesh. Indeed, one would intuitively expect to get accurate results with such mesh resolution. The difficulty is that a truncation error analysis does not provide useful information for these types of problems. Such an analysis tells us only that accurate results will be obtained by using optically thin cells. To determine if accurate results can be obtained with a mesh that is optically thick but resolves the spatial scale length of the solution, it is necessary to perform a discrete asymptotic analysis [49, 113]. Although the system is optically thick in both the diffusion and Fokker–Planck limits, the requirements associated with each limit for spatial differencing schemes are quite different. Accurate and robust spatial discretization schemes are generally required for charged-particle transport, but the highly anisotropic scattering treatment is of primary importance in the Fokker–Planck limit rather than the spatial differencing scheme [113]. In contrast, the spatial discretization is of primary importance in the diffusion limit. We focus on the diffusion limit here. 1 Advances in Discrete-Ordinates Methodology 31 A diffusive problem is optically thick with weak absorption; it is a problem for which the transport solution is well-approximated by the diffusion solution. A spatial discretization of the SN equations is of practical use for diffusive problems if it possesses the optically thick diffusion limit [49, 52, 66, 86, 88, 105]. Such a discretization scheme will yield accurate results for diffusive problems if the spatial mesh cells are thin with respect to a diffusion length (the spatial scale length for the diffusion solution), even if these cells are thick with respect to a mean free path. To describe the diffusion limit, let us consider the monoenergetic planargeometry SN equations n d n .x/ dx C †t n .x/ D N †s X 2 0 n0 .x/wn0 C n D1 Q.x/ ; 1 n N; 2 (1.74) and their diffusion approximation d 1 d .x/ C †a .x/ D Q.x/: dx 3†t dx (1.75) Here, we have used the standard notation †a D †t †s D absorption cross section; and .x/ D N X n .x/wn D scalar flux: (1.76a) (1.76b) nD1 To motivate the subsequent analysis, we multiply the diffusion Eq. (1.75) by a positive constant ": d " d .x/ C "†a .x/ D "Q.x/: (1.77) dx 3†t dx Clearly, the solution of the diffusion equation is unchanged. This shows that if we define the following scaled cross sections and source, †t ; " †a ! "†a ; (1.78b) Q.x/ ! "Q.x/; (1.78c) †t ! (1.78a) which implies †s D †t †a ! †t "†a ; " (1.78d) 32 E.W. Larsen and J.E. Morel then the diffusion equation is invariant under this scaling, for any choice of ". However, the SN equations are not invariant; they become n d n .x/ dx C †t " 1 2 n .x/ D †t "†a " Q.x/ ; 2 C" X N n0 .x/wn0 n0 D1 1 n N: (1.79) Now, one can show that for " 1, the solution of Eqs. (1.79) satisfies n .x/ .x/ C O."/; 2 D (1.80) where .x/ satisfies the diffusion Eq. (1.75). To derive this result, we solve Eqs. (1.79) by assuming a solution that, for " 1, depends on " by a simple asymptotic expansion: 1 X .x/ D "i n.i / .x/: (1.81) n i 0 Introducing Eq. (1.81) into Eq. (1.79) and equating the coefficients of different powers of ", we obtain for i 0 the following system of equations: †t .i / n .x/ ! N 1 X 2 0 .i / 0 n0 .x/wn n D1 D n .1/ d dx .i 1/ .x/ n N †a X 2 0 .i 2/ .x/ wn0 n0 n D1 C ıi;2 Q.x/ ; (1.82) 2 .2/ where n .x/ D n .x/ D 0. We solve this system recursively, by solving the first .i D 0/ equation, then the second .i D 1/ equation, etc. The first .i D 0/ equation is ! N 1 X .0/ .0/ 0 D 0: (1.83) †t n .x/ n0 .x/wn 2 0 n D1 Assuming that the quadrature set satisfies N X Z wn D nD1 1 d D 2; (1.84) 1 Eq. (1.83) has the general isotropic solution .0/ n .x/ where .0/ D .x/ is – for now – undetermined. .0/ .x/ ; 2 (1.85) 1 Advances in Discrete-Ordinates Methodology 33 The second .i D 1/ equation is, using Eq. (1.85), .1/ n .x/ †t N 1 X 2 0 ! .1/ 0 n0 .x/wn D n D1 n d 2 dx .0/ .x/: (1.86) Assuming that the quadrature set satisfies N X Z n wn D 1 d D 0; (1.87) 1 nD1 the general solution of Eq. (1.86) is .1/ n .x/ D .1/ n d .0/ .x/ .x/; 2 2†t dx (1.88) where .1/ .x/ is undetermined. The third .i D 2/ equation, using Eqs. (1.85) and (1.86), is X t .2/ n .x/ N 1 X 2 0 ! .2/ 0 n0 .x/ wn n D1 d D n dx .1/ n d .0/ .x/ .x/ 2 2†t dx ! †a 2 .0/ .x/ C Q.x/ : 2 (1.89) Unlike Eqs. (1.83) and (1.86), this third equation does not automatically possess a solution. To see this, we multiply Eq. (1.89) by wn and sum over 1 n N ; assuming that the quadrature set satisfies N X Z 2n wn D nD1 1 2 d D 1 2 ; 3 (1.90) we obtain the solvability condition 0D d 1 d .0/ .x/ †a dx 3†t dx .0/ .x/ C Q.x/: (1.91) If this equation is satisfied, then it can easily be shown that solutions to Eq. (1.89) exist. Equations (1.91), (1.85), and (1.81) confirm the result (1.80). The asymptotic analysis outlined above provides a direct mathematical link between the SN equations (1.74) and the diffusion equation (1.75). This analysis shows that if the cross sections and source in the SN equations are scaled by Eqs. (1.78) with " 1, the diffusion equation (1.75) is obtained. The condition " 1 is 34 E.W. Larsen and J.E. Morel consistent with the physical understanding of neutron diffusion: the mean free path [ D 1=†t D O."/] is small, the absorption rate [†a D O."/] is small, and the source [Q D O."/] is small – all of these “smallnesses” being balanced so that the resulting angular flux is O(1) and satisfies the diffusion equation. In the thick diffusion limit, two limits occur that have significance for spatial discretizations: 1 , 1. †t ! O " .x/ . 2. n (x) ! 2 Thus, the total cross section becomes unbounded, yet the SN solution limits to an O.1/ diffusion solution. This result applies to the spatially continuous SN equations (no spatial discretization). We now ask: What happens if this same asymptotic analysis is applied to the spatially discretized SN equations? More precisely, let us consider a spatially discrete SN problem posed on a fixed spatial grid. We scale the cross sections and source in this problem exactly as in Eqs. (1.78). For " 1, we seek a solution of this discrete system in the form of Eqs. (1.81), i.e., we expand all unknowns (cell-average fluxes, cell-edge fluxes, etc.) as power series in ", and we solve the resulting hierarchy of equations as described above for the continuous SN equations. What happens to the spatially discrete SN solution in this limit? There are two possible answers to this question. First, because as " ! 0 the SN solution smoothly limits to the diffusion solution, it is plausible to hope that the spatially discrete SN solution will smoothly limit to the solution of a spatially discrete diffusion solution. (Then, if the chosen spatial grid is adequate to resolve the solution of this discrete diffusion problem, the resulting discrete solution will be accurate.) However, because †t D O."1 /, the optical thickness of spatial cells †t x ! 1 as " ! 0. This and the fact that SN solutions generally become inaccurate as †t x increases suggest that spatially discrete SN solutions may not limit to an accurate result as " ! 0. Which of these two possibilities is correct? The answer to this question depends on the chosen spatial discretization scheme. Some schemes are accurate in the thick diffusion limit; others are not. For example, the Step-Characteristic (SC) scheme fails as " ! 0 (the SC solution ! 0). The Diamond-Difference (DD) scheme fails unless all the diffusive regions of the problem have isotropic incident boundary fluxes (in the presence of nonisotropic boundary fluxes, DD solutions become corrupted by unphysical spatial oscillations). LD-like schemes perform successfully in the thick diffusion limit in 1-D geometries. LD methods also perform well in multi-D geometries with triangular (2-D) or tetrahedral (3-D) spatial grids, but they fail in quadrilateral (2-D) or hexahedral (3-D) grids. (However, bilinear-discontinuous methods work well for quadrilateral grids and trilinear-discontinuous methods work well for hexahedral grids.) The thick diffusion limit analysis, which has been applied to these discretization schemes and many others, accurately predicts the performance of approximation schemes in realistic calculations. This analysis has enabled the successful development of spatial discretization methods for problems with optically thick, diffusive 1 Advances in Discrete-Ordinates Methodology 35 systems – in particular, for the thermal radiation transport and charged-particle transport problems discussed in Section 1.3. A reader may ask that if a transport problem is diffusive, then why not solve a simpler diffusion problem instead? The answer is that in many applications, only a part of the physical system is diffusive, and it may not be obvious where this diffusive part is. Also, some energy groups may be diffusive, while others are not. Finally, for time-dependent problems, some regions of space-energy phase space may be diffusive for certain times but not for others. For these reasons, it is generally infeasible to calculate accurate transport solutions by using the diffusion approximation in subregions of phase space where it is accurate. This leads to an important issue that can be discussed only briefly here: the behavior of SN spatial discretization schemes in the presence of unresolved boundary layers. (These are thin volumes, typically only a few mean free paths in width, containing the material boundaries that separate diffusive and nondiffusive subregions of a problem.) Across boundary layers, the flux usually has a rapid spatial variation; if the spatial grid is not sufficiently fine to resolve this fast variation, the boundary layer is said to be unresolved.) Many problems exist in which, due to computer memory limitations, it is not practical to prescribe a spatial grid that adequately resolves all boundary layers. Thus, one is led to the question of whether a given discretization scheme is accurate across an unresolved boundary layer. In particular, if an optically thick, diffusive region is adjacent to a nondiffusive region, can anything be said about the ability of a given discretization scheme to predict the changes in the flux across an unresolved boundary layer between two such regions? The asymptotic thick diffusion limit analysis does make it possible to study unresolved boundary layers; the conclusions so far are that no known differencing scheme is completely adequate to model unresolved boundary layers accurately. For example, LD methods are generally inaccurate in the first cell (containing the boundary layer) within the thick diffusive region, and they incorrectly predict that the flux exiting the diffusive region is isotropic. Generally, to be certain that a discrete solution is accurate, all spatial boundary layers must be adequately resolved by the spatial grid. For charged-particle transport problems, which are optically thick and have highly forward-peaked scattering, a more complicated asymptotic limit exists in which the total cross section †t D O."1 / and the mean scattering cosine 0 D 1 O."/. As " ! 0, the solution of the continuous transport equation limits to the solution of a Fokker–Planck equation (see the discussion in Section 1.3.2). Spaceangle discretization schemes have also been successfully analyzed in this asymptotic limit [113]. Ensuring that the discretized SN equations limit to a valid discretization of the Fokker–Planck equation is primarily related to the treatment of anisotropic scattering rather than the spatial differencing scheme. Nonetheless, the presence of very large and very small eigenvalues in the spectrum of the angular Fokker–Planck operator necessitates the use of accurate and robust spatial differencing schemes in Fokker–Planck calculations. In the thick diffusion and Fokker–Planck problems discussed above, it is generally impossible, given computer memory limitations, to use optically thin spatial 36 E.W. Larsen and J.E. Morel grids for the entire problem. To successfully simulate these problems, discretization schemes must produce accurate solutions for optically thick spatial grids away from boundary layers; and a theory is needed to justify the use of these schemes for these problems. The discontinuous finite-element schemes (such as LD and its variants) and the asymptotic (thick diffusion and Fokker–Planck) theories were developed to deal with just these practical difficulties. 1.5 Advances in Angular Discretizations Next we discuss (i) advances in SN discretizations for the angular derivative terms that appear in curvilinear coordinates systems and (ii) improvements to the standard SN treatment for highly anisotropic scattering. Perhaps surprisingly, very little has been accomplished during the past 40 years to successfully reduce the classic ray effects in SN simulations [21, 24]. 1.5.1 Angular Derivatives The transport equation in curvilinear geometries contains one or more angular derivatives, in addition to spatial derivative terms. The traditional technique for treating the angular derivative term in the 1-D spherical geometry equation, which is representative of the traditional treatment used in essentially all curvilinear geometries, is described in Section 1.2. This technique is characterized by: The use of special ˇ-coefficients to represent the quantity .12 / at each angular cell edge (see Eq. (1.23a)) The use of the diamond-in-angle relationship to express each cell-average angu- lar flux in terms of the adjacent cell-edge angular fluxes (see Eq. (1.24)) The use of a starting-direction flux equation to obtain initial values for the angular flux at D 1 (see Eq. (1.25)) This treatment has a deficiency, known as the discrete-ordinates flux dip, which consists of an erroneous suppression in the flux at the center of a sphere. Although the existence of the flux dip was recognized in the early 1960s, it was not eliminated until the early 1980s. Three features of the original method contributed to the existence of the flux dip: The starting-direction flux equation is a slab-geometry equation, but this was originally put in the following curvilinear-like form before being spatially discretized [22]: d 2 r dr D 1=2 N X n0 D1 C 2r 1=2 C †t .r/r 2 r 2 †s .r; 1; n0 / 1=2 n0 .r/wn0 C r 2Q 2 (1.92) 1 Advances in Discrete-Ordinates Methodology 37 This was thought to make the discretization for the starting-direction flux equation consistent with that of the other directions. However, it actually contributed to truncation errors that enhanced the flux dip. A boundary condition corresponding to specular reflection was used at the center of the sphere, even though the angular flux at the center of a sphere is rigorously isotropic and equal to the starting-direction flux. This incorrectly allowed the angular flux at r D 0 to be anisotropic. The diamond-in-angle equation is inconsistent with the location of the quadrature cosines within each angular bin. As a result, the diamond-in-angle scheme does not preserve solutions that are linear in . In the late 1970s, it was proposed that the slab-geometry form of the startingdirection flux equation be discretized rather than the curvilinear-like form, and that all of the angular fluxes at the center of the sphere be set equal to the startingdirection flux value [26]. These two steps significantly reduced the severity of the flux dip. In the early 1980s, an angular weighted-diamond equation was proposed that related the angular edge and average fluxes in a manner consistent with the location of the cosine in each angular bin [38]: n .r/ D n n1=2 wn nC1=2 .r/ C nC1=2 n wn n1=2 .r/: (1.93) When all three of these measures were combined, the resulting angular discretization scheme eliminated the flux dip [38]. This scheme has been generalized to 2-D cylindrical geometry [38]. Very few practical improvements beyond the elimination of the flux dip have been made in SN angular derivative treatments. Discontinuous finite-element discretizations might have been expected to have had an impact, but this has not happened, partly because it is difficult to develop a discontinuous angular finiteelement method that is compatible with the standard SN method in multidimensional geometries. It is interesting that discontinuous angular derivative treatments do not require a starting-direction flux. This would appear to be an advantage, but one of the few linear-discontinuous SN angular derivative treatments ever developed for the 1-D spherical geometry equation was found to be less accurate than the weighteddiamond scheme (Eq. (1.93)) for a series of test problems [60]. A reason for this is that the starting-direction flux is computed (by Eq. (1.25)) with greater accuracy than the other directions; hence, significant accuracy is actually lost if the starting-direction flux plays no role in the angular derivative treatment. However, superior accuracy relative to the weighted-diamond scheme was obtained by using a quadratic-continuous approximation in the first angular cell and using a lineardiscontinuous approximation in the remaining angular cells [60]. All of these factors make it challenging to develop advanced SN angular derivative treatments [100]. 38 E.W. Larsen and J.E. Morel 1.5.2 Anisotropic Scattering The standard SN treatment for the scattering source, which is based on a Legendre polynomial expansion for the scattering cross section in conjunction with quadrature-generated spherical-harmonic moments of the angular flux, is still the workhorse for modern discrete-ordinates calculations, even though it is not always satisfactory. There are several reasons why it remains in widespread use: Fundamentally different approaches usually require significant processing of raw cross-section data. Such techniques often have memory requirements that are significantly larger than those of the standard treatment. The standard technique is often much more accurate than one would expect, even when highly truncated cross-section expansions are used in a calculation. Next, we describe the standard method, together with an improvement that has had a notable impact on charged-particle calculations [54]. For simplicity, we consider the monoenergetic 1-D slab-geometry scattering source denoted by S : Z S .x; / D C1 1 1 X 2m C 1 Pm ./Pm .0 /†s;m 2 mD0 ! x; 0 d0 : (1.94) We assume that the angular flux is a Legendre series of degree L, .x; / D where L X 2m C 1 Pm ./ 2 mD0 Z m .x/ D m .x/; (1.95) C1 1 Pm ./ .x; /d: (1.96) Substituting Eq. (1.95) into Eq. (1.94) and using the orthogonality of the Legendre polynomials, we find that the scattering source is S .x; / D L X 2m C 1 Pm ./†s;m 2 mD0 m .x/: (1.97) Thus, the scattering source generated by an angular flux that is a Legendre series of degree L is itself a Legendre series of degree no higher than L. Furthermore, the only cross-section information appearing in the scattering source is the first L C 1 moments of the scattering cross section. This same result is obtained if a cross-section expansion of degree L is used, rather than an exact expansion of infinite degree. In this case, the convergence of the cross-section expansion is irrelevant. This powerful result is not widely appreciated. An analogous result holds for multidimensional calculations when the angular flux takes the form of a finite 1 Advances in Discrete-Ordinates Methodology 39 spherical-harmonic expansion. These results follow from the fact that the sphericalharmonic functions (which include the Legendre polynomials) are eigenfunctions of the Boltzmann scattering operator. We now discuss how this property impacts SN calculations. For simplicity, we consider the 1-D slab-geometry scattering source. Assuming that an N -point angular quadrature set is used in conjunction with a cross-section expansion of degree N 1, the SN scattering source takes the following form: S n .x/ D N 1 X 2m C 1 Pm .n /†s;m 2 mD0 where m .x/ D N X Pm .n / n .x/wn : m .x/; (1.98) (1.99) nD1 We assume a Gauss–Legendre quadrature set. With N quadrature points, one can uniquely interpolate those points with a polynomial of degree N 1. Furthermore, since an N -point Gauss–Legendre set exactly integrates polynomials of degree 2N 1 [19], the Legendre moments in Eq. (1.99) are exactly the moments of the interpolatory polynomial. Considering our previous results regarding the scattering source for a polynomial angular flux representation, we see that the discrete scattering source values given in Eq. (1.98) are exactly those of the scattering source generated with the polynomial interpolation for the angular flux. Thus, if the true angular flux is well-represented by the polynomial interpolation of the discrete angular flux values, the true scattering source will similarly be well-represented by the polynomial interpolation of the discrete scattering source values. We again stress that this is true regardless of the convergence of the cross-section expansion. This property does not guarantee positive discrete scattering source values, given positive discrete angular flux values, because the polynomial interpolation of the discrete angular fluxes can be negative at some points, even though the discrete values themselves are positive. However, since polynomial interpolation at the Gauss points is known to be stable, any negativities in the angular flux interpolation will be small relative to the maximum discrete angular flux value. Therefore, any negativities in the discrete scattering source values will also be small relative to the maximum discrete scattering source value. Hence, accurate SN solutions for angle-integrated quantities can be obtained in a wide variety of problems in 1-D slab geometry with highly anisotropic scattering using Gauss–Legendre quadrature, even if the scattering cross-section expansion is highly truncated. If a Gauss–Legendre quadrature set is not used, some of the scattering source moments of the interpolatory polynomial will be properly computed, but others will not, depending on the accuracy of the quadrature set. It can be seen from Eq. (1.98) that the mth moment of the scattering source is just the product of the mth moment of the scattering cross section and the mth moment of the angular flux. Thus, any flux moment that is erroneous yields a corresponding scattering source moment that 40 E.W. Larsen and J.E. Morel is erroneous. This deficiency can be treated by generating a separate set of quadrature weights for each moment. In particular, for each 0 m N 1, one can generate N weights, fwm;n gN nD1 that are defined by the N linear equations N X Pm .n /Pj .n /wm;n D nD1 2 ımj ; 1 j N : 2m C 1 (1.100) Pm .n / (1.101) Then Eq. (1.99) is replaced by m .x/ D N X n .x/wn : nD1 This method gives the desirable properties of Gauss quadrature to non-Gauss quadrature for the purpose of calculating the scattering source. (However, there is no guarantee that the weights generated in this way will be positive.) This is one variant of a more general technique known as Galerkin quadrature [54]. To present the more general method, we reexpress the standard SN technique for calculating the scattering source in terms of matrix algebra. In particular, we write Eqs. (1.98) and (1.99) as follows: SE D M†D E ; (1.102) where E is the vector of discrete angular flux values: E . D is the N 1; 2; : : : ; N/ T ; N matrix, Dm;n Pm .n / wn ; † is the N (1.103) (1.104) N diagonal matrix: † diag.†0 ; †1 ; †2 ; : : :/; and M is the N (1.105) N matrix: Mn;m 2m C 1 Pm .n /: 2 (1.106) The discrete-to-moment matrix D maps a vector of discrete angular flux values to a corresponding vector of Legendre flux moments. We note from Eq. (1.104) that the first row of this matrix consists of the standard quadrature weights, because P0 ./ D 1. The matrix † is the scattering matrix in the Legendre basis, or equivalently, the scattering matrix for the PN 1 approximation. It maps a vector of Legendre flux moments to a corresponding vector of Legendre scattering source 1 Advances in Discrete-Ordinates Methodology 41 moments. The moment-to-discrete matrix M maps a vector of Legendre scattering source moments to a corresponding vector of discrete scattering source values. Using the orthogonal property of the Legendre polynomials, one can show that with Gauss quadrature, M D D1 , DM D N X kD1 Di;k Mk;j D N X kD1 Pi .k / 2j C 1 Pj .k /wk D ıi;j : 2 (1.107) Thus, using Eq. (1.107), we can reexpress Eq. (1.102) as follows: SE D D1 †D E : (1.108) Equation (1.108) shows that the SN scattering matrix represents a similarity transformation of the Legendre scattering matrix, †. This means that the standard SN scattering source with Gauss quadrature (and a Legendre cross-section expansion of degree N 1) is equivalent to the scattering source of the PN 1 approximation. This is to be expected, considering the well-known equivalence between the SN and PN approximations in 1-D slab geometry [71]. If Gauss quadrature is not used, then M ¤ D1 , which is an undesirable result. The matrix D maps a vector of N discrete function values to N Legendre moments, and the matrix M maps a vector of N Legendre moments to N discrete function values. One can uniquely define a polynomial of degree N 1 either in terms of N Legendre moments or in terms of N discrete function values at N distinct points. Therefore, D and M should be inverses of one another. The moment-dependent weights defined in Eq. (1.100) ensure that this will be the case. We note that it is not necessary to actually generate the moment-dependent weights; one can directly obtain the correct matrix D simply by calculating the inverse of M. This Galerkin quadrature method is useful for 1-D calculations when quadratures with special directions are desired. For example, Lobatto and double Radau quadrature sets, which have quadrature points at D ˙1, are particularly useful for simulating a normally incident plane-wave of radiation [54]. In 2-D and 3-D, the Galerkin quadrature method is based on spherical-harmonic interpolation of the discrete angular fluxes rather than polynomial interpolation. Choosing the correct spherical harmonics for interpolation is more complicated in multidimensions because the number of spherical-harmonic functions of order N 1 does not equal the number of discrete directions in a multidimensional SN quadrature set. Nonetheless, suitable interpolation functions have been defined for triangular quadrature sets [54]. The Galerkin quadrature method in 2-D and 3-D can be much more accurate than the standard quadrature method with highly anisotropic scattering because there is no analog of Gauss quadrature in 2-D and 3-D, i.e., there is no 2-D or 3-D quadrature set that will exactly calculate all of the sphericalharmonic moments of the interpolated angular flux. In fact, fewer than half of the moments are exactly calculated with typical sets, e.g., even-moment symmetric sets [14], etc. 42 E.W. Larsen and J.E. Morel The Galerkin quadrature method can also accommodate nonpolynomial or nonspherical-harmonic interpolation functions [54]. To demonstrate this in 1-D, we consider a general interpolatory basis set for a given set of N discrete directions: N X .x; / D n .x/Bn ./; (1.109) nD1 where Bi .j / D ıij : (1.110) Multiplying Eq. (1.109) by Pm ./ and integrating over all directions, we obtain m .x/ D N X Z n .x/ C1 1 nD1 Pm ./Bn ./d : (1.111) It follows from the definition of the discrete-to-moment matrix M and Eq. (1.110) that the components of D are Z Dm;n D C1 1 Pm ./Bn ./d: (1.112) Equation (1.112) is valid for all types of interpolation functions, including polynomials. We note that the first row of the discrete-to-moment matrix consists of standard quadrature weights that are exact for integrating the interpolated angular flux: Z C1 N X .x; /d D (1.113) .x/ D n wn ; 1 where nD1 Z wn D C1 1 Bn ./d: (1.114) These are called the companion quadrature weights. Nonpolynomial interpolation requires much more computational effort to generate the discrete-to-moment matrix, because the interpolatory basis functions must be explicitly formed and their products with the Legendre polynomials must be integrated. Also, one must invert the discrete-to-moment matrix to obtain the moment-to-discrete matrix, because the standard SN expression for the moment-to-discrete matrix, Eq. (1.106), is only correct for polynomial interpolation. We refer to the scattering source obtained by operating on the interpolated angular flux with the exact scattering kernel as the exact interpolation-generated scattering source. When the interpolation functions are nonpolynomial, the exact interpolation-generated scattering source is generally not expressible in terms of the interpolation functions. Thus, if the discrete scattering source values obtained from the Galerkin quadrature method are interpolated, one generally does not obtain 1 Advances in Discrete-Ordinates Methodology 43 the exact interpolation-generated scattering source. Rather, one obtains a scattering source that has the same Legendre moments of degree 0 through N 1 as the exact interpolation-generated scattering source [54]. As an example of a useful nonpolynomial interpolation scheme, we consider a linear-discontinuous angular trial space in 1-D spherical geometry. Such a trial space is fully compatible with a linear-discontinuous treatment for the angular derivative term [60]. An “SN ” trial space of this type is defined to consist of N=2 equalwidth piecewise-linear segments in , where N is even and N > 2. There are two discrete angular flux unknowns per segment located p at the local Gauss S2 quadrature points, i.e., the points corresponding to ˙1= 3 obtained by linearly mapping Œ1; C1 onto each segment. A Galerkin quadrature set is generated for this trial space by exactly evaluating the Legendre angular flux moments of degree 0 through N 1 associated with the linear-discontinuous interpolation of the N discrete flux values [60]. The companion quadrature set corresponding to the Galerkin set, i.e., the standard quadrature set having the same quadrature points as the Galerkin set with quadrature weights that exactly integrate the interpolated angular flux representation, corresponds to a local Gauss S2 set on each linear segment. Since each local Gauss set exactly integrates cubic polynomials, it follows that the companion set will exactly evaluate the zeroth, first, and second Legendre moments of the interpolated angular flux. However, all higher flux moments will be inexactly evaluated, regardless of the quadrature order N . This is in contrast to the Galerkin quadrature set of order N , which always exactly evaluates the Legendre angular flux moments of degree 0 through N 1. Furthermore, because the companion quadrature set never exactly integrates polynomials of degree greater than 3, one cannot use a Legendre cross-section expansion of degree greater than 3 with the companion quadrature set (otherwise particle conservation will be lost). Thus, the accuracy of the scattering source with highly anisotropic scattering can be greatly improved for linear-discontinuous angular trial spaces in 1-D by using Galerkin quadrature. This enables one to use a linear-discontinuous approximation for the angular derivative term in 1-D spherical geometry in conjunction with an accurate treatment for highly anisotropic scattering. Perhaps the most important property of the Galerkin quadrature method, independent of the type of functions used to interpolate the discrete angular flux values, is that straight-ahead delta-function scattering is exactly treated. This has a very strong impact on charged-particle calculations, because it enables the total scattering cross section to be dramatically reduced (with an attendant decrease in the scattering ratio) while leaving the SN solution invariant. To demonstrate how straight-ahead scattering is exactly treated, let us consider the following differential scattering cross section (1.115) †s .0 / D ˛ı.0 1/; where ˛ is an arbitrary constant. The Boltzmann scattering operator associated with this cross section is ˛ times the identity operator: Z S D C1 1 ˛ı.0 1/ 0 0 d D ˛ ./: (1.116) 44 E.W. Larsen and J.E. Morel Furthermore, the Legendre moments of this cross section are all equal to alpha: Z †m D ˛ C1 1 ı.0 1/Pm 0 d0 D ˛Pm .1/ D ˛; 0 m 1: (1.117) Thus, the diagonal matrix of cross-section moments used to construct the vector of discrete scattering source values is ˛ times the identity matrix: † D ˛I: (1.118) Substituting from Eq. (1.118) into Eq. (1.108), and recognizing that M D D1 , we obtain SE E D M˛ID E D ˛MD E D ˛ E ; (1.119) which agrees with Eq. (1.116). We have explicitly considered only the 1-D case, but this result also applies in multidimensions. For charged particles, the scattering ratio for each group is generally very close to unity, and the mean free path is very small; nonetheless, the transport process is not diffusive. This is because the “transport-corrected” scattering ratio, .†0 †1 /=†t , is not close to unity. Within-group straight-ahead scattering is equivalent to no scattering at all, since the particle scatters into the same group and direction it had before the scattering. Thus, one can add or subtract a straight-ahead differential scattering cross section from any physically correct within-group cross section without changing the analytic transport solution. Since all Galerkin quadratures treat straight-ahead scattering exactly, one can subtract the truncated expansion for a within-group straight-ahead scattering cross section from the physically correct cross-section expansion without changing the SN solution. For instance, let us consider the total Boltzmann scattering operator (outscatter minus inscatter) associated with a 1-D SN Galerkin quadrature: †0 E SE D ŒM†0 D M†D E D MŒ†s †D E D MŒdiag.†0 †0 ; †0 †1 ; : : : †0 †N 1 / D E : (1.120) Subtracting the delta-function cross section given in Eq. (1.115) from the physically correct cross section, we obtain the following modified outscatter matrix: †0 D .†0 ˛I/; (1.121) and the following modified inscatter matrix where .M†D/ D M† D; (1.122) † D diag.†0 ˛; †1 ˛; : : : ; †N 1 ˛/: (1.123) 1 Advances in Discrete-Ordinates Methodology 45 We note that while the outscatter and inscatter matrices are modified by subtraction of the straight-ahead scattering cross section, the total Boltzmann matrix, Eq. (1.120), does not change, i.e., †0 M† D D †0 M†D: (1.124) Thus, the SN solution does not change. However, the convergence properties of the source iteration process can be dramatically changed. A discussion of the optimal choice of ˛ is beyond the scope of this review, but the traditional choice (which is nearly optimal) is to set ˛ D †N . This extended transport correction can greatly reduce both the total scattering cross section and the scattering ratio in relativistic charged-particle transport calculations [9, 27]. We note that the significance of this cross-section modification depends on the convergence of the cross-section expansion. If the expansion is essentially converged, †N 1 will be very small relative to †0 , resulting in a negligible reduction in †0 ; and if the cross-section expansion is highly truncated, †N 1 will be comparable to †0 , resulting in a significant reduction in †0 . Acceptable computational efficiency often cannot be achieved without the use of the extended transport correction in charged-particle calculations. Thus, with Galerkin quadrature, the extended transport correction (correctly) leaves the SN solution invariant. This is a powerful motivation for using the Galerkin method in charged-particle calculations. 1.6 Advances in Fokker–Planck Discretizations Next, we discuss advances in Fokker–Planck angle and energy discretizations for charged-particle transport. We first consider the continuous-scattering operator, and then the continuous-slowing-down operator. 1.6.1 The Continuous-Scattering Operator The continuous-scattering operator in 1-D slab geometry is .x; ; E/ D @ †r;t r @ 1 2 2 @ @ .x; ; E/: (1.125) Taking the zeroth and first angular moments of Eq. (1.125), we obtain Z C1 1 .x; / d D 0 (1.126) 46 E.W. Larsen and J.E. Morel and Z C1 1 .x; /d D †r;t r J.x/; (1.127) respectively, where J.x/ is the current Z J.x/ D C1 .x; / d: (1.128) 1 It is highly desirable for a numerical approximation to the continuous-scattering operator to preserve both the zeroth and first angular moments of that operator. Also, since the continuous-scattering operator is a diffusion operator on the unit sphere [see Section 1.3.2], it is highly desirable that the discretization for this operator yield a coefficient matrix that is symmetric and monotone. These two properties ensure that the matrix (like the analytic operator) will have only positive real eigenvalues and will yield positive solutions given positive sources. A straightforward discretization of Eq. (1.125) is . /n D †r;t r 1 nC1 n n n1 1 2nC1=2 ; 1 2n1=2 2 wn nC1 n n n1 (1.129) where nC1=2 D n1=2 C wn ; 1 n N; 1=2 D 1: (1.130) This discretization results in a symmetric and monotone coefficient matrix and preserves Eq. (1.126) under numerical integration, but it preserves Eq. (1.127) only if each quadrature point lies at the center of its associated angular interval. (As previously noted, this never occurs with standard quadrature sets.) A discretization that does preserve Eq. (1.127) with standard quadrature sets is [40] †r;t r 1 . /n D 2 wn ˇnC1=2 n n n1 ˇn1=2 nC1 n n n1 nC1 ; (1.131) where the ˇ-coefficients are defined by Eq. (1.23b) and all else remains as previously defined. Thus the ˇ-coefficients used to admit the constant solution in the discretization of the angular derivative term in the spherical geometry transport equation are also used in the discretization of the continuous-scattering operator to preserve Eq. (1.127). This moment-preserving approach has been extended to multidimensions for product quadratures [120]. For instance, in three dimensions, the continuousscattering operator is †r;t r D 2 @ 1 @2 @ .1 2 / C @ @ 1 2 @! 2 : (1.132) 1 Advances in Discrete-Ordinates Methodology 47 Taking the zeroth and first angular moments of Eq. (1.132), we obtain Z 2 Z C1 dd! D 0; and Z Z 2 C1 dd! D †r;t r J ; 1 0 (1.133) 1 0 (1.134) respectively, where J is the current Z 2 Z C1 J D dd!: (1.135) 1 0 Standard triangular SN quadrature sets do not represent a rectangular angular mesh on the unit sphere, but product sets do. Hence, it is reasonably straightforward to derive a discretization for the continuous-scattering operator assuming a product quadrature set. An SN product quadrature set has 2N 2 directions and is formed by the tensor product of an N -point quadrature defined over the polar cosine and a 2N point quadrature defined over the azimuthal angle. Each direction can be uniquely referenced in terms of a polar index n and an azimuthal index j . A momentpreserving discretization for the 3-D continuous-scattering operator is 2 . /n;j D †r;t r 6 4 2 1 p wn ˇnC1=2 nC1;j ˇn1=2 n;j nC1 n n 1 C 1 2 wa n j n;j C1 n;j !j C1 !j n;j n1;j n n1 n;j n;j 1 3 7 5: !j !j 1 (1.136) p Here, wn and waj are weights associated with the polar and azimuthal quadratures that sum to 2 and 2 , respectively, the ˇ-coefficients are identical to those defined for the 1-D case, and 2 Kn ; 1 n 2N; 2N 1 cos N p Kn D 2.1 2n / C cn 1 2n; n D ˇnC1=2 dnC1=2 ˇn1=2 dn1=2 ; p wn p p 1 2nC1 1 2n D : nC1 n cn D dnC1=2 (1.137) (1.138) (1.139) (1.140) 48 E.W. Larsen and J.E. Morel The above discretization is defined only for product quadrature sets constructed with azimuthal quadrature sets of the Chebychev type: !j D waj D 2j 1 ; 1 j 2N; N N ; 1 j 2N: (1.141) (1.142) This discretization has a 5-point stencil, and is symmetric positive-definite (SPD) and monotone. It preserves Eqs. (1.133) and (1.134). The restriction to Chebychev azimuthal quadrature arises from the fact that three first-moment equations must be met, but the ˇ-coefficients and the coefficients can only be defined to meet two of them. The ˇ-coefficients are defined to preserve the moment equation associated with the polar cosine, i.e., cos , and for a general quadrature set, the -coefficients can preserve one of the two moment equations associated with the cosines that depend on the azimuthal angle, i.e., sin cos! or sin sin!. However, when a Chebychev azimuthal quadrature set is used, the -coefficients can be defined to preserve both azimuthal cosines. An alternative to a finite-difference representation for the continuous-scattering operator is a Legendre moment representation. Because the total Boltzmann scattering operator (outscatter minus inscatter) and the continuous-scattering operator have the same eigenfunctions, one can define effective cross-section moments to represent the continuous-scattering operator [31]. In particular, the mth eigenvalue of the continuous-scattering operator is c;m D †r;t r m.m C 1/ ; 2 (1.143) while the mth eigenvalue of the total Boltzmann scattering operator is b;m D †0 †m : (1.144) Without loss of generality, we assume that a 1-D SN calculation is performed with Galerkin quadrature. In this case, an effective cross-section expansion of degree N 1 is required. The first step in defining the effective cross-section moments is to equate the eigenvalues defined by Eqs. (1.143) and (1.144): †e0 †em D †r;t r m.m C 1/ ; 0 m N 1: 2 (1.145) This step does not uniquely define the effective cross-section moments, but rather leaves †0 a free parameter. The only consideration for choosing †0 is to minimize the effective scattering ratio. In analogy with the choice of ˛ in the extended transport correction,†0 is defined so that the last moment in the expansion is zero: †e0 D †r;t r .N 1/N : 2 (1.146) 1 Advances in Discrete-Ordinates Methodology 49 Substituting from Eq. (1.146) into Eq. (1.145), we obtain an expression for the remaining effective moments: †em D †r;t r .N 1/N m.m C 1/ ; 1 m N 1: 2 (1.147) When used in conjunction with Galerkin quadrature, the number of eigenvalues preserved is always equal to the number of discrete directions. This is to be contrasted with finite-difference approximations that only preserve the zeroth and first moments. Thus, the moment representation can be considered to be more accurate than the finite-difference representations, but the coefficient matrix for the moment representation is not monotone. Thus the moment representation is less robust than the finite-difference representation. 1.6.2 The Continuous-Slowing-Down Operator We next consider the continuous-slowing down operator: C .x; ; E/ D @ˇ .x; ; E/ : @E (1.148) Discretizations of this operator have advanced from a multigroup or step-like treatment [31] through a diamond treatment [41] to a linear-discontinuous finite-element treatment [47]. Because standard SN codes were not originally intended to solve the charged-particle transport problems, early efforts in treating the continuousslowing-down operator were spent defining effective cross sections to implement various discretization schemes via the standard SN scattering source representation [31, 40, 47]. Modern codes that were designed to solve the charged-particle transport equation use a linear-discontinuous finite-element discretization in space, but treat the energy derivative similar to the spatial derivatives. Thus, this operator is inverted via a space-energy sweep [101]. Self-adjoint codes must still treat the term as a scattering source, and invert it via source iteration [112]. Because the spectral radius associated with iterations on the continuous-slowing down term can be very close to unity, these iterations must be accelerated. A synthetic acceleration technique, based on the diamond-difference approximation as the low-order operator, has been used for this purpose [47]. Although the pure Fokker–Planck equation can be solved, most charged-particle calculations are carried out with the Boltzmann–Fokker–Planck equation [37]. In this case, the Boltzmann scattering operator is treated with the standard multigroup approximation. The resulting hybrid multigroup/linear-discontinuous operator is formally treated as a linear-discontinuous operator. One simply takes the scattering kernel to be piecewise-constant in energy; equivalently, one takes all energy slopes associated with the scattering kernel to be zero. 50 E.W. Larsen and J.E. Morel For instance, let us assume that the angular flux within group g has the following linear-discontinuous dependence: .E/ D where a;g a;g C e;g 2 E Eg ; EgC1=2 E < Eg1=2 ; Eg is the group average flux: a;g e;g (1.149) D 1 Eg Z Eg1=2 .E/ dE; (1.150) EgC1=2 is the group energy slope: e;g D 6 Eg2 Z Eg1=2 E Eg .E/ dE; (1.151) EgC1=2 and Eg D Eg1=2 EgC1=2 is the group width for group g. We note from Eq. (1.149) that the angular flux at the interface energy between two groups is defined by the solution in the higher energy group. Since the continuous-slowing-down operator causes particles to lose energy, this choice is consistent with the direction of particle flow in energy. Under the assumptions of a Legendre expansion of degree L and 0 energy slopes for the multigroup scattering kernel, and a constant dependence of the restricted stopping power within each group, the discretized 1-D transport equation takes the following form for group g: @ a;g @x C @ C †t;g a;g @x G X L X 2m C 1 m Eg 0 †g 0 !g 2 Eg 0 mD1 a;m;g 0 Pm ./ C Qa;g g D1 1 ˇr;g1 . Eg e;g D C †t;g e;g a;g1 e;g1 / D Qe;g C ˇr;g . 3 ˇr;g1 Eg 2ˇr;g a;g a;g1 a;g e;g / C ˇr;g ; (1.152a) e;g1 a;g e;g : (1.152b) Here, 'a;m;g denotes the group average Legendre flux moment of degree m for group g, Qa;g and Qe;g respectively denote the group average source and group source energy slope for group g, and †m g 0 !g is the standard multigroup-Legendre coefficient of degree m for a transfer from group g0 to group g: Z Eg1=2 Z Eg0 1=2 Z C1 1 D †s E 0 ! E; 0 Pm .0 /d0 dE 0 dE: †m g 0 !g Eg 0 EgC1=2 Eg0 C1=2 1 (1.153) 1 Advances in Discrete-Ordinates Methodology 51 Equations (1.152a) and (1.152b) are solved via source iteration, but the continuousslowing-down operator is inverted during the sweep. In particular, the source iteration process takes the form @ .`C1/ a;g .`C1/ C †t;g a;g @x 1 h .`C1/ ˇr;g1 a;g1 Eg D Qa;g C C @ .`C1/ e;g @x L X 2m C 1 m †g!g 2 mD1 G X g 0 DgC1 .`C1/ e;g1 C †t;g 2ˇr;g ˇr;g .`/ a;m;g Pm L X 2m C 1 m Eg 0 †g 0 !g 2 Eg mD1 3 h ˇr;g1 Eg .`C1/ .`C1/ Cˇ r;g a;g a;g .`C1/ e;g .`C1/ a;g .`C1/ e;g i ./ .`C1/ a;m;g 0 Pm .`C1/ a;g1 i .`C1/ e;g ./; .`C1/ e;g1 (1.154a) D Qe;g ; (1.154b) where ` is the iteration index. If the full linear-discontinuous treatment for the scattering source were used, one would have nonzero scattering source energy slopes. In this case, one would iterate on both the scattering source averages and energy slopes. If the scattering ratio for a given group is sufficiently large to require convergence acceleration, the scattering source energy slopes could require acceleration in addition to the scattering source averages. Accelerating the source energy slopes significantly complicates the acceleration process. (We will address this point again, regarding the convergence acceleration of the temporal scattering source slopes associated with a linear-discontinuous discretization of the time derivative.) When combining the linear-discontinuous energy approximation with discontinuous finite-element approximations in space, one generally assumes a single energy slope per spatial cell rather than a separate energy slope for each spatial unknown within a cell. Although the latter assumption is more accurate, it can be excessively expensive. For example, if a trilinear-discontinuous spatial approximation is used for a 3-D rectangular cell, one gets eight spatial unknowns per cell per angle per group. If a separate energy slope is used for each spatial unknown, the number of unknowns per cell per angle per group increases to 16; but if only a single energy slope is used for the entire spatial cell, the number of unknowns only increases to nine. The linear-discontinuous discretization of the continuous-slowing-down operator represents a major improvement relative to step and diamond-difference discretizations [47]. In particular, the linear-discontinuous method is much less numerically diffusive than the step method and much less oscillatory than the diamond method. Nodal methods can also be applied to the continuous-slowing down operator. One would expect such methods to be comparable to discontinuous finite-element methods. However, because they have rarely been used in practice, we will not explicitly discuss nodal methods here. 52 E.W. Larsen and J.E. Morel 1.7 Advances in Time Discretizations Next, we discuss advanced discretization techniques for the time derivative. As is the case for most of the derivatives terms in the Boltzmann equation, the time derivative has been treated with the discontinuous finite-element method [91,98] and the nodal method [59]. The linear-discontinuous method assumes an angular flux dependence of the following form over the kth time step: .t/ D where k a C k a k t 2 k t t ; t k1=2 < t t kC1=2 ; t k is the average flux: k a k t (1.155) D 1 t k Z t kC1=2 .t/dt ; (1.156) t tk .t/dt ; (1.157) t k1=2 is the temporal slope: k t 6 D .t k /2 Z t kC1=2 t k1=2 t k D .t k1= 2 C t k1= 2 / = 2 is the midpoint of the time step, and t k D .t kC1= 2 t k1= 2 / is the width of the time step. We note from Eq. (1.155) that the angular flux at the interface between two time steps is defined by the solution from the previous time step. For simplicity, let us consider the 1-D time-dependent slab-geometry monoenergetic transport equation with isotropic scattering and an isotropic inhomogeneous source: @ Q †s 1@ C C †t D C : (1.158) v @t @x 2 2 We obtain the following equations after applying the linear-discontinuous finiteelement approximation in time to Eq. (1.158): 1 h vt k k a C k t k1 a C k1 t i C @ k C †t @x k D †s 2 k C Qak ; (1.159a) 3 h vt k D †s 2 k a k t C k t C Qtk ; 2 k a C k1 a C k1 t i C @ tk C †t;g @x k t (1.159b) 1 Advances in Discrete-Ordinates Methodology 53 where Qak and Qtk respectively denote the source temporal average and source temporal slope for time step k. Equations (1.159a) and (1.159b) can be simultaneously solved via source iteration: i 1 h k;.`C1/ k;.`C1/ k1 k1 C C t a a t vt k k;.`C1/ @ a † s k;.`/ C †t ak;.`C1/ D C C Qak ; (1.160a) @x 2 a i 3 h k;.`C1/ k;.`C1/ k;.`C1/ k1 k1 2 C C C t a a a t vt k k;.`C1/ @ †s k;.`/ C †t;g tk;.`C1/ D C t C Qtk ; (1.160b) @x 2 t where ` is the iteration index. We note that one must iterate on both the temporal averages and the temporal slopes of the scattering source. If the scattering ratio is close to unity, the source iterations for both the averages and slopes must be accelerated. Deriving fully consistent diffusion acceleration equations from Eqs. (1.160a) and (1.160b) yields a complicated and difficult-to-solve system of coupled diffusion equations. If one uses step or diamond differencing in time, the diffusion-synthetic acceleration algorithm requires the solution of only one diffusion equation and is essentially identical to the algorithm for steady-state calculations. An approximate method has been developed in which the fully coupled system of diffusion acceleration equations associated with the linear-discontinuous temporal discretization scheme is replaced by two independent diffusion equations [98]. This approximate method appears to work quite well, resulting in a cost increase for performing DSA of about a factor of 2 relative to that associated with traditional temporal differencing schemes. While the linear-discontinuous finite-element approximation in time is more accurate than the step scheme and more robust than the diamond scheme, it is also more expensive. As with the continuous-slowing down operator, when one combines the linear-discontinuous temporal approximation with discontinuous finite-element approximations in space, one generally assumes a single temporal slope per spatial cell rather than a separate temporal slope for each spatial unknown within a cell. Although the latter assumption is more accurate, it can be excessively expensive. As we noted before, if a trilinear-discontinuous spatial approximation is used for a 3-D rectangular cell, one gets eight spatial unknowns per cell per angle per group. If a separate temporal slope is used for each spatial unknown, the number of unknowns per cell per angle per group increases to 16; but if only a single temporal slope for the entire spatial cell is used, the number of unknowns only increases to nine. In analogy with the derivation of Eqs. (1.70) and (1.72), we apply the constantconstant nodal method to Eq. (1.158) to obtain the following equations for > 0 (assuming for simplicity Q D 0): h i k1=2 k1=2 t exp † .t/ D v t t n;x;i n;x;i t h io qn;x;i;k n 1 exp †t v .t t k1=2 ; (1.161) C †t 54 E.W. Larsen and J.E. Morel where qn;x;i;k D †s 4 t k Z t kC1=2 t k1=2 and n;t;k .x/ D where †s qn;t;k;i D 4 xi Z n xi n;t;k xi C1=2 n;t;k .xi 1=2 / ; (1.162) †t .x xi 1=2 / .x / exp n;t;k x1=2 n †t .x xi 1=2 / qn;t;k;i C 1 exp ; †t n xi C1=2 xi 1=2 x;i .t/ dt t;k .x/ dx 1 h vt k n;x;i t kC1=2 n;x;i (1.163) i t k1=2 : (1.164) The considerations for applying nodal methods in time are analogous to those for applying discontinuous finite-element methods. For instance, with a linear nodal method, one must be concerned with accelerating the temporal slopes of the scattering source. With a linear nodal method in both time and space [59], there would be only one temporal slope per space cell, but if one were to apply a nodal method in time in conjunction with another type of spatial discretization, multiple temporal slopes per space cell could arise. The practical need for advanced temporal discretization schemes relative to traditional discretization schemes is not as strong for the time variable as for the space and energy variables. This is due to the relative ease with which adaptive techniques can be applied to time integration, making it feasible to avoid the regimes in which simple discretization schemes perform poorly. In any event, the transport community has little experience with advanced temporal discretization schemes, and little research has been performed in this area. 1.8 Advances in Iteration Acceleration Next, we discuss major advances in iteration acceleration. A plethora of SN iterative acceleration techniques have been developed over the years (we refer the reader to the recent comprehensive review by Adams and Larsen [110]), but there is little doubt that the practical application of diffusion-synthetic acceleration (DSA) to source iteration has been the most significant advance in iteration acceleration techniques in the history of discrete-ordinates methods. Early attempts to utilize DSA were marred by successful performance only for problems with optically thin spatial grids [16]. Later, the subtleties concerning how these methods should be discretized became understood. To motivate the DSA and DSA-like methods, we first discuss the Fourier analysis technique, which has become an invaluable theoretical tool for predicting the convergence rate of iterative solutions of continuous and discrete problems. 1 Advances in Discrete-Ordinates Methodology 55 1.8.1 Fourier Analysis The Source Iteration (SI) method is described in Section 1.2; see Eqs. (1.31) and (1.32). For a model infinite, homogeneous-medium transport problem with no discretization, 1 @ .x; / C †t .x; / D Œ†s .x/ C Q.x/ ; @x 2 Z 1 .x/ D x; 0 d0 ; (1.165a) (1.165b) 1 .0/ the SI process begins with an initial guess ` 1, the `th source iteration is defined by @ .`/ .`1=2/ .x; / C †t @x .x/ D .`1=2/ Z .`1=2/ .x; / D 1 .x/ .`1=2/ .x/ of the scalar flux, and then for 1h †s 2 .`1/ i .x/ C Q.x/ ; (1.166a) x; 0 d0 : (1.166b) 1 We now write an exact transport equation for the scalar and angular flux errors: .x/ D .x/ ı .`1/ ı .`1=2/ .x; / D .`1/ .x; /; .x; / (1.167a) .`1=2/ .x; /: (1.167b) To do this, we subtract Eqs. (1.166) from Eqs. (1.165), obtaining ı @ı .`1=2/ .x; / @x .`/ .x/ D ı C †t ı .`1=2/ Z .`1=2/ .x; / D 1 .x/ ı .`1=2/ 1 †s ı 2 .`1/ x; 0 d0 ; .x/; (1.168a) (1.168b) 1 which define ı .`/ .x/ in terms of ı .`1/ .x/. Clearly, the rate of convergence of Eq. (1.166) is equal to the rate at which ı .`/ .x/ ! 0. To calculate this rate, we introduce the Fourier transforms Z 1 .`1/ ı .x/ D a.`1/ ./e i †t x d; (1.169a) 1 ı .`1=2/ .x; / D Z 1 1 b .`1=2/ .; /e i †t x d; (1.169b) 56 E.W. Larsen and J.E. Morel into Eqs. (1.168) to obtain c .i C 1/b .`1=2/ .; / D a.`1/ ./; 2 Z 1 a.`/ ./ D b .`1=2/ ; 0 d0 ; (1.170a) (1.170b) 1 where c D †s =†t D scattering ratio. Equation (1.170a) gives b .`1=2/ .; / D 1 c a.`1/ ./ ; 2 1 C i (1.171) and then Eq. (1.170b) gives a.`/ ./ D Z 1 d0 c a.`1/ ./ 2 1 1 C i 0 D ! ./ a.`1/ ./ D D Œ!./ ` a.0/ ./; where c !./ D 2 Z1 1 (1.172a) d0 c D tan1 ./ 1 C i 0 (1.172b) is the iteration eigenvalue. Equations (1.172) and (1.169a) yield Z1 ı .`/ .x/ D ! ` ./a.0/ ./e i †t x d: (1.173) 1 Thus, the rate at which the Fourier mode corresponding to wave number limits to zero is determined by !./. If j!./j 1, the corresponding mode converges rapidly. If j!./j < 1 and j!./j 1, the mode converges slowly. If j!./j 1, the mode does not converge. The overall rate of convergence is determined by the most slowly converging error mode, i.e., the largest value of j!./j over the Fourier variable . For large `, Eq. (1.173) implies ı .`/ .x/ ` A; (1.174) where A is a constant, and D sup j!./j D lim 1<<1 is the spectral radius (see Eqs. (1.33)). `!1 ı ı .`/ .x/ .`1/ .x/ (1.175) 1 Advances in Discrete-Ordinates Methodology 57 From Eqs. (1.175) and (1.172b), the spectral radius of the continuous SI scheme is c D sup (1.176) tan1 D c; 1<<1 which is attained for 0. Thus, the 0 Fourier error modes are the most slowly converging modes in the SI scheme. From Eq. ( 1.171), we have for 0: b .`C1=2/ .; / D c c 1 ! ` ./a.0/ ./ .1 i / ! ` ./ a.0/ ./; 2 1 C i 2 and thus the most slowly converging modes depend nearly linearly on . The Fourier analysis can be extended to multidimensional problems, to fully discrete (in space, angle, and energy) SN problems for infinite homogeneous (or spatially periodic) media on Cartesian grids, and to iteration strategies beyond Source Iteration [110]. This analysis makes it possible to predict – with relative ease and great accuracy – the rate of convergence of transport iteration schemes, before implementing them in test codes. It also provides a theoretical foundation for iteration schemes, making it possible to understand how a scheme works (if it works), or why it fails (if it fails). In the latter case, the Fourier analysis has often provided clues that have enabled researchers to modify schemes to perform more effectively. Overall, the Fourier analysis has become an invaluable tool in the development of advanced iteration strategies for particle transport problems. For the SI scheme, the Fourier analysis predicts that the spectral radius D c, even for fully discrete SN codes. The accuracy of this prediction has been observed in many calculations [110]. The Fourier analysis also predicts, even for discrete problems, that the most slowly converging modes correspond to 0, and that for such modes, the -dependence is nearly linear. This result provides a motivation for the DSA method, discussed next. 1.8.2 Diffusion-Synthetic Acceleration The DSA method is based on the use of diffusion as an approximation to transport, for the purpose of calculating the iterative error after source iteration. The algorithm can be expressed as follows. We again suppose that the problem to be solved is given by Eqs. (1.165). The `th DSA iteration begins with a SI sweep (Eqs. (1.166)): @ .`1=2/ .`1=2/ .x; / @x Z C †t .`1=2/ 1 .`1=2/ .x/ D 1 .x; / D x; 0 d0 : 1h †s 2 .`1/ i .x/ C Q.x/ ; (1.177a) (1.177b) 58 E.W. Larsen and J.E. Morel Unlike the SI scheme, the DSA method does not define .`/ .x/ D .`1=2/ .x/. Instead, Eqs. (1.177) are subtracted from Eqs. (1.165) to obtain the following exact equation for the angular flux error (Eqs. (1.167)): Z @ı .`1=2/ .x; / †s 1 .`1=2/ C †t ı .x; / ı .`1=2/ x; 0 d0 @x 2 1 h i †s .`1=2/ D .x/ .`1/ .x/ : (1.178) 2 This equation for ı .`1=2/ .x; / is exact, but it is just as difficult to solve as the original transport equation for .x; / (Eq. (1.165)). However, a good estimate of the iterative error can be obtained by approximating Eq. (1.178) by its diffusion approximation: h i d 1 d ı .`1=2/ .x/ C †a ı .`1=2/ .x/ D †s .`1=2/ .x/ .`1/ .x/ ; dx 3†t dx (1.179) which is much easier to solve than Eq. (1.178). This approximation is further motivated by the fact that the most slowly converging SI modes are linear in angle; these components are treated accurately in the above diffusion approximation. After solving Eq. (1.179) for ı .`1=2/ .x/, the update equation is .`/ .x/ D .`1=2/ .x/ C ı .`1=2/ .x/: (1.180) The DSA scheme is then defined by Eqs. (1.177) (a transport sweep), Eq. (1.179) (the low-order diffusion equation for the approximate scalar flux correction), and Eq. (1.180) (the update equation). A Fourier analysis can be employed to calculate the spectral radius of the DSA algorithm for the same model infinite homogeneousmedium problem used to analyze source iteration alone. The DSA spectral radius for this case is approximately 0:23c, which is bounded less than unity for all 0 c 1. Furthermore, with one exception that is discussed later, this excellent spectral radius can be achieved in practical calculations. Thus, for the most part, DSA is extremely effective. Details of the Fourier analysis show why DSA performs well for the model problem. The transport sweep strongly attenuates scalar flux errors that vary rapidly in space, i.e., high-frequency errors. However, this sweep attenuates low-frequency errors that vary slowly in space by only a factor of c. Thus, when c is close to unity, the low-frequency errors are essentially unattenuated. In contrast, the diffusion step almost completely attenuates low-frequency errors (because such errors have an angular shape that is primarily linear in ) but does basically nothing to highfrequency errors. Thus when DSA is applied, the high-frequency errors are strongly attenuated by the sweep and the low-frequency errors are strongly attenuated by the diffusion step. The minimum level of attenuation occurs for intermediate-frequency errors, but this level of attenuation is very high relative to that of the unaccelerated iteration algorithm. 1 Advances in Discrete-Ordinates Methodology 59 The concept of DSA existed long before 1968 [6], but the synthetic method for discrete problems was originally seen to be unstable for problems in which the cell thickness exceeded roughly a mean free path [11]. Alcouffe [23] made the DSA method practical for the diamond-differenced SN equations by showing that if the spatial discretization of the diffusion equation was chosen to be consistent with the spatial discretization of the SN equations, the instability was eliminated, and one obtained an algorithm that was unconditionally effective. Soon afterward, Alcouffe’s ideas were extended and generalized to nondiamond schemes [34, 35, 99, 108, 110]. A nonlinear “quasidiffusion” method was also developed [8], which is rapidly convergent but is not strictly an acceleration method, because it produces converged solutions that differ from the discrete SN solution due to an extra truncation error. In general, an acceleration scheme is said to be “unconditionally effective” when the accelerated spectral radius is bounded less than unity. Also, an acceleration scheme is said to be “unconditionally efficient” when the computational execution time associated with the accelerated scheme is always much smaller than that of the unaccelerated scheme. While the use of consistent diffusion discretizations makes the DSA method effective, it does not necessarily make it efficient. This is because the discrete diffusion equations that are consistent with advanced SN spatial discretization methods can have a nonstandard form that makes them much more expensive to solve than standard discrete diffusion equations. This results in an effective-but-inefficient DSA algorithm [116]. The most modern SN codes generally use advanced discontinuous finite-element discretization schemes, but they do not use fully consistent diffusion discretizations. Rather, they either use “partially-consistent” diffusion discretizations [64] or they solve the fully consistent discretizations in an approximate manner that involves the solution of a standard discretization of the diffusion equation [61, 67]. Although such schemes generally result in a degradation of the accelerated spectral radius, they are usually much more efficient than fully consistent schemes because of the high cost of solving the fully consistent diffusion equations [116]. The DSA method is not limited to problems with isotropic scattering. When applied to problems with anisotropic scattering, the original DSA algorithm was nearly identical to that for isotropic scattering; all higher-order Legendre angular flux moments (above the scalar flux) were simply left unaccelerated. However, the DSA method becomes ineffective for problems with highly anisotropic forward-peaked scattering because the higher-order Legendre moments become slow to converge and require acceleration. The standard DSA method can be modified to accelerate the current (the n D 1 Legendre moment) in addition to the scalar flux (the n D 0 Legendre moment) [36]. In 1-D calculations, this improves performance with highly anisotropic scattering, but the modified method nonetheless becomes increasingly ineffective as the anisotropy of the scattering increases [36]. In multidimensional calculations, acceleration of the currents becomes unstable as the anisotropy of the scattering increases [68]. For 1-D calculations, an angular multigrid method has been developed, which is unconditionally effective and efficient with anisotropic scattering [62]. However, this approach has had only limited success in multidimensional calculations [93]. 60 E.W. Larsen and J.E. Morel 1.8.3 DSA-Like Methods for Outer Iteration Acceleration The DSA method was originally applicable only to within-group source iterations, but variations on DSA have been developed to accelerate energy-dependent sources. In particular, one can accelerate the neutron upscatter source [69], the fission source in steady-state subcritical or time-dependent super-critical neutronics calculations [73], and the fission-like implicit emission source in radiative transfer calculations [42, 50]. To illustrate, we consider time-dependent radiative transfer calculations. The DSA-like scheme that is used to accelerate the outer iterations is called the linear multifrequency-grey method [42,50]. In accordance with Eq. (1.43), an accelerated outer iteration can be described as follows. The first step is a standard outer iteration on the emission source: rI .`1=2/ .r; ; E/ C † .E/I .`1=2/ .r; ; E/ i 1 h .E/f .`1/ .r/ C .r ; E/ ; D 4 where I is the intensity, sorption rate: †s .E/ 4 .`1=2/ .r; E/ (1.181) is the angle-integrated intensity, and f is the total abZ f .r/ D 1 0 r ; E 0 dE 0 : †a E 0 (1.182) We note from Eq. (1.181) that the monochromatic scattering sources (corresponding to within-group scattering sources in practice) carry an index of ` 1=2, and thus are assumed to be converged within each outer iteration. The convergence of these sources within each outer iteration is required for application of the linear multifrequency-grey method. Of course, one can use the DSA method to accelerate the convergence of these sources if desired. The iterative error in the angular intensity at step ` 1=2 is defined as ıI .`1=2/ D I I .`1=2/ , where I is the converged transport solution. This error satisfies the following transport equation: 1 †s .E/ı rıI .`1=2/ .r; ; E/ C †t .E/ıI .`1=2/ .r; ; E/ 4 h i 1 .E/ f .`1=2/ .r/ f .`1/ .r/ ; D 4 .`1=2/ .r; E/ (1.183) which is just as difficult to solve as Eq. (1.181). The DSA algorithm would approximate Eq. (1.183) by its (energy-dependent) diffusion approximation. Instead, in the multifrequency-grey method, Eq. (1.183) is approximated by a simpler monoenergetic or “grey” diffusion equation. The unknown in this approximate equation is the iterative error in the angle-energyintegrated intensity at step ` 1=2: Z .`1=2/ ıˆ 1 .r/ D 0 h r; E 0 .`1=2/ i r; E 0 dE 0 : (1.184) 1 Advances in Discrete-Ordinates Methodology 61 The approximate diffusion equation is r hDirıˆ.`1=2/ .r/ C Œ C .1 /h†a i ıˆ.`1=2/ .r/ h i D f .`1=2/ .r/ f .`1/ .r/ ; (1.185) where 1 hDi D 3 Z h†a i D Z 0 1 0 and .E/ &.E/ D †a .E/ C 1 &.E 0 / dE 0 ; † .E 0 / (1.186) &.E 0 /†a .E 0 /dE 0 ; Z 1 0 .E 0 / †a .E 0 / C (1.187) dE 0 : (1.188) The final step in the algorithm is to add the estimate for the absorption rate error to the absorption rate iterate: f .`/ .r/ D f .`1=2/ .r/ C ha iıˆ.`1=2/ .r/: (1.189) A Fourier analysis of the unaccelerated emission source iteration shows that when the spectral radius is near unity, iterative errors that vary slowly in space are poorly attenuated by the transport sweep, while errors that vary rapidly in space are strongly attenuated by the sweep. (Such behavior can always be expected from a sweep, regardless of the particular type of source being iterated upon.) The Fourier analysis also shows that the low-frequency errors take on an angular shape that is linear in the components of the photon direction vector with an energy spectrum that is fixed. The linear angular dependence suggests an energy-dependent diffusion approximation to Eq. (1.183), and the specific, i.e., common, energy shape suggests that an energydependent diffusion approximation can be reduced without loss of accuracy to a grey diffusion approximation. This is exactly how Eq. (1.185) is derived. In particular, it is first assumed that ıI.r; ; E/ D 1 Œı .r; E/ C 3 ıF .r; E/ : 4 (1.190) A multigroup diffusion approximation is then obtained by substituting Eq. (1.190) into Eq. (1.183), taking angular moments with respect to the weight functions, 1 and , and then eliminating ıF from the resulting P1 equations, to obtain r D.E/rı .`1=2/ .r; E/ C C †a .E/ ı i h D .E/ f .`1=2/ .r/ f .`1/ .r/ ; .`1=2/ .r; E/ (1.191) 62 E.W. Larsen and J.E. Morel where D.E/ D 1 : 3† .E/ (1.192) Equation (1.185) is obtained from Eq. (1.191) by first assuming that ı .`1=2/ .r; E/ D &.E/ıˆ.`1=2/ .r/; (1.193) where &.E/ is a normalized shape function defined by Eq. (1.188). The next step is to substitute Eq. (1.193) into Eq. (1.191) and integrate that equation over all energies. The resulting equation is identical to Eq. (1.185), except for a drift term that arises in systems with space-dependent cross sections. In practice, the drift term is dropped because doing this simplifies the grey equation with no significant degradation in the effectiveness of the acceleration algorithm. In analogy with DSA, the diffusion step in the linear multifrequency-grey method almost completely attenuates low-frequency errors but does not affect high-frequency errors. With one possible exception discussed in Section 1.8.4, the spectral radius is found to be excellent for practical radiative transfer calculations in the stellar regime. The fission source acceleration method [73] is nearly identical to the linear multifrequency-grey method because, as previously noted, the numerical emission source in radiative transfer has the mathematical form of a fission source. The neutron thermal upscatter source acceleration method [69] is very similar to the fission source acceleration method and the linear multifrequency-grey method, except that the upscatter operator is of full rank in energy as opposed to the fission and emission operators, which are of rank one in energy. Simultaneous upscatter and fission source acceleration has also been investigated [84]. A full discussion of the requirements imposed on acceleration techniques by a general “scattering” operator of rank greater than one is beyond the scope of this chapter. Suffice it to say that effective acceleration techniques for “scattering” operators of full rank in both energy and angle may not be achievable using a one-group diffusion approximation to estimate low-frequency errors. Rather, “multigrid” or multilevel approximations in angle, in energy, or in both angle and energy may be required. 1.8.4 A Deficiency in Multidimensional DSA and DSA-Like Methods For many years, Alcouffe’s consistent DSA approach appeared to yield an unconditionally effective DSA algorithm. However, Azmy [87] has recently shown that the DSA method becomes increasingly ineffective in heterogeneous multidimensional calculations as the jumps in cross sections across material discontinuities increase in magnitude. It is likely that all DSA-like methods will exhibit this same deficiency. This is not significant for neutronics calculations, but it is very significant for certain classes of radiative transfer calculations. Fortunately, 1 Advances in Discrete-Ordinates Methodology 63 this deficiency has been overcome by recasting DSA as a preconditioner and using it in conjunction with preconditioned Krylov methods to solve the SN equations. This topic is discussed next. 1.9 Krylov Methods In the past few years, Krylov methods have had an enormous impact on the manner in which the SN equations are solved. Although the first applications of Krylov methods to the SN equations were made by applied mathematicians in the late 1980s and early 1990s [53, 63], the numerical transport community was initially slow to embrace them. However, during the last several years, Krylov methods have entered the mainstream of SN solution techniques [74, 75, 79, 80, 95, 103, 115, 124, 126]. Indeed, almost every SN solution technique developed today can be expected to be based on a preconditioned Krylov method. There are three basic reasons for the mainstream acceptance of Krylov methods within the numerical transport community. First, the recent recognition that DSA and DSA-like methods are not unconditionally effective in multidimensional problems [94] has motivated a search both for modifications to the DSA method that would eliminate this deficiency and for fundamentally new unconditionally effective solution techniques. Second, it was demonstrated early on that an inconsistently discretized DSA method, when recast as a preconditioner for a Krylov method, can produce an unconditionally effective and efficient SN solution technique [63]. Third, it has been demonstrated that Krylov methods, even with relatively simple preconditioners, can be much more effective than simple source iteration [82]. In this section, we give a basic description of Krylov methods, and then we consider some of the specific issues that arise when applying preconditioned Krylov methods to the SN equations. This is not intended as a review of all forms of Krylov methods, but rather as a basic introduction to Krylov concepts with a focus on those types of Krylov methods that appear to be most relevant for particle transport calculations. 1.9.1 The Central Theme of Krylov Methods Let us assume that we want to iteratively solve the following matrix equation: E AxE D b; (1.194) where A is a nonsingular N N matrix, xE is a solution vector of length N , and bE is a source vector of length N . We assume an initial solution guess for Eq. (1.194) of E However, if xE0 is not zero, one can always solve the following equivalent xE0 D 0. system with an initial zero guess: AxE0 D bE0 ; (1.195a) 64 where and E.W. Larsen and J.E. Morel xE0 D xE xE0 ; (1.195b) bE0 D bE AxE0 : (1.195c) Fundamental to the central theme of a Krylov method is a Krylov space of dimenE We denote sion m, which is defined with respect to the matrix A and the vector b. E A basis for this space consists of M vectors formed by this space by Km .A; b/. E applying successive powers of A to b: o n E : : : ; Am1 bE : E D span b; E Ab; E A2 b; Km .A; b/ (1.196) We refer to these basis vectors as the Krylov vectors. The basic idea of Krylov methods is to approximate the solution to Eq. (1.194) by a linear combination of Krylov vectors. Before we provide more details on how this is done, we first show that the solution of Eq. (1.194) lies within a Krylov space. We define the minimum polynomial of A, denoted by Pd .A/, as the polynomial in A of minimum degree d such that (1.197) Pd .A/ D I C a1 A1 C C ad Ad D 0: If A is nonsingular, then zero is not a root of its minimal polynomial. Thus, the minimal polynomial can be scaled so that a0 D 1:0. Equation (1.197) implies I D .a1 I C C ad Ad 1 /A: (1.198) Multiplying Eq. (1.198) from the right by A1 gives A1 D d 1 X aj C1 Aj : (1.199) j D0 Thus, the solution to Eq. (1.194) can be written as xE D A1 bE D d 1 X E aj C1 Aj b; (1.200) j D0 E Thus, one is which, by comparison with Eq. (1.196), is an element of Kd .A; b/. E where motivated to seek an approximation to xE from the Krylov space Kn .A; b/, n d. Although Krylov methods are generally quite similar in spirit, they can be quite different in detail. Recognizing this fact, we nonetheless describe a “typical” Krylov E method as follows. The mth solution iterate, xE .m/ , is an element of Km .A; b/. E The first step in the algorithm is to build an orthogonal basis for Km .A; b/. 1 Advances in Discrete-Ordinates Methodology 65 Orthogonalization of the Krylov vectors is necessary because these vectors become less linearly independent with increasing m. This follows from the fact that Aj bE approaches the fundamental eigenvector (the eigenvector associated with the eigenvalue of largest magnitude) as j increases. Different methods achieve orthogonality in different ways, and in some cases, the vectors are orthogonal with respect to some particular inner product. For illustrative purposes we will simply assume that the vectors are orthogonal with respect to the standard dot product. For reasons that will be explained later, the orthogonal basis vectors are also normalized. The mth solution iterate is expressed as a linear combination of the orthonormal basis vectors: xE .m/ D Vm yEm ; (1.201) where Vm is an n m matrix whose j th column consists of the j th orthonormal basis vector, and yEm is the vector of expansion coefficients of length m. The expansion coefficients are uniquely determined via a residual orthogonality condition. More specifically, an m-dimensional weighting space of vectors is first chosen, ˚ E 1; w E 2; : : : ; w Em ; Wm D span w (1.202) and then expansion coefficients are chosen so that the residual vector associated with xE .m/ , rEm D bE AxE .m/ D bE AVm yEm ; (1.203) is orthogonal to the weighting space, i.e., WTm rEm D 0; (1.204) where Wm is an m m matrix whose j th column consists of the j th weighting vector. Substituting from Eq. (1.203) into Eq. (1.204), we obtain the following m m matrix equation for the coefficient vector: E WTm AVm yEm D WTm b: (1.205) Different Krylov methods arise from different choices of the weighting space. For E is a good choice. This results in instance, if A is positive-definite, Wm D Km .A; b/ a classic Ritz–Galerkin approximation because the weighting space and the approximate solution space are identical: E VTm AVm yEm D VTm b: (1.206) The A-norm of the error, defined by xE xE .m/ A D .xE xE .m/ /T A.xE xE .m/ /; (1.207) 66 E.W. Larsen and J.E. Morel is minimized under this approximation. The quantity defined by Eq. (1.206) is always minimized, but this is not a true norm unless A is positive-definite. The conjugate-gradient (CG) method [81] uses the Ritz–Galerkin approximation. E If A is not positive-definite, a good choice for the weighting space is AKm .A; b/, i.e., the space spanned by the A times the Krylov vectors. In this case, Eq. (1.205) becomes E (1.208) .AVm /T .AVm /yEm D .AVm /T b: Equation (1.208) is easily recognized as the least-squares approximation to the overdetermined linear system E A.Vm yEm / D b: (1.209) Thus, the residual associated with this approximation is minimized with respect to the L2 norm, defined as q rEm L2 D Tr rEm Em : (1.210) The generalized minimum-residual (GMRES) and minimum-residual (MINRES) methods [81] use the least-squares approximation. Weighting spaces other than those discussed above can be used, but when this is done, neither the error nor the residual are minimized in any rigorous sense. However, there are advantages to using other weighting spaces, relating to the amount of computational work and the amount of data storage required to solve for xE .m/ . In general, Eq. (1.205) is not actually solved to obtain yEm . For instance, in the GMRES method, an equivalent matrix equation based on an Arnoldi decomposition of A is used. The details of the process used to obtain xE .m/ are not important for our purposes, but the amount of computational work and the amount of data storage required to obtain xE .m/ are important. The storage costs associated with the Krylov vectors can be prohibitive if one is solving very large linear systems for which many iterations are required to achieve convergence. This is not a difficulty if A is symmetric positive-definite and the CG method is used, because xE .m/ can be calculated using a three-term recursion formula [81]. As a result, one need only save the previous solution iterate and one other vector of similar length to compute a new solution iterate. In this case, the required data storage does not increase with the number of iterations as it otherwise does. This is a great advantage associated with the CG method. If A is not symmetric positive-definite, one can avoid the storage issue by choosing the weighting space such that a recursion formula can be used to compute xE .m/ . However, as previously noted, this approach generally fails to minimize the error or the residual at each iteration step, which can result in erratic convergence properties. The biconjugate-gradient (BCG) and quasi-minimum-residual (QMR) methods [81] use this type of approach. If a recursion formula cannot be used, a restart strategy must generally be used. Under such a strategy, one chooses a maximum number of Krylov iterations to perform within each “stage” before restarting the Krylov process with the last iterate from the previous stage serving as the initial solution guess for the next stage. The difficulty with this approach is that convergence of the restarted process is not guaranteed for general matrices. However, the restarted GMRES method is guaranteed to converge if A is positive-definite, 1 Advances in Discrete-Ordinates Methodology 67 i.e., if xE T AxE > 0 for all xE ¤ 0; or equivalently, if the eigenvalues of A C AT are all positive and nonzero. As we later explain, this property is relevant for SN calculations. In summary, all Krylov methods (i) use a Krylov space of dimension m to approximate the mth solution iterate, and (ii) force the residual associated with that iterate to be orthogonal to an m-dimensional weighting space of vectors. The main differences between the various types of Krylov methods arise from the nature of the weighting space of vectors and the computational cost (both in CPU time and memory) of obtaining a solution iterate. Some Krylov methods are designed for general linear systems, while others are specifically designed to be optimal for some particular type of linear system. For instance, the conjugate-gradient method is designed for symmetric positive-definite systems while the minimum-residual method is designed for symmetric indefinite systems. 1.9.2 Convergence and Preconditioning of Krylov Methods The speed of convergence of Krylov methods very much depends on the characteristics of the coefficient matrix associated with the linear system being solved, i.e., the characteristics of the coefficient matrix A in Eq. (1.194). This matrix is normal if AAT D AT A. The convergence properties associated with normal coefficient matrices are fairly well understood, but characterizing the convergence properties associated with non-normal coefficient matrices remains an open problem. A property associated with the convergence of the conjugate-gradient (CG) method is the condition number of the coefficient matrix, defined as the ratio of its largest singular value to its smallest singular value: D max : min (1.211) If is not large, the convergence of the CG method will be rapid. However, if is large, no conclusion can be made regarding the rate of convergence of CG iterations. The condition number is not particularly relevant to the convergence of the generalized minimum-residual (GMRES) method. More relevant factors are the distribution of the eigenvalues of A in the complex plane, and (assuming that A is diagonalizable) the condition number of the matrix of eigenvectors of A. An eigenvalue spectrum clustered away from zero and entirely contained in either half-plane and an eigenvector matrix of A with a small condition number are desirable. If any single property of the coefficient matrix is desirable with a Krylov method, it is that its eigenvalues should be clustered away from zero. A set of eigenvalues is considered to be clustered if the distance between any two eigenvalues is much smaller than the distance of any eigenvalue from the origin. We stress that all of the eigenvalues need not be in one cluster or a collection of clusters. Any clustering of eigenvalues can be advantageous relative to a completely unclustered eigenvalue distribution. 68 E.W. Larsen and J.E. Morel The maximum number of iterations required for a Krylov method to converge is N , the dimension of A. However, it is desirable to use the technique of preconditioning to reduce the number of required iterations, both from the viewpoint of efficiency and from the fact that roundoff errors can be problematic when N is large. A preconditioner for A is a matrix that approximates A1 . Given a preconditioning matrix C; the preconditioning process can take two forms. The first, called left-preconditioning, corresponds to solving CAE x D Cy: E (1.212) In the absence of round-off error, the above equation clearly has the same solution as Eq. (1.194). The second is called right-preconditioning and corresponds to solving ACEz D y; E (1.213) xE D CEz: (1.214) where We note from Eq. (1.214) that once Eq. (1.213) has been solved, xE is obtained from Ez via the action of C on Ez. If C D A1 , both Eqs. (1.212) and (1.213) can be solved in a single Krylov iteration. This is why a preconditioner should approximate A1 . However, from a practical point of view, a preconditioner should approximate A1 only in a limited sense, because it is not generally possible to find a nearly exact approximation to A1 at a low computational cost. Also if a nearly exact approximation to A1 were known, a nearly exact solution to the problem, Eq. (1.194), would be immediately realizable without the need for iterations. Experience in solving the transport equation with Krylov methods indicates that the best preconditioners move the smallest eigenvalues significantly away from zero, while leaving the largest eigenvalues relatively unaffected [124]. Assuming a preconditioned transport system that is SPD, the effectiveness of such preconditioners is easily explained when they are used in conjunction with the CG method. In particular, the eigenvalues and singular values of an SPD matrix are identical, so the condition number of such a matrix is just the ratio of the largest eigenvalue to the smallest eigenvalue. Thus, moving the eigenvalues closest to zero away from zero, while leaving the largest eigenvalues essentially unaffected, will decrease the condition number of the coefficient matrix and thereby result in faster convergence of the CG method. However, even though such preconditioners are observed to remain effective when used in conjunction with nonsymmetric transport systems and more general Krylov methods, the reasons for their effectiveness are not yet well understood. To move the smallest eigenvalues away from zero, the preconditioner must accurately approximate the inverse of the coefficient matrix when operating on the eigenvectors associated with the smallest eigenvalues. To leave the largest eigenvalues relatively unaffected, the preconditioner must act approximately as the identity matrix when operating on the eigenvectors associated with the largest eigenvalues. Such preconditioners are analogous to the coarse-grid operators 1 Advances in Discrete-Ordinates Methodology 69 used in multigrid methods and the low-order operators used in synthetic acceleration schemes. Thus, it is somewhat clear how such preconditioners should be constructed. As previously stated, the convergence of Krylov methods can be significantly improved by clustering some of the eigenvalues. Unfortunately, the properties required for a preconditioner to cluster eigenvalues are generally much less clear than those required to move the smallest eigenvalues away from zero. Nonetheless, as we shall later see, moderately effective preconditioners for transport calculations have been found that cluster some of the eigenvalues while leaving the smallest eigenvalues essentially unaltered [124]. Finally, moderately effective preconditioners for transport calculations have been found that move the smallest eigenvalues away from zero, but also significantly increase and spread out (uncluster) the largest eigenvalues [121, 122]. One can give a plausible explanation for the effectiveness of such preconditioners when they are used in conjunction with the CG method, but such an explanation is not currently possible with more general Krylov methods. For instance, if we assume a preconditioned transport system that is SPD, moving the smallest eigenvalues away from zero decreases the condition number of the coefficient matrix, while increasing the large eigenvalues increases the condition number of the coefficient matrix. We can conjecture that these preconditioners are moderately effective when used in conjunction with the CG method because the decrease in the condition number associated with moving the smallest eigenvalues away from zero weakly dominates the increase in the condition number associated with increasing the largest eigenvalues. This results in a net decrease in the condition number of the coefficient matrix and an attendant increase in the convergence rate of the CG method. Preconditioners of this type are analogous to unstable acceleration schemes that strongly attenuate the low-frequency Fourier error modes but weakly amplify the high-frequency Fourier error modes. We emphasize that not every acceleration scheme that attenuates lowfrequency Fourier error modes and amplifies high-frequency Fourier error modes can be expected to be effective when recast as a preconditioned Krylov method. If the amplification factors for the high-frequency Fourier error modes are too large, the corresponding preconditioner may increase the largest eigenvalues too much relative to the movement of the smallest eigenvalues away from zero, presumably resulting in a net decrease in convergence rate. 1.9.3 Applying Krylov Methods to the SN Equations The optimal application of Krylov methods to the transport equation is not necessarily obvious. To illustrate, let us consider a monoenergetic approximation with isotropic scattering in 1-D slab geometry. The transport equation can be expressed as 1 L †s P 2 D Q; (1.215) 70 E.W. Larsen and J.E. Morel where L D @ C †t ; @x and Z P D (1.216) C1 D d: (1.217) 1 A Krylov method could be used to solve a fully discretized SN version of Eq. (1.215), but it would not be optimal. A better approach is to left-precondition Eq. (1.215) with L1 : 1 1 I L †s P 2 D L1 Q: (1.218) This approach is better for three reasons. First, the analytic operator on the left side of Eq. (1.218) represents the integral transport operator, which is bounded, whereas the differential transport operator is unbounded. This implies that the discretized SN version of the integral operator will have eigenvalues restricted to a bounded (hence much smaller) region of the complex plane than the discretized SN version of the differential operator. Furthermore, the integral transport operator is a compact perturbation of the identity operator, hence many (but not all) of the eigenvalues of the discretized SN version of the integral operator will be clustered about unity. Thus, Krylov methods will generally converge faster when used to solve Eq. (1.218) than when used to solve Eq. (1.215). The second reason for solving Eq. (1.218) is that the action of L1 is easily and economically obtained via a sweep. Thus, the improved convergence rate will not be outweighed by the cost of preconditioning with L1 . (We note that L1 is an example of a preconditioner that clusters eigenvalues but does not significantly move the smallest eigenvalues away from zero. The improvement in convergence rate is significant, but preconditioning with L1 will not result in rapid convergence for all problems.) The third reason for solving Eq. (1.218) is that it has been observed (but not rigorously proven [124]) that the matrices associated with SN discretizations of Eq. (1.218) are positive-definite. As previously noted, the restarted GMRES algorithm is guaranteed to converge when applied to linear systems with this property. This is a significant property of Eq. (1.218). A further improvement can be obtained by multiplying Eq. (1.218) by P. This results in an integral equation for the scalar flux (also known as Peierls’ equation): 1 I PL1 †s 2 D PL1 Q: (1.219) The main advantage of this approach is that the dimension of the matrix associated with a discrete version of Eq. (1.219) is much less than that associated with a discrete version of Eq. (1.218), greatly reducing the amount of computation required per Krylov iteration. The derivation yielding Eq. (1.219) can be generalized for any order of anisotropic scattering, resulting in an integral equation for the Legendre (or spherical-harmonic) flux moments. Solving an integral moments equation will be 1 Advances in Discrete-Ordinates Methodology 71 advantageous relative to Eq. (1.218) whenever the number of moments is less than the number of discrete angular flux directions. Another potential advantage of the integral moments approach is that analytic versions of such equations can be made self-adjoint and positive-definite (SAPD) via both a preconditioner and the use of a nonstandard inner product [107]. If a discretization for such an equation preserves this SAPD property, then the corresponding discrete system can be expressed in an SPD form and the conjugate-gradient method can be used to solve it. The SAPD property has been observed to be preserved for Eq. (1.219) on orthogonal meshes with a wide variety of SN spatial discretization schemes [82], but it is not preserved with the linear-discontinuous spatial discretization scheme on unstructured tetrahedral meshes [124]. The conditions under which the SAPD property is preserved are not currently well understood, but for reasons given later, this is not as important an issue as it might appear to be. Finally, perhaps the optimal approach would be to multiply Eq. (1.219) by the operator I C D1 †s : 1 .I C D1 †s / I PL1 †s 2 D .I C D1 †s /PL1 Q; (1.220) where D is the diffusion operator: D D @ 1 @ C †a : @x 3†t @x (1.221) This is the analog of diffusion-synthetic acceleration, and it (presumably) results in rapid convergence under almost all practical conditions [124]. The deficiency of DSA in multidimensions with large discontinuities in the cross sections is strongly diminished when it is recast as a preconditioned Krylov method [124]. This is the great advantage of Krylov methods. Only one eigenvalue of the iteration matrix very near unity can ruin the performance of an acceleration scheme, but a single anomalously large eigenvalue associated with the system being solved will generally have little effect on a Krylov method. As previously noted, modern SN codes generally use discontinuous spatial discretization schemes. If a discontinuous diffusion discretization is used, the left side of Eq. (1.220) will not be symmetric [124]. A great advantage of preconditioned Krylov methods is that the diffusion discretization does not have to be fully consistent with the SN spatial discretization to be effective. However, if a discontinuous SN spatial discretization is used, some capability for calculating a discontinuous solution must be included in the diffusion approximation to ensure unconditional effectiveness, and this will almost certainly cause the overall diffusion operator to be either nonsymmetric or symmetric-indefinite. Thus, even if a discontinuous discretization of Eq. (1.219) were to preserve the SAPD property of Peierls’ operator and result in an SPD system, applying a DSA-like preconditioner would result in an overall non-SPD system, precluding the use of the conjugate-gradient method to solve that system. This is why the failure of discontinuous methods to preserve the 72 E.W. Larsen and J.E. Morel SAPD property of Peierls’ operator on unstructured meshes is not as important an issue as it might be. On the other hand, even when a discontinuous diffusion approximation is used, the operator on the left side of Eq. (1.220) remains positive-definite, ensuring the convergence of the restarted GMRES method. This is an important property of Eq. (1.220). Next, we review the operations required to solve a fully discretized SN version of Eq. (1.220) and show that they are quite similar to the operations that take place in a standard SN code with diffusion-synthetic acceleration. We first note that to solve a general linear system such as Eq. (1.194), most Krylov routines do not require the formation of the matrix A. Rather, it is required that the calling routine provide the Krylov routine with a vector that represents the action of A. Specifically, the Krylov routine provides the calling program with a vector, which we denote by vE, and the calling routine returns the vector vE0 , where vE0 D AEv: (1.222) The Krylov space can easily be built up this way, with one matrix-vector multiply per iteration. The simplest Krylov methods (e.g., the conjugate-gradient method) require only one matrix-vector multiply per iteration, but others may require more (e.g., the biconjugate-gradient method requires two). We now consider the solution of Eq. (1.220) in detail. Since the unknown in Eq. (1.220) is the monoenergetic or one-group scalar flux, a Krylov method used to solve this equation works with “scalar flux” vectors, i.e., vectors with a length equal to the number of spatial cells times the number of spatial unknowns per cell. The latter number clearly depends on the spatial discretization used for Eq. (1.220). In the process of generating the scalar flux vectors required by the Krylov solver, intermediate “angular flux vectors” arise. These vectors have a length equal to the length of a scalar flux vector times the number of discrete directions in the quadrature set. In the descriptions that follow, angular flux vectors carry an additional subscript “a” to distinguish them from scalar flux vectors. The first vectors that must be provided to a Krylov solver are the initial solution guess, and the vector corresponding to the right side of Eq. (1.220). The latter “source” vector is generated as follows: E a as the source to obtain 1. Perform a transport sweep with Q E a: vE1;a D L1 Q 2. Integrate vE1;a over all directions to obtain E a: vE2 D PL1 Q 3. Multiply vE 2 by †s (a diagonal matrix in this context) to obtain a diffusion source vector: E a: vE3 D †s P L1 Q 1 Advances in Discrete-Ordinates Methodology 73 4. Solve the diffusion equation with vE 3 as the source to obtain E a: vE4 D D1 †s P L1 Q 5. Add vE2 and vE4 to obtain the desired source vector: E a: vE 5 D .I C D1 †s / P L1 Q Next, every iteration requires the calling routine to provide the Krylov solver with the action of the operator on the left side of Eq. (1.220). The action of this operator on the vector vE is calculated as follows: 1. Multiply vE by †s =2 and map it to an isotropic angle-dependent vector to obtain a transport source vector: 1 vE1;a D †s vE: 2 2. Perform a sweep with vE1;a as the source to obtain vE2;a D 1 1 L †s vE: 2 3. Integrate vE2;a over all angles to obtain vE3 D 1 1 PL †s vE: 2 4. Subtract vE3 from vE to obtain 1 vE 4 D I PL1 †s vE: 2 5. Multiply vE4 by †s to obtain a diffusion source vector: 1 1 vE5 D †s I PL †s vE : 2 6. Perform a diffusion calculation with vE 5 as the source to obtain 1 1 1 vE6 D D †s I PL †s vE: 2 7. Add vE4 to vE6 to obtain the desired action on vE: 1 vE7 D .I C D1 †s / I PL1 †s vE: 2 74 E.W. Larsen and J.E. Morel Thus we see that only sweeps, diffusion solves, and vector additions and subtractions are required to solve the transport equation via a preconditioned Krylov method. Because these operations occur in standard SN codes with DSA, it is not difficult to modify existing SN codes to use preconditioned Krylov solution techniques. 1.10 Future Challenges In this review, we have discussed some of the major improvements to numerical techniques, developed during the last three decades, for the SN equations. Perhaps the most far-reaching improvements have resulted from discontinuous finiteelement methods, which have impacted the discretization of essentially all of the variables in the transport equation. The present challenge for discontinuous finite-element methods is to achieve both accuracy and robustness on general nonorthogonal meshes in the optically thin limit, the optically thick absorptive limit, and the optically thick diffusive limit. These properties can be achieved on orthogonal meshes with standard lumping techniques, but such techniques are problematic on nonorthogonal meshes. Achieving robust strictly positive solutions with at least second-order accuracy in regions where fixup is not required is a classic problem that arises in many fields. Typical fixup schemes for transport are highly nonlinear and nondifferentiable. For instance, a popular fixup scheme used with spatial diamond differencing is to set any negative outflow fluxes in a cell to zero and then recalculate the cell-averaged flux [71]. When a problem contains both highly absorptive and highly scattering regions, the resulting equations can be difficult to solve even when nonlinear solution techniques are used because of the nondifferentiable nature of the “fixup” scheme. A new concept is now emerging by which solutions are not “fixed up” during the iteration process, but rather are “repaired” after convergence [123]. New ideas, such as this, will be required to solve the old and difficult problem of ensuring positive and monotone second-order solutions. Discrete-ordinates methods have traditionally been viewed as being limited to simple geometries, due to the use of orthogonal meshes. The recent development of SN methods for general unstructured meshes constitutes a major advance in the applicability of SN methods to geometrically complex problems. Highly efficient and scalable massively parallel SN solution techniques have been developed for orthogonal meshes [51], but massively parallel solution techniques for unstructured meshes are not yet as efficient or scalable [104, 114]. Also, fundamental problems occur in the SN method on meshes with curved cell faces. These problems have been addressed to a limited extent [109], but much more computational experience is needed before the efficacy of current approaches is fully demonstrated. Tremendous progress in DSA and DSA-like acceleration schemes has been made during the last two decades. The field of acceleration techniques was dealt a significant setback when it was discovered that these schemes are not always unconditionally effective, even with fully consistent diffusion discretizations [87]. However, the advances recently made in the application of Krylov methods to 1 Advances in Discrete-Ordinates Methodology 75 transport calculations have enabled synthetic acceleration schemes to be recast as unconditionally effective preconditioners, thereby regaining a powerful potential for these methods [124]. Developing optimal preconditioned Krylov methods for transport calculations remains an outstanding research topic. A significant challenge for charged-particle transport is to use the SN method to efficiently and accurately treat problems with pencil-beam or near pencil-beam sources. Some progress has recently been made in this regard [125], but more efficient methods requiring lower orders of approximation are needed. Since charged-particle scattering is extremely anisotropic, Galerkin quadratures represent an important alternative to standard quadratures for charged-particle transport calculations. However, much remains to be understood regarding the optimal properties of Galerkin quadrature sets. For example, it is desirable for standard quadrature sets to have positive quadrature weights. Galerkin quadrature sets have a (potentially different) set of weights for evaluating each flux moment. The Galerkin weights associated with the evaluation of the zeroth flux moment represent the weights associated with the standard companion quadrature set. These quadrature weights should be positive because (among other things) positivity indicates a stable spherical-harmonic interpolation. However, the connection between positive weights for evaluating the higher-order flux moments and the stability of the spherical-harmonic interpolation is not clear. To our knowledge, a multidimensional Galerkin quadrature set beyond S2 with positive weights for all moments has never been constructed. Finally, very little progress has been made with respect to perhaps the most significant of all SN deficiencies – the ray effect [119]. Although PN methods do not suffer from classic ray effects, they rarely provide the correct solutions to problems for which SN solutions exhibit ray effects. This is because the exact solutions to such problems usually have an extremely anisotropic angular dependence, and the global nature of spherical-harmonic functions makes them inefficient for representing such dependencies. Unfortunately, the introduction of angular coupling via continuous and discontinuous finite-element approximations does not appear promising for rayeffect mitigation [57, 119]. Because the solutions associated with ray effects are so complex, the accurate elimination of ray effects will probably require angular adaptation. Although adaptive methods are quite mature in many fields of computation, they are just beginning to be investigated for transport calculations [85,92,111,117]; this possibly represents an important research area for the future. Acknowledgments We would like to acknowledge the invaluable assistance of Michele Benzi of Emory University, James Warsa of Los Alamos National Laboratory, and Carrie Beck. Michele graciously reviewed our section on Krylov methods. James provided us with basic information on Krylov methods and their application to the SN equations. Carrie accomplished the laborious task of typing this manuscript in its present format. We would also like to acknowledge that one of the authors (JEM) was an employee of Los Alamos National Laboratory when much of this manuscript was written. Los Alamos National Laboratory is operated by Los Alamos National Security, LLC, for the US Department of Energy under Contract DE-AC52–06NA25396. 76 E.W. Larsen and J.E. Morel Authors’ Note The following reference list consists mostly of papers that, in the authors’ opinions, have significantly contributed to the topics discussed in this review. We have included several other important publications, but we do not intend for the list to be a full compilation of significant publications on SN methods. References 1900–1964 1. Eddington AS (1922) The internal constitution of the stars. Cambridge University Press, Cambridge 2. Chandrasekhar S (1960) Radiative transfer. Dover, New York. First published by Oxford University Press, London (1950) 3. Davison B (1957) Neutron transport theory. Oxford University Press, London 4. Carlson BG (1961) Numerical solution of neutron transport problems. In: Proc. of symposia in applied mathematics, Vol. 11, Nuclear reactor theory, American Mathematical Society, Providence, RI, 219 5. Carlson BG (1963) The numerical theory of neutron transport. In: Methods in Computational Physics, Vol. 1, Academic, New York 6. Kopp HJ (1963) Synthetic method solution of the transport equation. Nucl Sci Eng 17:65 7. Vladimirov VS (1961) Mathematical problems in the one-velocity theory of particle transport, Transactions of V.A. Steklov Mathematical Institute, USSR Academy of Sciences, 61. English translation published by Atomic Energy of Canada Limited, Tech. Report AECL1661 (1963) 8. Gol’din VYa (1964) A quasi-diffusion method for solving the kinetic equation. Zh Vych Mat I Mat Fiz 4:1078 (in Russian) English translation (1967) published in: USSR Comp Math Math Phys 4:136 1965–1969 9. Lathrop KD (1965) Anisotropic scattering approximations in the mono-energetic Boltzmann equation. Nucl Sci Eng 21:498 10. Carlson BG, Lathrop KD (1968) Transport theory – the method of discrete ordinates. In: Greenspan H, Kelber CN, Okrent D (eds) Computing methods in reactor physics. Gordon & Breach, New York 11. Gelbard EM, Hageman LA (1969) The synthetic method as applied to the SN equations. Nucl Sci Eng 37:288 12. Lathrop KD (1969) Spatial differencing of the transport equation: positivity vs accuracy. J Comp Phys 4:475 1970–1974 13. Bell GI, Glasstone S (1970) Nuclear Reactor Theory. Van Nostrand Reinhold, New York 14. Carlson BG (1970) Transport theory: discrete ordinates quadrature over the unit sphere. Los Alamos Scientific Laboratory report LA-4554 1 Advances in Discrete-Ordinates Methodology 77 15. Carlson BG (1971) Tables of symmetric equal weight quadrature EQn over the unit sphere. Los Alamos Scientific Laboratory Report LA-4734 16. Reed WH (1971) The effectiveness of acceleration techniques for iterative methods in transport theory. Nucl Sci Eng 45:245 17. Pomraning GC (1973) The equations of radiation hydrodynamics. Pergamon Press, Oxford 18. Reed WH, Hill TR (1973) Triangular mesh methods for the neutron transport equation. Los Alamos Scientific Laboratory Report LA-UR-73–479 19. Dahlquist G, Björck Å (1974) Numerical methods. Prentice Hall, Englewood Cliffs, NJ 20. Kellogg RB (July 1974) First derivatives of solutions of the plane neutron transport equation, Technical Note BN-783. Institute for Fluid Dynamics and Applied Mathematics, University of Maryland, College Park, MD 1975–1979 21. Briggs LL, Miller WF Jr, Lewis EE (1975) Ray effect mitigation in discrete ordinate-like angular finite-element approximations in neutron transport. Nucl Sci Eng 57:205 22. Hill TR (1975) ONETRAN: a discrete-ordinates finite-element code for the solution of the one-dimensional multigroup transport equation. Los Alamos Scientific Laboratory Report LA-5990-MS 23. Alcouffe RE (1977) Diffusion synthetic acceleration methods for the diamond-differenced discrete-ordinates equations. Nucl Sci Eng 64:344 24. Miller WF Jr, Reed WH (1977) Ray effect mitigation methods for two-dimensional neutron transport theory. Nucl Sci Eng 62:391 25. Alcouffe RE, Larsen EW, Miller WF Jr, Wienke BR (1979) Computational efficiency of numerical methods for the multigroup, discrete-ordinates neutron transport equations: the slab geometry case. Nucl Sci Eng 71:111 26. Dudziak DJ, O’Dell RD, Alcouffe RE (July 1979) Transport and reactor theory. Los Alamos National Laboratory Report, LA-7911-PR 27. Morel JE (1979) On the validity of the extended transport cross-section correction for lowenergy electron transport. Nucl Sci Eng 71:64 1980–1984 28. Lawrence RD, Dorning JJ (1980) A discrete nodal integral transport theory method for multidimensional reactor physics and shielding calculations. In: Proc. ANS topical conference. Advances in reactor physics and shielding, 14–19 Sept 1980. Sun Valley, Idaho, 840 29. Walters WF, O’Dell RD (1980) Nodal methods for discrete-ordinates transport problems in (x-y) geometry. In: Proc. ANS topical conference. Advances in mathematical methods for the solution of nuclear engineering problems, 27–29 April 1980. 1, 115. Munich 30. Gupta NK (1981) Nodal methods for three-dimensional simulators. Prog Nucl Energy 7:127 31. Morel JE (1981) Fokker–Planck calculations using standard discrete ordinates codes. Nucl Sci Eng 79:340 32. Sanchez R, Mc Cormick NJ (1982) A review of neutron transport approximations. Nucl Sci Eng 80:481 (review) 33. Larsen EW (1982) Spatial convergence properties of the diamond difference method in X,Ygeometry. Nucl Sci Eng 80:710 34. Larsen EW (1982) unconditionally stable diffusion synthetic acceleration methods for the slab geometry discrete-ordinates equations. Part I: theory. Nucl Sci Eng 82:47 78 E.W. Larsen and J.E. Morel 35. McCoy DR, Larsen EW (1982) Unconditionally stable diffusion synthetic acceleration methods for the slab geometry discrete-ordinates equations. Part II: numerical results. Nucl Sci Eng 82:64 36. Morel JE (1982) A synthetic acceleration method for discrete ordinates calculations with highly anisotropic scattering. Nucl Sci Eng 82:34 37. Caro M, Ligou J (1983) Treatment of scattering anisotropy of neutrons through the Boltzmann–Fokker–Planck equation. Nucl Sci Eng 83:242 38. Morel JE, Montry GR (1984) Analysis and elimination of the discrete ordinates flux dip. Transp Theory Statist Phys 13:615 39. Mihalas D, Mihalas BW (1984) Foundations of radiation hydro-dynamics. Oxford University Press, New York 1985–1989 40. Morel JE (1985) An improved Fokker–Planck angular differencing scheme. Nucl Sci Eng 89:131–136 41. Morel JE (1985) Diamond-difference multigroup-Legendre coefficients for continuousslowing down operator. Nucl Sci Eng 91:324 42. Morel JE, Larsen EW, Matzen MK (1985) A synthetic acceleration scheme for radiative diffusion calculations. J. Quant. Spectro. Radiat Transf 34:243 43. Badruzzaman A (1985) An efficient algorithm for nodal transport solutions in multidimensional geometry Nucl Sci Eng 89:281 44. Khalil H (1985) A nodal diffusion technique for synthetic acceleration of nodal SN calculations. Nucl Sci Eng 90:263 45. Lawrence RD (1986) Progress in nodal methods for the solution of the neutron diffusion and transport equations. Prog Nucl Energy 17:271 (review) 46. Marchuk GI, Lebedev VI (1986) Numerical methods in the theory of neutron transport. Harwood Academic Publishers, London (originally published in Russian in 1981) 47. Lazo MS, Morel JE (1986) A linear discontinuous Galerkin approximation for the continuous slowing down operator. Nucl Sci Eng. 92:98 48. Ullo JJ (1986) Use of multidimensional transport methodology on nuclear logging problems. Nucl Sci Eng 92:228 49. Larsen EW, Morel JE, Miller WF Jr (1987) Asymptotic solutions of numerical transport problems in optically thick, diffusive regimes. J Comp Phys 69:283 50. Larsen EW (1988) A grey transport acceleration method for thermal radiative transfer problems. J Comp Phys 78:459 51. Baker RS, Koch KR (1988) An SN algorithm for the massively parallel CM-200 computer. Nucl Sci Eng 128:312 52. Larsen EW, Morel JE (1989) Asymptotic solutions of numerical transport problems in optically thick, diffusive regimes II. J Comp Phys 83:212. See also Corrigendum. J Comp Phys 91:246 (1990) 53. Faber V, Manteuffel TA (1989) A look at transport theory from the point of view of linear algebra. In: Nelson P et al. (eds) Lecture notes in pure and applied mathematics, Vol. 115. Marcel Dekker, New York, p 31 54. Morel JE (1989) A hybrid collocation-Galerkin-SN method for solving the Boltzmann transport equation. Nucl Sci Eng 101:72 55. Landesman M, Morel JE (1989) Angular Fokker–Planck decomposition and representation techniques. Nucl Sci Eng 103:1 1 Advances in Discrete-Ordinates Methodology 79 1990–1994 56. Cefus GR, Larsen EW (1990) Stability analysis of coarse mesh rebalance. Nucl Sci Eng 105:31 57. Coppa GGM, Lapenta G, Ravetto P (1990) Angular finite-element techniques in neutron transport. Ann Nucl Energy 17:363 58. Lorence LJ Jr, Morel JE,Valdez GD (1990) Results guide to CEPXS/ONELD: a onedimensional coupled electron-photon discrete ordinates code package. Sandia National Laboratories Report SAND89–211 59. Badruzzaman A (1991) Finite moments approaches to the time-dependent Boltzmann equation. Prog Nucl Eng 25:127 60. Walters WF, Morel JE (1991) Investigation of linear-discontinuous angular differencing for the 1-D spherical geometry SN equations. In: Proc. ANS M&C topical meeting. Advances in mathematics, computation, and reactor physics, 28 April–May 2 May 1991, Sec. 11.1, Pittsburgh 61. Wareing T, Larsen EW, Adams ML (1991) Diffusion accelerated discontinuous finite element schemes for the SN equations in slab and X,Y geometries. In: Proc. ANS topical meeting. Advances in mathematics, computations, and reactor physics, 29 April–2 May 1991, Vol. 3, Sec. 11.1. Pittsburgh, p 2–1 62. Morel JE, Manteuffel TA (1991) An angular multigrid acceleration technique for the SN equations with highly forward-peaked scattering. Nucl Sci Eng 107:330 63. Ashby SF, Brown PN, Dorr MR, Hindmarsh AC (1991) Preconditioned iterative methods for discretized transport equations. In: Proc. International topical meeting. Advances in mathematics, computations, and reactor physics, 28 April–2 May 1991, Vol. 2. American Nuclear Society, Pittsburgh, PA, p. 6.1 2–1 64. Adams ML, Martin WR (1992) Diffusion synthetic acceleration of discontinuous finite element transport iterations. Nucl Sci Eng 111:145 65. Azmy YY (1992) Arbitrarily high order characteristic methods for solving the neutron transport equation. Ann Nucl Energy 19:593 66. Larsen EW (1992) The asymptotic diffusion limit of discretized transport problems. Nucl Sci Eng 112:336 (review) 67. Wareing T (November, 1992) Asymptotic diffusion accelerated discontinuous finite element methods for transport problems. Los Alamos National Laboratory Report LA-12425-T 68. Adams ML, Wareing TA (1993) Diffusion-synthetic acceleration given anisotropic scattering, general quadratures, and multi-dimensions. Trans Am Nucl Soc 68:203 69. Adams BT, Morel JE (1993) A two-grid acceleration scheme for the multigroup SN equations with neutron upscattering. Nucl Sci Eng 115:253 70. Morel JE, Dendy JE Jr, Wareing TA (1993) Diffusion accelerated solution of the 2-D SN equations with bilinear-discontinuous differencing. Nucl Sci Eng 115:304 71. Lewis EE, Miller WF Jr (1984) Computational methods of neutron transport. WileyInterscience, New York [reprinted: American Nuclear Society, La Grange Park, IL (1993)] 72. Briesmeister JF (ed) (1993) MCNP – A General Monte Carlo N-Particle Transport Code, Version 4A. Los Alamos National Laboratory Report, LA-12625 73. Morel JE, McGhee JM (1994) A fission-source acceleration technique for time-dependent even-parity SN calculations. Nucl Sci Eng 116:73–85 1995–1999 74. Brown PN (1995) A linear algebraic development of diffusion synthetic acceleration for threedimensional transport equations. SIAM J. Numer Anal 32:179 75. Kelly CT (1995) Multilevel source iteration accelerators for the linear transport equation in slab geometry. Transp Theory Stat Phys 24:697 80 E.W. Larsen and J.E. Morel 76. Castrianni CL, Adams ML (1995) A nonlinear corner-balance spatial discretization for transport on arbitrary grids. In: Proc. ANS M&C international conference on mathematics and computations, reactor physics, and environmental analyses, 30 April–4 May 1995, Vol. 2. Portland, p 916 77. Morel JE, Wareing TA, Smith K (1996) A linear-discontinuous spatial differencing scheme for SN radiative transfer calculations. J Comp Phys 128:445 78. Wareing TA, McGhee JM, Morel JE (1996) ATTILA: a three-dimensional unstructured tetrahedral mesh discrete ordinates transport code. Trans Am Nucl Soc 17:146 79. Patton BW, Holloway JP (1996) Application of Krylov subspace iterative methods to the slab geometry transport equation. In: Proc. 1996 ANS topical meeting. Advances and applications in radiation protection and shielding, 21–25 April 1996, Vol. 1. N. Falmouth, MA, p 384 80. Kelly CT, Xue ZQ (1996) GMRES and integral operators. SIAM J Sci Comp 17:217 81. Saad Y (1996) Iterative methods for sparse linear systems. PWS Publishing Company, Boston, MA 82. Ramone GL, Adams ML, Nowak PF (1997) A transport synthetic acceleration method for transport iterations. Nucl Sci Eng 125:257 83. Adams ML (1997) A subcell balance method for radiative transfer on arbitrary spatial grids. Transp Theory Stat Phys 26(4 & 5):385 84. Adams BT, Morel JE (1997) An acceleration scheme for the multigroup SN equations with fission and thermal upscatter. In: Proc. ANS M&C topical meeting. Joint international conference on mathematical methods and supercomputing for nuclear applications, 5–10 Oct 1997, Vol. 1. Saratoga Springs, New York, p. 343 85. Aussourd C (1997) An adapted DSN scheme for solving the two-dimensional neutron transport equation on a structured AMR grid. In: Proc international conference on mathematical methods and supercomputing for nuclear applications, October 5–9, 1997, Vol. 1. American Nuclear Society, Saratoga Springs, New York, p. 41 86. Adams ML, Nowak PF (1998) Asymptotic analysis of a computational method for time- and frequency-dependent radiative transfer. J Comp Phys 146:366 87. Azmy YY (2002) Unconditionally stable and robust adjacent-cell diffusive preconditioning of weighted-difference particle transport methods is impossible. J Comp Phys 182:213 88. Adams ML, Wareing TA, Walters WF (1988) Characteristic methods in thick diffusive problems. Nucl Sci Eng 130:18 89. Wareing TA, Morel JE (1998) ACTI Task 1 Summary. Los Alamos National Laboratory Report LA-UR-98–4749 90. Castrianni CL, Adams ML (1998) A nonlinear corner-balance spatial discretization for transport on arbitrary grids. Nucl Sci Eng 128:278 91. Warsa JS, Prinja AK (1998) Bilinear-discontinuous numerical solution of the time dependent transport equation in slab geometry. Ann Nucl Energy 26:195 92. Jesse JP, Fiveland WA, Howell LH, Colella P, Pember RB (1998) An adaptive mesh refinement algorithm for the radiative transport equation. J Comp Phys 139:380 93. Pautz SD, Morel JE, Adams ML (1999) An angular multigrid acceleration method for SN equations with highly forward-peaked scattering. In: Proc international conference on mathematics and computation, reactor physics and environmental analyses in nuclear applications, 27–30 Sept 1999, Vol. 1. Madrid, Spain, pp 647–656 94. Azmy YY (1999) Iterative convergence acceleration of neutral particle transport methods via adjacent-cell preconditioners. J Comp Phys 152:359 95. Guthrie B, Holloway JP, Patton BW (1999) GMRES as a multi-step transport sweep accelerator. Transp Theory Statist Phys 28:83 96. Thompson KG, Adams ML (1999) A spatial discretization for solving the transport equation on unstructured grids of polyhedra. In: Proc. ANS M&C international topical conference. Mathematics and computation, reactor physics and environmental analysis, 27–30 Sept 1999, Vol. 2. Madrid, p 1196 97. Wareing TA, McGhee JM, Morel JE, Pautz SD (1999) Discontinuous finite element SN methods on 3-D unstructured grids. In: Proc. ANS M&C international topical conference. Mathematics and computation, reactor physics and environmental analysis, 27–30 Sept 1999, Vol. 2. Madrid, p 1185 1 Advances in Discrete-Ordinates Methodology 81 98. Wareing TA, Morel JE, McGhee JM (1999) A diffusion synthetic acceleration method for the SN equations with discontinuous finite element space and time differencing, In: Proc. ANS M&C international topical conference. Mathematics and computation, reactor physics and environmental analysis, 27–30 Sept 1999. Vol. 1. Madrid, p 45 2000–2005 99. Azmy YY (2000) Acceleration of multidimensional discrete ordinates methods via adjacentcell preconditioners. Nucl Sci Eng 136:202 100. Lathrop KD (2000) A comparison of angular difference schemes for one-dimensional spherical geometry SN equations. Nucl Sci Eng 134:239 101. Wareing TA, Morel JE, McGhee JM (2000) Coupled electron–photon transport methods on 3-D unstructured grids. Trans Am Nucl Soc 83:240–242 102. Mathews KA, Miller RL, Brennan CR (2000) Split-cell linear characteristic transport method for unstructured tetrahedral meshes. Nucl Sci Eng 136:178 103. Zika MR, Adams ML (2000) Transport synthetic acceleration with opposing reflecting boundary conditions. Nucl Sci Eng 134:159 104. Plimpton S, Hendrickson B, Burns S, McLendon W III (2000) Parallel algorithms for radiation transport on unstructured grids. In: Proc. SuperComputing 2000, 4–10 Nov 2000. Dallas, TX 105. Adams ML (2001) Discontinuous finite element transport solutions in thick diffusive problems. Nucl Sci Eng 137:298 106. Brennan CR, Miller RL, Mathews KA (2001) Split-cell exponential characteristic transport method for unstructured tetrahedral meshes. Nucl Sci Eng 138:26 107. Sanchez R, Santandrea S (2001) Symmetrization of the transport operator and Lanczos’ iterations. In: Proc. international meeting on mathematical methods for nuclear applications, 9–13 Sept 2001. American Nuclear Society, Salt Lake City, Utah 108. Sanchez R, Chetaine A (2001) A synthetic acceleration for a two-dimensional characteristic method in unstructured meshes. Nucl Sci Eng 136:122 109. Wareing TA, McGhee JM, Morel JE, Pautz SD (2001) Discontinuous finite element SN methods on three-dimensional unstructured grids. Nucl Sci Eng 138:256 110. Adams ML, Larsen EW (2002) Fast iterative methods for discrete-ordinates particle transport calculations. Prog Nucl Energy 40:3 (review) 111. Baker RS (2002) A block adaptive mesh refinement algorithm for the neutral particle transport equation. Nucl Sci Eng 141:1 112. Liscum-Powell JL, Prinja AB, Morel JE, Lorence LJ Jr (2002) Finite element solution of the self-adjoint angular flux equation for coupled electron–photon transport. Nucl Sci Eng 142:270 113. Pautz SD, Adams ML (2002) An asymptotic study of discretized transport equations in the Fokker–Planck limit. Nucl Sci Eng 140:51 114. Pautz SD (2002) An algorithm for parallel SN sweeps on unstructured meshes. Nucl Sci Eng 140:111 115. Patton BW, Holloway JP (2002) Application of preconditioned GMRES to the numerical solution of the neutron transport equation. Ann Nucl Energy 29:109 116. Warsa JS, Wareing TA, Morel JE (2002) Fully consistent diffusion synthetic acceleration of linear discontinuous SN transport discretizations on unstructured tetrahedral meshes. Nucl Sci Eng 141:236 117. Aussourd C (2003) Styx: a multidimensional AMR SN scheme. Nucl Sci Eng 143:281 118. Haghighat A, Wagner JC (2003) Monte Carlo variance reduction with deterministic importance functions. Prog Nucl Energy 42:25 (review) 119. Morel JE, Wareing TA, Lowrie RB, Parsons DK (2003) Analysis of ray-effect mitigation techniques. Nucl Sci Eng 144:1 82 E.W. Larsen and J.E. Morel 120. Morel JE, Prinja AK, McGhee JM, Wareing TA, Franke BC (2003) A discretization scheme for the 3-D continuous-scattering operator. Trans Am Nucl Soc 89:360 121. Hanshaw HL, Larsen EW (2003) The explicit slope SN discretization method. In: Proc. topical meeting on mathematics and computations in nuclear applications, 6–10 April 2003. American Nuclear Society, Gatlinburg, Tennessee, 122. Hanshaw HL, Nowak PF, Larsen EW (2003) Stretched and filtered transport-synthetic acceleration of SN problems: part 1, homogeneous media. Trans Am Nucl Soc 89:354 123. Shashkov M, Wendroff B (2004) The repair paradigm and application to conservation laws. J Comp Phys 198:265 124. Warsa JS, Wareing TA, Morel JE (2004) Krylov iterative methods and the degraded effectiveness of diffusion synthetic acceleration for multidimensional SN calculations in problems with material discontin-uities. Nucl Sci Eng 147:218 125. Sanchez R, McCormick NJ (2004) Discrete ordinates solutions for highly forward peaked scattering. Nucl Sci Eng 147:249 126. Warsa JS, Wareing TA, Morel JE, McGhee JM, Lehoucq RB (2004) Krylov subspace iterations for deterministic k-eigenvalue calculations. Nucl Sci Eng 147:26 1 Advances in Discrete-Ordinates Methodology 83 Professor Edward Larsen was born in New York City in 1944. He attended Rensselaer Polytechnic Institute, graduating with a Ph.D. in Mathematics in 1971. His first position after graduate school was an assistant professorship in the Department of Mathematics at New York University. At the Courant Institute of Mathematical Sciences at NYU, he started a research program on neutron transport theory and made significant contributions to the applications of asymptotic expansions to this subject. In 1977, after 5 years at NYU and 1 year at the University of Delaware, he accepted a staff position in the Transport Theory Group at Los Alamos National Laboratory, where he investigated the numerical methods that underlie the discretization and solution strategies of the group’s large-scale deterministic neutron transport codes. During his 9 years at Los Alamos, Larsen’s research focus underwent a fundamental shift from analytical problems of mainly theoretical interest to computational problems of practical interest. At Los Alamos he made significant contributions to the development of methods for accelerating the iterative convergence of transport calculations, and to theoretically predicting the accuracy of numerical solutions in regions of large optical thickness. Several of these techniques are now commonplace in the computational transport community. In 1986, Larsen accepted a professorship in the Department of Nuclear Engineering at the University of Michigan. At Michigan, he has expanded his research in the development of advanced methods for numerically simulating radiation interactions with matter. His recent work includes photon and electron transport problems in medical physics, and merging deterministic and Monte Carlo methodologies. In 1988 he was made a Fellow of the American Nuclear Society; in 1994 he was awarded the US Department of Energy’s E.O. Lawrence Award for his innovative contributions to nuclear technology; in 1996 he won the Arthur Holly Compton Award of the American Nuclear Society in recognition of his outstanding contributions to education in the nuclear field; and in 1999 he won the Special Recognition Award from the Mathematics and Computations Division of the American Nuclear Society. Larsen is the author of over 150 scholarly journal articles on the mathematical analysis and computer simulation of radiation transport, and he has graduated 28 Ph.D. students who have worked in this technical area. 84 E.W. Larsen and J.E. Morel Professor Jim E. Morel received a B.S. in mathematics from Louisiana State University in 1972, an M.S. in nuclear engineering from Louisiana State University in 1974, and a Ph.D. in nuclear engineering from the University of New Mexico in 1979. He began his career in 1974 as a nuclear research officer at the Air Force Weapons Laboratory. In 1976 he became a staff scientist at Sandia National Laboratories. In 1984 he became a staff member at Los Alamos National Laboratory, eventually serving as a group leader and scientific advisor. In 2005 he accepted a professorship in the Department of Nuclear Engineering at Texas A&M University. He has published over 140 refereed articles in journals and conference proceedings in the areas of neutron transport, coupled electron–photon transport, thermal radiation transport, and radiation-hydrodynamics. Chapter 2 Second-Order Neutron Transport Methods E.E. Lewis 2.1 Introduction Among the approaches to obtaining numerical solutions for neutral particle transport problems, those classified as second-order or even-parity methods have found increased use in recent decades. First-order and second-order methods differ in a number of respects. Following discretization of the energy variable, invariably through some form of the multigroup approximation, the time-independent forms of both are differential in the spatial variable and integral in angle. They differ in that the more conventional first-order equation includes only first derivatives in the spatial variables, but requires solution over the entire angular domain. Conversely, the second-order form includes second derivatives but requires solution over one half of the angular domain. The two forms in turn lead to contrasting approaches to reducing the differential–integral equations to sets of linear equations and in the formulation of iterative methods suitable for the numerical solution of large engineering design problems. In what follows, we explore the state of methods used to solve the second-order transport equation, comparing them, where possible, to first-order methods. Historically, diffusion theory was the earliest and remains the most widely employed second-order computational method in reactor core design. Fine mesh spatial discretization has been performed through a variety of finite difference [1] and finite element techniques [2–4]. Along with the continuing advance of computing power, two events greatly influenced the development of methods that are emphasized in this work. The first was the translation in 1962 of Vladimirov’s variational formulation of the second-order form of the transport equation [5]. There followed increased interest in variational formulations and the systematic exploration of alternate variational principles and their interconnectedness [6]. The second was the rise of finite element methods, first in computational mechanics but then spreading to other fields [7–9]. The power of finite element methods largely supplanted E.E. Lewis () Department of Mechanical Engineering, Northwestern University, Evanston, IL 60208, USA e-mail: e-lewis@northwestern.edu Y. Azmy and E. Sartori, Nuclear Computational Science: A Century in Review, c Springer Science+Business Media B.V. 2010 DOI 10.1007/978-90-481-3411-3 2, 85 86 E.E. Lewis earlier finite difference techniques not only in computational mechanics, but also in a number of related fields as well. Since finite element methods in computational mechanics were closely associated with variational principles, the method’s application to transport problems was greatly facilitated by the variational framework that had been developed for the Boltzmann equation. The first applications of finite element methods to neutronics problems appeared in the early 1970s, and not surprisingly to the diffusion approximation [2–4]. But soon thereafter finite element methods appeared coupled with spherical harmonics [10–14], discrete ordinates [14–20], and also with finite elements in angle [22, 23]. Through the 1970s and into the 1980s, however, second-order transport methods were largely limited to stand-alone codes applied to model problems. At that time, computer memory restrictions continued to favor very effective marching algorithms with which first-order discrete ordinates methods were able to solve large engineering problems with very limited memory [24, 25]. It has only been with the rapid expansion of computer memories since that time that second-order methods have become a viable alternative for engineering calculation, and production codes based on the methods became viable for large-scale engineering computations [11–14, 17–21, 26, 27]. To present the current state of even-parity computational methods, we begin in Section 2.2 by deriving the second-order form of the linear Boltzmann equation from its first-order counterpart. We then examine the weak and variational forms of these equations that are used almost without exception as the points of departure for numerical approximation. For brevity and clarity, isotropic scattering is assumed. Section 2.3 reduces these forms to the diffusion approximation and then discretizes the spatial variables. The simplicity of diffusion theory provides a convenient vehicle for introducing finite element methods, and for establishing notation that then carries over for use in conjunction with more accurate angular approximations. Following the inclusion of anisotropic scattering into the even-parity formulation, Section 2.4 treats angular approximations: spherical harmonics, discrete ordinates as well as their simplified forms. The section then combines these angular approximations with finite elements in space to obtain fully discretized forms of the even-parity equation. Section 2.5 examines two variants of even-parity approaches: a hybrid or nodal method and an integral even-parity method. Section 2.6 then offers concluding remarks. 2.2 The Transport Equation The numerical algorithms of the following sections are based primarily on the variational form of the even-parity transport equation. We begin, however, with the standard first-order form of the equation and proceed to arrive at the variational principle. Along the way, we also present both the mixed and the second-order even- and odd-parity equations. Before proceeding to the variational formulation, we examine briefly the weak form of the even-parity equation and of the mixed 2 Second-Order Neutron Transport Methods 87 equations in their primal and dual weak forms. The even-parity weak form may be used interchangeably with the variational formulation. Mixed methods are less well developed but are included because of the future potential that they may hold. In what follows, we assume the time-independent form of the neutron transport equation in which a multigroup discretization of the energy variable has already been performed. Thus, we need to deal only with the energy-independent or withingroup form of the equation. For brevity, we also assume isotropic scattering in this section, reserving the introduction of anisotropic scattering to Section 2.4. 2.2.1 First- and Second-Order Forms The first-order form of the transport equation is [28] * * O r .* O C .* O D s .* r ; / r / . r ; / r/ Z * * O 0 / C s.* d0 . r ; r /; r 2 V; (2.1) * O is the direction of neutron travel; and s where r is the spatial location, and are the macroscopic total and scattering cross sections, Rrespectively, s is the group source, and we normalize the angular integrals such that d D 1. Integrating over angle yields the neutron conservation equation * * * * * r J . r / C r '. r / D s. r /; Z where * O d . r ; / (2.3) O . r ; / O d (2.4) * '. r / D and Z * * J .r / D (2.2) * are the scalar flux and current, respectively, and r D s (2.5) defining the even- and odd-parity flux components as ˙ O D 1= . r ; / 2 * h O ˙ . r ; / * O . r ; / * i (2.6) O and O and add and subtract the results to obtain a We may evaluate Eq. (2.1) at set of coupled first-order equations * O r C C D s C s (2.7) 88 E.E. Lewis and * O r C C D 0: (2.8) A pair of second-order equations may then be obtained. Eliminating Eqs. (2.7) and (2.8) yields the even-parity equation * * O r O r 1 while eliminating * C * O r 1 O r C C C between D s C s (2.9) yields the odd-parity equation C h i * * * O r. s /1 .s =/ r J s : D These two equations contain only and current may be expressed as C and Z D and * Z J D d , respectively, since the scalar flux C O d (2.10) (2.11) : (2.12) Finally, we may also write a second-order equation for the angular flux by first * O r s s/ and substituting into Eq. (2.1) solving Eq. (2.1) for D 1 . to obtain * * O r 1 O r C * O r 1 /.s C s/: D .1 (2.13) 2.2.2 Weak Forms The starting point for most space–angle approximations is the weak form of the forgoing equations. To obtain the weak forms of the coupled equations (2.7) and (2.8) we multiply the two by even- and odd-parity test functions, Q C and Q , respectively, and integrate over angle and space: Z Z dV Z Z dV * O r d Q C C C * O r d Q C C s ' s D 0; D 0: (2.14) (2.15) Applying the divergence theorem to the first term of the two equations yield Z Z dV h i * O r Q C / C Q C C Q .s C s/ d . Z Z O nO Q C D 0 C d d (2.16) 2 Second-Order Neutron Transport Methods and Z Z dV h * O r Q / d . C 89 C Q i Z C Z d O nO Q d C D 0: (2.17) where d indicates integration over the surface of the spatial domain, V , and nO is the outward normal from d. Two classes of methods arise from these equations [8, 19]. Equation (2.16) is the basis for primal methods, and Eqs. (2.16) and (2.15) taken together constitute the mixed-primal formulation. Alternately, Eq. (2.8) may be used to eliminate from the volume integral to obtain the second-order even-parity primal formulation: Z Z h * * O r Q C /. O r C/ C Q C d 1 . Z Z O nO Q C D 0 C d d dV C Q .s C s/ i (2.18) This even-parity result may also be obtained directly by multiplying Eq. (2.9) by Q C , integrating over space and angle, and then applying the divergence theorem. Equation (2.17) is the basis for dual methods. Taken together, Eqs. (2.17) and (2.14) constitute the mixed dual formulation. Alternately, Eq. (2.7) may be used to eliminate C from the volume integral in Eq. (2.17) to form the second-order oddparity dual formulation: Z Z * * O r Q /. O r d 1 . /C Q Z i Z * * * * Q h 1 O nO Q . s / .r J / .s = / r J s C d d dV C D0 (2.19) Multiplying Eq. (2.10) by Q , integrating over space and angle and then applying the divergence theorem also leads to this result. The surface terms in the above equations determine the form of the boundary conditions [9, 28]. Setting them equal to zero yields the homogeneous form of the * natural boundary conditions. Thus D 0, and C D 0 for r 2 are the natural boundary conditions, respectively, in the primal and dual formulations. Inhomogeneous natural conditions may be obtained by setting D , in the primal and C D C , in the dual formulation; hereafter, we employ the subscript to indicate a known function on the boundary. Conversely, if the boundary conditions specify C in the primal or in the dual formulations, they are essential; the surface integral is deleted, and they are imposed directly on the solution. Classically, it is the incoming angular flux that is known on the boundary. Hence C O C . r ; / * O . r ; / * * O D 0; * O < 0; r 2 ; nO . r ; / (2.20) 90 E.E. Lewis * O D 0, * O < 0 is the and in particular the vacuum boundary . r ; / r 2 , nO homogeneous form of this condition. These conditions may be treated as modified natural conditions by first forming weighted residuals Z 2 Z d O nO Q ˙ . d C C / D0 (2.21) n O O <0 for primal and dual formulations, respectively. By making use of angular parity arguments, it can be written Z Z Z Z Z O nO Q ˙ D d dj O nj d d O Q ˙ ˙ C 2 d Z O nO Q ˙ : d (2.22) n O O <0 By employing this expression to eliminate the surface integral on the left of Eq. (2.22) from the primal or dual formulations in Eq. (2.16) or (2.17), modified natural boundary conditions may be obtained for a known incoming flux distribu* O 0 / D .* O r ; /, tion. Reflected boundary conditions, which have the form . r ; 0 O O where and are the angles of incidence and reflection are essential since they apply directly to , with no contribution from its derivatives. Until quite recently, the vast majority of second-order transport work has applied to the even-parity form of the equation given by Eq. (2.9), whose weak form is given by Eq. (2.18). Thus, we focus our attention on it. With the incoming flux boundary conditions, Eq. (2.18) becomes Z Z i h * * O r Q C /. O r C / C Q C C Q .s C s/ dV d 1 . Z Z Z Z O nj O nO Q C D 0; C d dj O Q C C C 2 d d n O O <0 (2.23) where for vacuum boundary conditions we set D 0. For most computations, the problem domain is bounded by some combinations of reflected and vacuum boundaries. We indicate this by dividing the boundary as D r C . Then, instead of Eq. (2.23), we have Z Z h i * * O r Q C /. O r C / C Q C C Q .s C s/ dV d 1 . Z Z O nj C d dj O Q C C D 0; (2.24) since no surface integral appears for r , where the essential reflected conditions are applied. 2 Second-Order Neutron Transport Methods 91 2.2.3 Variational Formulation Both coupled and second-order forms of the primal and dual formulations may also be formulated as variational principles, and surface terms are added to impose modified natural boundary conditions. Here, we restrict attention to the primal second-order form, since it was the original variational principal of Vladimirov [5] and is that upon which virtually all second-order methods are based. The variational principle corresponding to Eq. (2.24) derives from the requirement that the following functional be stationary with respect to variations in C : FŒ C Z D Z dV 2 * O r C C d 1 Z Z O nj C d dj O C2 : C2 s 2 2 s (2.25) To illustrate, suppose we make the substitution C ! C C ı Q C where ı is a very small number, and Q C is an arbitrary deviation about the true solution C . We then have Z Z h i * C C O r C /2 C C2 s 2 2 s FŒ Cı D dV d 1 . (Z Z Z Z h * * C2 O nj O r Q C /. O r C/ O C 2ı C d dj dV d 1 . C Q C . 1 C Z s s/ C Z d ) O nj dj O QC C (Z Cı 2 Z dV d ) Z Z 2 * C C2 2 C2 O O Q Q Q Q : (2.26) r C d dj nj C s O For the functional to be stationary, the linear term in ı must vanish. This term, however, is identical to Eq. (2.24) derived from the weak form. Applying the divergence theorem to Eq. (2.24), or to the linear term in ı in Eq. (2.26), yields: Z Z * * O r O r 1 dV d Q C Z Z * O nO Q C 1 O r C d d C C s s Z Z C O nj C d dj O QC C C D 0: (2.27) For this expression to hold, both the volume and surface integrals must vanish. Since Q C is an arbitrary function, the expression in parentheses must vanish. Thus, the even-parity transport equation given by Eq. (2.9) is satisfied. This is the 92 E.E. Lewis Euler–Lagrange equation. On the vacuum boundary the two surface integral terms may be rewritten as Z Z d r * O r O nO Q C 1 d Z C2 Z C * O r O nO Q C . 1 d d C C /: (2.28) O n<0 O * O r C and D C Since from Eqs. (2.8) and (2.6) we have D 1 * O r C , the first term in Eq. (2.28) yields the natural boundary condition 1 D 0. However, on reflected boundary condition we take Q C D 0 on r and impose the essential boundary condition on the solution. From the second term, the O < 0 on ) is modified natural boundary condition on the vacuum ( D 0, for nO satisfied. 2.3 The Discretized Diffusion Equation The diffusion approximation may be viewed as arising naturally as the lowestorder angular approximation to the even-parity equation. Indeed a straightforward exercise is to eliminate the angular variables from the weak or variational forms discussed in Section 2.2 by using the diffusion approximation. Here, a simple statement of the diffusion equation in variational form serves as a vehicle for introducing the finite element method. Finite element notation established for use in conjunction with the diffusion equation then allows for a more straightforward discretization of the spatial variable in conjunction with the higher order angular approximations treated in Section 2.4. 2.3.1 The Diffusion Formulation The diffusion approximation may be obtained by making the approximations R * C O J and then operating on Eqs. (2.7) and (2.8) with d , 3 R O respectively, to obtain and d, * and * r J C r D s (2.29) * * 1= r C J D 0: 3 (2.30) 2 Second-Order Neutron Transport Methods 93 * The second-order form of the diffusion equation is obtained by eliminating J between Eqs. (2.29) and (2.30): * * r.3/1 r C . s / D s: (2.31) Alternately, we may obtain this equation directly from the even-parity transport equation, or from its variational form. Let C : (2.32) R Then operating on Eq. (2.9) with d yields Eq. (2.31). The approximation C likewise reduces the even-parity functional F Œ C , given by Eq. (2.25), to its diffusion form Z Z * 2 1 2 1 1 r C r 2 s C =2 d 2 : (2.33) F Œ D dV =3 The diffusion equation can be shown to be the Euler–Lagrange equation by letting ! C ı Q , where Q is an arbitrary deviation from the solution and requiring the linear term in ı to vanish: Z Z i h * * (2.34) dV 1=3 1 .r Q /.r / C r Q Q s C 1=2 d Q D 0: This is identical to the diffusion approximation’s weak form obtainable by inserting Eq. (2.32) into Eq. (2.27). If we apply the divergence theorem to Eq. (2.34), we obtain Z Z * * * 1 1 Q dV r =3 r C r s C d Q 1=3 1 nO r Z C * d Q 1=3 1 nO r C 1=2 r D 0: (2.35) For arbitrary Q , the quantity in parentheses in the first term must vanish, yielding Eq. (2.31) as the Euler–Lagrange equation. The surface integral over r yields the * reflected or zero current condition nO r D 0 as the natural boundary condition, and in diffusion theory this corresponds to the reflected boundary condition. (Recall, however, that for the transport equation more generally reflective boundary conditions are essential.) Requiring the quantity in parentheses in the last term to vanish yields the modified natural boundary condition; it approximates the vacuum condition extrapolating the flux to zero at a distance of 2=3 1 outside of . 94 E.E. Lewis 2.3.2 Finite Element Discretization We discretize the spatial variable employing the finite element method. For clarity, we illustrate using simple triangular elements in two-dimensional Cartesian geometry [7]. We first divide the problem domain into elemental volumes Ve : V D X Ve : (2.36) e In two dimensions, these may appear, for example, as the triangles in Fig. 2.1. * Within each element, we approximate . r / as the scalar product of a vector of * polynomial trial functions, ne . r /, and unknown coefficient vector, Ÿe : * * . r / nTe . r /Ÿe ; * r 2 Ve : (2.37) Representative linear and quadratic trial functions for triangular elements, for * which the length of ne . r / is 3 for the linear element and 6 for the quadratic element, Linear elements Quadratic elements Fig. 2.1 Triangular finite element grids 1 1 1 0 0 0 0 ne(r) linear 0 0 0 0 0 Fig. 2.2 Triangular finite element trial functions ne(r) quadratic 0 2 Second-Order Neutron Transport Methods 95 respectively, are shown in Fig. 2.2. (These and other classes of hfinite elements are i * discussed in detail elsewhere [7]). They have the properties that ne . r i / D ıij so j * that the components of Ÿe are equal to at r i , i D 1; : : : , 3 or 6, the mesh points shown in Fig 2.2. Moreover, if the same sets of trial functions are used in adjoining elements, the flux approximation will be continuous across the interfaces. Provided continuity is maintained, the functional given by Eq. (2.33) may be written as the sum of elemental contributions. X X ŸTe Ae Ÿe 2 ŸTe qe ; (2.38) F ŒŸe D e e where the elemental matrices involve integrals of the known trial functions, Ae D 1=3e1 Z Z Z * * dV rne rnTe C re dV ne nTe C 1=2 dne nTe ; e e (2.39) ve with the last term present only where one or more of the element surfaces lie along a vacuum boundary; the source term is given by Z qe D dV ne s: (2.40) e We may formally assemble the global functional from its elemental contributions as follows. The continuity of the trial functions across element interfaces allows the elemental unknowns to be mapped on to a globally numbered vector of N unknowns: Ÿe D „© Ÿ: (2.41) where „e is 3 N or 6 N Boolean transformation matrices. Inserting this equation into Eq. (2.38), we obtain the algebraic functional F ŒŸ D ŸT AŸ 2ŸT q; where AD X „Te Ae „e ; (2.42) (2.43) e and qD X „Te qe : (2.44) e We obtain a set of linear equations for Ÿ by requiring the reduced functional Q Hence F ŒŸ to be stationary with respect to variations: Ÿ ! Ÿ C ı Ÿ. i h Q F Ÿ C ı ŸQ T D .ŸT AŸ 2ŸT q/ C 2ı ŸQ T .AŸ q/ C ı 2 ŸQ T AŸ; (2.45) 96 E.E. Lewis and since the linear terms in ı must vanish we have AŸ D q: (2.46) Here, A is a symmetric matrix. Provided a consistent numbering scheme has been applied to the mesh points the matrix will also be banded. Solution may be obtained using any number of direct methods for smaller problems, or iterative techniques such as preconditioned conjugate gradients for large problems [29–31]. 2.4 The Discretized Transport Equation In this section, we first write Eq. (2.1) with anisotropic scattering included, and then derive the corresponding form of the even-parity equation as well as stating the variational principle that provides the starting point for numerical approximation. We then derive spherical harmonic and discrete ordinates angular approximations (referred to frequently as Pn and Sn methods, respectively) along with their simplified spherical harmonic and discrete ordinates (SPn and SSn) counterparts. For each approximation, the result is a coupled set of second-order partial differential equations. The sets differ only in the structure of their coefficient matrices. Thus, we can apply the finite element methods discussed in Section 2.3 to discretize the spatial variables for all of the approximations simultaneously. 2.4.1 Anisotropic Scattering With anisotropic scattering included, the form of the second-order equation becomes more complex [15, 17, 32]. To obtain it, we begin with the first-order form of the equation: * O r O C * r; r * Z * O O O0 r ; D d0 s r ; * O 0 Cs * O ; r; r; * (2.47) 0 O O where s r ; is the macroscopic differential scattering cross section. In the multigroup approximation the group source is given by * X Z * O O O0 d0 sgg0 r ; s r; D * g0 O 0 C Sfg * r; r * (2.48) g 0 ¤g where g designates the group under consideration, sgg0 represents scattering from group g 0 to g, and Sfg includes the isotropic fission and external sources. 2 Second-Order Neutron Transport Methods 97 O and O and add and subtract the results to We may evaluate Eq. (2.47) at obtain a pair of coupled first-order equations * O r where and O C * r; r * ˙ Z O D d0 s˙ * O O0 r; r; * O ; Cs ˙ r ; * ˙ O0 r; * (2.49) h i * O O 0 / D 1= s * O O 0 ˙ s * O O0 s˙ . r ; r ; r ; 2 (2.50) h i * O D 1= s * O ˙s * O : s˙ r ; r; r ; 2 (2.51) Here, s˙ and s ˙ indicate the even- and odd-parity components of the anisotropic scattering cross section and source, respectively. We next express the anisotropic scattering cross sections as expansions in an orthonormal set of spherical harmonics [28, 33] O D Clm P m ./ Ylm ./ l cos.m!/ ; sin.m!/ l D 0; 1; 2; : : : ; jmj D 0; 1; 2; : : : ; l (2.52) where Plm ./ are the associated Legendre polynomials, and the Clm are chosen R O l 0 m0 ./ O D ıll0 ımm0 ; we follow the convention that m 0 such that dYlm ./Y * O O 0 / as signifies the cosine series and m < 0 the sine series. We first write s˙ . r ; expansions in Legendre polynomials O O 0/ D s˙ . r ; * X l * O O0 : l r Pl (2.53) even odd The Legendre addition theorem [33] O O 0/ D Pl . l X O lm . O 0/ Ylm ./Y (2.54) mDl then allows Eq. (2.53) to be written in the convenient vector form * O O 0 / D yT˙ ./† O ˙ .* O 0 /; r /y˙ . s˙ . r ; (2.55) where we have formed vectors of the even- and odd-parity spherical harmonics O D ŒY00 ; Y22 ; Y21 ; Y20 ; Y21 ; Y22 ; Y44 ; ; yTC ./ O D ŒY11 ; Y10 ; Y11 ; Y33 ; Y32 ; Y31 ; Y30 ; Y31 ; yT ./ (2.56) (2.57) 98 E.E. Lewis R R O T ./ O D I and dy˙./y O T ./ O D 0. The diagonal matrices with dy˙ ./y ˙ are defined by † C D diagŒ0 ; 2 ; 2 ; 2 ; 2 ; 2 ; 4 ; ; (2.58) † D diagŒ1 ; 1 ; 1 ; 3 ; 3 ; 3 ; 3 ; : (2.59) The coupled pair of even- and odd-parity equations given by Eq. (2.49) then becomes * O C . r / C . r ; / O . r ; / Z O C .* O 0/ r / d0 yC . D yTC ./† O r * * * C O 0 / C s C . r ; /; O .r ; * * (2.60) and * * * O C .* O . r ; / r / . r ; / Z O .* O 0 / .* O 0 / C s .* O r / d0 y . r ; r ; /: D yT ./† O r C (2.61) To obtain the even-parity equation, we employ Eq. (2.61) to obtain the odd-parity moments in terms of the even-parity moments: Z Z * O ./ O D 1 s 1 r dy O ./ O C ./; O ŒI 1 † dy ./ (2.62) where we define the even- and odd-parity group-source moments as Z ˙ * O ˙ .* O r ; /: s . r / D dy˙ ./s (2.63) Using Eq. (2.62) to eliminate the integral of on the right side of Eq. (2.61), we may solve for in terms of C and the group-source terms: * O r O D 1 ./ O C 1 s ./ O ./ Z * O Q O 0 / O0r d0 y . 2 yT ./† C where C Q s ; (2.64) O 0 / C 2 yT ./ O † . Q ˙ D † ˙ ŒI 1 † ˙ † 1 : (2.65) Finally, substituting Eq. (2.64) into Eq. (2.60) we obtain the second-order evenparity equation; * * O r O r 1 * C O r 1 yT C Z O 0 / C . O 0 / C sC d0 yC . D yT † C C Z * Q d0 y . Q /s ; (2.66) O 0 / O 0 r C . O 0 / .I C 1 † 1 † C C 2 Second-Order Neutron Transport Methods 99 where we have also expanded the group source in harmonic moments sCD X Z † Cgg0 O dyC./ C O g 0 ./ C sfg ; (2.67) g 0 ¤g sD X 1 †gg0 g1 0 I g 0 † g 0 g 0 * 1 s g0 r Z O ./ O dy C O g 0 ./ : (2.68) g 0 ¤g Comparing Eq. (2.9) to Eq. (2.66), it is apparent that the second-order forms of the equation appear to be considerably more complex than the first-order form when anisotropic scattering is included. The weak form of Eq. (2.66) may be obtained by integrating over angle and volume and then applying the divergence theorem: (Z Z h * * O r Q C /. O r d 1 . dV Z dyC Q C Z T C / C QC Z †C * O r QC dy dyC T Q † C C i Z dyC Q Z * O r dy C ) Z T * C 2 1 Q /s O r Q dy .I C † Z Z C d 2 O nO Q C d C T sC D 0; (2.69) where is given by Eq. (2.64). Alternately, a variational principle may be written, which yields Eq. (2.66) as the Euler–Lagrange equation. Variational formulation of the within-group problem is FŒ C Z D (Z dV h * O r d 1 . Z dyC Z C 2 T C * i Z †C O r dy C2 / C dyC T Q † C Z dyC Z * O r dy C ) Z T * C 2 1 Q /s : O r dy 2 .I C † C The natural boundary condition remains Eq. (2.64). 2 D 0, but where C C T 2sC (2.70) is defined by 100 E.E. Lewis 2.4.2 Angular Approximations We next use Eq. (2.70) as the point of departure for deriving spherical harmonics and discrete ordinates approximations. 2.4.2.1 Spherical Harmonics Expansions The spherical harmonics approximations may be expressed as the scalar product of O is given by Eq. (2.56) a row and a column vector, where yC ./ C * O yTC ./§ O C .* . r ; / r /: (2.71) O we then have the space-dependent coeffiFrom the orthonormality of the yC ./, cients to be Z * O C .* O § C . r / D dyC./ r ; /: (2.72) Likewise, we may write C Z * Uk 0 rk 0 § . r / D Z where Uk D * O O r dy ./ C * O . r ; /; O kO y ./y O TC ./: O d (2.73) (2.74) and hereafter repeated k or k 0 in the same term indicates summation over the spatial directions. We next substitute Eq. (2.71) into the functional given by Eq. (2.70) to obtain: F Œ§ C D Z n Q /.Uk 0 rk 0 § C / dV .Uk rk § C /T 1 .I C 1 † o Q /s ; C§ CT .I 1 † C /§ C 2§ CT sC 2.rk § C /T 2 .I C 1 † (2.75) O to obtain the identity where we have used the orthonormal properties of yC ./ Z O kO O kO 0 yC ./y O TC ./ O (2.76) UTk Uk 0 D d and thus simplify the streaming term in the reduced functional. Requiring the func* tional to be stationary with respect to variations in § C . r / yields the spherical harmonics approximation as the Euler–Lagrange equation: Q Uk 0 rk 0 § C C .I † C /§ C rk UTk 1 ŒI C 1 † Q /s : D sC rk UT 1 .I C 1 † k (2.77) 2 Second-Order Neutron Transport Methods 101 2.4.2.2 Discrete Ordinates Approximations The discrete ordinates equation may be obtained from the Functional, Eq. (2.70), by O n and replacing integrals over evaluating all the quantities at discrete directions angle with an appropriate quadrature approximation Z X * * O O n /; d f . r ; / wn f . r ; (2.78) n O n are the discrete ordinates directions. The evenwhere wn are the weights and * O n /. Thus, parity flux is then evaluated in the discrete ordinates directions by C . r ; to reduce the functional we create a vector of the even-parity flux approximated in the discrete ordinates directions: h i T * * C * O O2 ; C * O 3 ; ; C * O n ; : §C . r / r ; 1 ; C r ; r; r; (2.79) To apply the quadrature rule of Eq. (2.78) to the functional, we also define two diagonal matrices h i O O O ; O O 1 k; O 2 k; O 3 k; O n k; k D diag (2.80) and W D diagŒw1 ; w2 ; w3 ; ; wn ; : (2.81) O column vectors evaluated in the We also employ matrices made up of the y˙ ./ ordinates directions: h i O 1 /; y˙ . O 2 /; y˙ . O 3 /; ; y˙ . O n /; : (2.82) Y ˙ D y ˙ . Utilizing the angular quadrature approximation and Eqs. (2.79) through (2.82), the Functional reduces to the form Z n C Q Y W .k 0 rk 0 § C / F Œ§ D dV .k rk § C /T 1 W I C 1 YT † C§ CT W I 1 YTC † C YC W § C 2§ CT WYTC sC o Q /s ; (2.83) 2.k rk § C /T WYT 2 .I C 1 † where the group sources are obtained by applying the quadrature formula to C Eqs. (2.67) and (2.68). If we require F Œ§ C to be stationary by taking F Œ§ C Cı §Q and requiring the linear term in ı to vanish, we obtain an Euler–Lagrange equation h i Q Y W k 0 rk 0 § C C I YT † C YC W § C rk k 1 I C 1 YT † C Q /s ; D YTC sC k rk 1 YT .I C 1 † which is just the even-parity form of the discrete ordinates equations. (2.84) 102 E.E. Lewis Both the spherical harmonics and discrete ordinates approximations can be written in forms that are identical except for the coefficient matrices. The variational principle has the form F §C D Z dV ˚ Hkk0 rk 0 § C C § CT K§ C 2§ CT GC sC 2.rk § C /T Gk s ; rk § C T (2.85) and the corresponding even-parity equations are rk Hkk0 rk 0 § C C K§ C D GC sC rk Gk s : (2.86) The coefficient matrices are given in Table 2.1, along with those for the simplified approximations treated in the following section. Thus far, the derivations hold for three dimensions. For one or two dimensions they may be appropriately reduced by eliminating terms in the spherical harmonics or discrete ordinates expansions. In spherical harmonics all terms with m < 0 for x–y geometry and all terms with m ¤ 0 for plane geometry are eliminated from Eqs. (2.71) through (2.77). The duplicate directions are likewise eliminated from discrete ordinates approximations (polar axis in x direction, and azimuthal angle measured from y direction k D 1, 2, 3 for x; y, and z axes). The plane-geometry spherical harmonics equation, in which the forgoing matrices are expressed in terms of Legendre polynomials, is particularly relevant here as the basis for the simplified spherical harmonics approximation. Table 2.1 Uniform notation for Eqs. (2.84) and (2.85) Pn Hkk0 K Q Uk 0 UTk 1 ŒI C 1 † GC I † C I Gk UTk 1 Q / .I C 1 † Sn k 1 ŒW C SPn 1 i Q Y W WYT † W WYTC † C YC W YTC k 1 WYT I † C I UT0 1 k 0 Q UT ıkk0 U0 1 ŒI C 1 † 0 Q / .I C 1 † Q / .I C 1 † SSn k 1 ŒW i Q Y W k 0 ıkk0 C 1 WYT † W WYTC † C YC W YTC k 1 WYT Q / .I C 1 † 2 Second-Order Neutron Transport Methods 103 2.4.2.3 Simplified Angular Approximations Cross-derivative terms rk . /rk 0 with k ¤ k 0 appear in both the spherical harmonics and discrete ordinates approximations. Eliminating them greatly simplifies the spatial discretization. The widely employed simplified spherical harmonics, or SPn, approximation provides such simplification and reduces the number of coupled differential equations that must be solved, often without substantial loss of accuracy [34–41]. We obtain the SPn equations by taking the one-dimensional form of the Pn equations, given by Eq. (2.77) i @ T 1 h Q U1 @ § C C .I † C /§ C I C 1 † U1 @x @x @ T 1 C 1 Q I C † s U1 Ds @x and replacing the derivatives with gradient and divergence operators h i Q U1 rk § C C .I † C /§ C rk UT1 1 I C 1 † X Q s : D sC rk UT1 1 I C 1 † (2.87) (2.88) k The resulting equation may also be written in variational form and expressed as coupled partial differential equations; the coefficient matrices are included in Table 2.1. More importantly, Eq. (2.88) can be written as coupled sets of diffusion equations, allowing highly developed diffusion computational methods to be used to obtain solutions with great computational efficiency. The simplified discrete ordinates approximation is much more recent [42], and to date not widely employed. It simply eliminates the cross-derivative terms from Eq. (2.84) by letting h h i i Q Y W k 0 ! k 1 I C 1 YT † Q k 1 I C 1 YT † Y W k 0 ıkk0 (2.89) to yield h i Q Y W k rk § C C I YTC † C YC W § C rk k 1 I C 1 YT † Q /s : D YTC sC k rk 1 YT .I C 1 † (2.90) The SSn equations may also be written as coupled sets of partial differential equations as indicated in Table 2.1. Before proceeding, it is instructive to briefly compare Pn and Sn solutions for which the spatial discretization error is insignificant. Figure 2.3 shows results for the widely used Azmy benchmark [43], which consists of a fixed source located in a highly absorbing medium. The flux plots along the surface far from the source accentuate errors in angular approximations. The P11 solution is converged in angle, and may be used as a reference. Relative Flux Magnitude 104 E.E. Lewis 6.00 5.75 5.50 5.25 5.00 4.75 4.50 4.25 4.00 3.75 3.50 3.25 3.00 2.75 2.50 2.25 2.00 1.75 1.50 1.25 1.00 0.75 0.50 0.25 0.00 VARIANT P1 VARIANT P3 VARIANT P5 VARIANT P11 TWODANT S8 TWODANT S16 0 1 2 3 4 5 6 7 8 9 10 Y Position (cm) Fig. 2.3 Flux plots from Sn and Pn calculations Note that the Sn solutions oscillate about the reference demonstrating the wellknown ray effects. Low order Pn solutions appear more physically plausible, since they are smooth curves and their errors tend to be either systematically low (as in this problem) or high. For this problem, SPn solutions, which are not shown, closely track the corresponding Pn solutions. Since SPn has fewer angular degrees of freedom, computational algorithms for SPn solutions run faster and use less memory than those for the same level Pn approximation. Unlike discrete ordinates or spherical harmonics methods, however, SPn calculations do not converge to the true solution as n is increased. In many cases, however, the residual errors are acceptably small. The circumstances under which SPn solutions may be expected to reasonably approximate the transport equation are well examined elsewhere [34, 35, 39, 40]. 2.4.3 Spatial Discretization We spatially discretize the foregoing angular approximations using the same finite element techniques employed in Section 2.3. Thus, we begin by subdividing the problem domain into a finite number of volume elements. V D X e Ve : (2.91) 2 Second-Order Neutron Transport Methods 105 Then, provided we consider only spatial trial functions that are continuous across element interfaces, we may write the functional given by Eq. (2.85) as a superposition of elemental contributions: n XZ T F §C D dV rk § C Hekk0 rk 0 § C C § CT Ke § C e e 2§ CT o T GeC sC 2 rk § C Gek s : (2.92) Within each element, we represent the spatial distribution trial functions, designated * by the vector ne . r /, and a vector of unknown magnitudes Ÿe . The even-parity flux moments become * * * (2.93) § C . r / D I ˝ nTe . r /Ÿe ; r 2 Ve : Then, using the properties of the tensor product, we may write § CT Ke § C D .I ˝ nTe Ÿe /T Ke I ˝ nTe Ÿe D ŸTe I ˝ ne Ke ˝ nTe Ÿe D ŸTe Ke ˝ ne nTe Ÿe : (2.94) Similarly, it follows that T rk § C Hekk0 rk 0 § C D ŸTe Hekk0 ˝ .rk ne / rk 0 nTe Ÿe : (2.95) The functional given by Eq. (2.92) reduces to a superposition of subelement contributions: X X ŸTe Ae Ÿe 2 ŸTe qe ; (2.96) F ŒŸe D e e where the coefficient matrices are given in terms of the known trial functions of space and angle Z Ae D Hekk0 ˝ dV .rk ne / r k0 nTe Z CK ˝ e e dV ne nTe : (2.97) e Here, the superscripts on Hekk0 and Ke indicate that the cross sections are evaluated for element Ve . The source term is Z Z (2.98) qe D GeC dV sC ˝ ne C Gek dV s ˝ rk ne : e e With piecewise polynomial trial functions for the finite elements, the components of Ÿe are just the approximate values of § C at the element vertices [7]. Since these trial functions must be continuous across element interfaces, the process for assembling the elemental contributions and determining a set of linear equations is completely analogous to that of Eqs. (2.38) through (2.46): The components of Ÿe corresponding to the same physical location on either side of a subelement must have the same value. This continuity condition is enforced by creating a Boolean 106 E.E. Lewis matrix „e for each subelement that maps the Ÿe onto Ÿ, a node wide vector of coefficients: Ÿe D „e Ÿ. This transformation allows us to write the discretized functional in the form of Eq. (2.38). The set of algebraic equations is again obtained by requiring the discretized functional to be stationary. The result, once again, is a set of equations with the form AŸ D q, but where the elemental contributions to A and q are now given by Eqs. (2.97) and (2.98). Regardless of the angular approximation employed, the A matrix is sparse and symmetric. For smaller problems, direct methods may be used for solving the matrix problem. For large problems, an array of sparse matrix iterative techniques may be employed [29–31]. 2.5 Hybrid and Integral Methods Thus far, we have examined a number of angular approximations to the even-parity transport equations coupled with the use of finite elements in space. There are, of course, other possibilities for the discretization of the second-order equations. Here, we briefly examine two of these, the variational nodal method [11, 13, 32, 41] and an integral method [44], and then discuss recent work in combining nodal, finite element, and integral methods. 2.5.1 A Variational Nodal Method The variational forms of the even-parity equation that we have utilized have incoming flux, vacuum, or reflected boundary conditions. Suppose, however, that we create a functional that is easily shown to be equivalent to the weighted residual, Eq. (2.69), which has an inhomogeneous natural boundary condition on the oddparity flux. We simply append the following surface term to the functional given by Eq. (2.70): Z Z C C O C : DF Œ C 2 d dnO (2.99) F Œ ; The subscript is appended to denote that the problem domain is a volume V bounded by . By requiring this functional to be stationary with respect to variations in C within V we obtain the Euler–Lagrange equation (2.66). Variations on the boundary, , yield the relationship, Eq. (2.64), giving in terms of C . Thus, we may consider solving the transport problem within V , assuming that the odd-parity flux is known at the boundary. In the finite element literature the following would be classified as a hybrid element approach [7, 8]. Suppose we consider V to be the volume of one “node” of the problem domain, and the volume of the entire domain to be the sum of the nodes’ volumes: 2 Second-Order Neutron Transport Methods 107 V D X V ; (2.100) Then, we may write the functional for the entire domain as FŒ C ; D X F Œ C ; ; (2.101) where now appears as a Lagrange multiplier at the nodal interfaces. Requiring F Œ C ; to be stationary with respect to variations in C within V , just yields Eq. (2.66) within each V . Requiring it to be stationary with respect to ! C ı Q yields the condition Z Z d O dnO C 0C Q D 0; (2.102) where nO and nO 0 D nO are the outward normal vectors from nodes V and V 0 on either side of . Thus, C must be continuous across the interfaces. In the neutronics literature, a nodal method is generally defined as one in which neutron conservation is enforced for each node or subregion V [45]. The variational method described here R R may be shown to meet this condition as follows: suppose we let N D V 1 dV d C be the scalar flux averaged over node V . We may then R R write C D N C 0C where dV d 0C D 0 and substitute this expression into Eq. (2.99) to obtain F Œ C ; D . s /V N 2 2V N sN C 2 N Z C terms independent of N ; * d nO J (2.103) where sN is the group source averaged over V and angle. If we require the functional to be stationary with respect to variations N ! N C ı QN we obtain the nodal balance condition Z * N . s /V C d nO J D V sN : (2.104) This enforcement of neutron conservation over each V allows larger nodes to be used than typical in the case of fine mesh methods that do not enforce such balance. To proceed, we apply spherical harmonics, discrete ordinates, or one of the simplified angular approximations given in Section 2.4 to F Œ C given by Eq. (2.99). We must consistently discretize the surface term. We first divide the surface into a number of flat surfaces, . In the spherical harmonics approximation, we let * * * O yT ./§ O . r ; / . r /; r 2 : (2.105) O to indicate that the odd-parity spherical Here, the subscript is attached to y ./ harmonics are rotated such that the polar axis is perpendicular to the surface. In 108 E.E. Lewis O this is necessary to obtain the correct addition, the Yll are deleted from y ./; number of linearly independent coupling conditions [14]. Employing Eqs. (2.71) and (2.105), we get Z Z d O dnO C D XZ d§ CT C § ; (2.106) R O C yT , while for the SPn approximation C D UT . with C D dnO y 1 For discrete ordinates, we approximate the odd-parity flux on by using Eqs. (2.79) through (2.81) with * § T . r / h * O 1 /; .r ; * O 2 /; .r ; * O 3 /; ; .r ; i * O n /; .r ; (2.107) and h i O 1 nO ; O 2 nO ; O 3 nO ; ; O n nO ; : D diag (2.108) The result is again Eq. (2.106), but with C D W ; this equality remains unchanged in the SSn approximation. Next, we rewrite the reduced form of the functional given by Eq. (2.99) with the angularly discretized variables discretized in a form similar to Eq. (2.92): F § C; § D Z Hkk0 rk 0 § C C § CT K§ C 2§ CT GC sC XZ 2.rk § C /T Gk s C 2 d § CT C § (2.109) : dV ˚ rk § C T Within the node, we approximate the spatial dependence of the even-parity flux by a * vector f. r / containing the orthonormal components of a complete polynomial, and on the interfaces the spatial dependence of the odd-parity flux is given by a second * set of orthonormal polynomials, h . r /, defined only on the interface. Thus, * * § C . r / D I ˝ f T . r /Ÿ * r 2V (2.110) and * * * T § . r / D I ˝ h . r / r 2 ; (2.111) where Ÿ and are the unknown coefficients. The reduced functional then becomes F ŒŸ; D ŸT AŸ 2ŸT q C 2ŸT X M ; (2.112) 2 Second-Order Neutron Transport Methods 109 where Z Z A D Hkk0 ˝ dV .rk f/ rk 0 f T C K ˝ dV f f T ; Z Z q D GC dV sC ˝ f C Gk dV s ˝ rk f (2.113) (2.114) Z and M D C ˝ dfhT : (2.115) Requiring Eq. (2.112) to be stationary with respect to variations in Ÿ yields AŸ D q X M ; (2.116) or equivalently Ÿ D A1 q A1 M; (2.117) T T T where M D ŒM1 ; M2 ; : : : and D 1 ; 2 ; : : : . To obtain the interface continuity conditions we first note that F ŒŸ; D X F ŒŸ; (2.118) Requiring that the functional be stationary with respect to the variations ! C ı Q , we obtain the condition X ŸT M Q D 0 (2.119) where the sum is now over all interfaces in the problem domain. This condition can be met for arbitrary Q only if the even-parity moments ® D MT Ÿ (2.120) are continuous across the interfaces. Thus for each node, we may combine this expression with Eq. (2.117) to express the even-parity interface moments in terms of the node source and the odd-parity interface moments ® D MT A1 q MT A1 M¦: (2.121) Letting ®T D ®T1 ; ®T2 ; : : : ; and therefore ® D MT Ÿ, we may write for each node ® D MT A1 q MT A1 M¦: (2.122) 110 E.E. Lewis A convenient way to solve these equations is to convert them to response matrix form. We make a linear transformation of variables j˙ D 1=4® ˙ 1=2¦; (2.123) which in the diffusion approximation correspond to the partial currents. We then may rewrite Eq. (2.122) in response matrix form jC D Rj C Bq; (2.124) h i1 h i 1= MT A1 M I where the matrices are defined by R D 1=2MT A1 M C I 2 i1 h T 1 T 1 1 1 =2M A . and B D =2M A M C I Red-black iteration or other methods, then, may be applied to the solution of the resulting equations. Partitioning algorithms, over-relaxation methods, and other techniques have also been developed to accelerate the iterative solutions of the systems of nodal response matrix equations [46, 47]. 2.5.2 An Even-Parity Integral Method The functional given by Eq. (2.25) may also be used to derive an even-parity integral transport method [28, 44]. The treatment is restricted to isotropic scattering, and we also eliminate the surface term from Eq. (2.25) by assuming essential boundary conditions. To obtain the integral method, we reverse the discretization order used in the preceding sections and first discretize the spatial variables, using finite elements, while leaving the unknowns functions of angle. Dividing the problem domain into elements, Eq. (2.25) reduces to F C D XZ e e Z dV h d e1 k rk C 2 C e C2 se 2 i 2 s : (2.125) * Then, employing the same finite element trial functions, ne . r /, used in Sections 2.4 and 2.5, we get * C * O O . r ; / D nTe . r /®e ./; (2.126) and correspondingly * * . r / D nTe . r /¥e : (2.127) O and ¥e approximate the even-parity angular flux and the Here, the vectors ®e ./ scalar flux, respectively, at the spatial mesh points. Combining these approximations with Eq. (2.125) then yields 2 Second-Order Neutron Transport Methods F Œ®e D X Z 111 O k k 0 e1 d®Te ./ Z e e Z C e e dV .rk ne / rk 0 nTe O ¥Te se dV ne nTe ®e ./ Z e dV ne nTe ¥e 2¥Te se (2.128) R where se D e dV ne s. As in Eqs. (2.38) through (2.46), we assemble the elemental contributions into global vectors of angularly dependent coefficients through the O D „e ®./ O and ¥e D „e ¥. The reduced use of the Boolean transformations ®e ./ functional is then Z O I ./®. O O ¥Te BI ¥ 2®T sI ; F Œ® D d®T ./A / (2.129) where O D k k 0 AI ./ C BI D X X e X e „Te e1 Z „Te e e e Z „Te se e Z e dV .rk ne / rk 0 nTe „e dV ne nTe „e ; dV ne nTe „e ; and sI D X (2.130) (2.131) „Te se : (2.132) e O then yields Requiring Eq. (2.129) to be stationary with regard to variations in ®./ the Euler–Lagrange equation O O D BI ¥ C sI : / AI ./®. (2.133) We may obtain scalar flux equations by first inverting A and then integrating over angle. We obtain Z I O dA1 I ./BI Z ¥D O dA1 I ./sI : (2.134) This equation has a number of similarities to those obtained using collision probability theory [28]. The dimension of the problem is only that of the number of spatial nodes, but the coefficient matrix on the left is both dense and nonsymmetric. MoreO represents an intractable task, numerical over, since analytical inversion of AI ./ quadrature must be used, inviting comparison to the ray tracing methods used in integral transport schemes. Treating curved boundaries and other irregular features is straightforward using isoparametric finite elements. Moreover, the truncation errors 112 E.E. Lewis are order h2 for the lowest-order finite elements (triangles with piecewise linear trial functions), compared to h for collision probability methods, allowing coarser element meshes to be used. 2.5.3 Combined Methods The variational nodal method as described in Section 2.5.1 is applicable to homogenous nodes. However, by using finite element trial functions within the nodes, heterogeneous nodes may also be treated. Recently, this has been accomplished in what is called the subelement nodal method and applied to reactor problems in which each node corresponds to one pin-cell [48]. Likewise, the integral method described in Section 2.5.2 may be placed within the nodal framework, and this too has resulted in the ability to treat heterogeneous nodes, often times with substantial savings in memory over what is required using the corresponding differential method [49]. 2.6 Discussion The foregoing sections attempt to provide an overview of the second-order methods that are finding increased use in neutron transport calculations. The focus has been placed upon the discretization of the transport equation rather than on the algorithms that are employed to solve the resulting sets of linear equations. The vast literature on iterative methods used to solve large sets of sparse matrix equations, and in particular the intense efforts being made to effectively utilize parallel computing to speed such computations, lies beyond the scope of this overview. In addition, the foregoing exposition has been limited to the linear neutron transport equation. No attempt has been made to include recent work in which second-order transport methods are combined with computational fluid mechanics or other techniques to include nonlinear thermal-hydraulic feedback mechanisms into neutronics calculations. Likewise, the scope of this work has not included techniques needed specifically in treating the unique characteristics of thermal radiation or of electron transport problems. Work in progress seems destined to expand significantly the value of secondorder methods. In particular, interest in the primal and dual coupled forms of the transport equations is beginning to break down the sharp distinction between firstand second-order methods. Perhaps the greatest challenge facing the expanded use of second-order methods, however, is in the treatment of void regions. Since the total cross section appears in the dominator of the even-parity equations, they cannot be applied directly to vacuum regions. Rather, they must be coupled with ray tracing or other techniques in situations, where neutron streaming in voids must be incorporated into second-order computations. 2 Second-Order Neutron Transport Methods 113 References 1. HASSITT (1968) Diffusion theory in two and three dimensions. In: Greenspan H, Kelber CN, Okrent D (eds) Computing methods in reactor physics, Chap. 2. Gordon & Breach, New York 2. Semenza LA, Lewis EE, Rossow EC (1972) Application of the finite element method to the multigroup neutron diffusion equation. Nucl Sci Eng 47:302 3. Hansen KF, Kang CM (1975) Finite element methods in reactor physics analysis. Adv Nucl Sci Tech 8:173 4. Kavenoky A, Lautard JJ (1977) A finite element depletion diffusion calculations method with space-dependent cross-sections. Nucl Sci Eng 64:563 5. Vladimirov VS (1961) Mathematical problems in the one-velocity theory of particle transport, Atomic Energy of Canada Ltd., Ontario (1963) (trans: V. A. Steklov Mathematical Institute) 61. 6. Kaplan S, Davis JA (1967) Canonical and involutory transformations of the variational problems of transport theory. Nucl Sci Eng 28:166–176 7. Zienkiewicz OC (1989) The finite element method, 4th edn. McGraw-Hill, London 8. Brezzi F, Fortin M (1991) Mixed and hybrid finite element methods. Springer-Verlag, New York 9. Strang G, Fix GJ (1973) An analysis of the finite element method. Prentice-Hall, Englewood Cliffs, NJ 10. Blomquist RN, Lewis EE (1980) A rigorous treatment of transverse buckling effects in twodimensional neutron transport computations. Nucl Sci Eng 73:125 11. Dilber I, Lewis EE (1985) Variational nodal methods for neutron transport. Nucl Sci Eng 91:132 12. deOliveira CRC (1986) An arbitrary geometry finite element method for multigroup neutron transport with anisotropic scattering. Prog Nucl Energy 18:227 13. Carrico CB, Lewis EE, Palmiotti G (1992) Three-dimensional variational nodal transport methods for Cartesian, triangular and hexagonal criticality calculations. Nucl Sci Eng 111:223 14. Lewis EE, Carrico CB, Palmiotti G (1996) Variational nodal formulation for the spherical harmonics equations. Nucl Sci Eng 122:194 15. Lillie RA, Robinson JC (1976) A linear triangle finite element formulation for multigroup neutron transport analysis with anisotropic scattering, ORNL/TM-5281. Oak ridge National Laboratory 16. Jung J, Kobayashi NO, Nishihara N (1973) Second-order discrete ordinate Pl equations in multi-dimensional geometry. J Nucl Energy 27:577 17. Morel JE, McGhee JM (1995) A diffusion-synthetic acceleration technique for the even-parity Sn equations with anisotropic scattering. Nucl Sci Eng 120:147–164 18. Morel JE, McGhee JM (1999) A self-adjoint angular flux equation. Nucl Sci Eng 132:312–325 19. Lautard JJ, Schneider D, Baudron AM (1999) Mixed dual methods for neutronic reactor core calculations in the CRONOS system. In: Proc. Int. Conf. Mathematics and Computation, Reactor Physics and Environmental Analysis of Nuclear Systems, 27–30 Sept 1999, Madrid 20. Fedon-Magnaud C (1999) Pin-by-pin transport calculations with CRONOS reactor code. In: Proc. Int. Conf. Mathematics and Computation, Reactor Physics and Environmental Analysis of Nuclear Systems, 27–30 Sept 1999, Madrid 21. Akherraz B, Fedon-Magnaud C, Lautard JJ, Sanchez R (1995) Anisotropic scattering treatment for the neutron transport equation with primal finite elements. Nucl Sci Eng 120:187–198 22. Miller WF Jr, Lewis EE, Rossow EC (1973) The application of phase-space finite elements to the two-dimensional neutron transport equation in X-Y geometry. Nucl Sci Eng 52:12 23. Briggs LL, Miller WF Jr, Lewis EE (1975) Ray-effect mitigation in discrete ordinate-like angular finite element approximations in neutron transport. Nucl Sci Eng 57:205–217 24. Carlson BG, Lathrop KD (1968) Transport theory: the method of discrete ordinates. In: Greenspan H, Kelber CN, Okrent D (eds) Computing methods in reactor physics, Chap. 3. Gordon & Breach, New York 25. Gelbard EM (1968) Spherical harmonics methods: PL and double PL approximations. In: Greenspan H, Kelber CN, Okrent D (eds) Computing methods in reactor physics, Chap. 4. Gordon & Breach, New York 114 E.E. Lewis 26. Fletcher JK (1994) The solution of the multigroup neutron transport equation using spherical harmonics. Nucl Sci Eng 116:73 27. Fedon-Magnaud C, Lautard JJ, Akherraz B, Wu GJ (1995) Coarse mesh methods for the transport calculations in the CRONOS reactor code. In: Proc. int. conf. mathematics, computations, reactor physics and environmental analysis, 30 Apr–4 May 1995, Portland, Oregon 28. Lewis EE, Miller WF Jr (1984) Computational methods of neutron transport. Wiley, New York 29. Greenbalm A (1977) Iterative methods for solving linear systems. SIAM, Philadelphia, PA 30. Saad Y (1996) Iterative methods for sparse linear systems. PWS Publishing Co, Boston, MA 31. Bemmel JD (1997) Applied Numerical Linear Algebra. SIAM, Philadelphia, PA 32. Palmiotti G, Carrico CB, Lewis EE (1966) Variational nodal transport methods with anisotropic scattering. Nucl Sci Eng 122:194 33. Morse PM, Feshbach H (1953) Methods of theoretical physics. McGraw-Hill, New York 34. Gelbard EM (1960) Application of spherical harmonics method to reactor problems, WARDBT-20. Bettis Atomic Power Laboratory 35. Gelbard EM (1961) Simplified spherical harmonics equations and their use in shielding problems, WAPD-T 1182 (Rev. 1). Bettis Atomic Power Laboratory 36. Smith KM (1986) Multidimensional nodal transport using the simplified PL method. In: Proc. topl. mtg. reactor physics and safety, 17–19 Sept 1986, Saratoga Springs, New York, p 223 37. Smith KS (1991) Multi-dimensional nodal transport using the simplified PL method. In: Proc. ANS topl. mtg. advances in mathematics, computations, and reactor physics, 29 Apr–2 May 1991, Pittsburgh, PA 38. Pomraning CG (1993) Asymptotic and variational derivations of the simplified P n equations. Ann Nucl Energy 20:623 39. Larsen EW, McGhee JM, Morel JE (1993) Asymptotic derivation of the simplified P n equations. In: Proc. topl. mtg. mathematical methods and supercomputers in nuclear applications, M&C + SNA’93, 19–23 Apr 1993, Karlsruhe, Germany 40. Larsen EW, Morel JE, McGhee JM (1995) Asymptotic derivation of the multigroup P1 and simplified P n equations. In: Proc. int. conf. mathematics and computations, reactor physics and environmental analysis, 30 Apr– 4 May 1995, Portland, Oregon 41. Lewis EE, Palmiotti G (1997) Simplified spherical harmonics in the variational nodal method. Nucl Sci Eng 126:48 42. Noh T, Miller WF Jr, Morel JE (1996) The even-parity and simplified even-parity transport equations in two-dimensional x-y geometry. Nucl Sci Eng 123:38–56 43. Azmy YY (1988) The weighted diamond-difference form of the nodal transport methods. Nucl Sci Eng 98:29 44. Lewis EE, Miller WF Jr, Henry TP (1975) A two-dimensional finite element method for integral neutron transport calculations. Nucl Sci Eng 58:202 45. Lawrence RD (1986) Progress in nodal methods for the solution of the neutron diffusion and transport equations. Prog Nucl Energy 17:271 46. Palmiotti G, Lewis EE, Carrico CB (1995) VARIANT: VARIational Anisotropic Nodal Transport for multidimensional Cartesian and hexagonal geometry calculation, ANL-95/40. Argonne National Laboratory 47. Yang WS, Palmiotti G, Lewis EE (2001) Numerical optimization of computing algorithms for the variational nodal method. Nucl Sci Eng 139:74–185 48. Smith MA, Tsoulfanidis N, Lewis EE, Palmiotti G, Taiwo TA (2003) A finite subelement generalization of the variational nodal method. Nucl Sci Eng 144:36 49. Smith MA, Palmiotti G, Lewis EE, Tsoulfanidis N (2004) An integral form of the variational nodal method. Nucl Sci Eng 146:141 2 Second-Order Neutron Transport Methods 115 Professor Elmer E. Lewis received his B.S. in engineering physics (1960) and an M.S. (1962) and Ph.D. (1964) in nuclear engineering at the University of Illinois, Urbana. He served as a captain in the US Army Ordnance Corps and as a Ford Foundation Fellow and assistant professor of nuclear engineering at MIT before joining as Northwestern’s faculty in 1968. In addition to serving as chair of Northwestern’s Department of Mechanical Engineering (1987–1997), he has held appointments as visiting professor at the University of Stuttgart and Guest Scientist at the Nuclear Research Center at Karlsruhe, Germany. He has been a frequent consultant to Argonne and Los Alamos National Laboratories and to a number of industrial firms. A Fellow of the American Nuclear Society, and winner of its Mathematics and Computation Distinguished Service and Arthur Holly Compton Awards, Professor Lewis serves on the Editorial Boards of the journals Nuclear Science and Engineering and Transport Theory and Statistical Physics. He has held a number of offices in the American Nuclear Society, including chair of its Mathematics and Computation Division. His research resulted in significant advances in a wide variety of topics including neutronics computational methods, radiation transport, the physics and safety of nuclear systems, reliability and quality modeling, and Monte Carlo simulation. Professor Lewis has taught a wide range of courses in mechanical and nuclear engineering, ranging from freshman seminars to graduatelevel offerings, and currently serves as his department’s undergraduate curriculum coordinator. In addition to undergraduate advising, he has been a primary supervisor to more than 20 Ph.D. and approximately 30 M.S. students, three of them winning the American Nuclear Society’s Mark Mills Award for their doctoral work. He is the author or co-author of nearly 200 journal articles and conference proceeding papers. He has written four engineering textbooks as well as a historical appreciation of engineering intended for a more general audience. Chapter 3 Monte Carlo Methods Jerome Spanier 3.1 Introduction Monte Carlo methods comprise a large and still growing collection of methods of repetitive simulation designed to obtain approximate solutions of various problems by playing games of chance. Often these methods are motivated by randomness inherent in the problem being studied (as, e.g., when simulating the random walks of “particles” undergoing diffusive transport), but this is not an essential feature of Monte Carlo methods. As long ago as the eighteenth century, the distinguished French naturalist Compte de Buffon [1] described an experiment that is by now well known: a thin needle of length l is dropped repeatedly on a plane surface that has been ruled with parallel lines at a fixed distance d apart. Then, as Laplace suggested many years later [2], an empirical estimate of the probability P of an intersection obtained by dropping a needle at random a large number, N , of times and observing the number, n, of intersections provides a practical means for estimating . The relationship is 2l d 2l PO d P D or where PO D n=N and we assume that l < d . We introduce another example that can be used to illustrate several important features of Monte Carlo simulation. This “model” transport problem, one of the simplest random walk problems one might imagine, can be solved completely without resorting to sampling at all and yet exhibits characteristics of problems that are typical of more complex particle transport. The study of such model problems is crucial in obtaining a deeper understanding of the basic principles that underlie Monte Carlo methods. J. Spanier () Beckman Laser Institute, University of California, Irvine, California, USA e-mail: jspanier@uci.edu Y. Azmy and E. Sartori, Nuclear Computational Science: A Century in Review, c Springer Science+Business Media B.V. 2010 DOI 10.1007/978-90-481-3411-3 3, 117 118 J. Spanier We imagine particles (random walkers) that are assumed to impinge on the left face of a vertical slab of unit thickness. Each particle moves only to the right in steps selected at random from a uniform distribution on [0,1] until it escapes from the slab. Let X be the number of steps required to escape. The problem is to estimate EŒX , the average number of steps to escape, where EŒX is the expectation of the random variable X . For this problem, one expresses the expectation as an infinite series, the nth term of which is the product of n and the probability pn that the particle escapes after exactly n steps. This infinite series representation of EŒX is precisely analogous to the Neumann series representation of the solution of the transport equation that describes this problem, as well as so many problems that are solved using Monte Carlo methods. Making use of the fact that for this simple problem, pn D 1 = n.n 2/Š for n D 2; 3; : : : the infinite series can be summed exactly, which yields EŒX D e Š 2:71828 and the variance of X is 2 ŒX D 3e e 2 Š 0:76579: This very simple test problem provides a very useful vehicle for analyzing more complex Monte Carlo random walk problems for which exact values of the moments of key random variables will be unknown in general. Of course, much more accurate and efficient (deterministic) methods may be used to estimate both and e. However, even these very simple simulation problems serve to illustrate a number of key ingredients of the Monte Carlo method: (1) The need to generate random samples drawn from a variety of probability distributions; (2) the need to express the outcomes of a Monte Carlo experiment as estimates of a theoretical expected value of some random variable; (3) the need to perform an error analysis based on statistical fluctuations of a random variable from sample to sample; and (4) where possible, the desirability of reducing the sample to sample fluctuations by thinking deeply about the inherent cause of these fluctuations and taking appropriate measures to reduce them. For example, in the needle-tossing experiment, why not toss a cruciform-shaped needle, or even one with many needles of the same length equally distributed around the circumference of a circle of radius l? This would seem to provide a more efficient experiment since each toss produces many possible intersections, yet each “spoke” retains the same distributional characteristics as each single needle toss in the original experiment. But clearly these more sophisticated “needles” produce correlated sampling results. How does this affect the statistical analysis of the outcomes? Some of these questions are addressed in the interesting references [3–5]. For the one-dimensional random walk problem, one might imagine that a more systematic sampling (instead of random sampling) of the unit interval to obtain step sizes for each step to the right would lead to reduced statistical fluctuations in the Monte Carlo estimate. Suppose, then, that one were to subdivide the unit interval 3 Monte Carlo Methods 119 into a large, but fixed, number, S , of equal subdivisions and choose as individual step sizes the midpoints, say, of these subintervals ordered deterministically. For example, one could move through these in order, going from the least to the greatest, to generate steps for the random walkers. Pretty clearly, this choice is not a very good one since, if S is very large compared to the number of samples generated, there would be a bias in the direction of short steps. Perhaps one should average a small step with a large one or run through the midpoints randomly. Again, what about the correlation introduced? And what is the impact of any such scheme on the sample to sample variability? In addition to containing many of the same critical features of most Monte Carlo problems, these simple model problems illustrate some of the advantages of the Monte Carlo method: (a) its appeal to intuition; (b) its simplicity of implementation; and (c) its accessibility to nonexperts. However, as more and more sophistication is considered in an attempt to speed up the computation or to reduce the sampling variability per unit computing cost, the model for the experiment can depart more and more from intuitive plausibility, and the need for mathematical rigor is accentuated. A firm theoretical foundation becomes not only desirable, but also essential. There are many problems for which analytic or good deterministic methods are simply unavailable that benefit from being formulated stochastically. Particle transport problems provide a fertile field of examples of such problems, and this field will be our main emphasis here. Since the earliest applications of Monte Carlo methods to neutron transport problems, however, the number and range of applications have grown well beyond the bounds of a single book chapter. The field of operations research is rife with such examples, which we do not discuss here (see, e.g., [7]). We have deliberately ignored the discussion of developments in computer architecture (e.g., use of parallel or vectorized computation) and many of the rapidly growing list of important application areas, such as financial modeling, design of radiation therapy plans, Markov chain Monte Carlo, and others. Because of the literal explosion in the number of applications of Monte Carlo methods and the avalanche of publications dealing with both the theory and applications, there is no possibility of dealing with the subject comprehensively here. Finally, we presume that the reader is familiar with at least the rudiments of Monte Carlo. Several books and articles can provide an introduction to the subject matter (e.g., [8–13]). For those more familiar with at least the rudiments, a review of [14] would provide an appropriate introduction since our ultimate goal here is to update that article. 3.2 Organizing Principles Initially, a historical context was suggested by the organizers for each of the Gelbard lectures. Each lecture was intended to survey one of the topics traditionally important to the Mathematics and Computation Division of the American Nuclear Society in such a way as to update the 1968 publication of [14]. While this context seemed 120 J. Spanier to serve well for the lectures at the Gatlinburg conference – at least the chronology featured prominently in the oral accounts – a division according to subject content seemed more appropriate for the written version. Our hope is that this switch in perspective might make the chapter more useful as a reference. Accordingly, we have abandoned the idea of dividing the content into two historical periods, one prior to 1968 and the other afterward. Instead, the material here will be organized into what we perceive to be the key elements of Monte Carlo methods development: generating sequences, error analysis, error reduction, and theoretical foundations. Here is a brief overview of what the reader might expect to find. 3.2.1 Generating Sequences Section 3.4, dealing with generating sequences, shows that the relatively simple algorithms used to create a sequence of unpredictable, “random” numbers for early use in simulation have evolved into several quite different streams of research. These involve both modern-day successors to the earliest pseudorandom number generators and algorithms that are completely deterministic, focusing only on uniformity in a manner that forsakes the idea of stochastic independence altogether. 3.2.2 Error Analysis For the analysis of errors in Monte Carlo output, we will discuss in Section 3.5 the evolution, in the case of pseudorandomly generated samples, from reporting the sample mean and standard deviation to the now fairly common use of higher moments and additional statistical tests (see [15]). When completely deterministic sequences are used in place of pseudorandom ones, and no probabilistic model is invoked, however, we will see that a markedly different error analysis must be applied. 3.2.3 Error Reduction Perhaps the greatest concentration of effort to improve Monte Carlo methods has been devoted to the topic of error reduction in the last 50 years or so. Whether the simulation makes use of pseudorandom number sequences or their deterministic cousins – quasi-random sequences – highly sophisticated techniques have been developed that are capable of producing not only increased rates of convergence, but also much lower error levels for a fixed sample size. Some of these developments are described in Section 3.6. 3 Monte Carlo Methods 121 3.2.4 Foundations/Theoretical Developments In discussing this last of our “big four” topics in Section 3.7, we shall attempt to list the major advances in understanding the foundations of the subject. Our (admittedly biased) perspective is that these have had a great deal to do with the practical advances made since the earliest uses of simulation methods to solve complex problems. Following our discussion of the development of each of these four major themes, we try to formulate at the close of each section, a succinct summary of the present state of the art for each such theme. We end this chapter with a short list, presented in Section 3.8, of major challenges that might serve to stimulate further thinking. 3.3 Historical Perspectives Before we pursue our discussion of each theme, we provide an overview of the early history of the subject. Much has already been written about the development of a nuclear weapons program whose history began with the famous experiments conducted during World War II at the University of Chicago. These culminated on December 2, 1942 with the first controlled nuclear chain reaction on a squash court situated beneath Chicago’s football stadium. Working later in Los Alamos, New York, Oak Ridge, and other locations, Enrico Fermi and others (for instance, Stanislaw Ulam, John von Neumann, Robert Richtmyer, and Nicholas Metropolis) played an important part in solving the problems connected with the development of the atomic bomb. This work depended crucially on rudimentary numerical simulations of multiplying neutron populations. It is widely held that the first paper on the Monte Carlo method was [16]. The marriage of relatively unsophisticated simulation methods with the development of automatic digital computers provided the key ingredients for success in this rapidly expanding wartime undertaking. Shortly after the end of World War II another major effort was initiated, spearheaded by Admiral Hyman Rickover, to design and implement nuclear propulsion systems for the Navy. This highly successful program began in the 1940s, with the first test reactor started up in the United States in 1953. The first nuclear-powered submarine – the USS Nautilus – put to sea in 1955. This development marked the transition of submarines from slow underwater vessels to warships capable of sustaining 20–25 knots while submerged for weeks at a time. The success of the Nautilus effort led to the development of additional submarines, each powered by a single pressurized water reactor, and an aircraft carrier, the USS Enterprise, powered by eight reactor units, in 1960. A cruiser, the USS Long Beach, was placed into service in 1961 and by 1962 the United States boasted a nuclear fleet of 26 operational nuclear submarines with 30 more under construction. Today, the US Navy operates more than 80 nuclear-powered ships including 11 aircraft carriers and a number of cruisers. 122 J. Spanier The postwar activities to produce nuclear propulsion units for a nuclear navy were concentrated primarily at the Westinghouse-managed Bettis Atomic Power Laboratory in suburban Pittsburgh, PA and the General Electric-managed Knolls Atomic Power Laboratory in Schenectady, New York during the 1950s and 1960s. In that same time frame there was a very rapid expansion in the capability of digital computers at these and at the nation’s National Laboratories at Los Alamos, Oak Ridge, Brookhaven, Argonne, and Livermore. The development of the giant (using nearly 18,000 vacuum tubes) ENIAC machine by John W. Mauchly and J. Presper Eckert at the University of Pennsylvania during the period 1943–1946 and the EDVAC computer, for which conceptual design was completed in 1946 but which was not fully operational until 1952, led the way. It is perhaps no coincidence that von Neumann and Metropolis were heavily involved in both the development of modern computing machines and the Monte Carlo method as a practical numerical method. As a result of this dual evolution of science and technology, the same period was marked by a dramatic increase in the levels of sophistication of computations in support of the nuclear energy program, including the design and development of reactors for peacetime uses. The commercial development of nuclear power-generating plants provided added incentives for acceleration of this effort. As a result, Monte Carlo methods began to find greater use as a design tool and as a partial replacement for the much more expensive (and risky!) criticality experiments that were needed to validate the various nuclear designs. An unfortunate consequence of the fact that much of the work done was classified during this early period is that publication occurred mainly in classified government reports rather than in the open literature. This undoubtedly prevented many important ideas from becoming known to a much wider audience sooner. Indeed, many of these reports, such as the seminal work of Herman Kahn [17, 18], were never republished. 3.4 Generating Sequences It seems appropriate to begin our review of this topic with an often cited quotation due to R.R. Coveyou [19] – “Random number generation is too important to be left to chance.” Indeed, this statement, which was certainly valid in 1969 when a good deal of effort was being devoted to try to understand how to generate high quality pseudorandom numbers, is even more accurate today. This is so because there are very sophisticated new methods for generating pseudorandom sequences as well as methods for generating completely deterministic uniform sequences that serve in place of pseudorandom ones. We will deal with such deterministic sequences1 in Section 3.4.2. 1 Of course, even pseudorandom sequences are deterministic, a fact that has stirred some debate about whether a probabilistic analysis made any sense at all for pseudorandomly implemented Monte Carlo. This, in turn, was one of the motives for developing a mathematically rigorous analysis divorced from probability theory and based only on a notion of uniformity arising out of number-theoretic considerations. 3 Monte Carlo Methods 123 3.4.1 Pseudorandom Sequences Pseudorandom numbers are commonly understood to be computer substitutes for “truly random” numbers. Their function is to simulate realizations of independent, identically distributed (iid) random variables on the unit interval. Other distributions that might be needed in a stochastic simulation are then obtained by transformation methods, of which there are many (see [20] and the software package C-Rand [21, 22]). The most common early source of pseudorandom numbers was the multiplicative congruential generator i C1 a i .mod m/ originally suggested by Lehmer [23], or one incorporating an additive component i C1 Œa i C b .mod m/: The pseudorandom numbers themselves are then defined through division by the modulus: ri D i =m 2 Œ0; 1 : When employing numbers generated by such linear congruential generators, appropriately chosen “seeds” 0 , moduli m, and multipliers a provide the means to generate reasonably high quality pseudorandom sequences, with sufficiently long periods for many problems (see [24]). Following Lehmer’s original suggestion to use such sequences for simulation, substantial effort and analysis was invested in assessing their imperfections, such as the persistence of serial correlation [25, 26]. Knuth [24] devised a number of statistical tests to bolster confidence in using pseudorandom sequences for generating samples from the many distributions required during the course of a simulation experiment. For much of the early history of Monte Carlo computations, the emphasis was on obtaining results, not on looking carefully at the theoretical underpinnings of the computations. When using linear congruential generators as a source of nearly independent and nearly uniform realizations of independent and identically distributed uniform distributions on the unit interval, it seemed sufficient to assure sufficiently long periods of pseudorandom sequences with serial correlation properties that seemed “safe.” The ideal desiderata for pseudorandom sequences were maximal periods and minimal “structure.” Here, the word structure is being used nearly synonymously with predictability. And indeed, recursive multiplications of large integers produces, as remainders, numbers that work surprisingly well in this regard for many Monte Carlo problem applications. In 1967, however, George Marsaglia published his seminal paper [27] with the surprising conclusion that all multiplicative congruential generators possess a defect that makes them unsuitable for use in many applications. Furthermore, this defect cannot be eliminated by adjusting the starting values, multipliers, or moduli of the congruence. The fatal flaw revealed by Marsaglia’s paper is that if successive n-tuples .u1 ; u2 ; : : :; un /, .u2 , u3 , : : :, unC1 ), : : : produced by the generator are 124 J. Spanier treated as points in a unit cube of dimension n, then all of these points will lie on a relatively small number of parallel hyperplanes. If one plots in two or three dimensions a set of successive pairs or triples of numbers obtained from a linear congruential generator, this hyperplane structure is readily apparent. In applications requiring as much uniformity as possible in two (or more) dimensional space, not just on each axis separately, this would be a crucial factor. For example, it is easy to imagine specific piecewise linear functions of a single variable whose integrals – estimated crudely by selecting a pair of points randomly in the unit square and counting the fraction that fall below the graph of the function – would not be estimated very accurately even after many samples were generated. After nearly 20 years of essentially unquestioning use of such generators, the stage had been set for an explosion of effort and analysis to create improved pseudorandom sequences as well as methods that do not rely on randomness at all. After all, as John von Neumann stated “Anyone who considers arithmetical methods of producing random numbers is, of course, in a state of sin” (quoted in [24]). There are many excellent sources of additional material about pseudorandom numbers. Beginning with Knuth’s book [24], which may be the best reference for a rigorously mathematical discussion, another more recent reference, which belongs to any basic library on the subject is Harald Niederreiter’s book [10] based on lectures he gave at a CBMS-NSF Regional Conference held at the University of Alaska, Fairbanks in 1990. In addition to these standard reference works, there are now several Internet web sites that provide not only lists of references but also links to various other sites, algorithms, and a variety of other relevant information. A good choice among these is the one maintained at the University of Salzburg at http://random.mat.sbg.ac.at. Two important developments characterize the past 30 years of progress with respect to generating sequences. The first of these is the exhaustive, rigorous analysis of uniform pseudorandom number algorithms, including linear and nonlinear congruential generators and the emergence of a host of other ideas for generating sequences that behave nearly randomly. The second development that occurred more or less in parallel with the first was much more radical. This involved the abandonment of randomness as a requirement for decision making in favor of reliance on uniformity in an appropriately defined high-dimensional space. In this latter area of research, number-theoretic methods are used for generating and analyzing optimally regular sequences. The use of such “quasi-random,” as opposed to pseudorandom numbers, in turn, can produce more rapid convergence of sample means to theoretical means for many practical problems (see Section 3.5.2). Following the discovery by Marsaglia of the parallel hyperplanes phenomenon characteristic of linear congruential generators it became clear that many simulations could be adversely affected by this behavior. As a result, much effort was devoted to careful theoretical analyses of the behavior and fundamental properties of uniform pseudorandom number generators and algorithms. Excellent material describing these developments can be found in [28–34] and in [10] and [35, 36]. A detailed exposition of this topic alone is beyond the scope of this chapter. The reference [28] lists 147 relevant papers, most of them published during the 3 Monte Carlo Methods 125 10-year interval, 1985–1995. However, we try to summarize the salient facts here to provide a flavor of the material to be found in the literature. Taking into account the important, indeed, pivotal role played by pseudorandom numbers in any Monte Carlo simulation, the significance of these investigations can hardly be overemphasized. Computer algorithms for generating uniform pseudorandom numbers all yield periodic sequences. Naturally, it is desirable that these sequences have long periods, good equidistribution properties, minimal serial correlation, and as little intrinsic structure as possible. Obviously, it is also important that the algorithm be amenable to a fast computer implementation. The purpose of a pseudorandom number generator is to simulate independent, identically distributed (iid) random variables that are uniform in the unit interval. Stringing s such numbers together in sequence then simulates iid uniform random variables in Œ0; 1 s for every integer s. Clearly, this cannot be the case for every s since there are, in any event, only finitely many numbers available from any pseudorandom generator. It remains, then, to find sensible criteria for assessing the quality of a given pseudorandom generator and to construct specific generators that satisfy these criteria. Presumably, this will provide a measure of the evenness of the distribution of successive s-tuples in all dimensions s up to some sufficiently large integer or some carefully selected subset of dimensions, if these can be identified. A common quantification of quality is the discrepancy between the empirical distribution (i.e., one based on sample, rather than theoretical, averages) of a point set, such as the set of all s-tuples of pseudorandom numbers) and the uniform distribution over Œ0; 1 s . This same notion of discrepancy will play a fundamental role in analyzing so-called quasi-random sequences – sequences chosen solely based on their uniformity properties rather than on any probabilistically – inspired ones. Definition 1. For any N points u0 ; u1 ; : : : ; uN 1 in the s-dimensional unit cube I s D Œ0; 1 s , s 1, their discrepancy is defined by DN .u0 ; u1 ; : : : ; uN 1 / D sup jEN .J / V .J /j J where the supremum is extended over all subintervals J of I s with one vertex at the origin, EN .J / is N 1 times the number of points among the u0 ; u1 ; : : : ; uN 1 that lie in J and V .J / is the s-dimensional volume of J. Evidently, the first study of the concept of discrepancy is in a paper of Bergstrom [37]. The term “discrepancy” was most likely coined by van der Corput and the first intensive study of it appeared in a paper of van der Corput and Pisot [38]. The discrepancy provides the statistical test quantity for the s-dimensional Kolmogoroff test, which is a goodness-of-fit test for the empirical distribution of initial segments of a pseudorandom sequence. For fixed N and theoretically random .u0 ; u1 ; : : : ; uN 1 / 2 Œ0; 1 , the distribution of DN .u0 ; u1 ; : : : ; uN 1 / is known, which gives rise to various formulas for ProbfDN .u0 ; u1 ; : : : ; uN 1 / tg; 0 t 1: 126 J. Spanier For s D 1 one obtains a test of one-dimensional uniformity (equidistribution in [0,1]), while for s > 1 one obtains a test of higher dimensional uniformity (which encompasses the independence, or lack thereof, of successive pseudorandom numbers). As we saw earlier, the classical method for generating pseudorandom numbers is the linear congruential method i C1 Œa i C b .mod m/: where m is always chosen to be a large integer and 0 , a, and b are suitably chosen integers. Algorithms based on this simple idea have been widely studied [10, 24, 34, 40–43], especially in the past 40 years. In spite of well-founded criticism of linear congruential generators stemming from the hyperplane structure they exhibit, they remain the most commonly used in practice, especially in standard computer software libraries. There are several reasons for this state of affairs: Linear congruential algorithms are fast. Alternative generators are essentially al- ways considerably slower. Many large Monte Carlo programs that undergo periodic revision rely on repro- ducing output from test problems run with previous versions of the program. Introducing change in the basic pseudorandom sequence employed would destroy the reproducibility of results obtained with the earlier versions of the program. Those responsible for maintaining Monte Carlo programs may be unaware of the potential danger in using pseudorandom number generators that were devised many years earlier, generators that may well have outlived their usefulness. Unfortunately, many of the “default” generators currently available in popular computer software are old and could be dangerous to use in modern applications that require much larger sample sizes than might have been needed earlier. An increasing demand for parallel or vector pseudorandom streams of numbers has aggravated the problem as well since these often make use of subsequences of sequences that may not have sufficiently long periods to justify this subdivision. For example, consider the simple generator i C1 Œa i C b .mod m/ with ri D i =m and the popular choices m D 231 – 1, a D 16; 807. The period length is 231 – 2, which is judged to be too small for serious applications [44, 45]. Furthermore, this generator has a lattice structure with rather large distances between the small number of hyperplanes that contain the higher dimensional s-tuples, which could easily prove to be a problem in simulation results [46, 47]. 3 Monte Carlo Methods 127 One method for controlling the lattice structure is to combine linear recursive generators [48]. For example, beginning with two such linear recurrences x1;n D .a1;1 x1;n1 C C a1;k x1;nk /.mod m1 / x2;n D .a2;1 x2;n1 C C a2;k x2;nk /.mod m2 / where k, the mj ’s, and the ai;j ’s are fixed integers, one can define the pseudorandom numbers by rn D .x1;n =m1 x2;n =m2 /.mod1/: This is an example of a combined linear multiple recursive generator. At step n this generator produces the 2k-dimensional vector .x1;n ; : : : ; x1;nkC1 ; x2;n ; : : : ; x2;nkC1 /, whose first k components lie in f0; 1; : : : ; m1 1g and whose last k components lie in f0; 1; : : : ; m2 1g. It can be shown in [24, 44] that this generator gives good coverage of the unit hypercube Œ0; 1 s for all dimensions s k. For dimensions higher than that there is the usual lattice structure, but parameters can be chosen that make the distance between the parallel hyperplanes of this lattice quite small. L’Ecuyer [47] recommends parameter choices that produce a generator with two main cycles of length 2192 each and whose lattice structure in dimensions up to 48 has been found to be excellent. This would seem to make it a good choice for many applications. The lattice structure that Marsaglia [27] discovered plagues every linear congruential generator. Inversive congruential generators (introduced by Eichenauer and Lehn [48] in 1986) were designed to overcome this difficulty. By analogy with linear congruential methods, inversive congruential sequences are defined by the nonlinear congruence i C1 Œa Ni C b .mod p/ with i Ni 1.mod p/ and ri D i =p 2 Œ0; 1 : In these equations, p is a prime modulus, a is a multiplier, b is an additive term, and 0 is a starting value. From these definitions it follows that the i take values in the set f0; 1; : : : ; p 1g. We denote this generator by ICG .p; a; b; 0 /. A key feature of the ICG with prime modulus is the absence of any lattice structure, in sharp contrast to linear congruential generators. Figure 3.1 is a plot of pairs of consecutive pseudorandom numbers .rn ; rnC1 / generated by ICG .231 1; 1288490188; 1; 0/ concentrated in a region of the unit square near the point (0.5, 0.5). The extra inversion step dramatically reduces the effect of the parallel hyperplanes phenomenon. For example, an inversive congruential algorithm modulo p D 231 – 1 passes a rather stringent s-dimensional lattice test for all dimensions s 230 , whereas, for the linear congruential algorithm with this same modulus, it is difficult to guarantee a nearly optimal lattice structure for s 10 [10]. Inversive congruential algorithms also display better behavior for a rather severe test of serial correlation. They also enjoy robustness with respect to the choice of parameters. These algorithms are promising candidates for parallelization because, unlike 128 J. Spanier t000_102.out 0.5004 0.5002 0.5 0.4998 0.4996 0.4996 0.4998 0.5 0.5002 0.5004 Fig. 3.1 Points generated by an inversive congruential generator linear congruential generators, they do not have long-range correlation problems. The only downside to their use is that they are substantially costlier to produce (by approximately a factor of 8) than their linear congruential counterparts because of the relative costliness of computing multiplicative inverses in modular arithmetic. In addition to these linear and inversive congruential generators for uniform pseudorandom numbers, several other classes have been studied – some quite extensively – in an attempt to circumvent the problems perceived with the classical ones. Of course, no single generator can prove to be ideal for all applications. Any single generator can satisfy only a finite number of randomness tests. This, along with the finiteness of the set of computer numbers, means that a test can always be devised that cannot be passed by a specific generator. It has been said that pseudorandom number generators are like antibiotics in that respect: no one is appropriate for all tasks. What is needed is an arsenal of possible choices with distinct properties. If two very different generators yield the same outcome in a simulation, then additional confidence is gained in the result. In addition to the generator families mentioned above, there are laggedFibonacci, generalized feedback shift register, matrix, Tausworthe and other classes of generators, and combinations of these. There are also various nonlinear generators. Literally thousands of publications dealing with these topics have appeared in print in the last 30 years. In [49], L’Ecuyer suggests the following generator families as having the essential requirements of good theoretical support, extensive testing, and ease of use: the Mersenne twister [50], the combined multiple recursive generators of L’Ecuyer [47], the combined linear congruential generators of L’Ecuyer and Andres [51], and the combined Tausworthe generators of L’Ecuyer [52]. More information can be found on the web pages: http://www.iro.umontreal.ca/ lecuyer http://random.mat.sbg.ac.at http://cg.scs.carleton.ca/ luc/rng.html http://www.robertnz.net/ 3 Monte Carlo Methods 129 3.4.2 Quasirandom Sequences In the past 35 years or so there has been a surge in the number of publications (see, e.g., [10, 53–58]) that recommend the use of quasi-random sequences (i.e., sequences more regular than pseudorandom ones) in place of pseudorandom sequences for difficult problems. There is no universally accepted, rigorous definition of the term “quasi-random sequence,” but it has come to mean sequences with low discrepancy (see definition of discrepancy: Definition 1 of Section 3.4.1). Indeed, the terms quasi-random and low-discrepancy sequences are frequently used synonymously. These quasi-Monte Carlo methods, as they are sometimes called, have become the methods of choice recently for many problems involving financial modeling [59–62], radiosity, and global illumination problems [54, 55, 63, 64] and other applications. Quasi-random methods offer the potential for improved asymptotic (i.e., for sufficiently large sample size) convergence rates when compared with pseudorandom methods and have performed even better than can easily be explained by existing theory in many applications. Additionally, deterministic rather than statistical error analysis can be applied to their use, even though sharp error bounds are not easily obtained. Consequently, a sizeable research effort is presently devoted to obtaining a deeper understanding of the potential of quasi-Monte Carlo methods. In spite of this potential, most traditional nuclear applications continue to depend upon Monte Carlo programs, such as MCNP [15], that rely on pseudorandom sequences. An important reason for this is that a completely different (and more complex) error analysis must be applied for simulations based on quasi-random sequences. Also, the generation of quasi-random sequences is costlier, in general, than that for pseudorandom sequences. In return for the extra computation, quasi-random sequences offer the prospect of accelerated convergence rates when compared with pseudorandom Monte Carlo. There is sufficient complexity involved in deciding which method might be best to use for a given problem, however, that no hard and fast rule can be applied. In any case, the development of simulation methods based on the use of quasi-random sequences represents, in our view, one of the most important lines of Monte Carlo research since 1968. As stated above, the term low-discrepancy sequences is normally used in place of quasi-random sequences to characterize the highly regular sequences used in quasi-Monte Carlo implementations. The definition of discrepancy (Section 3.4.1) provides a quantitative measure of the regularity of a sequence of s-dimensional points. For the case s D 1 we are concerned with either a finite set u0 ; u1 ; : : : ; uN 1 or with the first N members of an infinite sequence u0 ; u1 ; : : : of points drawn from the unit interval [0,1]. In this case the discrepancy reduces to DN .u0 ; u1 ; : : : ; uN 1 / D sup jEN .J / V .J /j; J where the supremum is extended over all subintervals J of [0,1] with one vertex at the origin, EN .J / is N 1 times the number of points among the u0 ; u1 ; : : : ; uN 1 that lie in J and V .J / is simply the length of the interval J . 130 J. Spanier A useful example of a low-discrepancy infinite sequence in [0,1) is the Van der Corput sequence [65], which is defined by 2 .n/ D N X aj .n/2j 1 ; (3.1) aj .n/2j : (3.2) j D0 where nD N X j D0 These formulas produce 2 .1/ D 1=2, 2 .2/ D 1=4, 2 .3/ D 3=4, 2 .4/ D 1=8, 1 2 .5/ D 5=8; : : : and the numbers f 2 .n/gn D 1 systematically run through the mulk tiples of 2 without duplicating any that arose earlier. Such numbers are much more uniformly distributed in the unit interval than are pseudorandom numbers. In similar fashion, one can define the radical inverse function for any number base b by b .n/ D N X aj .n/b j 1 I (3.3) j D0 it enjoys properties very similar to the b D 2 case when b is a prime larger than 2. k without duThat is, f b .n/g1 n D 1 systematically runs through the multiples of b plication for any prime b. The Halton sequence [66] is an infinite, s dimensional, low-discrepancy sequence defined by f b1 .n/; b2 .n/; : : : ; bs .n/g, where b1 ; b2 ; : : : ; bs are relatively prime in pairs (e.g., the first s primes). It is useful for generating very uniform sdimensional vectors, as when random walks in an s-dimensional phase space are required. In Fig. 3.2, a visual comparison is made of 2,000 pseudorandom pairs (left) with 2,000 Halton pairs (right). Another family of low-discrepancy sequences, called lattices, has been especially useful for integrating periodic functions. Their ancestor is the number-theoretic method of good lattice points, developed by Korobov [67] and Hlawka [68] for the approximate evaluation of integrals over I s D Œ0; 1 s under the assumption that the integrand is 1-periodic in each variable. Lattice methods, or lattice rules, generalize and extend this early work making use of algebraic, rather than numbertheoretic, principles and techniques. Excellent references for lattice rules are the books [10, 69, 70]. Many other low-discrepancy sequences have been used in various quasi-Monte Carlo simulations. The reader is referred to [10] for a more thorough treatment of this general topic. 3 Monte Carlo Methods 131 Fig. 3.2 Visual comparison of pseudorandom (left) and quasi-random (right) sequences 3.4.3 Hybrid Sequences Hybrid sequences are meant to combine the best features of both pseudorandom (convergence rate is independent of the problem dimension) and quasi-random (asymptotic rate of convergence is greater than N 1=2 but weakly dependent on dimension). Ideas for generating hybrid sequences rely, in general, on combining both random and quasi-random elements in a single sequence. For example, randomly scrambling the elements of a low-discrepancy sequence or restricting the use of the low-discrepancy component to a lower dimensional portion of the problem and filling out the remaining dimensions (“padding”) with pseudorandom sequence elements can be effective strategies. Thus, Spanier [58] introduces both a “scrambled” and a “mixed” sequence based on these ideas, Owen [71] describes a method for scrambling certain low-discrepancy sequences called nets [10], Faure [72] describes a method for scrambling the Halton sequence to achieve lowered discrepancy, Wang and Hickernell [73] randomize Halton sequences, and Moskowitz [74], Coulibaly and Lecot [75], and Morokoff and Caflisch [56] present various methods for renumbering the components of a low-discrepancy sequence – in effect, introducing randomness somewhat differently into the sequence. Okten [76, 77] has introduced a generalization of Spanier’s mixed sequence and Moskowitz’s renumbering method and also indicated how error estimation can be performed when using such sequences. Because of its generality, we describe Okten’s ideas briefly here. 132 J. Spanier In [77], Okten provides the following: Definition 2. Let D fi1 ; : : : ; id g.i1 < < id / be a subset of the index set f1; : : : ; sg . For a given d-dimensional sequence fqn g1 n D 1 , a mixed .s; d / sequence i 1 is an s-dimensional sequence fmn gn D 1 .s d / such that mnk D qnk , k D 1; : : : ; d ik and all other components of mn (i.e., mn for i 2 f1; : : : ; sg ) come from independent realizations of a random variable uniformly distributed on Œ0; 1 sd . This definition is useful inasmuch as it specializes to a number of interesting sequences introduced by other authors earlier. For example, Spanier’s mixed sequence corresponds to the choices D f1; : : : ; d g with s D 1; i.e., it is a mixed .1; d / sequence that can be used for either high-dimensional integration or random walk problems. Also, the continuation method introduced in [78] (see also [71]) amounts to using a mixed .s; d / sequence with D fs d C 1; s d C 2; : : : ; sg. Furthermore, the Spanier mixed sequence obviously specializes to an ordinary pseudorandom sequence when d D 0 and is empty, while if D f1; : : : ; d g and s D d , the resulting mixed .d; d / sequence is clearly completely deterministic with no random components at all. Okten [79] also introduces a new family of hybrid sequences obtained by random sampling from a universe consisting of low-discrepancy sequences of the appropriate dimension for the problem. This idea permits the use of conventional statistical analyses to be performed on the resulting estimates and it therefore attempts to overcome one of the major drawbacks of using low-discrepancy sequences: the unavailability of an effective and convenient error analysis. Hybrid sequences are designed to produce good results for general problems with dimensions s that are too large for pure low-discrepancy sequences to be effective. The dimension that defines this threshold depends upon the details of the problem. For example, a number of problems arising in financial modeling involve integrations over 360 dimensions and have been successfully accomplished with purely low-discrepancy sequences, whereas it is not difficult to compose integrands of only 20 variables for which the use of pseudorandom sequences provides better results than when a 20-dimensional low-discrepancy sequence is used. This disparity exists because, in the 360-dimensional case, the integrand function does not depend strongly on all of its 360 variables: it is much more affected by fluctuations in only a handful of the variables while behaving very smoothly with respect to the others. In other words, the partial derivatives of the integrand are all quite large in the case of the 20-dimensional integrand function while in the 360-dimensional problem cited, only a few of the partial derivatives are large and the remaining ones are much smaller. Hybrid sequences should, therefore, be useful for s-dimensional integration of arbitrary functions or for random walk problems whose Neumann series converge rather slowly. But one should not overlook the possibility to which we alluded in the previous paragraph that for certain integrands or random walk problems, special features of that problem might suggest the use of special sequences designed to take advantage of additional information about the problem. For example, if the s-dimensional integrand is, in fact, independent of several of the s variables, the 3 Monte Carlo Methods 133 properties of the hybrid sequence with respect to these variables becomes much less important. More generally, if an s-dimensional integrand exhibits diminished dependence on some subset of the variables, it makes sense to design a hybrid sequence that takes advantage of that information. Similar special considerations would apply in the case of certain random walk problems also. Based on this sort of reasoning, quite recently some authors have focused attention on restricting the class of integrands treated by each method in an attempt to explain why some sequences perform surprisingly well for certain problems. Interest in pursuing this point of view might have been piqued by provocative results reported when purely quasi-random sequences were used to estimate some very high-dimensional integrals arising in financial applications [80, 81]. This has led to a rash of publications in which the sensitivity of an integrand function with respect to its independent variables and/or parameters is studied [82–87]. 3.4.4 State of the Art The idea of replacing pseudorandom sequences in Monte Carlo programs by sequences that are more regular, although correlated, has had a profound effect on the field in the past 40 years. Although use of pseudorandom sequences still dominates the more traditional applications involving neutron and charged particle transport, quasi-random sequences are used increasingly in the newer applications areas. Thus, low-discrepancy or hybrid sequences are used almost routinely for financial modeling, global illumination, and other problems for which Monte Carlo methods have recently been shown to be useful. This trend has also dramatically influenced the amount and kind of research needed to analyze errors and to develop error reduction strategies, which is described in Section 3.5. 3.5 Error Analysis 3.5.1 The Pseudorandom Case Throughout the first 20 or more years (1942–1962) of the exciting period during which digital computers and Monte Carlo methods developed rapidly, only minimal information was available from most of the computer programs employing simulation. At that time, the primary output consisted of one or more sample means mN D N 1 X N i D1 i 134 J. Spanier of an estimating random variable, , together with estimates of the sample standard deviation, s 8 2 N < N X 41 sD :N 1 N i D1 2 i N 1 X N i D1 !2 391= 2 = 5 : i ; The estimated variance, s 2 =N , of the sample mean, mN , then establishes the very slow p O.N 1=2 / convergence of the Monte Carlo error (based on the standard deviation s 2 =N ) to zero for pseudorandom Monte Carlo implementations. The central limit theorem states that the sample mean mN is approximately normally distributed for large sample sizes N and that this normal distribution has as its true mean the expected value of the random variable and as its variance 2 =N , where 2 is the population variance. It is the assumption of asymptotic normality that permits the derivation of various confidence intervals that, in turn, provide the foundation for assigning precision to each Monte Carlo result. Quite often it is the estimate of the relative error R D Œs=N 1=2 =mN (3.4) that is supplied with each estimated mean. It is common to use the size of R as a mechanism for interpreting the quality of the Monte Carlo estimates. Small values of R indicate high precision, whereas values near to 1 suggest sample means that are suspect. The use of higher-order statistical quantities and other statistical tests has been recommended in conjunction with some Monte Carlo programs. For example, the program MCNP [15] estimates, in addition to tally means and variances, the variance of the variance (VOV), and the users’ manual contains guidelines for interpreting these measures of precision, along with a number of other test quantities provided routinely with MCNP output. Partly driven by the frustration caused by the slowness of the O.N 1=2 / convergence rate of pseudorandom Monte Carlo, the development of Monte Carlo methods that abandoned the probability theory model and the central limit theorem for error analysis was undertaken. Of course, this necessitated the construction of a deterministic model and error analysis. Radically, different error analyses must be applied when simulations are based on quasi-random or hybrid-generating sequences. 3.5.2 The Quasi-random Case When quasi-random sequences are used as generators, deterministic error bounds are available. The key ingredients of the quasi-Monte Carlo theory can be illustrated using simple one-dimensional integration. 3 Monte Carlo Methods 135 Definition 3. For a given sequence Q D fx1 ; x2 ; : : :g 0 t 1, first define a counting function A.Œ0; t/I N; Q/ D number of xi ; Œ0; 1/, and any t, 1 i N; with xi 2 Œ0; t/: (3.5) Using this notion, the discrepancy of the sequence Q becomes DN .Q/ D sup0 t 1 jLN .Œ0; t//j (3.6) where A.Œ0; t/I N; Q/ t: (3.7) N The discrepancy DN .Q/ plays a key role in bounding the error that results when estimating the theoretical mean of a random variable by its sample average. For finite-dimensional integrals, quasi-Monte Carlo methods replace probability with asymptotic frequency: LN .Œ0; t// D Z N 1 X lim f .xi / D f .x/dx N !1 N i D1 (3.8) Is for a reasonable class of f . Equation 3.8 means, then, that the sequence x1 , x2 , . . . produces convergent sums for the estimation of integrals of functions in the given class. Such sequences are said to be uniformly distributed in I s , a condition that clearly has nothing to do with randomness. For integral equations, the analogous condition is Z N 1 X lim .!i / D d (3.9) N !1 N i D1 and (which here ordinarily represents an estimator of a weighted integral R g.x/‰.x/dx of the solution ‰.x/ of the integral equation) must satisfy mild smoothness restrictions (and Eq. 3.9 then defines the -uniformity of !1 ; !2 ; : : : ; !N in ). The idea of -uniformity was introduced by Chelson [88] who showed that replacing pseudorandom sequences by appropriately chosen uniformly distributed sequences produces -uniformity in . This is the critical result needed to ensure that quasi-random sequences can be used to provide asymptotically valid (as N ! 1) estimating sums for solutions of integral equations. Chelson’s construction, modified slightly in [89], is simply to sample the usual onedimensional conditional probability density functions derived from the source and kernel of the transport equation by using low-discrepancy sequences, rather than pseudorandom ones. However, if one were simply to use a one-dimensional lowdiscrepancy sequence, such as the van der Corput sequence for all of the decisions needed to generate the random walks !1 ; !2 ; : : : , a little thought shows that the random walks would not necessarily satisfy the Markov property. In other words, in switching from pseudorandom sequences that are approximately uniformly and 136 J. Spanier independently distributed in the unit interval to low-discrepancy sequences that are very uniformly distributed but obviously serially “correlated,” the required condition Eq. 3.9 may be lost. The way around this predicament is to use sequences that are uniform (in this new, deterministic way) in a unit cube I s of sufficiently high dimension s to suffice for generating all collisions of every random walk. The fact that there is no a priori upper bound for such a dimension s in the case of integral equations means that the sequence used must be uniform over the infinitedimensional unit cube. A sequence such as the Halton sequence would suffice, for example, and this is the sequence that Chelson employed in [88]. These ideas can be illustrated using the simple, one-dimensional random walk problem introduced in Section 3.1. We first generated ten random walks using a conventional pseudorandom number generator to make the required decisions and compute the sample mean, m.1/ 10 . The results of that simulation are listed in the table below: Particle 1 2 3 4 5 6 7 8 9 10 X 5 2 2 2 5 5 3 3 6 3 m.1/ 10 3:6 jm.1/ 10 EŒX j D j3:6 ej 0:9 0:87509 p D 0:28: 3:162 10 p Here, D 2 is the population standard deviation. These results show that we had rather bad luck with our sample of ten particles. If we want a 95% probability of a relative error no worse than 1/100, using simple probabilistic arguments we would need to take n 3:84.100/2 .0:77=2:72/2 D 3;100 random walk samples. Next, we used the Van der Corput sequence instead of pseudorandom numbers to select step sizes for the random walks. That produces, for n D 10 particles: Particle X 1 3 2 3 3 3 4 2 5 3 6 3 7 2 8 3 9 3 10 2 m.2/ 10 2:7 .2/ jm10 EŒX j D j2:667 ej 0:0183: It appears as though we have improved our estimate of e. However, continuing the process (i.e., incorporating the results of additional particle histories generated in this way) does not improve the estimate further. In fact, this particular quasirandom sequence produces estimates that converge (rapidly!) to 2 2/3 rather than to e. Obviously, great care is needed in implementing quasi-Monte Carlo methods, 3 Monte Carlo Methods 137 even for such simple problems. The difficulty here is that the correlation which is intrinsic in the Van der Corput sequence has defeated the Markov property in the execution of particle random walks and has, therefore, not provided a faithful simulation of the underlying physical process. If we are careful to restore a kind of statistical independence in using quasi-random numbers, which is accomplished here by using components of a higher dimensional quasi-random vector sequence (in fact, the Halton sequence can again be used) to generate the individual steps of each random walk, we can indeed improve the use of pseudorandom sequences. When this was executed for model problem 1, we found: Particle X 1 2 3 4 5 6 7 8 9 10 3 3 3 3 2 7 2 3 3 3 m.3/ 10 3:2 jm.3/ 10 EŒX j D j3:2 ej 0:4818: This provides a better estimate than the one obtained using pseudorandom numbers, and the advantage in employing the quasi-random sequence in place of the pseudorandom one can be shown to increase as the number of samples grows. The basis for analyzing error when using quasi-Monte Carlo methods is the Koksma–Hlawka inequality [90, 91]. It provides a method for bounding the difference between an integral and an average of integrand values. In one variable, this result takes the following form: Theorem 1. Let Q D fx1 ; x2 ; : : :g ation on [0,1]. Then Œ0; 1/ and let f be a function of bounded vari- ˇ ˇ Z1 N ˇ1 X ˇ ˇ ˇ f .xi / f .t/dt ˇ jıN .f /j D ˇ ˇN ˇ i D1 0 DN .Q/V .f / (3.10) where DN .Q/ D discrepancy of Q, V .f / D total variation of f. The Koksma–Hlawka inequality has been extended to functions of many variables and provides a rigorous (deterministic) upper bound for the difference between a Monte Carlo-type sum and an integral of a function f of bounded variation. It can be applied to any set of points x1 ; : : : ; xN that lie in the domain of the function f . This idea of bounding such differences by a product of two factors, one of which describes the “smoothness” of the integrand, and the other describes the uniformity of the set of points used in the evaluation, has been generalized by making use of the theory of reproducing kernel Hilbert spaces. In this more general setting, the space of integrands is regarded as a reproducing kernel Hilbert space and the resulting 138 J. Spanier inequality makes use of the norm in this space to characterize the smoothness of the integrand, while the factor that replaces the discrepancy DN measures the uniformity of the point set in a way that generalizes the classical definition of discrepancy. The interested reader should examine [93] and references cited therein. A Koksma–Hlawka-type inequality has been established for the transport equation [88, 89, 93], which is shown in these references to be an infinitedimensional extension of the finite-dimensional quadrature problem. That is, jıN . /j CO DN (3.11) R P d. The constant CO in Eq. 3.11 measures the where ıN . / N1 N i D 1 .!i / variation of ; reductions in CO can be accomplished by importance sampling and similar conventional variance reduction mechanisms, as described in [88, 93]. But improvements in the rate of convergence as N ! 1 based on the Koksma–Hlawka inequality can be obtained only as a result of the rate of decrease of the factor DN . The key question then becomes: How rapidly can DN converge to 0 for arbitrary sequences or point sets? The answer is widely believed to be DN D OŒ.log N /S =N ; S D “effective” dimension of the problem (though slightly more rapid convergence cannot yet be ruled out theoretically). Integral equations are really infinite-dimensional problems in the sense that in general, no a priori upper bound exists for the number of steps in a random walk. However, a finite effective dimension of such a transport problem might be given by the product of the average number of steps and dim (); here is the physical phase space for the underlying transport model. In this case, the effective dimension is essentially the average number of decisions needed to simulate a random walk in the transport process being modeled. Sequences in s dimensions, whose discrepancies are OŒ.log N /S =N are the ones typically used in quasi-Monte Carlo implementations, and these are often referred to as low-discrepancy sequences. A very general family of low-discrepancy sequences are the (t,s)-sequences [10, 94, 95]. These sequences generalize to arbitrary base previous constructions for base 2 [96] and any prime base [97, 98], and are generally believed to possess the lowest discrepancies of any known sequences. Accordingly, these sequences are in wide use for quasi-Monte Carlo implementations. Let S denote either the actual dimension of a multidimensional integral to be estimated or the effective dimension of a discrete or continuous random walk problem to be simulated, as defined, for example, above. For fixed S , quasi-Monte Carlo methods will be superior to pseudorandom methods as N ! 1 because of the inequality (3.11) and because Œ.log N /S =N becomes much smaller than N 1=2 in this limit. However, for practical values of N and moderate values of S , such a comparison may well favor the pseudorandom convergence rate. In fact, when S is only 3, N must be larger than 107 for quasi-random sampling to be expected to improve upon pseudorandom sampling based on the error upper bounds expressed in (3.10) or (3.11). There is also some evidence in support of the conjecture that N must be exponential in S before the advantages of quasi-Monte Carlo methods 3 Monte Carlo Methods 139 over conventional Monte Carlo using pseudorandom sequences can be realized [56]. This means that, for all practical purposes, S cannot be too large. It is when S is too large that hybrid sequences have often been used successfully. Another problem with the use of the Koksma–Hlawka inequality to bound quasiMonte Carlo errors is that the individual terms V .f /, DN appearing on the right side of (3.10) (or the terms CO , DN on the right side of (3.11)) are difficult to estimate. This creates another argument in favor of hybrid sequences, especially those designed with components of randomness that enable a statistical analysis of the error, bypassing the Koksma–Hlawka inequality. Even with these caveats against the routine use of purely quasi-random sequences for estimating high-dimensional integrals or solutions of transport problems, they have been amazingly successful in certain situations for which the Koksma–Hlawka bounds suggest they would not. Examples of this sort arising, for example, in stochastic financial modeling, have inspired a good deal of research aimed at obtaining a deeper understanding of the pros and cons of the use of quasi-random sequences. Figure 3.3 compares the theoretical asymptotic convergence rates associated with pseudorandom Monte Carlo and quasi-random Monte Carlo implemented in two dimensions .S D 2/. The line with slope 1 is shown since it provides a theoretical optimum (at least asymptotically) for quasi-Monte Carlo implementations whose error analysis is based on the Koksma–Hlawka inequality. While this graph is instructive for values of the sample size N that are sufficiently large, for practical values of N there is no assurance that quasi-random errors will be smaller than 100 Pseudorandom Error 10−5 Quasirandom (S=2) 10−10 10−15 100 Line with slope = −1 105 1010 N = number of random samples 1015 Fig. 3.3 Comparison of pseudorandom, quasi-random (S D 1; 2) convergence rates 140 J. Spanier pseudorandom ones, as discussed above. Notice that the quasi-random .S D 2/ and pseudorandom lines cross for modest values of N . We have alluded earlier to the fact that good lattice point methods, or their extensions to lattice rules, can provide extremely good results (i.e., estimates of finite-dimensional integrals) when the integrand is a periodic function of each of its variables. In fact, the unusual effectiveness of the simple trapezoidal rule when applied to periodic functions, which has been well known for some time, is an illustration of this phenomenon. The explanation lies in the fact that the analysis of the error made in applying lattice rules to periodic functions is quite different from both of the error analyses discussed so far: statistically based for pseudorandom Monte Carlo or for some hybrid methods, and based on the Koksma–Hlawka inequality for quasi-Monte Carlo methods. In fact, the usual error analysis for lattice methods applied to periodic integrands makes use of number-theoretically based estimates of certain exponential sums. The interested reader might consult Chapter 5 of [10]. Any elaboration here would take us much too far afield. 3.5.3 The Hybrid Case We have just seen that a major drawback to the use of purely quasi-random methods – especially for random walk problems – is the lack of an effective and low-cost error analysis when it is used. By contrast, use of the sample standard deviation or variance in conventional Monte Carlo applications as a measure of uncertainty in the output is simple and inexpensive. Quite recently, Halton [99] has advocated analyzing low-discrepancy sequences as though they were independent and identically distributed uniform sequences, but this device entails generating low-discrepancy sequences in dimensions much higher than actually needed for the simulation itself, and there are some disadvantages in doing this. A similar error analysis has been suggested by Okten [100] when employing hybrid sequences. Okten’s idea is to define a universe consisting of a number of different low-discrepancy sequences of the appropriate dimension for the problem under study (e.g., of dimension s when estimating s-dimensional integrals or infinite dimensional when solving integral equations). One then draws such a sequence at random from this universe and uses it to perform the simulation. Repetition of this process a finite number of times then permits conventional statistical analysis to be applied rigorously to the resulting estimates. Okten has presented evidence [77, 79, 100] that this can be an effective strategy. 3.5.4 Current State of the Art Confidence interval theory is routinely applied to conventional pseudorandom Monte Carlo implementations. For quasi-Monte Carlo implementations, there are no known efficient ways of estimating the error bounds that result from application of 3 Monte Carlo Methods 141 the Koksma–Hlawka inequality. Certain hybrid sequences – for example, randomized low-discrepancy sequences – enable a conventional statistical error analysis to be used, and this seems to be one of the major reasons for employing hybrid sequences for difficult Monte Carlo simulations. Use of such an error analysis circumvents the thorny problem of estimating variations and discrepancies in order to obtain rigorous deterministic error bounds for quasi-random methods that have no random component at all. Many open questions remain concerning how best to utilize low-discrepancy and/or hybrid-generating sequences, and the question of effective error estimation for these methods is far from settled. 3.6 Error Reduction 3.6.1 Introduction The period following World War II was marked by a shift in emphasis from weapons development to peacetime applications of simulation methods. The design of nuclear reactors for power generation, both for use in naval propulsion systems and for commercial application, focused attention anew on Monte Carlo methods. This interest, in turn, accelerated the development of such methods, especially inasmuch as the limitations of deterministic solutions of the transport equation, or various approximations to that equation, became better understood in the 1950s and 1960s. For example, while formulation in terms of the transport equation was frequently replaced by the simpler diffusion equation (which was then often solved by finite difference methods), the inaccuracies inherent in this approach could not be assessed without having “benchmark” transport calculations available. The early development of Monte Carlo methods concentrated on the solution of fairly specific families of reactor physics problems and the development of techniques designed to improve the efficiency of their solution. For example, the need to deal with shielding problems – for which most of the useful information is carried in analog histories that occur only very rarely – gave rise to the study of importance sampling [7, 8, 11, 101–106]. In the review [13], considerable attention was paid to the use of importance sampling to achieve variance reduction, especially in problems involving deep penetration. Provided that sufficiently many random samples could be processed in early pseudorandom Monte Carlo codes, statistical fluctuations could be expected to be low enough to meet critical design criteria (at least most of the time!), albeit with sizeable computing expenditures. Eventually, however, the demands for increased accuracy and speed – especially in the naval reactors program and in the rapidly expanding peacetime applications – inspired the quest for additional clever error reduction strategies. In spite of the gains made in understanding how to use Monte Carlo methods to solve an increasing number of important physical problems, the method seems to have been used only as a last resort during the early period following World War II. 142 J. Spanier For reactor calculations generally, diffusion theory was dominant. Computations based on finite-difference approximations to the diffusion equation were routinely used for the design of nuclear reactors. There were, no doubt, several valid reasons for this. Monte Carlo calculations required large amounts of computer time and produced only statistical estimates of a few quantities at a time. While diffusion theory produced only an approximation to the solution of the transport equation of uncertain quality (and a rigorous examination of the approximation error was impractical), it provided at least qualitative knowledge of the particle behavior everywhere in the phase space – a distinct advantage. Another factor might have been the growing realization that Monte Carlo simulations making use of importance sampling could produce anomalous results. That is, certain rare occurrences of high-weight particles could produce an “effective bias” that was not well understood at the time. For pseudorandom Monte Carlo implementations, error reduction is usually equated with variance reduction because the N – sample error, declines at the rate ¢N 1=2 where is the standard deviation of the estimating random variable. The analogous issue for quasi-Monte Carlo methods is to seek reductions in the constant CO appearing in the inequality (3.11). In both cases, it is the fluctuations in the sample-to-sample values of the estimator that determines how large the Monte Carlo error is for any sample size. It is natural to inquire whether variance reduction methods developed for pseudorandom Monte Carlo can be directly applied to quasiMonte Carlo implementations, as one might perhaps expect. There has not been a systematic effort to convert variance reduction schemes for pseudorandom Monte Carlo to error reduction schemes for quasi-random Monte Carlo, but we will report on what seems to be known in this regard with respect to each strategy discussed in this section. As the number and kind of error reduction strategies has grown, it has become increasingly difficult to organize them all into a short list of “types.” In the first part of Section 3.6.2, however, we will mainly rely on the classification schemes used by earlier authors [8, 11, 107] for describing variance reduction strategies such as control variates, importance sampling, stratified sampling, and the use of expected values. We will complete this section by discussing a few other error reduction strategies that seemed sufficiently important to treat here. Since we cannot hope for an exhaustive treatment of the subject of error reduction, we will content ourselves with tracing what we think are the most important developments since the publication of [13], especially as they relate to impact on transport applications. 3.6.2 Control Variates The classical method of control variates has been well known in the statistical community for many years. In that context, the method has found application in the design of certain types of sample surveys. For instance, suppose one is interested in estimating the expected value y of a random variable and one can observe the 3 Monte Carlo Methods 143 outcomes of a control variable for which the expected value x is known. Then y can be estimated as N C x N ; where ; N N are the sample means of , . If and are highly positively correlated, this method will be much more effective than if the simple estimate N were used for x. When formulated as a technique for estimating integrals by Monte Carlo, the idea is to represent the integrand f as a sum of a function, , that mimics the behavior of f but whose integral is known (or easier to estimate than f ) and a remainder function, f – . Then, Monte Carlo is used only to estimate the integral of f – , and a substantial reduction in variance or variation can result. A similar idea finds use in transport applications, based on the linearity of the transport operator (3.12) L.F˛ C Fˇ / D Q˛ C Qˇ where L.F˛ / D Q˛ ; L.Fˇ / D Qˇ and L is the transport operator. One way to make use of linearity, if the problem being studied is described by the equation L.F˛ / D Q˛ , is to identify a source Qˇ such that the sum problem (3.12) has a simple, analytically known solution. For example, in one-energy transport problems with an isotropic source Q˛ one can simply define the source Qˇ to be isotropic and of sufficient strength in each region so that the ratio R of source strength to absorption cross section is constant across the entire geometry. This implies that the flux is also isotropic and constant; in fact, the scalar flux is also equal to R. The method then finds use in converting the “’-problem” to the ““-problem”; solving the “-problem by Monte Carlo might provide significant advantages over simulating the ’-problem directly. In general terms, this technique was described in [11] as the superposition principle. It was used by these authors to estimate both thermal flux averages and resonance escape probabilities. In both of these applications, the method was responsible for large gains in efficiency and accuracy. For the resonance escape application, this was especially true when the escape probability pres is close to 1, which is the situation that presents the greatest computational challenge when conventional simulation of the “’-problem” is employed. The method makes use of an analytic calculation of the narrow resonance approximation as a control [108]. Although it is usually identified by a different name, the method of antithetic variates could equally well be described as a control variate method with negative rather than positive correlation. Instead of seeking a random variable that is strongly positively correlated with the original one and which has a known expectation, one seeks a random variable with the same expectation as the original one that is strongly negatively correlated to it. Then forming the average of the two random variables produces a new random variable with the same mean but reduced variance. The idea was first presented in [5] and can be a powerful method for lowering the resulting error. Furthermore, this very simple idea can be extended in several ways, as was already made apparent in [5]. For example, if one is able to construct n mutually antithetic random variables with identical means, their average, or some appropriately chosen weighted average, has the potential to provide unbiased estimates of the same mean with greatly reduced variance. The antithetic 144 J. Spanier variates method has recently enjoyed a resurgence of interest in problems arising in the finance community, problems that entail estimating integrals of functions that are well approximated by linear functions, for which the method is exceptionally well suited. In [109] (see also [110]), Halton applied the superposition principle iteratively to solve matrix problems by means of a technique that he called “sequential Monte Carlo.” The goal was to improve the Monte Carlo estimates steadily so that geometric convergence of the error to zero could be accomplished. In our view, this idea – which has recently been extended to treat transport problems [111–114] – marks one of the more important developments in Monte Carlo over the last 40 years because the idea of using Monte Carlo sampling to produce global solutions of problems – e.g., the transport flux or collision density everywhere – is so striking. We will devote a bit more attention to it here for that reason. The main idea underlying this method is easy to describe. How can one build a feedback loop, or “learning” mechanism, into the Monte Carlo algorithm by means of which a larger and larger part of the problem can serve as a control variate, leaving a smaller and smaller portion to be estimated stochastically? Since matrix problems can be solved by Monte Carlo by generating random walks on a discrete index set consisting of N objects, where the matrix is of order N N , it is possible to encompass both matrix and (continuous) transport problems within the same formulation by studying random walks on a general (either discrete or continuous) state space, and we will adopt this point of view shortly. Of course, in implementing such a sequential (or adaptive) algorithm, if the feedback mechanism is imposed after each random walk, the additional overhead might easily overwhelm the improvement caused by the increased information content. Accordingly, the approach taken has been to process random walks in stages, each consisting of many samples, and to revise the sampling method at the end of each adaptive stage. The goal then becomes to achieve Ek < Ek1 < k E0 ; 0<<1 k D stage number; (3.13) where Ek = error after the kth stage. For example, for continuous transport problems, En D jj .P / Q n .P /jj; and Q n .P / is an approximation to the transport solution obtained in the nth stage. In this context, iterative zero variance Monte Carlo algorithms for global solutions of transport equations make use of expansions .P / D 1 X ai Bi .P / i D1 of the solution in a complete setP of basis functions, Bi , and produce essentially exact truncated solutions Q .P / D 1 .P /, where .P / satisfies i D 1 ai Bi .P / the transport equation 3 Monte Carlo Methods 145 Z .P / D K.P; P 0 / .P 0 /dP 0 C S.P /: This is accomplished by estimating the expansion coefficients Z ai D Bi .P / .P /dP (if Bi are orthonormal) in adaptive stages of ever-increasing accuracy. We consider now the abstract transport equation D C ; (3.14) in which represents a source term and is either a matrix or an integral operator. The solution is most commonly interpreted as a (discrete or continuous) collision density, so that describes the density of initial collisions and incorporates information that describes the probability of transfer from a given state in phase space to the next one. Then, an iterative or multistage procedure can be written quite generally as (3.15) D .k/ C .k/ .0/ D ; .0/ D : Thus, using Eq. 3.15 we reserve the right to modify both the source term .k/ and the operator kernel term .k/ after each stage of our adaptive algorithm, but our first stage only makes use of the source and kernel of the original transport equation. The first method described by Halton alters only the source term .k/ iteratively, so that .k/ D for all k. We will make use of the more general iterative procedure (3.15) later when we describe an adaptive form of importance sampling. In the context of Eq. 3.15, the control variate, or correlated sampling, algorithm may be characterized by setting .kC1/ D .k/ C yO .k/ yO .k/ ; .0/ D ; (3.16) where yO .k/ is an approximate solution to (3.15). Then as .k/ ! 0; yO .k/ ! 0, x .k/ D yO .k/ C yO .k1/ C C yO .0/ ! and convergence should be geometric with sufficient care in establishing the number of samples in each stage. Thus, we subtract an approximate solution from both sides of the transport equation and solve for the difference between the approximate and exact solutions in each stage. The geometric convergence of an algorithm based on this idea was established rigorously in [113]. For details about the implementation of this sequential correlated sampling (SCS) algorithm that produces geometric convergence, the interested reader should consult [112,114]. Here, we summarize the apparent advantages and disadvantages of using SCS to solve either discrete or continuous transport problems. We notice that Eq. 3.16 indicates that only the analog kernel is needed to implement SCS. This makes sampling the transition probabilities easiest, and is a distinct advantage. However, the reduced source .k/ is nonvanishing everywhere, but it 146 J. Spanier is normally small and of mixed sign. Most of the cost of using SCS stems from the complexity of dealing with this reduced source. In addition, the solutions of some transport problems vary by many orders of magnitude over the phase space, and for such problems, analog construction of random walk histories may not be sufficient to guarantee that sufficiently many “important” contributors to the solution are represented in the sample space obtained. For example, problems generally unsuitable for analog treatment (e.g., shielding or streaming-type problems) may require many random walks per sequential stage to achieve strict error reduction. For these reasons, a rather different strategy was developed for solving transport equations adaptively, one based on importance sampling. This will be described in Section 3.6.3. As for the use of a control variate strategy in conjunction with quasi-Monte Carlo simulations, quite recently, Hickernell et al. [115] have explored its potential. The context for their study was the estimation of definite integrals and they found, somewhat surprisingly, that functions that serve well as control variates in the stochastic setting may not work well in the deterministic one. The implications of this result for transport applications have apparently not been studied. 3.6.3 Importance Sampling For the estimation of a weighted integral of the transport flux or collision density Z I D g.P / .P /dP (3.17) where g is a known (weighting) function, is the physical phase space, and satisfies the transport equation Z K.P; Q/ .Q/dQ C S.P /; .P / D an importance function equation (3.18) could be defined as the solution of the adjoint transport .P / D Z K .P; Q/ .Q/dQ C g.P /; (3.19) where K .P; Q/ K.Q; P /. Under appropriate restrictions, then, it is quite easy to demonstrate that Z I D S.P / .P /dP (3.20) 3 Monte Carlo Methods 147 so that the integral I can be obtained either as a weighted integral (3.17) of the solution of the transport equation or as a weighted integral (3.20) of the solution of a “backward” transport equation. Notice that the roles of the source S and detector g are interchanged in Eqs. 3.18 and 3.19. The reason for describing as an importance function will be made apparent in Section 3.7. The duality expressed by the equality of the integrals (3.17) and (3.20) means that it can be estimated by a Monte Carlo simulation of Eq. 3.18, with initial collisions described by the source function S , as well as by a Monte Carlo simulation of Eq. 3.19, with the adjoint source described by the “detector” function g. Furthermore, while the function serves as an importance function for the estimation of (3.17), the collision density serves as an importance function for the estimation of (3.20). The references cited in Section 3.6.1 describe various aspects of this importance sampling theory and discuss how to use an importance function to alter the analog sampling functions so that a zero variance estimate of the weighted integral (3.17) or (3.20) can be obtained, given perfect information about the importance function. It appears to have been Maynard [116] who suggested that use of the reciprocity principle for the transport operator might be advantageous when applied to certain kinds of Monte Carlo calculations. In that paper he illustrated the use of reciprocity to estimate absorption in a small region and to solve a flux-peaking problem (literally obtaining the transport solution at a point by Monte Carlo). It was appreciated from the outset that accurate estimation of an importance function is as difficult as estimation of the global solution of the original transport equation, so that various practical approximations were brought into play. At their core, many of these variance reduction methods depend on the identification of an approximate importance function. For example, the assumption that underlies use of the exponential transformation is that the importance function is well described by parametrized exponential functions (see, e.g., [117]). The parametrization is provided by the optical distance between the source of radiation and the detection region under investigation. Use of a simple version of the Russian roulette and splitting techniques [11, 12, 118, 119] amounts to the application of a region-wise constant importance function. At the boundary between two subregions of differing “importance,” particles are split into two or more independent particle “fragments,” each with a reduced weight, when moving from a region of lower to higher importance. When moving from a region of higher to lower importance, however, a game of Russian roulette is played that is designed to terminate the random walk with a high probability while increasing the weight of any particles that survive this Russian roulette game in such a way that the unbiased nature of the game is maintained. In the case of shielding problems, use of the exponential transformation is designed to maintain a roughly constant population of particle fragments whose weights should roughly decline exponentially as the particle moves from source to detector. The ideal simulation would, of course, result in every particle reaching the detector with a weight which is precisely the quantity being estimated, for example, the probability of transmission associated with the shield. 148 J. Spanier Significant effort has continued to be invested in devising mechanisms for reducing variance in difficult Monte Carlo simulations using, for example, splitting and Russian roulette which essentially relies on input of an approximate importance function that takes particularly simple forms. When this approximate importance function is not sufficiently accurate, history weights can fluctuate wildly, adding to the variance and making reliance on this method very risky. A weight windows strategy [120, 121] that controls weight fluctuations by setting upper and lower limits and using Russian roulette and splitting to maintain those limits assures that variances cannot become too large. However, the parameters needed to guarantee high precision using weight windows also rely on knowledge of how the importance function behaves over the phase space. Partly as a natural consequence of this vast effort at creating “practical” variance reduction schemes making use of splitting and Russian roulette, there has been interest [122–127] in the development of an iterative form of importance sampling with goals similar to those achieved for sequential correlated sampling, as described in the previous section. An iterative importance sampling strategy can be based on the construction of ever more accurate approximations to an importance function throughout the entire phase space. Next, we sketch this development. Returning to Eq. 3.15, use is made of an approximate importance function (as obtained from approximate solutions of an appropriate adjoint transport equation) to modify both the source and the kernel of the original transport equation at each adaptive stage. In other words, Eq. 3.15 is used in its full generality for adaptive importance sampling (AIS) and now both .k/ and .k/ will change with k. Provided that an improving approximation to the exact importance function can be assured, convergence will also be geometric for this method. Proof of this fact for matrix problems can be found in [109], and details about implementing AIS may be found in [127, 128]. In these references and in [109] one can also find descriptions of how use is made of importance sampling theory, based on the duality between the original transport equation and its adjoint, to modify both the source and the kernel in each adaptive stage. As for the benefits and disadvantages of AIS, the most serious objection to its use arises from the fact that sampling the nonanalog, importance-modified kernel can be very costly. On the plus side, however, use of an importance function in the sampling usually results in the need for fewer random walks in each adaptive block of histories than is the case for sequential correlated sampling. Also, convergence rates obtained from a fixed number of histories in each stage tend to be more favorable for especially difficult (i.e., slowly convergent) problems when adaptive importance sampling is used than when sequential correlated sampling is used. Overall, however, each of the two methods is very problem-dependent and one cannot state categorically that it is always better to use one algorithm or the other. It is an interesting and challenging question to determine a priori criteria for deciding which method will be superior for any given transport problem. A third iterative method, a variant of adaptive importance sampling, was quite recently developed [129] in order to try to combine the best features of the previous two methods. By relaxing the requirement that the importance sampling estimator 3 Monte Carlo Methods 149 be unbiased, extra freedom in the choice of density functions used to generate the random walks is obtained. This idea had been suggested much earlier for estimating integrals [130] and was extended to transport equations in [131]. Although it makes use of estimators that are biased, they can be shown to be asymptotically (i.e., as the sample size N ! 1) unbiased and to achieve rates of convergence comparable to those obtainable with (unbiased) importance sampling. It remains to be seen whether or not effective ways can be found to choose the sampling density functions so as to realize the full potential of this new adaptive method. The investigation of importance sampling for quasi-Monte Carlo applications was taken up in the dissertation of Paul Chelson [88] where it was shown that the method could be used to achieve reduction in the variation of an integrand function or of a transport estimator. In a later dissertation [93], Earl Maize explored the question of quasi-Monte Carlo applications of the weighted analog sampling estimator. Some of these results are summarized in [57]. Although these results pave the way for a possible iterative application of these strategies for quasi-Monte Carlo implementation along the lines described above in the pseudorandom case, such studies have not yet been carried out to the author’s knowledge. 3.6.4 Stratified Sampling Stratified sampling is another conventional statistical method that has been adapted for estimating integrals or solving transport problems with reduced sampling error. When using Monte Carlo to estimate an integral, stratified sampling amounts to decomposing the range of integration into a number of subsets and applying standard Monte Carlo methods to each subset separately. Optimization of this method then involves deciding how to define the subsets and how many sample values to be used in each. It is less obvious how to make use of this idea for solving transport problems by Monte Carlo. One application of the principle involved would be to subdivide the phase space into mutually disjoint subsets and distribute the initial collision sites of random walks among them according to some prescription that lowers the sampling errors. In the book [11], one version of this idea is called systematic source sampling, and it was shown there that this method always reduces the variance. An extension of this idea might be to subdivide the sample space of random walks into disjoint subsets and attempt to distribute the random walks among these in an optimal way from the point of view of error reduction. This would require imposing a metric, or measure, on the sample space, rather than on the phase space. Recalling that the implementation of Monte Carlo methods for (discrete or continuous) transport problems involves generating the various collision points of each random walk one at a time by random sampling, the use of low-discrepancy sequences rather than pseudorandom ones might be seen as a kind of intensive stratified sampling employed over the unit hypercube rather than over the phase space or the sample space. Thus, the development of quasi-Monte Carlo for the solution 150 J. Spanier of transport problems can be likened to using stratified sampling of the individual collision sites visited by the random walks. As we have seen earlier, this can lower the sampling error by increasing the rate of convergence of sample averages to theoretical expected values, depending upon the phase space dimension and the number of samples processed. An idea that appears to be related to these is introduced in [132]. In that paper, the author replaces the continuous transport problem by a discrete one by representing the discrete states as obtained by averaging over a user-specified decomposition of the continuous phase space. This leads to a replacement of the integral transport equation by a matrix problem that is inherently much easier to solve than the original one. A similar point of view is taken in [88,133,134], where it was also demonstrated that the use of low-discrepancy, rather than pseudorandom sequences would result in further advantages in terms of rate of convergence. It is primarily because of the potentially useful connections between stratified sampling as a conventional statistical strategy for reducing error and the use of low-discrepancy sequences that we felt it important enough to warrant separate mention here. 3.6.5 Use of Expected Values Many important devices for lowering simulation error succeed because some component of statistical fluctuation in the computation has been replaced by a nonstochastic, exact or approximate, analytic representation. Perhaps the most obvious example of this involves the use of expected values wherein an estimating random variable is replaced by one that has been obtained by computing averages over certain events where this is possible. For example, the use of absorption weighting, or survival biasing, for transport simulations consists of disallowing nonproductive absorption events in favor of reductions of the particle weight through multiplication by the survival probability at each interaction where absorption would normally have been possible. This technique is easily shown to reduce variance at the expense, however, of increased cost per random walk. A more sophisticated use of expected values for transport applications is normally called next-event estimation or expected value scoring. We sketch its use in estimating the tally at some detector in a problem. Instead of waiting until a random walk has reached the detector to record a tally, one can calculate exactly the expected direct (i.e., assuming no intervening collisions) contribution to the detector along the current direction of each random walk. This produces an exponential that involves the optical distance between the current position and detector (see, e.g., Section 3.6 of [11]). The general idea, of course, is to decrease the variability in the tally produced by each random walk simulated by eliminating event-level variability through this theoretical averaging device. By summing all of the event-level contributions (many of which may be zero) over all the events of each history, one is able to extract more information from each random walk without altering the walk 3 Monte Carlo Methods 151 probabilities. One might be tempted to conjecture that it is always advantageous to perform such averaging whenever theoretical averages can be calculated without adding too much to the computational cost. However, reduction in the event-level variability does not necessarily lower the history-level variability, as simple model problem analyses can reveal. For example, if the size of the detector shrinks to a very small volume or even a point, difficulties can be anticipated (see [12]). Therefore, since event-level averaging does add to the computational cost, it is wise to apply next-event estimation with a certain amount of caution. Obtaining the flux or collision density at a point can, however, actually be accomplished by means of a simulation of the transport equation dual to the original one. This will be discussed further in the next section. A nontrivial application of the use of expected values has been worked out recently [135] for the estimation of pulse height tallies in problems arising in oil well logging. In that application, the technique was responsible for improvements that were impressive in simulations of practical importance. When used in conjunction with a weight windows strategy, the gains in efficiency were even more pronounced. An idea related to the use of expected values led to the development of Monte Carlo codes to estimate spatial density functions in infinite media. This approach makes use of the fact that, in a homogeneous medium, the spatial density of particles undergoing random walk can be determined analytically once the sequence of collision parameters is known. Thus, one can use Monte Carlo to sample the sequence of angles and energies and, using these as parameters, it is possible to calculate the probability that a random walking particle suffers any collision between parallel planes at distances z and z C z from the point of origin. This led to the creation of Monte Carlo codes [136] to calculate estimates of entire spatial density functions and other quantities related to these functions that avoided the need to perform any spatial sampling at all. The method is mentioned here because, while it is not, strictly speaking, an example of the use of expected values in the sense described earlier, it does take advantage of the idea that it is most efficient to use Monte Carlo sampling only for those elements of the transport model that cannot be treated analytically. The interested reader can find details in [136]. 3.6.6 Other Error Reduction Strategies Space limitations permit only the mention of two other error reduction methods: perturbation Monte Carlo methods and condensed history methods that seem too important to omit, even in such a highly compressed survey. Each of these methods has commanded a lot of attention over a long period of time and each has contributed greatly to the class of transport problems that can be solved effectively by Monte Carlo methods. The correlated sampling method on which perturbation analysis relies is described in [12, 137–139]. Perturbation techniques have been valuable in estimating small differences accurately by Monte Carlo methods. There are several ways to achieve this, but the 152 J. Spanier purest is to use the same random number or low-discrepancy sequence to generate a set of random walks that can serve to solve both a background, unperturbed, transport problem and a perturbation of that problem. It achieves this by associating two different sets of weights with the histories, weights that reflect the differences between the two problems in a statistically faithful way. Then the strong correlation between the pair of problems makes estimates of the small perturbing effects much more reliable than if the two problems were solved using independently generated random walks. More formally, the key is to track a pair of tallying random variables , O , where is an unbiased estimator in the unperturbed problem, and O estimates the same quantity for the perturbed problem. For example, the unknown to be estimated might be the transmissivity of a shielding array and could be a straightforward analog estimate of transmissivity for some design specification of the shield geometry, while O might represent the estimate of transmissivity with one (or more) small changes introduced in the shielding configuration. Then if denotes the analog measure characteristic of the design shielding array, and if O denotes the analog measure characteristic of the perturbed configuration, the identity Z holds provided O d D Z O D dO d dO (3.21) (3.22) R R Defining O in this way serves to correlate estimates of d; O d. In (3.21), the left-hand side is the expected transmissivity for the design geometry, and the righthand side is the expected transmissivity for the perturbed geometry. A single set of random walks generated according to is thus used to represent the solution of two (or more) problems: one, with no perturbation, and the other with the perturbation of known size and composition. In other words, when both and O are averaged with respect to a single set of random walks generated according to the “background” measure , they must be highly positively correlated for small perturbations. Through an extension of this idea, estimates can be obtained of the rates of change of some output quantity with respect to changes in one or more parameters of the problem. For example, if the geometry of the perturbation is assumed to be known but its composition is unknown, this idea can be used to estimate the effect on various output quantities of changes in the composition of the perturbing region. This, then, provides an extremely powerful tool for performing sensitivity analysis by means of Monte Carlo. Finally, the availability of accurate estimates of these derivatives can be combined with the use of an optimization algorithm to provide solutions to inverse problems. Such a solution would be provided, for example, by finding the best (in the sense of least squares) fit with a set of measured or experimentally observed quantities. A recent example of the use of this method to solve an inverse problem arising in biomedical optics can be found in [140, 141]. 3 Monte Carlo Methods 153 Condensed history models – which have been widely used for modeling the transport of ionizing radiation for many years (see [142–147]) – have been shown to speed up Monte Carlo simulations significantly while retaining reasonable accuracy for many problems. They achieve this acceleration by replacing detailed, collisionby-collision sampling with multiple collision models or equivalent compressions of information. One such method makes use of similarity relationships for the transport equation [148]. The critical idea is to rewrite the transport equation in terms of altered parameters in such a way that the solution to the equation remains unchanged, and then introduces an approximation that is designed to preserve certain moments of the solution, yet speed up the computation significantly. Specifically, the one-speed integro-differential transport equation is 1@ Cr v @t Z C †t .r/ .r; ; t / D .r; 0 ; t/†s .r; 0 ! /d0 C q.r; ; t/ (3.23) 4 where the solution describes the radiation flux, q is the physical source, †t .r/ D †s C †a is the total cross section, decomposed as the sum of scattering and absorption cross sections, .r; / describe the position and unit vector along the direction of motion, respectively, of a typical particle, and t denotes the time. This equation can be rewritten as 1@ Cr v @t qQ.r; ; t/ C I.r; ; t/ D 0 (3.24) where I.r; ; t/ D †a .r/ .r; ; t / 2 C†s .r/ 4 .r; ; t / Z 3 f .r; / d0 5 .r; 0 ; t / 2 (3.25) 4 and f .r; / is the probability density function for D 0 , the cosine of the scattering angle. Clearly, then, the solution of Eq. 3.24 is unchanged by the replacement of †a .r/; †s .r/, and f .r; / by †a .r/; †s .r/, and f .r; /, respectively, that use a different set of physical parameters, provided that the quantity I.r; ; t/ is unchanged. This substitution yields the condition Œ†a †a C †s †s Z D 4 .r; 0 ; t / †s f .r; / †s f .r; / 0 d 2 (3.26) 154 J. Spanier which, upon expanding into spherical harmonics and the angular probability density functions f and f into Legendre polynomials and simplifying, gives the following family of equivalence relations: SD 1 fn †s D ; †s 1 fn n D 1; 2; : : : ; N: (3.27) Here, the nth terms correspond to the nth coefficients in the Legendre expansions of the angular probability density functions, f and f , and N is the similarity relation order. By choosing larger N , one increases the simulation accuracy, admitting more directional anisotropy and thereby approximating the exact transport solution more closely. The assumption that only N –1 angular moments, i.e., only N terms in the series expansion are significant, produces the efficiency S D 1 fN . From Eq. 3.27 it follows that the similarity efficiency, S , lies between 0 and 1. It is also clear from this equation that for optimal efficiency, one wants to choose †s as small as possible for the given N . For many problems involving optically dense media, †s †t (absorption is negligible compared to scattering), †t is very large, and reductions in †s accelerate the simulation by increasing the mean free path approximately by the ratio S . A somewhat different approach assumes that each particle travels a fixed distance s between collisions, where s is chosen to be greater than the inherent mean free path and remains fixed throughout simulation. This artificial enlargement of the mean free path is then accompanied by treating each scattering event by means of a multiple-scattering model, rather than by sampling from the original single scattering phase function repeatedly. The theory that underlies the multiple-scattering model is due to Goudsmit and Saunderson [142, 149]. This method has been carefully analyzed by various authors [145–147, 150] and has been shown to produce very good results, both with respect to speed and accuracy, over a wide range of problems arising in connection with radiation therapy planning. 3.6.7 State of the Art A great deal of progress has been made in advancing the theory and practice of error reduction methods since the publication of [13]. Even so, the time has not yet arrived when Monte Carlo methods can be used routinely to solve general transport problems, even though codes such as MCNP [15] are in very wide use globally. The optimum choice of a sampling/weighting strategy depends in a complicated way on the details of the problem and, without fairly sophisticated controls, error reduction techniques, such as importance sampling, can produce poor results. The recent development of adaptive Monte Carlo algorithms offers the promise of a general-purpose code that automatically extracts the information it requires for each problem; such a code might then finally fulfill this promise. It remains to be seen whether an extremely efficient general-purpose code can be implemented, perhaps 3 Monte Carlo Methods 155 making use of advanced adaptive methods that converge geometrically to prescribed quantities with very small statistical errors. In the meantime, special codes for special applications will still be needed in order to meet very high precision goals for important problems/projects. 3.7 Foundations/Theoretical Developments The fact that several of the most important publications of the early period were classified may well have prevented more rapid dissemination of the critical theoretical ideas. Then too, the emphasis throughout this period was on results, whether in support of the Manhattan project or to speed up development of the nuclear navy. In both the cases, the “proofs of principle” were established by successful implementations – not mathematically rigorous arguments. Indeed, because Monte Carlo methods are so intuitively plausible, there may well have been a reduced dependence on rigor tolerated by its early practitioners. Coupled with the fact that those directly involved with the early advances were preoccupied with races against the clock in one way or another, publication in scholarly journals was often postponed or foregone in the interest of national security. In time, however, the stimuli for improved understanding of the mathematical foundations underlying Monte Carlo methods were at least twofold: the appearance of perplexing or anomalous numerical results (e.g., erratic behavior of sample means in some problems; the possibility of theoretically infinite variance importance sampling strategies that seem quite reasonable otherwise, etc.) and the pressure for constant improvement in new results obtained via Monte Carlo. In turn, not unexpectedly these newer methods often depended on sophisticated transformations of the original problem into problems fully equivalent to it but easy to solve by Monte Carlo. With this came the imposition of increased mathematical sophistication and understanding. Recognition that transport problems are natural infinite-dimensional extensions of finite-dimensional quadrature problems has provided the motivation for developing a rigorous measure-theoretic foundation for Monte Carlo solutions of transport problems. This allows a demonstration of the equivalence of formulations based on the transport equation (analytic model), Monte Carlo sampling (probability model), and quasi-Monte Carlo sampling (deterministic, number-theoretic model) [11, 58, 151]. It has also been valuable to recognize that Monte Carlo solutions of matrix problems can further our understanding of Monte Carlo solutions of continuous transport problems. The mathematical models of the two classes of problems are very similar. In both the discrete and the continuous cases, the solution can be represented as an infinite series of terms, each of which accounts for random walks that make exactly k collisions, k D 1; 2; : : : . The degree of difficulty of each problem can be measured, to some extent, by the rate of convergence of this infinite series. For example, the effective dimension of either type of problem can be defined as the product of the average number of collisions made by the random walks and the 156 J. Spanier phase space dimension, as we remarked earlier. If this effective dimension is higher, more computational effort must be invested, in general, in order to solve the problem. An important contribution to our theoretical understanding of the analysis of error in Monte Carlo calculations was advanced in the 1970s in a series of publications by Harvey Amster and his collaborators [152–155]. The key idea involved is to relate the expected value of an estimating random variable to an adjoint transport equation. For example, consider the integral transport equation for the collision density Z .P / D S.P / C Z where K.P; Q/ D K.P; Q/ .Q/dQ; (3.28) C.P; P 0 /T .P 0 ; Q/dP 0 ; (3.29) and the kernel T describes free flight transport and the kernel C describes the collision mechanics. The function S is the density of initial collision states and is related to the physical source density q via Z S.P / D T .P; P 0 /q.P 0 /dP 0 : (3.30) It is instructive to write down the equation adjoint to (3.28): Z .P / D S .P / C K.P; Q/ .Q/dQ (3.31) so that, as we showed in Section 3.6, we have the equality Z Z .P /S .P /dP D .P /S.P /dP : (3.32) Substitution of (3.30) into (3.32) gives Z .P /S .P /dP D Z Z Now if we denote M1 .P / D .P /T .P; P 0 /q.P 0 /dP 0 dP : .P /T .P; P 0 /dP 0 ; it follows quite readily that the adjoint solution .P /Rmust be the expected contribution to the estimation of the “reaction rate” I D .P /S .P /dP due to a particle originating at P from the physical source q. It is also straightforward to show that the integral equation satisfied by the function M1 is Z M1 .P / D 0 0 T .P; P /S .P /dP C Z L.P; P 0 /M1 .P 0 /dP 0 (3.33) 3 Monte Carlo Methods 157 Z where L.P; Q/ D T .P; P 0 /C.P 0 ; Q/dP 0 : Making use of uniqueness of solutions of the transport equation and its adjoint, it is not difficult to show that Eq. 3.33 is adjoint to the integro-differential form of the original transport process, making the function M1 an importance function for this problem. Higher moment equations, that is, integral equations for the expected square, cube, etc. of the score can be derived in similar fashion. This technique was initially developed as a mechanism for comparing variances of competing Monte Carlo strategies, thereby providing useful information about whether or not a proposed Monte Carlo method might be superior to an alternative formulation. However, the connection between these moment equations and the original transport process has led to a variety of other uses over the years. 3.7.1 State of the Art We now know how to demonstrate equivalence between an analytic model of a given problem derived from the transport equation with either a probabilistic model to be used for pseudorandom simulation or a deterministic model to be used for quasi-random simulation. Likewise, the derivation of coupled integral equations for expected values of higher moments of estimators is well understood. However, applying these models and derivations to achieve optimal performance is still as much art as it is science. Nevertheless, such theoretical advances have proven to be of great value in advancing the development of effective Monte Carlo techniques. 3.8 Challenges In this chapter, we have focused on the Monte Carlo solution of integral equations, and especially transport equations, as seemed in keeping with the spirit of the Gelbard lecture series. We have, nevertheless, tried to indicate that estimation of definite integrals and other conventional and unconventional Monte Carlo applications areas have also spurred developments highly relevant to the solution of transport problems. We close this chapter with some thoughts about open questions such as whose resolutions would, in our opinion, further advance the theory and practice of Monte Carlo methods for traditional nuclear applications. We have seen that high-dimensional integrals and transport problems can both benefit from the use of a judiciously chosen mixture of pseudorandom and quasirandom generating sequences. We have also suggested that these same techniques ought to play a useful role in the Monte Carlo solution of transport problems, but this has not been attempted in any systematic way as yet. 158 J. Spanier A relatively new branch of the theoretical analysis of Monte Carlo algorithms deals with their computational complexity (see, e.g., [87] and references cited therein). It would certainly be instructive to develop practical criteria for optimality with respect to complexity. The use of perturbation and differential Monte Carlo tools, coupled with optimization algorithms, to solve inverse problems is a rather new and important development, in our view. Much more effort would seem warranted for such problems. We have sketched the development of another rather new circle of ideas: the use of sequential or adaptive algorithms to achieve very rapid convergence of Monte Carlo estimates. Three different strategies have evolved to date: sequential correlated sampling, adaptive importance sampling, and generalized weighted analog sampling. It will be important to learn how these, or possibly others, can be used to solve various families of practical transport problems easily and efficiently. Strengthening our theoretical understanding of the application of low-discrepancy sequences to difficult transport problems would seem to be overdue, inasmuch as most traditional Monte Carlo programs continue to rely on pseudorandom sequence generators. At a minimum it would be well to derive more precise criteria for estimating the overall efficiency of a program implemented with low-discrepancy sequences in terms of the parameters that characterize the problem. This will, no doubt, prove to be a difficult exercise. Acknowledgments The author gratefully acknowledges the support of the Laser Microbeam and Medical Program NIH P-41-RR-01192, and grants UCOP 41730 and NSF/DMS 0712853 during the preparation of this chapter. References 1. Buffon GC (1777) Essai d’arithmetique morale. Supplement á l’histoire naturelle 4 2. Laplace MP-S (1886) Theory Analytiques des Probabilities, Livre 2, contained in Oeuvres Completes de Laplace, de L’Academie des Sciences, vol 7, part 2. Paris, pp 365–366 3. Mantel N (1953) An extension of the Buffon needle problem. Ann Math Stat 24:674–677 4. Kahan BC (1961) A practical demonstration of a needle experiment designed to give a number of concurrent estimates of . J R Stat Soc Series A 124:227–239 5. Hammersley JM, Morton KW (1956) A new Monte Carlo technique: antithetic variates. Proc Camb Phil Soc 52:449–475 6. Fishman G (1996) Monte Carlo: Concepts, Algorithms, and Applications, Springer Series in Operations Research 7. Cashwell ED, Everett J (1959) Monte Carlo method for random walk problems. Pergamon, New York 8. Hammersley JM, Handscomb DC (1964) Monte Carlo methods. Methuen & Co., Ltd., London 9. Kalos MH, Whitlock PA (1986) Monte Carlo methods, Volume I: basics. Wiley-Interscience, New York 10. Niederreiter H (1992) Random number generation and quasi-Monte Carlo methods, #63 in CBMS-NSF Regional Conference Series in Applied Mathematics. SIAM, Philadelphia, PA 11. Spanier J, Gelbard EM (1969) Monte Carlo principles and neutron transport problems. Addison-Wesley, Reading, MA 3 Monte Carlo Methods 159 12. Lux I, Koblinger L (1991) Monte Carlo particle transport methods: neutron and photon calculations. CRC Press, Boca Raton, FL 13. Kalos MH, Nakache FR, Celnik J (1968) Monte Carlo methods in reactor computations. In: Greenspan H, Kelber CN, Okrent D (eds) Computing methods in reactor physics. Gordon & Breach, New York, pp 365–438 14. Greenspan H, Kelber CN, Okrent D (eds) (1968) Computing methods in reactor physics. Gordon & Breach, New York, pp 365–438 15. X-5 Monte Carlo Team (2003) MCNP – a general N-particle transport code, Version 5,” LA-UR-03–1987, Los Alamos National Laboratory 16. Metropolis N, Ulam S (1949) The Monte Carlo method. J Am Stat Assoc 44:335–341 17. Kahn H (1954) Applications of Monte Carlo, RAND Corp. Report AECU – 3259 (April 1954; revised April 1956) 18. Kahn H (1956) Use of different Monte Carlo sampling techniques. In: Meyer HA (ed) Symposium on Monte Carlo methods. Wiley, New York, pp 146–190 19. Coveyou RR (1969) Random number generation is too important to be left to chance. Appl Math 3:70–111 20. Devroye L (1986) Non-uniform random variate generation. Springer, New York 21. Stadlober E, Kremer R (1992) Sampling from discrete and continuous distributions with C-Rand. In: Pflug G, Dieter U (eds) Simulation and optimization. Lecture notes in economics and math. systems, vol 374. Springer, Berlin, pp 154–162 22. Stadlober E, Niederl F (1994) C-Rand: a package for generating nonuniform random variates. In Compstat’94, Software Descriptions, pp 63–64 23. Lehmer DH (1964) Mathematical methods in large-scale computing units. Proc 2nd Symp on Large-Scale Calculating Machinery (1949), Ann Comp Lab Harvard Univ 26:141–146 24. Knuth DE (1998) The art of computer programming, Seminumerical Algorithms, vol 2, 3rd edn. Addison-Wesley, Reading, MA 25. Coveyou RR (1960) Serial correlation in the generation of pseudo-random numbers. J ACM 7:72–74 26. MacLaren MD, Marsaglia G (1965) Uniform random number generators. J ACM 12:83–89 27. Marsaglia G (1968) Random numbers fall mainly in the planes. Proc Natl Acad Sci USA 61:25–28 28. Anderson SL (1990) Random number generators on vector supercomputers and other advanced architectures. SIAM Rev 32:221–251 29. Dagpunar J (1988) Principles of random variate generation. Oxford University Press, Oxford 30. Deak I (1989) Random number generators and simulation. Akademiai Kiado, Budapest 31. Dieter U (1986) Non-uniform random variate generation. Springer, New York 32. James F (1990) A review of pseudorandom number generators. Comp Phys Commun 60:329–344 33. L’Ecuyer P (1990) Random numbers for simulation. Commun ACM, 33:85–97 34. L’Ecuyer P (1994) Uniform random number generation. Ann Oper Res 53:77–120 35. Niederreiter H (1995) New developments in uniform random number and vector generation. In: Niedderreiter H, Shiue P J-S (eds) Monte Carlo and quasi-Monte Carlo methods in scientific computing, Lecture Notes in Statistics #106. Springer, New York, pp 87–120 36. Niederreiter H (1993) Finite fields, pseudorandom numbers, and quasi-random points. In: Mullen GL, Shiue PJ-S (eds) Finite fields, coding theory, and advances in communications and computing. Marcel Dekker, New York, pp 375–394 37. Bergstrom V (1936) Einige Bemerkungen zur Theorie der Diophantischen Approximationen. Fysiogr Salsk Lund Forh 6(13):1–19 38. Van der Corput JG, Pisot C (1939) Sur la Discrépance Modulo un. Indag Math 1:143–153, 184–195, 260–269 39. Bratley P, Fox BL, Schrage LE (1987) A guide to simulation, 2nd edn. Springer, New York 40. Fishman G, Moore III LS (1986) An exhaustive analysis of multiplicative congruential random number generators with modulus 231 – 1. SIAM J Sci Stat Comp 7:24–45 41. Fishman G (1989) Multiplicative congruential random number generators with modulus 2ˇ : an exhaustive analysis for ˇ D 32 and a partial analysis for ˇ D 48. Math Comp 54:331–344 160 J. Spanier 42. Ripley BD (1983) The lattice structure of pseudo-random number generators. Proc R Soc Lond Ser A 389:197–204 43. Law AM, Kelton WD (2002) Simulation modeling and analysis, 3rd edn. McGraw-Hill, New York 44. L’Ecuyer P (1998) Random number generation. Chapter 4. In: Banks J (ed) Handbook of simulation. Wiley, New York, pp 93–137 45. Hellekalek P, Larcher G (eds) (1998) Random and quasi-random point sets, vol 138 of Lecture Notes in Statistics. Springer, New York 46. L’Ecuyer P, Simard R (2000) On the performance of birthday spacings tests for certain families of random number generators. Math Comp Simul 55:131–137 47. L’Ecuyer P (1999) Good parameters and implementations for combined multiple recursive random number generators. Oper Res 47:159–164 48. Eichenauer J, Lehn J (1986) A nonlinear congruential pseudorandom number generator. Stat Papers 27:315–326 49. L’Ecuyer P (2002) Random numbers. In: Smelser NJ, Paul B Baltes (eds) The international encyclopedia of the social and behavioral sciences. Pergamon, Oxford, pp 12735–12738 50. Matsumoto M, Nishimura T (1998) Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Trans Model Comp Simul 8:3–30 51. L’Ecuyer P, Andres TH (1997) A random number generator based on the combination of four LCGs. Math Comp Simul 44:99–107 52. L’Ecuyer P (1999) Tables of maximally equidistributed combined LFSR generators. Math Comp 68:261–269 53. Zaremba SK (1968) The mathematical basis of Monte Carlo and quasi-Monte Carlo methods. SIAM Rev 10:304–314 54. Keller A (1995) A Quasi-Monte Carlo Algorithm for the global illumination problem in the radiosity setting. In: Niederreiter H, Shiue PJ (eds) Monte Carlo and quasi-Monte Carlo methods in scientific computing, Lecture Notes in Statistics 106. Springer, New York, pp 239–251 55. Keller A (1998) The quasi-random walk. In: Niederreiter H, Hellekalek P, Larcher G, Zinterhof P (eds) Monte Carlo and Quasi-Monte Carlo methods 1996, Lecture Notes in Statistics 127. Springer, New York, pp 277–291 56. Morokoff WJ, Caflisch RE (1993) A Quasi-Monte Carlo approach to particle simulation of the heat equation. SIAM J Num Anal 30:1558–1573 57. Spanier J, Maize EH (1994) Quasi-random methods for estimating integrals using relatively small samples. SIAM Rev 36:18–44 58. Spanier J (1995) Quasi-Monte Carlo methods for particle transport problems. In: Niederreiter H, Shiue PJ (eds) Monte Carlo and quasi-Monte Carlo methods in scientific computing, Lecture Notes in Statistics 106. Springer, New York, pp 121–148 59. Boyle P (1977) Options: a Monte Carlo approach. J Fin Econ 4(4):323–338 60. Morokoff WJ, Caflisch RE (1997) Quasi-Monte Carlo simulation of random walks in finance. In: Niederreiter H, Hellekalek P, Larcher G, Zinterhof P (eds) Monte Carlo and quasi-Monte Carlo methods 1996, Lecture Notes in Statistics 127. Springer, New York, pp 340–352 61. Joy C, Boyle P, Tan KS (1996) Quasi-Monte Carlo methods in numerical finance. Manage Sci 42:926–938 62. Tezuka S (1998) Financial applications of Monte Carlo and Quasi-Monte Carlo methods. In: Hellekalek P, Larcher G (eds) Random and quasi-random point sets, Lecture Notes in Statistics 138. Springer, New York, 303–332 63. Cohen M, Wallace J (1993) Radiosity and realistic image synthesis. Academic Press Professional, Cambridge 64. Lafortune E (1996) Mathematical models and Monte Carlo algorithms for physically based rendering. Ph.D. dissertation, Katholieke Universitiet, Leuven, Belgium 65. Van der Corput JG (1935) Verteilungsfunktionen I, II, Nederl. Akad Wetensch Proc Ser B, 38:813–821, 1058–1066 66. Halton JH (1960) On the efficiency of certain quasi-random sequences of points in evaluating multi-dimensional integrals. Num Math 2:84–90 3 Monte Carlo Methods 161 67. Korobov NM (1959) The approximate computation of multiple integrals. Dokl Akad Nauk SSSR 124:1207–1210 (in Russian) 68. Hlawka E (1962) Zur Angenäherten Berechnung Mehrfacher Integrale. Monatsch Math 66:140–151 69. Hua K, Wang Y (1981) Applications of number theory to numerical analysis. Springer, Berlin 70. Sloan IH, Joe S (1994) Lattice methods for multiple integrals. Oxford University Press, Oxford 71. Owen A (1995) Randomly permuted (t; m; s)-nets and (t; m; s)-sequences. In: Niederreiter H, Shiue PJ (eds) Monte Carlo and quasi-Monte Carlo methods in scientific computing, Lecture Notes in Statistics 106. Springer, New York, pp 299–317 72. Faure H (1992) Good permutations for extreme discrepancy. J Num Theor 41:47–56 73. Wang J, Hickernell FJ (2000) Randomized Halton sequences. Math Comp Model 32:887–899 74. Moskowitz B (1995) Quasi-random diffusion Monte Carlo. In: Niederreiter H, Shiue PJ-S (eds) Monte Carlo and quasi-Monte Carlo methods in scientific computing, Lecture Notes in Statistics, 106. Springer, Berlin, pp 278–298 75. Coulibaly I, Lecot C (1998) Monte Carlo and quasi-Monte Carlo algorithms for a linear integro-differential equation. In: Niederreiter H, Hellekalek P, Larcher G, Zinterhof P (eds) Monte Carlo and quasi-Monte Carlo methods 1996, Lecture Notes in Statistics 127. Springer, New York, 176–188 76. Okten G (1999) High dimensional integration: a construction of mixed sequences using sensitivity of the integrand. Technical Report, Ball State University, Muncie, IN 77. Okten G (2000) Applications of a hybrid Monte Carlo sequence to option pricing. In: Niederreiter H, Spanier J (eds) Monte Carlo and quasi-Monte Carlo methods, 1998. Springer, New York, pp 391–406 78. Moskowitz BS (1993) Application of quasi-random sequences to Monte Carlo methods. Ph.D. dissertation, UCLA 79. Okten G (1999) Random sampling from low discrepancy sequences: applications to option pricing. Technical Report, Ball State University, Muncie, IN 80. Paskov SH (1997) New methodologies for valuing derivatives. In: Pliska S, Dempster M (eds) Mathematics of securities. Isaac Newton Institute, Cambridge University Press, Cambridge 81. Paskov SH, Traub JF (1995) Faster valuation of financial derivatives. J Portfolio Manage 22:113–120 82. Cukier RI, Levine HB, Shuler KE (1978) Nonlinear sensitivity analysis of multiparameter model systems. J Comp Phys 26:1–42 83. Owen A (1992) Orthogonal arrays for computer experiments, integration and visualization. Statistica Sinica 2:439–452 84. Radovic I, Sobol’ IM, Tichy RF (1996) Quasi-Monte Carlo methods for numerical integration: comparison of different low discrepancy sequences. Monte Carlo Meth Appl 2:1–14 85. Sobol IM (1993) Sensitivity estimates for nonlinear mathematical models. MMCE 1:407–414 86. Sloan IH, Wozniakowski H (1998) When are quasi-Monte Carlo algorithms efficient for high dimensional integrals? J Complexity 14:1–33 87. Wozniakowski H (2000) Efficiency of quasi-Monte Carlo algorithms for high dimensional integrals. In: Niederreiter H, Spanier J (eds) Monte Carlo and quasi-Monte Carlo methods 1998. Springer, New York, pp 114–136 88. Chelson P (1976) Quasi-random techniques for Monte Carlo methods. Ph.D. dissertation, The Claremont Graduate School, Claremont 89. Spanier J, Li L (1998) Quasi-Monte Carlo methods for integral equations. In: Niederreiter H, Hellekalek P, Larcher G, Zinterhof P (eds) Monte Carlo and quasi-Monte Carlo methods 1996, Lecture Notes in Statistics 127. Springer, New York, pp 398–414 90. Koksma JF (1942–1943) Een Allgemeene Stelling uit de Theorie der Gelijkmatige Verdeeling Modulo 1. Mathematica B. (Zutphen) 11:7–11 91. Hlawka E (1961) Funktionen von Beschränkter Variation in der Theorie der Gleichverteilung. Ann Mat Pura Appl 54:325–333 162 J. Spanier 92. Hickernell FJ (2006) Koksma-Hlawka inequality. In: Kotz S, Johnson NL, Read CB, Balakrishnan N, Vidakovic B (eds) Encyclopedia of statistical sciences, vol 6, 2nd edn. Wiley, Hoboken, NJ, pp 3862–3867 93. Maize EH (1981) Contributions to the theory of error reduction in quasi-Monte Carlo methods. Ph.D. dissertation, The Claremont Graduate School, Claremont 94. Niederreiter H, Xing C (1998) The algebraic-geometry approach to low-discrepancy sequences. In: Niederreiter H, Hellekalek P, Larcher G, Zinterhof P (eds) Monte Carlo and quasi-Monte Carlo methods 1996, Lecture Notes in Statistics 127. Springer, New York, pp 139–160 95. Niederreiter H (2000) Construction of (t,m,s)-nets. In: Niederreiter H, Spanier J (eds) Monte Carlo and quasi-Monte Carlo methods, 1998. Springer, New York, pp 70–85 96. Sobol IM (1967) The distribution of points in a cube and the approximate evaluation of integrals. Zh Vychisl Mat i Mat Fiz 7:784–802 (in Russian) 97. Faure H (1981) Discrépances de Suites Associées à un Système de Numération (en Dimension un). Bull Soc Math France 109:143–182 98. Faure H (1982) Discrépances de Suites Associées à un Système de Numération (en Dimension S). Acta Arith 41:337–351 99. Halton J (1998) Independence of quasi-random sequences and sets. Working Paper CB#3175, University of North Carolina, Chapel Hill, NC 100. Okten G (1998) Error estimation for Quasi-Monte Carlo methods. In: Niederreiter H, Hellekalek P, Larcher G, Zinterhof P (eds) Monte Carlo and quasi-Monte Carlo methods 1996, Lecture Notes in Statistics 127. Springer, New York 101. Goertzel G, Kalos MH (1958) Monte Carlo methods in transport problems. In: Hughes DJ, Sanders JE, Horowitz J (eds) Progress in nuclear energy, vol II, series I, Physics and Mathematics. Pergamon, New York, pp 315–369 102. Leimdorfer M (1964) On the transformation of the transport equation for solving deep penetration problems by the Monte Carlo methods. Trans Chalmers University of Tech, #286. Goteborg, Sweden 103. Leimdorfer M (1964) On the use of Monte Carlo methods for calculating the deep penetration of neutrons in shields. Trans Chalmers University of Tech, #287, Goteborg, Sweden 104. Coveyou RR, Cain VR, Yost KJ (1967) Adjoint and importance in Monte Carlo application. Oak Ridge National Laboratory Report ORNL-4093 105. Kalos MH (1963) Importance sampling in Monte Carlo shielding calculations – neutron penetration through thick hydrogen shields. Nucl Sci Eng 16:227 106. Goertzel G (1949) Quota sampling and importance functions in stochastic solution of particle problems. Oak Ridge National Laboratory Report ORNL-434 107. Halton JH (1970) A retrospective and prospective survey of the Monte Carlo method. SIAM Rev 12:1–63 108. Gelbard EM, Spanier J (1964) Use of the superposition principle in Monte Carlo resonance escape calculations. Trans Am Nucl Soc 7:259–260 109. Halton J (1962) Sequential Monte Carlo. Proc Camb Phil Soc 58:57–73 110. Halton J (1994) Sequential Monte Carlo techniques for the solution of linear systems. J Sci Comp 9:213–257 111. Kong R (1999) Transport problems and Monte Carlo methods. Ph.D. dissertation, Claremont Graduate University, Claremont 112. Kong R, Spanier J (2000) Sequential correlated sampling methods for some transport problems. In: Niederreiter H, Spanier J (eds) Monte Carlo and quasi-Monte Carlo methods 1998. Springer, Berlin, pp 238–251 113. Kong R, Spanier J (2000) Error analysis of sequential Monte Carlo methods for transport problems. In: Niederreiter H, Spanier J (eds) Monte Carlo and quasi-Monte Carlo methods 1998. Springer, Berlin, pp 252–272 114. Spanier J (2000) Geometrically convergent learning algorithms for global solutions of transport problems. In: Niederreiter H, Spanier J (eds) Monte Carlo and quasi-Monte Carlo methods 1998. Springer, Berlin, pp 98–113 3 Monte Carlo Methods 163 115. Hickernell FJ, Lemieux C, Owen AB (2005) Control variates for quasi-Monte Carlo, 2002. Stat Sci 20:1–31 116. Maynard CW (1961) An application of the reciprocity theorem to the acceleration of Monte Carlo calculations. Nucl Sci Eng 10:97–101 117. Spanier J (1970) An analytic approach to variance reduction. SIAM J Appl Math 18:172–190 118. Burn KW, Nava E (May 1997) Optimization of variance reduction parameters in Monte Carlo radiation transport calculations to a number of responses of interest. Proceedings of the international conference on nuclear data for science and technology. Italian Physical Society, Trieste, Italy 119. Burn KW, Gualdrini G, Nava E (2002) Variance reduction with multiple responses. In: Kling A, Barao F, Nakagawa M, Tavora L, Vaz P (eds) Advanced Monte Carlo for radiation physics, particle transport simulation and applications. Proceedings of the MC2000 conference. Lisbon, Portugal, 23–26 October 2000, pp 687–695 120. Hendricks J (1982) A code – generated Monte Carlo importance function. Trans Am Nucl Soc 41:307 121. Cooper MA, Larsen EW (2001) Automated weight windows for global Monte Carlo particle transport calculations. Nucl Sci Eng 137:1–13 122. Booth TE (1986) A Monte Carlo learning/biasing experiment with intelligent random numbers. Nucl Sci Eng 92:465–481 123. Booth TE (1988) The intelligent random number technique in MCNP. Nucl Sci Eng 100:248–254 124. Booth T (1985) Exponential convergence for Monte Carlo particle transport. Trans Am Nucl Soc 50:267–268 125. Booth T (1997) Exponential convergence on a continuous Monte Carlo transport problem. Nucl Sci Eng 127:338–345 126. Kollman C (1993) Rare event simulation in radiation transport. Ph.D. dissertation, University of California, Berkeley, CA 127. Lai Y, Spanier J (2000) Adaptive importance sampling algorithms for transport problems. In: Niederreiter H, Spanier J (eds) Monte Carlo and quasi-Monte Carlo methods 1998. Springer, New York, pp 276–283 128. Hayakawa C, Spanier J (2000) Comparison of Monte Carlo algorithms for obtaining geometric convergence for model transport problems. In: Niederreiter H, Spanier J (eds) Monte Carlo and quasi-Monte Carlo methods 1998. Springer, New York, pp 214–226 129. Spanier J, Kong R (2004) A new adaptive method for geometric convergence. In: Niederreiter H (ed) Proceedings MCQMC 2002. 25–28 November 2002, Singapore, pp 439–449 130. Powell MJD, Swann J (1966) Weighted uniform sampling – a Monte Carlo technique for reducing variance. J Inst Math Appl 2:228–236 131. Spanier J (1979) A new family of estimators for transport problems. J Inst Math Appl 23:1–31 132. Booth TE (1990) A quasi-deterministic approximation of the Monte Carlo importance function. Nucl Sci Eng 104:374–384 133. Li L (1995) Quasi-Monte Carlo methods for transport equations. Ph.D. dissertation, The Claremont Graduate School, Claremont 134. Li L, Spanier J (1997) Approximation of transport equations by matrix equations and sequential sampling. Monte Carlo Meth Appl 3:171–198 135. Mosher S, Maucec M, Spanier J, Badruzzaman A, Chedester C, Evans M, Gadeken L Expected-value techniques for Monte Carlo modeling of well logging problems. In review 136. Amster HJ, Kuehn H, Spanier J (February 1960) Euripus-3 and Daedalus – Monte Carlo Density Codes for the IBM-704. Westinghouse Atomic Power Laboratory report WAPD-TM205 137. Rief H (1984) Generalized Monte Carlo perturbation theory for correlated sampling and a second order Taylor series approach. Ann Nucl Energy 11:455–476 138. Rief H, Gelbard EM, Schaefer RW, Smith KS (1986) Review of Monte Carlo techniques for analyzing reactor perturbations. Nucl Sci Eng 92:289–297 139. Rief H (1996) Stochastic perturbation analysis applied to neutral particle transport. Adv Nucl Sci Tech 23:69–140 164 J. Spanier 140. Hayakawa C, Spanier J, Bevilacqua F, Dunn AK, You JS, Tromberg BJ, Venugopalan V (2001) Perturbation Monte Carlo methods to solve inverse photon migration problems in heterogeneous tissues. Optics Lett 26(17):1335–1337 141. Hayakawa CK, Spanier J Perturbation Monte Carlo methods for the solution of inverse problems. In: Niederreiter H (ed) Proceedings MCQMC 2002. 25–28 November 2002, Singapore, Springer, pp 227–241 (to appear) 142. Goudsmit S, Saunderson JL (1940) Multiple scattering of electrons. Phys Rev 57:24–29 143. Lewis HW (1950) Multiple scattering in an infinite medium. Phys Rev 78:526–529 144. Berger MJ (1963) Monte Carlo calculations of the penetration and diffusion of fast charged particles. In: Alder B, Fernbach S, Rotenberg M (eds) Methods in computational physics, vol I. Academic, New York, pp 135–215 145. Larsen EW (1992) A theoretical derivation of the condensed history algorithm. Ann Nucl Energy 19(10–12):701–714 146. Fernandez-Varea JM, Mayol R, Baro J, Salvat F (1993) On the theory and simulation of multiple elastic scattering of electrons. Nucl Inst Meth Phys Res B73:447–473 147. Kawrakow I, Bielajew AF (1998) On the condensed history technique for electron transport. Nucl Inst Meth Phys Res B142:253–280 148. Wyman DR, Patterson MS, Wilson BC (1989) Similarity relation for anisotropic scattering in Monte Carlo simulations of deeply penetrating neutral particles. J Comp Phys 81:137–150 149. Bielajew AF, Salvat F (2000) Improved electron transport mechanics in the PENELOPE Monte Carlo model. Nucl Inst Meth Phys Res B173:332–343 150. Tolar DR, Larsen EW (2001) A transport condensed history algorithms for electron Monte Carlo simulations. Nucl Sci Eng 139:47–65 151. SpanierJ, Li L (1998) General sequential sampling techniques for Monte Carlo simulations: Part I – matrix problems. In: Niederreiter H, Hellekalek P, Larcher G, Zinterhof P (eds) Monte Carlo and quasi-Monte Carlo methods 1996. Springer Lecture Notes in Statistics #127, Springer, New York, 382–397 152. Amster HJ, Djomehri MJ (1976) Prediction of statistical error in Monte Carlo transport calculations. Nucl Sci Eng 60:131–142 153. Booth TE, Amster HJ (1978) Prediction of Monte Carlo errors by a theory generalized to treat track-length estimators. Nucl Sci Eng 65:273–281 154. Booth TE, Cashwell ED (1979) Analysis of error in Monte Carlo transport calculations. Nucl Sci Eng 71:128–142 155. Amster HJ (1971) Determining collision variances from adjoints. Nucl Sci Eng 43:114–116 3 Monte Carlo Methods 165 Professor Jerome Spanier received a BA in mathematics and physics at the University of Minnesota in 1951 and an M.S. and Ph.D. in mathematics from the University of Chicago in 1952 and 1955, respectively. He spent the next 16 years at industrial research laboratories, first in suburban Pittsburgh at the Bettis Atomic Power Laboratory operated by Westinghouse, then at the North American Aviation (later Rockwell International) central research laboratory – the Science Center – in Thousand Oaks, California. In 1971, he moved to academia as Full Professor of Mathematics at the Claremont Graduate School (CGS) in Claremont, California. There he founded the Mathematics Clinic, a practicum course in which students and faculty solve real world problems, in 1973–1974. This course has played a central role in the applied mathematics curriculum at the Claremont Colleges since its inception. In 1998, Spanier became Professor Emeritus in order to devote full time to research and he established a small research institute, the Claremont Research Institute of Applied Mathematical Sciences (CRIAMS), in Claremont. He devoted 8 of the intervening years to full-time administration at CGS: first, as Dean of Faculty in 1982 and subsequently as Vice President for Academic Affairs and Dean of the Graduate School until his return to the faculty in 1990. Spanier is the author of several books and numerous articles and technical reports, and he has spent much of his career contributing to both the theory and applications of Monte Carlo methods. He has applied these, and other numerical and analytical methods, to a variety of problems arising in chemistry, physics, and engineering. He now focuses much of his attention on medical applications at the University of California at Irvine, although he remains a Senior Fellow and Director of CRIAMS in Claremont. Chapter 4 Reactor Core Methods Robert Roy 4.1 Introduction This chapter addresses the simulation flow chart that is currently used for reactor-physics simulations. The methodologies presented are more appropriate to the context of power reactors, and the chapter focuses particularly on the threedimensional (3D) aspect of core calculations. Software design that is currently used to achieve accurate numerical simulations of reactor cores is also studied from a practical nuclear engineering point of view. The focus here is on processes and the needs for reactor physicists or nuclear engineers to use modern-day software with confidence and reliability. In Section 4.2, some early combinations of mathematics with a computational environment are given with no intention to give any global historical perspective. Section 4.3 is devoted to the concepts in lattice cell calculations by which most of the core databases are still constructed today. Section 4.4 is a selection of modern tools and methodologies that are now used for industrial applications. In Section 4.5, some applications of reactor core methods are given. Concluding remarks are drawn in Section 4.6. 4.2 Analytic Methods and Early Calculation Schemes A nuclear reactor must maintain a sustained neutron reaction chain. The ratio of the number of neutrons in any generation to the number of neutrons from the previous generation is called the multiplication factor. First core calculation schemes were trying to factor out the different important states that neutrons were trying to reach during their lives. An idealized reactor core is a homogeneous infinite medium where neutrons of different energies interact with the medium. Criticality means equilibrium between neutrons produced and removed, which is unitary R. Roy () Nuclear Engineering Institute, École Polytechnique de Montréal, P.O. Box 6079, Station Centre-Ville, Montréal (Québec) H3C 3A7, Canada e-mail: robert roy@polymtl.ca Y. Azmy and E. Sartori, Nuclear Computational Science: A Century in Review, c Springer Science+Business Media B.V. 2010 DOI 10.1007/978-90-481-3411-3 4, 167 168 R. Roy infinite multiplication. To take into account the neutron losses due to leakage out of a finite core, corrections were added. Early core calculations were based on the four- and six-factor formulas following the neutron life cycle: keff D .k1 /ƒther ƒfast D ."pf ther /ƒther ƒfast (4.1) Of these six factors, two, ther and ", essentially concern the fuel pins. The middle two, p and f , strongly depend on the heterogeneity of the fuel assembly (coolant/moderator, structural materials, etc.). These first four factors are the following: " The fast fission factor or the total number of fission neutrons per thermal-fission neutron. p The probability that this neutron escapes resonance capture. f The thermal utilization factor that this neutron is captured in fuel. ther The thermal reproduction factor is the average number of fission neutrons produced per absorption in fuel. The last two factors depend on the overall core geometry: ƒther The thermal nonleakage probability. ƒfast The fast nonleakage probability. Another way to express the criticality is to use a formula such as keff D k1 2 1 C M 2 Beff (4.2) where the new variables are the following: M 2 The so-called migration area. 2 Beff The critical buckling value. Early calculation schemes dealt mostly with such formulas to describe the neutron multiplication inside physical systems. Homogeneous bare and lumped geometric models were analytically deduced from the transport or diffusion equation using two energy groups in order to get accurate slide-rule results. To carry more significant figures, the use of desk-type calculating machines was considered as a great evolution. However, such an automated process requires the validation of every significant result, as errors in the digital input could mean a bad tabulated value. It was very soon realized that power reactors could benefit from having a reflector zone surrounding the lumped fuel arrangement. To approximate how the reactor core could be reduced from an equivalent bare configuration, calculations of the reflector saving involved spatial coupling of different materials, and matrices were needed to interconnect the fuel and reflector regions together. In his review on the development of nuclear reactor theory in the Montreal laboratory, Prof. M.M.R. Williams provides some letters to show the kind of personnel problems that such a laboratory can have [1]. Here is a memorandum to George Placzek where Bengt Carlson is stressed because many projects are ready for computation . . . without computer! 4 Reactor Core Methods 169 Memorandum to – Dr. Placzek Concerning: the Computing Situation At present, there is a great amount of accumulated work in the computing section. In the following, a list is given of the projects ready for computation and the estimated time necessary for their completion with the present staff and equipment of the computing section (see Table 4.1). We have thus about 12 weeks of accumulated work. Improvement of this situation is impossible with the present staff, rather we expect it to become worse. We have also been advised of the desirability of an acceleration of output. Since the current rate of work thus is considered insufficient and since it is besides a result of a somewhat forced tempo, more manpower and calculating machines are necessary. The most effective remedy, in my opinion, would be to hire two additional men, one with a college education including mathematics courses, and one with an excellent high school record to be trained here; and at the same time acquire two Marchant calculators, model ACRM. If we have to operate with an increased staff before the new calculators arrive, we might try to arrange a double shift, but such a step with the present type of work, which requires frequent decisions on my part and hence my presence, is at best an emergency solution. No computing project with which I have ever had contact left the staff without supervision during half of the day. That would be too much of a menace to accuracy and efficiency over an extended period. Bengt Carlson BC:VL Montreal, October 13, 1943 Can you imagine doing that kind of projects using calculators like the one in Fig. 4.1? In early calculations, tabulations were always used for Bickley–Naylor functions, exponential integral functions, and all the others. No computer was needed to obtain crude solutions. However, the analytic work to reduce the problems into mathematics tractable by hand was tremendous. For example, let us think of Dr. Boris Davison and Jeanne Lecaine in their Montreal laboratory as they were trying to approximate the linear extrapolation length using several approximations. Their working process was to embed several approximations together into analytical formulations with the aim of obtaining observable results. Variational principles were then very important because the manual 170 R. Roy Table 4.1 Projects ready for computation in the Montreal laboratory Project Estimated time Density distribution in systems with 2 weeks multiplication factor near 1 and nonmultiplying reflector (Dr. Volkoff) Integral equation for absorption in an 2 weeks aluminum slab (Dr. Wallace) Solution of transcendental equations and 2 weeks calculations of residues in connection with slowing-down problems (Dr. Marshak) Slowing-down length in water and related 3 weeks problems (Dr. Marshak) Albedo problems (Dr. Adler) 1 week Improved numerical solution of Milne 1 week equation (Dr. Mark) Miscellaneous 1 week Total 12 weeks Fig. 4.1 Picture of a Marchant 8D calculator (The Marchant model ACRM has two more digits!) 4 Reactor Core Methods 171 numerical results were easier to obtain. Today’s formulation of the linear extrapolation length as 8 9 Z1 ˆ > ˆ > ˆ > ˆ > dx .x/E3 .x/ ˆ > ˆ > <3 = 3 0 2 3 lext D sup 2L2 C 1 1 Z ˆ > 8 2Z ˆ > ˇ ˇ ˆ > ˆ 42 .x/ dx 0 E1 ˇx 0 x ˇ .x 0 /5 > ˆ > dx .x/ > :̂ ; 0 0 (4.3) is quite easy to process using mathematical scripting language and obtain accurate values of lext 0:7104. The quality of analytic methods used in the 1940s and 1950s can certainly be observed in Dr. Davison’s book [2]. In Report LA-756 (also available in [1]), Bengt Carlson gives a list of functions that are needed to perform Serber calculations: exponential, circular hyperbolic, exponential integral, and exponential integral for complex arguments. In Report LA-2595, Clarence E. Lee states that [3]: One of the first numerical experiments performed by Carlson after a computer was available was an attempt to perform Serber calculations by numerical integration and a semianalytical approach. From the analysis of those experiments and recognition of the intrinsic directional derivative treatment, the original Sn difference method was born. Nowadays, the range of computing resources has many scales: from hyperthreading and multicore processors up to clusters and grid computing. But we still rely on coherency of physical approximations in order to tackle reactor core problems . . . since the number of unknowns is too large to have full-core “ready for computation” models. Even though we now have sophisticated tools for reactor simulations, there is still “too much of a menace to accuracy and efficiency over an extended period” if we use these tools without being aware of their limits. Note: The material of this section is mainly extracted from the master work of Prof. Williams in analyzing a collection of papers on the development of the socalled Montreal theory involving many of the greatest nuclear reactor physicists. The interested reader should find in this work [1] valuable historical landmarks for many of the early reactor theory. 4.3 Lattice Cell and Assembly Codes A reactor core is made of several materials. The evaluation of the neutron flux distribution inside all regions in a reactor core still demands considerable computer resources. A direct transport solution for the entire core is not affordable for day-to-day exploitation. One must proceed using various stages of calculations interconnected together in order to obtain reliable data at the core level. In an attempt to clarify these stages, let us describe the equation governing the core behavior. The neutron balance in a reactor can be expressed by the time-dependent transport equation [4]: 172 R. Roy 1 @ˆ.r; ; E; t / O ; E; t/ C Lstatic ˆ.r; ; E; t/ D Q.r; v @t (4.4) where v, ˆ, and QO are, respectively, the speed, the neutron angular flux, and the source (including the scattering and the fission term), and Lstatic is a general static transport operator: Lstatic ˆ D rˆ.r; ; E; t/ C †t .r; E; t/ˆ.r; ; E; t/ (4.5) Macroscopic cross sections, e.g., the macroscopic total cross section †t in Eq. 4.5, represent probabilities per unit path length of neutron interaction in the physical medium. In reactors at power, many types of nuclei can compose the medium, and the microscopic values of cross sections are compounded by the nuclei concentrations (number of nuclei per volume unit) as X Ni .r; t/t;i .E/ (4.6) †t .r; E; t/ D i to give the local probability per unit path length of a neutron collision. Neutron collisions expressed by the total microscopic cross section t;i include different interactions: elastic and inelastic scattering, radiative capture, fission, etc. Some of these interactions exhibit resonances. A resonance self-shielding calculation is required in order to take into account the resonant behavior. Other interactions will affect the concentration Ni .r; t/. New nuclei will be formed (e.g., fission products), others will disappear (as burnable absorbers). The depletion (or burnup) equations can be linearized into X˚ dNi D i j ˆ C i j N j (4.7) dt j where i j and i j represent the production rate after fission or capture and the radioactive decay constant, respectively. At any given space point, the concentrations of all the nuclides will affect the neutron flux in a complicated manner. From the point of view of neutron kinetics, the production and elimination of some fission products can have a considerable influence on reactor operation (fuel poisoning products, delayed-neutron precursors, etc.) In lattice cell calculations, it is generally assumed that the flux (or the power) remains constant over the span of a time step to allow for the solution of the depletion equation by standard numerical integration (Runge-Kutta, Kaps-Rentrop, etc.). In Fig. 4.2, a UML (Unified Modeling Language) class diagram describes the various objects that will be needed all along the core follow-up: BurnupHistory has a collection of depleted states that are ordered in time; each of these states is a snap-shot of Eq. 4.7. DepletionState is an association class that relates the microscopic to the macroscopic cross sections by Eq. 4.6. MicroLib is a class that is composed of one chain of depletion, as many isotopes that are needed to represent the problem and self-shielding particular data. 4 Reactor Core Methods 173 BurnupHistory has 1 {ordered} 1..* DepletionState MicroLib 1 1 MacroLib -updateXS 1 {ordered} 1 SelfShieldingOpt 1..* Isotope 1 ChainOfDepletion 1..* GroupData Fig. 4.2 Class diagram for depletion calculations MacroLib is a class used to represent macroscopic cross sections; the neutron travels into media that have these properties. For deterministic core solvers, data is generally ordered into energy groups. The linear approach presented here is not physically sound for nuclear engineering applications. There exists loopback in this calculation flow chart. The feedback effects for a power reactor are certainly important. For example, the power distribution and the different media temperatures interact together resulting in Doppler broadening of microscopic cross sections. In fact, a whole set of nonlinear effects may be introduced in order to keep a consistent physical model. However, these effects will not be considered in this chapter. 4.3.1 Lattice Physics Calculations A lattice code is primarily used to compute the neutron flux distribution and the infinite multiplication factor. In most actual power reactors, the solid fuel pins and 174 R. Roy the liquid (or gaseous) coolant are located so that the nuclear reaction and its thermal hydraulics effect can be predicted and controlled. The relative geometric arrangement of fissile, absorbent, and coolant materials usually follows regular patterns inside the core. These patterns are referred to as lattices. Such lattices are composed of cells. At the cell level, the local flux may vary strongly and resonance effects are important. Broader interaction between cells, for example, in fuel assemblies, can be taken into account if the interface currents are known. In the 1970s, the calculation flow chart involving reactor core simulations was generally the following: 1. Perform pin-cell calculations using a fine group (few thousands) cross-section library, and condense into fewer (few hundreds) groups and homogenize over the pin-cell’s geometry. 2. Perform assembly calculations using homogenized pin-cell cross-section data from step 1, condense into few-group reactor data, and homogenize over assembly’s geometry. 3. Perform core calculations using the homogenized assemblies’ few-group data from step 2. Using, nowadays, computer resources, the two first steps are generally performed together; and the neutron coupling between the pin cells can be evaluated using the transport equation. The unit cell used for these basic transport calculations de- : Experiments : NuclearDataFile : LibraryXS : LatticeCalculation : ReactorCalculation Fig. 4.3 Waterfall model for reactor core analysis 4 Reactor Core Methods 175 pends on the kind of reactor: for light-water reactors, the cell can be a particular PWR fuel assembly with various enriched pins including gaps, a BWR assembly along with cruciform rods, or a cluster of CANDU-like fuel element surrounded by heavy-water moderator. Although these unit cell calculations do not provide the flux distribution inside the whole reactor, they are still important for design and follow-up of core behavior. The reactor physicist progresses from one phase to the next according to Fig. 4.3, where problems with the physical approximations in one phase demand interventions back in the previous phase. Reactor methods encompass the last two phases : LatticeCalculation and : ReactorCalculation with knowledge of : LibraryXS to perform core analysis. Nuclear power plant engineers rely on these phases for operation and safety issues. The : LibraryXS phase requires building an isotopic cross-section library from evaluation data files. In North America and Europe, the Evaluated Nuclear Data File (ENDF) format is generally used as the source of cross sections to process nuclear data relative to neutron and photon reactions. Using selected evaluations such as, for example, ENDF/B-V, -VI, and -VII, the NJOY code can produce a consistent set of pointwise or multigroup microscopic cross sections, covering both the resolved and unresolved resonances energy domain [5]. Multigroup cross-section libraries are generated using specific modules offered by lattice codes to recover the most interesting features for core analysis. In Section 4.3.1.1, we briefly discuss how to generate these cross-section libraries. NJOY system ENDF/B : NuclearDataFile [add isotope] RECONR BROADR UNRESR PENDF : PointWiseXS DRAGR THERMR GENDF : GroupAvgXS GROUPR Fig. 4.4 Typical processing sequence in NJOY DRAGLIB : LibraryXS 176 R. Roy 4.3.1.1 Producing Cross-section Libraries Figure 4.4 shows a typical sequence used for producing cross sections per isotope. The dashed lines mean objects that serve as input or output for the various modules. Here, only short descriptions for modules of NJOY are provided: RECONR is used for cross-section reconstruction (in fact, it reads the ENDF file, reconstructs resonances to prepare pointwise ENDF data). BROADR accounts for Doppler broadening of resonances. UNRESR is used for unresolved resonance data processing (for main resonant isotopes, the PURR module can also be used). THERMR treats the thermal scattering law. GROUPR is used for group-averaged data processing. Code-specific modules are then used for formatting multigroup cross sections. The shown example involves the posttreatment module DRAGR. Once all necessary isotopes have been processed and cross-section libraries are available, reactor core simulations can be done. In Fig. 4.5, reactor analysis states are shown in an activity diagram. It can be seen that the : LibraryXS phase is essential for carrying out proper analysis. Note that the lattice and core calculations appear as parallel activities where the nuclear engineer takes into account simulation results at both levels: the fine-flux lattice and the global flux distribution in the design of a reactor core. In Fig. 4.5, the standard flow of objects needed for or generated during reactor core analysis is also shown; this includes The : MeshGeom object representing cell and core geometry The : UpdateCoreDB object containing nuclear properties after lattice calcu- lation The : PowerDist object representing the power distribution in the core. 4.3.1.2 Self-Shielding and Multigroup Approximation In reactor physics, it is quite common to arrange energy groups in a reverse fashion, so that fastest energy group appears first and the most thermal one appears at the end, and use lethargies instead of energies to describe neutron slowing down. 0 The lethargy, u, of a neutron of energy E is defined by u D ln EE where E 0 is some maximum energy, commonly taken as 10 MeV [4]. It is generally assumed that the energy range is divided into G energy groups and that group g lethargy interval goes from E g to E g1 , corresponding to the lethargy interval U g D [ug1 , ug ]. Suppose that there exists a typical energy-dependent spectral weighting function '.u/ in group g whose integration over U g is 1. The microscopic group-averaged cross sections for reaction x will generally be group-condensed using Z xg D ug ug1 du '.u/x .u/ (4.8) 4 Reactor Core Methods 177 get cross sections define geometry : MeshGeom validate data perform lattice calculation : LibraryXS perform core calculation : PowerDist : CoreBehaviourData compare with observed core data : UpdateCoreDB : AcceptanceCrit Fig. 4.5 Activity diagram for reactor core analysis This equation makes physical sense when the flux shape in energy is quite regular. An important feature to take into account in core analysis is the fact that cross sections for typical heavy nuclides exhibit resonances. The flux shape will drop substantially in the resonance energy range, and it is no longer easy to assume an energy spectrum '.u/. In that case, Eq. 4.8 is no longer valid. Self-shielding calculations must be done to recover effective microscopic cross sections for a resonant reaction x: 178 R. Roy Z g x;eff .0 / ug x .u/ t .u/ C 0 D Z ug 1 du '.u/ g1 .u/ C 0 t u ug1 du '.u/ (4.9) where 0 is the total cross section outside the resonances, usually called the background cross section and '.u/ is the fine energy spectrum. In the subgroup method, the energy-spectrum function in group g is approximated and probability tables on a fine energy group structure are used to tabulate the resonant cross sections: P x;s 's t;sC s2g g x;l D P s2g 0;l 1 's t;s C (4.10) 0;l for various values of the background 0;l . Self-shielding effects are treated by specific modules in most lattice cell codes. Once an appropriate library of self-shielded microscopic cross sections is available, lattice calculations are done in the multigroup framework. The time-dependent angular flux has to be solved from equations 1 @ˆ.r; ; t/ C Lgstatic ˆ.r; ; t / D QO g .r; ; t/ vg @t (4.11) with group-dependent solutions ˆ D ˆg .r; ; t/ at various time steps representative of the neutron life cycle. The numerical integration of the term @ˆ=@t can be done using various schemes (explicit, implicit, -method, etc.) and, assuming, for example, the power, i.e., burnup, history of the lattice, lattice calculations will be done for obtaining steady-state solutions to transport problems over the full energy range. In each energy group, the problem is to compute the value of ˆg satisfying Lgstatic ˆ D rˆg .r; / C †gt .r/ˆg .r; / D Qg .r; / (4.12) with appropriate group-dependent sources Qg and boundary conditions pertinent to the lattice. Finally, in some reactor dynamics studies, an exponential time-decay period is often sought so that the separation ˆ.r; ; t/ ( e ˛t ˆ.r; ; ˛/ remains valid for all groups in Eq. 4.11, also resulting in a transformation of the sources as in Eq. 4.12. 4.3.1.3 Generic Multigroup Solver The multigroup treatment has condensed the transport operator to G energy groups. This operator now acts on the angular flux with two separate parts: the usual streaming part and the collision part, which depends on a group-dependent cross section †gt .r/: Lgstatic ˆ ŒLstatic j†gt .r/ ˆ D rˆ.r; / C †gt .r/ˆ.r; / (4.13) 4 Reactor Core Methods 179 If the group-dependent source Qg .r; / is known, the methods presented in the above sections can be used to solve the transport problem independently in each group, as in Eq. 4.12, to obtain the flux map. However, the source term includes scattering from other groups as well as fixed sources or fission effects, which couple the set of G transport equations together: Qg .r; / D S g ˆ C F g .r; / XZ 0 d 2 0 †gs g .r; D 0 0 /ˆg .r; 0 / C F g .r; / (4.14) g 0 4 Scattering Effects There are three separate components comprising the scattering effect: the upscattering g<g 0 effect where the neutron scatters from lower to higher energy, the self-scattering g D g 0 where the neutron stays in the same energy group, and the down-scattering g>g 0 where the neutron is slowing down. When the source F g .r; / is fixed, the standard iterative method of solution assumes an initial flux guess ˆg.0/ .r; / and the convergence of the fixed point iterations of the type Lgstatic ˆg.mC1/ D S g ˆg.m/ C F g .r; / D Qg.m/ .r; / (4.15) where the source Qg.m/ .r; / is updated in each iteration using the flux ˆg.m/ .r; /. Because there is generally no up-scattering of neutrons in the fast groups, the groups are processed from fastest to the most thermal (i.e., g D 1; : : : ; G/. In most solvers, the down-scattering sources are updated using the new flux just obtained from the previous groups. In some solvers, it is also possible to use an implicit form for the self-scattering source, which is to enable the source calculation using the on-the-fly flux value in the same group; in other solvers, this could lead to an iterative process on its own. Finally, the up-scattering sources are updated for the next iteration. This method of solution, which is basically of Gauss–Seidel type in energy groups, can be sped up using various acceleration schemes. Fission Effects The fission source (generally assumed to be isotropic) can also depend on the flux. In conventional lattice calculations where the fission spectrum does not depend on the incident neutron energy, the fission source takes the form g .r/ X g 0 P gˆ D †f .r/ K K 0 g Z 4 0 d 2 ˆg .r; / (4.16) 180 R. Roy where K is an adjustable parameter to achieve criticality. Lattice codes are often interested in computing only the largest possible eigenvalue for K with the eigenvector associated to the neutron flux. To find this critical value, the inverse power method is often used ˚ P g ˆ.n/ Lgstatic C S g ˆ.nC1/ D k .n/ (4.17) where ˆ.n/ and k .n/ are approximations obtained for the multigroup flux map and the eigenvalue at iteration n, respectively. Because Eq. 4.16 implies a sum over all groups, an outer iteration (that is a complete loop where all the energy groups are solved) is generally necessary to compute the new fission source. These remarks have led most developers to use a two-level solver strategy of the Gauss–Seidel type. In the following sections, we first review the steady-state transport solutions that are generally used in order to recover few-group nuclear properties for core calculations. The basic calculational unit is a spatial convex domain D split into I homogeneous disjoint regions, each having volume Vi , surrounded by an external boundary @D that can be split into several surfaces: 8 <D D ˚ Vi i (4.18) :@D D ˚ S˛ ˛ For the sake of simplicity, we omit the energy dependence (or the group index) of the variables in the following sections to concentrate on the angular and space variables and to see how we can obtain solutions to a single group transport problem, as in Eq. 4.12. 4.3.1.4 Discrete Ordinates In the discrete ordinates method, the neutron transport equation is solved for a number of discrete directions n . The unit sphere is split into areas with weights wn D 2 n =4 and the discrete solid angles represent the directions where transport solutions will be obtained. The scalar flux is approximated as follows: Z d 2 ˆ.r; / ' ˆ.r/ D X wn ˆn .r/ (4.19) n 4 where ˆn is the solution for the discrete ordinate n . For this fixed direction, the steady-state transport equation is integrated over volume: Z Z 3 d 3 r ŒQ.r; n / †t .r/ˆ.r; n / d rr .n ˆ.r; n // D D D (4.20) 4 Reactor Core Methods 181 The divergence theorem is then applied for the left-hand term leading to an equation representing the exact balance for neutrons that stream in the volume: Z Z d 2 rb n NC ˆ.rb ; n / D d 3 r ŒQ.r; n / †t .r/ˆ.r; n / (4.21) D @D where NC is the unit outward normal and rb 2 @D. The left-hand side is the flow across the boundary and the right-hand side is the difference between neutron sources, including secondary production, and losses from collisions inside the domain. To illustrate the basics of discrete ordinates solvers, let i uschoose a 3D domain h C C ; zC with centers z yj ; yj split into small homogeneous cells xi ; xi k k at rijk D .xi ; yj ; zk /, having volumes Vijk D xi yj zk and with constant cross section †ijk D †t .xi ; yj ; zk /. Select a direction of flight n D .n ; n ; n / in the principal octant .n > 0; n > 0; n > 0/, and assume that the cell-centered flux is constant inside the volume and that the surface (or cell-edged) fluxes are also constant over each of the cell’s six faces. The balance Eq. 4.21 can be approximated by n n n x ˆn C y ˆn C z ˆn D Qijk †ijk ˆn;ijk xi yj zk (4.22) where constant differences are taken for each coordinate: 8 C ˆ ˆ <x ˆn D ˆn .xi ; ; / ˆn .xi ; ; / ˆ D ˆn .; yjC ; / ˆn .; yj ; / ˆ y n :̂ ˆ D ˆ .; ; zC / ˆ .; ; z / z n n n k k (4.23) A 3D Cartesian solver will proceed as follows: assuming that the incoming surface fluxes (associated with the – sign) are already known for neutrons entering the region, we march through Vijk following the neutron’s direction of motion, n , to determine the local angular flux value ˆn;ijk inside the volume and the outgoing surface fluxes (associated with the C sign) for neutrons going out of the volume. We still need some additional relationships to close the system, because there are four unknowns for each region Vijk : ˆn;ijk and the three outgoing fluxes. In the classical diamond difference scheme, we crudely assume that: ˆn;ijk D D ˆn .xiC ; ; / C ˆn .xi ; ; / 2 ˆn .; yjC ; / C ˆn .; yj ; / 2 ˆn .; ; zC / C ˆn .; ; z k/ k D 2 (4.24) 182 R. Roy but one may also use other difference schemes (step or weighted-diamond relationship; see Chapter 1). If the domain is composed of several Cartesian cells, each angle sweep is done starting from the domain external boundary in the direction of neutron motion. In the case of the principal octant and using natural 3D cell numbering, the spatial calculation proceeds in each cell for increasing values of i , j , and k. The flux values propagate as a wave front in the 3D domain, similarly as a step-by-step Euler’s solver for ordinary differential equations. In curvilinear coordinate systems (spherical, cylindrical, etc.), the method can still be applied using conservative difference schemes. Newer schemes have been developed using finite element treatment of the spatial variable. Typical discrete ordinates quadrature sets (also called Sn quadrature) are chosen to preserve the maximum number of angular moments. The order of the Legendre expansion of the scattering kernel, as defined in Section 4.3.1, implies the use of a minimal number of angles to compute the angle-dependent sources. 4.3.1.5 Method of Characteristics In the method of characteristics (MOC), the differential form of the Boltzmann equation will be used [6]. This differential equation will then be solved by integrating along the characteristics of the differential operator that correspond to the tracking lines. First, the streaming operator r in Eq. 4.12 is expressed as a derivative along the neutron’s direction of motion, . This leads to the following equation: d .r C s; / C †t .r C s/ .r C s; / D Q.r C s; / (4.25) ds Assume a starting point r D r0 (e.g., located on the external boundary @D), s is the distance measured from this point along the characteristic line with prolongation in the neutron streaming direction. For one line segment of length L and constant properties, we may integrate this last equation along the line and obtain: C D e †t L C Q 1 e †t L †t (4.26) where D ˆ.r0 ; / is the inward value of the angular flux at the beginning of the line segment and C D ˆ.r0 C L; / is the outward value at the line’s ending point, where Q is approximated by a constant value. A single characteristic line is an ordered collection of such line segments Lk crossing different region numbers Nk . The outward flux of one segment also serves as inward flux in the next segment. If the inward flux for the first segment of a characteristic line is known, a recursive segment-by-segment calculation will be done to compute the outward flux and the k-segment-averaged flux Nk according to the properties of the region traversed by segment k. 4 Reactor Core Methods 183 Characteristics multigroup solvers have been used since the 1970s to obtain accurate solutions for two-dimensional (2D) lattice cell problems. Nowadays, with greater computing resources, extensions of this method to 3D solvers are already in the development and benchmarking stages. The main advantage of MOC compared to the many other deterministic transport methods is that the geometric description is rather flexible. It is possible to consider the use of constructive solid geometry packages to define the : MeshGeom object of Fig. 4.5. Assuming a 3D lattice split into homogeneous regions and changing the phase space variables, the average flux ˆj can be expressed by Z Vj ˆj D Z d 3r d 2 ˆ.r; / 4 Vj Z Z 4 D C1 d T 1 Z D d 4T X dt Vj .T; t/ˆ.rT C t ; / ıjNk Lk k (4.27) k where the T characteristic line is determined by its orientation (solid angle) along with a reference starting point rT for the line. To cover the domain, Monte Carlo codes typically use pseudorandom number generators as elaborated in Chapter 3. In MOC deterministic codes, a quadrature set of solid angles is selected (Sn quadrature sets can be used) and the starting point is chosen by scanning the plane perpendicular to the selected direction. Hence, in the second line of Eq. 4.27, the d 4 T is composed of a solid angle element times the corresponding perpendicular plane element. In the above, the variable t refers to the local coordinates on the tracking line and the function Vj .; / is defined as 1 if the tracking line segment passes through the region j and 0 otherwise. The last form of Eq. 4.27 is of particular importance in MOC solvers; it states that the average flux over a volume Vj is made up of all the local segment contributions. Note that the k summation runs over all the segments of a characteristic line. However, only the contributions of segments crossing region j are added together by virtue of the Kronecker delta, i.e. ı symbol. A generic lattice MOC iterative solver may proceed as follows: after estimation of the neutron sources, the outward angular flux at the external boundary of the domain is summed on every surface in order to keep outward current components, which serve as inward currents for the next inner iteration. It is also possible to use cyclic characteristics on lattice domains; in this case, boundary conditions are directly embedded into the periodic tracking procedure allowing factorization of the infinite periodic source contribution to the local segment flux. Once all the segment-averaged flux values are obtained, Eq. 4.27 is used to compute the new flux distribution, which serves for updating the neutron sources of the next iteration. Note: Unfortunately, the estimated (numerically computed) volumes do not generally agree with the true volume of the regions. In most deterministic MOC codes, 184 R. Roy the segment lengths are usually renormalized to preserve the true volumes; this is done by multiplying each segment’s length Lk by an angle-dependent factor Vj =Vj0 .n /. 4.3.1.6 Collision Probability Method The integral form of the transport equation is obtained after integration of the streaming operator r along the neutron traveling direction. Let us first express Eq. 4.12 assuming an actual position r for the neutron d .r s; / C †t .r s/ .r s; / D Q.r s; / ds (4.28) Equations 4.25 and 4.28 are similar; however, the latter equation looks back along After multiplying by the integrating factor e s D ˚ the R s neutron trajectory. exp 0 dt†t .r t / , we integrate backward along the neutron path up to a boundary point rb and obtain ˆ.r; / D ˆ.r b; /e b C Z b ds Q.r s; /e s (4.29) 0 Assuming flat isotropic sources Q.r; / D .Qi = 4 /Vi .r/ and that the integration range covers an infinite lattice, the average scalar flux inside region j can be computed as Z Z Vj ˆj D 3 4 Vj Z Z D d 3r Vj D d 2 ˆ.r; / d r X Z 1 d 2 ds Q.r s; /e s 0 4 pij Vi Qi (4.30) i where the matrix coupling coefficients in the third form of Eq. 4.30 are defined by pij D 1 4 Vi Z Z d 3r Vj d 3r 0 exp. s / s2 (4.31) Vi Equation 4.30 is commonly referred to as the Peierls’ equation. The lattice CP matrices obey the reciprocity relation Vi pij D Vj pji and conservation of neutrons forces the sum of normalized probabilities Pij D †t;j pij over all sources j to be one. CP 4 Reactor Core Methods 185 d 3r dt d 2Ω d 3r ' s d t' Ω d 2s ΠΩ Fig. 4.6 Change of variables for 3D tracking solvers can use the same tracking files as MOC solvers since a change in variables allows the calculation to be performed using the formula 1 pij D 4 Vi Z Z 2 d 4 Z 2 d s Vi Vj Z dt dt 0 exp. ij / (4.32) … where … is the plane perpendicular to the solid angle . Figure 4.6 shows how this change of variables is done in order to translate the volume integration into characteristic-like tracking. Assuming that d 2 s is a planar element of plane … , the transformation of coordinates of Eq. 4.31 into local coordinates of the tracking lines in Eq. 4.32 is: d 3 r d 3 r 0 D d 2 d 2 s dt dt 0 The last two integrations are over line segments of the characteristics and can be done analytically. Under the condition that the segment lengths are renormalized to preserve the true volumes, it can be shown that results obtained by CP and MOC are equivalent if both use the same tracking data. 186 R. Roy Note: In the 1980s, CP methods were very popular because they could be used to collapse the tracking information into energy-dependent region-to-region matrices. Nowadays in most modern lattice codes, MOC is preferred because it has a linear complexity, it is easier to integrate anisotropic effects and, last but not least, actual computer resources (memory and hard disk space) have grown sufficiently to accommodate its computational load. Although the CP method is now deprecated, it is interesting to see how we can express multigroup solvers assuming the existence of these matrices. 4.3.1.7 Bn Solutions and Diffusion Coefficients In this section, we study more closely the total leakage phenomenon and how it can be accounted for using specially devised diffusion coefficients. Consistent Bn solutions are used to obtain homogenized nuclear parameters. The flux is first factorized into a global form and a lattice-periodic fine-structure flux ˆ.r; / D exp.iB r/'.r; ; B/ (4.33) Critical buckling search consists of finding the vector Beff having a minimal modulus in a given orientation, such that the lattice is critical. A generic fundamental lattice mode consists of a fine-structure solution over all directions on the critical ellipsoid. Provided that the buckling vector complies with Corngold’s inequality jIm.B /j < † (4.34) a critical buckling search is possible [7]. This inequality implies that the separation of the global form is not always possible in a subcritical cell, in particular, when void slits occur in some regions. When this method is applied to homogeneous media problems, the buckling value b is searched for the one-dimensional (1D) planar transport equation. The factorization in Eq. 4.33 is combined with a Legendre polynomial expansion and anisotropic collision probabilities can be computed by Z Al 0 l " 0 .1/l Cl 1 C D d Pl 0 ./Pl ./ 1 C ib 1 C ib 0 0 1 1 .1/l Cl Pl 0 Ql for l 0 l D ib ib ib 1 # (4.35) where Pl and Ql are the usual Legendre functions of the first and second type, respectively. If the transfer cross section is expanded up to Legendre order L, a system of equations is obtained for the angular flux moments in the homogeneous medium: L X .2l C 1/Al 0 l †s;l l †t l 0 D lD0 4 Reactor Core Methods 187 For this systemˇ to have a nontrivial solution, ˇ the following characteristic equation must be solved ˇ†t ıl 0 l .2l C 1/Al 0 l †s;l ˇ D 0. This procedure can be shown to be equivalent to the so-called dispersion equation used to compute the discrete eigenvalues of the transport equation [4]. Probably the best-known model is the so-called B1 homogeneous model, which attempts to solve these equations in a cell-equivalent infinite medium when considering linear anisotropy. Under the assumptions that absorption is weak and that the lattice has internal symmetries, it is possible to define axial-dependent diffusion coefficients to account for axial streaming effects. The most general form that can be obtained to get rid of the angle and collapse the transport problem into a diffusion form involves a tensor J g .r/ D Dg .r/ r g .r/ where J g .r/ is the neutron current and Dg .r/ is the diffusion matrix. Because the critical buckling vector length does not vary very much with direction (i.e., the critical ellipsoid being almost a sphere), the directional-dependent diffusion coefficients are seldom needed in everyday core calculations where nondirectional diffusion coefficients are used. The only significant exception is a lattice cell with large voiding where neutrons can travel a very long way [7]. 4.3.1.8 Lattice Solvers In lattice calculations, the boundary conditions are generally taken so that neutrons do not leak from the system comprised of a single, or a few assemblies. Neutrons arriving at the external boundaries of the lattice domain may change their geometric state by a simple transformation in the phase space .rb ; / like: Specular (mirror-like) reflection: ˆ.rb ; / D ˆ.rb ; 2. Nb /Nb / Periodic or translation: ˆ.rb ; / D ˆ.rb0 ; / where rb and rb0 are on @D and Nb is the outward normal on @D. In these cases, the external boundary does not have to be discretized into surfaces. In many problems, however, the external boundary is split into surfaces as in Eq. 4.18, and there is a homogenization process that takes place on each of these surfaces. Neutrons arriving at an external surface @D may forget (partly or totally) their directions of motion. Neutron currents leaving surface S˛ are defined by Z Z d 2 rb J˛;C D d 2 Nb;C ˆ.rb ; / (4.36) Nb;C >0 S˛ and will be inserted back into the system with incoming currents Z J˛; D Z d 2 j Nb;C jˆ.rb ; / 2 d rb S˛ Nb;C <0 (4.37) 188 R. Roy either at the same surface (that is the case of isotropic reflection) J˛; D J˛;C or at another surface of the system Jˇ; D J˛;C . If white boundary conditions are applied, the outgoing neutron currents are returned back into the P lattice as isotropic incoming Tˇ ˛ J˛;C . The orthogonality currents using an orthogonal transformation Jˇ; D ˛ of the transformation ensures that no neutron leaks from the lattice. Let us return to Eqs. 4.13–4.17 where we introduced the generic multigroup solver. Infinite lattice calculations can be done using the generic solver in order to get eigenvalues K D k1 for different configurations. Different search capabilities are usually programmed in lattice codes. For example, a neutron leakage model may be introduced to represent how the finite physical reactor core interacts with the embedded lattice. Assuming a homogeneous macroscopic flux distribution, solution of r r‰ C B B‰ D 0 for a buckling vector B, a critical buckling search can be done to adjust the buckling length jBjeff to yield K D keff D 1. To find the critical buckling, there are several possibilities, but the two main branches assume basically that a new absorption term is included [8]: On the left-hand side of Eq. 4.12 using a transformed static transport operator of the form (or some approximation of this): Lgstatic ˆ ŒLstatic j†gt .r/CD g .r/B 2 ˆ (4.38) On the right-hand side of Eq. 4.12 by subtracting the absorption term from the diagonal of the scattering matrix: S g ˆ ŒS j†g t g0 0 0 .r/Cı gg D g .r/B 2 ˆ (4.39) where D g .r/ is the diffusion coefficient for group g that will be discussed later. Finally, in cases where a critical period ˛eff has to be found, a similar absorption correction term appears as a transformation of the transport operator Lgstatic ˆ ŒLstatic j†gt .r/C˛= vg ˆ (4.40) and iterations are done in order to obtain the critical period. Discrete ordinates, characteristics, and collision probabilities methods are similar in many aspects. The : MeshGeom object will be numerically tracked after choosing a set of angles. In trajectory-based transport calculations, numerical values for volumes and surfaces strongly depend on the number of segments used. Consider Fig. 4.7, where a triangular stretched region is represented in bold. On the left-hand side, the tracking angle is nearly parallel to the main stretching axis. On the right-hand side, the same region is tracked with another angle (nearly perpendicular to the stretching axis). On the left, the segment length crossing the region on the middle track must be normalized to represent the real volume seen by the neutron (illustrated as a gray rectangle); the two other tracks do not even see the region. The normalization to preserve volume is huge and significantly disturbs the characteristics line. However, on the right, the normalization of segment length has very 4 Reactor Core Methods 189 Fig. 4.7 Stretched volumes in trajectory-based methods 1960 1966 THERMOS WIMS 1973 1977 APOLLO CPM 1983 CASMO 1991 1995 DRAGON ECCO 1960 Today 2007 Fig. 4.8 Milestones for introduction of new lattice codes little effect. Lattice-based results are better when a coherent scheme is taken to mesh the basic geometry into elements with sound tracking data in order to represent the neutron’s physical behavior. 4.3.1.9 Putting It All Together into Lattice Codes Since the 1960s, different lattice codes have been developed. Most of these codes have been maintained for several years, and their features have evolved with the available computing resources. Figure 4.8 gives a few milestone indications of when some codes have appeared in time. Short descriptions are now provided: THERMOS, originally developed at Brookhaven National Laboratory, was mostly a code devoted to neutron thermalization [9]. WIMS, developed at Winfrith laboratory, England, was the first full-feature lattice code [10]. It is currently available under version WIMS-9. APOLLO, developed at CEA Saclay, France, included interesting features like resonance treatment and homogeneous B1 treatment embedded in the flux solver [11]. The present version is APOLLO-2. Both CPM and CASMO, developed at EPRI and Studsvik Scandpower, integrate most features necessary for LWR cores [12]. The present version CASMO-4 is widely used in the USA. DRAGON, developed at École Polytechnique de Montréal, is an open-source lattice code with many features pertinent to (but not limited to) CANDU reactors [13]. The present major number release is DRAGON-3. ECCO, developed mainly at CEA Cadarache, is a European cell code with many features pertinent to (but not limited to) fast reactors [14]. It is integrated into the ERANOS-2 code system. 190 R. Roy TRITON, developed at Oak Ridge National Laboratory, is a high-fidelity lat- tice code based on the 2D extended step characteristic Sn transport module NEWT, the continuous-energy resonance processing module CENTRM, and the ORIGEN-S depletion/decay package [15]. The current version is part of the SCALE 5.1 package. 4.3.2 Homogenization Process This section presents some aspects of the homogenization problems in reactor physics. For obtaining a homogenized few-group zone with neutron properties totally equivalent to a heterogeneous medium, Koebke [16] postulates that two kinds of conservation should be fulfilled: 1. Volume-related conservation: the integral flux and the integral reaction rates must be conserved in the homogenized zone. 2. Surface-related conservation: the integral net currents and the integral fluxes must be conserved at each interface of the homogenized zone. When these two conditions are fulfilled, the heterogeneous cell medium is perfectly well represented by the homogenized equivalent zone. To enforce these conservation relations, lattice solvers are expected to preserve volumes and surfaces of components in the first place. Embedding each homogenized zone inside a realistic core application composed of several levels of heterogeneity would imply using a multigrid approach where each different local heterogeneous cell configuration would react to its interface boundary conditions. Using this approach, each cell node is represented by a response matrix where its output flux values depend on input at interfaces, while preserving significant integral reaction rates. Unfortunately, once these “exact” nodes are coupled together, the number of unknowns is as large as the one generated by the transport equation discretized over the reactor core. In this section, we will review some classical approximations that are used to reduce the number of spatial unknowns in core calculations. 4.3.2.1 Reaction Rates and Homogenized Cross Sections A generic formula for cell homogenization is based upon the following formula for computing reaction rates in a volume ˝ ˛ R D † ; ˆ D Z Z 3 d 2 † .r; /ˆ.r; / d r V (4.41) 4 It is sometimes possible to use the lattice adjoint calculation in order to take into account important local effects, such as detector response. Knowing the sources Q, 4 Reactor Core Methods 191 ˝ ˛ the reaction rates are simply given by Rd D ˆ ; Q with the appropriate importance function ˆ . A homogenization process that attempts to conserve these weighted reaction rates must ensure that E ˝ ˛ D c ; b̂ R D † ; ˆ D † (4.42) where the hat symbol b : is used for the homogeneous equivalent zone. Under angular isotropy assumptions, the regular form of this equation asserts that a uniform homogenized cross section over a cell could be defined by c † Z d r b̂.r/ D Z 3 V d 3 r † .r/ˆ.r/ (4.43) V When the integral flux value is preserved in the homogeneous equivalent, the homogenized cross section is simply the flux-volume average of the heterogeneous reaction rate. If the integrated flux in the homogeneous equivalent is not known, correction factors may be applied to flux-volume average values to preserve reaction rates and integral values: Z d 3 r † .r/ˆ.r/ c D † D V Z † (4.44) d 3 r ˆ.r/ V Similar reasoning could be applied to separate streaming effects in the volume into their contribution to currents at different interfaces while preserving Z Z 2 2 d rb @V d Nb;C 3/ D ˆ.r; 4 Z Z d 2 Nb;C ˆ.r; / 2 d rb 4 @V which can be translated into a current formulation X X .J˛;C J˛; / D .J˛;C J˛; / 5 ˛ (4.45) (4.46) ˛ A homogenizer (reactor physicist on task) must be aware of what is needed: here, only the net neutron flow is preserved; in infinite lattice calculations (with no neutron leakage model), this means zero equals zero. However, this crude balance may not preserve the surface currents at different interfaces. The lattice could be rotated and there would be no visible effect for the homogenized equivalent. In the case of completely symmetric unit cells, a single outer boundary surface (a white boundary on @D), which behaves exactly the same way for neutrons of any direction entering the volume and conservation like Eq. 4.41 may be sufficient. Using a leakage model, this behavior can sometimes be represented using uniform diffusion coefficients. 192 R. Roy Unfortunately, the true unit cell is embedded in the reactor core and currents on each interface are probably different, even when the basis lattice geometry is totally symmetric. 4.3.2.2 Generalized Equivalence Theory and Discontinuity Factors Another important issue in multigroup equivalence theory is the neutron behavior at cell interfaces (inter-assembly effects). In order to deal with interfaces in coarse problems, the classical approach generally assumes neutron current continuity. In the process of converting a large-scale transport problem into a diffusion-like problem, axial diffusion coefficients can give some degrees of freedom to conserve transverse leakage effects. However, if we add the last constraint of preserving also the surface fluxes, additional degrees of freedom not usually encountered in purely diffusive problems are needed. This supplementary constraint yields the well-known homogenization paradox: in the process of defining an equivalent homogeneous medium, some properties get lost. The generalized equivalence theory proposes a framework for obtaining equivalent reactor core solutions. At each interface S˛ D VI \VJ between zones I and J , the surface fluxes on the two sides of the interface are allowed to be discontinuous in the homogenized domain. Discontinuity factors are introduced on each side of the interface [17]: C f˛;J I D ˆ˛;side I 2 ˆ˛;side I ; f˛;I J D ˆ˛;side J 2 ˆ˛;side J (4.47) and the core-level solution assumes that neutrons crossing interfaces respect the discontinuity. Note: Work on homogenization theory and neutron leakage is certainly not exhaustively covered in this presentation. Many variations regarding the issue of defining homogenized parameters have evolved from the work of Behrens in the late 1950s to modern advanced nodal methods. In recent years, there have been significant achievements in heterogeneous whole-core analyses without homogenization or group condensation (e.g., see [18]). 4.4 Reactor Core Solvers Power-plant reactor core calculations are usually done with few energy groups. Many approximations are done to condense and homogenize the lattice results, but the angular dependence is also important. Basically, approximations made over the angular variable serve as the basic glue joining together the core regions. In this section, some methods used to obtain the flux distribution over the core will be presented. 4 Reactor Core Methods 193 4.4.1 Pn Approximations and Diffusion In isotropic media, the angular transfer operator depends on the angular deflection 0 D 0 resulting after a neutron scattering collision. The scattering cross section is generally expanded up to Legendre order L in anisotropy [8] †s .r; 0 / D L X 2l C 1 †s;l .r/Pl . 0 / 4 (4.48) lD0 where Pl are the usual Legendre polynomials. In order to determine the scattering source in Eq. 4.14, the flux must also be expanded into spherical harmonics components L Cl X 2l C 1 X (4.49) ˆ.r; / D lm .r/Rlm ./ 4 lD0 mDl where Rlm are orthogonal real spherical harmonics. By the use of the addition theorem of spherical harmonics, it is possible to rewrite the neutron sources in a truncated form Q.r; / D L Cl X ˚ 2l C 1 X Rlm ./ †s;l .r/ 4 lD0 lm .r/ C flm .r/ (4.50) mDl where flm are the expansion coefficients of the source terms of Eq. 4.14. 4.4.1.1 Spherical Harmonics and the Even-Parity Transport Equation To take into account the angular variable, the even and odd components of the angular flux are often separated. Assuming an isotropic source, the steady-state onegroup transport problem described in Eq. 4.12 can be transformed into: ( rˆ .r; / C †.r/ˆC .r; / D C 1 4 Q.r/ rˆ .r; / C †.r/ˆ .r; / D 0 (4.51) where ˆC .r; / D Œˆ.r; /Cˆ.r; / =2 and ˆ .r; / D Œˆ.r; /ˆ.r; / =2 are respectively the even and odd components of the angular flux. Extracting the odd component from the second equation, the even-parity form of the transport equation is r 1 1 rˆC .r; / C †.r/ˆC .r; / D Q.r/ †.r/ 4 (4.52) 194 R. Roy Assume that the spherical harmonics approximation of Eq. 4.49 is written using scalar products ˆC .r; / D e./ ®C .r/ and ˆ .r; / D o./ ® .r/. The even-parity spatial equations now reduce to a set of second-order differential equations: 1 EET r®C .r/ C †.r/®C .r/ D q.r/ (4.53) r †.r/ where EET D R d 2 e./eT ./ represents the even-component angular cou- 4 pling. To enforce continuity conditions at interfaces between neighboring regions, the odd-parity components are also defined ® .r/ D where OT D R 1 r OT ®C .r/ †.r/ (4.54) d 2 o./eT ./. Following these angular approximations, 4 a complete spherical harmonics solver can be obtained from the spatial discretization of Eq. 4.53 with orthogonal functions correctly matching the homogeneous regions inside the domain, with Lagrange multipliers coupling the odd-parity interface components of Eq. 4.54. For anisotropic scattering or sources, the formulation is also possible, but a little more tedious [19]. For the sake of simplicity, simpler numerical schemes based on the diffusion approximation will be presented in this chapter. The reader is referred to Chapter 2 of this book for a more detailed presentation on spherical harmonics solvers. 4.4.1.2 The Diffusion Approximation In the context of reactor codes, the P1 expansion is well known and directly related to diffusion theory. In that case, the angular flux takes a simple form ˆ.r; / D 1 f .r/ C 3 J.r/g 4 (4.55) Using the generic multigroup formulation, the integration of Eqs. 4.13 and 4.14 in angle gives r:J g .r/ C †gt .r/ g .r/ D G X g 0 D1 †gs;0 g0 .r/ g0 .r/ C f0g .r/ (4.56) 4 Reactor Core Methods 195 Now, assuming isotropic fission sources, another equation to close the multigroup system can be obtained after multiplying by the angle and integrating once more 1 r 3 g G X .r/ C †gt .r/J g .r/ D g0 †gs;1 0 .r/J g .r/ (4.57) g 0 D1 A multigroup P1 solver employs these equations. Most of the core simulations are probably still done using diffusion codes. Equations 4.56 and 4.57 can be combined together into an elliptic form using the diffusion coefficient D g r .D g .r/r g .r// C †gt .r/ g G X .r/ D g0 †gs;0 .r/ g0 .r/ C f0g .r/ (4.58) g 0 D1 Several recipes are possible for defining D g : g If †s;1 g0 .r/ D 0, use recipe 1 3†gt .r/ D g .r/ D g If †s;1 g0 0 .r/ D ı gg †gs;1 g .r/, use recipe D g .r/ D g If macro-reversibility †s;1 g0 1 3.†gt .r/ †gs;1 g .r// 0 0 .r/J g .r/ D †gs;1 g .r/J g .r/, use recipe 1 D g .r/ D 3 †gt .r/ G P g 0 D1 ! 0 †gs;1 g .r/ and so on Transport-corrected cross sections were often used in older lattice codes (limited to isotropic sources and scattering) to account for linear anisotropy. In the second definition of D g , the static-corrected transport operator of Eq. 4.12 can be taken as Lgstatic ˆ ŒLstatic j†g .r/†g t s;1 g .r/ ˆ consistent with the scattering operator defined as S g ˆ ŒS j†g s;0 g0 0 g .r/Cı gg †s;1 g .r/ ˆ 196 R. Roy 4.4.1.3 SPn and Improved Diffusion In 3D geometries, the Pn (or Bn ) equations are quite complicated to implement due to the large number of unknowns (in fact, there are .n C 1/2 coupled equations in 3D). At the beginning of the 1960s, Gelbard introduced the simplified spherical harmonics equations [20]. These simplified equations were historically derived after some algebraic reductions that are formally exact in planar geometry. In 1D planar geometry, the P3 expansion for the angular flux is the following: 1 ˆ.x; / D 0 .x/ C 3 1 .x/ 2 2 2 5 3 3 1 .x/ C 7 .x/ (4.59) C5 2 3 2 2 The corresponding P3 equations coupling the flux moments together are: 8 d ˆ ˆ D S.x/ ˆ 1 .x/ C †r;0 0 .x/ ˆ dx ˆ ˆ ˆ d d ˆ ˆ < 2 2 .x/ C 0 .x/ C 3 †r;1 1 .x/ D 0 dx dx ˆ d d ˆ ˆ 3 ˆ 3 .x/ C 2 1 .x/ C 5 †r;2 2 .x/ D 0 ˆ ˆ dx dx ˆ ˆ d :̂ 3 D 0 2 .x/ C 7 †r;3 3 .x/ dx (4.60) where †r;n D †t †s;n . Elimination of the odd-order moments is easy and the two remaining equations can be written as: 8 ˆ ˆ < d 1 d f 0 .x/ C 2 2 .x/g C †r;0 0 .x/ D S.x/ dx 3†r;1 dx 9 d 2 ˆ d f†r;0 0 .x/ S.x/g :̂ 2 .x/ C †r;2 0 .x/ D dx 35†r;3 dx 5 (4.61) Using an ad hoc procedure to obtain the multidimensional simplified P3 equations, the derivatives of the odd moments are replaced by divergence operators and the derivatives of the even moment by gradient operators. Equation 4.60 is then transformed into: 8 1 ˆ ˆ r. 0 C 2 2 / C †r;0 0 D S < r 3†r;1 (4.62) 9 2 ˆ r 2 C †r;2 2 D f†r;0 0 S g :̂ r 35†r;3 5 These two multidimensional equations look like a two-group diffusion problem. For a reactor core problem using G energy groups, there will be 2G such equations to solve; this is far more tractable than the full P3 formulation where there would be 16G coupled equations. Moreover, the actual form of Eq. 4.62 being 4 Reactor Core Methods 197 very similar to the usual diffusion equations, all the numerical methods suitable for diffusion problems can generally be extended easily to tackle these equations. In general, .n C 1/=2 second-order diffusion-like equations are needed for developing an SPn method. Many current reactor codes use SPn calculation schemes, where the diffusion solutions are improved by taking into account higher angular order relations. It is also possible to obtain SBn simplified equations after replacing r by r C iB and eliminating the complex terms. In the rest of this section, only the diffusion equation will be considered. However, the reader should be aware that this restriction is only in the interest of simplifying the presentation. 4.4.2 Diffusion-Like Methods Although not always accurate, diffusion solvers are invaluable in obtaining inexpensive reactor core data. They are also used as accelerators for transport core solvers. For these reasons, we will now explain a few basic core methods using the framework of a generic steady-state multigroup diffusion equation r D g r g C †gr g D X †gs;0 g0 g0 C g X †gf 0 g0 (4.63) g0 g 0 ¤g where †gr D †gt †gs;0 g is the removal cross section. A generic multigroup diffusion solver will follow a very similar pattern as the generic transport solver of Section 4.2.1.3. However, the solution is much simpler to obtain since upon discretization the self-adjoint operator on the left-hand side will involve symmetric positive definite matrices. The multigroup diffusion system stays nonsymmetric due to the scattering transfers. However, the self-scattering term can now be directly integrated in the inner iteration. The external boundary conditions of the reactor core are normally taken into account: For symmetry planes, by r g N D0 For zero incoming current, by g D 2 D g .r g N/ The flux distribution is the eigenvector solution corresponding to 0 D 1 = keff ; the eigenvector is normalized to the total reactor power as: Ptot G Z X ˝ ˛ D †f ; core D gD1 where is the energy release per fission. core d 3 r †gf .r/ g .r/ (4.64) 198 R. Roy 4.4.2.1 Transverse Integrated Nodal Methods Transverse integration processes the core solution in each dimension one at the time. Assuming a 3D Cartesian core split into volume elements jV j D x y z, the derivative with respect to one of the coordinates is kept on the left-hand side of Eq. 4.63 and the other terms are sent to the other side: 1 g 1 g d gd g D ˆ .x/ C †gr ˆg .x/ D Qg .x/ Lz .x/ L .x/ dx dx y z y (4.65) Here, the flux is integrated in the two other directions: Z yC Z zC 1 1 ˆg .x/ D dy dz g .x; y; z/ y z y z and so are the fission and scattering sources. Finally, the transverse leakage terms are given by: 8 Z zC yC ˆ 1 g ˆ g @ g ˆ L .x/ D dz D .x; y; z/ < y z z @y y (4.66) Z yC zC ˆ 1 ˆLg .x/ D g @ g dy D .x; y; z/ :̂ z y y @z z In nodal expansion methods (NEM), polynomial flux expansions are combined with a quadratic transverse leakage fit to get a fast core solver. Using local coordinates where the origin is the center of the node, the flux is represented without taking into account cross-directional dependency: g .x; y; z/ D g C N X ang pn .x/ C nD1 where g N X bng qn .y/ C nD1 N X cng rn .z/ (4.67) nD1 is the node-averaged flux and where the polynomials satisfy Z xC Z yC Z zC pn .x/ dx D qn .y/ dy D rn .z/ dz D 0 x y z Without any loss of generality, only the x-direction will be treated from now on. A classical example involves 4th degree polynomials 8 xx ˆ ˆ p1 .x/ D ˆ ˆ ˆ x ˆ ˆ 1 ˆ ˆ <p2 .x/ D 3 2 4 1 2 ˆ p3 .x/ D ˆ ˆ ˆ 4 ˆ ˆ ˆ 1 ˆ 2 :̂p4 .x/ D 20 (4.68) 2 1 4 4 Reactor Core Methods 199 conveniently chosen such that: p3 .x˙/ D p4 .x˙/ D 0. The coefficients of the expansions are related to flux and current values on each face of the node boundary. For example, in the x-direction, it can be shown that 8 ˆ ˆ ˆ ˆ ˆ ˆ ˆ < a1g D ˆg .xC / ˆg .x / ˆgxC ˆgx a2g D ˆgxC C ˆgx 2 g Dg 1 g 1 g g g g g d g ˆ .x / D a 3a2 C a3 a4 Jx D D ˆ ˆ dx x 1 2 5 ˆ ˆ g ˆ D 1 g 1 g ˆ g g g g d g :̂JxC D D ˆ .xC / D a1 C 3a2 C a3 C a4 dx x 2 5 (4.69) In order to obtain a well-posed system of equations involving the node-averaged flux and face-averaged partial currents across nodal interfaces, a weighted residual procedure is applied on the node. Integration over the node of Eq. 4.63 yields the nodal balance equation i 1 g 1 h g 1 g g g g g g C JxC Jx JyC Jy J Jz C D Q †gr (4.70) x y z zC The incoming interface currents are the outgoing interface currents from adjacent nodes. It is possible to take into account the discontinuity factors of Eq. 4.47 to link together interface fluxes. To close the system, Eq. 4.65 is also multiplied by some weight functions wn .x/ and integrated over x 2 Œx ; xC . Using a constant weight function, w0 .x/ D 1, leads to Eq. 4.70. In moment weighting, the next N 2 functions of the polynomial basis are used. In this case, the weights w1 .x/ D p1 .x/ and w2 .x/ D p2 .x/ will provide additional equations needed for the calculation of expansion coefficients. Flux values (node-averaged and face-averaged) can be eliminated in favor of the source and leakage terms yielding a global interface current system of equations. The solution of steady-state nodal diffusion equations follows the standard multigroup solver procedure: that is inner group-by-group iterations nested in outer iterations where neutron sources are updated. Inner iterations consist of mesh sweeps through the domain up to the convergence of the interface currents for a given group. Further information on the classical NEM solvers can be found in the review paper of R.D. Lawrence [21]. 4.4.2.2 Analytic Nodal Methods In analytic nodal methods (ANM), an analytic solution to the one-dimensional Eq. 4.65 is sought. The derivation is based on the fact that hyperbolic sine and cosine functions are generic solutions of Eq. 4.65 with no source. Typical nodal solutions are a blend of polynomials and exponentials: ˆg .x/ D Ag cosh ı ı g g ldg C B sinh ldg C fpart .x/ (4.71) 200 R. Roy p where ldg D D g = †gr is the well-known diffusion length [4]. Linear and quadratic approximations are normally employed for the sources and the transverse leakg .x/ is a second-degree polynomial a0g C age terms, respectively. The function fpart g g a1 p1 .x/ C a2 p2 .x/, computed as a particular solution of Eq. 4.65. The approximation of transverse leakage terms is of great help to reduce the complexity of ANM solvers. However, for nonrectangular (such as hexagonal) nodes, singular terms occur in the transverse leakages and these singularities have to be smoothed. In the analytic function expansion nodal method (AFENM), recently developed at KAIST, the transverse integration with an assumed shape is no longer necessary [22]. More nodal unknowns are needed to obtain a solvable system; AFENM thus includes the corner-point flux values as supplementary unknowns and corner-balance equations are needed to close the system of equations. Reference [22] offers a good review of the main principles and limitations behind the various modern nodal methods. 4.4.2.3 Core Harmonics and Modal Synthesis Consider the following eigenvalue matrix system obtained after discretization of the transport or diffusion equation in energy, angle and space of a steady-state core problem [23] M® D F® (4.72) When an adjoint solution ® is required, as for perturbation theory or kinetics, direct and adjoint calculations can be carried out simultaneously, as the eigenvalue is the same. Both eigenvectors are often combined by the Rayleigh ratio, which serves as an estimate for the eigenvalue for both systems h® ; M ®i ® ; ® D h® ; F ®i (4.73) This ratio is stationary with respect to both ® and ® . A technique similar to the well-known biconjugate gradient method developed for nonsymmetric matrices can be used to simultaneously solve direct and adjoint problems. The flux distribution in the core and its corresponding adjoint are the solutions (fundamental reactor mode) corresponding to the greatest real value K D keff D 1=0 . The core harmonics (direct and adjoint) are solutions that correspond to other eigenvalues jKj D 1=jn j < keff ; note that the other eigenvalues are not necessarily real or simple (that is of algebraic multiplicity of 1). Modal synthesis is used for real-time core simulators. The trial functions used as “modes” are precalculated over the entire reactor core domain. These modes serve as a function basis for core perturbations or transient analysis. For example, let us consider a multigroup modal scheme. Using the first N natural harmonics specific to the reactor being studied, the time-dependent flux is synthesized by: ˆ .t; r/ D g N X nD0 Agn .t/ g n .r/ 4 Reactor Core Methods 201 4.4.3 Variational Formulation and Finite Elements To allow the treatment of spatial variables in full-core calculations, nowadays methods rely on many high-order accurate numerical schemes. In this section, we will choose (once more) the template diffusive-like problem [24]: ( r J.r/ C †.r/ .r/ D S.r/ (4.74) J.r/ C D.r/r .r/ D 0 in a reactor core domain r 2 D. 4.4.3.1 Classical Spatial Finite Elements The classical way to express this problem as a bilinear form suitable for finite elements is to substitute the second equation into the first one, multiply by a test function , and integrate over space. For the moment let us forget about J and focus on . After integration by parts and assuming vanishing conditions .rb / D 0 at the core’s boundary (or at an extrapolated distance point) rb 2 @D, this gives the following Z Z d 3 r fDr r C† gD core „ƒ‚… a. ; / d 3 rS (4.75) core „ƒ‚… L. / This bilinear form a.; / defined on the left-hand side has nice properties: symmetric and coercive, and there is a unique solution in the Sobolev space 2 H01 .core/ where and all its derivatives are square integrable over the core. The classical Ritz–Galerkin method to approximate solutions of Eq. 4.75 consists of building finite subspaces Vh and looking for a minimal solution of the functional =h . / D 1 a. 2 h; h / L. h / D inf '2Vh 1 a.'; '/ L.'/ 2 (4.76) The core domain is generally seen as a partition of zones (or elements) in which local simple functions are defined and matched together at interfaces. The finite elements are built on these finite-dimensional subspaces, where the approximated flux is searched as a linear combination X ' hD xn 'n n that will minimize the functional and the best solution is the (unique) solution of the linear system 202 R. Roy 8 ˆ ˆ Z Ax D b ˆ ˆ ˚ ˆ ˆ d 3 r Dr'i r'j C †'i 'j < aij D a.'i ; 'j / D ˆ ˆ ˆ ˆ ˆ :̂ core (4.77) Z bj D L.'j / D 3 d r'j S core Implemented in the 1980s for reactor core calculations, this method has been in use for several years. This is now known as the primal formulation of our diffusive template problem; the first equation of Eq. 4.74 providing the flux distribution has to be found and the second equation acts as a constraint. In the dual formulation, the current has to be found using the second equation while the first equation acts as a constraint on the divergence of the current. 4.4.3.2 Mixed and Hybrid Finite Elements Nowadays, weaker variational formulations are often used to better represent the lack of regularity of the flux distribution in the core. Let us consider again the basic template problem in Eq. 4.74 and transform both equations into a variational system. The first equation, simply multiplied by the test function and integrated over the core, leads to Z Z d 3 r fr J C † g D d 3r S core core while the second equation can be transformed into Z ˚ d 3 r D 1 J I r I D 0 core where I is a test vector function. The solution to this mixed problem does not only provide the flux distribution, but the combination of . ; J / unknowns satisfying the variational system. This leads to a larger linear system, as the unknowns will be of the two kinds. However, this form is weaker because it involves a new functional space for currents [25] o n 3 H.divI core/ D I jI 2 L2core ^ r I 2 L2core with a well-defined trace on the boundary of the core. In order to close this variational system uncoupling the flux and current components, boundary conditions are also needed for the current component. To simplify this presentation, let us choose an essential boundary condition as I.rb / Nb;C D 0. The flux and the source term in the first equation need only to be square integrable on the core domain. The symmetric form for this variational system is 4 Reactor Core Methods 8 ˆ ˆ ˆ < ˆ ˆ :̂ 203 Z Z ˚ d 3 r D 1 J I r I D 0 Z core d 3 r f r J † core gD d 3r S (4.78) core It is possible to combine Eq. 4.78 into a mixed formulation Z inf J 2H0 .divIcore/ sup 2L2 .core/ ˚ d 3 r D 1 J J † 2 2 r J C 2S core where the system solution is a saddle point. The integration of various boundary conditions is easier (reflection, zero incoming current, etc.) than in the primal formulation as the flux–current components are uncoupled. As the flux and source terms are only square integrable, the function basis is much easier to choose. With a core domain meshed into volumes Vi , the most obvious choice is to take the characteristic function of Vi with a value of 1 over the volume and 0 otherwise as the flux function basis. For a 3D Cartesian core mesh, an obvious (nontrivial) current basis would consist of linear components over each normal direction Jx .r/; Jy .r/; Jz .r/, while the other directions can be constant on each interface. This formulation can yield efficient calculation schemes reducing the number of flux unknowns and enabling certain discontinuities at domain interfaces. 4.4.4 Putting It All Together into Reactor Codes Since the 1960s, reactor codes have been developed. Figure 4.9 gives a few milestone indications of when some codes have appeared in time. Short descriptions are now provided: FLARE, originally developed as an inexpensive 3D BWR simulator to determine core reactivity and power distribution, was a prototype for nodal codes [26]. CITATION, developed at Oak Ridge National Laboratory, is a generic code that solves various kinds of 3D multigroup diffusion problems (XYZ, RZ™, etc.) [27]. 1964 FLARE 1976 1982 Q/CUBBOX DIF3D 1960 1993 VARIANT Today 2007 CITATION 1971 QUANDRY 1978 Fig. 4.9 Milestones for introduction of reactor codes 204 R. Roy QUABOX/CUBBOX uses a coarse-mesh flux expansion with multidimensional polynomials for the solution of the nodal balance equations [28]. QUANDRY, developed at MIT, implemented a full two-energy groups analytic nodal method with quadratic approximation of the transverse leakage [29]. DIF3D, originally developed by R.D. Lawrence at Argonne National Laboratory, uses nodal diffusion and transport methods for the analysis of fast reactors [30]. VARIANT, also developed at Argonne National Laboratory, includes a significantly expanded set of solution techniques using variational nodal methods [31]. 4.5 Core Applications The reactor-physics calculation for core tracking or other transient analysis is a multistep process, generally involving a core database. The hierarchical local parameter database, generated with a lattice cell code, is used to compute the nuclear properties associated with each cell according to its local parameters. Every fuel pin (or fuel bundle) can have its own properties, which depend not only on the instantaneous environment condition surrounding the pin, but also on the historical effects that the fuel has experienced. On the other hand, structural materials, reactivity mechanisms, and reflector properties also must be generated consistently within the same database, but these usually vary much less than the fuel. The hierarchical reactor core database is generally built as follows [32]. Depending on the reactor type and core simulations, a set of local parameters (lEp ) of interest is first selected (temperatures, densities, etc.). Some nominal conditions for the reactor core are then identified .lE0 ; hE0 /. Detailed cell calculations are performed for E E these nominal conditions and homogenized reference cross sections †ref ˛ .l0 ; h0 / are saved. Then, multidimensional interpolation is used to obtain feedback coefficients for potential operational situations. A sensitivity analysis for cross-section variations must be undertaken for every local parameter: perturbed cell calculations are done and mixed effects must be tracked. The burnup and other history effects (hEb ) add another level to the hierarchy of data. Each perturbed state .lEp ; hEb / can affect the multigroup macroscopic cell cross sections either directly by introducing new microscopic data or via a perturbed cell flux. The perturbed cross sections can be written as X E E ˛i C ˛i .N i C N i / †pert ˛ .lp ; hb / D i D †ref ˛ C X i ˛i N i C X i ˛i N i C X ˛i N i i Added to the reference cross section, there are three types of macroscopic crosssection corrections: the first term takes into account temperature effects and any spectrum perturbations; the second term is devoted to nuclide concentration variations, e.g., fuel burnup, and the third (second-order) term is for the mixed effects between the nuclide concentrations and the microscopic data variation. 4 Reactor Core Methods 205 In power reactors, a traditional macroscopic depletion model is not accurate enough for core analysis. On the other hand, data tabulation of every microscopic cross-section effect is not affordable. Core databases use a blend composition for representing cross sections: MACRO E E E E †core .lp ; hb / C ˛ .lp ; hb / D †˛ X ˛micro .lEp ; hEb /N micro micro where some microscopic data will be extracted from the cell calculations. The concentration value N micro can be adjusted during the core simulations. These adjustments are sometimes done by automatic regulating or control systems included in the core; otherwise, these concentrations can be driven by factors external to the depletion or history-based cell models. The isotopes extracted depend on the reactor type. For example, in LWR reactors, some actinides and burnable absorbers (such as Gadolinium and Boron) are always extracted for obvious reasons. In Fig. 4.10, a class diagram collecting a whole set of cell calculations is shown with some typical local parameters. In that sketch, some standard parameters are taken into account for generating the database, namely: FuelCond includes the fuel conditions as fuelTemp, the fuel temperature; fuelDens, the fuel density; and fissProd, some fission production data. IntCoreDB - fitHomXS - fitIntProp - unitCalc : UnitCalc CoreSimulation UnitCalc -latticeGeom[1] -deplState[1..*] : DepletionState -concHet[0..*] FuelCond -fuelTemp -fuelDens -fissProd CoolantCond -coolDens -modTemp -voidFrac Fig. 4.10 Building of a reactor core database ReactDev -spacers -detectors -poisonRods DepletionState 206 R. Roy CoolantCond includes the coolant conditions as coolDens, the coolant density; modTemp, the moderator temperature; and voidFrac, a void fraction. ReactDev includes data for the reactor-driven mechanisms as spacers, detectors, and poisonRods. The preceding parameters will influence the behavior of different cell patterns present in the reactor core. Each unit cell calculation sequence is organized in a class UnitCalc identified by a specific lattice cell geometry latticeGeom. The various concentrations concHet of nuclides will vary depending upon the parameter values and the depletion state of the cell. The reactor database contains the nuclear properties of all these unit cell calculations and acts as an interface for restoring these properties in core simulations. 4.5.1 Pin Power Reconstruction in LWR Reactors LWR reactor design has evolved over the years. MOX fuel and extended cycles have pushed the need for more sophisticated reactor analysis tools. The calculation of pin powers when tracking an LWR core is still a very challenging problem for nowadays computer resources. A modern LWR core can contain more than 200 assemblies of 17 17 pins. In addition to various transient safety analyses, LWR core studies usually include simulations for setting operation margins and for core monitoring. In the late 1980s, pin power reconstruction capability was added to the SIMULATE-3 nodal reactor analysis code [33]. In this example, Studsvik’s CASMO-4/SIMULATE-4 calculation scheme will be presented [34]. In SIMULATE-4, the global core model is based on the multigroup diffusion equation solved by the analytic nodal method. The diffusion equations are integrated over the transverse directions with the transverse leakage approximated by a quadratic fit. This gives a full set of 1D multigroup equations where cross sections inside each node are considered uniform. The analytic nodal method is not sensitive to mesh-spacing limitations that could occur in a finite difference approach and can provide reliable 3D flux solutions. As material discontinuities can appear in the axial direction, a separate multigroup 1D diffusion equation is solved for each fuel assembly. The influence of neighboring assemblies is taken into account by radial leakage. In the radial direction, the core is split into slices and full 2D calculations are performed with the SP3 approximation using N N submeshes for each fuel assembly. Assuming zero net current at fuel assembly boundary, CASMO-4 is used to generate the cross sections †CASMO and discontinuity factors homogenized for each group and each submesh. For each fuel assembly, an equivalent SP3 calculation is done using these homogenized parameters and zero net current. The assembly cross sections †SA and the side average discontinuity factors obtained for this last calculation are computed and used to correct the node average parameters. The global model stitches the 2D submesh slices together with the corrected cross sections. 4 Reactor Core Methods 207 The pin power is reconstructed in the following way. For each group and each submesh, the nodal flux is assumed to be given by 2D .x; y/ D P .x/ C Q.y/ C ax e Kx x C bx e CKx x C ay e Ky y C by e CKy y where P .x/ and Q.y/ are polynomials of degree 2 and the coefficients K are obtained in the global solver. This nodal equation is consistent with the development presented in Section 4.3.2.2 on analytic nodal methods. To account for the heterogeneity of the submesh, the form factors are superimposed on the homogeneous 2D recovered from the solver. This gives the following expression powers pi;g Pi D G CASMO X pi;g gD1 SA pi;g 2D pi;g 4.5.2 Estimates of Zonal Powers in CANDU Reactors CANDU reactors are refueled at power [13]. If a nuclear engineer has to select channels to refuel today to sustain the total power, she has to take into account the perturbation to the flux shape in the core. The fourteen (14) liquid zone controllers of CANDU reactors (see Fig. 4.11) serve as zone control units that can be emptied Fig. 4.11 Control power zones in a CANDU core. The fill patterns indicate the zones controlled by the liquid zone controllers that are the black needles. There are seven zones in the front part of the reactor and seven others in the rear part. The fuel channels are horizontal. 208 R. Roy or filled with light water upon request of the regulating system. Let us say that our nuclear engineer estimates that 16 new bundles of fresh fuel is the target; where should she place these bundles? The characteristics of interest here are the zonal power fractions, which are defined for the unperturbed state by: ˝ p`o D ˝ ˛ †f o ; o Z ` †f o ; o core ˛ where o is the unperturbed flux and Z` is the volume of control zone `.` D 1; : : : ; 14/. We wish to obtain the variation in zonal power fractions caused by a single refueling perturbation. For example, refueling in channel j will produce perturbations to the power fractions p`.j / . The perturbation will depend on the perturbation to the system matrices M and F on the left- and right-hand sides of Eq. 4.68 and the variation in energy †f involved during refueling. Since there are 380 fuel channels and the refueling scheme could be a combination in buckets of four or eight bundles in a channel, selecting the channels using a full-core reactor calculation model can lead to millions of core combinations, each one comprising a small perturbation Generalized perturbation theory can be used for evaluating the power fraction increments to second-order accuracy with regards to variations in the original core model. In that particular case, the generalized adjoints ` are obtained by solving for each control zone ` an adjoint source problem: Z` p` †f o ˛ D ˝ †f o ; o core ˚ .M o F /` where Z` is 1 inside zone Z` and 0 otherwise. This problem is much simpler to solve than the perturbed equation since it does not involve another eigenvalue search, but rather a singular source problem. The unique solution to this problem ˛is a generalized adjoint; by the Fredholm alternative, it must comply with ˝ ` ; F o D 0. Using this solution, estimates of the power perturbation are provided by bilinear products .j / p` ' D˚ / Z` p` †.j ; f Po tot E o core D ` ; M.j / o F.j / E o core where Po tot is the total unperturbed reactor power. The above procedure has been found very reliable compared to reference eigenvalue calculations of the perturbed core configurations. To obtain the perturbed reference calculations, the engineer may think that it is easier to restart the calculation from the original state fo ; o g. However, a very accurate convergence criteria must be used because variations in the eigenvalue are small compared to the full-core properties. More results concerning this methodology can be found in [35]. 4 Reactor Core Methods 209 4.5.3 Teaching Modern Reactor Core Methods In a recent graduate course given at École Polytechnique, the reactor core was presented using a project approach. The challenge is to motivate students in collecting input data, defining core simulations, and extracting and analyzing pertinent outputs. In order to teach modern reactor core analysis, some software tools used with reactor data are necessary. At École Polytechnique, we use our own reactor-physics chain of codes: DRAGON, a lattice cell and supercell code, and DONJON, a reactor code [36,37]. Both are open-source codes that can be used with WLUP libraries [38]. The choice of a “real” reactor core for simulation is difficult: the core must be simple enough so that the student is not overwhelmed by modeling details. Among many research reactors that have been used for neutron physics studies (production of radioisotopes and testing of fuel samples), the Argonne National Laboratory Chicago Pile-Five (CP-5) (see Fig. 4.12) was chosen. It is a small heavywater graphite research reactor built in 1952 and abandoned in 1979. The CP-5 decontamination and decommissioning was initiated in 1991 and completed in 2000. The whole life cycle of the core is thus completed, so that the student can be confident that this core is entirely “feasible.” The CP-5 reactor used highly enriched Uranium elements in a tank of heavy water; the core has two reflectors: heavy water and graphite. The basic core layout is simple: Fig. 4.12 View of the CP5 Argonne reactor building (with kind permission of D&D – Nuclear Engineering Division of Argonne) (http://www.dd.anl.gov/projects/cp5.html) 210 R. Roy Graphite reflector D 2O moderator D2O reflector Fig. 4.13 CP5 reactor core with its two reflectors. The number of fuel elements in the core varies with time from 12 (white pins only) up to 16 positions (grayer pins). The central black-pin location is generally not loaded with fuel There are 17 loading positions for inserting fuel elements. Two feet of heavy water acts as a first reflector on the sides and bottom of the core (2.5 ft on top). Two feet of graphite acts as a second reflector on the sides and bottom. The core layout is depicted in Fig. 4.13. Students are asked to represent the core using different models using a fixed microscopic library. For the project, the functional requirements are the following: Cell models in 2D transport are done in order to recover nuclear properties for the various material zones and important isotopes. Core models in 3D transport are done using Cartesian grids based on the fuel cells and preserving volumes for each different material region. The core analysis provided must report on: – Convergence analysis of core models. – Study of critical load and core evolution. – Calculation of the temperature feedback coefficient. Among the quality requirements, consistency and physical sense of the analysis are most important; explanation of key ideas, report presentation, and conciseness are also expected. Students learn reactor analysis through the use of codes. They must describe and input materials and geometries; they must choose mesh splitting and options in the numerical solvers. Homogenized and condensed nuclear properties are obtained after the 2D cell calculation from collapsing the fuel and gap diluted with a quantity of heavy water to obtain a volume equivalent to the square pitch of 6 in. Then, these properties are sent to the 3D core model for studying core behavior. 4 Reactor Core Methods 211 Fig. 4.14 Thermal flux at the CP5 reactor core center As an example of output, Fig. 4.14 gives the thermal flux at the center of the 12-element core during start-up using 1=4-symmetry. Most students have found consistent results for the temperature feedback coefficient (about 0:04K=K=ı C) and the spatial core convergence analysis is generally well done. Major defects found in the analyses are related to the fact that some students do not preserve reaction rates when going from the cell to the core model. For example, they may input more (or not enough) fuel than necessary in the core model by using improper dilution factors. 4.6 Concluding Remarks The future developments in reactor analysis methods are difficult to foresee. One may speculate that the most probable development paths will depend on the availability of CPU resources. The hardware resources are still growing exponentially, but the clock speed of a single-processor node has now reached a limit where the power consumption begins to be a severe handicap. Design of high-end desktop computer nodes are now based on dual or quad core with three levels of cache. Moreover, interconnection network technology (copper or fiber optics) has also considerably improved, providing smaller latency and increased effective bandwidth. A wide range of small-scale clusters is now available at affordable cost, and these clusters can significantly reduce the processing times needed for reactor analysis. 212 R. Roy However, the investment involved for producing large-scale parallel systems is still important, and most of these systems do not fit all scientific needs. Highperformance computing will be more and more based on the efficient combination of various parallelisms: not only instruction-level parallelism (branch prediction, loop unrolling, etc.), but also thread-level parallelism and message passing. A wellbalanced assignment strategy of computing tasks and data to processors will have a large positive influence on performance. Let us go back to the context of reactor physics. More and more demanding reactor core analyses will require optimization and multiphysics strategies. Multiscale computations will be necessary to achieve efficient calculations with the large number of unknowns needed to represent modern reactor cores. Using the increased power provided by multiple interconnected nodes, future reactor analysis methods shall efficiently use parallel computing tools in an integrated and consistent multiphysics environment. A number of core simulations can potentially benefit from the hybrid shared-distributed parallelism, where a distributed-memory machine has embedded symmetric multiprocessor nodes. Inside each node, a shared-memory programming model is applied. Between nodes or at the cluster level, message passing is used. Functional and domain decomposition techniques provide ways to distribute the reactor core domain across cluster nodes. Dynamic graph partitioning tools will be used to map core unknowns onto the nodes and adaptive multigrid schemes will allow nuclear engineers to obtain accurate solutions to reactor problems. In the coming years, the reactor analyst will also need good graphics rendering and data mining tools to extract useful information from large amounts of data. More than ever, the teaching of reactor core analysis is far more complex than the simple understanding of some basic physical models. Before understanding core behavior for design or operation purposes, the apprentice physicist or nuclear engineer will have to deal with many computational simulations involving a great amount of data. The use of modern software tools and their related complex input and output files are subjected to a loss of insight into physical common sense. From the apprentice’s point of view, these complex simulations may appear as black box operations with (or without) magic results. The teaching of reactor physics must help the apprentice to fill the gap between physical knowledge and numerical results. Validation and verification (V&V) steps are more than ever necessary for complex core simulations. The steep learning curve to acquire V&V aptitudes is certainly the most important challenge in reactor physics’ education, and these aptitudes are certainly required if we want to succeed in sustaining new reactor designs over the coming years. Final note: The Unified Modelling Language (UML) was invented to help understand and communicate complex systems with fewer ambiguities. The UML diagrams (Figs. 4.2–4.5 and 4.9) shown in this chapter are based on the book of Jon Holt [39]. Nuclear engineering is a type of systems engineering where physical and nontangible artifacts (such as software and data processing) collaborate in core modeling and, in such integrated complex systems, good software engineering techniques are needed to ensure safety and reliability. 4 Reactor Core Methods 213 References 1. Williams MMR (2003) NEA-1706/01 CD package. Canadian and early British Energy Reports on Nuclear Reactor Theory (1940–1946). Nuclear Energy Agency Data Bank, OECD, Paris. Available at http://www.nea.fr/abs/html/nea-1706.html 2. Davison B (with coll. Sykes JB) (1957) Neutron transport theory. Oxford University Press, Oxford/England 3. Lee CE (1962) The discrete Sn approximation to transport theory. Report LA-2595, Los Alamos Scientific Laboratory, New Mexico, USA 4. Bell GI, Glasstone S (1970) Nuclear reactor theory. Van Nostrand Reinhold, New York 5. MacFarlane RE, Boicourt RM (1975) NJOY: a neutron and photon processing system. Trans Am Nucl Soc 22:720 6. Askew JR (1972) A characteristics formulation of the neutron transport equation in complicated geometries. Report AEEW-M 1108. United Kingdom Atomic Energy Establishment, Winfrith, England 7. Roy R (1996) Application of the Bn theory to unit cell calculations. Nucl Sci Eng 123:358–368 8. Bussac J, Reuss P (1978) Traité de neutronique. Hermann, Paris, France 9. Honeck HC (1961) THERMOS, a thermalization transport theory code for reactor lattice calculations. Report BNL-5826, Brookhaven National Laboratory (code available at NEA data bank: http://www.nea.fr/abs/html/nea-0043.html) 10. Askew JR, Fayers FJ, Kemshell PB (1966) A general description of the lattice code WIMS. J Br Nucl Energy Soc 5:564–585 11. Hoffman A, Jeanpierre F, Kavenoky A, Livolant M, Lorrain H (1973) APOLLO: Code Multigroupe de résolution de l’équation du transport pour les neutrons thermiques et rapides. Report CEA-N-1610. Commissariat à l’énergie Atomique, Paris, France 12. Ahlin A, Edenius M (1977) CASMO – a fast transport theory assembly depletion code for LWR analysis. Trans Am Nucl Soc 26:604–605 13. Roy R, Marleau G, Tajmouati J, Rozon D (1994) Modelling of CANDU reactivity control devices with the lattice code DRAGON. Ann Nucl Energy 21:115–132 (code available at NEA data bank: http://www.oecdnea.org/abs/html/ccc-0647.html) 14. Rahlfs S, Rimpault G, Ribon P, Finck P (1994) Recent developments of the sub-group method for use in the European cell code ECCO. Algorithms and Codes for Neutronics Calculations. Obninsk, Russia, 25–27 October 15. DeHart MD, Gauld IC, Williams ML (2007) High-fidelity lattice physics capabilities of the SCALE code system using TRITON. Proc. Math. & Comp. and Supercomputing in Nuclear Applications (M&C + SNA2007), 15–19 April 16. Koebke K (1981) Advances in homogenization and dehomogenization. Int. Top. Mtg advances in mathematical methods for the solution of nuclear engineering problems. München, Germany, 27–29 April 17. Smith KS (1980) Spatial homogenization methods for light water reactor analysis. Ph.D. thesis, Department of Nuclear Engineering, Massachusetts Institute of Technology, Boston, MA, USA 18. Smith MA, Lewis EE, Na BC (2003) Benchmark on deterministic 2-D/3-D MOX fuel assembly transport calculations without spatial homogenization (C5G7 MOX Benchmark). Report NEA/NSC/DOC (2003) 16, OCDE/NEA, Paris, France 19. Lewis EE, Palmiotti G, Taiwo T (1999) Space-angle approximations in the variational nodal method. Proc. Math. & Comp., Reactor physics and environmental analysis in nuclear applications. Madrid, Spain, 27–30 September 20. Gelbard EM (1961) Simplified spherical harmonics equations and their use in shielding problems. Report WAPD-T-1182. Bettis Atomic Power Laboratory, West Mifflin, PA, USA 21. Lawrence RD (1986) Progress in nodal methods for the solution of the neutron diffusion and transport equations. Prog Nucl Energy 17(3):271–301 22. Cho NZ (2005) Fundamentals and recent developments of reactor physics methods. Nucl Eng Tech 37(1):25–78 214 R. Roy 23. Rozon D (1992) Introduction à la cinétique des réacteurs nucléaires. Presses de l’École Polytechnique de Montréal, Québec (translated to English as Introduction to Nuclear Reactor Kinetics (1998)). 24. Hennart JP (1999) From primal to mixed-hybrid finite elements: a survey. Proc. Math. & Comp., Reactor physics and environmental analysis in nuclear applications. Madrid, Spain, 27–30 September 25. Brezi F, Fortin M (1991) Mixed and hybrid finite element methods. Springer, New York 26. Delp DL, Fisher DL, Harriman JM, Stedwell MJ (1964) FLARE – a three-dimensional boiling water reactor simulator. Report GEAP-4598, General Electric Company (code available at NEA data bank: http://www.nea.fr/abs/html/nesc0167.html) 27. Fowler TB, Vondy DR, Cunningham GW (1971) Nuclear reactor core analysis code CITATION. Report ORNL-TM-2496, Oak Ridge National Laboratory, USA (code available at NEA data bank: http://www.nea.fr/abs/html/nesc0387.html) 28. Langenbuch S, Velkov K, Pevec D, Grgic D (1996) Capability of the QUABOX/CUBBOXATHLET coupled code system. Int. Conf. Nucl. Option in Countries with small and medium electricity grids. Opatija, Croatia, 7–9 October 29. Greenman G, Smith KS, Henry AF (1979) Recent advances in an analytic nodal method for static and transient reactor analysis. Comp. Methods in Nucl. Eng. Williamsburg, VA, USA, 23–25 April 30. Derstine KL (1982) DIF3D: a code to solve one-, two-, and three-dimensional finite difference diffusion theory problems. Report Argonne-82–64, Argonne National Laboratory, USA 31. Palmiotti G, Carrico CB, Lewis EE (1993) Variational nodal methods with anisotropic scattering. Nucl Sci Eng 115:223–243 32. Sissaoui MT, Marleau G, Rozon D (1999) CANDU reactor simulations using the feedback model with Actinide burnup history. Nucl Tech 125:197–212 33. Rempe KR, Smith KS, Henry AF (1989) SIMULATE-3 pin power reconstruction: methodology and benchmarking. Nucl Sci Eng 103:334–342 34. Bahadir T, Lindahl S-T, Palmtag SP (2005) SIMULATE-4 multigroup nodal code with microscopic depletion, Math. and Comp., Supercomputing, reactor physics and nuclear and biological applications. Avignon, France, 12–15 September 35. Rozon D, Varin E, Roy R, Brissette D (1997) Generalized perturbation theory estimates of zone level response to refuelling perturbations in a CANDU600 reactor. Advances in nuclear fuel management. Myrtle Beach, USA, 23–26 March 36. Varin E, Hébert A, Roy R, Koclas J (2005) A user guide for DONJON 3.01, Report IGE-208 Rev. 1. Institut de génie nucléaire, École Polytechnique de Montréal, Québec 37. Marleau G, Hébert A, Roy R (2006) A user guide for DRAGON 3.05C, Report IGE-174 Rev. 6C. Institut de génie nucléaire, École Polytechnique de Montréal, Québec 38. Trkov A, Leszczynski F, Lopez Aldama D (2006)WIMS-D library update, final report of a co-ordinated research project, IAEA, Vienna (WLUP libraries available at NEA data bank: http://www.nea.fr/abs/html/iaea1408.html) 39. Holt J (2004) UML FPR Systems Engineering – 2nd Edn. IEE Prof Appl Comp Ser 4 4 Reactor Core Methods 215 Professor Robert Roy graduated from Université de Montréal with an M.Sc. degree in Mathematics in 1981. Then, he studied in France earning a D.E.A. in Numerical Analysis. In 1984, he had the chance to work briefly at E.D.F. Clamart where he was introduced to nuclear engineering by Gérard LeCoq and his team. He returned to École Polytechnique de Montréal (ÉPM) and received his Ph.D. degree in Nuclear Engineering in 1987. His thesis research involved the development of 3D transport models for reactivity devices in CANDU reactors. He presented a paper at the Topical Meeting on Advances in Reactor Physics, Mathematics and Computation, Paris (1987) and won the best student paper award. He was invited to join the S.E.R.M.A. team at C.E.A. Saclay where he spent a postdoctoral year working on the first release of Apollo-2 with Richard Sanchez and Zarko Stankovski. Returning to ÉPM, he spent 10 years as a research scientist at the Nuclear Engineering Institute. While his research work remained focused on transport theory, he made important contributions to the development and validation of the lattice code DRAGON along with Alain Hébert and Guy Marleau of the Institute. He is also one of the authors of the open-source reactor-physics codes DRAGON/DONJON used by the Canadian nuclear industry. In 2001, he decided to join the new Department of Computer Engineering at ÉPM, where he has been the first coordinator for a new undergraduate program in software engineering. Professor Roy is well known for his contributions to the development of parallel algorithms, in particular, for neutron transport applications. He has supervised seven Ph.D. and 14 M.Sc. students, more than half of whom are enrolled in Nuclear Engineering. Presently, he is a professor in the Department of Computer and Software Engineering at ÉPM, where he leads the DRAP research laboratory that exploits various computer cluster facilities available for students. He is also a member of the control board of the Réseau québécois de calcul haute performance, a network providing the most powerful computer resources in Québec. Chapter 5 Resonance Theory in Reactor Applications R.N. Hwang 5.1 Introduction The most essential objective in reactor physics is to provide an accurate account of the intricate balance between the neutrons produced by the fission process and those lost due to the absorption process as well as those leaking out of the reactor. The presence of resonance structures in neutron cross sections obviously plays an important role in such processes. Therefore, the treatment of neutron resonance phenomena has constituted one of the most fundamental subjects in reactor physics since its conception. It is the area where the concepts of nuclear reaction and the treatment of the neutronic balance in reactor lattices over a wide span of energy become intertwined. The basic issue here is how to apply the microscopic neutron cross sections in the macroscopic reactor systems. Because of its importance to reactor physics, much of the existing nuclear data and a significant portion of all cross-section processing codes downstream are devoted to the treatment of resonance phenomena prior to any meaningful neutronic calculations via either the deterministic or Monte Carlo approaches. The primary purpose of this chapter is to provide an overview of resonance theory in reactor applications as it evolved through the last 50 years. The subject under consideration can be divided into four major topics. First, various practical cross-section representations currently in use and their theoretical bases will be described. Second, various analytical and numerical methods for Doppler-broadening of cross sections will be discussed. Third, various methods for treating the resolved resonances in both homogeneous media and heterogeneous reactor lattices will be presented. Fourth, statistical methods for treating the unresolved resonances in the high energy region where individual resonance parameters are not available will also be given. For the sake of clarity, the historical perspective on this subject and some fundamental issues will be discussed briefly prior to other specific topics. Deceased Y. Azmy and E. Sartori, Nuclear Computational Science: A Century in Review, c Springer Science+Business Media B.V. 2010 DOI 10.1007/978-90-481-3411-3 5, 217 218 R.N. Hwang 5.1.1 Historical Perspective It all started with the practical interest in resonance absorption due to low-lying s-wave resonances of U238 and their implications to the feasibility of a natural uranium-fueled system to achieve a self-sustaining chain reaction [1, 2]. Of particular interest were two early observations which were later identified as the natural consequences of the so-called self-shielding effects. First one was the reduction in absorption rate per atom for neutrons slowed down in a water and uranium mixture as concentration was increased. Second one was that a substantial reduction in resonance absorption was possible if natural uranium was made into a lump surrounded by moderator. In fact, the latter discovery is commonly recognized as one of the key factors that signaled the beginning of the nuclear era. Practical needs have since provided the motivation for better understanding of the resonance phenomena exhibited by neutron cross sections and their effects on various reactor physics parameters of practical interest. To provide better understanding of the sequence of discussions to follow, a brief outline is presented on how resonance theory has evolved in conjunction to three main periods of reactor development, namely, the period of earlier thermal reactor development, the period of emergence of fast reactor program, and the current period of uncertainty. In the period of earlier thermal reactor development, efforts were focused on a handful of low-lying s-wave resonances of few actinides. The extent of their absorption depends on the number of high energy neutrons that can be slowed down to reach thermal energy via elastic scattering process. The fraction of neutrons that actually reaches the thermal energy was referred to as resonance escape probability. In the West, the first practical method for estimating the resonance escape probability in a reactor cell was pioneered by Wigner et al. in the early 1950s [2, 3]. A great deal of work in this area also began to emerge in the 1950s and most notable was the work by Chernick et al. [4, 5] that provided a concise description of resonance absorption in a two-region cell as neutrons were slowed down. One early book on resonance absorption that summarized a great deal of work carried out in the USA was that of Dresner published in 1960 [6]. At approximately the same time period, the issue was also examined in Russia, and the most representative works were those of Gourevich and Pomeranchouk [7]. A collection of papers on this subject that followed can be found in a book edited by Marchuk [8]. Two books on this subject that covered much of the developments up to the early 1970s were subsequently published by Lukyanov [9, 10]. The conceptual basis and computational methodologies began with relatively simple models not only in the cross-section representation but also in dealing with the localized flux as neutrons slowed down past the resonances in question. Two obvious reasons for this are attributed to the general lack of good quality resonance data and efficient computational tools. Consequently, efforts were focused on low-lying resonances of a few actinides with known parameters evaluated on the basis of the Breit–Wigner approximation [11]. From the practical point of view, the relative ease of such an approximation by which the microscopic cross sections can be concisely expressed as a function of energy made possible the 5 Resonance Theory in Reactor Applications 219 development of various plausible means of treating the macroscopic effects due to the presence of resonance structures in reactor physics calculations. These include various practical approximations to be discussed in sections to follow, notably, the analytical expression for Doppler-broadening, the resonance integral concept, viable approximations for the localized flux in the presence of a resonance, and the rational approximation for treating the collision probability. In the early 1960s when the liquid metal fast breeder reactor began to emerge as an attractive alternative for future generation reactors, there was a general shift of interest in resonance theory to the relatively high energy region and many intermediate weight nuclides began to play an important role in addition to the major and minor actinides. Because of safety considerations [12], a great deal of emphasis began to focus on the accurate estimation of the Doppler coefficient and the sodium void coefficient, both of which relied heavily on our extensive knowledge of resonance data for a large number of nuclides and our ability to compute those reactor parameters arising directly due to the presence of resonance structures. Such practical needs had motivated a great deal of developments in the area of resonance data evaluations as well as that of the method development. From the consideration of nuclear data and their appropriate representations, several developments are noteworthy. One important milestone was the formation of the “Cross Section Working and Evaluation Group” (or CSWEG) in 1966, that was intended to serve two important functions: (1) to provide the reference nuclear data base known as ENDF/B files; and (2) to motivate continuous improvement of nuclear data of which resonance data constitute a major portion. With improvement in high resolution measurements, the time had come to venture beyond the traditional Breit–Wigner approximation [11]. More accurate cross-section representations gradually began to emerge [13]. At the same time, the availability of data made possible significant extension of resolved resonance ranges especially for various major actinides and structural isotopes of interest. Thus, from considerations of reactor physics calculations downstream, one must deal with, not only the improved cross-section formalisms, but also cope with an increasingly large number of resonances with potentially different characteristics. This was particularly true when the exceedingly wide s-wave resonances spanning over several 100 keV were considered. Because of the emphasis on the relatively high energy region, the statistical treatment of unresolved resonances began to play an important role. Many methods in this area were developed as will be elaborated in the discussions to be presented in this chapter. Furthermore, various challenging issues were associated with the need to analyze various experiments performed in fast critical assemblies, which were constructed as experimental facilities to verify our ability to estimate important fast reactor parameters. Unlike traditional reactors, thermal or fast, the lattices of these assemblies were highly heterogeneous and special considerations were required. Therefore, there was a large body of work in this area that will be covered in the discussions to follow. The period from the early 1990s when the fast reactor program was cancelled in the USA to the present can be viewed as a period of uncertainty for the future of nuclear energy. Apparently, the reactor program has reached a crossroad. Perhaps less impact is felt elsewhere in the world. There is no question that nuclear 220 R.N. Hwang research and development in the USA have suffered. It is unfortunate that the lack of funding also reduces the traditional interest within the nuclear community in the pursuance of rigor and better understanding of the basic concepts essential to the future of nuclear energy. The dilemma is that a great many difficult issues encountered previously could probably be resolved today given our better knowledge of nuclear data, improved methodologies, and the availability of efficient but less costly computational facilities. From the perspective of resonance theory, there are still some interesting works worth noting. They will be described in later sections where appropriate. 5.1.2 Self-shielding Effects in Perspective Aside from the Doppler-broadening effect that one inevitably encounters in all practical problems involving cross sections, one macroscopic effect specifically associated with reactor applications is the so-called self-shielding effect. It is a phenomenon directly attributed to the localized fluctuations in neutron cross sections resulting from resonance structures on the averaged reaction rates of a reactor cell in energy and space at a given temperature. With no loss of generality, one convenient way is to view the self-shielding concept within the context of the commonly used multigroup approach in reactor calculations. One essential principle of this approach is the separation of the fine structure effect attributed to resonance structures from the global neutronic calculations of the entire reactor. This can be best accomplished via the use of the effective group cross sections for each nuclide and reaction type at a given cell in the reactor lattice defined as ˛ ˝ x .E; rE; T / .E; rE; T / E;Er ˝ ˛ (5.1) Q x D .E; rE; T / E;Er where the width of the group is usually taken to be much greater than the extent of the resonance for actinides. One quantity widely used as a measure of self-shielding effect is the self-shielding factor defined as the ratio of the effective cross section and its corresponding value at infinite dilution. Conceptually, its physical meaning can be better understood if it is cast into somewhat unconventional form [14], fx D Q x COVŒx .E; rE; T /; .E; rE; T / D1C hx .E/iE hx .E/iE h .E; rE; T /iE;Er (5.2) whereby the degree of the self-shielding effects is directly expressed in terms of the covariance of the localized reaction cross section and the neutron flux. Thus, the degree of self-shielding effects can be construed as a measure of correlation between the reaction cross section and neutron flux in energy and in space at a given temperature. Physically, it is anti-correlated and temperature-dependent which gives rise to the temperature coefficient of a reactor system. All averages here can either be cast 5 Resonance Theory in Reactor Applications 221 into the form of the usual Riemann integral or Lebesgue integral as one so chooses. For resonances in the unresolved energy range, these averages can be viewed as the expectation values based on the statistical properties of resonance parameters. Such a description provides a plausible basis, yet with no loss of generality, in much of the discussions to follow. 5.2 Representation of Microscopic Cross Sections The theory of nuclear reactions provides the basis for describing the behavior of resonance structures observed in cross sections. The most comprehensive discussions on this subject were given by Lane and Thomas [15], Lynn [16], and Fröhner [17]. For our purpose here, it is useful to retrace some of the fundamentals pertinent to practical applications. One obvious area to start is a brief description of the R-matrix theory from which various forms of representations with various degrees of sophistication were derived. 5.2.1 Brief Description of R-Matrix Theory In R-matrix theory, the reaction cross section for any incident channel c and exit channel c 0 and the total cross section are generally expressed in terms of the collision matrix Ucc0 , a symmetric and unitary matrix, cc0 D X 2 2 0 Ucc0 j ; t D g jı cc0 D 2 gc .1 <efUccg/ c cc k2 k 0 (5.3) c where k is the wave number (or momentum) reciprocal of .2 times the neutron wave length) and gc is the statistical spin factor, a function of the compound nucleus p spin J and the target nucleus spin I defined as k 2:196771 103ŒA=.AC1/ E and gc .2J C 1/=Œ2.2I C 1/ , respectively. There are two versions of R-matrix theory. The most commonly used basis for practical applications is the Wigner–Eisenbud version [18]. The other is the Kapur– Peierls [19] version which is also of some practical importance. For our purposes here, brief discussions of both will be presented. 5.2.1.1 Wigner–Eisenbud Version In this version, the explicit energy-dependent behavior of the resonance structure specified by Ucc0 is given either in terms of the channel matrix characterized by a real matrix R, or equivalently in terms of the level matrix A. Using Fröhner’s notation [17], one obtains 222 R.N. Hwang Ucc0 D e i 2 D e i 2 n ıcc0 C 2 i Pc1=2 .I RL0 /1 R 0 1 X 1=2 1=2 A c @ c A c ıcc0 C i 0 c cc0 Pc1=2 0 o (5.4) ; where two essential matrices here are, Rcc0 D X c c 0 ; E E A1 D .E E/ı X c L0c c (5.5) c and L0 is a diagonal matrix defined as L0cc0 D L0c ıcc0 D .Lc Bc /ıcc0 (5.6) The quantity Lc above is the logarithmic derivative of the outgoing wave function Oc .rc / at the channel radius rc with the boundary parameter Bc taken to be a real constant. Lc and the hard-sphere phase shift factor c , another Oc .rc /-dependent quantity, are explicitly given as Lc D rc Oc0 .rc / D Sc C iPc ; Oc .rc / c D arg Oc .rc / (5.7) Two generic resonance parameters for a given level that appear in the formulations are the reduced width amplitude c for a given channel c (or alternatively, the 2 partial width c D 2Pc c / and the eigenvalue E . These parameters are real and their statistical properties are well-known. Such attributes are not only important in resonance data evaluations but also essential in reactor physics applications when it comes to the treatment of the unresolved resonances to be addressed later. Aside from the obvious energy dependence that appears in matrix R or in the level matrix A, there are three energy-dependent factors that must be specified, namely, Sc ; Pc , and c . The quantity Sc , real component of Lc , is referred to as the level shift factor whereas Pc is referred to as the penetration factor. These quantities along with the hard-sphere phase shift factor, c , are weakly energy-dependent when the elastic scattering and inelastic channels are considered, whereas they are usually taken to be constant for photon and fission channels, usually by setting Bc D Sc and Pc to be constant. Thus, the explicit energy dependence of the collision matrix requires the specification of those energy-dependent factors. The outgoing wave function for a neutron channel with a given angular momentum state l at the channel radius is expressible in terms of the spherical Hankel function of the first kind, a linear combination of spherical Bessel functions, .krc / D .ikrc /Œjl .krc / C i yl .krc / Oc .rc / D ikrc h.1/ l (5.8) where jl .x/ and yl .x/ are the usual spherical Bessel functions of the first and the second kind, respectively. The incoming wave function, =c .rc /, is equal to Oc .rc / 5 Resonance Theory in Reactor Applications 223 Table 5.1 Momentum-dependent factors for various l-states defined at channel radius rc ( D krc ) Fctr l D 0 l D 1 l D2 l D3 3 Pl Sl l 1C 0 1 1C 5 9C3 2 2 7 C 225 C 45 4 .18 C 3 2 / 9C3 2C 4 3 tan1 3 2 tan1 2 2 C6 4 C 6 .675 C 90 2 C 6 4 / 225 C 45 2 C 6 4 C 6 .15 2 / tan1 15 6 2 for the neutron-induced cross sections. One useful identity from which the spherical Hankel function and its derivatives can be defined explicitly is, h.1/ .x/ D i l1 l l e ix X .l C j /Š .2ix/j x j Š.l j C 1/ (5.9) j D0 By utilizing Eqs. 5.7–5.9, the weakly energy-dependent factors, Sc ; Pc , and c can be defined explicitly. Table 5.1 shows these factors as a function of D krc for four l-states of practical interest. One limiting property of the level shift factor when k ! 0 is that Sc l, which can be shown directly via the use of Eqs. 5.7–5.9. Thus, for the relatively low energy range, one can set Bc Dl so that L0c0 c becomes dependent on the penetration alone. 5.2.1.2 Kapur–Peierls Version The Kapur–Peierls version [19] amounts to a different way of choosing the boundary parameter defined in Eq. 5.6. A simpler expression can be obtained if one sets Bc D Sc C iPc . Because of its energy-dependent nature, it is sometimes referred to as the energy-dependent boundary condition. The consequence is that L0c0 c D 0 and I RL0 D I . Thus, it eliminates the need for inverting the channel matrix so that the collision matrix can be cast into the much simpler form given below, Ucc0 D e i 2 c ıcc0 C X i g 1=2 g 1=20 c c " E ! (5.10) where two generic parameters gc and " become complex and energy-dependent. To obtain such a set of parameters, it requires computations at each incident neutron energy inferred by the use of the energy-dependent boundary condition, a trade-off that is not necessarily attractive in practical applications. Furthermore, there does not appear to be any plausible means to derive their statistical properties analytically. Thus, the Kapur–Peierls version has not been taken seriously as the tool for data evaluations, with exception of perhaps some special cases. However, it does have two major merits quite appealing to reactor physics applications. First, the 224 R.N. Hwang general form of Eq. 5.10 is readily amenable to analytical Doppler-broadening, a major task in the processing of point-wise cross sections. Second, it resembles the single-level Breit–Wigner approximation on which much of the reactor physics concept was based. For these reasons, various attempts were made, at least for some limited cases, to convert the known set of evaluated R-matrix parameters based on the Wigner–Eisenbud version to pole parameters of the Kapur–Peierls type. 5.2.2 Practical Representations Currently in Use For practical applications, it is convenient to separate the resonance component from the nonresonance component (i.e., potential scattering cross section) in the crosssection representation [13]. This can be accomplished symbolically by letting X 1=2 1=2 2&nc D i n A c (5.11) in Eq. 5.4. For our purposes here, let tR denote the resonance component of t so that the total and partial cross section of reaction process x can be written as t D p C 42 x D 42 X l;J a D 42 X X ˚ gJ <e e i 2 l &nn ; p D 42 l;J gJ j&nc j2 ; X gJ sin2 l (5.12) l;J c … n; x 2 f; gJ <ef&nn g j&nn j2 ; (5.13) sR D tR a (5.14) l;J It should be noted that the resonance scattering component, sR is usually taken to be the difference of the compound nucleus cross section and the absorption cross section. Similarly, for some approximations to be described, the capture cross section is taken to be D t s f to ensure the unitary nature of the collision matrix. 5.2.2.1 Single Level Breit–Wigner Approximation (SLBW) In the limit when resonances are well-isolated, the inverse level matrix can be represented by the matrix of rank 1 attributed to a single resonance, i.e., A1 E0 E X c L0c c2 E0 C E i t 2 (5.15) where the total width t is equal to the sum of all partial widths and the level shift is D .Sc Bc /c2 associated with elastic scattering channel c. Thus, strictly speaking, the corresponding cross section should reflect the discrete nature of isolated levels. In practice, there are very few situations where 5 Resonance Theory in Reactor Applications 225 resonances are completely isolated. The question arose as to how one could maintain the continuity in flux calculations if the tails of neighboring resonances ran into one another. In lieu of other options beyond the SLBW in earlier days, one solution to circumvent this difficulty was to consider the cross sections as a superposition of individual resonances accompanied by a set of precomputed point-wise “smooth” cross-section files. The latter was intended to provide the necessary corrections to the inadequacies of the approximation. In this context, the single-level approximation actually in use can be specified as x D X X 0xk X X nk xk i 2 ; D 4 g <e J 2tk zk E 1 C xk2 l;J tR D k l;J XX l;J D 4 0t k k 2 X l;J gJ xk sin 2 l cos 2 l C 1 C xk2 1 C xk2 X nk k x 2 ; f k (5.16) ! i e i 2 l <e 2 zk E (5.17) where 0xk D 42 gk nk xk =tk2 , xk D .E Ek k /=.t k =2/ and zk D Ek C k i t k =2. The equivalent representations of the traditional expressions in complex domain are introduced here to signify the meromorphic nature of cross sections pertinent to discussions to follow. The quantities in parentheses in Eqs. 5.16 and 5.17 that characterize the energy dependence of a given resonance are also referred to as the Lorentzians. They are readily amenable to analytical Doppler-broadening, an attribute essential to much of the resonance integral concept in reactor physics. 5.2.2.2 Multilevel Breit–Wigner Approximation (MLBW) In the late 1960s, improvement of the single-level approximation began to receive a great deal of attention as the fast reactor development was in full swing. To account, at least in part, for the overlap effects due to neighboring levels, the multilevel parameters based on the MLBW approximation became available. This method is based on the assumption that the inverse level matrix A1 can be taken as a diagonal matrix consisting of various SLBW elements, i.e., i t ı .A1 / D E C E 2 (5.18) Thus, the resulting collision matrix can be expressed in the form of a pole expansion similar to that of Kapur–Peierls with easily derivable parameters without matrix inversion [20]. With no loss of generality, the corresponding cross sections can be cast into the same form as those of the SLBW in the complex domain, 226 R.N. Hwang 91 8 .x/ = < X X i r 1 J;l; A @<e x D ; x 2 ; f : z E ; E 0 (5.19) l;J where the pole z is clearly identifiable with that defined in Eq. 5.16 for SLBW .x/ .x/ resonance and rJ;l; is related to the corresponding residue cJ;l; defined as .x/ cl;J; 1 X .2i / D 4 1=2 p .0/ .0/ E n x nu x z z .x/ .x/ ; rl;J; D 42 gJ E cl;J; (5.20) p where D n = E is referred to as the “reduced” neutron width and is independent of energy for s-wave neutrons. Note that the residue here, unlike the case of SLBW, is complex and involves the overlap effect of all levels. The total resonance cross section can be cast into the similar form, 91 0 8 i 2 l .tR/ = rJ;l; 1 X X @ < i e A tR D <e (5.21) ; : E z E .0/ n l;J where the residue is real and readily identifiable with the parameters defined for the SLBW approximation given by Eq. 5.17. The same procedure is applicable to the absorption cross section defined by Eq. 5.14. It can be shown readily that, 9 8 .a/ = < i X X .1/r h 1 l;J; .a/ .n/ a D <e D 42 gJ E n =2 cl;J; ; rl;J; : z E ; E l;J (5.22) .n/ residue rl;J; is obtainable from Eq. 5.20 by replacing the index x with n. where the The MLBW representations given in the complex domain here are directly identifiable with those originally proposed by Adler–Adler [20] when expressed explicitly in terms of real parameters. It should be noted that the approach described above represents the generic form including multilevel effects in all cross sections. In the version suggested by the ENDF/B user’s manual [13], such effects are only accounted for in the resonance scattering cross section while other partial cross sections are taken to be the same as those of the single-level approximation described earlier. 5.2.2.3 Adler–Adler Approximation (AA) The cross sections based on MLBW approximation account for the multilevel effects attributed to the diagonal elements alone while the multichannel effects arising from the off-diagonal elements are not accounted for. Such an approximation 5 Resonance Theory in Reactor Applications 227 may be satisfactory for nuclides with relatively isolated resonances. It can become unsatisfactory when fissionable nuclides with closely spaced resonances are considered. One improved alternative was pioneered by Adler and Adler [20] in order to deal with the low-lying s-wave resonances of fissionable isotopes. For such resonances, the total width is practically dominated by the sum of fission and capture width. Hence, all weakly energy-dependent factors attributable to the elastic scattering process described earlier will have little impact on the level matrix. The penetration factor, Pc .E/, that appears in A1 can be approximated by Pc .Ek / when applied to the neutron width of each resonance considered. Consequently, one can deduce the Kapur–Peierls-like, yet energy-independent, parameters via diagonalization of the inverse level matrix and the resulting cross sections can be expressed in the same general form in terms of the single-level-like pole expansion described for the MLBW approximation. With no loss of generality, one convenient way to derive the Adler–Adler parameters directly from the inverse level matrix is to consider a complex eigenvalue problem, a procedure also used in [21] for computing S-matrix parameters, as given below X c L0 c (5.23) B Z ./ D " Z./ ; B D E ı c where " is the -th complex eigenvalue and Z./ is the normalized eigenvector corresponding to the -th eigenvalue respectively. The normalization is such that P ./ 2 D 1. Once the eigenvalues and eigenvectors are known, the collision Z matrix can be expressed in the Kapur–Peierls form where the residue amplitude g 1=2 c is given by X p g 1=2 2Pc Z ./ c (5.24) c D It is important to note that g1=2 c , the residue amplitude, is physically equivalent to 1=2 c except it is defined in the complex p domain. For the elastic scattering channel, gkn , like nk , is proportional to E, for s-wave resonances. Hence, in the p same context, one can define g .0/ E as the “reduced” neutron residue n D gn = amplitude. Thus, the generic forms defined in Eqs. 5.19, 5.21 and 5.22 are equally applica1=2 ble here except that z and x must be replaced by pole " and residue amplitude g 1=2 x , respectively. For the sake of completeness, the corresponding residue parameters are explicitly given below: .x/ rl;J; .tR/ rl;J; 1=2 p .0/ .0/ 42 gJ E X .2i / E g n g x g n g x D 4 " " .a/ .tR/ .n/ .n/ D 42 gJ E e i 2 l gn ; rl;J; D rl;J; rl;J; rJ;l; (5.25) (5.26) 228 R.N. Hwang .n/ where rl;J; can be obtained by Eq. 5.20 if n and x are replaced by gn and g x respectively. The complex parameters here are readily identifiable with those originally derived by Adler–Adler [20] when cast into real arithmetic expressions. It is noteworthy that the traditional approach of converting the R-matrix parameters into the Kapur–Peierls-type parameters presented here is by no means the only choice. An attractive alternative pioneered by deSaussure and Perez [22] is believed to be more efficient for practical applications when used in conjunction with converting the Reich–Moore parameters [23] into pole parameters and is conceptually more amenable to further generalization. 5.2.2.4 Reich–Moore Approximation This is an approximation that best preserves the rigor of the R-matrix theory and is most preferred for resonance data evaluations. The only significant assumption is given below [23], X X 2 c L0c c D ı c L0c (5.27) c2 c2 i.e., the photon channels that appear in the above matrix are taken to be diagonal. The physical justification for this assumption is based on the fact that a large number of photon channels with c assuming either positive or negative signs are expected so that the off-diagonal elements are unlikely to be important due to cancellations. This assumption, in effect, provides a significant reduction in the number of channels that one has to deal with. The inverse level matrix after such channel reduction becomes X ı i c L0c c (5.28) .A1 / D E E i 2 c… and the corresponding “reduced” R-matrix is expressible as 0 Rcc 0 D X c c 0 ; c0 … E E i =2 (5.29) The collision matrix in the channel matrix form after elimination of photon channels can be written as Ucc0 D e i. c C c0 / ˚ Œ2.I K/1 cc0 1=2 ıcc0 ; Kcc0 D iPc1=2 Rcc0 Pc 0 (5.30) Thus, the Reich–Moore approximation can be readily cast into the same generic forms defined by Eqs. 5.12–5.14 in terms of the resonance component &nc D ınc .I K/1 nc ; 2&nc D i X 1=2 1=2 c A c ; for the channel matrix or level matrix representations respectively. (5.31) 5 Resonance Theory in Reactor Applications 229 The fact that only a small number of elastic scattering and fission channels are present makes the approach much more manageable than the formal R-matrix representation. There are, however, two trades-off here. First, the “reduced” R-matrix is no longer real, a requirement to maintain the unitarity of the collision matrix. To circumvent this issue, the capture cross section must be defined as the difference of the total cross section and sum of non-capture cross sections. The common practice is to compute t , f and a via Eqs. 5.12–5.14 respectively, from which s and can be deduced. Second, unlike other approximations, the generic form of pole representation can no longer be maintained. This has two significant implications on reactor physics applications where much of the basic concepts and codes were based on the single-level approximation. It is particularly so in conjunction with the analytical Doppler-broadening of cross sections and utilization of the resonance integral concept to be described later. 5.2.3 Other Alternative: Generalized Pole Representation In the mid-1980s when the fast reactor program reached its peak, a large body of Reich–Moore resonance data for practically all major nuclides of interest began to become available. To enable utilization of the vastly improved resonance data in many reactor applications motivated the development of the generalized representation, whereby both the attractive features of the pole representations for reactor applications and the rigor of the Reich–Moore formalism [23] could be preserved. One important development prior to the conception of the generalized pole representation was the method pioneered by deSaussure and Perez [22] for converting the R-matrix parameters based on the Reich–Moore approximation to the Kapur– Peierls-type parameters (or Adler–Adler parameters) described earlier. Instead of the traditional approach based on diagonalization of the level matrix described previously, their rationale was based on the explicit mathematical behavior of &nc , the resonance component of the collision matrix, as a function of energy. One simple way to illustrate the basis of their approach is to examine the energy dependence of A1 given by Eq. 5.5. In lieu of energy dependence of L0c in this equation under the Adler–Adler assumption [20] for the s-wave resonances, it is quite obvious that the N N matrix A is mathematically equivalent to a rational function of energy of order N . This can be proved readily by examining the cofactor and determinant of A1 . The former must be expressible as a polynomial in E of order N 1 while the latter as a polynomial in E of order N , the total number of s-wave resonances in question. The same conclusion can also be reached if one examines the channel matrix .I K/1 defined by Eq. 5.30. Mathematically, it is equivalent to state that A and/or .I K/1 must be single-valued and meromorphic in the energy plane. Thus, the Adler–Adler parameters can be obtained directly from a given set of Reich–Moore parameters via the computation of the poles and their corresponding residues of the channel matrix. The former are equivalent to the roots of det 230 R.N. Hwang .I K/ D 0, which can be computed efficiently via the usual Newton–Raphson scheme. Once the poles are known, the residues can be determined readily. The POLLA-code [22] was developed for this purpose. The rationale of deSaussure and Perez [22] can be readily extended to the rigorous pole expansion of the collision matrix in the k-plane [24]. Conceptually, such a representation can be construed as the natural consequence of the physical condition that the collision matrix must be single-valued and meromorphic in the k-plane (or momentum domain). From the theorem of complex calculus, such a function is expressible as a rational function with simple poles. With no loss of generality, the same rationale can be extended to the general situation for all possible l-states if one substitutes their respective explicit energy dependence of the level shift factor and penetration factor, given in Table 5.1, into A1 . Upon examining the ratio of the cofactor and determinant, one can show that &nc defined by Eq. 5.11 is representable by a rational function of order 2.N C l/. Thus, the generalized pole representations of cross sections based on such a rational function and its absolute square can be expressed as [24], ( ) N Cl 2 1 XXX .i /e i 2 l .t / < e rl;J;j; .j / p t D p C E p E j D1 l;J D1 (5.32) ( ) N Cl 2 1 XXX .i / .x/ x D < e rl;J;j; .j / p ; E p E l;J D1 j D1 (5.33) x 2 f; a .x/ where pole p.j / and residue rl;J;j; are genuinely energy-independent. The inner sum reflects the behavior of the poles that appear in pairs with p.1/ nearly equal to .2/ .1/ .2/ p . The AA can be considered as a special case for which p D p . The pole expansion here also preserves the same general form as that for the Breit–Wigner approximation readily amenable to various existing codes. Studies have shown that these pole parameters can also be computed efficiently via the Newton–Raphson scheme provided extended precision is used. The algorithms have been coded in the WHOPPER code [24] for routine applications. It is also noteworthy that, given these parameters, one can deduce the Humblet– Rosenfelt [25]-type parameters directly as shown in [26]. This feature further enhances its compatibility with existing codes. 5.3 Doppler-Broadening of Cross Sections The representations of microscopic cross sections described in the previous section have been based on the assumption that the target nucleus is at rest in the laboratory system in the neutron–nucleus interaction. Realistically, the target nuclei 5 Resonance Theory in Reactor Applications 231 must be subject to thermal motion at the elevated temperatures of practical interest. The procedure of taking into account such an effect in computing cross sections is generally referred to as the Doppler-broadening of cross sections. In fact, it is the first macroscopic effect one encounters prior to others in the reactor physics calculations to follow. 5.3.1 Practical Doppler-Broadening Kernels in Use For practical applications, the classical theory of velocity distribution has generally been used while the solid state effects of quantum mechanical nature are usually considered separately. Two models have been widely used to provide the Dopplerbroadening kernel required for computation of the temperature-dependent cross sections. They will be addressed in order as follows. 5.3.1.1 Ideal Gas Model One commonly used model is based on the assumption that the velocities of the target nuclei follow the classical Maxwell–Boltzmann distribution. One attractive simplification of such a distribution derived by Solbrig [27] is of particular interest for practical applications. The resulting Doppler-broadened cross section can be expressed in terms of the following integral, p Z p p E x E; m D Œu x .u; 0/ S E; u du 1 (5.34) 0 p where x . E; 0/ will henceforth be used to denote the microscopic cross section at 0ı K and S.u; u0 /, referred to as Solbrig’s kernel is given by S u; u0 D .u C u0 /2 .u u0 /2 u0 exp p exp 4t 4t 2 t u (5.35) p p with m D 2 t D .KT=A/1=2 denoting the Doppler-width in the u D E domain. p As pointed out by Solbrig, Eq. 5.34 preserves the 1= E -dependence of absorption cross section in the limit of low energy. The use of Solbrig’s kernel in conjunction with Eq. 5.34 provides the rigorous Doppler-broadened cross sections so long as the ideal gas model is assumed to be valid. It is particularly amenable to applications in conjunction with the generalized pole representation described in the previous section. In the early period of reactor physics, the broadening kernel actually used was a Gaussian in the E-domain, which can be considered as a limiting case of Solbrig’s kernel defined by Eq. 5.35 when applied in the relatively high energy region, i.e., 232 R.N. Hwang u 1 .E E 0 /2 0 0 0 0 dE0 S.u; u /du G.E; E /dE D p exp u0 2 (5.36) where D Œ.4KT E/=A 1=2 is specifically referred to as the Doppler-width when applied in conjunction with the Gauss kernel. Hence, the corresponding Dopplerbroadened cross section becomes Z1 x .E; / .E 0 ; 0/G.E; E 0 /dE0 (5.37) 0 5.3.1.2 Accommodation of Crystalline Binding Effects via Effective Temperature Model The fact that the thermal motion of the target nuclei may not necessarily be represented adequately by the traditional ideal gas model has been a long-standing issue in the nuclear community since the early days of reactor development. Strictly speaking, the target nucleus and compound nucleus subsequently formed are bound in the energy states of the atom. Hence, this is an area where nuclear physics and solid state physics become intertwined. For details, readers are referred to an excellent book on the fundamental aspect of this subject by Osborn and Yip [28]. For our purposes here, a widely accepted model will be described. The model, sometimes referred to as the “effective temperature” model, was pioneered by Lamb [29] in the study of the binding effects for purely absorbing nuclei. It was further extended by Egelstaff [30], Nelkin and Parks [31], and others [32]. Two general assumptions were used in this model. First, the motion of the nuclei about the lattice sites is harmonic in nature characterized by its frequency distribution, which reflects the solid state properties of the lattice in question. In the harmonic approximation, all normal modes of vibration are taken to be independent. Second, the motion of nuclei is independent of that of the lattice. Of particular interest here is the case of “weak” lattice binding considered by Lamb that most closely resembles the practical situation. The weak binding approximation is based on the assumption that the compound nucleus’ lifetime for an energy level in the nuclide is small compared to the maximum relaxation period of the lattice, the reciprocal of the maximum frequency of oscillations. Let !max be the maximum frequency. The maximum binding energy within the crystal is given by „!max where „ is Plank’s constant divided by 2. The “weak binding” approximation is assumed to be valid if .t C /=2 >>> „!max . Here, t and are the total width of a Breit–Wigner resonance and the Doppler width in energy domain, respectively, as defined previously. Physically, it is quite obvious that this condition will be met as the neutron’s energy increases. One quantity of particular physical interest is the average energy characteristic of the thermal motion. For the ideal gas model, the average energy E is well-known from the thermal dynamic nature of the gas, i.e., E D KT, the product of Boltzmann 5 Resonance Theory in Reactor Applications 233 constant and temperature. The corresponding average energy using the effective temperature model is given in terms of the average per modes of oscillation, i.e., ! Zmax E D KT eff D „! „! coth g.!/ d! 2 2KT (5.38) 0 where g.!/ is the normalized frequency distribution, also known as phonon distribution, describing the wave number per unit frequency interval and Teff is the effective temperature deduced from the average energy. The equation above provides the general relationship between the effective temperature and the thermal dynamic temperature. Thus, all broadening kernels described earlier remain intact so long as the effective temperature is known. It is important to realize, however, that our knowledge of the phonon distribution for various lattices of interest is, at best, sketchy. Earlier work in this area was focused mostly on UO2 lattice [32]. Most notable results were given by Dolling et al. [33] based on the shell model fitting the experimental dispersion curves obtained by inelastic scattering. For practical applications, people usually resort to the use of the much simplified model known as the Debye model commonly used to calculate the specific heat. It is based on the assumption that g.!/ varies parabolically in ! between zero and the cut-off frequency !max . Thus, the effective temperature defined by Eq. 5.38 is characterized by !max alone. Substituting g.!/ D ! 2 into Eq. 5.38, one immediately obtains an equation of manageable form, Teff 3 D D 2 Z1 x 3 coth xD 2T dx (5.39) 0 where D is commonly referred to as the Debye temperature defined as D D .„!max =K/. Again, viable information of the Debye temperature is also generally lacking. It is quite obvious that more studies in this area are needed. 5.3.2 Analytical Broadening via Doppler-Broadened Line-Shape Functions As long as the cross sections can be expressed in the form of a pole expansion as described in Section 5.2, the Doppler-broadening of each Lorentzian-like term leads to the analytical forms commonly referred to as the Doppler-broadened line-shape functions, in which the energy and temperature dependence are explicitly defined. There are two types of approaches by which these Doppler-broadened line-shape functions can be derived. One is based on the approximate Gaussian kernel while the other is based on Solbrig’s kernel. 234 R.N. Hwang 5.3.2.1 Traditional Doppler-Broadened Line-Shape Functions The traditional Doppler-broadened line-shape functions based on the approximate Gauss kernel become quite obvious when the Lorentzian term-based cross sections are expressed in the complex domain. Upon broadening via Eq. 5.37, each Lorentzian term described in Section 5.2.2 becomes p . k =2/W .z/ D Œ .x; k / C i .x; k / ; i W .z/ D Z1 1 2 e t dt zt (5.40) where z D .E zk /=. For SLBW and MLBW approximations, z D .E Ek / C ik =2 with k D t =. For the AA, zk is identifiable with "k described previously. Here, W .z/, referred to as the complex probability integral, is a well-known function while the - and - functions are its symmetric and asymmetric components of the traditional Doppler-broadened functions. Thus, the corresponding Dopplerbroadened cross sections for the case of the SLBW approximation become x .E; T / D tR .E; T / D XX l;J k l;J k XX 0xk .x; k / 0tk Œcos 2 l (5.41) .x; k / C sin 2 l .x; k / (5.42) respectively. Other comparable approximations can also be expressed in the same general forms accordingly. 5.3.2.2 Generalization of Doppler-Broadened Line-Shape Functions As discussed in Section 5.2.3, the rigor of the energy dependence of cross sections as defined by any formalism can always be preserved if cast into the form of a p generalized pole expansion in the k-plane (or E -domain). Therefore, the exact Doppler-broadening of each Lorentzian-like term can be carried out analytically as before if the rigorous Solbrig’s kernel is used. The substitution of Eqs. 5.32 and 5.33 into Eq. 5.34 leads to [24], x N Cl 2 n p p o 1 XXX .x/ E; m D <e rl;J;j; Bx E; m ; p.j / E l;J (5.43) j D1 Cl X 2 p 1 X NX p o n .tR/ .j / (5.44) tR E; m D <e rl;J;j; e 2i l Bx E; m ; p E l;J j D1 5 Resonance Theory in Reactor Applications where Bx . is given as 235 p E; m ; p.j / /, the Doppler-broadened line-shape function of each pole ! p p E p .j / E; m ; p ; x 2 a; f Cx m i h p Z1 exp t 2 2 E= t .j / p m i p E .j / Cx E; m ; p D p exp 2 dt 2 m .j / 2m p =m t 2 0 p p W .j / Bx E; m ; p D m (5.45) (5.46) p The quantity Cx . E; m ; p.j / / can be viewed as the low energy correction term which vanishes exponentially as a function of .E=2m /. Typically, it becomes unimportant when E 1eV for most nuclides of practical interest. Thus, the Doppler-broadened line-shape function exhibits the same mathematical propp erties in E -domain as the traditional one in E-domain except under the low energy limit. It should be noted that, strictly speaking, Eq. 5.44 implicitly assumes that the weakly energy-dependent l is unimportant in the Doppler-broadening process. For the sake of rigor, the exact broadening to account for such an effect has been developed and incorporated into the W -function approach for practical applications [34]. 5.3.2.3 Evaluations of the Doppler-Broadened Functions For the Doppler-broadening based on the Gauss kernel widely in use, the evaluation of the W -function alone will suffice. For exact Doppler-broadening based on Solbrig’s kernel, an additional integral defined by Eq. 5.45 may also be required when the extremely low energy region is considered. Therefore, the evaluation of the W -function and that of the correction term will be described separately. Method for Evaluating the W -Function Two identities of the W -function are of particular practical interest, namely, W .z / D W .z/; 2 W .z/ D 2e z W .z/ (5.47) which are also referred to as the symmetry relations. Thus, computation of W .z/ in the first quadrant (i.e., 0 < arg z < =2) will suffice because the values for the other quadrants can be deduced directly from these relations. For illustration purposes, two contrasting methods will be described briefly. The first method is based on the efficient algorithm developed by O’Shea and Thacher [35]. The second method is known as Turing’s method using the Cauchy integral approach [36]. 236 R.N. Hwang First Method The fundamental basis of the O’Shea–Thacher algorithm [35] is to use a Taylor expansion based logic. The well-known Taylor’s expansion of the W -function, upon separation of the even and odd terms, yields the following identity, W .z/ D 1 X 1 1 X X .iz/n .1/k 2k .1/k n D z Ci z2kC1 .k C 1/ .k C 3=2/ C 1 2 nD0 kD0 kD0 2i 2 D e z C p D.z/ where D.z/ D e z 2 Rz (5.48) 2 e t dt is the Dawson integral. 0 Thus, it is natural to divide jzj into three basic regions. In region 1.jzj <<< 1/, it is quite obvious that the use of the low-order terms given by Eq. 5.48 will suffice. O’Shea and Thacher [35] have demonstrated that the continued fraction method provides an exceedingly accurate and efficient tool for other regions. In region 2.jzj 4/, the coefficients required for the continued fraction approach can be deduced from the portion of the expansion for the Dawson integral given by Eq. 5.48 via the quotient-difference algorithm developed by Henrici [37]. In Region 3.jzj 4/, there are two ways of computing the W -function. The original version of the O’Shea–Thacher algorithm [35] was based also on the continued fraction approach in this region, except its coefficients via the quotient-difference algorithm were based on the asymptotic series of the W -function, i.e., W .z/ D 1 X .k C 1=2/ z2kC1 (5.49) kD0 Alternatively, another version adopted later was based on the more straightforward Gauss–Hermite quadrature approach. Second Method The W -function can also be computed via the Cauchy integral approach originally developed by Turing [36] for computing the zeta-function. Turing’s method was later extended by Goodwin [38] to compute the integral of a meromorphic function R1 t 2 e f .t/dt, via consideration of a contour integral with a Gaussian weight, i.e., of the form, 1 Z I D e z 2 f .z/ dz 1 e 2iz= h (5.50) C Aside from the poles of f .t/, all poles of the sinusoidal function in the denominator are equally spaced with a spacing of h on the real axis. The contour of integration is 5 Resonance Theory in Reactor Applications 237 taken to be a rectangular box enclosing these poles. The integral of interest can be related to the contour integral by extending the contour to infinity while the latter can be determined via the Cauchy integral theorem if the poles of f .z/ are also known. The same method was first used by Bhat and Lee-Whiting [39] to compute the W function and later by Fröhner [40] to broaden the zero-temperature Reich–Moore cross sections assuming that the poles were known. For the case in point, f .t/ D 1=.z t/. Bhat and Lee-Whiting [39] showed via this method that, 2 2 2 1 2P .h/e z ih X e n h C R.h/ W .z/ D nD1 z nh 1 e 2iz= h (5.51) where P .h/ D 1, 1=2, or 0 for =mfzg<.= h/, =mfzg D .= h/ or =mfzg > .= h/, respectively. Mathematically, the first and the second terms in Eq. 5.51 are attributed to the poles of the sinusoidal functions in the denominator of Eq. 5.50 and the single pole of f .t/, respectively, while the remainder R.h/ is of the order of exp. 2 =h2 /. Therefore, by choosing the value of h in the range 0:5 < h < 1, optimum efficiency and accuracy can be achieved. Evaluation of the Correction Terms For the sake of rigor, one may include the low-energy correction terms defined by Eqs. 5.45 and 5.46 as well if the generalized pole representation is considered. It was found that the pertinent integrals involved can be treated efficiently via the use of the semi-infinite Gauss–Hermite quadrature developed by Steen et al. [41], under the condition where these terms may play a role, i.e., small E=m . A code for generating the weights and abscissas of such quadrature in extended precision was developed in order to alleviate the round-off problem resulting from the recursive relations required for the higher-order orthogonal polynomials of interest. For our purpose here, a quadrature of 24 points was found to be sufficient. For details, the reader is referred to [34]. 5.3.3 Direct Numerical Doppler-Broadening of Point-Wise Cross Sections It is important to point out that the Doppler-broadening of cross sections via the traditional line-shape functions is by no means unique. In fact, various numerical techniques can also be utilized for computing the Doppler-broadened cross sections at any elevated temperature directly from a predetermined set of point-wise cross sections at zero temperature. Unlike the analytical broadening where one deals only with a known function with known characteristics, one must deal with the numerical algorithms for treating an extremely large body of mesh points over a large energy range of interest. 238 R.N. Hwang There are two types of numerical methods for reactor applications in existence, namely, the kernel-broadening method and the method based on the solution of the traditional heat equation via the finite-difference method. Conceptually, the validity of the low-order Taylor expansion within any mesh interval for the cross sections to be broadened must be implicitly or explicitly assumed in these methods. Brief discussions of this subject will be presented in the following. 5.3.3.1 Kernel-Broadening Approach With no loss of generality, the Doppler-broadened cross section defined by Eq. 5.34 will be considered. The kernel-broadening p schemes in question are based on the p E; 0 over Solbrig’s kernel defined by Eq. 5.36 piece-wise integration of Ex in each predetermined segment. Upon changing variables, Eq. 5.34 can be cast into the following discretized form, 0 Zbi1 N X 1 B t 2 f .y; m / D p @ e .t C y/f Œm .t C y/; 0 dt y i D1 ai1 Zbi 2 1 2 C e t .t y/f Œm .t y/; 0 dtA W xi y xi C1 (5.52) ai 2 p p p p where f E; m D Ex E; m , y D E=m and fxi g is a set of p precomputed mesh points with xi D E=m . Here, the limits of integration are defined as ai1 D xi y, bi1 D xi C1 y, ai 2 D xi C y and bi 2 D xi C1 C y for these two integrals, respectively. There are two general ways of carrying out the integration numerically in each mesh interval in question. One method is to assume that the cross section to be broadened is representable by a low-order Taylor expansion explicitly, typically no more than two terms. The schemes based on this assumption will henceforth be referred to as the explicit kernel-broadening scheme. Alternatively, the integral of this type is obviously readily amenable to the low-order Gauss–Hermite quadrature for arbitrary limits of integration if the pertinent weights and abscissas can be computed efficiently. The latter implicitly assumes that the integrand excluding the Gaussian weight must be representable by a polynomial of reasonable order. Explicit Kernel-Broadening Method The integral in Eq. 5.52 can be carried out analytically if Taylor’s expansion is used, i.e., Zbij aij N X 2 e t t ˙ y f Œm t ˙ y ; 0 dt D cn.ij / Mn.ij / aij ; bij ; j 2 1 or 2 (5.53) nD0 5 Resonance Theory in Reactor Applications 239 Rbij 2 .ij/ where the n-th order moment of the Gaussian is defined as Mn .aij ; bij /D t n e t dt aij and these moments can be readily computed recursively in terms of error functions and exponential functions. There are two comparable methods which differ only in the domain of expansion one assumes. Linearity in Energy The most commonly used method [42, 43] is based on the assumption that the cross section at zero temperature varies linearly in energy between two mesh points, xi < x < xi C1 , i.e., xi2C1 xi2 y .x; 0/ D y.i / C si x 2 xi2 ; si D y.i C1/ y.i / (5.54) p Physically, this expansion can not rigorously reproduce the 1= E limit of absorption cross section as energy approaches zero. It should also be noted pthat, although the linear equation in energy becomes a quadratic equation in the E -domain, it is not equivalent to a second-order approximation from the perspective of Taylor’s expansion in lieu of the presence of the linear term. The substitution of Eq. 5.54 into 5.53 leads to the explicit specification of Eq. 5.52 in which the linear combination up to the fourth-order moment is required. Linearity in p E In contrast, the linear approximation for xi < x < xi C1 in the Solbrig’s kernel was defined, can be represented by [44] xy .x; 0/ D xi y.i / C si .x xi /; si D p E -domain, where xi C1 y.i C1/ xi y.i / xi C1 xi (5.55) Upon substituting Eq. 5.55 into 5.53, the resulting integrals in Eq. 5.52 become a linear combination of moments up to the second order. Thus, it is numerically more efficient than the former approach. p It is also important to note that this approximation clearly preserves the 1= E-dependence of the absorption cross section as energy approaches zero, the necessary criterion from which Solbrig defined the Doppler-broadening given by Eq. 5.34. Extensive benchmark calculations have shown that both approximations based on identical sets of precomputed mesh points and internally consistent coding give practically the same results except in the low energy limit. Implicit Kernel-Broadening Method The basic integral for the Kernel broadening scheme defined by Eq. 5.52 immediately suggests that one simple, yet accurate, means to evaluate this type of integral 240 R.N. Hwang is via the generalized Gauss–Hermite quadrature for integrals with finite limits of integration. One obvious advantage of the Gauss quadrature is that the N -th order quadrature will provide the exact value of the integral if the integrand excluding the weight function is representable by a polynomial of order .2N 1/. Such an approach requires the specification of the corresponding weights and abscissas which, in turn, must be determined from the associated orthogonal polynomials and their roots. For a two-point quadrature of this type, the latter can be determined analytically with equal ease as the other two methods described above using the same Gaussian moments up to third order, as demonstrated in [44]. Much better accuracy than the previous two methods can be achieved via this method using the same mesh structures as also reported in [44]. 5.3.3.2 Heat Equation Approach Based on the Finite Difference Method It is well-known that the Doppler-broadening process is equivalent to the solution of the linear heat flow equation in a semi-infinite or infinite medium which can be described by a second-order partial differential equation of the form, @f .u; t/ @2 f .u; t/ D 2 @u @t (5.56) if the initial condition f .u; 0/ is specified. It can be shown [44], via the use of Fourier transform, that the solution of Eq. 5.56 isdirectly identifiable with Eq. 5.34 p p 2 E; m . if one sets u D E, t D m and f .u; t/ D Ex For reactor applications, there are two finite-difference methods in use for this purpose. One is based on the implicit finite-difference method developed by Dunford and Bramblett [45] and the other is based on the explicit finite-difference method developed by Leal and Hwang [46]. For details, the reader is referred to the respective references cited. 5.4 Resonance Absorption in Homogeneous Media Much of the earlier developments in the theory of resonance absorption in nuclear reactor systems began with the treatment of such phenomena in an infinite homogeneous medium. In lieu of the spatial dependence, the steady state Boltzmann transport equation that defines the balance of the neutron population at a given energy E reduces to the following form, generally referred to as the slowing-down equation [47], .E/†t .E/ D XZ i dE0 .E 0 /†si .E 0 /fi .E 0 ! E/ C Q.E/ (5.57) 5 Resonance Theory in Reactor Applications 241 where †si .E/ and fi .E 0 ! E/ are the macroscopic elastic scattering cross section and the average scattering kernel for a given nuclide i , respectively. Here, Q.E/ denotes the neutron source of non-elastic scattering nature. For the laboratory system, where the target nucleus is usually assumed to be at rest, the average scattering kernel fi .E 0 ! E/ is the average of fi .0 ; E 0 ! E/ over all solid angles, fi .E 0 ! E/ D 1 4 Z d 0 fi .0 ; E 0 ! E/ (5.58) E0 E and the energy/angle correlation is determined by the kinematics where 0 D of elastic collision. 5.4.1 Average Scattering Kernel for Practical Applications For practical applications, the energy-angle correlation can be simplified considerably if the kinematics of elastic collision is cast into the center-of-mass system where the center of mass remains stationary upon collision. In other words, the magnitudes of the initial and final neutron velocities are invariant. By utilizing such a property and the requirement of energy conservation, it can be shown that the energy-angle correlation can be specified by the following relation [47], 1 E0 E 1 D 0 E 2 Ai 1 Ai C 1 2 ! .1 c / D 1 .1 ˛i /.1 c / 2 (5.59) where ˛i is referred to as the maximum neutron energy loss per collision with a nuclide with atomic weight Ai and c is the directional cosine in the center-of-mass system. Equation 5.59 implies that the corresponding scattering kernel is identifiable with a ı-function of the form, 2Ai E 0 .1 / (5.60) f .c ; E 0 ! E/ D ı E E 0 C c .1 C Ai /2 If one further assumes that the scattering is isotropic in the center-of-mass system, i.e., d c D 2 sin c dc , a condition clearly true for the relatively low energy region dominated by the s-wave resonances, one obtains, 1 dE ; ˛i E 0 E E 0 1 ˛i E 0 D 0; elsewhere fi .E 0 ! E/dE D (5.61) Physically, Eq. 5.61 signifies the fact that upon elastically colliding with nuclide Ai a neutron acquires an energy E with equal probability in the range ˛i E 0 E E 0 . For practical calculations, it is more convenient to examine slowing-down issues on 242 R.N. Hwang the basis of a logarithmic scale instead of the energy scale particularly attractive in conjunction with the widely used multigroup approach. This can be accomplished by introducing a new variable u D ln.E0 =E/, generally referred to as “lethargy,” where E0 is an arbitrary reference energy. It follows that the scattering kernel in the u-domain becomes 0 fi .E 0 ! E/dE D Ki .u u0 /du D D 0; e .uu / du; 1 ˛i 0 u u0 "i elsewhere (5.62) where "i D ln.1=˛i / is the maximum increment in lethargy per collision. Given these scattering kernels, the slowing-down equations in the respective domains are defined. 5.4.2 Characteristics of the Slowing-Down Equation In the lethargy domain, the slowing-down equation given by Eq. 5.57 becomes, F .u/ D u X Z Ki .u u0 /hi .u0 /F .u0 /du0 C Q.u/ (5.63) i u" i where F .u/ D †t .u/ .u/ D E†t .E/ .E/, known as the collision density, physically signifies the total collision rate per unit volume by those neutrons within the element du, and hi .u/ D †si .u/=†t .u/ is the ratio of the macroscopic scattering cross section of a given nuclide i and the macroscopic total cross section of the system. Unlike .u/, F .u/ is a much smoother function of u and, therefore, Eq. 5.63 is preferable when resonances are present. From the perspective of resonance treatment, the energy region of interest is usually far away from the neutron source of non-elastic nature. Hence, it suffices to focus on the Green’s function solution of Eq. 5.63 equivalent to the case of mono-energetic source whereby F0 .u/ D u X Z 0 0 0 Zu 0 Ki .u u /hi .u /F0 .u /du C ı.u/; F .u/ D i u" i Q.u0 /F0 .u u0 /du0 0 (5.64) The solution to Eq. 5.64, in turn, can be pictured as the linear combination of the form, F0 .u/ D ı.u/ C Fs .u/, where Fs .u/ signifies the collision density away from the mono-energetic source. Thus, it follows that Fs .u/ D u X Z i u" i Ki .u u0 /hi .u0 /Fs .u0 /du0 C X i hi .0/Ki .u/ (5.65) 5 Resonance Theory in Reactor Applications 243 where hi .0/ will henceforth be Ptaken as a constant independent of u. It should be noted that the source function i hi .0/K.u/ above exhibits discontinuities at u D 0 and u D "i . Although Eq. 5.65 is seldom used in practical calculations, it is, nevertheless, of great interest conceptually because it provides the means to illustrate the fluctuations of collision density as a function of u, commonly referred to as the Placzek oscillations [48] as one shall see. 5.4.2.1 Slowing-Down Density One quantity of great practical importance is generally referred to as the slowingdown density, which represents a measure of the total number of neutrons slowed down past a given energy E (or above the corresponding lethargy u), defined as q.u/ D u X Z hi .u0 /F .u0 /du0 i u" i D u X Z 0 C" uZ i Ki .u00 u0 /du00 u hi .u0 /F .u0 /ki .u u0 /du0 (5.66) i u" i where ki .u u0 / D Œexp.u0 u/ ˛i =.1 ˛i / is the explicit form of the inner integral on the first lines of Eq. 5.66. For the special case in the absence of resonances, the slowing-down density simply becomes q.u/ D Fs ; D X i Zu hi .0/ .u u0 /ki .u u0 /du0 D X Œhi .0/ i (5.67) i u"i where the constant is referred to as the average increment of lethargy per collision in the system consisting of many nuclides. Differentiating Eq. 5.66 with respect to u and utilizing Eq. 5.63, one obtains the following first-order differential equation, dq.u/ D Q.u/ †a .u/ .u/ du (5.68) where †a .u/ D †t .u/ †s .u/ is the macroscopic absorption cross section of the system. If one considers a mono-energetic source Q.u/ D qasym ı.u/, one obtains 2 q.u/ D qasym 41 1 qasym Zu 0 3 †a .u/ .u/du5 (5.69) 244 R.N. Hwang Physically, the term inside the square brackets is commonly referred to as the resonance escape probability, while the second term therein denotes the resonance absorption probability. These equations have provided the theoretical basis for many useful methods in reactor applications. Their impact will be further addressed in later sections. 5.4.2.2 Concept of Placzek Oscillations One fundamental phenomenon associated with the neutron slowing-down process is known as the Placzek oscillations. It was first examined by Placzek [48] for the case of a medium with a single nuclide in the absence of resonance absorption. Two specific features of the slowing-down process are of great theoretical interest here, namely, a general description of the Placzek oscillations in the presence of many nuclides and their impact on the slowing-down equation when resonances are present. In the following discussions, the emphasis will be focused on the conceptual aspects of these topics. One conceptually plausible way of providing a better understanding of this subject is to cast Eq. 5.65 into an alternative form as first proposed by Corngold [49] via the use of the Laplace transform. Define the Laplace transform of a function and its inverse symbolically as fQ.p/ D Z1 f .u/e pu 1 L du; 1 ffQ.p/g D 2 i 0 cCi Z 1 e up fQ.p/dp D f .u/ (5.70) ci 1 Taking the Laplace transform of Eq. 5.65, one obtains I P FQs .p/ D I P hi .0/KQ i .p/ i D1 I P 1 i D1 hi .0/KQ i .p/ i D1 KQ i .p/Lfgi .u/Fs .u/g 1 I P i D1 hi .0/KQ i .p/ Q D PQ .p/ R.p/ (5.71) where gi .u/ D hi .0/ hi .u/ and I is the total number of nuclides in the system. In Q the absence of resonance absorption, gi .u/ must vanish so that R.p/ D 0. Thus, upon inversion, Fs .u/ D P .u/. Here, P .u/ will henceforth be referred to as the Placzek function in the presence of many nuclides unless otherwise stated. In the presence of resonance structures, the inversion of Eq. 5.71 leads to an alternative form of the slowing-down equation derived by Corngold [49], Fs .u/ D P .u/ I Z X i D1 0 u du0 Gi .u u0 /gi .u0 /Fs .u0 / (5.72) 5 Resonance Theory in Reactor Applications 245 where the kernel Gi .u u0 / and the Placzek function can be defined as 8 9 ˆ > ˆ > ˆ > I < = X Q i .p/ K 1 ; P .u/ D hi .0/Gi .u/ Gi .u/ D L I ˆ > P ˆ > i D1 > Q hi .0/Ki .p/ ; :̂ 1 (5.73) i D1 Hence, Eq. 5.72 provides the direct connection between the solution to the slowingdown equation without resonances to that with resonances. The Laplace transform of the scattering kernel can be determined analytically. The substitution of KO i .p/ into Eq. 5.71 yields, I P LfP .u/g D i D1 hi .0/ 1˛i .p C 1/ I P i D1 1 e "i .1Cp/ hi .0/ 1˛i (5.74) 1 e "i .1Cp/ By examining the above equation, the asymptotic properties of P .u/ and its shortrange nature of fluctuations can be readily established analytically. From the Laplace transform identity, it is quite obvious that the asymptotic limit of the Placzek function must be 1 lim P .u/ D lim Œp LfP .u/g D u!0 p!0 1 I P i D1 D ˛i "i hi .0/ 1˛ 1 N (5.75) i where is the average lethargy increment per collision also defined in Eq. 5.67. Similarly, the short-range nature of fluctuations in P .u/ and/or Gi .u/ can also be determined via the use of the asymptotic expansion of Eq. 5.74 for large p. For illustration purposes, consider a single nuclide in the limit of large p, so that Eq. 5.74 can be represented by the following series, ˛ n n" p 1 e 1 1 ˛e "p X .1/n 1˛ lim LfP .u/g D (5.76) n ˛ ˛ p!1 1 ˛ p 1˛ nD0 p 1˛ where the subscript i is dropped for convenience. Note that the presence of exp.n"p/ indicates that its inverse must consist of a unit step function for each interval n". The inversion of Eq. 5.76 can be carried out term by term and, upon rearrangement, one obtains ˛ 1 e 1˛ u X .1/n ˛ n ˛ .un"/ C e 1˛ P .u/ D 1˛ nŠ 1 ˛ nD1 Œn C .u n"/ .u n"/n1 H.u n"/ where H.u n"/ is the Heaviside function (or unit step function). (5.77) 246 R.N. Hwang It is interesting to note that Eq. 5.77 provides the same results identifiable with those derived originally by Placzek if the sum in Eq. 5.77 is written out explicitly. A similar approach can be extended readily to P .u/ and/or Gi .u/ for a mixture with many nuclides. It can be shown that the results so obtained are identifiable with those given in [50]. Some results of Gi .u/ and P .u/ for a typical fast reactor composition are given both quantitatively as well as graphically in [50]. The fluctuations are noticeably less striking in the first few intervals when other scatterers with much smaller atomic weights are present. From Eq. 5.72, it is quite obvious that the fluctuations in the kernel Gi .uu0 / will impact the collision density as well when resonances are present. For the relatively high energy region where the extent of a resonance is usually less compared to the corresponding ", only the first scattering interval will be of practical interest from the perspective of resonance integral considerations. As a general rule, under the latter condition, F .u/ either exhibits a bump above its asymptotic value, or a drop below it, corresponding to whether gi .u/ assumes predominately positive or negative values. As discussed in [50], the former represents the scenario where the neutron width n is much greater than other partial widths whereas the latter represents the reverse scenario. The impact can be substantial when the neutron width is very large. For more details, the reader is referred to [50]. 5.4.3 Resonance Integrals and Their Applications The resonance integral concept was introduced in the early stages of reactor physics development as an effective means to account for resonance absorption attributed to a handful of low-lying s-wave resonances of few actinides using the Breit–Wigner approximation. The general concept is readily extendable to other types of crosssection representations when cast into the form of the generalized pole expansion described in Section 5.2. 5.4.3.1 Traditional Resonance Integral Concept In addition to the use of the Breit–Wigner approximation, the three main assumptions used in earlier days that eventually led to four widely used approximations are: (1) flux recovery between resonances; (2) .0/ 1 above each resonance; (3) the collision density is taken to be constant for all nuclides in the mixture except for the resonant isotope in question. Hence, the slowing-down equation at energies far below the source energy becomes Zu †t .u/ .u/ D †m C u" 0 du0 e .uu / †sr .u0 / .u0 / 1˛ (5.78) 5 Resonance Theory in Reactor Applications 247 where †m is the energy-independent macroscopic scattering cross section of nonresonance nuclides and the subscripts for various parameters of the resonant isotope will, henceforth, be dropped for convenience. The simplified equation above also implies that the absorption of the medium is due entirely to the resonance under consideration. These assumptions lead immediately to various means to determine the flux and absorption rate for each Breit–Wigner resonance from which the slowing-down density can be determined in the context of Eq. 5.69. The resonance integral, defined as the absorption rate per atom, i.e., Z1 RI D x .u/ .u/du (5.79) 0 has been used as one of the major tools in studies of resonance absorption. 5.4.3.2 Various Resonance Integral Approximations During the period preceding the mid-1960s, there were four widely used approximations for treating the resonance integrals of the well-isolated Breit–Wigner resonances. Brief discussions of these methods are presented below. Narrow Resonance Approximation (NR) If the extent of the resonance is small compared to the maximum energy loss per collision, the resonance contribution to the integral term in Eq. 5.78 becomes negligible so that Eq. 5.78 is reduced to .u/ D †p †p D †t .u/ †p C †Rt .u/ (5.80) where †Rt .u/ is the total macroscopic resonance component and †p D †m C †pr is the total macroscopic potential scattering cross section. The corresponding resonance integral becomes p ar J.r ; ˇr ; ar / Er cos l Z1 .r ; x/dx p ar 1 D Er cos l 2 ˇr C .r ; x/ C ar .r ; x/ .RI/NR D 1 (5.81) 248 R.N. Hwang where p D †p =Nr , ar D r C fr , ˇr D †p =.†0r cos l / and ar D tan 2 l . Here, †0r is the total macroscopic peak resonance cross section of level r. It is worth noting that the hard-sphere phase shift angle, l , was usually assumed to be small in the low energy range so that cos l 1 and sin 2 l 2 l . In fact, the asymmetric Doppler-broadened line-shape function was often ignored for the sake of expediency in many earlier works [6]. Since J.r ; ˇr /, in absence of ar , is readily amenable to the utilization of a precomputed table in a two-dimensional array, it was most widely used in earlier days. It will be shown that the integral of the form given in Eq. 5.81 can be computed efficiently without resorting to tabulation. Infinite Mass Approximation (NRIM or WR) In the limit of infinite mass or the extent of the resonance much wider than the maximum lethargy increment per collision, i.e., " ! 0 in Eq. 5.78, the flux and its corresponding resonance integral are also reduced to the simple forms [6], .u/ D †m ; †m C †a .u/ .RI /NRIM D m t J.r ; ˇr / Er (5.82) respectively, where ˇ r D †m =.†0r a =t / and m is the diluent cross section per absorber atom. Intermediate Resonance Approximation (IR) It is apparent that some means to bridge the gap between the two approximations given above is needed. One widely used approximation that serves this purpose is the IR approximation originally proposed by Goldstein and Cohen [51]. From the foregoing discussions, it is reasonable to conjecture that the neutron flux across a resonance generally resembles the following approximate form, .u/ D †m C †pr †m C †a .u/ C †sr (5.83) where is a parameter characteristic of the resonance to be determined. This expression obviously leads to the NR and WR approximations as approaches 1 and 0 respectively. Furthermore, the corresponding resonance integral based on such a flux shape must retain the same general forms defined by Eqs. 5.81 and 5.82 provided that is insensitive to energy. By substituting the Doppler-broadened Breit–Wigner cross sections into Eq. 5.83, one obtains the same general form of the resonance integral as that for the NR-approximation with different arguments, i.e., .RI/IR D ./ p./ ar J r ; ˇr./ Er (5.84) 5 Resonance Theory in Reactor Applications 249 ./ where p./D.†m C †pr /=Nr , ar D ar t =.ar Cnr / and ˇr./ D ˇr t =.ar Cnr /, respectively. Thus, it amounts to redefining the parameters for the expression based on the NR-approximation defined by Eq. 5.81. Here, the parameter must reflect the higher-order effects of the slowing-down equation. Goldstein and Cohen [51] argued that Eq. 5.83 can be viewed as the first-order iterant of the integral equation of the Fredholm type defined by Eq. 5.78, i.e., .1/ D .u/. Substituting .1/ .u/ into Eq. 5.78, one obtains the second-order solution to be denoted by .2/ .u/. If the iterative process converges rapidly, .1/ .u/ and .2/ .u/ must not be significantly different. Therefore, one plausible criterion for determining was by setting Z1 Z1 .1/ †ar .u/ .u/ du D †ar .u/ .2/ .u/ du (5.85) 0 0 From this transcendental equation, one may deduce the value of provided that the integration on the right-hand side can be carried out analytically into a manageable form. One obvious problem that hinders such a procedure is its complexity when the Doppler-broadened function along with its asymmetric component are considered. If one neglects the inherent temperature-dependence and uses the Lorentzian shape for the resonance, a procedure commonly used, can be determined explicitly. Direct Numerical Approach (Nordheim’s Method) In the mid-1960s, the availability of modern computers made possible the alternative of a relatively rigorous treatment of the isolated resonance integral without resorting to significant approximations and complications illustrated by the IR approximation. One such method widely used in thermal reactor applications, especially in the United States, is that pioneered by Nordheim [52]. Nordheim’s method was intended for the treatment of the resonance integral in two-region repeated cells imbedded in an infinite reactor lattice via the collision probability method. For the purposes of this section, it suffices to focus only on its basic algorithm of treating the slowingdown equation and resonance integral for the case of the infinite homogeneous medium. The subject of collision probabilities will be addressed separately in the next section. This method amounts to a breakup of the resonance integral into the following form, Z1 Zu2 (5.86) .RI/Nord D x .u/ .u/du D x .u/ .u/du C I 0 u1 where the lethargies u1 and u2 correspond to the predetermined energy boundaries E1 and E2 , respectively, around the resonance peak with Ej D E0 ˙ m. p =2/; j 2 1; 2. Physically, the interval is taken to be an “m” multiple of the “practical” width p D t =2ˇ 1=2 approximately equivalent to the “half width” of the integrand based 250 R.N. Hwang on the NR or WR approximations. If m p is taken to be much greater than the Doppler width, where the Doppler-broadened line-shape functions approach their Lorentzian limits, I , consisting of two integrals corresponding to those attributed to the tails on both sides of the resonance outside of u1 < u < u2 , becomes analytically integrable. To evaluate the integral defined in Eq. 5.86 requires the solution of the slowingdown equation given by Eq. 5.78 at each mesh point uj between u1 and u2 as well as the subsequent resonance integral itself. Thus, it amounts to the evaluation of a double integral which can be accomplished via the use of Simpson’s integration scheme with equally spaced mesh points with the mesh spacing taken to be much smaller than the maximum increment of lethargy per collision for the resonance absorber in question. Nordheim [52] showed that the discretized integrand for the integral defined in Eq. 5.86 can be readily obtained recursively. Unlike other approximations, this method preserves the rigor of the flux behavior within the crucial region around the peak of the Breit–Wigner resonance. It was eventually superseded by other more rigorous methods during the emergence of the fast reactor program when more advanced computational tools became available. 5.4.4 Various Developments Motivated by the Emergence of the Fast Reactor Program The emergence of fast reactor development and modern computing facilities had a significant impact on our philosophies for treating the resonance phenomena in practical applications. As discussed in Section 5.1.1, the former casts the role of resonance cross sections into a somewhat different light while the latter makes possible the development of many rigorous methods for routine applications unimaginable in earlier days. There were several challenges directly associated with fast reactor development. For resonance treatments in the homogeneous media, there are three major challenges. First, one must account for overlap effects resulting from neighboring resonances either from the same nuclide or from different nuclides even if the Breit– Wigner approximation is assumed. Second, one must also account for the spectral effects resulting from resonance scattering of intermediate weight nuclides. Third, one must deal with compatibility issues of the vastly improved nuclear data and their representations in conjunction with the existing reactor physics concepts and codes based on the Breit–Wigner approximation. As discussed in Section 5.2.3, the latter can be resolved via the use of the generalized pole representation. Hence, it suffices to focus on various developments in the first area for our purposes here. 5.4.4.1 Generalization and Computation of the J -Integral Our shift of interest to the higher energy region clearly justifies the extensive use of the NR-approximation for treating many sharp resonances of actinides. One 5 Resonance Theory in Reactor Applications 251 way to handle these issues is to generalize the integral within the context of the NR-approximation. If the total macroscopic resonance cross section defined in Eq. 5.80 is taken to be a linear combination of all Breit–Wigner resonances in the system, the NR-approximation at the k-th resonance can be cast into the following form via partial fractions [53], 0 B B 1 .u/ D †p B B † C † .u/ tk @ p P 1 C C !C C P A †p C †t k .u/ †p C †t k .u/ C †k 0 k .u/ k 0 ¤k †k 0 k .u/ (5.87) k 0 ¤k where k 0 denotes the neighboring resonances. Thus, the corresponding generalized J -integral, referred to as the J -integral, must exhibit the following form, Jk D Jk .k ; ˇk ; ak ; bk / X Okk0 (5.88) k 0 ¤k where 1 Jk .k ; ˇk ; ak ; bk / D 2 Z1 1 .k ; xk / C bk .k ; xk / dxk ˇk C .k ; xk / C ak .k ; xk / (5.89) and Okk0 is referred to as the overlap term attributed to the k-th resonance with the integrand deducible from Eq. 5.87 in terms of the Doppler-broadened line-shape functions. The principal component J.k ; ˇk ; ak ; bk / represents the contribution from the k-th resonance alone. The parameter bk D 0 is taken when used in conjunction with the capture and fission resonance integrals so that it is consistent with Eq. 5.81. For total and/or scattering resonance integrals, one sets bk D ak . For details, the reader is referred to discussions given in [53]. It is important to point out that such an integral is composition-dependent and the required calculations must be carried out at run time. Hence, an efficient method for such a purpose is obviously needed. One proven method that has been extensively used in fast reactor applications is that based on the special form of the Gauss–Jacobi quadrature, sometimes also referred to as the Gauss–Chebyshev quadrature. While the detailed discussions are given in [53], some general features of this method will be discussed briefly for our purposes here. The rationale for this approach is based on the fact that the Doppler-broadened line-shape functions can become relatively well-behaved in a new domain upon 252 R.N. Hwang changing the variable of integration. One simple way to accomplish this is via either one of the following two types of rational transformations [53], or u2 D C 2 xk2 = 1 C C 2 xk2 (5.90) u2 D 1= 1 C C 2 xk2 where C is a constant to be chosen. With no loss of generality, the integral of the following form can be cast into the form readily amenable to Gauss–Jacobi quadrature via the first type of transformation, i.e., Z1 f . .xk ; k /; .xk ; k //dxk 1 Z1 D 1 D du p 1 u2 1 f . .xk ; k / ; .xk ; k / ; k / C u2 N 1 X f . .xk .un /; k /; .xk .un /; k /; k / C RN C N nD1 u2n (5.91) where un D cosŒ.2n 1/=2N and N is the total number of points considered. For such a quadrature to be effective, the part of the integrand inside the bracket in the equation above must be a smooth function of the new variable u so that it can be accurately approximated by a low-order Chebyshev polynomial expansion. For computation of the integral J.k ; ˇk ; ak ; bk /, it was shown in [53] that only a few mesh points are required to ensure accurate results if the constant C is carefully chosen in various ranges of k and ˇk . The same algorithm is equally applicable to the evaluation of the overlap correction term usually consisting of few contributing neighboring resonances if a larger number of mesh points is used. This algorithm has been incorporated into the MC2 -2 code [54] for routine applications. 5.4.4.2 Connections Between the Resonance Integral and Traditional Multigroup Cross Section Processing With no loss of generality, the multigroup cross section for a given reaction process x in a predetermined lethargy group ug D ug ug1 can be expressed in terms of resonance integrals within this group boundary as follows [53], P xk Jxk Fk =E0k .g/ p k2g .g/ Q x D (5.92) ug fg where fg , sometimes referred to as the flux correction factor, is defined as 1 X tk Jtk Fk fg D hFk i ug E0k (5.93) k2g and Fk , the collision density at the k-th resonance, is generally taken to be constant. 5 Resonance Theory in Reactor Applications 253 There are two ways that the multigroup cross sections can be deployed in reactor applications. One simple approach known as the Bondarenko scheme where the selfshielding factors for a chosen group structure are precomputed by using Eq. 5.92, .g/ without the overlap term, as a function of p at various temperatures with results tabulated and stored in one-dimensional arrays. In lieu of the overlap term, .g/ the quantity p infers the approximate composition dependence of the multigroup cross sections so that this scheme provides an efficient means for survey-type calculations. Another approach is to process these multigroup cross sections taking into account the actual composition dependence before the results are passed on to neutronic calculations [54]. As mentioned earlier, accurate accounts of the composition involving many nuclides with resonance structures are extremely important in fast reactor applications and the multigroup cross sections must undergo further refinements prior to their deployment. For our purposes here, it suffices to present the rationale for two widely used methods that provide a better account of the global spectral effects. Ultra-Fine Group/Fundamental Mode Approach The ultra-fine group (ufg) approach was pioneered by Hummel [12] in order to account for the fine spectrum effects attributed to resonances of nuclides with intermediate atomic weights not accounted for in the earlier resonance theory. It was intended specifically for the treatment of resonance structures of the structural and metal coolant materials present in a fast reactor composition. The dominant resonances of these materials are characterized by relatively wide neutron widths not sensitive to Doppler-broadening. One conceptually simple means to deal with this issue is to divide the entire energy span of interest into 2,000 ufgs with equally spaced lethargy widths u D 1=120, approximately equivalent to half of the maximum lethargy increment per collision of an actinide. One way to account for fine structure effects on the broad group cross sections is to use the ufg flux computed based on the predetermined ufg cross sections as the tool to collapse the latter into the desired group structure. This can be accomplished in two steps. First, compute the ufg cross sections according to Eq. 5.92. Second, once the ufg cross section set for a given composition is generated, the ufg spectrum can be computed via the usual fundamental mode spectrum calculations utilizing the consistent or inconsistent PN or BN approximation for diffusion equations widely used in reactor applications. This algorithm was coded in the MC2 -2 code [54]. Continuous Slowing-Down Approach The continuous slowing-down approach is an alternative in which the ufg weighting spectrum is determined by solving the slowing-down density equation given by Eq. 5.66. It is based on the rationale that the effects attributed to the relatively 254 R.N. Hwang smooth-varying cross sections and those attributed to the sharp resonances can be treated separately, a method particularly amenable in conjunction with the resonance integral concept. If the slowing-down density defined by Eq. 5.66 can be determined in the absence of sharp resonances, the corresponding local slowing-down density with sharp resonances and thus the local flux can also be specified via the attenuation of the former using the resonance escape probability in the same context defined by Eq. 5.69. A comprehensive review of this subject was presented by Stacey [55]. For our purposes here, it suffices to focus only on the conceptual basis of this approach. One best known approximation for solving Eq. 5.66 for the case of relatively slow-varying cross sections was pioneered by Goertzel and Greuling [56] using the synthetic kernel approach. Their rationale can also be viewed as a natural consequence of applying a low-order Taylor’s expansion to the quantity †s .u/ .u/, the scattering component of the collision density [55]. The substitution of the expansion into Eq. 5.66 yields ! 1 X X .1/n .i / d n mn Œ†si .u/ .u/ q.u/ D nŠ dun nD0 i Zu / m.i n D .u u0 /n ki .u u0 /du0 (5.94) u"i / where ki .u u0 / is defined in Eq. 5.66 and m.i n is the n-th order moment for the i-th nuclide. By retaining the first two terms in the expansion and the relation given by Eq. 5.68, the resulting first-order differential equation of q.u/ can be represented in .i / terms of moderating parameters that depend on the low-order mn and local cross sections. In absence of sharp resonances, such an equation can be solved readily. Because of its importance in fast reactor applications, a great deal of improvement of the original version by Goertzel and Greuling [56] has since been added, most notably by Stacey [55]. The improved version of Stacey [55] was based on a somewhat different low-order Taylor’s expansion from that given by Eq. 5.94. Instead of †s .u/ .u/, a similar expansion was made on †t .u/ .u/, the total collision density, which conceptually exhibits smoother behavior than the former. By so doing, the same type of first-order differential equation as the former was derived except that all moderating parameters must be redefined. For fast reactor applications, higher-order Legendre moments were also included in computing the moderating parameters. In particular, this improved method has been incorporated into the MC2 -2 code [54] as an option for computing the ufg fundamental mode spectrum in the resolved energy range. For details, the reader is referred to [54] and [55]. 5.4.4.3 Rigorous Treatment of Resonance Absorption via Numerical Means There exist various degrees of inherent limitations in all resonance integral methods described so far. In general, there are two limitations in common. First, the rigor in the treatment of the slowing-down equation is lacking especially when resonances 5 Resonance Theory in Reactor Applications 255 of many nuclides are present. Second, the range of integration is generally taken from minus infinity to infinity without due consideration of the finite group structures in conjunction with the multigroup applications downstream. This gives rise to the so-called boundary effects. These limitations along with the need for accurate treatment of heterogeneous effects of reactor lattices to be addressed in the next section provided a strong motivation for the development of more rigorous methods particularly useful for benchmarking purposes. For our purpose here, one rigorous numerical treatment of the resonance absorption in homogeneous media will be presented. The method pioneered by Kier [57] and later improved by Olson [58], sometimes referred to as the “hyper-fine group” method, is believed to be tailor-made for such a purpose. For illustration purposes, first let us consider the simplest case involving only one nuclide. The rationale is to divide a given scattering interval into many equally spaced hyper-fine groups (hfg’s) with spacing u much smaller than the extent of the resonances under consideration. This allows us to describe the flux defined by Eq. 5.63 for any given group, say k , discretely if the elastic scattering process for the hfgs within a span of "i is known. Thus, it suffices to isolate the neutronic balance for one scattering interval in the presence of a single nuclide for illustration purposes. Let the total number of hfgs within the span of " below the hfg in question be L D "=u; ˛ D exp.Lu/ (5.95) The neutronic balance in the hfg sense can be achieved via the use of the “effective” scattering kernel according to Kier [57]. Physically, if the scattering kernel K.u u0 / is viewed as the probability density function (p.d.f.), Kier’s “effective” scattering kernel [57] can be construed as the corresponding cumulative density function (c.d.f.) as a function of the successive hfgs within " above the hfg in question. If Pl denotes the c.d.f. for the l-th hfg above the initial group with lower boundary u0 , one can express it as Pl u D KQ l u; D KQ LST u C KQ s u; 1 l L1 lDL (5.96) where the c.d.f. for l D L represents the probability of neutrons initiated at u0 reaching the hfg in question and must take into account in-group scattering of that hfg in order to preserve neutron balance. These effective scattering kernels can be specified in the form of double integrals, 1 KQ l u D 1˛ u0ZCu .l1/ Z u0 u du u0 ul0 u 0 e .uu / du0 D KQ 1 u e .li/u (5.97) 256 R.N. Hwang KQ LST u D 1 1˛ u0ZCu u0 .L1/u Z u0 KQ s u D 1 1˛ 0 e .uu / du0 D KQ L u ˛ KQ s u (5.98) du uLu uLZCu Zu du uL 1 .u 1 C e u / 1˛ 0 e .uu / du0 D (5.99) u0 Here, KQ s u is the probability the scattering is in-group. It can be readily shown that the sum of Pl over all l is equal to unity as expected. If one denotes the flux per unit lethargy for a given group k by k and the corresponding collision density per unit lethargy by Sk , the corresponding slowing-down equation can be discretized as follows. Sk D L X ! Q Kl .†s /kl ˛Ps .†s /kL C .†s /k KQ s D Sk.in/ C Sk.s/ (5.100) lD1 where Sk.in/ and Sk.s/ physically signify the in-scattering source from other hfgs and the self-scattering source of group k, respectively. One potential issue is the evaluation of the in-scattering source when an exceedingly large number of hfgs are considered. Kier [57] showed that the computational efficiency could be significantly enhanced via the use of the following recurrence relation derivable from the properties of KQ l given by Eq. 5.97, i.e., .in/ Sk.in/ D Sk1 .KQ 1 KQ s e u /˛.†s /k1L CKQ 1 .†s /k1 ˛ KQ s .†s /kL (5.101) Hence, for a scattering interval, one only needs to deal with four terms coming from the lethargy groups below k (or energy groups above k). For practical applications in conjunction with the multigroup approach, the calculation usually begins in the energy region slightly above the resolved energy region where the cross sections are taken to be constant. .CR/k D Sk.in/ 1 .†sk =†tk /KQ s ; k D .CR/k †tk (5.102) Once Sk.in/ is known, the corresponding collision rate, .CR/k , and flux, k , for the given k-th hfg can also be defined. The procedure described above can be repeated until the hfg fluxes and the corresponding reaction rates are determined. The effective group cross section for a lethargy group, u1 u2 , consisting of N hyper-fine groups is simply, !, N ! N X X (5.103) Q x D xk k k nD1 nD1 5 Resonance Theory in Reactor Applications 257 The same procedure can be readily generalized for the practical case involving many nuclides. This approach is, in principle, rigorous so long as the slowing-down equation is valid. These basic algorithms have been incorporated into the RABBLEcode [57] and MC2 -2 code [54]. 5.5 Resonance Absorption in Heterogeneous Media The treatment of resonance absorption in heterogeneous media obviously presents a greater challenge than that in homogeneous media. Unlike the latter, the energy and space dependence become intertwined. It is clearly unrealistic if the detailed distribution of neutron flux throughout the entire reactor must be specified simultaneously at any arbitrary energy and temperature. Hence, the general starting point for all existing deterministic approaches is to focus on the examination of the resonance absorption within a localized unit cell. A realistic reactor can be generally viewed as an ensemble of lattices consisting of unit cells. These unit cells, in principle, can have either identical or different composition depending on the design under consideration. 5.5.1 Traditional Collision Probability Methods for a Two-Region Cell One convenient way of treating resonance absorption in a reactor lattice is via the use of the collision probability method. The rationale of using such a method can be best illustrated via the two-region cell embedded in an infinite reactor lattice considered by Chernick [4, 5]. The cell consists of a fuel region and a large moderator region designated by 0 and 1, respectively and no inter-cell interactions are assumed. If F0 .u/ and F1 .u/ denote the collision density in these regions, respectively, the neutron conservation of this cell can be specified by the coupled integral equations given below, V0 F0 .u/ D .1 P0 /S0 C P1 S1 (5.104) V1 F1 .u/ D .1 P1 /S1 C P0 S0 (5.105) where the respective volumes and escape probabilities are denoted by V0 , V1 , P0 , and P1 , respectively. Alternatively, .1 Pn / is generally referred to as the collision probability for the n-th region. The quantity Sn represents the scattering source in the specific region and is given by Sn D I.n/ X Zu i D1u" i .n/ Ki .u u0 / †si †.n/ t .u/ Vn Fn .u/d u0 ; n 2 0 or 1 (5.106) where I.n/ is the total number of nuclides in the nth region under consideration. 258 R.N. Hwang Thus, the coupled equations amount to the combination of two slowing-down equations previously defined for homogeneous media connected together via the respective escape (or collision) probabilities. To specify the collision density as a function of u requires the explicit description of Pn as a function of lethargy (or energy) and the means to solve the coupled integral equations. In earlier days, the direct numerical solution to these equations, in conjunction with resonance integral calculations, was obviously prohibitive and the task required a great deal of simplifications. One general identity which is crucial to lattice physics studies is the widely used reciprocity relation. For a two-region cell, it is simply, .0/ .1/ P0 †t .u/ V0 D P1 †t .u/ V1 (5.107) so that P1 can be computed once P0 is known. Another simplification is made possible by the fact that region 1 is usually assigned to the moderator with constant cross sections. Hence, it is reasonable to assume the NR-approximation in that region. By taking F1 .u/ D †1 .u/ and utilizing the reciprocity relation, the coupled integral equations can be reduced to a familiar form similar to that for homogeneous media described in the previous section, F0 .u/ D .1 P0 / fuel X i 1 1 ˛i Zu u" 0 e .uu / 0 †.0/ si .u / 0 †.0/ t .u / F0 .u0 /d u0 CP0 †t .u/ (5.108) .0/ The simplified equation above is generally considered as the starting point for various approximations of the two-region cell configurations to follow. The key issue here is how to define the escape probability P0 as a function of †.0/ t .u/ so that the traditional approximations for homogeneous media can be applied consistently. 5.5.1.1 General Features of Collision Probability of Practical Interest For the sake of completeness, the transport theory origin of the escape probability method [59] will be briefly addressed. With no loss of generality, let us start with the generic representation of the escape probability for an arbitrary region with volume V and surface area S from which all subsequent simplifications are based, 0 1 Z Z h i 1 @ 1 E nE 0 / 1 exp.†t jErs rEs 0 j/d E AD 1T dErs 0 . Pesc D N S †t l †t lN S (5.109) E where lN D 4V =S is commonly referred to as the average of the chord length, 0 is the unit vector along the direction of motion of the neutron, and nE is the unit vector normal to the surface S . The quantity inside the parentheses is denoted by 5 Resonance Theory in Reactor Applications 259 1 T , where T is known as the transmission probability signifying the fraction of neutrons transmitted through the region in question without suffering any collision. Alternatively, T also represents the neutron current of unit strength at the surface that passes through the region without collision. The corresponding collision probability is simply, (5.110) Pc D 1 Pesc Let l D jErs rEs 0 j be the chord length along the neutron path, a straight line connecting the intersections of the two vectors rEs and rEs 0 with the surface. For practical configurations of convex geometries, the outer integral over the surface in Eq. 5.109 becomes independent of the inner integral so that, Pesc D S 4V †t .u/ Z E nE / Œ1 exp.†t .u/l.d; // d . (5.111) where l.d; /, is a function of “diameter” (or thickness where appropriate) and the direction of flight . For the sake of clarity, the general attributes of geometry-related quantities in the integrand of Eq. 5.111 are given in Table 5.2 for the three most commonly used configurations in reactor physics applications. These are the fundamental geometric relationships in terms of ', the angle between the projection of l with respect to the x-axis, and , the azimuthal angle, used in defining their respective escape and/or collision probabilities. Let us begin with a simple scenario of an isolated unit cell in the sense that all neutrons escaping from the fuel lump will suffer their next collision in the surrounding moderator. For the isolated fuel configurations of practical interest, the escape probabilities are analytically derivable as illustrated in [59]. Figure 5.1 ilN a quantity commonly referred to lustrates their behavior as a function of †t .u/l, as the “optical” thickness, for slab, cylinder, and sphere unit cells. In spite of their differences in shape, the escape probabilities very much follow the same pattern. N In particular, they share ı the same limiting values, i.e., for †t l ! 0, Pesc D 1 and †t lN ! 1, Pesc D 1 †t lN known as the “white” and “black” limits, respectively. These limiting properties lead immediately to two well-known approximations, referred to as the mean-chord approximation and Wigner’s rational approximation [2, 3]. The former is defined as h i Pesc D 1 exp.†t .u/l/ .†t .u/l/ Table 5.2 Geometry-dependent quantities computing collision probabilities E n E Configurations .; E/ d Infinite slab, d D t cos ' sin ' d' d Infinite rod, diam. d sin ' cos ' sin ' d' d Sphere, diameter ds cos ' sin ' d' d required (5.112) for E l.d; / t =sin ' d cos =sin ' ds cos ' 260 R.N. Hwang THREE CONFIGURATIONS & WIGNER APPROX. ESCAPE PROBABILTY 1 cylinder slab sphere rational 0.8 0.6 0.4 0.2 0 2 0 4 6 8 OPTICAL THICKNESS 10 12 Fig. 5.1 Analytically based results and Wigner’s rational approximation while the latter is defined as Pesc D 1 1 C †t .u/ l (5.113) It is quite apparent that these functions exhibit the same limiting properties as those of the rigorous representations. The discrepancies of the rational approximation, for instance, are usually no more than a few percent for the isolated lumps considered as illustrated also in Fig. 5.1. These approximations made possible much of the earlier work on lattice physics. 5.5.1.2 Various Earlier Methods Based on Approximate Escape Probabilities for Isolated Fuel Lumps The similarity between the simplified version of the slowing-down equation defined by Eq. 5.108 and that of the homogeneous media previously given makes possible the application of the comparable approximations for reactor physics calculations. For our purpose here, the most widely used NR and WR approximations will be presented for illustration purposes. NR-Approximation .0/ Let F0 .u/ D †t .u/ 0 .u/ be the collision density for the fuel region. Under the NR-approximation as before, Eq. 5.108 for a single resonance absorber mixed with diluent becomes †.0/ t .u/ 0 .u/ .0/ D .1 P0 /†.0/ p C P0 †t .u/ (5.114) 5 Resonance Theory in Reactor Applications 261 Thus, the corresponding resonance integral can be expressed in terms of two integrals Z1 RI D †.0/ p 0 a .u/ †.0/ t .u/ Z1 du C 0 P0 a.0/ .u/†.0/ Rt .u/ †.0/ t .u/ du D Iv C Is (5.115) .0/ where †.0/ p and †Rt .u/ are the total macroscopic potential scattering cross section and total macroscopic resonance cross section, respectively. The quantities Iv and Is are referred to as the “volume” integral and “surface” integral respectively according to Wigner et al. [3]. The former is identical to that for homogeneous media described in the previous section while the latter involves the escape probability characteristic of the heterogeneous nature of the cell. The general form of the surface integral becomes + *Z1 h i a.0/ .u/†.0/ .u/ S0 .0/ Rt 1 exp.†t .u/l/ du Is D (5.116) 4V0 †.0/ .u/ t 0 where the angular bracket denotes the integration over the geometric configuration in the context of Eq. 5.111. From the perspective of reactor applications, this approach is not widely used. In contrast, the simplest possible means of computing the resonance integral for the fuel lump is via the direct use of Wigner’s rational approximation. The substitution of Eq. 5.113 into 5.108, leads to the well-known “equivalent relation” whereby the flux exhibits the same functional form as that for homogeneous media, 0 †.0/ p C †e .u/ D ; .0/ †p C †e C †.0/ t †e D 1 l (5.117) where †e is referred to as the “equivalent” cross section. It follows that .RI /k D .eq/ where p p.eq/ a J.ˇk0 ; k ; ak / cos 2 l E0 (5.118) i. h .0/ D †p C †e N0 is commonly referred to as the “equivalent” po- tential scattering cross section and ˇk0 D p.eq/ =.0k cos 2 l /. Hence, Eq. 5.118 is identical to the corresponding resonance integral for homogeneous media defined by Eq. 5.81 if p.eq/ is replaced by p . It was the early version of this equivalence relation which was later generalized to the closely packed lattice as one shall see. This approximation was used extensively not only in reactor physics calculations but was also used as a tool to analyze resonance integral measurements. If one neglects the temperature dependence and the asymmetric component of the resonance line shapes, Eq. 5.118 for a low energy resonance immediately reduces to the following form, a 0k 0 ı 1=2 ˇ .1 C ˇ 0 / (5.119) .RI/k D 2 E0 262 R.N. Hwang In the limit of a strong resonance, the above expression can be approximated by 1=2 S0 1=2 S0 .RI/k const †.0/ C / A C B p 4V0 M (5.120) which was used as the means to estimate the approximate behavior of the resonance integral as a function of the surface to mass ratio in analyzing experimental results. Similar results were also obtained by Gourevich and Pomeranchouk [7] based on Eq. 5.116. NRIM (or WR) Approximation In contrast to the NR-approximation, Eq. 5.108 under the NRIM-approximation in the same context described for homogeneous media must assume the following form: i h .0/ .0/ †t 0 .u/ D .1 P0 / †.0/ .u/ † 0 .u/ s m h i .0/ C †.0/ (5.121) CP0 †.0/ t .u/ †m m Again, like the case of the NR-approximation, one practical method to circumvent the task of computing the escape probability rigorously is the use of the rational approximation. Upon substituting Eq. 5.121 into 5.108, one obtains the pertinent equivalence relation as before, i.e., 0 .u/ N †.0/ m C 1=l D .0/ .0/ †m C 1=lN C †a .u/ (5.122) Thus, the corresponding resonance integral for a Breit–Wigner resonance becomes .eq/ .RI/k D p t J.ˇk0 ; k /; E0 .0/ p D †m C †e N0 (5.123) . where ˇk0 D p.eq/ .0k ak =tk / and the equivalent cross section is the same as that given in Eq. 5.117. It is quite clear that the same rationale is equally applicable to the IR approximation described in the previous section. 5.5.2 Traditional Collision Probability Treatment in a Closely Packed Lattice The discussions so far have been focused on the idealistic case in which the fuel lumps are considered as isolated from one another, separated by large moderator 5 Resonance Theory in Reactor Applications 263 regions. The basic principle was later extended to the more realistic situations involving closely packed reactor lattices. 5.5.2.1 General Features of the Escape Probability In the presence of many closely packed unit cells, neutrons escaping from the fuel lump in a given cell may not necessarily suffer their next collision in the surrounding moderator. This issue was first addressed by Dancoff and Ginsburg [60]. The “shadowing” effect attributed to neighboring fuel lumps on the escape probability of the fuel lump in question has henceforth been referred to as the Dancoff effect. The most comprehensive discussions on this subject are believed to be those given by Rothenstein [61], Nordheim [52], and Lukyanov [10]. The physical attributes of this phenomenon can be best illustrated by following a straight-line trajectory of the neutron path through the fuel lumps and the moderator regions. Let subscript 0 denote the fuel lump in question and 2; 4; : : :; 2N denote all neighboring fuel lumps, while 1; 3; 5; : : :; 2N C 1 denote the moderator regions sandwiched between the fuel lumps. If ln is the chord length of the path in the n-th region, and the local transmission probability between two surface points is denoted by n , then for the isolated fuel lumps previously described, one has .iso/ P0 D 1 †.0/ t l .1 h0 i/ (5.124) To account for the Dancoff effect, one must take into account the probability that neutrons survive without suffering any collision along the trajectory through the successive regions. Let P0 be the improved escape probability for the lump 0 in such a lattice. It is not difficult to see that P0 is expressible symbolically in the general form as follows: P0 D 1 .0/ †t l h.1 0 / Œ.1 1 / C 1 2 .1 3 / C i (5.125) Similarly, the escape probability for the surrounding moderator, P1 , can be obtained by interchanging the indices in the equation above. However, the escape probability in its general form is obviously too complicated for practical applications when resonance structures in cross sections are involved. One commonly used simplified assumption is that all unit cells are taken to be identical, a configuration referred to as an infinite lattice with repeated cells. Under such an assumption, it follows that h2n i D h0 i D T0 ; h2nC1 i D h1 i D T1 ; n D 1; 2; 3; : : : ; N (5.126) The repeated cell assumption leads to four approximations most frequently used in applications. In particular, the first two of these provide the theoretical basis for extension of Wigner’s rational approximation to accommodate the Dancoff effect in closely packed lattices. 264 R.N. Hwang “Black” Limit Approximation If the total cross section of the absorber is assumed to be large, all higher-order terms in the square bracket of Eq. 5.125 diminish. In addition, if the average of the product is replaced by the product of the averages, one obtains, P0 1 †.0/ t l .1 h0 i/.1 h1 i/ D P0.iso/ .1 C / (5.127) where C is referred to as the Dancoff correction factor depending on the cross section and geometric configuration of the moderator alone. Thus, the rationalfunction-based approaches derived for isolated lumps are equally applicable to closely packed lattices if the Dancoff factor is known. Nordheim’s Approximation The fact that the absorbers may not always be considered as “black” provides the need for further improvement. Since the linear terms in Eq. 5.125 are generally most dominant, all higher-order products in the ’s can be adequately approximated by the respective products of the averages as pointed out by Nordheim [52]. Thus, upon substitution of the averages and noting the geometric nature of the series for the repeated cells, Eq. 5.125 can be expressed in the closed form P0 D .1 h0 i/.1 h1 i/ .1 C /P0.iso/ D .0/ .0/ 1 h0 i h1 i †t l 1 .1 l†t P0 /C 1 (5.128) Infinite Slabs The case of closely packed repeated cells consisting of infinite slabs is probably one of the few cases where the escape probability can be represented analytically without simplifying assumptions. It is quite obvious from Eqs. 5.111, 5.125 and Table 5.1 that the escape and transmission probabilities for this type of configuration are generally expressible as a single integral. If the absorber and moderator with the thicknesses t0 and t1 , respectively, are either in a periodic (or reflective) arrangement, Eq. 5.125 can be reduced to a single integral of manageable form as pointed out by Corngold [62] and Rothenstein [61], namely, P0 D 1 2†.0/ t t0 12 1 X E3 ..n C 1/ 0 nD0 2E3 ..n C 1/. C n 1 / C E3 .n 0 C .n C 1/ 1 / 0C 1 // (5.129) 5 Resonance Theory in Reactor Applications 265 .1/ where 0 D †.0/ t t0 and 1 D †t t1 are the optical thickness of the fuel and of the moderator, respectively. E3 .x/ is the exponential integral of order 3. It is quite obvious that the escape probability of an isolated fuel plate defined by .0/ Eq. 5.124 corresponds to n D 0 with transmission probability h0 i D 2E3 .†t t0 /. On the other hand, under the black limit assumption, the Dancoff factor in Eq. 5.127 is simply C D 2E3 .†.1/ t t1 /. Repeated Cells with Cylindrical Configuration The infinite lattice in this case can be pictured as an ensemble of unit cells each consisting of a fuel rod and a cylindricized moderator region. In contrast to the infinite slabs, one general complication here is that the resulting integral generally involves a double integral according to the basic geometric properties of the cylinder given in Table 5.1. As discussed by Takahashi [63] and also by Rothenstein [61], it is possible to derive the same type of expression as that for infinite slabs except that it involves a series consisting of double integrals. In particular, the inner integrals are identifiable with the Bickley–Nayler function [64] of order 3. From the perspective of practical applications, especially fast reactor applications involving an extremely large number of resonances over a large energy span, the method derived on this basis is far more difficult to use for routine calculations than that for the infinite slabs. Other practical alternatives will be discussed later. 5.5.2.2 Fine-Tuning of the Rational Approximation for Routine Applications As discussed earlier, the dramatic simplification of the lattice physics calculations for the case of an isolated cell was made possible via Wigner’s rational approximation, leading to the equivalent relation whereby the lattice physics calculations can be treated in the same context as those for homogeneous media. The extension to closely packed lattices is obviously possible via either the black limit approximation or Nordheim’s approximation discussed in the previous subsection. By substituting Eq. 5.113 into 5.128, one obtains, P0 ı 1 lN .1 C / D .0/ D .0/ ı †t .u/ C ŒS0 =.4V0 / .1 C / †t .u/ C 1 lN .1 C / ŒS0 =.4V0 / .1 C / (5.130) It is interesting to note that, physically, the Dancoff effect is equivalent to the reduction of the surface to volume ratio. As long as the moderator cross sections are taken to be constant, or C is independent of energy, the equivalence relation is clearly applicable as well. From the NR-approximation consideration, Eq. 5.130 leads to an equivalent cross section of the following form, 266 R.N. Hwang †e D 1 lN a.1 C / 1 C .a 1/C (5.131) where a, referred to as the Bell–Levine factor [65, 66], provides more fine-tuning of the rational approximation. Typically, a is set to 1.33 and 1.08 for cylindrical and slab geometries respectively. The same argument can be extended to the NRIM-approximation defined by Eq. 5.123. The second issue is how to compute the Dancoff factor simply at run time. One obvious choice is to apply the same rational approximation proposed by Bell [65], .1/ B D 1 C D l 1 †t 1 C l 1 †.1/ t (5.132) N where the moderator cross section, †.1/ t , is taken to be constant and l1 here is the average chord length for the moderator region. Hummel et al. [67] have found that Eq. 5.132 can be significantly enhanced by the following modification: D 1 C D B C B4 .1 B / (5.133) The approximation above was found to be adequate and has been incorporated into the MC2 -2 code [54] for routine applications. 5.5.3 Connections to Resonance Integral and Multigroup Cross-section Calculations For closely packed repeated cells, there are two types of resonance integral-based methods widely used in reactor applications, which are summarized below. 5.5.3.1 Rational Approximation and Approximate Flux Based Approaches As discussed in the previous section, the direct equivalence between the treatment of the resonance integral in heterogeneous media and that in homogeneous media can be established to various degrees of sophistication when applied in conjunction with traditional approximations to the slowing-down equation. Like the previously described case of isolated fuel lumps, one only needs to redefine the “equivalent” cross section †e according to Eq. 5.131. For the NR-approximation, it is readily amenable to the J -integral approach discussed in Section 5.4 from which the multigroup cross sections can be obtained via Eq. 5.92. The viability of this type of approach has been demonstrated by the MC2 -2 code. The same logic applies if one chooses to use the NRIM or IR approximation. 5 Resonance Theory in Reactor Applications 267 The equivalence relation also makes possible the use of the Bondarenko scheme widely used for routine reactor calculations. For the case in point, the precomputed self-shielding factors at a given temperature as a function of p.eq/ defined by Eq. 5.131 are generated for various preselected groups and stored prior to their deployment at run time. 5.5.3.2 Nordheim’s Method Like the case of homogeneous media, Nordheim’s method using a numerical method for solving the slowing-down equation and the subsequent calculation of the resonance integral described in Section 5.4.3.2 is also applicable here. The only difference is that the slowing-down equation is given by Eq. 5.108 instead of Eq. 5.78. Like Eq. 5.78, 5.108 can also be further simplified by assuming the NR-approximation for the diluents in the fuel lump in the same context as the former [52]. For fuel lumps with a single resonance absorber, Eq. 5.108 becomes 1 Zu .r/ 0 † .u / 1 0 s e .uu / .r/ F0 .u0 /du0 A F0 .u/ D .1 P0 / @ 1 ˛r †t .u0 / u"r .r/ C P0 †t .u/ C †m 0 (5.134) where the indices r and m denote the resonance absorber and the diluent in the fuel lump respectively. The above equation would become identical to Eq. 5.78 if one sets the escape probability to zero. Thus, by using the escape probability defined by Eq. 5.128, the same numerical algorithm described in Section 5.4.3.2 for computing the resonance integral can be deployed. It is important to note that the original numerical scheme was developed primarily for computing a well-isolated Breit–Wigner resonance for thermal reactor applications. The question will arise when the mutual self-shielding effects resulting from the presence of other resonances in the vicinity of the one under consideration, a scenario quite common in fast reactor physics calculations. Therefore, more rigorous methods were subsequently developed in order to cope with the overlapping of resonances in the slowing-down process. 5.5.4 Rigorous Treatment of Resonance Absorption in a Unit Cell with Multiple Regions and Many Resonance Isotopes Motivated by needs in conjunction with fast reactor development, rigorous methods for treating resonance absorption were developed for various types of calculations where accuracy is required. The case in point can be viewed as a natural extension of the rigorous treatment of the slowing-down equation in homogeneous media if 268 R.N. Hwang the same hfg approach is used as the means to discretize the lethargy domain. For a unit cell with multiple regions and many resonance isotopes, generally it would require the numerical solution of a system of N slowing equations analogous to the coupled equations for the two-region cell defined by Eqs. 5.104 and 5.105, i.e., Vi Fi .u/ D N X Pij .u/Vj Sj .u/ (5.135) j D1 where Pij .u/ is the collision probability for neutrons originating in region j that will make their next collision in region i and Sj .u/ is the scattering source from region j as defined in Eq. 5.106. In principle, the numerical solution of these slowing-down equations can be carried out in the same context of the hyper-fine group approach described in Section 5.4.4.3. The analogy to Eq. 5.102 can be readily established via the use of matrix notation. For a given hfg k, the collision rates and scattering sources in various regions of the cell can be represented as vectors. In the same context as Eq. 5.102 for the homogeneous case, the collision rate can be expressed as .in/ E E D .P1 R/1 S E E .in/ E (5.136) CR D P S C R CR ; CR where R is a diagonal matrix with elements Ri i D .†sk =†t k /i .KQ s /i for a given region i and the remaining quantities appearing in Eq. 5.136 are consistent with Eq. 5.102 except that they are written in vector form. Thus, the collision rate for a given region can be specified once the collision matrix is known and the corresponding neutron flux is simply, CRi .uk / (5.137) i .uk / D Œ†t .uk / i Given the hfg flux in every subregion of the unit cell, the desired broad group cellaveraged cross section for reaction process x required for practical applications becomes, Q xG D X N X i D1 k2G x .uk / i .uk / , X N X .u / ; i k x2i (5.138) i D1 k2G The practical issue here is how to compute the collision probabilities for cells involving many regions and resonance isotopes at a large number of mesh points (or hfgs). For our purpose here, two proven methods will be outlined. 5.5.4.1 Kier’s Method for Cylindrical Unit Cells Consider a typical unit cell in the form of an infinite cylinder consisting of a fuel rod with cladding surrounded by a cylindricized moderator region [57]. Let us subdivide 5 Resonance Theory in Reactor Applications 269 the cell into N intervals all of which are in the form of annuli except for the interval at the center. To compute the detailed neutron flux at a given hfg and spatial interval requires an explicit or implicit description of the collision probability. This can be accomplished in steps as follows. Intra-Cell Neutronic Balance One way to simplify the complexity of computing the collision probabilities for the multiregion cell is to cast the intra-cell neutronic balance into a somewhat different perspective. In Kier’s approach [57], the interface current can be described in the form of a matrix equation, E TJE D PS (5.139) where the elements of T and JE denote the transmission probabilities and currents at the surfaces of each region, respectively, while P denotes the escape probabilE is the neutron source ities from a given region to its immediate neighbors and S in the region considered. Within the context of the hyper-fine group approach described in Section 5.4.4.3, this matrix equation must be solved at each hyper-fine group level with the source in each region computed the same way as before starting at the hfgs beyond the resonance region. Physically, one only needs to deal with the transmission and escape probabilities of the region under consideration to the immediately adjacent neighboring regions while the effects from the remaining regions are implicitly accounted for via boundary conditions. If one further assumes isotropic return of the currents at the boundaries, the corresponding T and P for annuli can be specified readily. To determine the interface currents based on the above equation requires the solution of a tri-diagonal matrix equation of order 2N 2N reflecting the tri-diagonal nature of the matrix equation given by Eq. 5.139 where two currents per region for such a region are required. Thus, Eq. 5.139 can be expressed in terms of 2N linear equations as specified in [54, 57]. As proposed by Kier [57], this matrix equation can be solved efficiently via the use of the usual Gauss elimination procedure followed by backward substitution once the pertinent transmission probabilities and escape probabilities are known. Computation of the Probabilities of a Given Annulus Consider the i-th annulus bounded by circles with radii ri 1 and ri respectively. For an annulus, three first-flight transmission probabilities and two escape probabilities are required to specify the neutronic balance defined by Eq. 5.139. The former can be denoted symbolically by TiOI ; TiIO and TiOO signifying the transmission of neutrons from inner-to-outer, outer-to-inner and outer-to-outer surfaces of region i , respectively. The latter can be denotedby PiC and Pi signifying the fraction 270 R.N. Hwang of neutrons escaping the outer and inner surfaces of region i , respectively. These probabilities can be defined in manageable forms if the cosine current assumption is used, i.e., Z Z 2 x 4 4 ˛i TiOO D cos Ki3 .bi cos /d D p Ki3 .bi x/dx (5.140) sin1 ai 0 1 x2 Z q p z 4 1 OI 2 2 2 ; TiIO Dai TiOI (5.141) Ti D dxKi3 1 ai x ai 1 x 0 1 ai where ai D .ri 1 =ri /; ˛i D .1 ai2 /1=2 ; bi D 2†t i ri and z D bi .1 ai /=2. Here, Ki3 .x/ is the well-known Bickley–Nayler function [64] of order 3. The corresponding outward and inward escape probabilities for the i-th annulus are respectively, PiC D 1 1 1 TiOO TiIO ; Pi D 1TiOI †t i l †t i l (5.142) where lN D 2.ri2 ri21 /=ri 1 . Hence, all probabilities are specified once TiOO and TiOI are known. For practical calculations, these transmission probabilities can be either precomputed and stored in two-dimensional tables or alternatively, computed at run time via a low-order quadrature described in [68]. Determination of Collision Probabilities From neutron-conservation considerations, the collision rate for a given region can be specified symbolically as .CR/i D 1 TiOO Ji C 1 PiC Si ; i D 1 OO D 1 TiOI JiC TiIO Ji C 1 PiC Pi Si ; 1 C 1 Ti i D 2; N (5.143) where i D 1 denotes the inner-most region with the configuration of a circle. In the original development of Kier [57], the corresponding flux for region i and hfg k is simply, i .uk / D .CR/i =†t i .uk /. To put the results in the same context of Eq. 5.136, the traditional collision probabilities can be readily deduced from Eq. 5.143. Physically, it is not difficult to rationalize that the collision rate for region i is identifiable with Pij if one sets Si D 1 and Sj D 0 in the equation above, i.e., Pij D .CR/j ; Si D 1; Sj D 0 if j ¤ i (5.144) Thus, the hfg flux for all subregions can be determined via Eq. 5.137, from which the cell-averaged broad group cross section defined by Eq. 5.138 can be specified. This approach has been incorporated into the RABBLE-code [57] as well as the MC2 -2 code [54]. 5 Resonance Theory in Reactor Applications 271 5.5.4.2 Olson’s Method for Unit Cell with Many Plates During the period of fast reactor development, a great deal of emphasis was placed on the analysis of measurements from various fast critical assemblies made of drawers with a large number of plates. A much more rigorous alternative for such an approach was developed by Olson [58] without resorting to the assumptions of the cosine or cosine-square for the interface current made by Kier [57], conditions plausible for cylindrical cells. The rationale was that the neutron transport in the infinite lattice consisting of repeated cells with infinite slabs can be specified directly by the integral transport equation without resorting to the use of Eq. 5.139 as was evident from the earlier work of Corngold [62] and Rothenstein [61]. In the context of the hfg approach described previously, the scattering source for a given hfg k and i-th region, Si.k/ encompasses the sum of scatter-in and self-scattering terms described in Section 5.4.4.3. In the approach proposed by Olson [58], the source is allowed to vary linearly within each plate and is then used in conjunction with the integral transport approach to compute the intraand inter-cell currents. The resulting current at mean free path ; D †t i x, beyond the plate with optical thickness i can be expressed as E ; i/ D J. Si.k/ ŒE3 . / E3 . C 2 i i/ C f. ; i/ (5.145) where the first term is identifiable with the traditional result using constant source and the second term amounts to a correction term taking into account the linear variation of the source. f . ; i / involves a linear combination of exponential integrals of both orders 3 and 4 [58]. If a plate j is at mean free path away from the plate i , the collision rate in j due to the source in i is E E C E ; i / J. CR.i ! j / D J. j; i/ (5.146) For the infinite repeated cells with periodic plate arrangement, the contribution of all plates i to plate j become E 1 .i ! j / D CR 1 h X E C E C mh; i / J. J. j C mh; i / i (5.147) mD0 where h is the optical thickness of the unit cell. A similar expression can also be derived for cells with reflective boundary condition. The infinite sum involving exponential integrals of order n 3 can be evaluated readily via the use of the Gauss quadrature developed for this purpose [58]. Thus, the collision probability Pij for a unit cell is expressible as h i E 1 .j ! i / E 1 .i ! j / C ŒCR Pij D CR Si (5.148) 272 R.N. Hwang By substituting Eq. 5.148 into 5.136 followed by inverting the resulting matrix, the flux in a given plate for the hfg k can be determined. The substitution of the flux so obtained into Eq. 5.138 yields the desired broad group cell-averaged cross section for applications. The approach described here was first coded in the RABIDcode [58] and later incorporated into the current MC2 -2 code [54] as a valuable option. 5.6 Treatment of Unresolved Resonances The treatment of resonance self-shielding effect in the unresolved energy range constitutes one of the most important topics in resonance theory especially in fast reactor applications. The range under consideration is usually defined as the energy span between the upper boundary of the resolved energy region and an upper bound beyond which the self-shielding effect becomes unimportant. For major actinides, it spans the range from the low keV region up to about 100 keV. This is the energy region where the Doppler-width becomes much larger than the resolution width of the instrument so that resonances may not be parameterized deterministically. The methods for treating resonance absorption in this range can be viewed as a natural extension of the statistical theories of average cross sections such as those described by Moldauer [69] and Ericson [70]. Since the theoretical foundations may often be obscured in routine applications, it is useful to summarize briefly some conceptual aspects of the problem prior to the discussions of the basis for the computational methods. For our purpose here, the discussions will be focused on current methods based on the single-level Breit–Wigner approximation although the same concept is extendable to the more rigorous S -matrix approach if its parameters are expressed in terms of the R-matrix parameters. 5.6.1 Statistical Theory Basis The statistical basis for evaluating the average cross section hx iE is assumed to be extendable to the estimation of the expectation values of the reaction rate hx iE;Er and the flux h iE;Er as well. Without loss of generality, each microscopic cross section is represented by the R-matrix formalism defined previously in terms of parameters E and c . From the statistical theory of spectra [71], the distributions of these parameters are well-known. These distributions, in effect, define the joint density function (p.d.f.) required for evaluating the averages attributed to an ensemble of resonances ˝ 2in˛ the vicinity of an energy, say E . Given information of hjEi Ei C1 ji and ci and through the explicit knowledge of the behavior of x and as function of energy, the expectation values of interest are, in principle, completely specified. 5 Resonance Theory in Reactor Applications 273 5.6.1.1 Some Statistical Theory Fundamentals A brief outline of some pertinent statistical theory fundamentals is believed to be a good starting point for the case in point. Basic Rule The probability distribution of an event A characterized by statistically independent variables .x1 ; x2 ; x3 ; : : : ; xi / 2 A is defined as Z Z P .A/ D ::: Z Y pi .xi /dx1 dx2 : : : dxi (5.149) i where pi .xi / is referred to as the probability density function (p.d.f.) for xi and the products of all p.d.f. is known as the joint density function for these variables. For the case under consideration here, these quantities are identifiable with the partial width distributions and the level correlation functions. The direct use of the joint density function provides the most widely used basis for computing the averages of interest to be described. Addition Theorem The probability distribution function for the union of two events is given by, P .A [ B/ D P .A/ C P .B/ P .AB/ (5.150) Multiplication Theorem The probability distribution of the product of two correlated events is given by, P .AB/ D P .A/P .BjA/ D P .B/P .AjB/ (5.151) where P .BjA/ is known as the conditional probability of B for a given occurrence of the event A. The above relation must be symmetric when A and B are interchanged. This theorem is of a great deal of practical interest since it provides another basis for computing the averages of interest known as the “probability table” method to be described shortly. 274 R.N. Hwang 5.6.1.2 Statistical Distributions of Practical Interest There are three types of distributions for the resonance parameters of practical interest. They will be summarized as follows. Partial Width Distributions It is well-known that the reduced width amplitude c for a given level and channel c follows the normal distribution with zero means and variance of unity according to Porter and Thomas [72]. For a given reaction process, there may exist several channels and, consequently, the evaluation of the average quantities can require the task of evaluating many multiple integrals with many variables. For the Breit–Wigner approximation, the partial width of a given reaction type is actually used. For a given reaction process x, it is defined as X 2 2Pc c (5.152) x D c2x where Pc is the penetration factor described in Section 5.2.1.1. If one assumes that 2 > is the same for all c 2 x, the probability density function for the average < c the partial width becomes, p .y/dy D 2 1 y e 2 y dy 2 .=2/ 2 (5.153) where p .y/ is the well-known 2 -distribution with the degrees of freedom equal to the total number of exit channels for process x and y D x =hx i, the local to average ˝ 2 ˛ ratio of the partial width. It should be noted that Eq. 5.153 is no longer valid is different for c 2 x, a scenario where the multichannel effect is important. if c Level Width Distribution The distribution of E is characterized by the Wigner distribution [73] or the longrange correlation described by Dyson [74]. The former, also known as Wigner’s surmise developed in 1957, that the probability distribution of the level spacing jE E j for a given l and J state should follow a simple analytical form, w.x/dx D x exp x 2 dx (5.154) 2 4 ı˝ ˛ where x D jE E j jE E j is the local to average ratio of the level spacing. The fact that w.0/ D 0 signifies the repulsion of eigenvalues of a real and symmetric Hamiltonian matrix. More studies of this subject of such a matrix ensemble were further pursued by Wigner and others [71, 73, 74]. Of particular interest for 5 Resonance Theory in Reactor Applications 275 practical applications is the so-called Gauss orthogonal ensemble (GOE) in which the distributions of the matrix elements are taken to be statistically independent Gaussians and invariant under orthogonal transformation. The Wigner distribution given above is identifiable with the behavior of the difference of eigenvalues of a 2 2 matrix. In contrast, the latter proposed by Dyson [74] is free from the assumption used by the former. Here, the ensemble under consideration is not the Hamiltonian directly but an ensemble of eigenvalues fE g of a unitary matrix S of the form exp.i ‚ / uniformly distributed around a unit circle. For an orthogonal ensemble, a general expression for the n-level correlation function as a function of ‚ with D 1; 2; 3; : : : ; n was developed by Dyson [74]. From the practical point of view, the general form of the distribution is difficult to use. In the limit of two levels, however, it is expressible in a simple analytical form which can be utilized for routine applications as one shall see. Level Correlation Functions In practical calculations, another distribution of interest, .jEk Ej j/, is the probability of finding any level within an interval at a distance jEk Ej j away from a given level Ek when the direct integration approach is used for averaging. It is of particular interest when the overlapping ˇ˛ is dominated by the immediate ı˝ˇ of levels neighbors. If one sets x D jEk Ej j ˇEk Ej ˇ as before, this distribution can be defined by the following convolution integral equation, Zx .x/ D w.x/ C w.t/.x t/dt (5.155) 0 where w.x/ is the Wigner distribution of the level spacing belonging to a resonance sequence of a given l and J previously defined. The analytical solution to this equation in a closed form is difficult to derive but one can handle it numerically. It is interesting to note that the analytical solution in a closed form can be derived if w.x/ is replaced by the 2 -distribution of even order via the use of the Laplace transform [75], i.e., 1 .x/ D 2 i cCi Z 1 ci 1 .v=2/v=2 e xp dp .v=2/ .v=2/x e D 2 i .p C v=2/v=2 .v=2/v=2 cCi Z 1 ci 1 e .v=2/xz dz z.v=2/ 1 (5.156) whereby, for all even 2, it is expressible in terms of a damped oscillatory function in closed form readily shown via the use of the Cauchy integral theorem with the poles of the integrand computed via De Moivre’s theorem. In particular, by setting D 8, the corresponding 2 -distribution exhibits resemblance to the Wigner distribution and Eq. 5.156 yields the following simple analytical form, .x/ D 1 e 8x 2 sin 4xe 4x (5.157) 276 R.N. Hwang which was used in earlier work based on the “nearest level” approximation for estimating the overlap effects on the self-shielded cross sections attributed to immediate neighbors. Physically, Dyson’s two-level correlation function can be considered as the rigorous alternative to .x/. The former was later introduced into reactor applications by replacing .x/ with r2 .x/ derived by Dyson [74], i.e., ds.x/ r2 .x/ D 1 Œs.x/ dx Z1 2 s.y/dy; s.x/ D sin x x (5.158) x There is another type of level correlation function of practical interest. If the ensembles levels fEk g and fEj g belong to either different spin states or different nuclides, the distribution of jEk Ej j must follow the Poisson distribution. Thus, if one substitutes the 2 -distribution of D 2 into Eq. 5.156 in place of the Wigner distribution, the corresponding correlation function becomes .x/ D 1. Physically, this signifies the fact that the probability of finding a level at a distance jEk Ej j is equally probable so long as these levels are statistically independent. 5.6.1.3 Conceptual Aspects of Computing Average Cross Sections Conceptually, the computation of these averages can be considered as the natural extension of the previously described treatment of resonance cross sections in the resolved energy range. With no loss of generality, consider an ensemble of a generic quantity qk.l;J / .E/ attributed to a Breit–Wigner resonance k for a given l and J state. E D , can be pictured as the population The statistically averaged value, qk.l;J / .E/ E par: average in the following context, D E qk.l;J / .E/ E par: N Z E2 X 1 D qk.l;J / .E/dE E2 E1 E1 kD1 Z 1 1 1 .l;J / lim D q .E/dE hDi N !1 N 1 k Z 1 1 q .l;J / .E/dE D hDi 1 k par: (5.159) where the average resonance parameters for the distributions are specified at the predetermined energy E provided by the evaluated data file. Here, hDi denotes the average level spacing. The equivalence of the discrete average and the statistical average clearly requires the validity of the ergodicity and stationarity of the samples inside the angular bracket in the vicinity of E . 5 Resonance Theory in Reactor Applications 277 This concept has been widely used as a basis for computing the unshielded, as well as the shielded, group cross sections. For the former, qk.l;J / .E/ is set to .l;J / .E/ as defined in Eqs. 5.41 and 5.42. For the latter, it requires the average rexk .l;J / action rate as well as the average flux, which can be obtained by setting qk .E/ .l;J / equal to xk .E/ k .E/ and k .E/ respectively. The relation defined in Eq. 5.159 can be cast into a different context from the standpoint of the widely used Monte Carlo technique. Given the probability density functions for level spacing and partial widths, one can construct the corresponding discrete resonance sequence whereby the averages can be treated in the same manner as the resolved resonances as shown in the first part of Eq. 5.159. This can be accomplished by the following procedure. First, one specifies the c.d.f. from the known p.d.f., say p.x/, Zx P .x/ D p.y/dy; x D P 1 .x/ (5.160) 0 where P 1 .x/ symbolically R 1 represents the value x obtainable from a given value in P .x/. By definition, 0 p.x/dx D 1 so that the range of the c.d.f. must be between 0 and 1. P .x/ can be either the c.d.f. for level spacing or that for partial widths where appropriate. Second, a resonance sequence can be generated if one assumes that P .x/ is equally probable within the range 0 P .x/ 1. This can be accomplished by choosing a random number i obtained by a random number generator, a computer program for generating statistically uncorrelated numbers with 0 i 1, and setting Pi D i , from which the corresponding set of parameters xi can be obtained from the inverse relation given by Eq. 5.160. Thus, by repeating such a process for the c.d.f.’s of all resonance parameters, a resonance sequence consisting of discrete resonances can be generated. A sequence of discrete resonances so generated is commonly referred to as a “resonance ladder.” Given the resonance ladder, the desired average can be treated in the same manner as if they were resolved resonances. It is important to realize, however, that this procedure is always accompanied by significant statistical uncertainties reflecting the highly fluctuating nature of the resonance phenomena. Thus, to reduce such uncertainties generally requires the inclusion of many statistically uncorrelated ladders. This undoubtedly hinders the routine usage of such a method especially when high accuracy is required. It should be noted that the desired averages can also be specified from a somewhat different perspective via the use of the multiplication theorem defined by Eq. 5.159. This concept leads to another alternative approach commonly referred to as the “probability table” method, which is particularly attractive in conjunction with reactor calculations using the Monte Carlo method. A discussion of this method will be presented later. 278 R.N. Hwang 5.6.2 Average Unshielded Cross Sections and Fluctuation Integrals The simplest possible averages based on the statistical concept given in the previous subsection are the average unshielded cross sections also referred to as the “infinitely dilute” cross sections. Such averages are generally temperature-independent due to the invariant nature of the Doppler-broadened line-shape functions when integrated over energy. One average of particular importance is the average compound nucleus cross section. Substituting Eq. 5.42 into 5.159, one obtains, htR i D 2 2 2 X gJ cos 2 l sl;J (5.161) l;J where sl;J D hn i=hDi is known as the “strength function,” a quantity directly relatable to the transmission measurements. For partial cross sections, the unshielded averages are directly expressible in terms of fluctuation integrals. The substitution of Eq. 5.41 into 5.159 leads to, X g n x J ; x 2 ; or f hx i D 2 hDiJ t l;J 2 X g n J hsR i D 2 2 2 2 sin2 l hn i hDiJ t 2 2 (5.162) (5.163) l;J for reaction x cross section and resonance scattering cross section, respectively. It should be noted that the total width in practical applications is taken to be the sum of all partial widths plus a “competitive width” c , i.e., t D n C C f C c . The latter presumably provides the approximate means to account for the inelastic scattering channels in the energy range above the first excited state of the nuclide in question. The quantity hn x =t i is referred to as the “fluctuation integral” which can be computed accurately via direct integration over all distributions of the partial widths. Let fx D hn x =t i be the multiple integral over the 2 -distributions of partial widths involved. Taking advantage of the exponential nature of these distributions, one can show readily that fx can be reduced to a single integral of the generic form given below, Z1 fx D aC 0 e t dt .1 C c1 t/. 1 =2/Cj .1 C c2 t/. 2 =2/Ck .1 C c3 t/. 3 =2/Cm (5.164) for x 2 ; s, or f , where ci D 2hx i = = i . with the degrees of freedom ˝i in ˛ the context given by Eq. 5.153 and i 2 1; 2; 3 corresponding to hx i D hn i ; f , and hc i, respectively, and C D hn i hx i = . The constant a is equal to unity for all x ¤ s while a D 3 when the fluctuation integral for the scattering cross section 5 Resonance Theory in Reactor Applications 279 that appears in the first term of Eq. 5.163 is considered. The integer constants j , k and m are dependent on the type of reaction considered. For x 2 , j D 1, k D 0, and m D 0. For x 2 f , j D 1, k D 1 and m D 0. For x 2 s, j D 2, k D 0 and m D 0. For x 2 c, j D 1, k D 0 and m D 1. It should be noted that fc is not used in practical calculations. One way to evaluate the single integral given by Eq. 5.164 is via a Gauss quadrature. In particular, the integral can be computed accurately and efficiently if the change of variable of integration setting y D .1 C t/=.1 t / is made prior to integration and is followed by the use of the Gauss–Legendre quadrature in the y-domain [77]. The accurately computed average unshielded cross sections via the fluctuation integrals provides the necessary criteria to verify those computed by other methods, either via deterministic or Monte Carlo approaches. The disagreement in the unshielded cross sections clearly will be passed on to the calculations of the selfshielded cross sections as well. 5.6.3 Traditional Approaches for Computing Self-shielded Average Cross Sections Discussions in this section focused on the traditional methods for computing the self-shielded cross sections in the unresolved resonance range that directly utilize the known joint probability distribution of resonance parameters described in Section 5.6.1.1. There are two different types of approaches based on the same general principle currently in use. The most commonly used approach is based on the direct integration, a procedure similar to that of computing the unshielded averages. Another less commonly used approach, referred to as the “resonance ladder” method, is based on statistically generated discrete resonance sequences via the use of the Monte Carlo technique described in Section 5.6.1.3. A brief discussion of these methods is now presented. 5.6.3.1 Methods Based on Direct Integrations For the direct integration approach, the self-shielded cross section for a given reaction process x associated with a given l and J state is defined on a statistical basis in the vicinity of a given energy E and can be viewed as the natural extension of the J -integral approach described in Section 5.4.4.2 if the NR-approximation is assumed [53]. The only difference is that the sum over individual resonances is replaced by a statistical average in the context of Eq. 5.159, i.e., D Q x.l;J / D †b x.l;J / .E / †b C†Rt .E / f E ; f D1 †Rt .E/ †b C †Rt .E/ (5.165) 280 R.N. Hwang where †b denotes the macroscopic energy-independent background cross section including all potential scattering cross sections and smooth cross sections given by the data file as well as the “equivalent” cross section for the cell where appropriate. Like the case of the resolved resonances, the presence of a neighboring resonance can affect the resonance integral of a given resonance through its impact on the local flux. From statistical considerations, such overlap effects can come from two types of resonances present. They may either belong to resonances of the same spin sequence or those of a different spin sequence of the same nuclide and/or sequences of different nuclides. The former requires statistical averaging over the level correlation function defined by Eq. 5.157 or 5.158, while the latter requires integration over the uniform distribution, or .x/ D 1, as described in Section 5.6.1.2 in addition to averaging over the partial widths. Two scenarios can be described separately as follows. Scenario 1: One Spin Sequence Only In absence of other statistically independent resonance sequences, Eq. 5.165 can be expressed as [53], Q x.l;J / h iE x J.k ; ˇk ; ak / Ok.x/ 0k D h iE D 1 1 hDi t J.k ; ˇk ; ak ; ak / Ok.t0 /k b hDi D (5.166) where b D †b =N0 is the background cross section per atom density of the nuclide under consideration. As demonstrated in [53,77], the statistical average can be computed efficiently via the use of Gauss quadratures. In particular, the integration involving the level correlation function can be significantly simplified if the overlap effects can be attributed to the nearest level, an approximation usually referred to as the “nearest neighbor” approximation. Scenario 2: Presence of Many Statistically Independent Resonance Sequences As described in [53], the integrand in Eq. 5.165, in principle, can also be further partial-fractioned into the similar form given by Eq. 5.166 in the presence of other statistically independent resonance sequences. Let the subscript i denote such a sequence and Jk denote the corresponding J -integral. The statistical averaging of such an integral requires not only integration over the correlation function of neighbors of the k-th sequence but also a uniform distribution of those belonging to the i-th sequence as well. The latter integration to first order can be approximated readily and the resulting expression for the reaction rate and the flux become, 5 Resonance Theory in Reactor Applications 281 2 E E X 1 b D b D .x/ .x/ 4 xk Jk x Jk 1 hDk i hDk i hDi i 3 E D .t / 5 ti Ji (5.167) i ¤k f 1 X all m E 1 D tm Jm.t / hDm i (5.168) where Jk.x/ denotes the quantity inside the square bracket in the numerator of .t / Eq. 5.166 while Jk denotes the quantity inside the square bracket in the denominator of Eq. 5.166. Note that the second-order terms in Eqs. 5.167 and 5.168 are usually small and the inter-sequence overlap terms can be approximated by, 1 X all i E E Y 1 D 1 D 1 ti Ji.t / ti Ji.t / hDi i hDi i (5.169) all i Thus, the substitution of Eq. 5.169 into 5.167 and 5.168 leads to an extremely simple expression for the average self-shielded cross section particularly useful for reactor applications, X Q x.l;J / (5.170) Q x D l;J where Q x.l;J / is dependent on the in-sequence overlap term but independent of the inter-sequence overlap terms as given by Eq. 5.166. Physically, the direct integration approach is directly compatible with the fluctuation integral approach for computing the unshielded average cross sections. In the limit of infinite dilution, the average shielded cross section will approach the unshielded cross section based on the fluctuation integral. 5.6.3.2 Methods Using the Statistically Generated Resonance Ladders As described in Section 5.6.1.3, the unresolved resonances can also be pictured as an ensemble of discrete resonances generated based on the known distribution functions of resonance parameters via the usual Monte Carlo techniques. One advantage of this approach is that subsequent calculations of the shielded average cross sections are no longer constrained by the use of the NR-approximation required by the previous methods based on direct integration. However, the resulting cross sections computed on this basis are subject to large statistical uncertainties difficult to circumvent. Therefore, this approach is not widely used in practical applications as others. 282 R.N. Hwang 5.6.4 Probability Table Methods One method particularly amenable in conjunction with the Monte Carlo approach for reactor applications is the “probability table” method pioneered by Levitt [78]. The method bypasses the need of directly using the ladder approach at run time in earlier Monte Carlo codes, a procedure that would not only require large storage space and excessive computing time but also would lead to less manageable statistical uncertainties in its results. Conceptually, the general idea is to utilize the probability and conditional probability distributions themselves deducible numerically from the known distributions of resonance parameters so that the averages of interest can be cast into the form of a Lebesgue integral instead of that of the Riemann integral described earlier. 5.6.4.1 Conceptual Basis Consider a simple case where the neutron flux .t / is a function of t alone, i.e., the NR-approximation. One alternative joint probability distribution required to specify the average of the type hx .t /i is, according to the multiplication theorem, .max/ Z t hx .t /i D .max/ Z t p.t /E.x jt / .t /dt ; h .t /i D 0 .t /p.t /dt 0 (5.171) where p.t / is the p.d.f. for the total cross section and E.x jt / is referred to as the conditional means of the partial cross section, the average x over the conditional probability p.x jt /, defined as .max/ Z x E.x jt / D x p.x jt /dx (5.172) 0 which can be precomputed once the conditional probability is known. Thus, if these probability distributions are deducible from the known distributions of resonance parameters, the averages of practical interest can be expressed in terms of single integrals of Lebesgue form given by Eq. 5.171. For practical applications, only the prior knowledge of p.t / and the conditional means of various partial cross sections is required at run time. The main attraction of this approach is that p.t / along with its c.d.f. and various conditional means can be predetermined and tabulated in one-dimensional arrays for a prescribed E and different temperatures before their deployment. The same principle is extendable to the treatment of the resolved resonances in conjunction with the subgroup methods developed 5 Resonance Theory in Reactor Applications 283 independently by Nikolaev [79], Cullen [80] and Ribon [75]. Various numerical methods for computing these tabulated quantities will be further addressed. There is one word of caution if this approach is applied beyond the NR-approximation. Strictly speaking, this would require the conditional distribution of the form p.x js ; t /, which may negate, at least in part, the advantage of simplicity described above. 5.6.4.2 Methods for Computing the Tabulated Quantities There are two widely used methods for computing the discrete values of p.t / and E.x jt / as a function of t . One is the numerical scheme based on the statistically generated resonance “ladders” originally developed by Levitt [78]. The other is the moment-based approach based on matching the moments and partial moments of cross sections developed by Ribon and Maillard [75]. A brief outline of these methods is presented below. Method Based on Resonance Ladders The rationale is relatively straightforward. The discretized p.t / and the corresponding conditional means are determined through the following steps. 1. For a given reference energy E where the averages are to be computed, a set .max/ is preselected as the of total cross section bands fti g with 0 < ti < t mesh points for the tables to be constructed. t D ti1 and t D ti , graphically equivalent to two parallel lines in a plot of t vs. E, constitutes the i-th “band.” The index i will henceforth be used to denote the t bands that will serve as entries to the tables. 2. Select an energy range E D E2 E1 around E with E >>> hDi, the average spacing of the resonance ladder in question. 3. Generate a “ladder” of resonances with the peak resonance energies falling inside E using the usual Monte Carlo scheme described in Section 5.6.1.3. 4. Construct a set of fine energy mesh fEk g within the preselected interval E from which the corresponding point-wise total cross sections ftk g and partial cross sections fxk g are computed. 5. Determine a subset of energy pairs fE2j ; E2j 1 g, signifying the consecutive pairs of intersections created between the upper and lower lines of the i-th “band” with the plot of t vs. E, which can be obtained via the criterion ti < tk < ti1 . The index j will henceforth be used to denote the consecutive energy pairs sandwiched by a given band i . 6. Store the localized average partial cross sections N xj.L/ .E2j ; E2j 1 / and total .L/ cross section N tj .E2j ; E2j 1 / of this ladder L for all energy pairs within the given band. 284 R.N. Hwang 7. The localized average of the probability density function (p.d.f.) of the total cross section at the i-th band amounts to the fraction of N tj.L/ .E2j ; E2j 1 / that falls in the range between ti and ti1 , i.e., J P pi.L/ D j D1 jE2j E2j 1 j E D 0; ; ti1 < .L/ tj .E2j ; E2j 1 / < ti elsewhere (5.173) if and only if the band width is sufficiently small. .L/ 8. The c.d.f. for the given ladder is simply Pi D i P kD1 .L/ pk . 9. The corresponding conditional means of the partial cross section at the i-th band becomes E.x j ti / D .L/ 1 X .L/ .L/ .E2j E2j 1 / xj .E2j ; E2j 1 /; ti1 < tj .E2j ; E2j 1 /<ti E j D 0; elsewhere (5.174) Steps 7 through 9 are repeated for all predetermined bands. 10. Repeat steps 4 through 8 as many times as one desires. The final quantities to be tabulated are just the population averages of all ladders considered. Moment Method The quantities of interest can also be deduced from the known moments and partial moments of cross sections via the method developed by Ribon and Maillard [75]. Their rationale is that the usual Riemann integral, in principle, can be truncated into a form of the discretized Lebesgue integral. For the case in point, the n-th order moment and partial moments can be expressed respectively as Mn D 1 E Z tn .E/dE D E Mn.x/ D 1 E N X pi tin I i D1 Z x .E/tn .E/dE D E N X pi xi tin (5.175) i D1 where xi D E.xi jti /. The question is how to compute these discretized quantities. The same rationale is equally applicable for the resolved as well as the unresolved treatments in the context of Eq. 5.159. Ribon and Maillard [75] conjectured 5 Resonance Theory in Reactor Applications 285 that these discretized quantities ti and pi in the first equation above can be pictured as the Gauss quadrature abscissae and weights in conjunction with an integral of the following form: Z F .z/ D t X pi p.t / C R2N dt D 1 zt 1 ti z N (5.176) i D1 As the usual criterion, the abscissae and weights of the Gauss quadrature of order N requires the knowledge of up to (2N 1)-th order moments. This can be accomplished by using the Pade approximation for F .z/ whereby X wi a0 C a1 z C a2 z2 C C aN 1 zN 1 D C R2N 1 C b1 z C b2 z2 C C bN zN 1 zz N F .z/ D i D1 (5.177) i The coefficients in the numerator and denominator can form a system of 2N linear equations in which each equation is a linear combination of moments if the function is matched with all moments from n D 0 to n D N 1. The solution of fai g and fbi g provides the unambiguous basis to compute the weight wi , the residue identifiable with pi and abscissas 1=zi , the pole identifiable with ti . The same rationale can be extended to the matching of the partial moments given by Eq. 5.175. It should be noted, however, the determination of xi , the conditional means, requires a total of only N partial moments in contrast to 2N moments required for computing pi . Thus, there is an issue of uniqueness when the conditional means are determined this way. 5.7 Future Challenges Given our vastly improved computational capability and tools for evaluating nuclear data, much of the topics discussed can be greatly enhanced. Some of the future challenges in various areas, in the author’s opinion, are listed as follows: 1. Nuclear Data and Cross-Section Representation Continuous efforts to improve the resonance parameters for a wide range of nuclides are needed. More utilization of the self-indication ratio measurement at various tempera- tures as a means for data verification is needed particularly in the unresolved energy range. It is the only viable, yet simple, means that provide a basis to verify the computed self-shielding effect directly vs measurements. One cross-section representation not widely recognized in the West is the so-called combined method pioneered by Lukyanov and Sirakov [81]. The method certainly deserves more attention by the nuclear data communities around the globe. 286 R.N. Hwang 2. Point-wise Doppler-Broadening of Cross Sections One long-standing issue in Doppler-broadening of cross sections is the poten- tial impact of crystalline binding effects not accounted for in the traditional approach based on the ideal gas model. In addition, such effects can affect the elastic scattering kernel currently in use according to recent studies [82, 83]. Although other recent investigations were also made in this area [84], the fundamental issues remain unresolved. In particular, more in-depth theoretical, as well as experimental, studies on the solid state aspects pertinent to the chemical binding of various lattices of interest needs to be pursued. To enhance the accuracy and efficiency of calculations at run time, one area of practical interest is optimization of point-wise cross section libraries at a given temperature currently used in many production codes. 3. Multigroup Cross-Section Processing One obvious challenge is the improvement of the current collision probabil- ity treatment at the resonance level in order to accommodate complex reactor cells. For many realistic reactor lattices, a two-dimensional collision probability approach is apparently needed. In the same context, improvement of the current methods for computing the global weighting spectrum for cross-section collapsing in various multigroup constant codes is also in order. 4. Improvement of Unresolved Resonance Treatment A new analytical approach for computing probability tables via the integral transform techniques is currently under consideration as outlined in [85]. Further development is underway. One method that has not received much attention in the West is the characteristic function approach for treating averages involving the S-matrix pioneered by Lukyanov et al. [86, 87]. Some recent results have demonstrated its viability to compute the self-shielded cross sections of practical interest. It is yet another alternative method that is worth pursuing further. The method based on the concept of maximizing information entropy as applied to the treatment of S-matrix averages pioneered by Fröhner [88] has provided a new alternative quite different from all the traditional methods described here. The challenge is whether one can utilize such a concept in calculations of practical interest in reactor applications. 5. Book on Resonance Theory in Reactor Applications In this author’s opinion, what has been missing is a comprehensive book that provides an up-to-date reference basis for a large body of work in this area. There were two representative books of this kind in earlier days. One was authored by L. Dresner [6] in the early 1960s while the other by Lukyanov [10] was published in the mid 1970s. It is quite evident from the foregoing discussions that a new book in this area is needed. 5 Resonance Theory in Reactor Applications 287 References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. Fermi E (1962) Collected papers of E. Fermi, vol 1. University of Chicago Press, Chicago, IL Wigner EP (1955) Nuclear reactor issue. J Appl Phys 26:257 Wigner EP, Creutz JH, Snyder TJ (1955) J Appl Phys 26:260 Chernick J (1956) The theory of uranium-water lattices. In: Proceedings of the international conference on peaceful uses of atomic energy, New York, pp 603 Chernick J, Vernon, R (1958) Nucl Sci Eng 4:649 Dresner L (1960) Resonance absorption in nuclear reactors. Pergamon, New York Gourevitch II, Pomeranchouk IY (1955) Theory of resonance absorption in heterogeneous systems. First Geneva Conference on Peaceful Use of Atomic Energy, vol 5, pp 466 Marchuk GI (1964) Methods of reactor calculations. Atomizdat, Moscow (1961) (in Russian) (English trans: Consultant Bureau). New York (1964) Lukyanov AA (1978) Structures of neutron cross sections. Atomizdat, Moscow (in Russian) Lukyanov AA (1974) Slowing-down and absorption of resonance neutrons. Atomizdat, Moscow (in Russian) Breit G, Wigner EP (1936) Phys Rev 49:519 Hummel HH, Okrent D (1970) Reactivity coefficients in large fast power reactors. Monograph series on nuclear science and technology, American Nuclear Society Rose PF, Dunford CL (eds) (1990) ENDF-102 Data formats and procedures for evaluated nuclear data file ENDF-6, BNL-NCS-44945. Brookhaven National Laboratory Hwang RN (1992) R-Matrix parameters in reactor applications. In: Dunford CL (ed) Proceedings of the symposium on nuclear data evaluation methodology, Brookhaven National Laboratory, New York, pp 316–326 Lane M, Thomas RG (1958) Rev Mod Phys 30:257 Lynn JE (1968) The theory of neutron resonance reactions. Clarendon, Oxford Fröhner FH (1978) Applied resonance theory, KFK-2669. Kernforschungszentrum Karlsruhe Wigner EP, Eisenbud L (1947) Phys Rev 72:29 Kapur PL, Peierls R (1938) Proc R Soc Lond A 166:277 Adler DB, Adler FT (1963) Neutron cross sections in fissile elements. In: Proceedings of the conference on breeding, economics and safety in fast reactors, ANL-6792, 695, Argonne National Laboratory Moldauer PA, Hwang RN, Garbow BS (1969) MATDIAG, a program for computing multilevel S-Matrix resonance parameters, ANL-7569, Argonne National Laboratory deSaussure G, Perez RB (1969) POLLA, a Fortran program to convert R-Matrix-type multilevel resonance parameters for Fissile nuclei into equivalent Kapur–Peierls-type parameters. ORNL-2599, Oak Ridge National Laboratory Reich CW, Moore MS (1958) Phys Rev 111:929 Hwang RN (1987) Nucl Sci Eng 96:192 Humblet J, Rosenfeld L (1961) Nucl Phys 26:529 Hwang RN (1992) Nucl Sci Eng 111:113 Solbrig AW, Jr (1961) Nucl Sci Eng 10:167 Osborn RK, Yip S (1964) Foundations of neutron transport theory. Am Nucl Soc Monograph Lamb WE (1939) Phys Rev 55:190 Egelstaff PA (1962) Nucl Sci Eng 12:250 Nelkin M, Parks DE (1960) Phys Rev 119:1060 Adkins CR (1966) Effects of chemical binding on the Doppler broadened cross section of Uranium in a UO2 lattice. Ph.D. thesis, Carnegie Institute of Technology Dolling G, Cowley RA, Woods ADB (1965) The crystal dynamics of uranium dioxide. Can J Phys 43(8):1397 Hwang RN (1998) An analytical method for computing Doppler-broadening of cross sections, ANL-NT-69, Argonne National Laboratory O’Shea DM, Thacher HC (1963) Computations of resonance line-shape functions. Trans Am Nucl Soc 6:36 288 R.N. Hwang 36. Turing AM (1943) Proc Lond Math Soc 48:130 37. Henrici P (1963) Some applications of the quotient-difference algorithm. Proc Symp Appl Math 15:159 38. Goodwin ET (1949) Proc Cambridge Phil Soc 45:241 39. Bhat MR, Lee-Whiting GE (1967) A computer program to evaluate the complex probability integral. Nucl Instr Meth 47:277–280 40. Fröhner FH (1980) New techniques for multi-level cross section calculation and fittings. KFK3081, Kernforschungszentrum Karlsruhe 41. Steen NM, Byrne GD, Gelbard EM (1969) Math Comp 23:107 42. Cullen DE (1977) Program SIGMA1 (Version 77-1): Doppler broaden evaluated cross sections in the evaluated nuclear data file/Version B (ENDF/B) Format, UCRL-50400, vol. 17, Part B 43. MacFarlane RE, Muir DW (1994) NJOY nuclear data processing system, Version 91. LA12740-M, Los Alamos National Laboratory 44. Hwang RN (1998) Critical examinations of commonly used numerical methods for Dopplerbroadening of cross section, ANL-NT-72, Argonne National Laboratory 45. Dunford CL, Bramblett ET (1966) Doppler broadening of resonance data for Monte Carlo calculations, MAI-CE-MEMO-21, Atomic International 46. Leal LC, Hwang RN (1987) A finite difference method for treating the Doppler-broadening of cross sections. Tran Am Nucl Soc 55:340 47. Weinberg AM, Wigner EP (1958) The physical theory of neutron chain reactors. University of Chicago Press, Chicago, IL 48. Placzek G (1946) Phys Rev 69:423 49. Corngold (1957) Slowing down of neutrons in infinite homogeneous media. Proc Phy Soc Lond A 70:793 50. Hwang RN (1965) Effects of the fluctuations in collision density on fast reactor Doppler effect calculations. In: Proceedings of the conference on safety, fuels, and core designs in large fast reactors, 449, ANL-7120 51. Goldstein R, Cohen ER (1962) Nucl Sci Eng 13:132 52. Nordheim LW (1961) A program of research and calculations of resonance absorption. GA2527, General Atomic 53. Hwang RN (1973) Nucl Sci Eng 52:157 54. Henryson H, Toppel BJ, Stenberg CG (1976) MC2 -2: a code to calculate fast-neutron spectra and multigroup cross sections. ANL-8144, Argonne National Laboratory 55. Stacey WM, Jr (1972) Advances in continuous slowing-down theory. In: Proceedings of the National Topical Meeting on New Developments in Reactor Physics and Shielding, Conf720901 Book 1, 143 56. Goertzel G, Greuling E (1960) Nucl Sci Eng 7:69 57. Kier PH, Robba AA (1967) RABBLE, a program for computation of resonance absorption in multiregion reactor cells, ANL-7326 58. Olson AP (1970) An integral transport-theory code for neutron slowing-down in slab cells, ANL-7645 59. Case KM, de Hoffmann F, Placzek G (1963) Introduction to the theory of neutron diffusion, vol 1. Los Alamos Scientific Laboratory 60. Dancoff SM, Ginsburg M (1944) Surface resonance absorption in a closely packed lattice, USAEC-Report cp-2157 61. Rothenstein W (1959) Collision probabilities and resonance integrals for lattices. BNL-563 (T-151), Brookhaven National Laboratory 62. Corngold NC (1957) Resonance escape probability in slab lattices. J Nucl Energy 4:293 63. Takahashi H (1960) First flight collision probability in lattice systems. J Nucl Energy A12:1 64. Bickley WC, Nayler J (1935) Phil Mag 20(Series 7):343 65. Bell GL (1959) Nucl Sci Eng 5:138 66. Levine MM (1963) Nucl Sci Eng 16:271 67. Hummel HH, Hwang RN, Phillips K (1965) Recent investigations of fast reactor reactivity coefficients. In: Proceedings of the conference on safety, fuels, and core design in large fast power reactors, ANL-7120, pp 413–420 5 Resonance Theory in Reactor Applications 289 68. Hwang RN, Toppel BJ (1979) Mathematical behavior probabilities for annular regions. In: Proceedings of the topical meeting on computational methods in nuclear engineering, vol 2. Williamsburg, VA, pp 7–85 69. Moldauer PA (1964) Phys Rev 135(3B):642–659; 136B:947 70. Ericson T (1963) Ann Phys 23:390–414 71. Porter CE (ed) (1965) Statistical theory of spectra: fluctuations. Academic, New York and London 72. Porter CE, Thomas RG (1956) Phys Rev 104:483 73. Wigner EP (1957) Ann Math 53:36 (1951); 62:548 (1955); 65:203 (1957) 74. Dyson FJ (1962) J Math Phys 3(1):140 75. Ribon P, Maillard JM (1986) Les tables de probabilité applications au traitement des sections efficaces pour la neutronique. Report CEA-N, NEACRP-L-294 76. Hwang RN (1965) Nucl Sci Eng 21:523 77. Hwang RN, Henryson H, II (1975) Critical examination of low-order quadratures for statistical integrations. Trans Am Nucl Soc 22:712 78. Levitt LB (1971) The probability table method for treating unresolved resonances in Monte Carlo criticality calculations. Trans Am Nucl Soc 14:648 79. Nikolaev MN, et al. (1971) At Energy 29:11 (1970); 30:426 (1971) 80. Cullen DE (1974) Nucl Sci Eng 55:378 81. Lukyanov AA, Sirakov IA (1998) Combined method for resonance cross section parameterization. Bulg J Phys 25(3–4):112 82. Ouisloumen M, Sanchez R (2000) Nucl Sci Eng 107:189–200 83. Bouland O, Kolesov V, Rowlands JL (1994) The effects of approximations in the energy distributions of scattered neutrons on thermal reactor Doppler effects. International conference on nuclear data for science and technology, Gatlinburg, Tenn, pp 1006–1008 84. Gressier V, Naberejnev D, Mounier C (2000) Ann Nucl Energ 27:1115–1129 85. Hwang RN (1998) Recent developments pertinent to processing of ENDF/B 6 resonance cross section data. In: Proceedings of the international conference on nuclear science and technology, Long Island, New York, pp 1241–1250 86. Koyumdjieva N, Sovova N, Janeva N, Lukyanov AA (1989) Bulg J Phys 16(1):13 87. Alami MN, Janeva NB, Lukyanov AA (1991) Bulg J Phys 9:16 88. Fröhner FH (1987) Information theory applied to unresolved resonance. In: Proceedings of the international conference on nuclear data for basic and applied science, Santa Fe, N. M. (1985); also Resent progress in compound nuclear theory and calculation of resonance-averaged cross section, IAEA Meeting on Nuclear Model for Fast Neutron Nuclear Data Evaluation, Beijing, IAEA TEC-DOC Series 290 R.N. Hwang Dr. Richard N. Hwang was born in Canton, China on January 24, 1935 and came to the USA as a student in 1956. He joined the staff at the Argonne National Laboratory in 1962 after he received his B.S. in Mechanical Engineering and Ph.D. in Nuclear Physics from Iowa State University. He retired from Argonne National Laboratory as a senior physicist after a 45-year career. His work there was focused on nuclear resonance theory and its nuclear reactor physics applications. His work included: generalization of the traditional resonance integral concept; development of accurate analytical and numerical methods for computing Dopplerbroadening of cross sections; development of efficient methods for treating the generalized resonance integral; contribution to a better understanding of Placzeck oscillations on resonance integrals; studies of the effect of crystalline binding on Doppler-Broadening; original contributions to the theory of statistical treatment of self-shielding effects in the unresolved resonance range; development of efficient numerical algorithms for treating selfshielded cross sections in the resolved as well as the unresolved regions; accurate method for computing transmission probabilities in annular cell geometry; development of statistical distributions for S-matrix parameters; development of a method for converting statistically generated R-matrix parameters to Kapur–Peierls type parameters; introduction of rigorous pole representation of cross sections; development of a method to extract the Humblet–Rosenfeld type parameters from rigorous pole parameters. He was responsible for the development of the MATDIAG code, resolved and unresolved resonance modules for the MC2 -2 code, the WHOPPER code, the SUPERW code, and the IMPLIC code. Dr. Hwang was a member and fellow of the American Nuclear Society. He was the author of over 66 Journal articles and Conference papers and reports. In 2004 he received the Eugene P. Wigner Reactor Physicist award for his work in nuclear resonance theory. The Wigner award, given by the American Nuclear Society for “outstanding achievements in the field of nuclear reactor physics,” is the highest honor a reactor physicist can receive. He passed away on December 20, 2007. Argonne National Laboratory February, 2008 Composed by Dr. Roger Blomquist Chapter 6 Sensitivity and Uncertainty Analysis of Models and Data Dan Gabriel Cacuci 6.1 Introduction This chapter highlights the characteristic features of statistical and deterministic methods currently used for sensitivity and uncertainty analysis of measurements and computational models. The symbiotic linchpin between the objectives of uncertainty analysis and those of sensitivity analysis is provided by the “propagation of errors” equations, which combine parameter uncertainties with the sensitivities of responses (i.e., results of measurements and/or computations) to these parameters. It is noted that all statistical uncertainty and sensitivity analysis methods first commence with the “uncertainty analysis” stage, and only subsequently proceed to the “sensitivity analysis” stage. This procedural path is the reverse of the procedural (and conceptual) path underlying the deterministic methods of sensitivity and uncertainty analysis, where the sensitivities are determined prior to using them for uncertainty analysis. In particular, it is emphasized that the Adjoint Sensitivity Analysis Procedure (ASAP) is the most efficient method for computing exactly the local sensitivities for large-scale nonlinear problems comprising many parameters. This efficiency is underscored with illustrative examples. The computational resources required by the most popular statistical and deterministic methods are discussed comparatively. A brief discussion of unsolved fundamental problems, open for future research, concludes this chapter. 6.2 Sensitivities and Uncertainties in Measurements and Computational Models: Basic Concepts In practice, scientists and engineers often face questions such as: How well does the model under consideration represent the underlying physical phenomena? What confidence can one have that the numerical results produced by the model D.G. Cacuci () Institute for Nuclear Technology and Reactor Safety, Karlsruhe Institute of Technology, Germany e-mail: Dan.Cacuci@KIT.Edu Y. Azmy and E. Sartori, Nuclear Computational Science: A Century in Review, c Springer Science+Business Media B.V. 2010 DOI 10.1007/978-90-481-3411-3 6, 291 292 D.G. Cacuci are correct? How far can the calculated results be extrapolated? How can the predictability and/or extrapolation limits be extended and/or improved? Answers to such questions are provided by sensitivity and uncertainty analyses. As computerassisted modeling and analyses of physical processes have continued to grow and diversify, sensitivity and uncertainty analyses have become indispensable investigative scientific tools in their own right. Since computers operate on mathematical models of physical reality, computed results must be compared to experimental measurements whenever possible. Such comparisons, though, invariably reveal discrepancies between computed and measured results. The sources of such discrepancies are the inevitable errors and uncertainties in the experimental measurements and in the mathematical models. In practice, the exact forms of mathematical models and/or exact values of data are not known, so their mathematical form must be estimated. The use of observations to estimate the underlying features of models forms the objective of statistics. This branch of mathematical science embodies both inductive and deductive reasoning, encompassing procedures for estimating parameters from incomplete knowledge and for refining prior knowledge by consistently incorporating additional information. Thus, assessing and, subsequently, reducing uncertainties in models and data require the combined use of statistics together with the axiomatic, frequency, and Bayesian interpretations of probability. As is well known, a mathematical model comprises independent variables, dependent variables, and relationships (e.g., equations, look-up tables, etc.) between these quantities. Mathematical models also include parameters whose actual values are not known precisely, but may vary within some ranges that reflect our incomplete knowledge or uncertainty regarding them. Furthermore, the numerical methods needed to solve the various equations themselves introduce numerical errors. The effects of such errors and/or parameter variations must be quantified in order to assess the respective model’s range of validity. Moreover, the effects of uncertainties in the model’s parameters on the uncertainty in the calculated results must also be quantified. Generally speaking, the objective of sensitivity analysis is to quantify the effects of parameter variations on calculated results. On the other hand, the objective of uncertainty analysis is to assess the effects of parameter uncertainties on the uncertainties in calculated results. Sensitivity and uncertainty analyses can be considered as formal methods for evaluating data and models because they are associated with the computation of specific quantitative measures that allow, in particular, assessment of variability in output variables and importance of input variables. Models of complex physical systems usually involve two distinct sources of uncertainties: (i) stochastic uncertainty, which arises because the system under investigation can behave in many different ways, and (ii) subjective or epistemic uncertainty, which arises from the inability to specify an exact value for a parameter that is assumed to have a constant value in the investigation. Epistemic (or subjective) uncertainties characterize a degree of belief regarding the location of the appropriate value of each parameter. In turn, these subjective uncertainties lead to subjective uncertainties for the response, thus reflecting a corresponding degree of belief regarding the location of the appropriate response values as the outcome of 6 Sensitivity and Uncertainty Analysis of Models and Data 293 analyzing the model under consideration. A typical example of a complex system that involves both stochastic and epistemic uncertainties is a nuclear reactor power plant: in a typical risk analysis of a nuclear power plant, stochastic uncertainty arises due to the hypothetical accident scenarios that are considered in the respective risk analysis, while epistemic uncertainties arise because of uncertain parameters that underlie the estimation of the probabilities and consequences of the respective hypothetical accident scenarios. This section commences with a brief description of the main sources and features of errors and uncertainties associated with measurements. The fundamental concepts used for assessing the magnitude and effects of uncertainties for complex measurements and computational simulations of physical phenomena are also presented. The practical consequences of these fundamental concepts are embodied in the “propagation of errors (moments)” equations. As will be shown in this section, the “propagation of errors” equations provide a systematic way for assessing the uncertainties in the results of measurements and computations arising not only from uncertainties in the parameters that enter the computational model, but also from numerical approximations. The “propagation of errors” equations systematically and consistently combine the parameter errors with the sensitivities of responses (i.e., results of measurements and/or computations) to the respective parameters, thus providing the mathematical connection between the objectives of uncertainty analysis and those of sensitivity analysis. The efficient computation of sensitivities and, subsequently, uncertainties in results produced by various models (algebraic, differential, integral, etc.) will then form the objectives of subsequent sections in this chapter. 6.2.1 Measurement Uncertainties A measurable quantity is a property of phenomena, bodies, or substances that can be defined qualitatively and expressed quantitatively. Measurable quantities are also called physical quantities. The term quantity is used both in a general sense, when referring to the general properties of objects (e.g., length, mass, temperature, electric resistance, etc.), and in a particular sense, when referring to the properties of a specific object (e.g., the length of a given rod, the electric resistance of a given segment of wire, etc.). Measurement is the process of finding the value of a physical quantity experimentally with the help of special devices called measuring instruments. The result of a measurement is a numerical value, together with a corresponding unit, for a physical quantity. The true value of a measurable quantity is the value of the measured physical quantity, which, if it were known, would ideally reflect, both qualitatively and quantitatively, the corresponding property of the object. The theory of measurement relies on the following postulates: (a) True value of the measurable quantity exists. (b) True value of the measurable quantity is constant (relative to the conditions of the measurement). (c) True value cannot be found. 294 D.G. Cacuci Since measuring instruments are imperfect, and since every measurement is an experimental procedure, the results of measurements cannot be absolutely accurate. This unavoidable imperfection of measurements is generally expressed as measurement inaccuracy, and is quantitatively characterized by measurement errors. Thus, the result of any measurement always contains an error, which is reflected by the deviation of the result of measurement from the true value of the measurable quantity. Knowledge of measurement errors would allow statements about measurement accuracy, which is the most important quality of a measurement: the smaller the underlying measurement errors, the more accurate is the respective measurement. However, since the true value of a measurable quantity is always unknown, the errors of measurements must be estimated theoretically, by computations, using a variety of methods, each with its own degree of accuracy. The basic sources of measurement errors are errors arising from the method of measurement, errors due to the measuring instrument, and personal errors committed by the person performing the experiment. These errors are assumed to be additive. Methodological errors are caused by unavoidable discrepancies between the actual quantity to be measured and its model used in the measurement. Most commonly, such discrepancies arise from inadequate theoretical knowledge of the phenomena on which the measurement is based, and also from inaccurate and/or incomplete relations employed to find an estimate of the measurable quantity. Instrumental measurement errors are caused by imperfections of measuring instruments. Finally, the individual characteristics of the person performing the measurement may give rise to personal errors. If the results of separate measurements of the same quantity differ from one another, and the respective differences cannot be predicted individually, then the error owing to this scatter of the results is called random error. Random errors can be identified by repeatedly measuring the same quantity under the same conditions. On the other hand, a measurement error is called systematic if it remains constant or changes in a regular fashion when the measurements of that quantity are repeated. Systematic errors can be discovered experimentally either by using a more accurate measuring instrument or by comparing a given result with a measurement of the same quantity, but performed by a different method. In addition, systematic errors are estimated by theoretical analysis of the measurement conditions, based on the known properties of the measuring instruments and the quantity being measured. Although the estimated systematic error can be reduced by introducing corrections, it is impossible to eliminate all systematic errors completely from experiments. Ultimately, a residual error will always remain, which will then constitute the systematic component of the measurement error. The smallest of the measurement errors are customarily referred to as elementary errors (of a measurement), and are defined as those components of the overall measurement error that are associated with a single source of inaccuracy for the respective measurement. The total measurement error is calculated, in turn, by using the estimates of the component elementary errors. Even though it is sometimes possible to partially correct certain elementary errors (e.g., systematic ones), no amount or combination of corrections can produce an absolutely accurate measurement. 6 Sensitivity and Uncertainty Analysis of Models and Data 295 In particular, the corrections themselves cannot be absolutely accurate, and, even after they are implemented, there remain residuals of the corresponding errors, which cannot be eliminated and which later assume the role of elementary errors. Since a measurement error can only be calculated indirectly, based on models and experimental data, it is important to identify and classify the underlying elementary errors. This identification and classification is subsequently used to develop mathematical models for the respective elementary errors. Finally, the resulting (overall) measurement error is obtained by synthesizing the mathematical models of the underlying elementary errors. In the course of developing mathematical models for elementary errors, it has become customary to distinguish four types of elementary errors: absolutely constant errors, conditionally constant errors, purely random errors, and quasi-random errors. Thus, absolutely constant errors are defined as elementary errors that remain the same (i.e., are constant) in repeated measurements performed under the same conditions, for all measuring instruments of the same type. Absolutely constant errors have definite limits but these limits are unknown. For example, an absolutely constant error arises from inaccuracies in the formula used to determine the quantity being measured, once the limits of the respective inaccuracies have been established. Typical situations of this kind arise in indirect measurements of quantities determined by linearized or truncated simplifications of nonlinear formulas (e.g., analog/digital instruments where the effects of electromotive forces are linearized). Based on their properties, absolutely constant elementary errors are purely systematic errors, since each such error has a constant value in every measurement, but this constant is nevertheless unknown. Only the limits of these errors are known. Therefore, absolutely constant errors are modeled mathematically by a determinate (as opposed to random) quantity, whose magnitude lies within an interval of known limits. Conditionally constant errors are, by definition, elementary errors that have definite limits (just like the absolutely constant errors) but (as opposed to the absolutely constant errors) such errors can vary within their limits due to both the non-repeatability and the non-reproducibility of the results. A typical example of such an error is the measurement error due to the intrinsic error of the measuring instrument, which can vary randomly between fixed limits. Usually, the conditionally constant error is mathematically modeled by a random quantity with a uniform probability distribution within prescribed limits. This mathematical model is chosen because the uniform distribution has the highest uncertainty (in the sense of information theory) among distributions with fixed limits. Note, in this regard, that the round-off error also has known limits, and this error has traditionally been regarded in mathematics as a random quantity with a uniform probability distribution. Purely random errors appear in measurements due to noise or other random errors produced by the measuring device. The form of the distribution function for random errors can, in principle, be found using data from each multiple measurement. In practice, however, the number of measurements performed in each experiment is insufficient for determining the actual form of the distribution function. 296 D.G. Cacuci Therefore, a purely random error is usually modeled mathematically by using a normal distribution characterized by a standard deviation that is computed from the experimental data. Quasi-random errors occur when measuring a quantity defined as the average of nonrandom quantities that differ from one another such that their aggregate behavior can be regarded as a collection of random quantities. In contrast to the case of purely random errors, though, the parameters of the probability distribution for quasi-random errors cannot be unequivocally determined from experimental data. Therefore, a quasi-random error is modeled by a probability distribution with parameters (e.g., standard deviation) determined by expert opinion. It is very difficult to identify systematic errors; for example, variable systematic errors can be identified by using statistical methods, correlation, and regression analysis. Systematic errors can also be identified by measuring the same quantity using two different instruments (methods) or by measuring periodically a known (instead of an unknown) quantity. If a systematic error has been identified, then it can usually be estimated and eliminated. However, making rational estimates of the magnitude of the residual systematic errors and, in particular, assigning consistent levels of confidence to these residual errors is an extremely difficult task. In practice, therefore, residual systematic errors are assumed to follow a continuous uniform distribution, within ranges that are conservatively estimated based on experience and expert judgment. When the random character of the observational results is caused by measurement errors, the respective observations are assumed to have a normal distribution. This assumption rests on two premises: (i) since measurement errors consist of many components, the central limit theorem implies a normal distribution for such errors; and (ii) measurements are performed under controlled conditions, so that the distribution function of their error is actually bounded. Hence, approximating a bounded distribution by a normal distribution (for which the random quantity can take any real value) is a conservative procedure since such an approximation leads to larger confidence intervals than would be obtained if the true bounded distribution were known. Nevertheless, the hypothesis that the distribution of the observations is normal must be verified, since the measured results do not always correspond to a normal distribution. For example, when the measured quantity is an average value, the distribution of the observations can have any form. An indirect measurement is a measurement in which the value of the unknown quantity is calculated by using matched measurements of other quantities, called measured arguments or, briefly, arguments, which are related through a known relation to the measured quantity. In an indirect measurement, the true but unknown value of the measured quantity or response, denoted by R, is related to the true but unknown values of arguments, denoted as .˛1 ; : : : ; ˛k /, by a known relationship (i.e., function) f . This relationship is called the measurement equation, and can be generally represented in the form R D f .˛1 ; : : : ; ˛k /: (6.1) 6 Sensitivity and Uncertainty Analysis of Models and Data 297 In practice, only the nominal parameter values .˛10 ; : : : ; ˛k0 / are known together with their uncertainties or errors .ı˛1 ; : : : ; ı˛k /. The nominal parameter values are given by their respective expectations, while the associated errors and/or uncertainties are given by their respective standard deviations. Processing the experimental data obtained in an indirect measurement is performed with the same objectives as for direct measurements, namely to calculate the expected value E.R/, of the measured response R, and to calculate the error and/or uncertainty, including confidence intervals, associated with E.R/. The higher-order moments of the distribution of R are also of interest, if they can be calculated. It is very important to note here that the measurement equation, Eq. 6.1, can be interpreted to represent not only results of indirect measurements, but also results of computations. In this interpretation, .˛1 ; : : : ; ˛k / are considered to be the parameters underlying the respective computation, R is considered to represent the result or response of the computation, while f represents not only the explicit relationships between parameters and response, but also implicitly the relationships among the parameters and the independent and dependent variables comprising the respective mathematical model. 6.2.2 Propagation of Errors (Moments) The true, but unknown, parameter values .˛1 ; : : : ; ˛k / can be expressed in vector form as ˛ D ˛0 C ı˛ D ˛10 C ı˛1 ; : : : ; ˛k0 C ı˛k : As noted in Eq. 6.1, the response is related to the parameters via the measurement equation or computational model. Traditionally, Eq. 6.1 is written in the simpler functional form R D R.˛1 ; : : : ; ˛k / D R.˛10 C ı˛1 ; : : : ; ˛k0 C ı˛k /: (6.2) In the functional relation above, R is used in the dual role of both a random function and the numerical realization of this function, which is consistent with the notation used for random variables and functions. Expanding R.˛10 C ı˛1 ; : : : ; ˛k0 C ı˛k / in a Taylor series around the nominal values ˛0 D .˛10 ; : : : ; ˛k0 / and retaining only the terms up to the nth order in the variations ı˛i ˛i ˛i0 around ˛i0 gives: R.˛1 ; : : : ; ˛k / R ˛10 C ı˛1 ; : : : ; ˛k0 C ı˛k k X @R ı˛i D R ˛0 C @˛i1 ˛0 1 i1 D1 C 1 2 k X i1 ;i2 D1 @2 R @˛i1 @˛i2 ˛0 ı˛i1 ı˛i2 (6.3) 298 D.G. Cacuci C 1 3Š 1 C nŠ k X i1 ;i2 ;i3 D1 @3 R @˛i1 @˛i2 @˛i3 k X i1 ;i2 ;:::;in D1 ˛0 ı˛i1 ı˛i2 ı˛i3 C @n R @˛i1 @˛i2 : : : @˛in ˛0 ı˛i1 : : : ı˛in : Using the above Taylor-series expansion, the various moments of the random variable R.˛1 ; : : : ; ˛k /, namely its mean, variance, skewness, and kurtosis, can be computed by considering that the system parameters .˛1 ; : : : ; ˛k / are random variables distributed according to a joint probability density function p.˛1 ; : : : ; ˛k /, with mean values, variances, and covariances defined as: E.˛i / ˛i0 ; Z var .˛i ; ˛i / i2 Z cov ˛i ; ˛j ˛i ˛io 2 (6.4) p .˛1 ; : : : ; ˛k / d˛1 d˛2 : : : d˛k ; (6.5) S˛ ˛i ˛io ˛j ˛jo p .˛1 ; : : : ; ˛k / d˛1 d˛2 : : : d˛k : (6.6) S˛ Even when the joint probability density function p.˛1 ; : : : ; ˛k / is unknown, as is often the case in practice, the means, variances, and covariances of the system parameters .˛1 ; : : : ; ˛k / can be “propagated” to compute the various moments of the response R.˛1 ; : : : ; ˛k /. This procedure is called the method of propagation of errors or propagation of moments, and the resulting equations for the various moments of R.˛1 ; : : : ; ˛k / are called the moment propagation equations. For large complex systems, with many parameters, it is impractical to consider the nonlinear terms in Eq. 6.3. In such cases, the response R.˛1 ; : : : ; ˛k / becomes a linear function of the parameters .˛1 ; : : : ; ˛k / of the form k X R.˛1 ; : : : ; ˛k / D R ˛0 C i D1 @R @˛i ˛0 ı˛i D R0 C k X Si ı˛i ; (6.7) i D1 where R0 R.˛0 /, and where the quantity Si .@R = @˛i /˛0 denotes the sensitivity of the response R.˛1 ; : : : ; ˛k / to the parameter ˛i . The mean value of R.˛1 ; : : : ; ˛k / is obtained from Eq. 6.7 as Z E.R/ D k X i D1 S˛ k X Z Si i D1 D R0 : ! Si ı˛i p.˛1 ; : : : ; ˛k /d˛1 d˛2 : : : d˛k C R0 .˛i ˛io /p.˛1 ; : : : ; ˛k /d˛1 d˛2 : : : d˛k C R0 S˛ (6.8) 6 Sensitivity and Uncertainty Analysis of Models and Data 299 The various moments of R.˛1 ; : : : ; ˛k / can be calculated by using Eqs. 6.7 and 6.8; thus, the lth central moment l .R/ of R.˛1 ; : : : ; ˛k / is obtained as the following k-fold integral over the domain S˛ of the parameters ˛, to obtain: l l .R/ E R E.R/ Z k X D !l Si ı˛i p.˛1 ; : : : ; ˛k / d˛1 d˛2 ; : : : ; d˛k : (6.9) i D1 S˛ The variance of R.˛1 ; : : : ; ˛k / is calculated by setting l D 2 in Eq. 6.9 and by using the result obtained in Eq. 6.8; the detailed calculations are as follows: 2 .R/ var.R/ E Z D S˛ D k X i D1 C2 k X R R0 !2 Si ı˛i D p.˛1 ; : : : ; ˛k / d˛1 d˛2 : : : d˛k i D1 Z Si2 .ı˛i /2 p.˛1 ; : : : ; ˛k / d˛1 d˛2 : : : d˛k S˛ k X Z Si Sj i ¤j D1 k X 2 .ı˛i /.ı˛j /p.˛1 ; : : : ; ˛k / d˛1 d˛2 : : : d˛k S˛ k X Si2 var.˛i / C 2 i D1 Si Sj cov.˛i ; ˛j /: i ¤j D1 The result obtained in the above equation can be written in matrix form as var.R/ D SV˛ ST ; (6.10) where the superscript “T ” denotes transposition, and V˛ denotes the covariance matrix for the parameters .˛1 ; : : : ; ˛k /, with elements defined as ( cov.˛i ; ˛j / D ij i j ; i ¤ j; ij correlation coefficient .V˛ /ij D var.˛i / D i2 ; i D j; and the column vector S D .S1 ; : : : ; Sk /, with components Si D .@R=@˛i /˛0 , denotes the sensitivity vector. Equation 6.10 is colloquially known as the sandwich rule. If the system parameters are uncorrelated, Eq. 6.10 takes on the simpler form var.R/ D k X i D1 Si2 var.˛i / D k X i D1 Si2 i2 : (6.11) 300 D.G. Cacuci The above concepts can be readily extended from a single response to n responses that are functions of the parameters .˛1 ; : : : ; ˛k /. In vector notation, the n responses are represented as the column vector R D .R1 ; : : : ; Rn /: (6.12) In this case, the vector-form equivalent of Eq. 6.7 is the following linear, first-order Taylor expansion of R.˛/: R ˛0 C ı˛ D R ˛0 C ıR Š R ˛0 C Sı˛; (6.13) where S is a rectangular matrix of order n k with elements representing the sensitivity of the j th response to the i th system parameter, namely .S/ji D @Rj =@˛i : (6.14) The expectation E.R/ of R is obtained by following the same procedure as that leading to Eq. 6.8, to obtain (6.15) E.R/ D R 0 : The covariance matrix VR for R is obtained by following the same procedure as that leading to Eq. 6.10; this yields VR D E Sı˛.Sı˛/T D SE ı˛ı˛T ST D SV˛ ST ; (6.16) where the superscript “T ” denotes transposition. Note that Eq. 6.16 has the same “sandwich” form as Eq. 6.10 for a single response. The equations for the propagation of higher-order moments become increasingly complex and are seldom used in practice. For example, for a single response R .˛1 ; : : : ; ˛k / and uncorrelated parameters .˛1 ; : : : ; ˛k /, the respective propagation of moments equations can be obtained from Eq. 6.3, after a considerable amount of algebra, as follows: k 2 1X 0 @ R 0 E.R/ D R ˛1 ; : : : ; ˛k C 2 .˛i / 2 @˛i2 ˛o i D1 k k 1 X @3 R 1 X @4 R C 3 .˛i / C 4 .˛i / 6 24 @˛i3 ˛o @˛i4 ˛o i D1 i D1 ( ) k k1 @4 R 1 X X C 2 .˛i /2 .˛j / I 24 @˛i2 @˛j2 o i D1 j Di C1 ˛ (6.17) 6 Sensitivity and Uncertainty Analysis of Models and Data ( ) k X @R 2 2 .R/ D @˛i i D1 ˛o k X @R @2 R 3 .˛i / @˛i @˛i2 ˛o i D1 @R @ R 4 .˛i / @˛i @˛i3 ˛o i D1 ( 2 ) k @2 R 1X 4 .˛i / .2 .˛i //2 I (6.18) C 2 4 @˛ i i D1 ˛o ( ) k X @R 3 3 .R/ D 3 .˛i / @˛i i D1 ˛o ( ) k 3X @R 2 @2 R 4 .˛i / .2 .˛i //2 I (6.19) C 2 2 @˛i @˛ i i D1 ˛o ( ) k X @R 4 4 .R/ D 4 .˛i / 3.2 .˛i //2 C 3Œ2 .R/ 2 : (6.20) @˛i o C 1 3 i D1 k X 2 .˛i / C 301 3 ˛ In Eqs. 6.17–6.20, the quantities l .R/; .l D 1; : : : ; 4/ denote the central moments of the response R.˛1 ; : : : ; ˛k /, while the quantities k .˛i /; (i D 1; : : : ; kI k D 1; : : : ; 4) denote the respective central moments of the parameters .˛1 ; : : : ; ˛k /. Note that E.R/ ¤ R0 when the response R.˛1 ; : : : ; ˛k / is a nonlinear function of the parameters .˛1 ; : : : ; ˛k /. As has been already mentioned, Eqs. 6.17–6.20 are valid for uncorrelated parameters only. It is important to emphasize that the “propagation of moments” equations are used not only for processing experimental data obtained from indirect measurements, but also for performing uncertainty analysis of computational models. In the latter case, the “propagation of errors” equations provide a systematic way of obtaining the uncertainties in computed results, arising not only from uncertainties in the parameters that enter the respective computational model, but also from the numerical approximations themselves. The efficient computation of sensitivities and, subsequently, uncertainties in results produced by various models (algebraic, differential, integral, etc.) are the objectives of subsequent sections in this chapter. As a simple illustrative example of using Eq. 6.10, consider the computation of the standard deviation R1 of a response R1 ˛1 ˛2 , where ˛1 and ˛2 are two correlated parameters, with standard deviations 12 , 22 , and correlation V12 . The sensitivities of R1 to ˛1 and ˛2 can be readily calculated as ˇ ˇ @R1 ˇˇ @R1 ˇˇ 0 S1 D D ˛ ; S D D ˛10 : (6.21) 2 2 @˛1 ˇ˛0 @˛2 ˇ˛0 Substituting the above sensitivities in Eq. 6.10 yields the following result for the relative standard deviation of R1 1=2 2 R1 1 22 V12 D C 0 2 C2 0 0 : (6.22) R1 .˛10 /2 .˛2 / ˛1 ˛2 302 D.G. Cacuci 6.3 Statistical Methods for Sensitivity and Uncertainty Analysis There are many methods, based either on deterministic or statistical concepts, for performing sensitivity and uncertainty analysis. However, despite this variety of methods, or perhaps because of it, a precise, unified terminology, across all methods, does not seem to exist yet, even though many of the same words are used by the practitioners of the various methods. For example, even the word “sensitivity” as used by analysts employing statistical methods may not necessarily mean or refer to the same quantity as would be described by the same word, “sensitivity,” when used by analysts employing deterministic methods. Care must therefore be exercised, since identical words may not necessarily describe identical quantities, particularly when comparing deterministic to statistical methods. Furthermore, conflicting and contradictory claims are often made about the relative strengths and weaknesses of the various methods. Sensitivity and uncertainty analysis procedures can be either local or global in scope. The objective of local analysis is to analyze the behavior of the system response locally around a chosen point (for static systems) or chosen trajectory (for dynamical systems) in the combined phase space of parameters and state variables. On the other hand, the objective of global analysis is to determine all of the system’s critical points (bifurcations, turning points, response maxima, minima, and/or saddle points) in the combined phase space formed by the parameters and dependent (state) variables, and subsequently analyze these critical points by local sensitivity and uncertainty analysis. The methods for sensitivity and uncertainty analysis are based on either statistical or deterministic procedures. In principle, both types of procedures can be used for either local or for global sensitivity and uncertainty analysis, although, in practice, deterministic methods are used mostly for local analysis while statistical methods are used for both local and global analysis. This section, 6.3, highlights the salient features of the most popular statistical procedures currently used for local and global sensitivity and uncertainty analysis. These statistical procedures can be classified as follows: sampling-based methods (random sampling, stratified importance sampling, and Latin Hypercube sampling), first- and second-order reliability methods (FORM and SORM, respectively), variance-based methods (correlation ratio-based methods, the Fourier amplitude sensitivity test (FAST), and Sobol’s method), and screening design methods (classical one-at-a-time (OAT) experiments, global one-at-a-time design methods, systematic fractional replicate designs (SFRD), and sequential bifurcation (SB) designs). It is important to note that all statistical uncertainty and sensitivity analysis methods first commence with the “uncertainty analysis” stage, and only subsequently proceed to the “sensitivity analysis” stage. This procedural path is the reverse of the procedural (and conceptual) path underlying the deterministic methods of sensitivity and uncertainty analysis, where the sensitivities are determined prior to using them for uncertainty analysis. 6 Sensitivity and Uncertainty Analysis of Models and Data 303 6.3.1 Sampling-Based Methods If the uncertainty associated with the parameters ˛ were known unambiguously, then the uncertainty in the response R.˛/ could also be assessed unambiguously. In practice, however, the uncertainty in ˛ can rarely be specified unambiguously; most often, many possible values of ˛, of varying levels of plausibility, could be considered. Such uncertainties can be characterized by assigning a distribution of plausible values (6.23) D1 ; D2 ; : : : ; DI ; to each component ˛i .x/ of ˛. Correlations and other restrictions can also be considered to affect the parameters ˛i .x/. Uncertainties characterized by distributions of the form (6.23) are called epistemic or subjective uncertainties, and characterize a degree of belief regarding the location of the appropriate value of each ˛i .x/. In turn, these subjective uncertainties for the parameters ˛i .x/ lead to subjective uncertainties for the response R.˛/, which reflect a corresponding degree of belief regarding the location of the appropriate response values as the outcome of analyzing the model under consideration. Sampling-based methods for sensitivity and uncertainty analysis are based on a sample (6.24) ˛ D Œ˛k1 ; ˛k2 ; : : : ; ˛kI ; .k D 1; 2; : : : ; nS /; of size nS , taken from the possible values of ˛ as characterized by the distributions in Eq. 6.23. The response evaluations corresponding to the sample ˛ defined in Eq. 6.24 can be represented in vector form as R.˛ / D ŒR1 .˛ /; R2 .˛ /; : : : ; RJ .˛ / ; .k D 1; 2; : : : ; nS /; (6.25) where the subscript J denotes the number of components of the response R. The pairs (6.26) Œ˛ ; R.˛ / ; .k D 1; 2; : : : ; nS /; represent a mapping of the uncertain “inputs” ˛ to the corresponding uncertain “outputs” R.˛ /, which result from the “sampling-based uncertainty analysis.” Subsequent examination and post-processing (e.g., scatter plots, regression analysis, partial correlation analysis) of the mapping represented by Eq. 6.26 constitute procedures for “sampling-based sensitivity analysis,” in that such procedures provide means of investigating the effects of the elements of ˛ on the elements of R.˛/. Specifically, a “sampling-based uncertainty and sensitivity analysis” involves five steps, as follows: Step 1: Define the subjective distributions Di described by Eq. 6.23 for characterizing the uncertain input parameters. Note that this first step is actually the most important step of the entire sampling-based procedure! Because of its fundamental importance, the characterization of subjective uncertainty has been widely studied (see, e.g. [2, 6, 36]). In practice, this step invariably involves formal expert review processes. Two of the largest 304 D.G. Cacuci examples of analyses that used formal expert review processes to assign subjective uncertainties to input parameters are the US Nuclear Regulatory Commission’s reassessment of the risks from commercial nuclear reactor power stations (1990), and the assessment of seismic risk in the Eastern USA(1990–1991) [62]. Although formal statistical procedures can occasionally be used for constructing subjective distributions, experience has shown that it is more useful to specify selected quantile (minimum, median, maximum, etc.) values, rather than attempting to specify a particular type of distribution (e.g., normal, beta, etc.) and its associated parameters. This is because experts are more likely to be able to justify the selection of specific quantile values than the selection of a particular form of distribution with specific parameters. When distributions from several expert opinions are combined, it is practically very difficult to assign weights to the respective opinions; these difficulties are discussed, for example, by Clement and Winkler [20]. Once a subjective distribution Di has been assigned to each element ˛i .x/ of ˛, the collection of distributions of the form given by Eq. 6.23 defines a probability space (S, E, p), which is a formal structure where: (i) S denotes the sample space (containing everything that could occur in the particular universe under consideration; the elements of S are elementary events); (ii) E denotes an appropriately restricted subspace of S, for which probabilities are defined; and (iii) p denotes a probability measure. Step 2: Use the distributions described by Eq. 6.23 to generate the sample ˛ described by Eq. 6.24. The most widely used sampling procedures are: random sampling, importance sampling, and Latin Hypercube sampling. In random sampling, each sample point is selected independently of all other sample points. However, there is no guarantee that points will be sampled from all subregions of S; furthermore, if sampled values fall closely together, the sampling of S is quite inefficient. To alleviate these shortcomings, importance sampling is used to ensure that specified regions in the sample space are fully covered. The idea of fully covering the range of each parameter is further extended in the Latin Hypercube sampling procedure (see, e.g. [46]). In this procedure, the range of each parameter ˛i is divided into nLH intervals of equal probability, and one value is randomly selected from each interval. The nLH values thus obtained for the first parameter, ˛1 , are then randomly paired, without replacement, with the nLH values obtained for ˛2 . In turn, these pairs are combined randomly, without replacement, with the nLH values for ˛3 to form nLH triples. This process is continued until one obtains a set of nLH of I tuples, of the form ˛ D Œ˛k1 ; ˛k2 ; : : : ; ˛kI ; .k D 1; 2; : : : ; nLH /, which is called a Latin Hypercube sample. Latin Hypercube sampling is suitable for uncorrelated parameters only; if the parameters are correlated, then the respective correlation structure must be incorporated into the sample, for otherwise the ensuing uncertainty/sensitivity analysis would yield false 6 Sensitivity and Uncertainty Analysis of Models and Data 305 results. To incorporate parameter correlations into the sample, Iman and Conover [37] proposed a restricted pairing technique for generating Latin Hypercubes based on rank correlations (i.e., correlations between ranktransformed parameters) rather than sample correlations (i.e., correlations between the original, untransformed, parameters). Since random sampling is easy to implement and provides unbiased estimates for the means, variances, and distribution functions, it is the preferred technique in practice, if large samples are available. However, a sufficiently “large sample” for producing meaningful results by random sampling cannot be generated for complex models (with many parameters) and/or for estimating extremely high quantiles (e.g., the 0.99999 quantile), since the computation of the required sample becomes prohibitively expensive and impractical. In such cases, the random sampling method of choice becomes the stratified sampling method, in which the sampling space is partitioned in regions called “strata.” The main difficulty for implementing stratified sampling lies with defining the strata and for calculating the probabilities for the respective strata, unless considerable a priori knowledge is already available for this purpose. For example, the fault and event trees used in risk assessment studies of nuclear power plants and other complex engineering facilities can be used as algorithms for defining stratified sampling procedures. Latin Hypercube sampling is used when very high quantiles need not be estimated, but the calculations needed for generating the “large sample” required for random sampling still remain impractical. This is often the case in practice when assessing the effects of subjective uncertainty in medium-sized problems (e.g., ca. 30 parameters) while a 0.9–0.95 quantile is adequate for indicating the location of a likely outcome. For such problems, random sampling is still not feasible computationally, but the unbiased means and distribution functions provided by the full stratification (i.e., each parameter is treated equally) of the Latin Hypercube sampling makes it the preferred alternative over importance sampling, since the unequal strata probabilities used in importance sampling produce results that are difficult to interpret (particularly for subsequent sensitivity analysis). In this sense, Latin Hypercube sampling provides a compromise importance sampling when a priori knowledge of the relationships between the sampled parameters and predicted responses is not available. Step 3: Use each of the elements of the sample ˛ in order to perform model recalculations, which then generate the responses R.˛ / described by Eq. 6.25. Once the sample has been generated, its elements must be used to perform model recalculations, which then generate the responses R.˛ / described by Eq. 6.25. These model recalculations can become the most expensive computational part of the entire uncertainty and sensitivity analysis and, if the model is complex, the limited number of feasible model recalculations may severely limit the sample size and the other aspects of the overall analysis. 306 D.G. Cacuci Step 4: Perform “uncertainty analysis” of the response R.˛/, by generating displays of the uncertainty in R.˛/ using the results for R.˛ / obtained above, in Step 3. It is customary to display the estimated expected value and the estimated variance of the response (as estimated from the sample size). However, these quantities may not be the most useful indicators about the response, since information about the physical system under consideration is always lost in the computations of means and variances. In particular, the mean and variance are less useful for summarizing information about the distribution of subjective uncertainties; by comparison, quantiles associated with the respective distribution provide a more meaningful locator for the quantity under consideration. Distribution functions (e.g., cumulative and/or complementary distribution functions, density functions) provide the complete information that can be extracted from the sample under consideration. Step 5: Perform “sensitivity analysis” of the response R.˛/ to parameters ˛, by exploring the mappings represented by Eq. 6.26, to assess the effects of the components of ˛ on the components of R.˛/. In the context of sampling-based methods, statistical sensitivity analysis (as opposed to deterministic sensitivity analysis) involves the exploration of the mapping represented by Eq. 6.26 to assess the effects of some, but not all, of the individual components of ˛ on the response R.˛/. This exploration includes examination of scatter plots, regression and stepwise regression analysis, correlation and partial correlation analysis, rank transformation, identification of non-monotonic patterns, and identification of nonrandom patterns. It is important to note that correlated variables introduce unstable regression coefficients, in that the values of these coefficients become sensitive to the specific variables introduced into the regression model. In such situations, the regression coefficients of a regression model that includes all of the parameters are likely to give misleading indications of parameter importance. If several input parameters are suspected (or known) to be highly correlated, it is usually recommended to transform the respective parameters so as to remove the correlations or, if this is not possible, to analyze the full model by using a sequence of regression models with all but one of the parameters removed, in turn. Note, however, that the regression model should attempt to match the trend displayed by the collective sample rather than match the predictions associated with individual sample parameters; otherwise over-fitting of data could arise if parameters are arbitrarily forced into the regression model. Regression models based on linear representations of the impact of parameters on the response will perform poorly when the relationships between the parameters and the response are nonlinear. In such cases, the rank transformation may be used to improve the construction of the respective regression model. The conceptual framework underlying rank transformation involves simply replacing the parameters by their respective ranks, and then performing the customary regression analysis on the ranks rather than the corresponding parameters (see, e.g. [37, 55, 57]). Thus, if 6 Sensitivity and Uncertainty Analysis of Models and Data 307 the number of observations is M , then the smallest value of each parameter is assigned rank 1, the next largest value is assigned rank 2, and so on, until the largest value, which is assigned rank M ; if several parameters have the same value, then they are assigned an averaged rank. The regression analysis is then performed by using the ranks as input/output parameters, as replacements for the actual parameter/response values. This replacement has the effect of replacing the linearized parameter/response relationships by rank-transformed monotonic input/output relationships in an otherwise conventional regression analysis. In practice, a regression analysis using the rank-transformed (instead of raw) data may yield better results, but only as long as the relationships between parameters and responses are monotonically nonlinear. Otherwise, the rank transformation does not improve significantly the quality of the results produced by regression analysis. 6.3.2 Reliability Algorithms: FORM and SORM In many practical problems, the primary interest of the analyst may be focused on a particular mode of failure of the system under consideration, while the detailed spectrum of probabilistic outcomes may be of secondary concern. For such problems, the so-called reliability algorithms provide much faster and more economical answers (in comparison to the sampling-based methods discussed in Section 6.3.1) regarding the particular mode of failure of the system under consideration. The typical problems that can be analyzed by using reliability algorithms must be characterized by a mathematical model (whose solution can be obtained analytically or numerically), by input parameters that can be treated as being affected by subjective (epistemic) uncertainties, and by a threshold level that specifies mathematically the concept of “failure.” The reliability algorithms most often used are those known as firstorder reliability methods (FORM) and second-order reliability methods (SORM), respectively. Both of these methods use optimization algorithms to seek “the most likely failure point” in the space of uncertain parameters, using the mathematical model and the response functional that defines failure. Once this most likely failure point (referred to as the “design point”) has been determined, the probability of failure is approximately evaluated by fitting a first- or second-order surface at that point in parameter space. Reliability algorithms have been applied to a variety of problems, including structural safety (see, e.g. [45]), offshore oil field design and operation (see, e.g. [8]) and multiphase flow and transport in subsurface hydrology (see, e.g. [66]). The FORM and SORM algorithms are susceptible to non-convergence or convergence to an erroneous design point, particularly when the failure probability approaches the extreme values of 0.0 or 1.0. Therefore, the numerical optimization algorithm and convergence tolerances underlying FORM and SORM should be tailored, whenever possible, to the specific problem under investigation. 308 D.G. Cacuci 6.3.3 Variance-Based Methods These methods use variance, among other indicators, as a measure of the importance of a parameter in contributing to the overall uncertainty in the response. The concept of variance as a measure of the importance of a parameter also underlies the conceptual foundation of three further methods for statistical uncertainty and sensitivity analysis, namely the Fourier Amplitude Sensitivity Test (FAST), Sobol’s method, and the correlation-ratio method (including variants thereof). It is important to note that, in contrast to the sampling-based methods discussed in Section 6.3.1, the correlation ratio, the FAST, and Sobol’s methods do not make the a priori assumption that the input model parameters are linearly related to the model’s response. The FAST procedure was originally proposed by Cukier et al. [23], and was subsequently extended by Cukier’s group and other authors. This procedure uses the following Fourier transformation of the parameters ˛i : ˛i D Fi sin.!i z/; i D 1; : : : ; I; (6.27) where f!i g is a set of integer frequencies, while z 2 . ; / is a scalar variable. The expectation E.R/ and variance of the response R can be approximated, respectively, as follows 1 E.R/ D 2 Z f .z/dz; and Var.R/ Š 2 1 X 2 Aj C Bj2 ; (6.28) j D1 where f .z/ f ŒF1 sin.!1 z/; F2 sin.!2 z/; : : : ; FI sin.!I z/ ; (6.29) while 1 Aj 2 Bj 1 2 Z f .z/ cos.jz/dz; (6.30) f .z/ sin.jz/dz : (6.31) Z The transformation given by Eq. 6.27 should provide, for each parameter ˛i , a uniformly distributed sample within the unit I -dimensional cube. As z 2 . ; / varies for a given transformation, all parameters change simultaneously; however, their respective ranges of uncertainty are systematically and exhaustively explored (i.e., the search curve is space-filling) if and only if the set of frequencies f!i g is incommensurate (i.e., if none of the frequencies !i may be obtained as a linear combination, with integer coefficients, of the remaining frequencies). 6 Sensitivity and Uncertainty Analysis of Models and Data 309 The first-order sensitivity indices are computed by evaluating the coefficients Aj and Bj for the fundamental frequencies f!i g and their higher harmonics p!i .p D 1; 2; : : :/. If the frequencies f!i g are integers, the contribution to the total variance Var.R/ coming from the variance Di corresponding to parameter ˛i is approximately obtained as M X Di Š 2 2 A2p!i C Bp! ; i (6.32) pD1 where M is the maximum harmonic taken into consideration (usually M 6). The ratio of the partial variance Di to the total variance Var.R/ provides the so-called first-order sensitivity index. The minimum sample size required to compute Di is .2M!max C 1/, where !max is the maximum frequency in the set f!i g (see, e.g. [56]). Furthermore, the frequencies that do not belong to the set fp1 !1 ; p2 !2 ; : : : ; pI !I g for .pi D 1; 2; : : : ; 1/, and for any .i D 1; 2; : : : ; I /, contain information about the residual variance ŒVar.R/ Di that is not accounted for by the first-order indices. Saltelli et al. have proposed a method that extracts information regarding this residual variance in .I NS / computations, where NS is the respective sample size. The method due to Sobol’ relies on a particular case of a theorem originally proven by Kolmogorov, in which a multivariate function f .x1 ; x2 ; : : : ; xn / is decomposed into summands of increasing dimensionality of the form f .x1 ; x2 ; : : : ; xn / D f0 C n X f i .xi / C i D1 X f ij .xi ; xj / 1 i <j n C C f12:::n .x1 ; x2 ; : : : ; xn / (6.33) The above decomposition of f .x1 ; x2 ; : : : ; xn / is reminiscent of the ANOVA decomposition. When the quantities xi are uncorrelated, the above decomposition is unique, and has the following properties. (i) The integrals of any summand over any of its own variables is zero, i.e., Z1 fi1 i2 :::in .xi1 ; xi2 ; : : : ; xin / dxim D 0; if 1 m n ; (6.34) 0 (ii) The summands are orthogonal, i.e., Z fi1 i2 :::in fj1 j2 :::jm dx D 0; Œ0;1n if .i1 ; i2 ; : : : ; in / ¤ .j1 ; j2 ; : : : ; jm / ; (6.35) 310 D.G. Cacuci (iii) f0 is a constant, i.e., Z f0 D f .x/dx : (6.36) Œ0;1n By squaring Eq. 6.33 and integrating the resulting expression over the unit cube Œ0; 1 n , the following relation is obtained for the total variance D of f .x/: Z f 2 .x/dx f02 D D n X Di C D ij C C D12:::n ; (6.37) fi1 i2 :::im .xi1 ; xi2 ; : : : ; xim / dxi1 : : : dxim ; (6.38) i D1 Œ0;1n X 1 i <j n where the partial variances of f .x/ are defined as Z1 Di1 i2 :::im D Z1 ::: 0 0 for 1 i1 < : : : < im n; m D 1; : : : ; n : The sensitivity indices are defined as Si1 i2 :::im Di1 i2 :::im =D; for 1 i1 < : : : < im n; m D 1; : : : ; n : (6.39) The first-order sensitivity index, Si , for the parameter xi indicates the fractional contribution of xi to the variance D of f .x/; the second-order sensitivity index, Sij ; .i ¤ j /, measures the part of the variation in f .x/ due to xi and xj that cannot be explained by the sum of the individual effects of xi and xj ; and so on. Note also that Eqs. 6.38 and 6.39 imply that n X i D1 Si C X S ij C C S12:::n D 1 : (6.40) 1 i <j n 6.3.4 Design of Experiments and Screening Design Methods Design of Experiments (DOE) was first introduced by Fischer [31], and can be defined as the process of selecting those combinations of parameter values, called design points, which will provide the most information on the input–output relationship embodied by a model in the presence of parameter variations. However, the basic question underlying DOE is often a circular one: if the response function were known, then it would be easy to select the optimal design points, but the response is actually the object of the investigation, to begin with! Often used in practice is the so-called Factorial Design (FD), which aims at measuring the additive and interactive effects of input parameters on the response. An FD simulates all 6 Sensitivity and Uncertainty Analysis of Models and Data 311 possible combinations of assigned values, li , called levels, to each (uncertain) system parameter ˛i . Thus, even though an FD can account for interactions among parameters, the computational cost required by a FD is l1 l2 : : : lI , where I denotes the total number of parameters in the model; such a computational effort is prohibitively high for large-scale systems. A useful alternative is the Fractional Factorial Design (FFD) introduced by Box et al. [7], which assumes a priori that higher-order interactions between parameters are unimportant. Screening design methods refer to preliminary numerical experiments designed to identify the parameters that have the largest influence on a particular model response. The objective of screening is to arrive at a short list of important factors. In turn, this objective can only be achieved if the underlying numerical experiments are judiciously designed. An assumption often used as a working hypothesis in screening design is the assumption that the number of parameters that are truly important to the model response is small in comparison with the total number of parameters underlying the model. This assumption is based on the idea that the influence of parameters in models follows Pareto’s law of income distribution within nations, characterized by a few, very important parameters and a majority of noninfluential ones. Since screening designs are organized to deal with models containing very many parameters, they should be computationally economical. There is an inevitable trade-off, however, between computational costs and information extracted from a screening design. Thus, computationally economical methods often provide only qualitative, rather than quantitative information, in that they provide a parameter importance ranking rather than a quantification of how much a given parameter is more important than another. Falling within the simplest class of screening designs are the so-called one-ata-time (OAT) experiments, in which the impact of changing the values of each parameter is evaluated in turn [24]. The standard OAT experiment is defined as the experiment that uses standard or nominal values for each of the I parameters underlying the model. The combination of nominal values for the I parameters is called the control experiment (or scenario). Two extreme values are then selected to represent the range of each of the I parameters. The nominal values are customarily selected midway between the two extremes. The magnitudes of the residuals, defined as the difference between the perturbed and nominal response (output) values, are then compared to assess which factors are most significant in affecting the response. Since classical OAT cannot provide information about interactions between parameters, the model’s behavior can only be assessed in a small interval around the “control” scenario. In other words, the classical OAT experiments yield information only about the system’s response local behavior. Therefore, the results of a classical OAT experiment are meaningful only if the model’s input–output relation can be adequately represented by a first-order polynomial in the model’s parameters. If the model is affected by nonlinearities (as is often the case in practice), then parameter changes around the “control” scenario would provide drastically different “sensitivities,” depending on the chosen “control” scenario. 312 D.G. Cacuci To address this severe limitation of the classical OAT designs, Morris [47], has proposed a global OAT design method. In this method, the entire space in which the parameters may vary is covered independently of the specific initial “control” scenario used to initiate the experiment. A global OAT design assumes that the model is characterized by a large number of parameters and/or is computationally expensive (regarding computational time and computational resources) to run. The range of variation of each component of the vector ˛ of parameters is standardized to the unit interval, and each component is then considered to take on p values in the set f0; .p 1/1 ; 2.p 1/1 ; : : : ; 1g, so that the region of experimentation becomes an I -dimensional grid with p-levels. An elementary effect of the i -parameter at a point ˛ is then defined as ŒR.˛1 ; : : : ; ˛i 1 ; ˛i C ;: : : ; ˛I / R.˛/ = , where is a predetermined multiple of 1=.1 p/, such that ˛i C is still within the region of experimentation. A finite distribution Fi of elementary effects for the i th parameter is obtained by sampling ˛ from within the region of experimentation. The number of elements for each Fi is p k1 Œp .p 1/ . The distribution Fi is then characterized by its mean and standard deviation. A high mean indicates a parameter with an important overall influence on the response; a high standard deviation indicates either a parameter interacting with other parameters or a parameter whose effect is nonlinear. The alternative systematic fractional replicate design (SFRD), proposed by Cotter [21], does not require any prior assumptions about parameter interactions. For a model with I parameters, an SFRD involves the following steps: (i) one model computation with all parameters at their low levels; (ii) I model computations with each parameter, in turn, at its upper level, while the remaining .I 1/ parameters remain at their low levels; (iii) I model computations with each parameter, in turn, at its low level, while the remaining .I 1/ parameters remain at their upper levels; and (iv) one model computation with all parameters at their upper levels. Thus, an SFRD requires 2.I C 1/ computations. Denoting by .R0 ; R1 ; : : : ; RI ; RI C1 ; : : : ; R2I ; R2I C1 / the values of the responses computed in steps (i)–(iv) within an SFRD, the measures M.j / jCe .j /jC jCo .j /j, where the quantities jCe .j /j and jCo .j /j are defined as jCe .j /j Œ.R2I C1 RI Cj / .Rj R0 / =4 and jCo .j /j Œ.R2I C1 RI Cj / C.Rj R0 / =4, respectively, are used to estimate the order of importance of the I parameters ˛i . It is apparent from these definitions that the measures M.j / may fail when a parameter induces cancellation effects in the response; such a parameter would remain undetected by an SFRD. Worse yet, it is not possible to protect oneself a priori against such occurrences. Furthermore, an SFRD is not sufficiently precise, since the above definitions imply that, for one replicate, the variances are varŒCo .j / D varŒCe .j / D 2 =4, whereas a fractional replicate with n-computations would allow the estimations of parameter effects (on the response) with variances 2 =n. In addition to screening designs that consider each parameter individually, the (originally) individual parameters can be clustered into groups that are subsequently treated by group screening designs. Perhaps the most efficient modern group screening designs techniques are the iterated fractional factorial design (IFFD) 6 Sensitivity and Uncertainty Analysis of Models and Data 313 proposed by Andres and Hajas [1], and the SB technique proposed by Bettonvil [3]. In principle, the IFFD requires fewer model computations, n, than there are parameters, I . To identify an influential parameter, an IFFD investigates the groups through a fractional factorial design; the procedure is then repeated with different random groupings. Influential parameters are then sought at the intersection of influential groups. The IFFD samples three levels per parameter, designated low, middle, and high, while ensuring that the sampling is balanced: different combinations of values for two or three parameters appear with equal frequency. Hence, IFFD can be considered as a composite design consisting of multiple iterations of a basic FFD. The sequential bifurcation (SB) design combines two design techniques: (i) the sequential design, in which the parameter combinations are selected based on the results of preceding computations, and (ii) bifurcation, in which each group that seems to include one or more important parameters is split into two subgroups of the same size. However, the SB design must a priori assume that the analyst knows the signs of the effects of the individual parameter, in order to ensure that effects of parameters assigned to the same group do not cancel out. Furthermore, the sequential nature of SB implies a more cumbersome data handling and analysis process than other screening design methods. To assess the effects of interactions between parameters, the number of SB computations becomes the double of the number of computations required to estimate solely the “main effects”; quadratic effects cannot be currently analyzed with the SB design technique. The screening designs surveyed in the foregoing are the most representative and the most widely used methods aimed at identifying at the outset, in the initial phase of sensitivity and uncertainty analysis, the (hopefully not too many!) important parameters in a model. Each type of design has its own advantages and disadvantages, which can be summarized as follows: the advantages of OAT designs are: (i) no assumption of a monotonic input–output relation; (ii) no assumption that the model contains only “a few” important parameters; and (iii) the computational cost increases linearly with the number of parameters. The major disadvantage of OAT designs is the neglect of parameter interactions. Although such an assumption drastically simplifies the analysis of the model, it can rarely be accepted in practice. This simplifying assumption is absent in the global OAT design of Morris, which aims at determining the parameters that have (i) negligible effects, (ii) linear and additive effects, and (iii) nonlinear or interaction effects. Although the global OAT is easy to implement, it requires a high computational effort for large-scale models, and provides only a qualitative (but not quantitative) indication of the interactions of a parameter with the rest of the model; the global OAT cannot provide specific information about the identity of individual parameter interactions. The SFRD does not require a priori assumptions about parameter interactions and/or about which few parameters are important. Although the SFRD is relatively efficient computationally, it lacks precision and cannot detect parameters whose effects cancel each other out. The IFFD estimates the main and quadratic effects, and two-parameter interactions between the most influential parameters. Although the IFFD requires fewer computations than the total number of model parameters, it gives good results only if the model’s response is actually influenced by only 314 D.G. Cacuci a few truly important parameters. The SB design is simple and relatively costeffective (computationally), but assumes that (i) the signs of the main effects are a priori known, and (ii) the model under consideration is adequately described by two-parameter interactions. 6.4 Deterministic Methods for Sensitivity and Uncertainty Analysis The importance of parameters in large-scale complex models is not a priori obvious, and may often be counterintuitive. To analyze such complex models, information about the slopes of the model’s response at a given set of nominal parameter values in parameter space is of paramount importance. The exact slopes are provided by the local partial functional derivatives @R=@˛i of the response R with respect to the model parameters ˛i ; these local partial functional derivatives are called the local sensitivities of the model’s response to parameter variations. The simplest way of estimating local sensitivities is by recalculations of the model’s response, using parameter values that deviate by small amounts, ı˛i , of the order of 1%, from their nominal values ˛i0 . The sensitivities are then estimated by using a finite difference approximation to @R=@˛i of the form @R @˛i D ˛0 R.˛10 ; : : : ; ˛i0 C @˛i ; : : : ; ˛I0 / R.˛i0 / ; ı˛i .i D 1; : : : ; I /: This procedure, occasionally called the “brute-force method,” requires .I C 1/ model computations; if central differences are used, the number of model computations could increase up to a total of 2I . Although this method is conceptually simple to use and requires no additional model development, it is slow, relatively expensive computationally, and involves a trial-and-error process when selecting the parameter perturbations ı˛i . Note that erroneous sensitivities will be obtained if: (i) ı˛i is chosen to be too small, in which case computational round-off errors will overwhelm the correct values, and (ii) the parameter dependence is nonlinear and ı˛i is chosen too large, in which case the assumption of local linearity is violated. Historically, limited considerations of sensitivity analysis already appeared a century ago, in conjunction with studies of the influence of the coefficients of a differential equation on its solution. For a long time, however, those considerations remained merely of mathematical interest. The first systematic methodology for performing sensitivity analysis was formulated by Bode [4] for linear electrical circuits. Subsequently, sensitivity analysis provided a fundamental motivation for the use of feedback, leading to the development of modern control theory, including optimization, synthesis, and adaptation. The introduction of state-space methods in control theory, which commenced in the late 1950s, and the rapid development of digital computers have provided the proper conditions for establishing sensitivity theory as a branch of control theory and computer science. The number of publications 6 Sensitivity and Uncertainty Analysis of Models and Data 315 dealing with sensitivity analysis applications in this field grew enormously (see, e.g., the books by Kokotovic 1972 [42]; Tomovic and Vucobratovic [61]; Cruz [22]; Frank [32]; Fiacco [30]; Deif [25]; Eslami [29]; Rosenwasser and Yusupov [53]). In parallel, and mostly independently, ideas of sensitivity analysis have also permeated other fields of scientific and engineering activities; notable developments in this respect have occurred in the nuclear, atmospheric, geophysical, socioeconomical, and biological sciences. When the parameter variations are small, the traditional way to assess their effects on calculated responses is by using perturbation theory, either directly or indirectly, via variational principles. The basic aim of perturbation theory is to predict the effects of small parameter variations without actually calculating the perturbed configuration but rather by using solely unperturbed quantities (see, e.g. [5,28,35,40–42,50,51,54]). As a branch of applied mathematics, perturbation theory relies on concepts and methods of functional analysis. Since the functional analysis of linear operators is quite mature and well established, the regular analytic perturbation theory for linear operators is also well established. Even for linear operators, though, the results obtained by using analytic perturbation theory for the continuous spectrum are less satisfactory than the results delivered by analytic perturbation theory for the essential spectrum. This is because the continuous spectrum is less stable than the essential spectrum, as can be noted by recalling that the fundamental condition underlying analytic perturbation theory is the continuity in the norm of the perturbed resolvent operator (see, e.g. [40, 67]). If this analytical property of the resolvent is lost, then isolated eigenvalues need no longer remain stable to perturbations, and the corresponding series developments may diverge, have only a finite number of significant terms, or may cease to exist (e.g., an unstable isolated eigenvalue may be absorbed into the continuous spectrum as soon as the perturbation is switched on). The analysis of such divergent series falls within the scope of asymptotic perturbation theory, which comprises: (i) regular or uniform asymptotic perturbation expansions, where the respective expansion can be constructed in the entire domain, and (ii) singular asymptotic perturbation expansions, characterized by the presence of singular manifolds across which the solution behavior changes qualitatively. A nonexhaustive list of typical examples of singular perturbations includes: the presence or occurrence of singularities, passing through resonances, loss of the highest-order derivative, change of type of a partial differential equation, and leading operator becoming nilpotent. A variety of methods have been developed for analyzing such singular perturbations; among the most prominent are the method of matched asymptotic expansions, the method of strained coordinates, the method of multiple scales, the WKB (Wentzel–Kramers–Brillouin, or the phase-integral) method, the KBM (Krylov–Bogoliubov–Mitropolsky) method, Whitham’s method, and variations thereof. Many fundamental issues in asymptotic perturbation theory are still unresolved, and a comprehensive theory encompassing all types of operators (in particular, differential operators) does not exist yet. Actually, the problems tackled by singular perturbation theory are so diverse that this part of applied mathematics appears to the nonspecialist as a collection of almost disparate methods that often require some clever a priori guessing at the structure of the very answer one 316 D.G. Cacuci is looking for. The lack of a unified method for singularly perturbed problems is particularly evident for nonlinear systems, and this state of affairs is not surprising in view of the fact that nonlinear functional analysis is much less well developed than its linear counterpart. As can be surmised from the above arguments, although perturbation theory can be a valuable tool in certain instances for performing sensitivity analysis, it should be noted already at this stage that the aims of perturbation theory and sensitivity analysis do not coincide, and the two scientific disciplines are evolving separately from each other. For models that involve a large number of parameters and comparatively few responses, sensitivity analysis can be performed very efficiently by using deterministic methods based on adjoint functions. The use of adjoint functions for analyzing the effects of small perturbations in a linear system was introduced by Wigner [65]. Specifically, he showed that the effects of perturbing the material properties in a critical nuclear reactor can be assessed most efficiently by using the adjoint neutron flux, defined as the solution of the adjoint neutron transport equation. Since the neutron transport operator is linear, its adjoint operator is straightforward to obtain. In the same report, Wigner was also the first to show that the adjoint neutron flux can be interpreted as the importance of a neutron in contributing to the detector response. Wigner’s original work on the linear neutron diffusion and transport equations laid the foundation for the development of a comprehensive and efficient deterministic methodology, using adjoint fluxes, for performing systematic sensitivity and uncertainty analyses of eigenvalues and reaction rates to uncertainties in the cross sections in nuclear reactor physics problems (see, e.g. [33, 34, 58–60, 63, 64]). Since the neutron transport and neutron diffusion equations underlying problems in nuclear reactor physics are linear in the dependent variable (i.e., the neutron flux), the respective adjoint operators and adjoint fluxes are easy to obtain, a fact that undoubtedly facilitated the development of the respective sensitivity and uncertainty analysis methods. The responses considered in all of these works were functionals of the neutron forward and/or adjoint fluxes, and the sensitivities were defined as the derivatives of the respective responses with respect to scalar parameters, such as atomic number densities and energy-group-averaged cross sections. Local sensitivities can be computed exactly only by using deterministic methods that involve some form of differentiation of the system under investigation. The (comparatively few) deterministic methods for calculating sensitivities exactly are as follows: the Green’s Function method, the direct method (including its decoupled direct method [DDM] variant), the Forward Sensitivity Analysis Procedure (FSAP), and the Adjoint Sensitivity Analysis Procedure (ASAP). The Green function method (GFM) commences by differentiating the model under consideration with respect to its initial conditions in order to obtain a Green’s function, which is subsequently convoluted with the matrix of parameter derivatives, and is finally integrated in time to obtain the respective time-dependent sensitivities. There are several variants of the GFM; the integrated Magnus version (GFM/AIM) proposed by Kramer et al. [43] appears to be the most efficient (computationally) GFM. In practice, though, the GFM is seldom used, since it is computationally very expensive and difficult to implement. 6 Sensitivity and Uncertainty Analysis of Models and Data 317 The so-called direct method has been applied predominantly to systems of ordinary differential and/or algebraic equations describing chemical kinetics (including combustion kinetics) and molecular dynamics. This method is practically identical to the sensitivity analysis methods used in control theory, involving differentiation of the equations describing the model with respect to a parameter. The resulting set of equations is solved for the derivative of all the model variables with respect to that parameter. The actual form of the differentiated equations depends on the parameter under consideration. Consequently, for each parameter, a different set of equations must be solved to obtain the corresponding sensitivity. The most advanced and computationally economical version of the direct method is the decoupled direct method (DDM), originally introduced by Dunker [26, 27], in which the Jacobian matrix needed to solve the original system at a given time-step is also used to solve the sensitivity equations at the respective time-step, before proceeding to solve both the original and sensitivity systems at the next time-step. It is important to note that the computational effort increases linearly with the number of parameters. The direct method, on the one hand, and the variational/perturbation methods using adjoint functions, on the other hand, were unified and generalized by Cacuci et al. [17], who presented a comprehensive methodology, based on Frechetderivatives, for performing systematic and efficient sensitivity analyses of largescale continuous and/or discrete linear and/or nonlinear systems. Shortly thereafter, this methodology was further generalized by Cacuci [9, 10], who used methods of nonlinear functional analysis to introduce a rigorous definition of the concept of sensitivity as the first Gâteaux-differential – in general a nonlinear operator – of the system response along an arbitrary direction in the hyperplane tangent to the base-case solution in the phase-space of parameters and dependent variables. These works presented a rigorous theory not only for sensitivity analysis of functionaltype responses, but also for responses that are general nonlinear operators, and for responses defined at critical points. Furthermore, these works have also set apart sensitivity analysis from perturbation theory, by defining the scope of the former to be the exact and efficient calculation of all sensitivities, regardless of their use. As detailed in the recent book by Cacuci [12], the two most general and effective procedures for computing local sensitivities are the Forward Sensitivity Analysis Procedure (FSAP) and the Adjoint Sensitivity Analysis Procedure (ASAP). The FSAP constitutes a generalization of the decoupled direct method (DDM), since the concept of Gâteaux-differential (which underlies the FSAP) constitutes the generalization of the concept of total differential in the calculus sense, which underlies the DDM. Notably, the Gâteaux-differential exists for operators and generalized functions (e.g., distributions) that are not continuous in the ordinary calculus sense, and therefore do not admit the “nice” derivatives required for using the DDM. As expected, the FSAP reduces to the DDM, whenever the continuity assumptions required by the DDM are satisfied. Finally, even though the FSAP represents a generalization of the DDM, the FSAP requires the same computational and programming effort to develop and implement as the DDM. Most problems of practical interest comprise a large number of parameters and comparatively few responses. In such situations, it is by far more advantageous 318 D.G. Cacuci to employ the ASAP. Note, though, that the ASAP is not easy to implement for complicated nonlinear models, particularly in the presence of structural discontinuities (i.e., when the structure of the model itself changes). Furthermore, Cacuci [9, 10] has underscored the fact that the adjoint functions needed for the sensitivity analysis of nonlinear systems depend on the unperturbed forward (i.e., nonlinear) solution, a fact that is in contradistinction to the case of linear systems. These works have also shown that the adjoint functions corresponding to the Gâteauxdifferentiated nonlinear systems can be interpreted as importance functions – in that they measure the importance of a region and/or event in phase space in contributing to the system’s response under consideration; this interpretation is similar to that originally assigned by Wigner [65] to the adjoint neutron flux for linear neutron diffusion and/or transport problems in reactor physics and shielding. Once they become available, the sensitivities can be used for various purposes, such as for ranking the respective parameters in order of their relative importance to the response, for assessing changes in the response due to parameter variations, for performing uncertainty analysis using either the Bayesian approach or the response surface approach, or for data adjustment and/or assimilation. As highlighted by Cacuci [12], it is necessary to define rigorously the concept of sensitivity and to separate the calculation of sensitivities from their use, in order to compare clearly, for each particular problem, the relative advantages and disadvantages of using one or the other of the competing deterministic methods, statistical methods, or Monte Carlo methods of sensitivity and uncertainty analysis. The exact local sensitivities obtained by using deterministic methods can be used for the following purposes: (i) understand the system by highlighting important data; (ii) eliminate unimportant data; (iii) determine effects of parameter variations on system behavior; (iv) design and optimize the system (e.g., maximize availability/minimize maintenance); (v) reduce over-design; (vi) prioritize the improvements effected in the system; (vii) prioritize introduction of data uncertainties; and (viii) perform local uncertainty analysis by using the “propagation of errors” method. Important applications of deterministically computed local sensitivities include core reactor physics and shielding (see, e.g. [33,34,44,52]; and references therein), reactor thermal-hydraulics and neutron dynamics [16, 19], two-phase flows with phase transition [14, 38], geophysical fluid dynamics (see, e.g. [13, 48, 49, 68]), reliability and risk analysis [15, 39]. 6.4.1 The Forward Sensitivity Analysis Procedure (FSAP) for Nonlinear Systems Consider that the physical system is represented mathematically by means of K coupled nonlinear operator equations of the form N Œu.x/; ˛.x/ D QŒ˛.x/ ; x 2 ; (6.41) 6 Sensitivity and Uncertainty Analysis of Models and Data 319 where 1. x D .x1 ; : : : ; xJx / denotes the Jx -dimensional phase-space position vector for the primary system; x 2 X RJx , where x is a subset of the Jx -dimensional real vector space RJx . 2. u.x/ D Œu1 .x/; : : : ; uKu .x/ denotes a Ku -dimensional (column) vector whose components are the primary system’s dependent (i.e., state) variables; u.x/ 2 Eu , where Eu is a normed linear space over the scalar field F of real numbers. 3. ˛.x/ D Œ˛1 .x/; : : : ; ˛I .x/ denotes an I -dimensional (column) vector whose components are the primary system’s parameters; ˛ 2 E˛ , where E˛ is also a normed linear space. 4. QŒ˛.x/ D ŒQ1 .˛/; : : : ; QKu .˛/ denotes a Ku -dimensional (column) vector whose elements represent inhomogeneous source terms that depend either linearly or nonlinearly on ˛; Q 2 EQ , where EQ is also a normed linear space; the components of Q may be operators, rather than just functions, acting on ˛.x/ and x. 5. N .u; ˛/ ŒN1 .u; ˛/; : : : ; NKu .u; ˛/ denotes a Ku -component column vector whose components are nonlinear operators (including differential, difference, integral, distributions, and/or infinite matrices) acting on u and ˛. In view of the definitions given above, N represents the mapping N W D E ! EQ , where D D Du D˛ , Du Eu , D˛ E˛ , and E D Eu E˛ . Note that an arbitrary element e 2 E is of the form e D .u; ˛/. If differential operators appear in Eq. 6.41, then a corresponding set of boundary and/or initial conditions (which are essential to define D) must also be given. The respective boundary conditions are represented in operator form as ŒB.u; ˛/ A.˛/ @ x D 0; x 2 @x ; (6.42) where A and B are nonlinear operators, and @x denotes the boundary of x . The vector-valued function u.x/ is considered to be the unique nontrivial solution of the physical problem described by Eqs. 6.41 and 6.42. The system response (i.e., performance parameter) R.u; ˛/ associated with the problem modeled by Eqs. 6.41 and 6.42 is a phase-space-dependent mapping that acts nonlinearly on the system’s state vector u and parameters ˛, and is represented in operator form as R.e/ W DR E ! ER ; (6.43) where ER is a normed vector space. In practice, the exact values of the parameters ˛ are not known; usually, only the nominal (mean) parameter values, ˛0 , and their covariances, cov.˛i ; ˛j /, are available (in exceptional cases, higher moments may also be available). The nominal parameter values ˛0 .x/ are used in Eqs. 6.41 and 6.42 to obtain the nominal solution u0 .x/ by solving the equations N u0 ; ˛0 D Q ˛0 ; x 2 x ; B u0 ; ˛0 D A ˛0 ; x 2 @x : (6.44) (6.45) 320 D.G. Cacuci Thus, Eqs. 6.44 and 6.45 represent the “base-case” (nominal) state of the primary (non-augmented) system, and e 0 D .u0 ; ˛0 / represents the nominal solution of the non-augmented system. Once the nominal solution e 0 D .u0 ; ˛0 / has been obtained, the nominal value R.e 0 / of the response R.e/ is obtained by evaluating Eq. 6.43 at e 0 D .u0 ; ˛0 /. As was generally shown by Cacuci [9], the sensitivity of the response R to variations h in the system parameters is given by the Gâteaux- (G)-differential ıR.e 0 I h/ of the response R.e/ at e 0 D .u0 ; ˛0 / with increment h, defined as ıR e 0 I h d 0 R e C th dt R e 0 C th R.e 0 / ; t !0 t D lim t D0 (6.46) for t 2 F, and all (i.e., arbitrary) vectors h 2 E. For the non-augmented system considered here, it follows that h D .hu ; h˛ /, since E D Eu E˛ . The G-differential ıR.e 0 I h/ is related to the total variation ŒR.e 0 C th/ R.e 0 / of R at e 0 through the relation R e 0 C th R e 0 D ıR e 0 I h C .th/; with lim Œ .th/ =t D 0: (6.47) t !0 The objective of local sensitivity analysis is to evaluate ıR.e 0 I h/. Recall that the system’s state vector u and parameters ˛ are related to each other through Eqs. 6.41 and 6.42, which implies that hu and h˛ are also related to each other. Therefore, the sensitivity ıR.e 0 I h/ of R.e/ at e 0 can only be evaluated after determining the vector of variations hu in terms of the vector of parameter variations h˛ . The firstorder relationship between hu and h˛ is determined by taking the G-differentials of Eqs. 6.41 and 6.42, to obtain the forward sensitivity system (FSS) N0u u0 ; ˛0 hu D ıQ ˛0 I h˛ N0˛ u0 ; ˛0 h˛ ; x 2 x ; B0u u0 ; ˛0 hu D ıA ˛0 I h˛ B0˛ u0 ; ˛0 h˛ ; x 2 @x : (6.48) (6.49) For a given vector of parameter variations h˛ around ˛0 , the forward sensitivity system represented by Eqs. 6.48 and 6.49 is solved to obtain hu . Once hu is available, it is, in turn, used in Eq. 6.46 to calculate the sensitivity ıR.e 0 I h/ of R.e/ at e 0 , for a given vector of parameter variations h˛ . Equations (6.48) and (6.49) represent the “forward sensitivity equations (FSE),” also called occasionally the “forward sensitivity model (FSM),” or the “forward variational model (FVM),” or the “tangent linear model (TLM).” The direct computation of the response sensitivity ıR.e 0 I h/ by using the (h˛ -dependent) solution hu of Eqs. 6.48 and 6.49 constitutes the Forward Sensitivity Analysis Procedure (FSAP). From the standpoint of computational costs and effort, the FSAP is advantageous to employ only if, in the problem under consideration, the number of different responses of interest exceeds the number of system parameters and/or parameter variations to be considered. This is rarely the case in practice, however, since most problems of practical interest are characterized by many parameters (i.e., 6 Sensitivity and Uncertainty Analysis of Models and Data 321 ˛ has many components) and comparatively few responses. In such situations, it is not economical to employ the FSAP to answer all sensitivity questions of interest, since it becomes prohibitively expensive to solve the h˛ -dependent FSE repeatedly to determine hu for all possible vectors h˛ . 6.4.2 The Adjoint Sensitivity Analysis Procedure (ASAP) for Nonlinear Systems When the response R.e/ is an operator of the form R W DR ! ER , the sensitivity ıR.e 0 I h/ is also an operator, defined on the same domain, and with the same range as R.e/. To implement the ASAP for such responses, the spaces Eu , EQ , and ER are henceforth considered to be Hilbert spaces and denoted as H u .x /, H Q .x /, and H R .R /, respectively. The elements of H u .x / and H Q .x / are, as before, RJx , with smooth boundary vector-valued functions defined on the open set x @x . The elements of H R .R / are vector or scalar functions defined on the open set R Rm ; 1 m Jx , with a smooth boundary denoted as @R . Of course, if Jx D 1, then @x merely consists of two endpoints; similarly, if m D 1, then @R consists of two endpoints only. The inner products on H u .x /, H Q .x /, and H R .R / are denoted by h; iu , h; iQ , and h; iR , respectively. Furthermore, the ASAP also requires that ıR.e 0 I h/ be linear in h, which implies that R.e/ must satisfy a weak Lipschitz condition at e 0 , and that R e 0 C th1 Cth2 R e 0 Cth1 R e 0 Cth2 C R e 0 D o .t/ I (6.50) h1 ; h2 2 Hu H ˛ I t 2 F If R.e/ satisfies the two conditions above, then the response sensitivity ıR.e 0 I h/ is indeed linear in h, and can therefore be denoted as DR.e 0 I h/. Consequently, R.e/ admits a total G-derivative at e 0 D .u0 ; ˛0 /, such that the relationship DR e 0 I h D R0u e 0 hu C R0˛ e 0 h˛ (6.51) holds, where R0u .e 0 / and R0˛ .e 0 / are the partial G-derivatives at e 0 of R.e/ with respect to u and ˛. Note also that R0u .e 0 / is a linear operator, on hu , from H u into H R , i.e., R0u .e 0 / 2 L.H u ./; H R .R //. It is convenient to refer to the quantities R0u .e 0 / hu and R0˛ .e 0 / h˛ appearing in Eq. 6.51 as the “indirect-effect term” and the “direct-effect term,” respectively. The direct effect term can be evaluated efficiently at this stage. To proceed with the evaluation of the indirect-effect term, we consider that the orthonormal set fps gs2S , where s runs through an index set S, is an orthonormal basis of H R .R /. Then, since R0u .e 0 / hu 2 HR .R /, it follows that R0u .e 0 / hu can be represented as the Fourier series X˝ ˛ R0u e 0 hu ; p s R p s : (6.52) R0u e 0 hu D s2S 322 D.G. Cacuci In the above sum only an at most countable number of elements are different from zero, and the series extended upon the nonzero elements converges unconditionally. The functionals hR0u e 0 hu ; ps iR are the Fourier coefficients of R0u .e 0 /hu with respect to the basis fp s g. These functionals are linear in hu , since R.e/ was required to satisfy the conditions stated in Eq. 6.50. Since R0u .e 0 / 2 L.H u .x /; H R .R //, and since Hilbert spaces are selfdual, it follows that the following relationship holds: hR0u e 0 hu ; p s iR D hƒ e 0 ps ; hu iu ; s 2 S: (6.53) In Eq. 6.53, the operator ƒ.e 0 / 2 L .H R .R /; H u .x // is the adjoint of R0u .e 0 /; recall that ƒ.e 0 / is unique if R0u .e 0 / is densely defined. To eliminate the unknown values of hu from the expression of each of the functionals hhu ; ƒ.e 0 /p s iu ; s 2 S, the next step of the ASAP is to construct the operator LC .e 0 /, which is the operator formally adjoint to N0u .u0 ; ˛0 /, by means of the relationship h s; N0u u0 ; ˛0 hu iQ D hLC e 0 s; hu iu C fP .hu I s /g@ x ; s 2 S; (6.54) which holds for every vector s 2 HQ ; s 2 S. Recall that the operator LC .e 0 / is defined as the Ku Ku matrix 0 LC e 0 ŒLC ; .i; j D 1; : : : ; Ku / ; (6.55) ji e obtained by transposing the formal adjoints of the operators 0 0 0 Nu u ; ˛ ij ; while fP.hu I s /g@ x is the associated bilinear form evaluated on @x . The domain of LC .e 0 / is determined by selecting appropriate adjoint boundary conditions, represented here in operator form as ˚ C B sI e 0 A C ˛0 @ x D 0; s 2 S: (6.56) The above boundary conditions for LC .e 0 / are obtained by requiring that: (a) They must be independent of hu , h˛ , and G-derivatives with respect to ˛. (b) The substitution of Eqs. 6.49 and 6.56 into the expression of the so-called bilinear concomitant fP.hu I s /g@ x must cause all terms containing unknown values of hu to vanish. 6 Sensitivity and Uncertainty Analysis of Models and Data 323 This selection of the boundary conditions for LC .e 0 / reduces the bilinear concomitant to a quantity that contains boundary terms involving only known values of O ˛ ; s I ˛0 /. In general, h˛ , , and, possibly, ˛0 ; this quantity will be denoted by P.h O P does not automatically vanish as a result of these manipulations, although it may O could be forced to vanish by considdo so in particular instances; in principle, P 0 0 0 ering extensions of N˛ .u ; ˛ /, in the operator sense, but this is seldom needed in practice. Introducing Eqs. 6.49 and 6.56 into 6.54 reduces the latter to hLC e 0 s ; hu iu D h s ; ıQ ˛0 I h˛ N0˛ u0 ; ˛0 h˛ iQ O h˛ ; s I ˛0 ; s 2 S : P (6.57) The left side of Eq. 6.57 and the right side of Eq. 6.53 are now required to represent the same functional; this is accomplished by imposing the relation LC .e 0 / s D ƒ e 0 ps s 2 S; (6.58) which holds uniquely in view of the Riesz representation theorem. This last step completes the construction of the desired adjoint system, which consists of Eq. 6.58 and the adjoint boundary conditions given in Eq. 6.56. Furthermore, Eqs. 6.52–6.58 can now be used to obtain the following expression for the sensitivity DR.e 0 I h/ of R.e/ at e 0 : X h s ; ıQ ˛0 I h˛ N0˛ .u0 ; ˛0 /h˛ iQ DR e 0 I h D R0˛ e 0 h˛ C s2S O h˛ ; P sI ˛ 0 ps : (6.59) As Eq. 6.59 indicates, the desired elimination of all unknown values of hu from the expression of the sensitivity DR.e 0 I h/ of R.e/ at e 0 has thus been accomplished. Note that Eq. 6.59 includes the particular case of functional-type responses, in which case the respective summation would contain a single term .s D 1/ only. To evaluate the sensitivity DR.e 0 I h/ by means of Eq. 6.59, one needs to compute as many adjoint functions s from Eqs. 6.58 and 6.56 as there are nonzero terms in the representation of R0u .e 0 /hu given in Eq. 6.52. Although the linear combination of basis elements ps given in Eq. 6.52 may, in principle, contain infinitely many terms, obviously only a finite number of the corresponding adjoint functions s can be calculated in practice. Therefore, special attention is required to select the Hilbert space H R .R /, a basis fps gs2S for this space, and a notion of convergence for the representation given in Eq. 6.52 to best suit the problem at hand. This selection is guided by the need to represent the indirect-effect term R0u .e 0 /hu as accurately as possible with the smallest number of basis elements; a related consideration is the viability of deriving bounds and/or asymptotic expressions for the remainder after truncating Eq. 6.52 to the first few terms. 324 D.G. Cacuci 6.4.3 The Adjoint Sensitivity Analysis Procedure (ASAP) for Responses Defined at Critical Points Often, the response functional R.e/, where e D .u; ˛/, is located at a critical point y.˛/ of a function F .u; x; ˛/ which depends on the system’s state vector and parameters. In such situations, the components yi .˛/; . i D 1; : : : ; M /, of the critical point y.˛/, must be treated as responses in addition to R.e/. Such a response can be represented in the form Z R.e/ D F .u; x; ˛/ M Y ıŒxi yi .˛/ i D1 J Y ı.xj zj / dx1 : : : dxJ : (6.60) j DM C1 The quantities appearing in the integrand of Eq. 6.60 are defined as follows: (i) F is the nonlinear function under consideration. (ii) ı.x/ is the customary “Dirac-delta” functional. (iii) ˛ 2 R I , i.e., the components ˛i ; .i D 1; : : : ; I /, are restricted throughout this section to be real numbers. (iv) y.˛/ D Œy1 .˛/; : : : ; yM .˛/ , M J , is a critical point of F . If the G-differential of F vanishes at y.˛/, then y.˛/ is a critical point defined implicitly as the solution of the system of equations f@F=@xi gy.’/ D 0I .i D 1; : : : ; J /: (6.61) In this case, y.˛/ has J components (i.e., M D J ), and YJ j DM C1 ı.xj zj / 1 in the integrand of Eq. 6.60. Note that, in general, y.˛/ is a function of ˛. Occasionally, it may happen that @F=@xj takes on nonzero constant values (i.e., values that do not depend on x) for some of the variables xj ; .j D M C 1; : : : ; J /. Then, as a function of these variables xj , F attains its extreme values at the points xj D zj , zj 2 @x . Evaluating F at zj , .j D M C 1; : : : ; J /, yields a function G, which depends on the remaining phase-space variables xi , .i D 1; : : : ; M /; G may then have a critical point at y.˛/ D Œy1 .˛/; : : : ; yM .˛/ defined implicitly as the solution of f@G=@xi gy.˛/ D 0 ; .i D 1; : : : ; M / : (6.62) With the above specifications, the definition of R.e/ given in Eq. 6.60 is sufficiently general to include treatment of extrema (local, relative, or absolute), saddle, and inflexion points of the function F of interest. Note that, in practice, the base-case solution path, and therefore the specific nature and location of the critical point under consideration, are completely known prior to initiating the sensitivity studies. 6 Sensitivity and Uncertainty Analysis of Models and Data 325 As first shown by Cacuci [9, 10], the objective of sensitivity analysis for such systems is twofold, namely: (a) To determine the G-differential ıR.e0 I h/ of R.e/ at the “base-case point” e0 D .u0 ; ˛0 /, which gives the sensitivity of R.e/ to changes h D .hu ; h˛ / in the system’s state functions and parameters. (b) To determine the (column) vector ıy.˛0 I h˛ / D .ıy1 ; : : : ; ıyM / whose components ıym .˛0 I h˛ / are the G-differentials of ym .˛/ at ˛0 , for .m D 1; : : : ; M /. The vector ıy.˛0 I h˛ / gives the sensitivity of the critical point y.˛/ to changes h˛ . Applying the definition of the G-differential to Eq. 6.60 shows that ıR e0 I h D Z M Y dx Fu0 e0 hu C F˛0 e0 h˛ ı xi yi ˛0 i D1 J Y ı.xi zj / C j DM C1 Z M X h˛ mD1 M Y dx F ı 0 .xm ym / dym d˛ ı.xi yi / (6.63) ’0 J Y ı.xj zj /: j DM C1 i D1;i ¤m The last term on the right side of Eq. 6.63 vanishes, since Z M Y F ı 0 .xm ym / ı.xj zj /dx D j DM C1 i D1;i ¤m (6.64) Z J Y ı.xi yi / .@F=@xm / M Y ı.xi yi / i D1 J Y ı.xj zj / dx D 0 ; .m D 1; : : : ; M / ; j DM C1 in view of the well-known definition of the ı 0 functional and in view of either Eq. 6.61 if M D J , or of Eq. 6.62 if M < J . Therefore, the expression of ıR.e0 I h/ simplifies to ıR e0 I h D Z d x Fu0 e0 hu C F˛0 e0 h˛ M Y ı xi yi ˛0 i D1 J Y ı.xi zj / : (6.65) j DM C1 As already mentioned, the sensitivity of the location in phase space of the critical point is given by the G-differential ıy.˛0 I h˛ / of y.˛/ at ˛0 . In view of either Eq. 6.61 or 6.62, each of the components y1 .˛/; : : : ; yM .˛/ of y.˛/ is a real-valued 326 D.G. Cacuci function of the real variables .˛1 ; : : : ; ˛I /, and may be viewed as a functional defined on a subset of RI . Therefore, each G-differential ıym .˛0 I h˛ / of ym .˛/ at ˛0 is given by ıym ˛0 I h˛ D dym d˛ ˛0 h˛ D I X @ym i D1 @˛i ˛0 h˛i I .m D 1; : : : ; M /; (6.66) provided that @ym =@˛i , .i D 1; : : : ; I / exist at ˛0 for all .m D 1; : : : ; M /. The explicit expression of ıy.˛0 I h˛ / is obtained as follows. First, it is observed that both Eqs. 6.61 and 6.62 can be represented as Z .@F=@xm / M Y j Y ıŒxi yi .˛/ i D1 ı.xj zj /dx D 0I .m D 1; : : : ; M /: (6.67) j DM C1 Taking the G-differential of Eq. 6.67 at e0 yields the following system of equations involving the components ıym : Z M Y ˚ dx @.Fu0 hu C F˛0 h˛ /=@xm e0 ı xi yi ˛0 i D1 J Y ı.xi zj / j DM C1 M X ıys ˛0 I h˛ Z dx f@F=@xm ge0 sD1 M Y ı 0 Œxs ys ˛0 ı xi yi ˛0 i D1;j ¤s J Y ı.xi zj / D 0; .m D 1; : : : ; M /: (6.68) j DM C1 The above system is algebraic and linear in the components ıys .˛0 I h˛ /; therefore, it can be represented in matrix form as ˆ.ıy/ D by defining ˆ D Œ ms to be the M Z ms (6.69) M matrix with elements M Y ˚ 2 dx @ F=@xm @xs e0 ıŒxi yi ˛0 i D1 J Y j DM C1 ı.xj zj /; .m; s D 1; : : : ; M /; (6.70) 6 Sensitivity and Uncertainty Analysis of Models and Data 327 and by defining to be the M -component (column) vector .f1 C g1 ; : : : ; fM C gM /T ; (6.71) where Z fm M Y ˚ dx @.F˛0 h˛ / = @xm e0 ı xi yi ˛0 i D1 J Y ı.xi zj /; .m D 1; : : : ; M /; (6.72) j DM C1 and Z gm M Y ˚ dx @.Fu0 hu / = @xm e0 ı xi yi ˛0 i D1 J Y ı.xi zj /; .m D 1; : : : ; M /: (6.73) j DM C1 Notice that the definition of the ı 0 functional has been used to recast the second integral in Eq. 6.68 into the equivalent expression given in Eq. 6.70. Equation 6.69 can be solved by employing methods of linear algebra to obtain ıy.˛0 I h˛ / D ˆ 1 : (6.74) Both ıR.e0 I h/ and ıy.˛0 I h˛ / can be evaluated after obtaining the solution hu of the “forward sensitivity equations” given in Eqs. 6.48 and 6.49. However, this formalism is ill-suited for sensitivity analysis of problems with large data bases (i.e., when ˛ has many components). For such problems, the ASAP should be applied to circumvent the need for having to solve repeatedly the “forward sensitivity equations.” The ASAP can be developed if and only if the following three conditions, labeled as C.1 through C.3, are satisfied: (C.1) (C.2) (C.3) The partial G-derivatives at e0 of R.e/ with respect to u and ˛ exist. The partial G-derivatives at e0 of the operators N and B with respect to u and ˛ exist. The spaces Eu and EQ are real Hilbert spaces, denoted by Hu and HQ , respectively. For u1 ; u2 2 Hu , the inner R product in Hu will be denoted by Œu1 ; u2 , and is given by the integral u1 u2 dx. The inner product in HQ will be denoted by h; i. 328 D.G. Cacuci An examination of Eq. 6.65 shows that ıR.e0 I h/ is linear in h. Hence, the condition (C.1) mentioned above is satisfied, and the hu -dependent component of ıR.e0 I h/, i.e., the “indirect-effect term,” can be written in inner product form as Z M Y Fu0 e0 hu ı xi yi ˛0 i D1 J Y ı.xi zj /dx D Œr u R e0 ; hu ; (6.75) j DM C1 where M Y ru R e0 D ı xi yi ˛0 i D1 J Y ı.xi zj / j DM C1 !T @F e0 @F e0 ;:::; : @ui @uk (6.76) Following the procedure set forth in Section 6.4.2, the sensitivity ıR.e0 I h/ is obtained as ıR e0 I h D Z M Y F˛0 e0 h˛ ı xi yi ˛0 i D1 J Y ı.xi zj /dx j DM C1 ˝ ˛ C ıQ ˛0 I h˛ N0˛ e 0 h˛; v PO h˛ ; vI e0 ; (6.77) where the adjoint function v is the solution of the adjoint sensitivity system LC e0 v D r u R e0 fBC . I e 0 / AC ˛0 g@ (6.78) x D 0: (6.79) Unknown values of hu can be eliminated from the expression of ıy.˛0 I h˛ / given in Eq. 6.74 only if they can be eliminated from appearing in Eq. 6.73. Examination of Eq. 6.73 reveals that each quantity gm is a functional that can be expressed in the equivalent form Z gm D Fu0 e0 hu ı 0 .xm ym / M Y ıŒxi yi i D1;i ¤m J Y ı.xi zj /dx (6.80) j DM C1 by employing the definition of the ı 0 functional. In turn, the above expression can be written as the inner product gm D Œ” m e0 ; ; hu ; (6.81) 6 Sensitivity and Uncertainty Analysis of Models and Data 329 where ” m e0 ı 0 xm ym ˛0 M Y ıŒxi yi i D1;i ¤m !T @F e0 @F e0 ;:::; : @u1 @uk J Y ı.xi zj / j DM C1 (6.82) The desired elimination of the unknown values of hu from Eq. 6.73 can now be accomplished by letting each of the functions ”m .e0 / play, in turn, the role previously played by ru R.e0 /, and by following the same procedure as that leading to Eq. 6.78. The end result is ˛ ˝ gm D ıQ.˛0 I h˛ / N0˛ e0 h˛ ; wm PO .h˛ ; wm I e0 /; (6.83) where each function wm is the solution of the adjoint system ( LC e0 wm D ” m e0 ˚ C B w m I e 0 AC e 0 @ x D 0; .m D 1; : : : ; M /: (6.84) It is important to note that LC .e0 /, BC .wm I e0 /, and AC .e0 / appearing in Eq. 6.84 are the same operators as those appearing in Eqs. 6.78 and 6.79. Only the source term ”m .e0 / in Eq. 6.84 differs from the corresponding source term ru R.e0 / in Eq. 6.78. Therefore, the computer code employed to solve the adjoint system given in Eqs. 6.78 and 6.79 can be used, with relatively trivial modifications, to compute the functions wm from Eq. 6.84. Comparing the right sides of Eqs. 6.77 and 6.83 reveals that the quantity PO .h˛ ; vI e0 / is formally identical to the quantity PO .h˛ ; wm I e0 / and that the vector ıQ.˛0 I h˛ / N0˛ .e0 /h˛ appears in both of the inner products denoted by h; i. This indicates that the computer program employed to evaluate the second and third terms on the right side of Eq. 6.77 can also be used to evaluate the functionals gm , .m D 1; : : : ; M /, given by Eq. 6.83. Of course, the values of v required to compute ıR.e 0 I h/ are to be replaced by the respective values of wm , when computing the gm ’s. In most practical problems, the total number of parameters I greatly exceeds the number of phase-space variables J , and hence M , since M J . Therefore, if the ASAP can be developed as described in this section, then a large amount of computing costs can be saved by employing this formalism rather than the FSAP. In this case, only M C 2 “large” computations (one for the “base-case computation,” one for computing the adjoint function v, and M computations for obtaining the adjoint functions w1 ; : : : ; wM / are needed to obtain the sensitivities ıR.e0 I h/ and ıy.’0 I h˛ / to changes in all of the parameters. By contrast at least .I C 1/ computations (one for the “base-case,” and I to obtain the vector hu ) would be required if the “forward sensitivity formalism” were employed. 330 D.G. Cacuci 6.4.4 ASAP for Linear Systems For a physical system that is linear in the system’s vector of dependent (state) variables, u.x/, the operators N .u; ˛/ and B.u; ˛/ in Eq. 6.41 and, respectively, Eq. 6.42 take on the forms N .u; ˛/ ! L.˛/ u and B.u; ˛/ ! B.˛/ u, where L.˛/ and B.˛/ act linearly on u.x/. Hence, Eqs. 6.41 and 6.42 take on the forms L.˛/u D QŒ˛.x/ ; ŒB.˛/u A.˛/ @ x x 2 x ; D 0; (6.85) x 2 @x : (6.86) Note that although the components of L D ŒL1 .˛/; : : : ; LK .˛/ act linearly on the state vector u.x/, they depend nonlinearly (in general) on the vector of system parameters ˛.x/.The G-differentiated forms of Eqs. 6.85 and 6.86; i.e., the forward sensitivity system (or the “tangent linear model”) take on the forms (6.87) L ˛0 hu C ŒL0˛ ˛0 h˛ ıQ ˛0 I h˛ D 0; x 2 x ; ˚ 0 B ˛ hu C ŒB0˛ ˛0 h˛ ıA ˛0 I h˛ @ x D 0; x 2 @x ; (6.88) respectively. Furthermore, Eqs. 6.58 and 6.56 take on the forms LC ˛0 s D ƒ e 0 p s ; s 2 S; ˚ C 0 AC ˛0 @ x D 0; B sI ˛ (6.89) s 2 S: (6.90) while the expression for the sensitivity DR.e 0 I h/ of R.e/ at e 0 , cf. Eq. 6.59, becomes: DR.e 0 I h/ D R0˛ e 0 h˛ Xh˝ 0 ˛ 0 0 C s ; ıQ ˛ I h˛ L˛ ˛ h˛ Q s2S O h˛ ; P 0 sI ˛ i ps : (6.91) The following important characteristics must be noted when dealing with linear problems: (i) the adjoint operators LC .˛0 / and BC . s I ˛0 / are independent of the nominal solution u0 of the original system; (ii) if the response R.e/ is also linear in u.x/, then the operator ƒ.e 0 / ! ƒ.˛0 / (i.e., ƒ also becomes independent of u0 ) in Eq. 6.89, which leads to the very important consequence that the adjoint sensitivity system for linear problems can be solved independently of solving the original (linear) system for u0 . This important property of linear systems will be highlighted in several illustrative paradigm examples presented in the next section. 6 Sensitivity and Uncertainty Analysis of Models and Data 331 6.5 Paradigm Applications of the ASAP 6.5.1 Application of the ASAP to Compute the Variance of the Maximum Flux of Particles in a Particle Diffusion Problem The application of the ASAP for responses defined at critical points, presented in the foregoing section, will now be illustrated by considering a simple diffusion problem of neutral particles (e.g., neutrons) within a slab of material of extrapolated thickness a Œcm , placed in vacuum, which contains distributed particle sources of strength QŒparticles cm3s1 , and is also driven externally by a flux of particles of strength 'in Œparticles cm2 s1 , which impinges on one side (e.g., on the left side) of the slab. Consider, also for simplicity, that the material within the slab only scatters the particles, but does not absorb them. The linear particle diffusion equation that simulates mathematically this problem is L.˛/' D d2 ' D Q; dx 2 x 2 .0; a/; (6.92) where '.x/ denotes the (everywhere positive) particle flux, D is the diffusion coefficient, and Q is the corresponding distributed source term within the slab. The boundary conditions considered for '.x/, namely '.0/ D 'i n ; '.a/ D 0; (6.93) simulate an incoming flux 'in at the boundary x D 0, and a vanishing flux at the extrapolated distance x D a. Thus, the vacuum at the right side of the slab plays the role of a perfect absorber, in that particles that have left the slab can never return. The response R considered for the diffusion problem modeled by Eqs. 6.92 and 6.93 is the reading of a particle detector placed at the position y within the slab where the flux attains its maximum value, 'max . Such a response (i.e., the reading of the particle reaction rate at y) would be simulated mathematically by the following particular form of Eq. 6.60: Za †d '.x/ ıŒx y.˛/ dx; R.e/ D (6.94) 0 where †d represents the detector’s equivalent reaction cross section, in units of Œcm1 . The location y.˛/ where the flux attains its maximum value is defined implicitly as the solution of the equation Za 0 d'.x/ ıŒx y.˛/ dx D 0: dx (6.95) 332 D.G. Cacuci The parameters for this problem are the positive constants †d , D, Q, and 'in , which will be considered to be the components of the vector ˛ of system parameters, defined as (6.96) ˛ .†d ; D; Q; 'in / : The vector e.x/ appearing in the functional dependence of R in Eq. 6.94 denotes the concatenation of '.x/ with ˛, i.e., e .'; ˛/. The parameters ˛ are considered experimentally determined quantities, with known nominal values ˛0 D 0 to be 0 0 0 †d ; D ; Q ; 'in and known variances Œvar.†d /; var.D/; var.Q/; var.'i n / ; but being otherwise uncorrelated to one another. These parameter variances will give rise to variances in the value of the response R (i.e.,variance in the maximum flux measured by the detector) and the location y.˛/ of R within the slab. The goal of this illustrative example is to determine the variances of R and y.˛/ by using the “Sandwich Rule” of the “Propagation of Moments” method presented in Section 6.2. The nominal value ' 0 .x/ of the flux is determined by solving Eqs. 6.92 and 6.93 for the nominal parameter values ˛0 †0d ; D 0 ; Q0 ; 'in0 : For this simple example, the expression of ' 0 .x/ can be readily obtained in closed form as ' 0 .x/ D Q0 x 0 2 ' : .ax x / C 1 2D 0 a in (6.97) Note that even though Eq. 6.92 is linear in '; the solution '.x/ depends nonlinearly 0 ; of the on ˛, as evidenced by Eq. 6.97. Using Eq. 6.97, the nominal value, 'max maximum of ' 0 .x/, and y.˛0 /; respectively, are readily obtained as 0 D 'max Q 0 a2 D 0 .'in0 /2 'in0 C C 8D 0 2a2 Q0 2 and y.˛ 0 / D a 'in0 D 0 : 2 aQ0 (6.98) (6.99) From Eqs. 6.94 and 6.98, it follows that the nominal value, R.e0 /, of the response R.e/ is obtained as 0 : (6.100) R e0 D †0d 'max Of course, for the complex large-scale problems analyzed in practice, it is not pos0 , R.e0 /, and y.˛0 /. sible to obtain exact, closed form expressions for ' 0 .x/, 'max For uncorrelated parameters, the “sandwich rule” formula presented in Section 6.2, cf. Eq. 6.11, can be readily applied to this illustrative problem, to obtain the following expression for the variance, var.R/, of R: ıR 2 ıR 2 var.R/ D var.†d / C var.D/ ı†d ıD ıR 2 ıR 2 var.Q/ C var.'in /; C ıQ ı'in (6.101) 6 Sensitivity and Uncertainty Analysis of Models and Data 333 where the symbol .ıR=ı˛i / denotes the partial sensitivity (i.e., the partial G-derivative) of R.e/ to a generic parameter ˛i . In turn, the sensitivities of the response R.e/ are given by the G-differential ıR.e 0 I h/ of R.e/ at e 0 , for variations (6.102) h' I h˛ .ı†d ; ıD; ıQ; ı'in / : The G-differential ıR.e 0 I h/ is readily obtained from Eq. 6.94 as ıR e I h D Za 0 ı†d '.x/ ı x y ˛0 dx 0 Za C h' .x/ ı x y ˛0 dx 0 Za dy h˛ '.x/ ı 0 x y ˛0 dx: d˛ ˛0 (6.103) 0 The last term on the right side of Eq. 6.103 vanishes in view of Eq. 6.95, namely Za Za 0 'ı .x y/dx D 0 .@'=@x/ı.x y/dx D 0: 0 Therefore, Eq. 6.103 reduces to ıR e0 I h D R0˛ e 0 h˛ C R0' e 0 h' ; (6.104) where the “direct-effect” term R0˛ h˛ is defined as R0˛ e 0 Za h˛ 0 ı†d '.x/ ı x y ˛0 dx D ı†d 'max (6.105) 0 while the “indirect-effect” term R'0 h' is defined as R0' e 0 Za h' †0d h' .x/ ıŒx y ˛0 dx: (6.106) 0 The “direct-effect” term R0˛ h˛ can be evaluated at this stage by substituting Eq. 6.98 into 6.105, to obtain D 0 .'in0 /2 'in0 Q 0 a2 : C C R0˛ e 0 h˛ D ı†d 8D 0 2a2 Q0 2 (6.107) 334 D.G. Cacuci The “indirect-effect” term R0' h' , though, cannot be evaluated at this stage, since h' .x/ is not yet available. The first-order (in kh˛ k) approximation to the exact value of h' .x/ is obtained by calculating the G-differentials of Eqs. 6.92 and 6.93 and solving the resulting “forward sensitivity equations” (FSE) L ˛0 h' C L0˛ ˛0 ' 0 h˛ D O kh˛ k2 ; (6.108) together with the boundary conditions h' .0/ D ı'in ; h' .a/ D 0: (6.109) In Eq. 6.108, the operator L.˛0 / is defined as while the quantity d2 L ˛0 D 0 2 ; dx (6.110) d2 0 L0˛ ˛0 ' 0 h˛ ıD C ıQ; dx 2 (6.111) is the partial G-differential of L' at ˛0 with respect to ˛, and contains all of the first-order parameter variations h˛ . The Adjoint Sensitivity Analysis Procedure (ASAP) can be readily applied since, as indicated by Eq. 6.106, the “indirect-effect” term R0' .e 0 /h' is expressible as a linear functional of h' . Therefore, R'0 .e 0 /h' can be represented as an inner product in an appropriately defined Hilbert space Hu ; for our illustrative example, Hu is chosen to be the real Hilbert space Hu L 2 ./, with .0; a/, equipped with the inner product Za hf .x/; g.x/i f .x/g.x/dx; 0 for f; g 2 Hu L2 ./; .0; a/: (6.112) In Hu L 2 ./, the linear functional R0' .e 0 /h' defined in Eq. 6.106 can be represented as the inner product ˛ ˝ R0' .e 0 /h' D †0d ı x y ˛0 ; h' : (6.113) The operator LC .˛0 /, which is formally adjoint to L.˛0 / is readily obtained as d2 LC ˛0 D 0 2 : dx (6.114) 6 Sensitivity and Uncertainty Analysis of Models and Data 335 Note that LC .˛0 / and L.˛0 / are formally self-adjoint. The qualifier “formal” must still be kept at this stage, since the boundary conditions for LC .˛0 / have not been determined yet. These boundary conditions are derived by applying Eq. 6.54 to the operators L.˛0 /h' and LC .˛0 / , to obtain Za .x/ D 0d 2 h' .x/ dx D dx 2 0 Za D0 d2 .x/ h' .x/dx dx 2 0 ˚ C P h' ; xDa xD0 : (6.115) Note that the function .x/ is still arbitrary at this stage, except for the requirement that 2 H Q D L 2 ./; note also that the Hilbert spaces H u and H Q have now both become the same space, i.e., H u D H Q D L 2 ./. Integrating the left side of Eq. 6.115 by parts twice and canceling terms yields the following expression for the bilinear boundary form: ˚ P h' ; xDa xD0 D D0 dh' d h' dx dx xDa : (6.116) xD0 The boundary conditions for LC .˛0 / can now be selected by applying to Eq. 6.116 the general principles expressed in items (a) and (b) immediately following Eq. 6.56. From Eq. 6.109, h' is known at x D 0 and x D a; however, the quantities ˚ dh' = dx xDa xD0 are not known. These unknown quantities can be eliminated from Eq. 6.116 by choosing .0/ D 0; .a/ D 0; (6.117) .x/. Implementing Eqs. 6.117 and as boundary conditions for the adjoint function 6.109 into 6.116 yields ˚ P h' ; xDa Note that the quantity xD0 D D 0 ı'in d dx PO h˛ ; I ˛0 : (6.118) xD0 PO h˛ ; I ˛0 does not vanish; furthermore, the boundary conditions in Eq. 6.117 for the adjoint operator LC .˛0 / differ from the boundary conditions in Eq. 6.109 for L.˛0 /. Hence even though the operators LC .˛0 / and L.˛0 / are formally self-adjoint, they are not self-adjoint for this illustrative example. 336 D.G. Cacuci The last step in the construction of the adjoint system is the identification of the source term, which is done noting from Eq. 6.113 that r' R e 0 D †0d ı x y ˛0 : (6.119) Thus, the complete adjoint system becomes LC ˛0 D0 d2 D †0d ı x y ˛ 0 ; 2 dx (6.120) where the adjoint function .x/ is subject to the boundary conditions given in Eq. 6.117. Using Eqs. 6.118–6.120 and 6.114 in 6.113 gives the following expression for the “indirect-effect” term R0' .e0 /h' : R0' e 0 Za d2 ' 0 .x/ C ıQ dx dx 2 0 d .x/ D 0 .ı'in / ; dx xD0 h' D .x/ ıD (6.121) where .x/ is the solution of the adjoint sensitivity system defined by Eqs. 6.120 and 6.117. Note that the adjoint sensitivity system is, as expected, independent of parameter variations h˛ ; thus, the adjoint sensitivity system needs to be solved only once to obtain the adjoint function .x/. Very important for our illustrative example is also the fact (characteristic of linear systems) that the adjoint system is independent of the original solution ' 0 .x/, and can therefore be solved directly, without any knowledge of the (original) flux '.x/. Of course, the adjoint sensitivity system depends on the response, which provides the source term as shown in Eqs. 6.119 and 6.120. Solving the adjoint sensitivity system, namely Eqs. 6.120 and 6.117, yields the following expression for the adjoint function .x/: 0 0 x o 0 †0d n ; H x y ˛ a y ˛ x y ˛ D0 a where H x y ˛0 is the Heaviside-step function defined as .x/ D ( H.x/ D 0; for x < 0 1; for x 0 Noting from Eq. 6.92 that d2 ' 0 .x/ Q0 D ; dx 2 D0 : (6.122) 6 Sensitivity and Uncertainty Analysis of Models and Data 337 using the above result together with Eq. 6.122 in 6.121, and carrying out the respective integrations over x yields the following expression for the “indirecteffect” term R0' .e0 /h' : R0' 0 e h' D †0d " 0 0 2 ! D 'in Q0 ıD a2 ıQ D0 4 aQ0 # D 0 'in0 1 Cı'in C 2 0 : (6.123) 2 a Q 1 2D 0 From Eq. 6.105, it follows that the partial sensitivity (i.e., the partial G-derivative) of R with respect to †d has the expression ıR Q 0 a2 D 0 .'in0 /2 'in0 0 : D 'max D C C ı†d 8D 0 2a2 Q0 2 (6.124) Similarly, from Eq. 6.123, it follows that the partial sensitivities (i.e., the partial G-derivatives) of R with respect to D; Q; and 'in are: " 0 0 2 # †0d Q0 a2 D 'in ıR ; D 0 2 ıD 2.D / 4 aQ0 " 0 0 2 # †0d a2 ıR D 'in D ; ıQ 2D 0 4 aQ0 D0' 0 1 ıR C 2 in0 : D †0d ı'in 2 a Q (6.125) (6.126) (6.127) Since the sensitivities of the response R.e/ have thus been determined, Eqs. 6.124– 6.127 can be replaced in Eq. 6.101 to compute the variance var.R/. The “sandwich rule,” i.e., Eq. 6.11, can also be applied to obtain the variance, var.y/, of the critical point y.˛/, where the flux attains its maximum (and where the particle detector is located in this illustrative problem). The corresponding expression for var.y/ is: var.y/ D ıy ıD 2 var.D/ C ıy ıQ 2 var.Q/ C ıy ı'in 2 var.'in /; (6.128) where the symbol .ıy=ı˛i / denotes the partial sensitivity (i.e., the partial Gderivative) of y.˛/ to a generic parameter ˛i . Using the ASAP, the sensitivities of the critical point y.˛/ can be computed by specializing Eqs. 6.82–6.84 to our illustrative example. Thus, applying Eq. 6.82 to our example shows that ” 1 e0 ı 0 x y.’0 / : (6.129) 338 D.G. Cacuci Furthermore, Eq. 6.83 reduces to Za g1 D 0 d2 ' 0 .x/ dw1 .x/ 0 w1 .x/ ıD C ıQ dx D .ı'in / ; (6.130) dx 2 dx xD0 where the function w1 is the solution of the following particular form taken on by the adjoint system shown in Eq. 6.84 for our illustrative example: 8 d2 w1 .x/ < C 0 D ı 0 x y ’0 ; L ˛ w1 .x/ D 0 (6.131) dx 2 : w1 .0/ D 0; w1 .a/ D 0: It is important to note that the same operator, namely LC .˛0 /, appears in both Eqs. 6.120 and 6.131. Furthermore, the functions .x/ and w1 satisfy formally identical boundary conditions, as can be noted by comparing Eq. 6.131 to 6.117, respectively. Only the source term, 1 .e0 /, in Eq. 6.131 differs from the corresponding source term ru R.e0 / in Eq. 6.120. Therefore, the computer code employed to solve the adjoint system given in Eqs. 6.120 to compute the adjoint function .x/ can be used, with relatively trivial modifications, to compute the function w1 by solving Eq. 6.131. Comparing the right sides of Eqs. 6.130 and 6.121 reveals that they are formally identical, except that the function .x/, which appears in Eq. 6.121, is formally replaced by the function w1 in Eq. 6.130. This indicates that the computer program employed to evaluate the “indirect-effect term” R0' .e0 /h' can also be used, with only formal modifications, to evaluate the functional g1 , in Eq. 6.130. To complete this illustrative example, we note that the explicit form of w1 can be readily obtained by solving Eq. 6.131, as xo 1 n ; (6.132) w1 .x/ D 0 H x y ˛0 D a where H Œx y.˛0 / is the previously defined Heaviside-step functional. Replacing Eq. 6.132 in 6.130 and carrying out the respective integrations over x yields Q0 ıD ı'in 1 ıQ a 2y ˛0 C g1 D 0 0 2D D a 'in0 ıQ ıD ı'in (6.133) D 0 : a Q0 D 0 'in The sensitivity ıy ’0 I h˛ of the critical (i.e., maximum) point y.˛/ at ˛ 0 is now obtained by applying Eq. 6.74 to our illustrative example. Thus, specializing Eq. 6.70 to this illustrative problem yields: Za 11 0 ˚ @2 '= @x 2 e0 Q0 ı x y ’0 dx D 0 : D (6.134) 6 Sensitivity and Uncertainty Analysis of Models and Data 339 Replacing Eqs. 6.133 and 6.134 into the particular form taken on by Eq. 6.74 for this illustrative example leads to D 0 'in0 ıy ’0 I h˛ D aQ0 ıQ ıD ı'in : Q0 D 0 'i0n (6.135) The above expression indicates that the partial sensitivities (i.e., the partial G-derivatives) of y.˛/ with respect to D; Q; and 'in are as follows: '0 ıy D in0 ; ıD aQ D 0 'in0 ıy D ; ıQ a.Q0 /2 ıy D0 D 0: ı'in aQ (6.136) (6.137) (6.138) The variance var.y/, of the critical point of y.˛/, can now be computed by replacing Eq. 6.136–6.138 in 6.128. Further details regarding this illustrative example are given by Cacuci et al. [18]. 6.5.2 ASAP for a Ricatti Equation Consider the computation, using the ASAP, of the local sensitivities of a response associated with a nonlinear initial-value problem modeled by the Ricatti equation N.u/ du.t/ cu2 D 0; dt (6.139) where c is an experimentally determined, time-independent quantity with nominal (mean-) value c 0 , and where the dependent variable u.t/ satisfies the initial condition u.0/ D uin ; for t D 0: (6.140) The initial condition uin is also considered to be an experimentally determined quantity, having the nominal value u0in . For illustrative purposes, consider that the response is a simple functional of u.t/, of the form: Ztf RD f .t/ u.t/ dt; (6.141) 0 where tf represents some final-time value. The function f .t/ is defined as f .t/ fu d.t/; (6.142) 340 D.G. Cacuci where fu is a time-independent uncertain parameter with nominal value fu0 , while d.t/ is a time-dependent function that contains no uncertain parameters. Thus, the nominal value f 0 .t/ of f .t/ is f 0 .t/ fu0 d.t/; (6.143) For this simple problem, the vector of parameters, ’, is a three-component column vector of the form (6.144) ’ .c; uin ; fu /; with nominal value ’0 .c 0 ; u0in ; fu0 /: (6.145) Solving Eqs. 6.139 and 6.140 for the nominal parameter values given by Eq. 6.145 yields the following base-case nominal value, u0 .t/, for the dependent variable u.t/ u0 .t/ D u0in c 0 .u0in /2 t 0 2 .c / .u0in /2 t 2 2c 0 u0in t C1 : (6.146) In view of Eqs. 6.141, 6.144, and 6.146, the nominal value of the response is Ztf R D 0 f 0 .t/u0 .t/dt : (6.147) 0 As usual, the known variations in the parameters ’ around the nominal value ’0 will be denoted by the vector of parameter variations h˛ , defined as h˛ .ıc; ıuin ; ıfu /: (6.148) The variations h˛ will induce variations hu .t/ in u.t/ around u0 .t/, and will also induce variations in the response R around R0 . The sensitivity of R to variations in the vector of parameters ’ is obtained by computing, as usual, the G-differential of R at e0 .u0 ; ’0 / in the arbitrary direction h .hu ; h˛ /; performing the respective operations shows that the G-differential of R is linear in h. Therefore, it can be written as: (6.149) DR.e 0 I h/ D R0u e 0 hu C R0˛ e 0 h˛ where R0u e 0 Ztf hu f 0 .t/hu .t/dt (6.150) ıfu d.t/u0 .t/dt: (6.151) 0 and R0˛ .e 0 /h˛ Ztf 0 6 Sensitivity and Uncertainty Analysis of Models and Data 341 Applying the FSAP to Eqs. 6.139 and 6.140 yields the following “forward sensitivity equations (FSE)”: L.˛ 0 /hu .t/ h.0/ D ıuin ; d 2c 0 u0 .t/ hu .t/ D ıc u0 .t/ dt for t D 0: 2 (6.152) (6.153) In Eq. 6.152, the operator L.˛ 0 / is defined as d L ˛0 2c 0 u0 .t/: dt (6.154) In principle, Eqs. 6.152 and 6.153 could be solved, repeatedly, to obtain hu .t/ for each parameter variation h˛ . The need to solve repeatedly the FSE can be circumvented by using the Adjoint Sensitivity Analysis Procedure (ASAP). The first prerequisite for applying the ASAP is that the “indirect-effect” term R0u .e 0 /hu be expressible as a linear functional of hu .t/. An examination of Eq. 6.150 readily reveals that R0u .e 0 /hu is indeed a linear functional of hu .t/, which can be represented as an inner product in the real Hilbert space Hu L 2 ./, with .0; tf /, equipped with the inner product Ztf hf .t/; g.t/i f .t/ g.t/ dt; 0 for f; g 2 Hu L2 ./; .0; tf /: (6.155) In Hu L 2 ./, the linear functional R0u .e 0 /hu defined in Eq. 6.150 can be represented as the inner product ˛ ˝ R0u .e 0 /hu D f 0 .t/; hu : (6.156) The next step, namely the construction of the operator LC .˛0 /, which is formally adjoint to L.˛0 /, readily yields LC .˛0 / d 2c 0 u0 .t/: dt (6.157) Note that LC .˛0 / and L.˛0 / are not even formally self-adjoint. The boundary conditions for LC .˛0 / are derived by specializing Eq. 6.54 to the operators L.˛0 /hu and LC .˛0 / , to obtain Ztf 0 d 2c 0 u0 .t/ hu .t/dt D .t/ dt Ztf hu .t/ d 2c 0 u0 .t/ dt .t/ dt 0 C f P Œhu ; t Dt gxD0f : (6.158) 342 D.G. Cacuci Note that the function .x/ is still arbitrary at this stage, except for the requirement that 2 HQ D L2 ./; note also that the Hilbert spaces Hu and HQ have now both become the same space, i.e., Hu D HQ D L2 ./. Integrating the left side of Eq. 6.158 by parts and canceling terms yields the following expression for the bilinear boundary form: fP Œhu ; t Dt gxD0f D hu .tf / .tf / ıuin .0/: (6.159) The boundary conditions for LC .˛0 / can now be selected by noting that hu is not known at t D tf ; it can therefore be eliminated from Eq. 6.159 by choosing .tf / D 0; (6.160) as boundary conditions for the adjoint function .t/. Replacing Eq. 6.160 into 6.159 yields xDa ˚ (6.161) D ıuin .0/ PO h˛ ; I ˛0 : P h' ; xD0 The last step in the construction of the adjoint system is the identification of the source term, which is done by using Eq. 6.156 to obtain r' R.e 0 / D f 0 .t/; (6.162) so that the complete adjoint system becomes LC .˛0 / d 2c 0 u0 .t/ dt D f 0 .t/; (6.163) where the adjoint function .x/ is subject to the boundary conditions given in Eq. 6.160. Using Eqs. 6.158–6.163 in 6.156 gives the following expression for the “indirect effect” term R0u e 0 hu : R0u .e 0 /hu Ztf D ıc .t/ u0 .t/ 2 dt C ıuin .0/ (6.164) 0 where .t/ is the solution of the adjoint sensitivity system defined by Eqs. 6.163 and 6.160. As expected, Eqs. 6.163 and 6.160, which underlie the adjoint sensitivity system, are independent of parameter variations h˛ ; thus, the adjoint sensitivity system needs to be solved only once to obtain the adjoint function .t/. Note, though, that the adjoint sensitivity system depends on the nominal solution u0 .t/, since the original equation is nonlinear in u.t/. Of course, the adjoint sensitivity system depends on the response, which provides the source term as shown in Eq. 6.163. 6 Sensitivity and Uncertainty Analysis of Models and Data 343 6.5.3 ASAP for a System of Linear Ordinary Differential Equations Consider the application of the ASAP for computing the sensitivity of a response associated with the following system of coupled linear ordinary differential equations: 8 du 1 ˆ ˆ ˆˇ dx u2 D Q1 ; x 2 .0; `/ < du2 (6.165) D xQ2 ; u1 C ˇ ˆ ˆ dx :̂ u1 .0/ D u10 I u2 .0/ D u20 ; and consider that the response of interest for this system is simply the functional Z R.e/ ` .r1 0 r2 / u1 .x/ u2 .x/ dx: (6.166) where r1 and r2 are “experimentally determined” constants. In this example, the column vector e .u; ˛/ has components u.x/ D Œu1 .x/; u2 .x/ and ˛ .ˇ; u10 ; u20 ; Q1 ; Q2 ; r1 ; r2 /. The components of the column vector ˛ .ˇ; u10 ; u20 ; Q1 ; Q2 ; r1 ; r2 / are considered to be uncertain parameters that can vary by amounts h˛ .ıˇ; ıu10 ; ıu20 ; ıQ1 ; ıQ2 ; ır1 ; ır2 / around their nominal values ˛0 .ˇ 0 ; u010 ; u020 ; Q10 ; Q20 ; r10 ; r20 /. Note that, since all vectors are considered to be column (rather than row) vectors, the symbol “T ,” denoting “transposition,” will be omitted, to keep the notation as simple as possible. The system defined by Eq. 6.165 can readily be solved for the nominal parameter values to obtain 0 u1 0 u D u02 ! .Q10 ˇ 0 Q20 C u020 / sin x=ˇ 0 C u010 cos x=ˇ 0 C xQ20 D : (6.167) .Q10 ˇ 0 Q20 C u020 / cos x=ˇ 0 u010 sin x=ˇ 0 Q10 C0 ˇQ20 Using Eq. 6.167 in 6.166 and performing the integration over x gives the nominal response value as R.e 0 / D .r1 r2 / 0 1 .Q10 ˇ 0 Q20 C u020 /ˇ 0 .1 cos `=ˇ 0 / C u010 ˇ 0 sin `=ˇ 0 C Q20 `2 =2 @ 0 A: .Q1 ˇ 0 Q20 C u020 /ˇ 0 sin `=ˇ 0 u010 ˇ 0 .1 cos `=ˇ 0 / C .ˇ 0 Q20 Q10 /` (6.168) 344 D.G. Cacuci The sensitivity DR.e 0 I h/ of R.e/ at e 0 for a variation h D .hu ; h˛ /, where hu Œh1 .x/ h2 .x/ , is obtained by computing the G-differential of Eq. 6.166; this yields DR.e 0 I h/ D R0u e 0 hu C R0˛ e 0 h˛ ; (6.169) where the indirect-effect term is defined as R0u .’0 /hu Z ` .r10 0 r20 / Z ` 0 h1 .x/ dx D r1 h1 .x/ C r20 h2 .x/ dx; (6.170) h2 .x/ 0 whereas the direct-effect term is defined as R0˛ .e 0 /h˛ Z ` u0 .x/ ır2 / 10 u2 .x/ .ır1 Z 0 ` D 0 dx .ır1 /u01 .x/ C .ır2 /u02 .x/ dx: (6.171) The direct-effect term given by Eq. 6.171 can already be evaluated at this stage by inserting the nominal solution u0 .x/ D Œu01 .x/; u02 .x/ from Eq. 6.167 into 6.171, above, and by carrying out the integration over x. The indirect-effect term, though, must be evaluated by using either the FSAP or the ASAP. If the FSAP is used, then the Forward Sensitivity System (FSS) is obtained by G-differentiating Eq. 6.165, to obtain 8 ˆ ˆ ˆ L.’0 / ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ < ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ :̂ h1 ! h2 h1 .0/ h2 .0/ d ˇ 0 dx 1 1 d ˇ 0 dx ! h1 ! h2 0 1 du0 1 ıˇ ıQ 1 dx A ; D@ du0 xıQ2 ıˇ dx2 ! D ıu10 ıu20 x 2 .0; `/; (6.172) ! ; The vector of variations hu Œh1 .x/ h2 .x/ in the state-function u.x/ D Œu1 .x/; u2 .x/ around the nominal values e 0 can be computed by solving the Forward Sensitivity System (FSS) given by Eq. 6.172. In turn, the values hu Œh1 .x/ h2 .x/ thus obtained could be used to compute the indirect-effect term R0u .e 0 /hu from Eq. 6.170. However, the ASAP is more advantageous to use than the FSAP since there are seven parameters but only one response, which means that the sensitivities to all parameters could be obtained by computing the adjoint function once (rather than solving the FSS seven times, if the FSAP were employed). Proceeding with the application of the ASAP, we construct next the operator LC .’0 /, 6 Sensitivity and Uncertainty Analysis of Models and Data 345 which is formally adjoint to L.’0 /. For this illustrative example, LC .’0 / is readily obtained as d ˇ 0 dx 1 C 0 : (6.173) L .’ / D d 1 ˇ 0 dx Using Eqs. 6.172 and 6.173 in 6.54, performing the respective vector-matrix multiplications, and the subsequent integration over x yields the following form for the bilinear concomitant: g@ fP Œhu ; x ˇŒ .`/ h1 .`/ C 1 .0/ h1 .0/ 1 2 .`/ h2 .`/ 2 .0/h2 .0/ : (6.174) The boundary conditions for the adjoint function are determined by noting that the values h1 .0/ and h2 .0/ are known from Eq. 6.172, but the values of h1 .`/ and h2 .`/ are unknown. The latter can be eliminated from the expression of the bilinear concomitant by requiring that 1 .`/ 2 .`/ D 0 : 0 (6.175) Replacing the boundary conditions for the adjoint function selected in Eq. 6.175 together with the boundary conditions for h1 .0/ and h2 .0/ from Eq. 6.172 reduces the bilinear concomitant given in Eq. 6.174 to the expression O ˛ ; I ˛0 / D ˇ Œ P.h 1 .0/ıu10 2 .0/ıu20 : (6.176) The construction of the adjoint system can now be completed by using the expression of the indirect-effect term defined by Eq. 6.170 to obtain LC .e 0 / d ˇ 0 dx 1 d 1 ˇ 0 dx 1 .x/ 2 .x/ D r10 r20 : (6.177) As expected, the adjoint sensitivity system, consisting of Eqs. 6.175 and 6.177, is independent of any parameter variations h˛ .ıˇ; ıu10 ; ıu20 ; ıQ1 ;ıQ2 ; ır1 ; ır2 /, and is also independent of variations hu Œh1 .x/ h2 .x/ . Thus, the adjoint sensitivity system needs to be solved once only to obtain the adjoint function . For this illustrative example, Eqs. 6.177 and 6.175 can be readily solved to obtain D 1 2 D A sin x=ˇ 0 B cos x=ˇ 0 r2 B sin x=ˇ 0 C A cos x=ˇ 0 C r1 ; (6.178) where the constants A and B are defined as A r2 sin `=ˇ 0 r1 cos `=ˇ 0 ; B r1 sin `=ˇ 0 r2 cos `=ˇ 0 : (6.179) 346 D.G. Cacuci Once has been obtained, it is used in conjunction with Eq. 6.91, specialized to this illustrative example, to obtain the following expression for the sensitivity DR.e 0 I h/ in terms of the adjoint function : Z ` 0 DR.e I h/ D 0 .ır1 /u01 .x/ C .ır2 /u02 .x/ dx C ˇ Œ 1 .0/ıu10 C 2 .0/ıu20 Z ` du01 .x/ ıQ ıˇ C 1 1 dx 0 0 du dx : C 2 .x/ xıQ2 ıˇ 2 dx (6.180) As indicated by the above expression, the sensitivity DR.e 0 I h/ is calculated most efficiently in terms of the adjoint function , by quadratures. Thus, even this simple illustrative example highlights the powerful efficiency of the ASAP, as can be seen from the following considerations: (i) to compute the sensitivity DR.e 0 I h/ to the parameter variations by the FSAP, one would need to solve (at least) seven times (since there are seven parameters) the coupled differential equations shown in Eq. 6.172, in order to compute hu Œh1 .x/ h2 .x/ for the vector of variations h˛ .ıˇ; ıu10 ; ıu20 ; ıQ1 ; ıQ2 ; ır1 ; ır2 /; (ii) alternatively, to compute the sensitivity DR.e 0 I h/ to the parameter variations h˛ .ıˇ; ıu10 ; ıu20 ; ıQ1 ;ıQ2 ; ır1 ; ır2 / by the ASAP, one would need to solve the adjoint sensitivity system once only, which is no more difficult than solving the forward sensitivity system once, and then perform seven quadratures (i.e., integrations over x), as indicated in Eq. 6.180, to obtain the same sensitivities as would be provided by the FSAP. If one needs sensitivities to more than seven parameter variations, then, for each parameter variation, one would need to solve a system of coupled differential equations by using the FSAP, as opposed to one quadrature by using the ASAP. Comparing the computational effort needed to solve a system of coupled differential equations as opposed to performing a quadrature clearly underscores the powerful advantages of using the ASAP, whenever possible. 6.6 Computational Considerations and Open Problems Recall from Section 6.3 that statistical uncertainty and sensitivity analysis methods aim at assessing the contributions of parameters uncertainties to the overall uncertainty of the model response (output). The relative magnitude of this uncertainty contribution is assigned a measure of the statistical sensitivity of the response uncertainty to the respective parameter, and this measure is also used to rank the importance of the respective parameter. The simplest conceptual attempt at “sensitivity analysis” is to use screening design methods to identify a short list of parameters that have the largest influence on a particular model response. The fundamental 6 Sensitivity and Uncertainty Analysis of Models and Data 347 assumption underlying all screening design methods is that the influence of parameters in models follows Pareto’s law of income distribution within nations, i.e., the number of parameters that are truly important to the model response is small by comparison to the total number of parameters in the model. There is an inevitable trade-off between the computational costs and the information extracted from a screening design. Thus, computationally economical methods provide only qualitative, rather than quantitative information, in that they provide a parameter importance ranking rather than a quantification of how much a given parameter is more important than another. Furthermore, the importance of parameters in large-scale, complex models is seldom a priori obvious (and may even be counterintuitive). Hence, screening design methods should be used cautiously, since they may be a priori inadequate to identify the truly important parameters. As has also been discussed in Section 6.3, “sampling-based uncertainty and sensitivity analysis” is performed in order to ascertain if model predictions fall within some region of concern (“uncertainty in model responses due to uncertainties in model parameters”) and to identify the dominant parameters in contributing to the response uncertainty (“statistical sensitivity analysis”). It is particularly important to recall from Section 6.3 that the very first step (of the five specific steps described there) of “sampling-based uncertainty and sensitivity analysis,” in which subjective uncertainties are assigned through expert review processes, is crucial to the results produced by the subsequent steps in the analysis. This exceptional importance of the very first step is due to the fact that the subsequent results produced by sampling-based uncertainty and sensitivity analysis methods depend entirely on the distributions assigned to the sampled parameters; hence, the proper assignment of these distributions is essential to avoid producing spurious results. Furthermore, the statistical “sensitivity analysis” of the response to the parameters is performed in the fifth (and last) step of these procedures (by using scatter plots, regression analysis, partial correlation analysis, etc.). Therefore, it is also important to note that correlated variables introduce unstable regression coefficients when performing the “statistical sensitivity analysis” part, because these coefficients become sensitive to the specific variables introduced into the regression model. In such situations, the regression coefficients of a regression model that includes all of the parameters are likely to give misleading indications of parameter “sensitivities.” If several input parameters are suspected (or known) to be highly correlated, it is usually recommended to transform the respective parameters so as to remove the correlations or, if this is not possible, to analyze the full model by using a sequence of regression models with all but one of the parameters removed, in turn. From the material presented in the preceding sections, it has also become apparent that all statistical uncertainty and sensitivity analysis procedures commence with the “uncertainty analysis” stage, and only subsequently proceed to the “sensitivity analysis” stage; this path is the exact reverse of the conceptual path underlying the methods of deterministic sensitivity and uncertainty analysis where the sensitivities are determined prior to using them for uncertainty analysis. Without any a priori assumption regarding the relationship between the parameters and the response, the construction of a full-space statistical uncertainty analysis 348 D.G. Cacuci requires O.s I / computations, where s denotes the number of sample values for each parameter and I denotes the number of parameters. If a local polynomial regression is used, the rate of convergence of the approximate to the true parametersto-response mapping is sN D N p=.2pCI / , where N denotes the number of sample points, p denotes the degree of smoothness of the function representing the response in terms of the parameters, and I denotes the number of parameters. This relation indicates that the parameters-to-response mapping (function) can be approximated to a resolution of s 1 with O.s I=p / sample points. The FAST method appears to be the most efficient of the global statistical methods, needing I.8!i C 1/Nr computations for each frequency, where Nr denotes the number of replicates. For example, if the response is a function of eight parameters, and if the sample size is 64, then Sobol’s method requires 1,088 model evaluations, while the FAST method requires 520 model evaluations; when the sample size increases to 1,024, then Sobol’s method requires 17,408 model evaluations, while the FAST method requires 8,200 model evaluations. It becomes clear that even for the most efficient statistical methods (e.g., the FAST method) the number of required model evaluations becomes rapidly impractical for realistic, large-scale models involving many parameters. Thus, since many thousands of simulations are needed, statistical methods are at best expensive (for small systems), or, at worst, impracticable (e.g., for large time-dependent systems). Furthermore, since the response sensitivities and parameter uncertainties are inherently amalgamated, improvements in parameter uncertainties cannot be directly propagated to improve response uncertainties; rather, the entire set of simulations must be repeated anew. Currently, a general-purpose “fool-proof” statistical method for correctly analyzing mathematical models of physical processes involving highly correlated parameters does not seem to exist, so that particular care must be taken while interpreting regression results for such models. Summarizing the computational effort required by the various deterministic methods for computing local sensitivities, we recall that the “brute-force method” is conceptually simple to use and requires no additional model development, but it is slow, relatively expensive computationally, and involves a trial-and-error process when selecting the parameter perturbations ı˛i , in order to avoid erroneous results for the computed sensitivities. The “brute-force method” requires .I C 1/ model computations; if central differences are used, the number of model computations could increase up to a total of 2I . Of the deterministic methods for obtaining the local first-order sensitivities exactly, the Green’s function method is the most expensive computationally. The DDM requires at least as many model evaluations as there are parameters, and the computational effort increases linearly with the number of parameters. Since it uses Gâteaux-differentials, the FSAP represents a generalization of the DDM; nevertheless, the FSAP requires the same computational and programming effort to develop and implement as the DDM. Hence, just like the DDM, the FSAP is advantageous to employ only if the number of different responses of interest for the problem under consideration exceeds the number of system parameters and/or parameter variations to be considered. Otherwise, the use of either the FSAP or the DDM becomes impractical for large systems with many parameters, because the computational requirements become unaffordable. 6 Sensitivity and Uncertainty Analysis of Models and Data 349 By far the most efficient local sensitivity analysis method is the ASAP, but the ASAP requires development of an appropriate adjoint sensitivity model. If the adjoint model is developed simultaneously with the original model, then it requires very little additional resources to develop. If, however, the adjoint sensitivity system is developed a posteriori, considerable skill may be required for its successful implementation and use. Note that the adjoint sensitivity system is independent of parameter variations, but depends on the response, which contributes the sourceterm for this system. Hence, the adjoint sensitivity system needs to be solved only once per response in order to obtain the adjoint function. Furthermore, the adjoint sensitivity system is linear in the adjoint function. In particular, for linear problems, the adjoint sensitivity system is independent of the original state variables, which means that it can be solved independently of the original system. In summary, the ASAP is the most efficient method to use for sensitivity analysis of systems in which the number of parameters exceeds the number of responses under consideration. It is important to emphasize that the “propagation of moments” equations are used both for processing experimental data obtained from indirect measurements and also for performing statistical analysis of computational models. The “propagation of moments” equations provide a systematic way for obtaining the uncertainties in computed results, arising not only from uncertainties in the parameters that enter the computational model, but also from the numerical approximations themselves. The major advantages of using the “propagation of moments” method are: (i) if all sensitivities are available, then all of the objectives of sensitivity analysis (enumerated above) can be pursued efficiently and exhaustively; and (ii) since the response sensitivities and parameter uncertainties are obtained separately from each other, improvements in parameter uncertainties can immediately be propagated to improve the uncertainty in the response, without the need for expensive model recalculations. On the other hand, the major disadvantage of the “propagation of moments” method is that the local sensitivities need to be calculated a priori; as we have already emphasized, such calculations are very expensive computationally, particularly for large (and/or time-dependent) systems. It hence follows that a very efficient overall methodology for performing local sensitivity and uncertainty analysis would be to combine the ASAP (which would provide the local response sensitivities) with the “propagation of moments” method, to obtain the local response uncertainties. Several techniques have been proposed (see, e.g., the reviews by Greenspan [34], Ronen [52]; and references therein) for computing higher-order response derivatives with respect to the system’s parameters. However, none of these techniques has proven routinely practicable for large-scale problems. This is because the systems of equations that need to be solved for obtaining the second-order (and higher-order) Gâteaux-differentials of the response and system’s operator equations are very large and depend on the perturbation ı˛. Thus, even the calculation of the second-order (local) Gâteaux-differentials of the response and system’s operator equations is just as difficult as undertaking the complete task of computing the exact value of the perturbed response R.˛10 C ı˛1 ; : : : ; ˛k0 C ı˛k /. It appears that the only genuinely global deterministic method for sensitivity analysis is the Global Adjoint Sensitivity Analysis Procedure (GASAP) proposed 350 D.G. Cacuci by Cacuci [11]. Instead of attempting to extend the validity of local Taylor series, the GASAP is based on a global homotopy-based concept for exploring the entire phase space spanned by the parameters and state variables. The GASAP yields information about the important global features of the physical system, namely the critical points of R.˛/ and the bifurcation branches and/or turning points of the system’s state variables. Although the GASAP is both exhaustive and computationally efficient, its general utility for large-scale models is still untested at the time of this writing. Regarding future developments in sensitivity and uncertainty analysis, two of the outstanding issues, whose solution would greatly advance the state of overall knowledge, would be: (i) to extend the adjoint sensitivity analysis procedure (ASAP) to problems describing turbulent flows, and (ii) to combine the GASAP with global statistical uncertainty analysis methods, striving to perform, efficiently and accurately, global sensitivity and uncertainty analyses for large-scale systems. References 1. Andres TH, Hajas WC (April 1993) Using iterated fractional factorial design to screen parameters in sensitivity analysis of a probabilistic risk assessment model. Proceedings of the Joint International Conference on Mathematical Methods and Supercomputing in Nuclear Applications, vol 2, 328. Karlsruhe, Germany, pp 19–23 2. Berger J (1985) Statistical decision theory and Bayesian analysis, 2nd edn. Springer, New York 3. Bettonvil B (1990) Detection of important factors by sequential bifurcation. Tilburg University Press, Tilburg 4. Bode HW (1945) Network analysis and feedback amplifier design. Van Nostrand-Reinhold, Princeton, NJ 5. Bogaevski VN, Povzner A (1991) Algebraic methods in nonlinear perturbation theory. Springer, New York 6. Bonano EJ, Apostolakis GE (1991) Theoretical foundations and practical issues for using expert judgments in uncertainty analysis of high-level radioactive waste disposal. Radioact Waste Manag Nucl Fuel Cycle 16:137 7. Box GEP, Hunter WG, Hunter JS (1978) Statistics for experimenters. Chap.15. Wiley, New York 8. Bysveen S et al. (1990) Experience from application of probabilistic methods in offshore field activities. In Proceedings of the ninth international conference on off-shore mechanics and arctic engineering. Saga Petroleum, AS, Norway 9. Cacuci DG (1981) Sensitivity theory for nonlinear systems. I. Nonlinear functional analysis approach. J Math Phys 22:2794 10. Cacuci DG (1981) Sensitivity theory for nonlinear systems. II. Extensions to additional classes of responses. J Math Phys 22:2803 11. Cacuci DG (1990) Global optimization and sensitivity analysis. Nucl Sci Eng 104:78 12. Cacuci DG (2003) Sensitivity and uncertainty analysis: I. theory, vol 1. Chapman & Hall/CRC Press, Boca Raton, FL 13. Cacuci DG, Hall MCG (1984) Efficient estimation of feedback effects with application to climate models. J Atmos Sci 41:2063 14. Cacuci DG, Ionescu-Bujor M (2000) Adjoint sensitivity analysis of the RELAP5/MOD3.2 Two-fluid thermal-hydraulic code system: I. theory. Nucl Sci Eng 136:59 15. Cacuci DG, Ionescu-Bujor M (2002) Adjoint sensitivity and uncertainty analysis for reliability/availability models, with application to the international fusion materials irradiation 6 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. Sensitivity and Uncertainty Analysis of Models and Data 351 facility. A&QT-R 2002 (THETA 13). International conference on automation, quality and testing, robotics. Cluj-Napoca, Romania, May 23–25 Cacuci DG, Wacholder E (1982) Adjoint sensitivity analysis for transient two-phase flow. Nucl Sci Eng 82:461 Cacuci DG et al. (1980) Sensitivity theory for general systems of nonlinear equations. Nucl Sci Eng 75:88 Cacuci DG, Ionescu-Bujor M, Navon IM (2005) Sensitivity and uncertainty analysis: II. Applications to large-scale systems, vol 2. Chapman & Hall/CRC Press, Boca Raton, FL Cacuci DG, Maudlin PJ, Parks CV (1983) Adjoint sensitivity analysis of extremum-type responses in reactor safety. Nucl Sci Eng 83:112 Clement RT, Winkler RL (1999) Combining probability distributions from experts in risk analysis. Risk Anal 19:187 Cotter SC (1979) A screening design for factorial experiments with interactions. Biometrika 66:317 Cruz JB (1973) System sensitivity analysis. Dowden, Hutchinson and Ross, Stroudsburg, PA Cukier RI et al. (1973) Study of the sensitivity of coupled reaction systems to uncertainties in rate coefficients. I: theory. J Chem Phys 59:3873 Daniel C (1973) One-at-a-time-plans. J Am Statist Assoc 68:353 Deif AS (1986) Sensitivity analysis in linear systems. Springer, New York Dunker AM (1981) Efficient calculation of sensitivity coefficients for complex atmospheric models. Atmos Environ 15:1155 Dunker AM (1984) The decoupled direct method for calculating sensitivity coefficients in chemical kinetics. J Chem Phys 81:2385 Eckhaus W (1979) Asymptotic analysis of singular perturbations. North-Holland, Amsterdam Eslami M (1994) Theory of sensitivity in dynamic systems. Springer, Heidelberg Fiacco AV (ed) Sensitivity, stability, and parametric analysis (a publication of the Mathematical Programming Society). North-Holland, Amsterdam Fischer RA (1935) The design of experiments. Oliver & Boyd, Edinburgh Frank PM (1978) Introduction to system sensitivity theory. Academic, New York Gandini A (1987) Generalized Perturbation Theory (GPT) methods: a heuristic approach. In: Lewins J, Becker M (eds) Advances in nuclear science and technology, vol 19. Plenum, New York Greenspan E (1982) New developments in sensitivity theory. In: Lewins J, Becker M (eds) Advances in nuclear science and technology, vol 14. Plenum, New York Holmes MH (1995) Introduction to perturbation methods. Springer, New York Hora SC, Iman RL (1989) Expert opinion in risk analysis: the NUREG-1150 methodology. Nucl Sci Eng 102:323 Iman RL, Conover WJ (1982) A distribution-free approach to inducing rank correlation among input variables. Commun Stat Simul Comput B 11:311 Ionescu-Bujor M, Cacuci DG (2000) Adjoint sensitivity analysis of the RELAP5/MOD3.2 two-fluid thermal-hydraulic code system: II. Applications. Nucl Sci Eng 136:85 Ionescu-Bujor M, Cacuci DG (2003) Illustrative application of the adjoint sensitivity analysis procedure to reliability models of electromechanical devices, SIELMEN 2003. The 4th international conference on electromechanical and energetic systems. Chisinau, Republic of Moldavia, September 26–27 Kato T (1963) Perturbation theory for linear operators. Springer, Berlin Kevorkian J, Cole JD (1996) Multiple scale and singular perturbation methods. Springer, Heidelberg Kokotovic PV et al. (1972) Singular perturbations: order reduction in control system design, JACC Kramer MA et al. (1981) An improved computational method for sensitivity analysis: Green function method with AIM. Appl Math Model 5:432–441 Lillie RA et al. (1988) Sensitivity/Uncertainty analysis for free-in-air tissue kerma at Hiroshima and Nagasaki due to initial radiation. Nucl Sci Eng 100:105 352 D.G. Cacuci 45. Madsen HO et al. (1986) Methods of structural safety. Prentice Hall, Engleqood Cliffs, NJ 46. McKay MD et al. (1979) A comparison of three methods of selecting values of input variables in the analysis of output from a computer code. Technometrics 21:239 47. Morris MD (1991) Factorial sampling plans for preliminary computational experiments. Technometrics 33:161 48. Navon IM (1998) Practical and theoretical aspects of adjoint parameter estimation and identifiability in meteorology and oceanography. Dyn Atmos Ocean 27(1–4):55 49. Navon IM et al. (1992) Variational data assimilation with an adiabatic version of the NMC spectral model. Mon Wea Rev 120:1433 50. Nayfeh AH (1973) Perturbation methods. Wiley, New York 51. O’Malley RE, Jr (1991) Singular perturbation methods for ordinary differential equations. Springer, New York 52. Ronen Y (1988) Uncertainty analysis based on sensitivity analysis. In: Ronen Y (ed) Uncertainty analysis. CRC Press, Boca Raton, FL 53. Rosenwasser E, Yusupov R (2000) Sensitivity of automatic control systems. CRC Press, Boca Raton, FL 54. Sagan H (1969) Introduction to the calculus of variations. McGraw-Hill, New York 55. Saltelli A, Sobol IM (1995) About the use of rank transformation in sensitivity analysis of model output. Reliab Eng Syst Safety 50:225 56. Saltelli A et al. (1999) A quantitative, model independent method for global sensitivity analysis of model output. Technometrics 41:39 57. Sanchez MA, Blower SM (1997) Uncertainty and sensitivity analysis of the basic reproductive rate. Tuberculosis as an example. Am J Epidemiol 145:1127 58. Selengut DS (1959) Variational analysis of multidimensional systems. Rep. HW-59129, HEDL, Richland, WA, pp 97 59. Selengut DS (1963) On the derivation of a variational principle for linear systems. Nucl Sci Eng 17:310 60. Stacey WM, Jr (1974) Variational methods in nuclear reactor physics. Academic, New York 61. Tomovic R, Vucobratovic M (1972) General sensitivity theory. Elsevier, New York 62. US NRC (Nuclear Regulatory Commission) (1990–1991) Severe accident risks: an assessment for five U.S. nuclear power plants, NUREG-1150. US Nuclear Regulatory Commission, Washington, DC 63. Usachev LN (1964) Perturbation theory for the breeding ratio and for other number ratios pertaining to various reactor processes. J Nucl Energy A/B 18:571 64. Weinberg AM, Wigner EP (1958) The physical theory of neutron chain reactors. University of Chicago Press, Chicago, IL 65. Wigner EP (1945) Effect of small perturbations on pile period. Manhattan Project Report CPG-3048 66. Xiang Y, Mishra S (1997) Probabilistic multiphase flow modelling using the limit-state method. Groundwater 35(5):820 67. Yosida K (1971) Functional analysis. Springer, Berlin 68. Zou X et al. (1993) An adjoint sensitivity study of blocking in a two-layer isentropic model. Mon Wea Rev 121:2833 6 Sensitivity and Uncertainty Analysis of Models and Data 353 Professor Dan Gabriel Cacuci received his degrees (M.Sc. in 1973, M.Phil. in 1977, and Ph.D. in Applied Physics and Nuclear Engineering, in 1978) from Columbia University. He worked at Oak Ridge National Laboratory, becoming Senior Section Head overseeing activities in several high-profile projects, including the Strategic Defense Initiative and the Advanced Neutron Source, while actively conducting theoretical and experimental research on a variety of topics including sensitivity and uncertainty analysis of large-scale nonlinear systems. In 1988, Dr. Cacuci joined the Department of Chemical and Nuclear Engineering, University of California, Santa Barbara, as a Full Professor. He continued his research and teaching activities in the Department of Nuclear Engineering, University of Illinois, Urbana-Champaign, where he was a Full Professor from 1990 to 1993. In 1992, he moved to the University of Karlsruhe, Germany, as Ordinarius Chair Professor and Director, a lifetime appointment, of the Institute for Nuclear Technology and Reactor Safety, while concurrently serving as director of the Institute for Reactor Safety at the National Research Center in Karlsruhe (FZK). Over the years, Professor Cacuci held faculty positions in various leading universities, including Distinguished Professor of Engineering and Applied Science at the University of Virginia, Visiting Professor in the Department of Nuclear Engineering and Radiological Sciences, University of Michigan, Visiting Professor at the Royal Institute of Technology, Stockholm, and Adjunct Professor of Nuclear Engineering at the University of California, Berkeley, graduating 51 Ph.D. students in Germany. In 2004, Dr. Cacuci resigned his position as Institute Director at FZK to become the Scientific Director for the Nuclear Energy Sector (Division) of France’s Commissariat a l’Energie Atomique. He was elected as Fellow of the American Nuclear Society (ANS) in 1986, and has been Editor, since 1984, of the Society’s leading international research journal, Nuclear Science and Engineering. Included among the awards and recognitions received by Dr. Cacuci are four honorary doctorates, the Alexander Von Humboldt Prize for Senior Scholars, Germany, 1990; the Science Prize and title of Honorary Member of the Romanian Cultural Foundation, Bucharest, Romania, 1995; the A. A. Harms International Award, 1998; the Glenn Seaborg Medal of the ANS, 2000; and the Eugene P. Wigner Reactor Physics Award of the ANS, 2001. In 1998, he won the Department of Energy’s prestigious E. O. Lawrence Memorial Award. He is an honorary member of the Romanian Academy and European Academy of Sciences and Arts. He has published three books and ca. 180 refereed publications in international journals and conferences. Chapter 7 Criticality Safety Methods G.E. Whitesides, R.M. Westfall, and C.M. Hopper 7.1 Introduction 7.1.1 Overview The objective of this chapter is to examine the history of nuclear criticality safety calculations. To this end, we will review the history of criticality safety concerns and look at the various approaches in dealing with this particular type of calculation. The criticality safety methods that are the subject of this chapter are those that are used to determine the criticality safety of the handling, transportation, and storage of fissile materials outside nuclear reactors. The object of criticality safety studies is defined in American National Standards Institute, Inc./American Nuclear Society Standard ANSI/ANS-8.1–1998 for Nuclear Criticality Safety in Operations with Fissionable Materials Outside Reactors, specifically: 4.1.2 Process Analysis. Before a new operation with fissionable material is begun, or before an existing operation is changed, it shall be determined that the entire process will be subcritical under both normal and credible abnormal conditions. 4.2.5 Subcritical Limit. Where applicable data are available, subcritical limits shall be established on bases derived from experiments, with adequate allowance for uncertainties in the data. In the absence of directly applicable experimental measurements, the limits may be derived from calculations made by a method shown by comparison with experimental data to be valid in accordance with 4.3. (i.e., 4.3 Validation of a Calculational Method). 7.1.2 Historical Background While we normally think of criticality occurrences as events that have taken place since the 1940s, we now know that one occurred many years ago. The first known G.E. Whitesides (), R.M. Westfall, and C.M. Hopper Oak Ridge National Laboratory e-mail: whitesidesge@ornl.gov; westfallrm@ornl.gov; hoppercm@ornl.gov Y. Azmy and E. Sartori, Nuclear Computational Science: A Century in Review, c Springer Science+Business Media B.V. 2010 DOI 10.1007/978-90-481-3411-3 7, 355 356 G.E. Whitesides et al. “criticality” took place 1,800 million years ago in the Republic of Gabon [1]; hence, the issue of criticality outside a nuclear reactor has been with us for a long time. Presumably, from the perspective of personnel safety, we do not know the consequences of this occurrence. The most interesting aspect of the Oklo criticality is that the material has remained in place, without migration, for a very long time. 7.2 Nuclear Criticality Safety: The Early Years 7.2.1 The First Criticality Concerns In modern history, the origin of criticality safety studies began in the early 1940s. In determining whether a nuclear weapon could be built, the US government decided on three approaches to obtaining fissile material: electromagnetic separation of U-235, conversion of U-238 into Pu-239 by irradiation, and separation by gaseous diffusion. In the first approach, the batch-processing method, the amount of fissile material was relatively small and could not pose a criticality safety problem. In the second, the fissile material was natural uranium in solid form, which was converted into small-batch-approach quantities of plutonium, far below critical quantities. Thus, it raised no concerns. However, the last approach was a continuous-process gaseous diffusion method that did cause concerns about a criticality accident. Because the material initially was in the form of natural uranium, no criticality concerns would normally exist. However, because the quantity of material was large and the process was intended to continuously enrich the material in U-235 content, a question arose that gave birth to modern nuclear criticality safety concerns. This slightly humorous story about one of the first criticality safety concerns was recalled by Edward Teller in a criticality safety experiments conference at Lawrence Livermore National Laboratory in 1969 (with kind permission of the Lawrence Livermore National Laboratory) [2]. Since the total uranium quantity in the gaseous diffusion plant was large and, hence, the U-235 inventory significant, Dr. Teller was given the task of evaluating the criticality safety concerns. At one meeting to address safety concerns, as Teller recalls, one of those present asked a question. Since uranium was in gaseous form and uranium atoms could move around freely, would it be possible that the U-238 atoms, in the U-238–U-235 mixture could be collected by chance, in one end of a tube, and the U-235 could be collected in the other end, in sufficient quantity to achieve criticality. To this Teller responded positively and added that though, the possibility existed, the possibility of this occurrence was exactly the same as the possibility of all the oxygen atoms in the room, where they were sitting, congregating under the table and all of them dying due to lack of oxygen. Thus, the first criticality accident concerns were allayed without the need to resort to calculations. This was, of course, before the time of highly enriched U-235 in the gaseous diffusion plants. 7 Criticality Safety Methods 357 7.2.2 Early Attempts at Criticality Safety Computations Needless to say, it was not long before sufficient fissile material was produced that real concerns about criticality safety did indeed arise. The early pioneers had only rudimentary computational tools available, and most computations were made with either one- or two-energy groups and with cross sections that had been determined from the relatively few cross-section measurements that had been made at that time. The concerns centered primarily on two physical forms: fissile materials in liquids and solids. The latter was mainly in metal form since that was the material required for building a nuclear weapon. In the liquid form, the thermal-energy cross sections were most important; in the metal form, the high-energy cross sections were dominant. Hence, these one- or two-energy-group calculations, made on a mechanical calculator, provided the tools that initiated the field of nuclear criticality safety calculations. The field of nuclear criticality safety calculations has always been intimately bound to criticality experiments. As the weapons development program progressed, experimental criticality facilities were developed, first at Los Alamos and then at Oak Ridge. The first critical experiments for demonstrating the safety of enrichment plant design were conducted at Los Alamos by Oak Ridgers, including Clifford Beck, Dixon Callihan, and Raymond Murray. These early experiments were intended to establish critical mass values for various forms of fissile materials, with little attention paid to the needs for validating criticality safety calculations. This emphasis, as will be discussed later, changed slowly over the years as the need to validate calculations became the major thrust of the fissile materials critical experiments programs. The one- and two-group calculations coupled with critical experimental data provided the information to guide the criticality safety specialist for a substantial period in the early nuclear program. The major criticality accidents during this time occurred in the critical experiments facilities and not in other fissile-handling operations. Most of the accidents in operations occurred in the 1950s, with accidents occurring at both Oak Ridge (the Y-12 Plant) and Los Alamos (TA-55) in 1958. 7.3 Criticality Safety Versus Reactor Design Calculations 7.3.1 Computational Requirements for Reactor Design During the 1950s, the emphasis in the nuclear industry shifted to developing nuclear reactors. During this period, the first electronic computers appeared on the scene. This combination provided the environment for the development of multigroup diffusion theory computer programs and the first capabilities to perform neutronic calculations for multiregion systems. The need for a computational tool for reactors, at that time, was quite adequately met by these codes. 358 G.E. Whitesides et al. 7.3.2 Special Requirements for Criticality Safety Calculations While the development of these new multigroup diffusion computation programs was hailed as a major development for the design of nuclear reactors, those involved in nuclear criticality calculations found that this was only a partial solution to their need for a computational tool. Most reactor designs under consideration at that time relied on thermal fission for their principal reaction. Furthermore, the systems were designed with a minimum of leakage and a minimum of materials with large absorption cross sections. These are precisely the conditions for which the diffusion theory approximation is applicable, that is, large single units with low leakage, no voids, and low neutron absorption. On the other hand, the criticality safety practitioners were seeking to prevent criticality and to design safe systems. They relied on minimizing moderators, increasing leakage, and using materials with large absorption cross sections. It was clear that better computational tools were needed. This need resulted in the development of a number of computational tools. We will briefly describe each method and then provide more specific information regarding each tool and its use in criticality safety calculations. The specific time periods for the development of each of these methods were intertwined, and the development of each in some way was dependent on the other methods. 7.3.3 Early Computational Tools for Criticality Safety Calculations The methods discussed here are the diffusion theory method, the solid angle method, the surface density method, and the density analog method. These are later followed by the Sn , or discrete ordinates method, the Monte Carlo method, and the N.BN /2 method. The diffusion theory method was clearly the first of the computational methods and found extensive use in the development of nuclear reactors. Its usefulness was greatest for systems that met the conditions of the limits that arose from the approximation to the Boltzmann transport equation, which gave rise to the diffusion approximation. From a criticality safety viewpoint, this allowed computation for well-moderated fissile material in a single unit. This was a good first computational step, but clearly something better was needed. Transformations of the diffusion theory, as supported by critical data, from the “four-factor formula” permitted treatments for neutron leakage, due to geometric effects (buckling, including neutron extrapolation), reflector effects (reflector savings), and neutron nonleakage probability effects (Fermi age, migration area) as solved with a one- or two-energy-group “five-factor formula” or “six-factor formula,” respectively. Even with its limitation, modifications of diffusion theory provided the basis for the early development of 7 Criticality Safety Methods 359 other approximate methods such as the solid angle, surface density, and density analog methods for determining the criticality safety of arrays of fissile materials. These methods were developed to take the output from the diffusion calculation for a single fissile unit and use this information to determine the multiplication factor for an array of units. While the methods were not necessarily rigorous, they were born of necessity and nurtured by ingenuity. A complete mathematical description of these various approximate methods can be found in an article authored by Douglas Hunt [3]. The first was the solid angle method. In this method, the leakage from the single unit was input into a calculation that computed the solid angle between each unit in an array with all other units in the array. The solid angle method was first proposed by criticality methods staff at the Oak Ridge Gaseous Diffusion Plant to determine the safe spacing of process equipment at the plant. The original idea was a very simple method, based on the optical view of leaking neutrons from a unit. The solid angle of a unit was that angle subtended by all visible surrounding units. By using this information, an approximate keff of the array could be determined. The second was the surface density method. In this method, the fissile material in an array was assumed to have somehow flowed downward into a slab. An empirical model was developed to determine whether the array was safe, based on the knowledge of the infinite critical slab thickness for the material involved. The third was the density analog method, which was based on the premise that if the densities of a just critical system are increased to x times their initial value and all the linear dimensions are reduced to 1/x times their initial value, the system will remain critical. This method allowed the development of a theory that allowed single unit data for a material to be used, to evaluate array criticality. At a later time, a method called the N.BN /2 was developed by the melding of diffusion theory with interaction potential for limiting surface densities. This method treated an array of fissile units, from both the array leakage and the multiunit coupling perspectives. All of these methods had the same goal: to take output from a one-dimensional calculation for a given material and determine whether an array of units containing the same material would be subcritical. 7.4 Role of the Sn , or Discrete Ordinates Method The Sn , or discrete ordinates method, arose from development work being performed by Bengt Carlson and his colleagues at Los Alamos in the mid-1950s. As the name implies, this method involves a direct solution of the integro-differential transport equation by spatial differencing and angular quadrature approximate representations. This work is more completely described in Chapter 1. 360 G.E. Whitesides et al. 7.4.1 Impetus for the Early Sn Method Development Clearly, this work had its first applications in nuclear weapons design, but it quickly became clear that this was a very useful tool for criticality safety calculations. The method appeared to provide exactly what the criticality safety specialists needed. It had the capability to analyze systems with the properties of a nuclear reactor as well as those with properties often applied in safety design to prevent criticality. This computational tool was able to perform calculations in one dimension for spheres, infinitely long cylinders, and planes infinite in two dimensions. The combination of the Sn method with neutron cross sections, which had been provided by Hansen and Roach [4] became the standard computational tool for criticality safety calculations. In the late 1950s, a two-dimensional version of the Sn method was developed. In practice, the two-dimensional Sn method provides capability for the computation of certain simple arrays through the combined use of reflecting and true-surface boundary conditions or for a set of individual cylinders stacked along the same axis. However, the more general applications would require another method. 7.4.2 Later Sn Method Development Subsequent to the early Sn method development at Los Alamos, several newer Sn codes were developed at both Los Alamos and Oak Ridge. These codes have found extensive use for criticality safety calculations as their capabilities have been extended to include many additional geometrical and cross-section-handling capabilities. Of particular importance were the additions of flux anisotropy and anisotropic scattering, which can strongly influence neutron transport, leakage, and coupling between systems. 7.5 Role of the Monte Carlo Method During this same time frame, the Monte Carlo method, which had previously been used by the weapon designers at Los Alamos, began to be examined as the basis for calculations by criticality safety specialists. Development of this method was centered in Los Alamos, Oak Ridge, and the UK. The Monte Carlo method involves a solution of the integral form of the transport equation with various neutron tracking schemes for treating the spatial variable and statistical sampling of the neutron kinematics. As discussed below, the energy variable is treated in either a continuous or piecewise continuous (multigroup) fashion. For the first time, this method allowed the evaluation of the criticality safety of any system for which the mathematical equations defining the geometry could be defined. The major drawback to this method at that time was the difficulty in using the available computer programs – primarily because of the complexity in input specifications and interpretation of results. 7 Criticality Safety Methods 361 7.5.1 Early Monte Carlo Calculational Methods The first most notable two codes employing the now widely used Monte Carlo methods were the O5R code (from Oak Ridge) and the GEM code (from the UK). These two codes had unique capabilities, which are worth examining. Of the two notable codes, the O5R code was the first truly general Monte Carlo neutron transport code. It was developed for application on the ORACLE computer, an early Oak Ridge National Laboratory mainframe based on cathode-ray-tube memory technology. It was designed to accommodate pseudo point neutron cross sections and could be used to deal with any geometry for which second-order equations could be written for the surfaces. While this was a noble and energetic effort, the tremendous generality of the code made it almost impossible to use for complicated geometrical arrangements of material. Moreover, the cross-section data were of such volume and complexity that they were nearly impossible to validate. Hence, the generality of O5R in treating the geometry and neutron energy became its greatest burden. The GEM code from the UK was designed specifically for nuclear criticality calculations and had the unique capability of performing calculations of regular arrays of fissile materials as long as the units were made up of nested spheres, cylinders, or rectangular parallelepipeds. The cross-section capability of the code was almost as general as that for O5R. However, in practice, the cross-section data that accompanied the code consisted of a more limited and simplified description of the cross sections. The unique feature of the GEM code was the tracking method. The developer of the code incorporated an interesting concept. For the purpose of computation, the neutrons, N1 , were initially started at the “reflector/core interface.” The neutrons were directed initially into the core and followed until they were either absorbed or returned across this interface. The number of neutrons returned was defined as N2 . At this point, a value M was calculated as N2 =N1 . The N2 neutrons were then tracked into the reflector until they leaked, were absorbed, or returned across the reflector–core interface. The number returned into the core was defined as N3 . From this value, R was defined as N3 =N2 . From this calculation a quantity defined as MR (M R) was determined as being related to keff . When MR is equal to 1.0, it is exactly equal to the value for keff . While this code was easy to use and provided useful information, it was not easy to interpret the results. No written documentation exists that details the reasoning that went into the tracking concept built into GEM. The first author of this chapter had the opportunity to listen to GEM’s author, Ed Woodcock, as he described the method to Robert Coveyou of Oak Ridge National Laboratory. Woodcock believed that source convergence would occur more quickly on a surface than in a volume. Hence, he indicated that this was the basis for the tracking method used in GEM. Coveyou countered by using a Green’s function argument to prove that convergence on a surface would take exactly the same computational time as that for the volume. At the end of the discussion, Woodcock agreed with Coveyou. When the UK staff produced a successor code to GEM, they abandoned the MR method. 362 G.E. Whitesides et al. 7.6 Critical Experiments, Benchmarks, and Validation With the advent of rigorous computational methods, it became clear that many of the critical experiments that had been performed in the past were not sufficiently defined for use in the validation of these new methods. This finding resulted in a rush to provide these data. During the late 1950s through the early 1970s critical experiments facilities at Hanford, Los Alamos, Oak Ridge, and Rocky Flats in the USA and several facilities in Europe performed most of the experiments on which today’s calculations rely for validation. The need for validation became such an important part of the criticality safety calculation field that the need for an American National Standard became clear. To meet this need, a representative group of individuals working under the auspices of the American Nuclear Society’s Subcommittee 8 produced a standard titled, “Validation of Calculational Methods for Nuclear Criticality Safety,” ANS8.11, ANSI N16.9–1975, that has subsequently been incorporated into ANSI/ ANS-8.1–1998. To ensure that the experimental data are properly documented for use as benchmark data, the US Department of Energy initiated an Evaluated Criticality Safety Benchmarks project to collect, document, and determine which experiments were qualified to be used as benchmark data. Subsequently, this project made an international effort under the auspices of the Organisation for Economic Cooperation and Development’s Nuclear Energy Agency, which has published data for literally hundreds of experiments that qualify as benchmark data. Unfortunately, many of the world’s critical experiments facilities no longer operate. Obtaining the required new experimental data appears very unlikely at the moment. This scenario has the potential to limit the range of applications for validation, thus creating a need for increased uncertainty in evaluating the results of criticality safety calculations for several new applications. 7.7 Evaluation of the Various Methods and Their Role in Current Criticality Safety Calculations Each of the methods discussed in this chapter can still play an important role in a total criticality safety computational study. 7.7.1 Role of the Sn Method The Sn method has played a very important role in nuclear criticality safety. This method allowed computation for almost any situation encountered (except for arrays of fissile units), although some simple array systems can be treated. The Sn 7 Criticality Safety Methods 363 method provided the capability to handle systems with voids, strongly anisotropic scattering, and the presence of strong absorbers. As a matter of solution technique, this method provides the neutron flux at every mesh point, in addition to the value of keff . Furthermore, it can provide an adjoint solution that can be of significant value in understanding the importance of various regions as they contribute to the keff of a system. Another advantage of the Sn method was the speed with which calculations could be performed, particularly in one-dimensional systems. While these data were often of direct value to the criticality safety specialist, more importantly, they provided input to other methods (e.g., solid angle, surface density, and density analog) in evaluating the criticality of arrays of fissile materials. The latter methods were very simple calculations that easily provided the ability to perform many array parameter studies at small computational cost and could often be performed on a calculator. 7.7.2 Evaluation of the Role of the Sn Method Today we have a number of computer programs that use the Sn method. Both the Los Alamos and Oak Ridge National Laboratories have developed one-, two-, and three-dimensional codes with the capability of computing the keff of fissile systems. The great advantage of this method is its ability to produce the neutron flux and fission density at every mesh point as well as to compute the keff of the system. Furthermore, the keff of a system, as well as the change in keff resulting from small changes in the material or geometry, can be computed much more precisely than with the Monte Carlo method. The Sn method is particularly useful when performing Parameter studies, especially if the system can be described in one dimension. Often great insight into the neutronic behavior of a system can be obtained for little expenditure of personnel and computer time. 7.7.3 Role of the Monte Carlo Method As discussed earlier, O5R and GEM were early codes that introduced many of us to the Monte Carlo technique. Even though these two codes were very different in technique and in actual application, each played a key role in the development of the Monte Carlo method. The O5R code was perhaps years ahead of its time in that it provided many capabilities that have been incorporated into later codes. Because the O5R code had been developed initially for use in shielding calculations for which the geometrical descriptions were relatively simple, the main problem was its generality, particularly the great difficulty in generating the input to describe the geometry. 364 G.E. Whitesides et al. The GEM code, on the other hand, provided a very easy-to-use technique for describing geometries that were likely to be encountered in criticality safety calculations. In fact, the sole purpose in developing the code was to provide the capability to determine the criticality of single fissile units as well as arrays of fissile units. GEM’s main problem was the fact that the code did not yield the keff of a system directly. Furthermore, since the technique required that a core–reflector interface be defined, many users seemed to have difficulty with the concept when a reflector was not present. The fact that an actual reflector was not required was a very confusing concept for those who did not fully understand the technique. On the other hand, the primary advantage of GEM was its geometry package, as well as its reasonably good set of neutron cross sections. The O5R and GEM codes provided the basis for the development of most of the Monte Carlo criticality computational tools on which we rely today. 7.7.4 Evaluation of the Monte Carlo Method Today we have a number of codes that use the Monte Carlo method to determine the criticality of fissile systems. The more widely used codes were developed at Los Alamos and Oak Ridge National Laboratories and in the UK. The great strength of the Monte Carlo method is its ability to determine the criticality for almost any geometry using a variety of cross-section treatment methods. Moreover, the dimensionality of a system as well as the method of cross-section treatment has little effect on the total computational time. Thus, Monte Carlo is the method of choice for the most difficult problems. The main weakness of the Monte Carlo method centers on its statistical limitations. While the statistical uncertainty can be reduced to almost any given value, the cost of this reduction will be increased by the square of the reduction: that is, to reduce the uncertainty by a factor of 10 requires 100 times more computation time. While the neutron flux can be determined in a Monte Carlo computation, the uncertainty in the flux in any small region of the system is usually too large for most practical uses. The remaining weakness of the Monte Carlo method stems from its great geometrical generality. Because it is so easy to model systems, there is a tendency to expand the system to include many loosely coupled features, some of which are not important to integral quantities, such as the multiplication factor. This can, and often does, result in undersampling, which can cause the keff of the system to be underestimated. This concept has been discussed widely since it was first discovered by Whitesides [5]. Although numerous attempts have been made to find a general solution to the problem, the following cautions remain the best advice: be aware of the problem and use a large number of histories per generation and a large number of generations. 7 Criticality Safety Methods 365 7.7.5 Summary of the Monte Carlo Criticality Safety Software Beginning with O5R, a number of general-purpose Monte Carlo transport codes have been developed. These are listed below, along with the parent organization and the unique and/or significant features of each code: UNCLE SAM code: United Nuclear (MAGI); Combinatorial Geometry MORSE code: Oak Ridge National Laboratory; Multigroup, Biasing Techniques, Anisotropic Scattering VIM: Argonne National Laboratory; Benchmarking of ZPPR Assemblies, Improved Unresolved Resonance Shielding, Source Convergence Studies COG: Lawrence Livermore National Laboratory; Advanced Geometry Modeling, CAD Setup and Display Additionally, both Bettis Atomic Power Laboratory and Knolls Atomic Power Laboratory have sophisticated Monte Carlo codes that are utilized in reactor design and fuel exposure analysis. Some Monte Carlo codes have been developed that are specific to criticality safety analysis. The KENO program at Oak Ridge National Laboratory and the MONK computer program at Winfrith, UK, have undergone extensive modification and enhancement over 3 decades. Geometry capabilities have been enhanced to include more complex region boundaries, automated array specifications, and nested arrays. The kinematics has been enhanced to be fully compatible with modern nuclear data specifications. The MCNP program at Los Alamos National Laboratory has been enhanced for criticality applications by the inclusion of neutron thermal scatter, an automated array feature, and improved unresolved resonance shielding. 7.7.6 The N.BN /2 Method The N.BN /2 approximate method was briefly mentioned earlier. The method combines the density analog and surface density methods and diffusion theory geometric buckling concepts into a single picture of array criticality, resulting in a limiting surface density, ˆ.m/, g=cm2 . The method relates the geometric buckling of the external boundary of an array in terms of the number of orthogonally arranged airspaced units (i.e., N D nx ny nz ) and a relating parameter, ˆ.m/, for various fissionable materials and reflector conditions. The relating parameter is determined from the mass of a single bare critical fissionable material and the mass of units in a reflected array of N units. Numerous transformations of the basic relations among the values for N.BN /2 and ˆ.m) have been developed that permit high degrees of reliability in the prediction and equating of interpolated and extrapolated values for critical and subcritical arrays that are reflected with various material thicknesses and that may have units of similar and dissimilar fissionable materials and unit spacing. The basic parameters of the N.BN /2 are determined either from experiment or by 366 G.E. Whitesides et al. Monte Carlo calculation for particular fissionable material compositions and reflector conditions. From these data, it is possible to determine the array criticality or subcriticality for a particular fissionable material composition for an array of nearly any size. This method has been useful in evaluating a large number of array combinations with substantially reduced computing costs. 7.8 The Role of Cross-Section Representation The two options for cross-section representation are the multigroup and the pointwise models. It is important to understand these options and to be able to evaluate their use. 7.8.1 Multigroup Cross Sections Because of the discrete representation of the neutron energy dependence through the group-wise transfer array in the Sn codes, the multigroup cross-section model gained great acceptance. As a result, significant effort was expended in providing multigroup cross sections that were benchmarked against many of the available experimental data. As discussed earlier, the O5R code had a very general cross-section representation, which could represent the cross sections to almost any detail desired. This very general representation was one of the factors that limited the method from being effectively used for criticality safety calculations. While the data were very detailed, they had not been adequately evaluated and benchmarked for criticality calculations. On the other hand, the multigroup cross sections, particularly those provided by Hansen and Roach, had been effectively evaluated and benchmarked for criticality for use in the Sn method. This provided the incentive for the development of a Monte Carlo computer program that used these cross sections. The effective use of multigroup cross sections in the Sn method led to their introduction into Monte Carlo methods being developed in the early 1960s. Through the development of a computer program with the geometrical capabilities of a code such as GEM, along with the use of the multigroup cross sections that had been used in the Sn method, a very useful computational tool resulted. The effectiveness of these new Monte Carlo methods using multigroup cross sections was the major driver in pushing the Monte Carlo method to the forefront of criticality safety calculations. Furthermore, the large number of evaluated multigroup cross sections, which have been evaluated against a very large number of critical experiments, continues to provide an incentive for their use. A further incentive for the use of multigroup cross sections is the ability to directly compare the results of Sn method calculations with those produced by multigroup Monte Carlo calculations. Finally, there is an additional incentive in that the 7 Criticality Safety Methods 367 multigroup approximation greatly facilitates the calculation of the adjoint flux for use in perturbation analyses and sensitivity/uncertainty (S/U) calculations. 7.8.2 Point-Wise Cross Sections As time evolved, it was inevitable that the use of point-wise cross sections in the Monte Carlo method would be implemented. After all, the capability exists and, in actuality, the cross sections do indeed change at every possible energy point. Several codes that effectively use point-wise cross sections have now been developed and are in widespread use. As the cross sections are evaluated and validated against critical experiments data, the dependence on these codes will grow. The major concern that still remains in the use of point-wise cross sections is the possibility of undersampling. As with the geometrical concerns with undersampling raised earlier, the same problem exists with the use of point-wise cross sections in the energy domain. For the energy-sampling of system-integrated quantities, such as keff , the undersampling tends to occur in less important energy-reaction type domains, which mitigates the potentially adverse effect on results. The advice is the same: to be aware of the problem and to follow many neutron histories. Unfortunately, undersampling of the cross sections might not be so obvious to the casual user. The use of point-wise cross sections will always be required when performing calculations in which iron, nickel, or chromium has significant effects on neutron transport through core and/or reflector regions. These materials have scattering resonances with major gaps and overlap for which there is no means of treatment via multigroup representations of the cross sections. 7.9 Elements of a Complete Nuclear Criticality Safety Computational Tool Set In order to provide the most effective evaluation of problems involving criticality safety, it is necessary to implement a variety of computational tools and cross sections. 7.9.1 Cross-section Selection The choice of cross-section model will depend on several factors: the materials involved, the computational tools to be used, and the benchmark data available. 368 G.E. Whitesides et al. While some versions of the Sn method that use point-wise cross sections have been developed, these versions are not in widespread use. Hence, if the Sn method is used, then the only available option is the multigroup cross-section representation. If one plans to use both the Sn and the Monte Carlo methods, it may be desirable to use the same cross-section set, which, for practical purposes, requires the multigroup model. This is particularly important if direct comparison of the results is required. If the Monte Carlo method is the only method to be used in an analysis, then point-wise cross sections can be an attractive alternative. In practice, point-wise cross sections eliminate most of the problems that result from inadequate crosssection representation that can occur in multigroup cross-section formulation. As mentioned earlier, if there are materials with scattering resonances present, particularly when the neutron transmission through thick regions of these materials is important, it is imperative that point-wise cross sections be used. 7.9.2 Using All Available Tools to Ensure Economical and Accurate Computations Although the advances in computer speed and memory have led to attractive and powerful computational methods, incentives still exist to look at all alternatives in evaluating a criticality safety issue. In the analysis of a specific situation, the use of several of the computational methods, along with simple hand methods, can add significantly to the validity of a computational evaluation of a criticality safety problem. 7.10 The Future The future of nuclear criticality safety calculations lies in the continued development of new geometrical packages that minimize data input effort, the acquisition of better cross-section data, the acquisition of additional critical experiments data, and the development of sensitivity methods for evaluating the uncertainties of the computed results. The third issue, which has been addressed in Chapter 6, is the subject of intensive effort in the criticality safety computational methods community that is involved in the development of S/U methods. Development of S/U methods is needed to quantify the applicability of experiment benchmark computational validations to criticality safety computational evaluations and to determine appropriately safe margins of subcriticality accounting for the availability and applicability of critical experiment benchmarks, neutron cross-section data, and their associated uncertainties. The need for S/U methods development is increasing due to the changing emphases of the nuclear industry (e.g., decontamination, waste processing/disposal, spent nuclear reactor fuel burnup credit, new mixed oxide fuel compositions) that 7 Criticality Safety Methods 369 are confronting nuclear criticality safety specialists, particularly in a scientific environment of declining activity of nuclear criticality safety experimentation and cross-section measurements/evaluation. 7.11 Summary To the best of our knowledge, no criticality accident has ever resulted from an erroneous computation. However, this record should not be a cause for complacency and no safety analysis should ever be made solely on the basis of the use of one computational method. While we have become accustomed to the use of the Sn , and now more particularly the Monte Carlo, methods for much of our safety analysis, we need to be aware of the importance of the older computational tools. While the methods such as solid angle, surface density, and density analog may not produce the accurate results to which we are accustomed from our more rigorous methods, they do provide valuable insights into the characteristics of a problem and help provide bounding conditions. The use of the Sn method to evaluate bounding conditions can be extremely valuable when one uses the Monte Carlo method as the primary tool in an analysis. For large single units satisfying the constraints of the neutron diffusion theory, diffusion theory codes can be very useful in accelerating the space-energy solution for the neutron flux. On the other hand, the use of a one- or two-dimensional Monte Carlo calculation to validate whether the proper angular and spatial mesh has been used in an Sn calculation can also be very useful. As we have demonstrated, the mathematics of criticality safety calculations is precisely the same as that required for rigorous reactor design calculations. From the neutronics perspective, the important differences are in the materials present and the geometrical configuration. It is precisely these differences that cause a divergence in the computational tools required. The reactor designer has a fairly rigid physical arrangement of fissile material and moderator and is interested in very precise values of keff . His concerns are with relatively small variations of the materials present and how the value of keff varies as a function of the material changes. With the much broader variation of the materials present – along with introduction of voids and strongly absorbing materials, compounded by the consideration of system upset conditions – criticality safety calculations present a very different challenge. It is this challenge that has provided the basis for the development of computational tools specifically for use in criticality safety calculations. References 1. Neuilly M, Bussac J, Frejacques C, Nief G, Vendryes G, Yvon J (1972) C R Acad Sci Paris 275D:1847–1849 2. Proceedings of the livermore array symposium, Conf-680909, 23–25 Sept 1968 370 G.E. Whitesides et al. 3. Douglas CH (1976) A review of criticality safety models used in evaluating arrays of fissile materials. Nucl Technol 30(2):138–165 4. Six and sixteen group cross sections for fast and intermediate critical assemblies, LAMS-2543. Los Alamos National Laboratory, December 1961 5. Whitesides GE (1971) A difficulty in computing the keff of the world. Trans Am Nucl Soc 14: 680 7 Criticality Safety Methods 371 G. Elliott Whitesides graduated from North Carolina State University with a B.S. in Nuclear Engineering in 1960. Upon graduation he accepted a position with the nuclear facilities at Oak Ridge operated by Union Carbide for the (then) Atomic Energy Commission. One of his first assignments was to implement the Sn computer programs that had been developed at Los Alamos for the solution of criticality safety problems that existed at the Oak Ridge facilities. In 1963, he was named to head the Nuclear Computer Programs Group that was charged with the task of developing computer programs for nuclear criticality safety, shielding, and neutronic crosssection processing. In this capacity he initiated the development of the DOT and ANISN Sn programs with an emphasis on solving neutron and gamma ray shielding capabilities. He then turned his personal attention to developing the first neutronically rigorous Monte Carlo program designed especially for the solution of nuclear criticality problems. This program, KENO, was developed in conjunction with the Critical Experiments Facility activities at Oak Ridge. This development was especially important because it provided the ability to compute systems with arrays of fissile materials, a problem which had been particularly vexing to the safety of many operations involving fissile materials. The organization he headed also produced the SCALE code system that integrated the shielding, criticality safety, and cross-section computer programs that had been developed at Oak Ridge. This system has become a major computational tool for the nuclear industry. He has had an active role with the American Nuclear Society having served on the Executive Board of the Mathematics and Computations Division, and on the Board of Directors, vice chairman, and then chairman of the Nuclear Criticality Safety Division. In 1980, he was elected as Fellow of the American Nuclear Society. He has been active in the Standards activities of the American Nuclear Society. He has served on the ANS – 8 Standards Subcommittee since 1975, and as chairman of the working groups that produced two standards entitled, “Validation of Calculational Methods for Nuclear Criticality Safety” and “Criticality Safety Criteria for the Handling, Storage, and Transportation of LWR Fuel Outside Reactors.” In 1981, he was promoted as site manager for Computing at Oak Ridge National Laboratory and subsequently director of the Computing Applications Division. He has maintained a strong international interest by serving as the chairman of the Criticality Calculations Working Group at the Organization for Economic Cooperation and Development’s Nuclear Energy Agency (OECD-NEA) in Paris from 1980 to 1996. He also was a principal factor in initiating and promoting the series of international criticality conferences now know as ICNC. 372 G.E. Whitesides et al. Robert Michael (Mike) Westfall completed his Ph.D. in nuclear engineering at the University of Virginia in 1974. His dissertation research was the development of a highly precise transport solution for the Radially Reflected Critical Cylinder. From 1963 to 1973 he worked on methods development for processing nuclear data for reactor design at the NASA-Lewis Research Center. In 1973, he joined the Oak Ridge National Laboratory. An early ORNL project was the development of the ROLAIDS software, which treats resonance overlap in multiregion geometries. From 1976 to1980, he initiated the development of the SCALE system. In 1980–1981, Westfall served as a guest scientist with the Service for Reactor Studies and Applied Mathematics at CEN-Saclay, where he worked on resonance processing methods for the APOLLO code system. From 1981 to 2001, he led the Nuclear Engineering Applications Section at ORNL. In the 1980s, he led the technical support efforts for the nuclear criticality safety (NCS) qualification of TMI-2 recovery and the initial DOE/NCS burnup credit studies. In the 1990s, he served on technical support groups for DOE responses to Defense Nuclear Facility Safety Board (DNFSB) Recommendations 93–2 and 97–2. Westfall currently serves as a member of the DOE Criticality Safety Support Group formed as part of the DOE response to DNFSB Rec. 97–2, as well as the DOE Nuclear Data Advisory Group. Regarding international consensus standards, he is a vice chair of the Nuclear Technical Advisory Group, which administers US participation in ISO Technical Committee 85, “Nuclear Technology.” For domestic standards, Westfall serves as a member of ANS/N-16, the consensus committee for NCS standards, as well as being a member of the ANS Standards Board. Also, since 1977, he has served as a technical expert on the working group for ANSI/ANS-8.15, “Nuclear Criticality Control of Special Actinide Elements.” In 1994, Westfall was named a Fellow of the American Nuclear Society with the citation: “For innovation in the development and application of advanced problem-solving methodologies in the nuclear field with major contributions to nuclear safety.” He served as technical program chair of the 1985 Mathematics and Computation (M&C) Division topical meeting “Advances in Nuclear Engineering Computational Methods” in Knoxville. He chaired the M&C Division in 1987–1988. He also served as general chair of the 1993 Nuclear Criticality Safety Division topical meeting “Physics and Methods in Criticality Safety” in Nashville. In his current position as the ORNL Manager of the DOE Nuclear Criticality Safety Program, with major responsibilities for the Analytical Methods and Nuclear Data work elements, he has had technical leadership roles on a DOE-wide basis. 7 Criticality Safety Methods 373 Calvin Mitchell Hopper is a distinguished senior development engineer at Oak Ridge National Laboratory (ORNL), graduated from Southern Colorado State College with a B.S. in engineering physics. Between 1968 and 1970, he provided radiation protection services to the Oak Ridge Critical Experiments Facility and the Oak Ridge Y-12 Development Division. Between 1970 and 1978, he provided nuclear criticality safety (NCS) engineering evaluations and analysis for the Oak Ridge Y-12 Plant and K-25 Gaseous Diffusion Plant. Between 1978 and 1980, he was the manager of Nuclear Material Licensing, Nuclear Material Accountability, and Nuclear Safety (health physics and nuclear criticality safety) at the Texas Instruments, Inc. – HFIR Project resulting in TI receiving their first comprehensive special nuclear material license from the US Nuclear Regulatory Commission (NRC). In 1980, he returned to the Oak Ridge Y-12 Plant as the Technical Manager of the Health Physics Department responsible for developing a corporate wide external dosimetry service and managing the internal and external dosimetry programs at Y-12. In 1982, he became the department head for the Y-12 Nuclear Criticality Safety Department. In 1984, he transferred to ORNL to provide Nuclear Criticality Safety Officer services and to develop an ORNL NCS organization and holding an interim position of ORNL Nuclear Criticality Safety Section Head. In 1995, he engaged the US NRC to support the seminal application of first-orderlinear-perturbation-theory sensitivity and uncertainty analysis at ORNL for NCS engineering applications and critical experiment design and evaluations. Between 1997 and 2007, he was the principle investigator of the US Department of Energy Nuclear Criticality Safety Program for Applicable Ranges of Bounding Curves and Data that lead to major neutronics code enhancements at ORNL. He has been a member of the American Nuclear Society (ANS) since 1970 serving in all offices of the Nuclear Criticality Safety Division, is the Chair of the ANSI/ANS Consensus Committee N16 on Criticality Safety, was a chair and is a member of various standards Working Groups, and is a member of the ANS Standards Board. Internationally, he has been the convener (chair) of Working Group 8 on Criticality Safety within ISO TC85/SC5 since 1998. He has participated in OECD-NEA Expert Groups on MOX critical experiment needs and reference critical values and is a contributor and peer-reviewer for the OECD-NEA International Handbook of Evaluated Criticality Safety Benchmark Experiments. His latest publications have focused on the use and benefits of sensitivity and uncertainty analyses as applied to NCS problems and critical experiment design. Chapter 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond Jack Dorning 8.1 Introduction When one of the editors of this book, Professor Yousry Azmy, asked me to give a lecture on the development of reactor kinetics during the twentieth century and the future direction of research in this area in the twenty-first century, my reaction was, “Wow! Review the developments of a century of reactor kinetics. That’s a lot to cover.” But then, upon reflection on the fact that the discipline of reactor kinetics, and reactor physics in general, did not even exist until the 1930s, I realized that I did not have to review a whole century of development, but rather, a mere two thirds of the century! Still a somewhat daunting task! Very soon after beginning that task, I became somewhat intrigued, one might even say obsessed, with the question of the precise origins of the equations of reactor kinetics – specifically the precise origins, i.e., the first appearance in the literature, of the time-dependent neutron diffusion equation and of the point reactor kinetics equations. Thus, this lecture and chapter begin with the historical origins of the equations of reactor kinetics – more precisely, the results of my best efforts to uncover them. The lecture and chapter are composed of seven sections. Following this introduction, the development of nuclear reactor kinetics during the twentieth century, starting with its historical origins in the 1930s and 1940s, is reviewed in a prologue which comprises Section 8.2. The introduction of the time-dependent neutron diffusion equation and the point reactor kinetics equations for reactor analysis is chronicled, and the roles played by Enrico Fermi and Eugene Wigner in these events are discussed. Subsequent derivations of more general point reactor kinetics equations during the 1950s, 1960s, and 1970s are summarized in Section 8.3. Then, during a digression that evolved into Section 8.4, the theory of pulsed neutron experiments, which was developed primarily in the 1960s and 1970s and which played an important role in the development of reactor physics, is described with emphasis J. Dorning () Le Carlina Lodge, Biarritz, France e-mail: dorning@virginia.edu Permanent address: University of Virginia, Charlottesville, VA USA Y. Azmy and E. Sartori, Nuclear Computational Science: A Century in Review, c Springer Science+Business Media B.V. 2010 DOI 10.1007/978-90-481-3411-3 8, 375 376 J. Dorning on the resolution of apparent contradictions between theory and experiment and some very interesting connections between several mathematical subtleties and initially perplexing experimental observations. Next, in Section 8.5, the evolution of elementary and advanced numerical methods for the solution of space-time reactor kinetics problems, from the advent of large-scale mainframe computers to the present, is discussed. Then, in Section 8.6, a short summary of some special topics in reactor dynamics is given. Finally, the chapter ends with a closing section, or epilogue, which presents some thoughts about the possible directions of research and development in the future – a short prologue on reactor kinetics and reactor dynamics in the twenty-first century. 8.2 Prologue: The Historical Origins of the Equations of Reactor Kinetics 8.2.1 The Time-Dependent Neutron Diffusion Equation To find the first use in the literature of the neutron diffusion equation to describe the motion of neutrons in a nonmultiplying or subcritical system, I originally thought that it would only be necessary to look quickly at a few standard textbooks on reactor physics, at least some of which, I had assumed, would cite the original reference. However, after checking a few of the elementary reactor physics and reactor theory books currently in use [1,2] and even some of the vintage books on the subject [3–6] that I used as a student, I was very surprised and disappointed to find no reference to the original introduction of the, subsequently so widely used, neutron diffusion equation. Further efforts expended trying to locate a citation of the seminal publication in many, many textbooks on reactor physics [7, 8], neutron transport theory [9–11], and reactor kinetics and reactor dynamics [12–15] – many of which are not cited here – led to the same disappointing result. Even after consulting with some of the most “mature” members of our nuclear engineering and neutron physics communities, who provided a wellspring of information on the history of these disciplines, no path to the elusive reference was opened. After a fair effort searching through the post-World War II literature of declassified articles led to no success in my quest, I turned to two obvious references. The first was Enrico Fermi – Collected Papers [16, 17], with which I was already very familiar from my Ph.D. dissertation days because of the many important papers on neutron thermalization published by Fermi and members of his youthful group at the Istituto Fisico dell’Università di Roma in the 1930s. The other was The Collected Works of Eugene Paul Wigner [18]. All I needed, plus lots more of course, was in these seven volumes – the two-volume Fermi collection plus the five-volume Wigner set. The first published article that I was able to find in which the neutron diffusion equation appeared was by Edoardo Almadi and Enrico Fermi. (Actually, I was “rooting” for Fermi from the outset, since he made so many important contributions to 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 377 neutron physics, reactor physics, and reactor theory via his pioneering experiments, his keen physical insight and his profound theoretical ideas – and to so, so many other areas of physics.) The article, a letter (in Italian), “Sull’Assorbimento dei Neutroni Lenti – III” (“On the Absorption of Slow Neutrons – III”), appeared in Ricerca Scientifica in 1936 [19]. It is also reprinted in Italian in Enrico Fermi – Collected Papers Volume I, Italy 1921–1938 as article number 114 on pages 823–826. It is noteworthy that this letter opens with a reference to two letters published in 1935 (also in Ricerca Scientifica) in which they discussed the existence of groups of slow neutrons with different absorption and diffusion properties. The first section of this letter [19], entitled “I. Diffusione dei singoli gruppi di neutroni lenti” (“I. Diffusion of the individual groups of slow neutrons”), begins with “Giá nella lettera precedente abbiamo accennato alle differenze tra le proprietà di diffusione dei neutroni dei gruppi A e C” (“Already in the preceding letter we mentioned the differences between the diffusion properties of the neutrons of the groups A and C”). It continues after a few lines with “Nell’ipotesi che i neutroni del gruppo C nella paraffina obbediscano alle solite leggi della diffusione, ed abbiamo in più la possibilitá di essere distrutti con vita media , la loro densitá n soddisfa all’equazione differentiale” (“Under the hypothesis that the group C neutrons in paraffin obey the usual law of diffusion, and have in addition the possibility of being destroyed with half-life , their density n satisfies the differential equation.”) n dn D Dn ; dt “dove D è il coefficiente di diffusione dato da 1=3Nv; essendo il cammino libero medio e vN la velocità media” (“where D is the diffusion coefficient given by 1=3Nv; being the mean free path and vN the average velocity”). Finally, there is an earlier letter published in Ricerca Scientifica in 1934 by E. Fermi, E. Amaldi, B. Pontecorvo, F. Rasetti, and E. Segrè [20] in which there is a detailed explanation of the laboratory observations they made, given in terms of a diffusion model of neutron migration in water or paraffin. An English translation by Emilio Segrè of that letter appears on page 761 ff as paper number 105b in Enrico Fermi – Collected Papers Volume I, Italy 1921–1938 [16], and the last major paragraph states the following: A possible explanation of these facts seems to be the following: neutrons rapidly lose their energy by repeated collisions with hydrogen nuclei. It is plausible that the neutron-proton collision cross section increases for decreasing energy and one may expect that after some collisions the neutrons move in a manner similar to that of the molecules of a diffusing gas, eventually reaching the energy corresponding to thermal agitation. One would form in this way something similar to a solution of neutrons in water or paraffin, surrounding the neutron source. The concentration of this solution at each point depends on the intensity of the source, on the geometrical conditions of the diffusion process and on possible neutroncapture processes due to hydrogen or other nuclei present. A longer article was published in Italian by Fermi in Ricerca Scientifica in 1936 [21]; in it he discussed, in some detail, neutron slowing-down and neutron diffusion in hydrogenous materials. On the penultimate page of the 40-page article, which is reprinted in his collected works [16] as article number 119a on pages 378 J. Dorning 943–983, he points out that “given the considerable number of collisions a thermal neutron undergoes before being captured, it is clear that its motion can be accurately described as diffusive motion, and that one can easily write a differential equation for the neutron density n v 1 @n D n n C q; @t 3 where q is the number of thermal neutrons produced per unit volume and time from the slowing down of fast neutrons and the first term on the right-hand side represents the effect of diffusion.” He goes on to emphasize the importance of the stationary case and explicitly includes the reduction of the above equation to the steady-state diffusion equation for thermal neutrons. Clearly, the young leader – referred to as the “Pope” by his colleagues because of his theoretical prowess – of the youthful group at the Istituto Fisico dell’Università di Roma, and the members of that group – known as the “Via Panisperna Group” – understood neutron slowing-down and neutron diffusion as a theoretical description of neutron motion and used the neutron diffusion equation to explain the results of their pioneering experiments at least as early as 1934 [20] through 1936 [19, 21]. So, it seems, based on the results of my limited search efforts that the time-dependent (and steady-state) neutron diffusion equation originated with Enrico Fermi and his colleagues. Of course, it was a small modification to add the prompt and delayed neutron fission production terms later to arrive at the neutron diffusion equation so widely used even today in reactor physics in general, and in reactor kinetics and reactor dynamics in particular. A photo of Enrico Fermi, taken during those “early days” in Rome and reprinted from his collected works, appears on the following page. 8.2.2 The Point Reactor Kinetics Model The point reactor kinetics equations, of course, can be derived easily from the coupled time-dependent diffusion equation and delayed neutron precursor equations, as will be done in Section 8.3. However, as in the case of the time-dependent neutron diffusion equation, I thought it would be appropriate to include in this lecture and the book chapter a citation of the article in which the equations, in this case the “point reactor kinetics model” or point reactor kinetics equations, first appeared. Again, as in the case of the time-dependent neutron diffusion equation, a simple perusal of the standard textbooks on reactor physics [1–8], reactor kinetics and reactor dynamics [12–15], and neutron transport theory [9–11] failed to reveal a reference that was cited as the article or report in which the point kinetics equations first appeared. Thus, I again turned to the collected works of two of the greatest men and women of physics in the twentieth century and pioneers of neutron physics and reactor physics, Eugene Paul Wigner and Enrico Fermi. In The Collected Works of Eugene Paul Wigner, Part A, The Scientific Papers, Vol. V [18], annotations are 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 379 provided by Alvin M. Weinberg on the part entitled “Macroscopic Reactor Theory.” Among those annotations, Weinberg wrote the following: In paper 34 Wigner derives the usual kinetic equations, in which the delayed neutrons dominate the variations of power following a small reactivity change. Similar results had been obtained by Fermi, by Wheeler and Ibser, and a little later, by J. Schwinger (who spent one month in Chicago in 1943 working with Wigner’s group) [18]. Paper number 34 beginning on page 520 is “On Variations of the Power Output in a Running Pile,” by E. P. Wigner, November 11, 1942. Wigner begins that paper with the statement, “The importance of the delayed neutrons for the steady operation of the pile has been early recognized by Allison, Fermi, and Szilard (Conferences on the Power Plant, February 1942).” He goes on, in the second paragraph, to introduce Enrico Fermi during the “early days” in the 1930s in Rome. (Photographer unknown.) the point kinetics equations with the statement, “If we denote the number of neutrons by n, by sj the number of radioactive nuclei which emit a delayed neutron of the kind j , we have the equations sPj D Aj sj C fj .n C S / X X X nP D Aj sj C n ke fj :00 fj C S 1 Although the notation differs slightly from that which is currently in use, these equations are easily recognized as the point kinetics equations, so familiar to even beginning students of reactor kinetics. In this paper, which was originally Chicago Project Report CP-351 [22], Wigner uses the solutions to these equations to discuss the initially rapid and subsequently more gradual increase in neutron density during the transient that follows the introduction of a small increase in reactivity into an initially critical system. A 1958 photo of Eugene Wigner, copied from his collected works, appears on the following page. On his lap in this photo, very appropriately, 380 J. Dorning is a copy of the classic textbook, The Physical Theory of Neutron Chain Reactors, by A. M. Weinberg and E. P. Wigner [8]. The references to the earlier work of Fermi (and others) in both Wigner’s paper and Weinberg’s annotation, although not specific, led me to a slightly earlier Chicago Project Report CP-291 (Notes on Lecture of October 7, 1942) by Enrico Fermi entitled “Problem of Time Dependence of the Reaction Rate: Effect of Delayed Neutrons Emission” [23]. In the first four pages of those lecture notes, Fermi first estimates the “time to change the number of neutrons by a factor e” following an increase of keff from 1.000 to 1.001 as 1 s – based on the assumption all neutrons are prompt neutrons – in Part A. Simple Theory Neglecting Delayed Emission of Neutrons. Then, in Part B. Simple Theory Including Delayed Emission of Neutrons, he introduces , the “normal time of one generation;” T, the “time of one generation of delayed neutrons;” n, the “the number of neutrons present in the reacting mass;” c, the “the number of existing radioactive atoms which will decay to give delayed neu, the “rate of change of the number of neutrons;” trons (c stands for “credits”);” dn dt and p D 1%, the “fraction of neutrons which produce delayed neutrons.” He then adds “c atoms decay with lifetime T to give c=T new neutrons per second.” “All neutrons are absorbed after an average lifetime at the rate n= neutrons per second.” “Each absorbed neutron forms k new neutrons. k includes emission of both credit neutrons and instantaneous neutrons. These statements may be expressed by the equations (1) and (2). Eugene Paul Wigner with a copy of his classic textbook on reactor theory on his lap (with kind permission of Springer Science + Business Media) 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 381 c n n dn D C k .1 p/ ; dt T where k.1 p/ neutrons produce instantaneous neutrons (1) (2) dn n c D kp ; dt T where kp n D new credits formed per second (and) Tc D credits lost per second from radioactive decay.” These equations, of course, are the point kinetics equations written in almost the same notation we use today – although, interestingly, the “c” denoted “credits” in Fermi’s description whereas it subsequently was taken to indicate the “concentration” of the precursors of the various delayed neutron groups, as it is to this day. (Although Fermi lumped the delayed neutrons into one effective delayed neutron group in these equations, earlier in these notes he mentions that the “appreciable time lag,” after which the delayed neutrons are emitted, is “described by a complicated law which depends on about 3 lifetimes.”) After developing the exact solutions to the above point kinetics equations for a step change in keff , Fermi introduces the simple jump approximation to generate even simpler solutions and uses those to arrive at the conclusion that “The relaxation time for k D 1:001 from this theory is 1990 s, while if the delayed neutrons are neglected, simple theory gives the quite different value of 1 s.” Thus, recognizing the significance of the delayed neutrons, he used the point kinetics equations (in the notes of October 7, 1942) to show that a graphite/natural-uranium pile could be controlled – less than 2 months prior to the historic event on December 2, 1942, during which he and his colleagues achieved mankind’s first controlled neutron chain reaction in the graphite/naturaluranium pile in a squash court under the west stands of Stagg Field at the University of Chicago. (The pile had been moved from Columbia University in New York City earlier that year.) It is clear that in anticipation of that historic event, interest in the neutron kinetics of a critical, or near critical, pile was very high in the experimental and theoretical physics groups at Chicago led, respectively, by Fermi and Wigner. No doubt, the related point kinetics equations were floating around in the air there, and were well-known to the few people in those groups. In fact, although the first appearances of these equations that I was able to find were in the reports by Fermi and by Wigner discussed above, statements by Wigner in the opening paragraph of his paper suggest that they appeared in earlier reports. He begins his report with “The importance of the delayed neutrons for the steady operation of the pile has been early recognized by Allison, Fermi, and Szilard (Conference on the Power Plant, February 1942),” and his third sentence states, “The solution of the equations which give the power output of the pile as a function of time has been given by Ibser, Manley and Wheeler (C-65) in their discussion of the effect of a sudden neutron burst.” Unfortunately, I was not able to obtain any information on Conferences on the Power Plant, February 1942 or on C-65. However, it is clear that these two giants of twentieth century physics were among the first to introduce and use the point kinetics equations. 382 J. Dorning In fact, Fermi and Wigner’s seminal contributions to the antecedents of our field – neutron physics and reactor physics, although extensive and profound, were but tiny fractions – in both scope and importance – of their numerous brilliant contributions in so many diverse areas of physics. To generate partial evidence of this fact instantly, one need only make a mental list of the enormous number of terms developed in twentieth century physics that are based on their names – especially Fermi’s name. In our field: the Fermi four factor formula, Fermi age theory, the Fermi pseudo-potential – in slow-neutron scattering [24], the Fermi synthetic slowing-down kernel [25, 26], etc. (There also is a Wigner synthetic slowing-down kernel. [25, 26]) and in other areas of physics: the Fermi solid, the Fermi sea, the Fermi surface, the Fermi distribution, Fermi–Dirac statistics, Fermions, and the list goes on and on! Enrico Fermi probably has more phenomena and concepts named after him than any other physicist. And well he should! A picture of the still fairly young Fermi appears on the opposite page. It is from his collected works and was taken at Los Alamos in 1946, a short 4 years after he led the Chicago group in achieving the first controlled nuclear chain reaction and only 10 years after “the early days” in Rome during which he and the young members of his group performed the pioneering neutron physics experiments (and developed the related seminal theories) that led to his Nobel Prize in 1938 – “For the transmutation of the elements by means of slow moving neutrons,” according to his obituary which appeared in the New York Herald Tribune on November 28, 1954. 8.3 The Point Reactor Kinetics Equations 8.3.1 The Basics: From the One-Group Diffusion Equation with Delayed Neutrons for a Bare Homogeneous Reactor to the Point Reactor Kinetics Equations Although Fermi (and others), no doubt, originally developed the point kinetics equations from simple physical arguments based on neutron balances for the overall assembly, the standard “classroom” derivation of these equations at an introductory level starts from the one-speed neutron diffusion equation for a bare homogeneous reactor, with delayed neutrons, and the coupled delayed neutron precursor concentration equations. And that will be the course followed here. The time-dependent one-group neutron diffusion equation with delayed neutrons for neutron migration in a bare homogeneous reactor for the space- and timedependent neutron number density N.r; t/ is X @ N.r; t/ D vDr 2 N v†a N C .1 ˇ/k1 v†a N C i Ci C S.r; t /; (8.1) @t I i D1 with boundary conditions N.r; t/ D 0 at the extrapolated boundary r , and the coupled equations for the space- and time-dependent delayed-neutron precursor 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 383 Enrico Fermi at Los Alamos, NM after World War II in 1946. (Photographer unknown.) concentrations (Fermi’s “credits” [23]) Ci .r; t/; i D 1; : : : ; I; are @ Ci .r; t / D ˇi k1 v†a N i Ci ; i D 1; : : : ; I; @t (8.2) where I is the effective number of delayed-neutron groups, and the other notation is standard [1–15]. In many developments, the fixed source term S.r; t / is taken as zero, and the neutron number density N.r; t / and the precursor concentration are factored at this point into a space-dependent factor – the fundamental “buckling” mode – and a time-dependent factor n.t/, often interpreted as the total number of neutrons in the reactor, and ci .t/; i D 1; : : :; I , interpreted as the total number of the i -th group delayed-neutron precursors in the reactor. However, it is slightly more satisfactory simply to separate variables as is so frequently done in solving partial differential equations (PDEs) like the neutron diffusion equation. It follows from the separation 1 0 1 0 N.r; t/ f .r/n.t/ B Ci .r; t/ C B gi .r/ci .t/ C C B C B (8.3) CDB C; B :: :: A A @ @ : : CI .r; t/ gI .r/cI .t/ via the precursors equations, Eq. 8.2, that the gi .r /; i D 1; : : :; I , are proportional to f .r/. Hence, they can be written as gi .r/ D i f .r/; i D 1; : : : ; I . Further, since the proportionality constants i can be incorporated into the definitions of the ci .t/, the gi .r/=f .r/ become unity, and the precursor equations reduce to 384 J. Dorning cPi .t/ D ˇi k1 v†a n.t/ i ci .t/; i D 1; : : : ; I; (8.4) and Eq. 8.1, the diffusion equation, becomes 1 r 2 f .r/ 1 n.t/ 1 X ci .t/ P Œ.1 ˇ/ k1 1 v†a D D B 2 ; (8.5) i vD n.t/ vD vD n.t/ f .r/ I i D1 in this special case of separation of variables (special case because there are no derivatives with respect to r in the precursor equations), and B 2 is the usual separation constant. The equation for the time-dependent function n.t/ follows, as usual, from setting the time-dependent terms – the terms that precede the first equal sign – equal to B 2 , which after some very elementary manipulations becomes X . ˇ/ n.t/ C i ci .t/; ` I n.t/ P D (8.6) i D1 where D kek1 , of course, is the reactivity, and ` `0 =ke is the neutron gene eration time, both of which appear as a result of the following two definitions: (1) the neutron lifetime `0 `1 =.1 C L2 B 2 /, where `1 1=.v†a / is the infiniteneutron non-leakage medium neutron lifetime, and 1=.1 C L2 B 2 / is the thermal p probability written in terms of the diffusion length L D D=†a and the separation constant –B 2 ; and (2) the effective multiplication constant ke k1 =.1 C L2 B 2 /. In terms of these parameters, Eq. 8.4 becomes cPi .t/ D ˇi n.t/ i ci .t/; i D 1; : : : ; I: ` (8.7) The corresponding equation for the space-dependent terms in Eq. 8.5 – the terms between the two equal signs in that equation – equal to the separation constant –B 2 is the familiar Helmholtz equation r 2 f .r / C B 2 f .r / D 0; (8.8) with boundary conditions f .Qr / D 0, which of course follow from the boundary conditions on N.r; t/ stated immediately below Eq. 8.1. Of course, this linear homogeneous PDE for f .r/ with homogeneous boundary conditions is an eigenvalue problem, with the eigenvalues being the eigenvalues of the Laplacian B 2 D Bn2 ; n D 0; 1; 2; : : :; and the corresponding eigenfunctions are 'n .r/; n D 0; 1; 2; : : :. According to long-standing tradition in reactor physics, these eigenvalues and eigenfunctions are referred to, respectively, as “bucklings” and “buckling modes,” with the fundamental eigenvalue being the buckling. The explicit expressions for the Bn and 'n .r/, of course, depend upon the specific geometry of the reactor. The solution to the initial-value problem given by the linear PDEs, Eqs. 8.1 and 8.2, thus is 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond N.r; t/ D 1 X 385 nn .t/'n .r/; (8.9) ci;n .t/'n .r/; i D 1; : : : ; I: (8.10) nD0 and Ci .r; t/ D 1 X nD0 Here, the nn .t/ and the ci;n .t/ are determined from the time-dependent equations, Eqs. 8.6 and 8.7, with the parameters defined below Eq. 8.6 in terms of B now written in terms of the Bn ; n D 1; 2; : : :, and therefore having index n (i.e., . ; ke ; `; `0 / become . n ; ke;n ; `n ; `0;n /), as do the unknowns nn .t/ and ci;n .t/. Thus, Eqs. 8.6 and 8.7 become X ˇ nn .t/ C i ci;n .t/; n D 0; 1; 2; : : : ; `n (8.11) ˇi nn .t/ i ci;n .t/; i D 1; : : : ; I; n D 0; 1; 2; : : : : `n (8.12) I nP n .t/ D n i D1 and cPi;n .t/ D This set of coupled linear ordinary differential equations (ODEs) with constant coefficients is solved for each value of the index n using standard procedures developed in an introductory undergraduate ODEs course to arrive at exponential solutions, nn .t/ D I X An;j e ˛n;j t ; n D 0; 1; 2; : : : ; (8.13) Ai;n;j e ˛n;j t ; i D 1; : : : ; I; n D 0; 1; 2; : : : ; (8.14) j D0 and ci;n .t/ D I X j D0 where the ˛n;j ; n D 0; 1; 2; : : : ; j D 0; 1; : : : ; I are determined from the characteristic equation that arises in the solution to the ODEs ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˛n `n ˇ1 `n :: : ˇI `n n 1 ˇ ˇ ˇ ˇ ˇ ˇ ˇ 0 ˇ D 0; ˇ :: ˇ : ˇ ˇ ˇ .I C ˛n / ˇ 2 .1 C ˛n / 0 :: : :: : 0 0 I (8.15) the Ai;n;j ; i D 1; : : : ; I are written in terms of the An;j using Eqs. 8.11 and 8.12, and the An;j are determined from the initial conditions – not previously mentioned 386 J. Dorning here – associated with Eqs. 8.1 and 8.2, N.r; 0/ and Ci .r ; 0/ using the orthogonality properties of the eigenfunctions 'n .r/; n D 0; 1; 2; : : : of the Laplacian. When a reactor is near critical – the case in which the point kinetics equations normally are used – the ˛n;j ; n D 1; 2; : : : ; j D 0; 1; : : : ; I are negative and much smaller (more negative) than the ˛0;j ; j D 0; 1; : : : ; I ; hence, the higher spatial modes 'n .r/; n D 1; 2; : : : die away and only the fundamental “buckling” mode '0 .r/ remains at long times. Thus, in the development of the point kinetics equations, only this fundamental mode – corresponding to n D 0 – is retained, and Eqs. 8.11 and 8.12 are rewritten with n set equal to zero and, in fact, with that zero suppressed, to give the standard point kinetics equations, X ˇ n.t/ C i ci .t/; ` I n.t/ P D (8.16) i D1 cPi .t/ D ˇi n.t/ i ci .t/; i D 1; : : : ; I; ` (8.17) with the spatial distribution or “shape function” given by the fundamental buckling mode solution to Eq. 8.8 '0 .r/. The solutions to Eqs. 8.13 and 8.14 become for n D 0 (with the zero suppressed) n.t/ D I X Aj e ˛j t ; (8.18) Ai;j e ˛j t ; i D 1; : : : ; I; (8.19) j D0 ci .t/ D I X j D0 and the characteristic equation for these time-dependent solutions associated with the fundamental spatial mode, Eq. 8.15, becomes ˇ ˇ ˇ ˛ 1 ˇ ` ˇ ˇ1 ˇ .1 C ˛/ ` ˇ ˇ :: :: ˇ : : ˇ ˇI ˇ 0 ` ˇ ˇ ˇ ˇ ˇ 0 ˇ D 0; ˇ :: ˇ : ˇ .I C ˛/ ˇ 2 0 :: : 0 I (8.20) which is of course, in reactor kinetics, called the “in-hour equation,” for the characteristic roots ˛ D ˛j ; j D 0; 1; : : : ; I that explicitly give the exponential solutions in time, exp ˛j t ; j D 0; 1; : : : ; I in Eqs. 8.18 and 8.19. If the reactor is slightly supercritical ˛0 is positive and ˛j ; j D 1; : : : ; I are negative, and if it is subcritical all the ˛j ; j D 0; 1; : : : ; I are negative. Finally, the case in which the fixed source S.r; t/ in Eq. 8.1 is non-zero should be included. To do this, it is convenient to expand S.r; t/ if it is an L2 function in 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 387 r and t (i.e., jS.r; t/j2 is integrable over the spatial domain of the reactor and over all time), for practical geometries in the complete set (in L2 ) of eigenfunctions of Eq. 8.8 – the “buckling” modes, S.r; t/ D 1 X Sn .t/'n .r/: (8.21) nD0 Substituting this expansion for S.r; t/ into Eq. 8.1 and also substituting the eigenfunction expansions of N.r; t/ and Ci .r; t/; i D 1; : : : ; I (which also belong to L2 if S.r; t/ does) into Eqs. 8.1 and 8.2, forming the inner product (in r ) of the adjoint eigenfunctions 'm .r/ with Eqs. 8.1 and 8.2, and using the bi-orthogonality prop erties of the adjoint and “forward” eigenfunctions .'m .r/; w.r /'n .r// D Cn ınm , where w.r/ is the appropriate weight function for the reactor geometry and the inner product ( , ) corresponds to integration over the spatial domain of the reactor, leads to Eq. 8.11, with a source term X ˇ nn .t/ C i ci;n .t/ C Sn .t/; n D 0; 1; 2; : : : ; `n I nP n .t/ D n (8.22) i D1 and the coupled equations for the precursor concentrations, Eq. 8.12, which are unchanged. When the system is near critical, as discussed above in this section, only the fundamental spatial eigenfunction, or “buckling” mode, corresponding to n D 0 remains at long times, and – under the usual assumption that only this fundamental spatial mode is present in the source term S.r; t/ – Eqs. 8.22 and 8.12 become the standard point reactor kinetics equations with a fixed source term: X ˇ n.t/ C i ci .t/ C S.t/; ` I n.t/ P D (8.23) i D1 cPi .t/ D ˇi n.t/ i ci .t/; i D 1; : : : ; I; ` (8.24) where Eq. 8.23 replaces Eqs. 8.16 and 8.24 is identical to Eq. 8.17. The solution to these nonhomogeneous equations (and also to the more general equations, Eqs. 8.22 and 8.12) are given in terms of the complementary solution, Eqs. 8.18 and 8.19 (Eqs. 8.13 and 8.14 in the more general case) plus the particular solution obtained using standard techniques for the solution of elementary ODEs. Of course, this simple and very explicit and concrete development of the point kinetics equations from the one-speed neutron diffusion theory description of neutron migration in a reactor, Eqs. 8.1 and 8.2, is only the beginning of the story. Much more general developments of the point kinetics equations, starting from the energy(or speed-), angle-, space-, and time-dependent neutron transport theory description have been given. And some of these will be summarized in the next section. 388 J. Dorning 8.3.2 More General Developments of the Point Reactor Kinetics Equations: “Shape Functions,” “Time Functions,” and “Neutron Importance” More general developments of the point reactor kinetics equations were first reported almost simultaneously by L. N. Ussachoff [27] and A. F. Henry [28], with a more detailed derivation published a few years later by Henry [29]. (A photograph of the late Allan F. Henry who had a long and distinguished career in reactor kinetics, reactor physics, and reactor computations appears on the opposite page.) It is this more detailed derivation that we shall summarize here. A more complete, and very nice, development appears in [14]. Although it is possible to begin this derivation from the one-speed neutron diffusion equation, as was done above in Section 8.3.1 for the simple introductory first-course classroom development, it is more general, therefore more useful – and certainly more elegant – to start from the space-, speed- (or energy-), directionvector- (or angle-), and time-dependent neutron transport (or Boltzmann) equation – as is done in [14]. O t I @n r; v; ; X 1 0 O t ; DLt n C Mt n C i Ci CS r; v; ; i (8.25) @t 4 i D1 1 i .v/ Ci .r; t/ @ 4 1 D Mti n i i Ci ; i D 1; ; I: (8.26) @t 4 O t into The first step [14,29] is to decompose the neutron number density n r; v; ; O t and a “Time Function” T .t/ a product of a “Shape Function” ‰ r; v; ; O t D ‰ r; v; ; O t T .t/ ; n r; v; ; (8.27) and substitute this decomposition into the transport equation, Eq. 8.25. It is worthy of comment that at this point no approximation has and this is still been made, O t still depends upon completely general because the shape function ‰ r; v; ; time t, and therefore this decomposition is not a “separation of variables” – at least not yet. It is useful now to introduce the concept of a “steady-state reference reactor” by which is usually meant the reactor under consideration in its critical state – although this does not have to be the case. The corresponding steady-state Boltzmann equation and adjoint Boltzmann equation for the reference reactor are O D 0; H0 N0 r; v; (8.28) and O D 0; H0 N0 r; v; (8.29) 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 389 The late Professor Allan F. Henry (courtesy of the Nuclear Science and Engineering Department of The Massachusetts Institute of Technology) where Ht D Lt C Mt , Ht D Lt C Mt , H0 D Ht D0 , H0 D HtD0 , and the dagger () indicates adjoint operators and adjoint functions. The time-dependent, streaming, removal and inscatter operator Lt above is given by O r †T .r ; v; t/ C Lt D Z 1 0 d v0 Z O 0 †s r; t; v0 ! v; O0 ! O ; d 4 (8.30) the time-dependent fission neutron production operators Mti , i D 0; : : : ; I; for the prompt neutrons .i D 0/ and the delayed neutrons (in group i ), i D 1; : : : ; I , are given by Z Z 1 1 O 0 v0 †f r ; v0 t ; d v0 d .1 ˇ/0 .v/ 4 Z 1 0 Z 1 i O 0 v0 †f r; v0 t ; ˇi i .v/ Mt D d v0 d 4 0 Mt0 D Mt D I X Mti ; (8.31) (8.32) (8.33) i D0 where here, and also in Eqs. 8.25 and 8.26, 0 .v/ denotes the fission spectrum of the prompt neutrons and i .v/; i D 1; : : : ; I , that of the neutrons in the i -th delayed 390 J. Dorning group, and ˇi ; i D 1; : : : ; I , is the fraction of delayed neutrons in the i -th group. In Eqs. 8.31 and 8.32, and in the remainder of this Chapter, denotes the average number of neutrons emitted per fission event and †f is the macroscopic fission cross section. Substitution of the above decomposition, Eq. 8.27, into the precursor equations, Eq. 8.26, leads to an analogous decomposition 1 i .v/Ci .r; t/ D 4 O t C i .t/; i D 1; : : : ; I; r; v; ; (8.34) which is sufficient, but not necessary, for consistent solutions in the forms given by Eqs. 8.27 and 8.34. After substituting these decompositions into the Boltzmann equation, Eq. 8.25, and the precursor equations, Eqs. 8.26 the inner products – over O of the solution N .r; v; / O the space, speed and direction-vector variables .r; v; / 0 to the adjoint steady-state (critical) reference reactor equation, Eq. 8.29, and both the Boltzmann and precursor equations – are formed to obtain T .t/ d d No ; T .t/ D No ; Lt C No ; T .t/C No ; Mt0 T .t/ dt dt I X C i No ; C i .t/C No ; S (8.35) i D1 and d d N ; T .t/ C No ; dt o dt D No ; M1i T .t/ i No ; C i .t/; i D 1; : : : ; I: C i .t/ (8.36) Here, the inner product ( , ) is the inner product in a complex L2 space (or Hilbert space) Z 1 Z Z O O g r; v; O ; d v d f r; v; (8.37) .f; g/ D d 3 r V 0 4 and the spatial integral is over the volume of the reactor V . Next, the “Normalization Condition” [14, 29] d O ; N0 ; r; v; dt O t r; v; ; D 0; (8.38) which renders the first term on the left-hand side of the Boltzmann equation, Eq. 8.35, and of each precursor equations, Eqs. 8.36, equal to zero – is introduced. This implies that .N0 ; / is constant – independent of time – even though O t / depends upon time which, indeed, imposes a very special condition .r ; v; ; on the shape function. Of course, if the shape function is independent of time, 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 391 the normalization condition, Eq. 8.38, is satisfied immediately, but in this case, the decompositions in Eqs. 8.27 and 8.34 become a separability assumption. O correWith the usual interpretation of the adjoint eigenfunction, N0 .r ; v; /, sponding to the zero eigenvalue of H0 , as the neutron importance [1, 2, 5, 7–11, 13, 14,30] the inner product of N0 with Eq. 8.27 shows that the time function is propor tional to .N0 ; n/, the total neutron importance in the critical reference reactor due to the instantaneous space-, speed-, and direction-vector-dependent neutron number density in the actual reactor described by Eqs. 8.25 and 8.26. Hence, the time function T .t/ is often interpreted as the total neutron importance (for sustaining the chain reaction) in the actual reactor under consideration. Division of Eqs. 8.35 and 8.36, with the first term in each now eliminated (via the normalization condition), first by ; Ft N0 ; Mt where Z 1 Mt D .v/ 4 1 dv 0 0 Z (8.39) O 0 v0 †f r; v0 ; t ; d (8.40) 4 and .v/ D .1 ˇ/0 .v/ C I X ˇi i .v/; (8.41) i D1 (see also Eqs. 8.31–8.33), and then by .N0 ; /=Ft leads immediately to the point reactor kinetics equations, d T .t/ D dt .t/ ˇ ` t t ! T .t/ C I X i C i .t/ C S .t/; (8.42) I D1 and t d ˇ C i .t/ D ti T .t/ i C i .t/; i D 1; : : : ; I: dt ` Here, the reactivity .t/ is given by .t/ D 1 .N ; .Ht H0 / /; Ft 0 (8.43) (8.44) and the term in Eq. 8.42, in which it appears, was derived by adding and subtracting Mt to Lt C Mt0 in Eq. 8.35, identifying Lt C Mt0 as Ht , and subtracting .N0 ; H0 / from .N0 ; Ht /, the former of which is equal to zero since .N0 ; H0 / D .H0 N0 ; / D 0. 392 J. Dorning t The effective delayed neutron fractions ˇ i in Eq. 8.43 are defined as t ˇi D 1 N0 ; Mti ; i D 1; : : : ; I; Ft (8.45) and the effective neutron generation time `t and the effective source S .t/ in Eqs. 8.42 and 8.43 are given by `t D 1 N0 ; ; Ft and S.t/ D N0 ; S (8.46) : (8.47) N0 ; t The effective total delayed neutron fraction ˇ , which appears in Eq. 8.42, is consistent with Eq. 8.45 because the addition and subtraction of Mt that led to the expression for the reactivity .t/ in Eq. 8.44 also leads to the appearancet of N0 ; .Mt M0 / =Ft which is precisely the sum over i D 1; : : : ; I of the ˇ i as defined in Eq. 8.45. This more general development of the point reactor kinetics equations, which has become, to some extent, the standard development used in more advanced presentations and textbooks (see [14] for example) seems quite elegant and convincing. However, it does have some mathematical shortcomings. Although, as noted above, the decomposition of the neutron number density into the product of a “Shape Function” and “Time Function” is completely general – since the shape function depends upon time in Eq. 8.27 (see also Eq. 8.34), this generality ceases to exist when the “Normalization Condition” is imposed on the shape function, in Eq. 8.38. The result is that the shape function is restricted in some not-very-welldefined way which, to some, makes the development seem a little magical, or at least somewhat ad hoc. If the shape function were taken as time-independent, which surely would satisfy the normalization condition, the decomposition would become a simple separation of variables – as was also noted above – and the resulting analysis using the Boltzmann equation (with constant coefficients) would lead to a set of shape functions, the eigenfunctions of H0 n D n n; n D 0; 1; : : : (8.48) which are the so-called lambda-modes [13, 31] in which the solutions to Eqs. 8.25 and 8.26 are often expanded. Alternatively, when the analysis resulting from the separation of variables is carried out using the Boltzmann equation and the coupled precursor equations (again with constant coefficients) the eigenfunctions are the so-called omega-modes [13, 32, 33] (sometimes called omega-d modes) [13], in which the solution to Eqs. 8.25 and 8.26 are also often expanded – which is more 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 393 appealing physically – although mathematically both sets of eigenfunctions form complete basis sets in L2 . These two eigenfunction expansions are different in the context of the speed-dependent transport theory description adopted here, and even in the context of a two-energy-group diffusion theory description; however, they are equivalent to each other in the context of one-speed diffusion theory for a bare reactor, since in that context the two sets of eigenfunctions become identical and the two expansions reduce to the “buckling mode” expansion discussed at the end of Section 8.3.1. Although the general development of the point reactor kinetics equations by L. N. Ussachoff [27] and A. F. Henry [28,29] summarized here, has some mathematically ad hoc features, which are bothersome, it does have a great deal of appeal. First, it has the very practical advantage of accommodating time-dependent coefficients, i.e., time-dependent cross sections – essential for “real-world” applications; and, second, it seems to work fairly well when applied to the analysis of mild transients, i.e., when it is used in the simulation of near-critical real-world reactors. Of course, the fact that the shape function, in practical applications, is an instantaneous fundamentalmode eigenfunction naturally should restrict the resulting point kinetics equations to applications involving fairly slow reactor transients. In practice, the above point reactor kinetics equations, Eqs. 8.42 and 8.43, are usually applied using the quasistatic method [34–36] (which evolved from the so-called adiabatic method) [31], the names of which both rather faithfully convey the basic idea of these approaches. The time-discretized point kinetics equations are solved for several successive time steps using coefficients that were calculated based on the shape function computed at the beginning of the first time step for the instantaneous critical reactor. Then, after these several time steps – with the number of these steps depending on how rapid the transient is – the shape function is recalculated based on the new state of the reactor at the end of the just completed sequence of time steps, and the resulting point kinetics coefficients are used for the next several time steps, etc. In the adiabatic method, the lambda-mode was used as the instantaneous shape function, while in the quasi-static methods, various approximations to the omega mode – which generally lead to improvements – are used. For a very short and somewhat dated, but very nice, discussion of the differences among the various quasi-static methods, see [13]. Notwithstanding the fact that the point reactor kinetics equations that result from the general development summarized above [14, 27–29] usually work reasonably well in practice, other developments of point kinetics equations are worthy of mention [37–40, 42]. In many ways they lead to truer approximations to the original equations, e.g., Eqs. 8.25 and 8.26, and – perhaps of equal importance – they add additional insight. Thus, formulations of the point reactor kinetics equations based on a variational principle [39] and on asymptotic expansions [40, 42] will be summarized in the next section. 394 J. Dorning 8.3.3 Variational Formulations and Asymptotic Formulations of Point Reactor Kinetics and the Appearance of “Additional Terms” Subsequent to the seminal works by L. N. Ussachoff [27] and A. F. Henry [28,29] on the general development of point reactor kinetics equations, a few other general formulations have appeared, three of which will be mentioned here. Not surprisingly, in all three the somewhat controversial “Normalization Condition” was avoided; and, of course, additional terms – terms that are not present in the point kinetics equations derived by Ussachoff [27] and by Henry [28, 29] – appear in the final equations. The first of these formulations – which was motivated by early applications to reactor safety – was reported by E. P. Gyftopoulos [38] in The Technology of Nuclear Reactor Safety in 1964. The end result of that development was the point reactor equations t I X .t/ ˇ d T .t/ D T .t/ C i C i .t/ C S .t/ W .t/T .t/ t dt ` i D1 and (8.49) t d ˇ C i .t/ D ti T .t/ i C i .t/ W .t/C i .t/; i D 1; : : : ; I (8.50) dt ` with one term – the last one on the RHS – in each equation which does not appear in Eqs. 8.42 and 8.43 in Section 8.3.2. The first factor in each of these terms is d h i d 1 N0 ; ln N0 ; D ; W .t/ D dt dt N0 ; (8.51) from which it is very obvious that these “new terms” result from the fact that the normalization condition has introduced in the derivation since clearly W .t/ not been were taken to be independent of time. would be equal to zero if N0 ; A few years later M. Becker [39] reported a derivation of the point kinetics equations starting from the coupled Boltzmann equation and delayed neutron precursor equations, Eqs. 8.25 and 8.26 above, and the corresponding adjoint O t/ and C .r; t /; i D 1; : : : ; I . In this variational formulaequations for n .r; v; ; i tion, he used a functional for non-self-adjoint operators D E D E y ; Lx y ; Q D 0; (8.52) (with additional terms to allow the approximation of initial and final conditions in time) where the inner product here h ; i is over space (the volume of the reactor), speed, direction vector and time .0; tfinal /, and L is the matrix of linear operators that 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 395 appear in the Boltzmann equation and the precursor equations, Eqs. 8.25 and 8.26, rewritten here as Lx D Q, where x is the vector of the neutron number denO t/ and the precursor concentrations Ci .r; t /; i D 1; : : : ; I , and Q sity n.r ; v; ; is the vector of fixed-source terms in these equations – which, of course, are zero in the precursor equations. The adjoint vector y , which satisfies the adjoint equation O t/ and the L y D Q , comprises the adjoint neutron number density n .r; v; ; adjoint precursor concentrations Ci .r; t /; i D 1; : : : ; I which satisfy the equations that are adjoint (with respect to all the independent variables, including time) to Eqs. 8.25 and 8.26. The adjoint source vector Q is made up of the fixed-source O t / in the adjoint Boltzmann equation and zeros corresponding to term S .r; v; ; the zero fixed sources in the adjoint precursor equations. The minimization of this functional, after the introduction of decompositions for the neutron number density and the precursor concentrations of the type given in Eqs. 8.27 and 8.34 – and analogous decompositions for the adjoint neutron number density and adjoint precursor concentrations, led to (forward) Euler-LaGrange equations and adjoint Euler–LaGrange equations that resulted in the following generalized point reactor kinetics equations (when the same weight function, the adjoint neutron number density, is used with all the forward equations): t I X .t/ ˇ d T .t/ D T .t/ C i C i .t/ C S .t/ W .t/T .t/; t dt ` i D1 and (8.53) t ˇ d C i .t/ D ti T .t/ i C i .t/ W .t/C i .t/; i D 1; : : : ; I (8.54) dt ` and analogous adjoint reactor kinetics equations. t t Here, the reactivity and parameters ˇ i and ` and S .t/ are given by expressions analogous to those in Eqs. 8.44–8.47 with the fundamental adjoint eigen O t / and W .t/ is function N0 replaced by the adjoint shape function .r; v; ; given by an expression analogous to Eq. 8.51 above with N0 again replaced by O t /. .r; v; ; A decade later another derivation of generalized point reactor kinetics equations from the coupled Boltzmann equation and precursor equations, Eqs. 8.25 and 8.26 – this one based on an asymptotic expansion technique – was developed by J. Dorning and G. Spiga [40]. In this development, the small parameter ", defined as the ratio of the prompt neutron lifetime to the average delayed neutron precursor half-life, " D `1 = d D d =v†a ; (8.55) was exploited. (This parameter is, indeed, very small since the prompt neutron lifetime is of the order of 103 s in a thermal reactor (and 105 s in a fast reactor) and the average delayed neutron precursor half-life is of the order of seconds – 0.23 to 55.72 s [13].) After introducing the small parameter, and the appropriate 396 J. Dorning ordering of the time derivatives and the fixed source (for a near critical reactor), the neutron number density, the precursor concentrations, and 1=keff were expanded in powers of " O t D n0 r; v; ; O t C n1 r; v; ; O t "C ; n r; v; ; (8.56) Ci .r; t/ D Ci;0.r; t / C Ci;1 .r; t/" C ; (8.57) and 1=keff D .1=k/0 C .1=k/1 " C ; (8.58) and substituted into the original equations, Eqs. 8.25 and 8.26. In the resulting hierarchy equations associated with successively higher powers of ", the first equations, the O."0 / equations, the time dependence appears only through the cross sections. The time derivatives and the time-dependent source term do not appear. Hence, these zero order equations combine to become an instantaneous steady-state Boltzmann equation for the critical state of the reactor under consideration with cross sections that are those associated with the instantaneous state of the reactor undergoing the transient. The solutions to this equation are the products of the eigenfunctions .n/;t O of this homogeneous transport equation (with parametric depen.r; v; / o dence upon time due to the dependence of the cross sections upon time) times arbitrary functions of time T0.n/;t .t/; n D 1; : : : . The fundamental eigenfunction .0/;t O is the instantaneous distribution in space, speed, and of this equation o .r; v; / direction vector of the neutron number density in the instantaneous critical reactor associated with the cross-section values in the actual reactor (the one undergoing the transient) at the instant of time t. Since the actual reactor is near critical, only the fundamental eigenfunction is relevant, and the leading order term in the expansion of the neutron number density is given by O t D o.0/;t r; v; O T0 .t/; (8.59) n0 r; v; ; which provides, directly as a result of the asymptotic expansion, the “decomposition” of the neutron number density into the product of a time-dependent “shape function” – the instantaneous critical distribution in the reactor – times an arbitrary “time function” T0 .t/. Both the time derivatives and the fixed source appear in the first-order equations, O 0 .t/. These equations for n1 .r; v; ; O t/ the O."1 / equations, as does o.0/;t .r; v; /T and Ci;1 .r; t/; i D 1; : : : ; I are identical to the zero-order equation, except they are nonhomogeneous equations in which the nonhomogeneous terms include the time O t / and also involve o.0/;t .r ; v; /T O 0 .t/. derivatives and the fixed source S.r; v; ; The solvability condition (the Fredholm Alternative Theorem [41]) for this non-selfadjoint nonhomogeneous set of equations requires that the sums of these nonhomogeneous terms in these equations be orthogonal to the fundamental eigenfunction of the instantaneous steady-state adjoint equations for the instantaneous critical reactor O t /. This solvability condition leads directly to equations for the upNo.0/;t; .r; v; ; to-this-point arbitrary function of time T0 .t/ – now written without the subscript as 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 397 T .t/ – and the related C i .t/; i D 1; : : : ; I , in the form of generalized point reactor kinetics equations t I X .t/ ˇ d T .t/ D T .t/ C i C i .t/ C S .t/ W .t/T .t/ V .t/T .t/; (8.60) t dt ` i D1 t d ˇ C i .t/ D ti T .t/ i C i .t/ W .t/C i .t/ Ui .t/; i D 1; : : : ; I: (8.61) dt ` t t Here, the reactivity and parameters ˇ i and ` and source term S .t/ are again given by expressions analogous to those in Eqs. 8.44–8.47 with the “shape func.0/;t O the fundamental (lambda-mode) eigenfunction tion” replaced by 0 .r; v; /, of the instantaneous critical reactor, and with the fundamental (lambda-mode) adO replaced by the fundamental (lambda-mode) adjoint joint eigenfunction N0 .r; v; / .0/;t O eigenfunction o .r; v; / of the instantaneous critical reactor. In the new “additional terms” W .t/T .t/, V .t/T .t/, W .t/C i .t/ and Ui .t/ on the RHSs of these equations, W .t/, V .t/ and Ui .t/ are given by d t ln ` ; dt @ .0/;t V .t/ D ; @t W .t/ D and Ui .t/ D @ @t .0/;t ; (8.62) .0/;t .0/;t ; .0/;t ; 1 0 i Ci;0 ; i D 1; : : : ; I: 4 (8.63) (8.64) All four of the “additional terms” that appear in these generalized point reactor kinetics equations, W .t/T .t/, V .t/T .t/, W .t/C i .t/, and Ui .t/; i D 1; : : : ; I , will reduce to zero if, in the spirit of A. F. Henry’s “Normalization Condition,” the inner product . 0.0/;t ; 0.0/;t / is forced to be independent of time, and these generalized point reactor equations will reduce to those originally derived by L. N. Ussachoff [27] and A. F. Henry [28, 29], although with slightly different definitions of the reactivity, the effective parameters, and the effective source. But within the context of this asymptotic formulation, there is absolutely no reason to do this. This asymptotic development of the generalized point reactor equations [41] led deductively to (a) the decomposition of the leading order neutron number density into the product of a time-dependent shape function and a time function; (b) the identification of the shape function as the fundamental lambda-mode eigenfunction of the steady-state Boltzmann equation for the instantaneous critical state of the reactor; (c) the arrival at the fundamental lambda-mode eigenfunction of the corresponding adjoint Boltzmann equation as the weight function – used to form the inner product that leads to the generalized point reactor kinetics equations, 398 J. Dorning and, with the shape function, to generate the expressions for the reactivity, the effective neutron generation time, and the effective delayed neutron fractions, and with the source to generate the effective source, in the generalized point reactor kinetics equations; and finally, (d) the generalized point reactor equations for the time function T .t/ and precursor concentrations, C 1 .t/; i D 1; : : : ; I . All these results follow directly and deductively from the initial asymptotic expansion; no ad hoc factorizations, weight functions, or normalization conditions were introduced. Of course, without the ad hoc normalization condition, the additional terms W .t/T .t/, V .t/T .t/, W .t/C i .t/, and Ui .t/; i D 1; : : : ; I appear in the generalized point reactor kinetics equations. (It would be an interesting exercise to compare the results of simulations done using quasi-static methods based on Eqs. 8.60 and 8.61, with and without these additional terms with the results of comparable simulations of the full time-dependent Boltzmann equations and coupled precursor concentration equations.) Finally, in closing our discussion of the derivation of the point reactor kinetics equations, and generalizations thereof, it should be mentioned that a few years after the above-mentioned derivation – of the generalized point kinetics equations as an asymptotic approximation to the coupled Boltzmann equation and precursor equations – appeared, an extension of this approach was reported [42]. Although N a was used, two time scales were the same small parameter " D `1 = Nd D d =Nv† introduced – a “fast time scale” t1 D "t, based on the prompt neutron time scale, and a “slow time scale” t2 D t based on the average precursor half-life time scale. O t1 ; t2 /, the precursor concentraThere, when the neutron number density n.r ; v; ; tions Ci .r; t1 ; t2 / and 1=keff were expanded in powers of ", the O."0 / equations in the resulting hierarchy were not instantaneous steady-state equations; rather, they were time-dependent equations in which there appeared time derivatives with respect to the fast time variable t1 and cross sections that depended on both t1 and the slow time variable t2 . Hence, they were coupled differential equations in the fast time variable t1 with instantaneous parametric dependence on the slow time variO t1 ; t2 / that was the able t2 . This led again to a leading order solution n0 .r; v; ; O t1 / – that was a function of the fast time product of a shape function ot2 .r; v; ; variable t1 and also had additional parametric dependence on the slow time variable t2 – and a time function T .t2 / – that was a function of only t2 . The solution to these zero order equations leads to the instantaneous (in the slow time variable) fundamental omega-mode eigenfunction as the time-dependent shape function; and the solvability condition – the Fredholm Alternative Theorem [41] – for the O."1 / equations leads to the instantaneous fundamental omega-mode adjoint eigenfunction as the time-dependent weight function. Otherwise the results were essentially the same as those summarized above obtained using just one time scale which led to lambda-mode forward and adjoint eigenfunctions. The point reactor kinetics equations are often used to described the time evolution of a reactor when the reactor is just one component of a much larger system being simulated – for example, an entire nuclear power plant. They are also used in the context of the adiabatic method [31] and quasi-static methods [34–36] in the simulation of reactor transients when only the reactor is represented – i.e., it is not coupled 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 399 to an entire plant. In these simulations, the shape functions – used to calculate the reactivity and update the effective point kinetics parameters – sometimes are calculated using the multigroup transport (Boltzmann) equation; however, more often the multigroup diffusion equations or the few group diffusion equations are used. Of course, the point reactor kinetics equations can also be derived starting from any of these descriptions [2] instead of the energy-dependent or speed-dependent transport theory description, Eqs. 8.25 and 8.26, used here – and in some textbooks [14] – for generality. Notwithstanding the fact that the quasi-static method may frequently yield sufficiently accurate results, it often is necessary to solve the full space- and timedependent multigroup diffusion equations – and in some applications the multigroup transport equations – along with the coupled delayed neutron precursor equations. Thus, the solution of these so-called space–time kinetics equations, which, in practice, must be done numerically, is an extremely important subject, and a summary of the developments of reactor kinetics in the twentieth century surely could not omit it. Therefore, the main methods that have evolved – principally since the 1950s – for practical numerical solution of the space–time reactor kinetics equations will be discussed in Section 8.5. That discussion, however, will be preceded by a digression on the kinetics of neutron pulses. This subject, which was an integral part of reactor kinetics at the time,1 was of substantial importance in the 1950s and 1960s and even afterward, first in connection with neutron thermalization problems related to the development of thermal reactors and later in connection with the physics of fast reactors. 8.4 A Digression on the Kinetics of a Pulse of Neutrons in Non-multiplying Systems and Subcritical Multiplying Systems: Pulsed Neutron Experiments and Their Analysis 8.4.1 Neutron Thermalization, Exponential Decay, and Diffusion Cooling The generation of accurate thermal neutron cross sections and thermal neutron scattering data were important priorities for the development of thermal reactors during the late 1950s, the 1960s, and into the early 1970s. The generation and validation of cross sections and slow neutron scattering kernels for thermal reactor core materials was essential for the advancement of thermal reactor design. And there was an experiment that could be done in many laboratories because all it required 1 For a collection of articles on reactor kinetics, which included the kinetics of neutron pulses, see the following conference proceedings, which contains many articles that were representative of the state-of-the-art in the late 1960s: Hetrick, D. L. (Editor), Dynamics of Nuclear Systems. 1972, Tucson, AZ: University of Arizona Press. 400 J. Dorning was a “neutron generator” – i.e., a small Cockcroft–Walton accelerator in which deuterium ions were accelerated to bombard a tritium target producing a burst, or “pulse,” of 14MeV neutrons via the resulting .d; t/ reaction. These neutrons then slowed down on a very fast time scale . 105 s/, and thermalized, and diffused on a slower time scale . 103 s/ in the “assembly” made up of a single material of interest in the physics of thermal reactors. Moreover, relatively inexpensive BF-3 tube detectors, or later (slightly more expensive) He-3 tube detectors, were used to measure the time evolution of the thermal neutron population – or the “decay of the neutron pulse.” (Actually, the experiments were performed using many well-spaced, narrow pulses and the detector counts were combined using the appropriate delay time following each pulse.) So, the experiment was conceptually fairly simple – although practical technical details such as background noise, room return, etc. often plagued the physicists doing the experiments. In fact, the basic experiment was conceptually sufficiently simple that a “Pulsed Neutron Decay Experiment” was routinely included by the 1960s and early 1970s in the graduate lab course in most nuclear science and engineering departments in the United States and elsewhere. While the basic pulsed neutron experiment was conceptually very simple, the “theory” on which it was based was even simpler! Well, at least, at first glance! It starts from the one-speed diffusion equation for the neutron flux in a homogeneous non-multiplying medium (cannot be much simpler than that!) – which, of course, is actually the speed-dependent neutron diffusion equation averaged over a thermal neutron speed distribution, @ N̂ .r; t/ D vDr 2 N̂ .r; t / v†a .r; t / N̂ .r; t /; @t (8.65) where the overbar indicates the average over the Maxwellian thermal neutron spectrum. Separation of variables, N̂ .r; t/ D '.r/T .t/ leads immediately to r 2 'n .r/ C Bn2 'n .r/ D 0; ˛ X N̂ .r; t / D An 'n .r/e ˛n t ; (8.66) (8.67) nD0 where the time eigenvalues ˛n ˛n D v†˛ C vDBn2 ; n D 0; 1; 2; : : : (8.68) and Bn2 is the geometric “buckling” (spatial eigenvalue) for the nth spatial mode 'n .r/ (spatial eigenfunction), all of which result from the application of the boundary condition (zero neutron flux at the “extrapolated” boundary) for the specific geometry of the assembly used in the experiment. The total experiment actually comprises a sequence of experiments carried out in a sequence of assemblies (usually) of the same geometric shape but of successively larger sizes. For a specific assembly, say the smallest one, the time asymptotic decay 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 401 constant ˛0 is measured, using the counts recorded after all the higher modes have damped away (but before long-time background noise becomes a significant fraction of the count rate). That is, the time asymptotic flux '.r; N t/ A0 '0 .r/e ˛0 t ; (8.69) is measured – usually at a node (zero) of the first spatial harmonic '1 .r/ if possible, or, if the initial pulse is spatially symmetric, at a node of the second harmonic '2 .r/ – to obtain the time-asymptotic, fundamental mode decay constant (or timeeigenvalue) ˛0 D v†a C vDB02 ; (8.70) in terms of the fundamental-mode buckling (or spatial eigenvalue) B02 . This experiment is done for each of the successively larger assemblies (with smaller fundamental-mode bucklings B02 ) and the sequence of measured decay constants ˛0 is plotted vs these bucklings B02 . (See sketch in Fig. 8.1.) Of course, the extrapolation of this curve, Eq. 8.70 to zero buckling gives, as the y-intercept, the experimental value of the thermal neutron spectrum averaged (Maxwellian averaged – at the temperature of the assemblies) absorption frequency v†a , which, happily, is precisely the cross-section data needed in the one-speed (thermal spectrum averaged) description of a thermal reactor (or, in practice, the thermal group averaged cross-section data needed in a two-group description). Further, the experimental value of the neutron speed times the diffusion coefficient, needed in the one-speed or two-group diffusion theory description of a thermal reactor, is given by the slope of the curve (Fig. 8.1). (Actually, the experimental points for smaller assemblies (larger bucklings B02 ) lie below the straight line given by Eq. 8.70 as shown in the figure. This is easily understood, even in terms of a simple two-group a0 2 a 0 = nSa + nDB0 nD nSa + nDB0 - CB04 nSa 2 B0 Fig. 8.1 Sketch of the decay constant ˛0 vs the buckling B02 402 J. Dorning diffusion theory model, as a diffusion cooling phenomenon in which the preferential leakage of the faster neutrons in the thermal spectrum in the smaller assemblies cools the neutron energy distribution lowering the decay constant ˛0 – by decreasing the spectrum-averaged value vD. This phenomenon was included by adding a (negative) diffusion cooling term in the expression for ˛0 that is quadratic in B02 ˛0 D v†a C vDB02 CB40 C ; (8.71) and the measured value of the diffusion cooling coefficient was often reported along with the other experimental data v†a , vD and the values of †a and DN deduced from those data.) So, in many ways everything associated with the pulsed neutron experiment and the theory of the pulsed neutron experiment seemed to be very simple and quite nice and tidy. And it was – until the transport equation reared its ugly head! Ah, the demon Boltzmann equation strikes again – as it so often does! But, of course, since the Boltzmann equation is a more precise description of particle migration, difficulties to which it sometimes leads usually reflect complexities that are physically real, often subtle, and sometimes very important – and should not be overlooked, and certainly never “swept under the carpet!” 8.4.2 Non-Exponential Decay and the Theory of Pulsed Neutron Die-Away: The Continuous Spectrum of the Boltzmann Operator In order to introduce the complications that arise when the better description of particle migration provided by the speed-dependent (or equivalently, the energydependent) Boltzmann equation is introduced, it is convenient to first discuss the eigenvalue spectrum of the one-speed neutron diffusion operator. Thus, returning to Eq. 8.65 and “looking for solutions” of exponential form in time (or, equivalently, separating variables as in Section 8.4.1 above) '.r; t / D a'.r/e t , one immediately arrives at '.r/ D vDr 2 '.r/ v†a .r/'.r/ D ADiff '.r/; (8.72) where ADiff D vDr 2 v†a is the one-speed diffusion operator, and the ’s that result from the related eigenvalue equation .ADiff /' D 0; (8.73) are the so-called time-eigenvalues D ˛0 ; ˛1 ; ˛2 ; : : : (the same ˛’s that appear above in Eq. 8.68). Since it is well known that the Laplacian operator r 2 has an infinite set of eigenvalues that are real, negative, isolated (a finite distance apart from each other) and countable, and because division by vD does not change any of these properties and v†a just shifts every eigenvalue to a more negative value, the time-eigenvalue “spectrum” or eigenvalue “spectrum” of the one-speed diffusion operator – or the point spectrum written as P .ADiff / is just n D ˛n ; n D 0; 1; 2; : : : (see Fig. 8.2). 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 403 Im (λ) • • • −α3 −α2 −α1 −α0 Re(λ) Fig. 8.2 Sketch of the eigenvalue spectrum of the one-speed diffusion operator Now, upon the introduction of the velocity-dependent Boltzmann equation to describe the transport of the neutrons more precisely, this mathematical result changes quite significantly. Starting from the time- and velocity-dependent transport equation for the neutron flux ‰.r; v; t / in a homogeneous medium @ ‰.r; v; t / D Ev r‰.r; v; t/ v†T ‰.r; v; t/ @t Z C d 3 v0 v0 †s v0 ! v ‰ r; v0 ; t ; (8.74) and again “looking for solutions” of exponential form in time (or, equivalently, separating variables – more precisely in this case “trying” to separate variables) ‰.r; v; t / D .r ; v/e t leads directly to .r ; v/ D vE r .r ; v/ v†T .v/ Z 0 r; v D ATrans .r ; v/; C d 3 v0 v0 †s v0 ! v (8.75) where ATrans is the velocity-dependent transport operator and the ’s for which there are bounded solutions to the resulting corresponding homogeneous equation .ATrans / D 0; (8.76) are the time-eigenvalues of the velocity-dependent transport equation. But the situation here is not so simple as it was in the case of the one-speed diffusion operator. In addition to real, negative, isolated discrete eigenvalues – which belong to the point spectrum of the transport operator P .ATrans / – the transport operator has a (nonempty) continuous spectrum – written as C .ATrans / – and this makes the problem “much more interesting!” Very loosely speaking, the continuous spectrum of an operator provides additional “eigenstuff” (usually a continuously distributed curve or area in the complex -plane or “spectral plane”) which leads, in the eigenfunction expansion – or more precisely, the spectral expansion – of the solution to the related nonhomogeneous operator equation, to a (continuous) integral over the boundary of 404 J. Dorning the continuous spectrum in addition to the (discrete) summation over the point spectrum, i.e., over the (discrete) eigenvalues. (An example, familiar to many physicists because it arises in elementary quantum mechanics, is the (continuous) integral over the unbound states in addition to the (discrete) summation over bound states in the eigenstate expansion of the wave function in the quantum mechanical scattering of a particle by a potential well of finite depth and height.2 This familiar example simply corresponds to the integral over the continuous spectrum plus the summation over the discrete spectrum of the Schrödinger operator.) At this point, before discussing the continuous spectrum of the Boltzmann operator and its implications for the time decay of the neutron flux in a pulsed neutron experiment, a very short review of the definitions of the components of the spectrum of a linear operator might be helpful – in that it should make that discussion somewhat easier to follow, and therefore clearer and more informative. Starting from a general nonhomogeneous linear operator equation .A /f D S; (8.77) and the associated normed linear vector space X (Banach space) of which S is an element and in which the solution f is sought, it is helpful here to restrict X to be a Hilbert space since that is the space in which essentially all the early spectral analyses of the linear Boltzmann operator were done. In this space H – which is also the Lesbegue space L2 of (Lesbegue) square-integrable functions – the norm k k of a function f is defined as the square root of the complex inner product of f with itself (8.78) kf k D .f; f /1=2 : Here, the inner product of two functions f and g in the space is given by Z .f; g/ D dxfN.x/g.x/; (8.79) D where x is the vector-valued independent variable (for a multidimensional problem) and it belongs to D, the domain over which f .x/ and g.x/ (and S.x// are defined; and the overbar indicates the complex conjugate. Hence, the norm (or “length”) of f satisfies Z dxjf .x/j2 ; kf k2 D (8.80) D which, of course, must be finite for f .x/ to belong to L2 . 2 For example see: Schiff, L. I., Quantum Mechanics, 3rd Edition. 1968, New York: McGrawHill; or Messiah, A., Quantum Mechanics, Volume I. 1958, Amsterdam, The Netherlands: North Holland; or Dirac, P.A.M., Principles of Quantum Mechanics, 4th Edition. 1958, Oxford: The Oxford University Press. 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 405 Now that a function space has been introduced, the definitions of the components of the spectrum of the linear operator A – written as .A/ – can be clearly stated. 1. The point spectrum P .A/ comprises the set of values of for which the inverse .A /1 of the operator .A / does not exist. 2. The continuous spectrum C .A/ comprises the set of values of for which the inverse .A /1 of the operator .A / exists but is unbounded. That is, the values of for which .A /1 operating on a S 2 L2 gives a function fO outside the space L2 – or more mathematically, the values of for which the domain of .A /1 is L2 but its range is outside L2 . 3. The residual spectrum R.A/ comprises the set of values of for which the inverse .A /1 is not one-to-one. For most linear operators that describe physical problems the residual spectrum is empty. 4. The (total) spectrum .A/ D P .A/ [ C .A/ [ R.A/ and P .A/ \ C .A/ \ R.A/ D Ø, the null set. 5. The complement of the spectrum of A in the complex plane .A/ D C n.A/ is called the resolvent set. For values of 2 .A/, the inverse of Eq. 8.77 exists and is bounded; hence, it maps known nonhomogeneous terms S.x/ 2 L2 to solutions f .x/ 2 L2 . That is, the resolvent set .A/ is the set of values of for which the nonhomogeneous equation, Eq. 8.77, has (bounded) solutions f .x/ 2 L2 for “source” terms S.x/ 2 L2 . Before ending this short informal review of the definitions related to the generalization of the “eigenvalue” spectrum of a linear operator, two final points should be made. The first, a minor one in practice, is that elements belonging to the point spectrum, 2 P .A/, need not be discrete points; they can be continuously distributed in the complex -plane. Conversely, elements belonging to the continuous spectrum, 2 C .A/, need not be continuously distributed; they can be discrete points in the complex -plane. (Examples in which these two somewhat counter-intuitive situations arise are given in many introductory texts on functional analysis and even in [41, 43]. More complete and more rigorous discussions of the spectral theory of linear operators are available in many textbooks on functional analysis [43]). The second of these two final points is a little more important – or at least a little more useful – because it is helpful in developing an intuitive interpretation of the definitions just summarized. If the above equation, Eq. 8.77, results from an initial time-dependent equation @ O f .x; t / D .AfO /.x; t / C SO .x; t /; @t (8.81) with initial condition fO.x; 0/ D fO0 .x/ – as is the case when the Boltzmann equation, Eq. 8.74, is used to describe the pulsed neutron experiment which, of course, is the circumstance that initially motivated this brief review – and a Laplace transform, with which most applied mathematicians, physicists, and engineers are quite comfortable, is introduced to solve it, the result is .A s/fs D Ss ; (8.82) 406 J. Dorning where s is the new variable that was introduced by the Laplace transform with respect to time, fs D f .x; s/, the Laplace transform of fO.x; t /, and Ss D S.x; s/ C fO0 .x/. Clearly, this equation is formally the same as Eq. 8.77. Now if fs .x/ is obtained by inverting .A s/, and then the inverse Laplace transform is applied to obtain the solution for fO.x; t / and the Bromwich contour is deformed, the only contributions to the solution will be from the singularities of fs .x/ in the complex s-plane. These singularities occur where s equals a value of that belongs to .A/. This means that contributions to f .x; t / will be in the form of a sum or residues associated with P .A/ plus a contour integral around C .A/, assuming P .A/ comprises discrete points (the usual case of discrete eigenvalues), C .A/ comprises one or more continuously distributed sets of points (also the usual case when it is not empty), and R.A/ is empty (the usual case in physically motivated problems). This solution, then, is clearly equivalent to the spectral representation of the solution to Eq. 8.77, i.e., the “generalized” eigenfunction expansion of that solution in terms of a discrete summation of the discrete eigenfunction components over P .A/ plus the continuous contour integration (or set of such integrations) of the continuously distributed “singular” eigenfunction components over the boundary of C .A/. Now returning to the Boltzmann equation description of the pulsed neutron experiment, there are two seminal papers to which it is essential to refer here. The first – which in some ways represents a mathematical perspective – is by R. Lehner and G. Milton Wing [44], not surprisingly both mathematicians, and the second – which more or less reflects a physicist’s point of view – is by Noel Corngold [45], a physicist. Lehner and Wing studied the time-evolution of a pulse of neutrons using the one-speed transport equation in one-dimensional infinite-slab geometry with vacuum boundary conditions (zero incoming angular flux on the two infinite plane surfaces). Using the techniques of functional analysis and the spectral theory of linear operators, they proved that the spectrum of the one-speed transport operator in this geometry was comprised of a point spectrum – made up of a countable number of discrete eigenvalues with negative real parts, the algebraically largest of which was real and isolated, and the associated single eigenfunction was everywhere real and positive – a physically essential property for the fundamental space-angle mode – plus a continuous spectrum that occupies the entire half-plane Re v†T where v†T is the energy-spectrum averaged value of the total collision frequency in the one-speed transport theory description they employed (see sketch of the spectral plane – the -plane – in Fig. 8.3). Moreover, they also proved that as the width of the slab is decreased all the (discrete) eigenvalues move to the left and for a sufficiently thin slab (but one still of finite width) the last eigenvalue ˛0 disappears into the continuous spectrum Re v†T . The implications of this last result were profound. For a pulsed neutron experiment performed in a sufficiently thin slab, there would not exist a fundamental space-angle eigenmode; moreover, the time-asymptotic die-away of the neutron population would not be the exponential decay associated with a real, isolated, discrete eigenvalue ˛0 , but rather it would be given by a non-exponential asymptotic time dependence that results from a contour integral along the edge of the 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 407 Im (l) Cσ (ATr) Re(l) = -nST • • • -a2 -a1 Re(l) -a0 Fig. 8.3 Sketch of the spectrum of the one-speed transport operator continuous spectrum at Re D v†T . But these results were for the one-speed transport equation – which is not a particularly faithful representation of the thermal neutron population in a pulsed neutron experiment, especially in small assemblies in which, as already mentioned, a thermal spectral shift to lower energies occurs, due to the greater leakage of fast neutrons, resulting in diffusion cooling. No doubt N. Corngold, whose picture appears on the next page, was at least partially motivated by concern about these limitations on the one-speed transport equation when he studied the full velocity-dependent Boltzmann equation, Eq. 8.74, in the context of the pulsed neutron experiment. In order to make the treatment of the spatial operator .Ev r/ more manageable, he introduced the ansatz ‰.r; v; t / D .B; v; t/ exp.i BE rE/ – to represent the spatial dependence of the neutron flux based on so-called image reactor theory [8] which also is simply a fundamental-buckling-mode approximation in the context of the transport equation. The result of this simplification is a form of the Boltzmann equation in which the spatial operator, .Ev r/, is replaced by a simple (complex-valued) multiplicative operator added to v†T .v/, h i @ .B; x; t / D i vE BE C .v/†T .B; v; t / @t Z B; v0 ; t : C d 3 v0 v0 †s v0 ! v (8.83) (Clearly, this equation also corresponds to the spatially Fourier transformed Boltzmann equation – for a spatially infinite medium – but it was not treated as such and the solution for .B; v; t/ was not inverse Fourier transformed back from the B variable to the r variable.) Treating this equation as an initial-value problem that describes the experiment, Corngold Laplace transformed it (with respect to time) to obtain Œs C i vE BE C v†T .v/ Q .B; v; s/ D Z d 3 v0 v0 †s .v0 ! v/ Q B; v0 ; s C .B; v; 0/; (8.84) 408 J. Dorning Professor Noel Corngold of Caltech (courtesy of Professor Noel Corngold) and proceeded with tender loving care. (No heavy-duty functional analysis and spectral theory of linear operators here! Ah! Different ships, different long splices!) Upon inverting the Laplace transformed solution Q .B; v; s/ by deforming the Bromwich contour in the complex s-plane, he arrived at a sum of contributions due to residues at isolated poles sn – equivalent to eigenvalues n D ˛n – plus a contribution due to a contour integral along the boundary of the region s D i vE BE v†T .v/ which arises because it corresponds to Œs C i vE BE C v†T .v/ D 0 and this quantity appeared in the denominator of the transformed solution. These contributions are indicated schematically in the sketch of the s-plane in Fig. 8.4. He also showed that when the buckling B 2 becomes sufficiently large (the assembly becomes sufficiently small), all the poles s D ˛n – including ˛0 , associated with the fundamental velocity-mode – “disappeared” into the shaded region on the left of the contour integral along s D i vN BE v†T .v/. Hence, these results – on the velocity-dependent Boltzmann equation also showed that, for a sufficiently small assembly, the time-asymptotic behavior of the neutron flux in the pulsed neutron experiment would not be exponential. And, since the fundamental pole ˛0 moves along the real axis, it crosses into the shaded region at s D minŒv†T .v/ ; therefore, the upper bound on a discrete ˛0 was minŒv†T .v/ , and exponential decay would not occur in assemblies of dimensions less than the dimension corresponding to this value. This result was sometimes cited as Corngold’s .v†/min theorem or Corngold’s .v†/min limit [7, 25, 26, 45]. Subsequent, more rigorous mathematical spectral analyses based on the full space- and velocity-dependent Boltzmann equation, largely building on the techniques introduced by Lehner and Wing, led to 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 409 essentially the same results[46, 47] as those reported by Corngold. Those analyses showed that the continuous spectrum occupied the region Re minŒv†T .v/ in the spectral plane – which the result reported by Corngold would approach in the large buckling (small assembly) limit. They also showed that for a sufficiently small assembly all the discrete eigenvalues moved into the continuous spectrum – hence, no exponential decay. Alas, there was one “small” difficulty – some experimental values of the decay constant ˛0 that exceeded the .v†/min limit had been reported! And notwithstanding the intimidation of all the heavy mathematics, some stalwart experimentalists, after carefully re-evaluating their data, firmly stood by the exponential decay – within experimental error – with ˛0 > .v†/min that they observed, specifically in small heavy water assemblies and small beryllium assemblies. This apparent paradox was subsequently resolved by analysis using simple cross-section and scattering kernel models – that retained the salient physical features – in conjunction with the exp.i BE rE/ spatial ansatz. Because of the simple models used, the analysis – done using a simple Laplace transform in time – could be carried out rather explicitly, and it showed that for a detector response the continuous spectrum or singular region, shown shaded in Fig. 8.4 leads to a branch cut (see Fig. 8.5), and that, for a sufficiently small system (large B 2 ), where the fundamental eigenvalue ˛0 (pole in the Laplace transformed solution) has passed through the branch point at .v†/min , it bifurcates, and the two poles that are born in this bifurcation move onto the adjacent sheets of the Riemann surface (see Fig. 8.5). The resulting time dependence for the detector response, given by the integral around the branch cut, is very close to exponential for a long time (but not as t goes to infinity) if the bifurcated poles are very close to the branch cut [48]. Of course, if this pseudo-exponential decay is very close to exponential and if it lasts for a long time, it would be observed in a pulsed experiment as exponential decay – since, by the time (slower) non-exponential behavior would appear the count rate would most likely be so low as to be indistinguishable from the background count in the experiment. A separate, not unrelated analysis, based on a more specialized cross-section model for a crystalline material with a Bragg peak (e.g., beryllium), gave an earlier somewhat different explanation of exponential decay rates above .v†/min in crystalline materials [49]. ® ® s = −iv . B − vΣT (v) Im (s) s = −min[vΣT (v)] · · · - a2 - a1 - a0 Re (s) Fig. 8.4 Sketch of the complex s-plane for the velocity-dependent transport operator in the image reactor theory approximation 410 J. Dorning Im (s) Branch Cut X X (nS)min Re (s) Bifurcated Poles on Adjacent Sheets of the Riemann Surface Fig. 8.5 Sketch of the complex s-plane for the energy-dependent diffusion operator Many important theoretical and experimental problems in neutron thermalization related to thermal reactor physics – not just those related to pulsed neutron experiments – were put to rest during those years. Cross-section measurements and time-of-flight measurements of energy and angular distributions in thermal neutron scattering experiments, combined with the development of theoretical models for slow-neutron scattering, led to very much improved cross-section and scattering kernel data for use in reactor design and analysis. (See [24] for a very nice summary of some of this research written shortly after the research was completed.) These data were the antecedents (and in many cases the origins) of the Evaluated Nuclear Data Files (ENDF), and many of them are still used today. These were exciting times! They were great fun! And some really good science and engineering was done along the way! And the present author was thrilled to be part of it – albeit only at the very end. Like great wines, some data improves with age! (And even becomes perfect!) Before leaving the subject of pulsed neutron experiments in thermal systems – now so distant in time, but still so close to my heart – a personal anecdote seems irresistible (and perhaps even justified, since it was part of the original oral presentation of this material). Figure 8.6 from [50] shows experimental values (all well below .v†/min / of the decay constant in pulsed neutron experiments in light water assemblies along with dashed and solid curves that represent the calculated values of ˛0 based on the Nelkin model [51] and its slight improvement, the so-called anisotropic model [52]. It is clear that the data point for the smallest experimental system – actually a pingpong ball into which water was injected using a hypodermic needle – does not lie on the solid curve. And in the original graph given to the draftsman (by me), it was even oh so slightly further from the curve. (It took a few iterations to get where it is on this figure. After that I gave up!) Thus, Fig. 8.6 shows the initial improvement of that data point with age. But the real improvement was yet to come – when, soon 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 411 10 9 8 α0 (sec−1), ×104 7 6 NELKIN MODEL 5 4 3 ANISOTROPIC MODEL 2 1 0 0 1 2 3 4 5 6 7 8 9 10 11 R (cm) Fig. 8.6 Experimental points and theoretical curves for the decay constant in spheres of water as a function of their radius (with permission from the American Nuclear Society, Copyright July 1968 by the American Nuclear Society, La Grange, IL) after [50] appeared, the textbook by G. Bell and S. Glasstone [7] was published with an apparently redrafted version of this figure in it, on which this data point lies smack dead center on the solid curve! “Like great wines. . . .!” 8.4.3 Exponential and Non-exponential Decay in Subcritical Fast Multiplying Assemblies The simplest description of a pulsed neutron experiment performed in a subcritical, multiplying assembly rather than a nonmultiplying system is again the time-dependent one-speed diffusion equation, Eq. 8.65, but with v†a replaced by .1 ˇ/k1 v†a . Then the same simple steps reviewed at the beginning of Section 8.4.1 lead to ˛0 D v†a Œ1 .1 ˇ/k1 C vDB 2 ; (8.85) from which it immediately follows that keff k1 D 1 C L2 B 2 1 ˛0 `1 1CL2 B 2 .1 ˇ/ D 1 ˛0 `0 ; .1 ˇ/ (8.86) 412 J. Dorning where, as indicated below Eq. 8.6, `1 D 1=v†a , etc. and the overbar has been added here as a reminder that the v†a which appears in the one-speed equation has been averaged over the neutron energy spectrum. It is clear from Eq. 8.86 that a pulsed neutron experiment in a subcritical multiplying assembly is an integral experiment that yields the effective multiplication constant keff for the assembly; and from Eq. 8.85 that a sequence of such experiments on successively larger assemblies (smaller values of B 2 ) gives k1 for the system, given v†a and ˇ are known. Not unexpectedly, when the velocity-dependent Boltzmann equation is used to describe even a subcritical thermal multiplying assembly, all the subtleties summarized in Section 8.4.2 above arise. And in a subcritical fast multiplying assembly they are more pronounced. This is because, unlike in a thermal assembly in which the up-scattering of the thermal neutrons provides an energy-spectrum regeneration mechanism (in addition to that which results from the fission process) that tends to lead to the establishment of a collective energy mode, in a fast assembly, in which only down-scattering occurs, the resulting downward shift of the energy spectrum that occurs continuously in time does not lead to such a mode. Only when a fast assembly is quite close to critical, and the fission regeneration is dominant over the slowing down, is a persistent (decaying) collective energy mode established. From the mid-1960s to the mid-1970s, when various Western European countries, the Soviet Union, Japan, and the United States were developing plans to build fast reactors, pulsed neutron experiments were done as integral experiments to measure keff and other integral parameters, and to verify computational capabilities and models. Because they required considerably more financial and other investment (subcritical fast multiplying assemblies with plutonium or enriched uranium fuel) than non-multiplying thermal assemblies (e.g., spheres of water or graphite stacks) a limited number of these pulsed neutron experiments were done. Experimental data from two such experiments [53], done at the SUAK facility in the mid-1960s at the Kernforschungszentrum in Karlsruhe, Germany, are shown in Fig. 8.7 (adapted from [54]). The die-away data for SUAK-B – the larger, closer to critical assembly – clearly is exponential; whereas that for SUAK-A – the smaller farther subcritical assembly – clearly is not. Both sets of data initially appear to be exponential (upper portions of the straight lines; however, while the data for SUAK-B remain on the straight line at later times, those for SUAK-A depart from the straight line indicating slower than exponential decay at long times. This corresponds to pseudo-mode decay in which the die-away appears to be exponential for a finite time but then becomes slower at later times when the pseudo-mode ceases to be sustained and collapses. Because the .v†/min limit and measurements of related non-exponential die-away in thermal non-multiplying assemblies were well known by the time these experiments in fast assemblies were carried out, the nonexponential die-away in SUAK-A was not a surprise – and some experimentalists were, indeed, very well prepared for it. Similarly, some theorists also were prepared to do the analysis. Because of the fast, rather than thermal energy spectra in these assemblies – and especially because of the continuously downward-shifting fast spectrum in the pseudo-mode die-away – a one-speed diffusion theory description certainly was inadequate. 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 413 A : 25 ELEMENTS, keff =0.78 EXPERIMENTAL POINTS I/α = 122 nsec 5 10 B : 36 ELEMENTS, keff =0.869 EXPERIMENTAL POINTS AND CURVE, I/α = 230 nsec 4 10 A B 3 10 10 2 0.2 0 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 t = ( m sec) Fig. 8.7 Experimental die-away curves for the two SUAK assemblies (Adapted from [54]) A speed-dependent (or energy-dependent) representation was essential. And, since these fast assemblies were larger (even in mean free paths) than the most interesting thermal assemblies in previous studies, a speed-dependent diffusion theory description seemed adequate. Whether this description, or a speed-dependent transport theory description, is used it is simple and convenient to exploit the separability of the kernel in the Fredholm integral associated with the fission process. Starting from the speed-dependent diffusion equation with isotropic scattering (slowing down) in the center-of-mass system, introducing the ansatz ˆ.r; v; t / D '.B; v; t / exp.i BE rE/ and Laplace transforming from time to the variable s leads immediately to Z1 s C vD.v/B C v†T .v/ 'Q B ; v; s d v0 v0 †s v0 ! v 'Q B 2 ; v0 ; s 2 2 v Z1 D .1 ˇ/.v/ d v0 v0 †f v0 'Q B 2 ; v0 ; s C 'Q B 2 ; v; 0 S B 2 ; v; s ; (8.87) 0 where '.B Q 2 ; v; s/ depends only on B 2 not B, and '.B Q 2 ; v; 0/ is the initial condition. Now, introducing the Green’s function of the associated slowing-down equation s C vD.v/B 2 C v†T .v/ GQ vjv0 ; B 2 ; s Z1 d v0 v0 †s v0 ! v GQ v0 jv0 ; B 2 ; s D ı.v v0 /; (8.88) 0 414 J. Dorning which is simply the textbook slowing-down equation with v†t .v/ augmented by s C vD.v/B 2 , the solution to Eq. 8.87 then can be written in terms of this Green’s function as Z1 'Q B 2 ; v; s D d v0 GQ vjv0 ; B 2 ; s S B 2 ; v0 ; s 0 Z1 D .1 ˇ/ d v0 GQ vjv0 ; B 2 ; s .v0 / Q B 2 ; s 0 Z1 C d v0 GQ vjv0 ; B 2 ; s ' .B; v0 ; 0/ ; (8.89) 0 where the Laplace transformed fission neutron production rate 2 Z1 Q B ;s d vv†f .v/'Q B 2 ; v; s ; (8.90) 0 has been introduced. Then multiplying Eq. 8.89 by v†f .v/ and integrating with respect to v from 0 to 1 yields a formal solution for the transformed fission neutron production rate, or the transform of the time dependence of a fission detector. Q B 2; s Q.B 2 ; s/ D ; Q B ;s D 1 K .B 2 ; s/ D .B 2 ; s/ 2 (8.91) where Z1 2 Q.B ; s/ D Z1 d vv†f .v/ 0 d v0 GQ vjv0 ; B 2 ; s '.B; v0 ; 0;/; 0 Z1 K.B ; s/ D Z1 d vv†f .v/.1 ˇ/ 2 0 (8.92) d v0 GQ vjv0 ; B 2 ; s .v0 /; (8.93) 0 and D.B 2 ; s/ D 1 K.B 2 ; s/ – sometimes referred to as the dispersion relation since its zeros are poles of Q.B 2 ; s/ that relate the discrete time decay constants ˛n to B 2 – has been introduced. It follows from this simple development that, if the 2 Q slowing-down equation, Eq. 8.88, can be solved for G.vjv 0 ; B ; s/, an expression 2 for the time-dependent response of a fission detector .B ; t/ in the experiment can be obtained via the inverse Laplace transform of Eq. 8.91. This was done using model cross sections and, first a synthetic (hydrogen-like) slowing-down kernel [55], and then the exact elastic slowing kernel [56, 57]. The key results, which were 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 415 essentially the same for the two studies, showed that the expressions for Q.B 2 ; s/ had branch points at sbp D Œv†.v/ C vD.v/B 2 min and poles sn .B 2 / – at the zeros of D.B 2 ; s/ – and that as B 2 is increased – i.e., the assembly becomes smaller and farther subcritical – the last pole s0 .B 2 / “disappears” into the branch point. Prior to this there is a time-asymptotic exponentially decaying solution associated with the real, negative, isolated pole s0 .B 2 / D ˛0 .B 2 /. When that pole “disappears,” as B 2 is increased through a critical value B2 , it disappears only from the principal sheet of the Riemann surface for Q.B 2 ; s/. Actually, the pole bifurcates, as B 2 is increased through B2 , resulting in a complex conjugate pair of poles so˙ .B 2 / which were obtained by analytically continuing the expression for Q.B 2 ; s/ in the counter-clockwise (+) and clockwise (–) directions around the branch point sbp (see Fig. 8.8). The inverse Laplace transform is then given by a contour integral (path shown as dashed line in Fig. 8.8) that results from deforming the original integral along the Bromwich contour. When the integrand of this contour integral was expanded it led to time-dependence, for the detector response, of the form Q B 2; t ˇ Cˇ ˇ t C 1 sC 2 t 2 ; exp ˇsR 2 I (8.94) C on an intermediate time scale. Here, sR is the (negative) real part of the complex conjugate poles on the +1 and –1 Riemann sheets and sIC is the (very small) imaginary part. s Fig. 8.8 The Riemann surface with the trajectories of the bifurcated poles onto the adjacent (C1 and –1) sheets. The dashed curve represents the deformed contour of the Laplace inversion integral (Adapted from [54]) 416 J. Dorning If ˛.t/ is defined as ˛.t/ D P.t/= .t/, then on the intermediate time scale C C when Eq. 8.94 applies, it will be given by ˛.t/ jsR j .sIC /2 t, and sR will C 2 be the y-intercept of this local curve – an inclined plateau – and .sI / will be its slope. (See sketches of (a) log .B 2 ; t/ vs t and (b) ˛.t/ P.B 2 ; t /= .B 2 ; t / vs t in Fig. 8.9 (adapted from [54]).) Experimental data [58] comprising such an inclined plateau, indicating the existence of a pseudo mode and quasi-exponential die-away on a 10–40 s time scale in a 13 14 EURECA subcritical fast multiplying assembly, is shown in Fig. 8.10a (adapted from [54]). Analogous data [58] is shown in Fig. 8.10b (adapted from [54]), for a smaller, farther subcritical, 9 9 EURECA assembly, where the inclined plateau is less well-defined and much shorter-lived – 4–7 s [54]. As priorities in fast reactor development shifted toward other physics problems, e.g., sodium void effects in liquid metal fast breeder reactors (LMFBRs), and toward a b • log rf a (t) = - rf rf t t Fig. 8.9 (a) Sketch of the metastable (pseudomode) decay of a detector response. (b) Sketch of the time dependence of the logarithmic derivative, log .B 2 ; t/ (Adapted from [54]) a 0.20 b a ( 1 ) msec 0.15 0.10 10 20 30 50 msec 40 0.9 0.8 0.7 0.6 0.5 1 2 3 4 5 6 7 8 9 10 11 12 msec Fig. 8.10 (a) Experimental measurements of the “time-dependent” decay constant, ˛.t / P.B 2 ; t/= .B 2 ; t/ vs t for a EURECA 13 14 element assembly. Solid line is sketched in. (b) Similar data for a 9 9 EURECA assembly (Adapted from [54]) 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 417 many pressing technological problems – and in general toward nuclear engineering problems directly related to LMFBR design in France, Japan, the Soviet Union and the United States, support waned for experiments of this type in bare fast subcritical assemblies; hence, the understanding of these and various other fundamental problems related to fast reactor physics problems did not progress nearly as far as the understanding that resulted from the earlier, more extensive studies of problems related to thermal reactor physics. Perhaps the budding resurgence of interest in fast reactors in France will change this, and put us on the path to developing a deep understanding of pulsed neutron experiments in subcritical fast reactor assemblies, and, therefore, a deep understanding of some important aspects of the kinetics of fast reactors. 8.5 Space–Time Reactor Kinetics In the early days of reactor development – the mid-1940s to the mid-1950s – when high speed, large digital computers did not operate at very high speeds, and did not have very large memory capacities, and were not very widely available – reactor physics analysis and reactor kinetics analysis was largely done using basic theoretical techniques and hand calculations. (In fact, when I was a graduate student, I met a computer programmer who told me his first job title had been “Computer.” Yes, he was a Computer! Early in his career he had done numerical calculations – by hand, and using a desk-top mechanical calculator – based on the expressions the physicists gave to him (e.g., the solution to the two-group, one-dimensional, steadystate diffusion equations in slab geometry).) Then, when the early antecedents of modern digital computers became available, e.g., in the US Navy’s nuclear reactor program, numerical solution techniques based on finite-difference schemes were programmed for the one- and two-group steady-state diffusion equations in slab geometry and for the one-group steady-state P-1 and P-3 equations in slab geometry. For some delightful reminiscences of that era the reader is strongly encouraged (dare I say, “commanded!”) to see the text of a wonderful after-dinner talk given by the late Dr. Ely Gelbard at an early American Nuclear Society Mathematics and Computational Division National Topical Meeting [59]. (I was there to enjoy it live, and it was one of the most interesting and entertaining talks I have ever heard!) As time passed, one-dimensional and later two-dimensional reactor kinetics codes based on finite-difference schemes in space and time were developed using the few-group and multigroup diffusion equations and the coupled delayed neutron precursor concentration equations. Some of these developments will be summarized briefly in Section 8.5.1. Because the memory capacity and speed of digital computers were still very limited in the late 1950s and early 1960s, alternative methods to those based on simple direct finite-difference schemes, were developed. These included methods based on variational functionals which were fairly widely used in reactor analysis (again, especially in the US Navy’s nuclear reactor program) in both steady-state 418 J. Dorning reactor calculations and reactor kinetics calculations. Developed primarily during the early 1960s, they gradually evolved by the late 1960s into so-called synthesis methods in which the trial functions used in the variational formulations were the numerical outputs of the computer-generated solutions to related simpler problems. These methods were extensively developed and widely used in reactor analysis and design at both of the US Navy nuclear laboratories: Bettis Atomic Power Laboratory (BAPL) and Knolls Atomic Power Laboratory (KAPL). The basis for these methods will be reviewed in Section 8.5.2, and anomalies that arise in their application will be discussed briefly there. The early 1970s saw the development of another general class of computational methods for both reactor criticality calculations and reactor kinetics calculations. Also motivated by the goals of more accurate numerical solutions and more realistic representations of reactors using limited computer resources, they, to some extent, represented a logical compromise between the decomposition of a reactor into numerical cells (or boxes, or computational elements) characteristic of finite-difference methods and use of analytical solutions characteristic of variational methods. The basic idea of these so-called coarse mesh methods – some of which later bore the appellation nodal methods – was to decompose the reactor into computational elements or cells, as is done in finite-difference methods, and then rather than using the linearly truncated Taylor series approximation to the solution within the cell – as also is done in finite-difference schemes – introduce some more complicated, but better analytical approximations to the solution – as is done in variational methods. The distinction, of course, was that the solution is not approximated by global analytical functions over the whole reactor domain but rather by local analytical functions within each cell or computational element. This typically results in more accurate solutions – in comparison with those obtained using traditional finite difference methods – for a given cell size; or, conversely, it permits the use of larger, thus fewer, cells to achieve a solution of given accuracy requirements. Thus, specified solution accuracy was achieved using both less computer memory and less computer central processor unit (CPU) time on the mainframe computers of that era. Some of these coarse mesh [60] and nodal [61–67] methods for the multigroup diffusion equations, originally developed for criticality calculations, were soon afterwards extended to space–time reactor kinetics applications [62, 64, 65, 67] and some were also extended to the multigroup transport equations [68–73]. An early review of coarse-mesh and nodal methods for the diffusion equation appeared in 1979 [69], and an analogous review of nodal methods for the transport equation was written in 1985 [73]. Excellent, much more comprehensive, reviews of nodal methods for the diffusion equation and a related fuel-assembly homogenization procedure [74], and of nodal methods for the diffusion and transport equations [75] were published in 1986. The main ideas involved in the development of these coarse-mesh and nodal methods will be presented in Section 8.5.3 in which some results obtained using them will also be discussed. The success of these coarse mesh and nodal methods in achieving very high computational accuracy using very coarse meshes, or large computational elements or cells – in some cases as large as a light water reactor (LWR) fuel assembly 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 419 or larger – led to the need for a homogenization procedure that could be used to homogenize the fuel pins and associated lattice cells over a fuel assembly or a significant fraction thereof. One such homogenization procedure, called “equivalence theory,” in which an exact transport theory solution is introduced as a hypothetical solution on a computational element (fuel assembly) boundary, was introduced by K. Koebke [76], and subsequently used extensively and extended by G. Greenman, K. Smith and A. F. Henry [65] and K. Smith [74]. Others, based on asymptotic expansions were developed by several authors over a period of years. Most of these other homogenization theories began with an asymptotic expansion of the solution to the transport equation for a lattice of fuel cells in powers of the small parameter " D =R, where is an appropriate mean free path and R is a characteristic overall reactor dimension [77–81]. Clearly, this ratio is very small (of the order of 102 ) for a light water thermal reactor. One development, however, started from the asymptotic expansion of the solution to the diffusion equation in the small parameter © D L=R where L is the diffusion length; hence, this parameter is equally small (also of the order of 102 ) for a LWR. The essential ingredients that enter into these homogenization theories will be discussed in Section 8.5.4 in which a few results obtained using them in conjunction with nodal methods will be summarized. It is clear that, for different applications, it is appropriate to use different descriptions of reactor kinetics – the point reactor kinetics model, quasi-static models, space–time multigroup diffusion theory, and in some cases even space–time multigroup transport theory. It is also clear that, in some complicated transients, it is necessary to use a more precise model such as the time-dependent multigroup transport equations to adequately represent the portion of the transient in which, crucial local space–time variations of the neutron flux occur. However, it typically is very costly if such a precise computational representation is used throughout a long simulation of a transient. Thus, just as adaptive spatial grids and adaptive variable time steps are very effectively used in a whole host of computational physics and computational engineering problems – spatial grids in the neighborhood of shock-wave fronts in gas dynamics, refined time steps in phase transition problems, etc. – adaptive procedures to represent the evolution of the neutron flux during different epochs of a complex reactor transient by different models is an efficient way to proceed. Hence, this section on space–time reactor kinetics will close with a brief summary in Section 8.5.5 of two articles in which the development and application of precisely such an adaptive model procedure was reported [82, 83]. 8.5.1 Finite-Difference Schemes for the Time-Dependent Multigroup Neutron Diffusion Equations The origins of numerical methods based on finite-difference schemes for the approximation of the solutions to differential equations go far back in time, as is evidenced by the names some of these schemes bear – for example, the forward Euler scheme and the backward Euler scheme, both of which are named after the 420 J. Dorning Swiss mathematician Leonhard Euler, who lived from 1707 to 1783. These schemes were used extensively in “hand calculations” long before modern computers – or even desk-top mechanical calculators – had arrived on the scene. They are very simple and straightforward, mathematically well-understood, easy to program, and very widely applicable; and they are particularly suitable for the solution of diffusion equations. It is, therefore, not surprising that they were the first schemes employed in serious attempts at the numerical solution of reactor physics equations, particularly the neutron diffusion equation. In the 1950s and 1960s, and even now some 50 years later, most whole-core thermal reactor calculations begin, not from the multigroup transport equations, but rather from the few-group or multigroup diffusion equations 1 @ vg @t g .r; t / D r Dg .r; t /r C G X †s;g g .r; t/ †a;g .r; t / g .r; t / g 0 .r; t/ g 0 .r; t / g 0 D1 g 0 ¤g C.1 ˇ/0;g G X †f;g 0 .r; t / g 0 .r; t/ g 0 D1 C I X i i;g Ci .r; t / C Qg .r; t /; g D 1; : : : ; G (8.95) i D1 and the coupled delayed neutron precursor equations G X @ Ci .r; t/ D ˇi i;g †f;g 0 @t 0 g 0 .r; t / i Ci .r; t/ i D 1; ; I ; (8.96) g D1 for reactor kinetics calculations, and from the steady-state version of these equations for reactor criticality calculations (k-calculations). Due to the dramatic difference between the time scale on which the prompt neutrons arrive – of the order of the prompt neutron lifetime – and that on which the delayed neutrons arrive – of the order of the precursor delay times – these equations are stiff differential equations. Thus, the time-eigenvalues associated with the delayed neutrons are much greater in magnitude than those associated with the prompt neutrons, and because of this an explicit finite difference scheme in time (such as the forward Euler scheme) would lead to an algorithm that would be numerically unstable in many applications. Hence, implicit schemes in time – such as the backward (implicit) Euler scheme (first order) or the Crank–Nicholson scheme (second order) – usually are used. Because most thermal reactor core layouts are very amenable to Cartesian geometry representation, the discussion that follows will use two-dimensional x–y 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 421 spatial coordinates. (The inclusion of the z-coordinate would be completely straightforward, but cumbrous.) Further, since there is no differential operator in the spatial variables, in the precursor equations, and their treatment is therefore straightforward, they will be omitted from the discussion. Finally, since the linear algebraic equations that result from applying difference schemes to the multigroup (and even the few group) diffusion equations comprise very large systems, which typically are solved via source iteration of the fission source, or the fission plus inscatter sources, the basic finite-difference scheme for the space–time solution is developed for just the g-th group equation. Hence, the essential part of Eqs. 8.95 and 8.96 that is relevant to this discussion is the time-dependent g-th group diffusion removal equation with a fixed source in x–y spatial coordinates 1 @ vg @t g .x; y; t / @2 @2 g .x; y; t / C Dg .t/ 2 @x @y 2 †r;g .t/ g .x; y; t / C Sg .x; y; t /; D Dg .t/ g .x; y; t / (8.97) where the group removal cross section and diffusion coefficient have been taken to be uniform in space to avoid tedious and possibly confusing details; and Sg .x; y; t / represents the fission, inscatter, and precursor decay sources, along with any possible fixed sources. A simple central difference scheme applied to the second-order partial derivatives in x and y leads to 1 d v dt j:k .t/ D D.t/ D.t/ D.t/ j C1;k .t/ 2 j;k .t/ C j 1;k .t/ 2 2 x x x 2 D.t/ D.t/ D.t/ C j;kC1 .t/ 2 j;k .t/ C j;k1 .t/ 2 2 y y y 2 †r .t/ j;k .t/ C Sj;k .t/; j D 0; : : : ; J; k D 0; : : : ; K (8.98) where j;k .t/ D .xj ; yk ; t /, a uniform spatial mesh of cells xy has been used, and the group index g has been dropped. Clearing the 1=v and writing these equations in matrix form for the column vector .t/ of unknowns j;k .t/; j D 0; : : : ; J; k D 0; : : : ; K yields d .t/ D L.t/ .t/ C S .t/; D dt (8.99) where the definitions of the square matrix L.t/ and the vector S .t/ follow from D Eq. 8.98 and are obvious. Finally, introducing a simple finite difference scheme for the time derivative leads to ` `1 D .1 /t` L ` D ` t`1 L `1 D `1 C.1 /t` S ` t`1 S `1 ; ` D 1; 2; 3; : : : (8.100) 422 J. Dorning where ` .t` /; L ` L.t` /; S ` S .t` /; t` is the discretized time, t` D D t` t`1 is the `th time step, and has been introduced so that a -weighted average of the right-hand-side of Eq. 8.99 appears in Eq. 8.100. Solving these equations for , the vector of fluxes at spatial grid points .xj ; yk / at time level t` in terms of ` and the sources S ` and S `1 yields `1 ` D I .1 /t` L ` D D I t`1 L `1 D 1 D C .1 /t` S ` t`1 S `1 ; ` D 1; 2; 3; : : : `1 (8.101) Here, the vector of sources S ` depends upon the group fluxes at the current (the `th) time level. Some of these group fluxes have been computed in the current outer source iteration and therefore are available; the others have to be approximated by their values at the previous time level t`1 . When , in this so-called theta-difference method in time, is set equal to unity, Eq. 8.101 becomes the result for the forward Euler scheme, which has a first-order global error in t (for a uniform time step) and is explicit (no matrix inversion required) – but often unstable. For equal to 0, this equation gives the timeadvancement operator for the backward Euler scheme, which also is first order, but implicit (matrix inversion required) and stable. Finally, for equal to one half, it gives the advancement operator for the Crank–Nicolson scheme, which has a second-order global error in t (for a uniform time step) and also is implicit and stable. Of course, there are many very important details – cell-centered versus celledge spatial finite-difference schemes, acceleration schemes such as coarse-mesh rebalancing, asymptotic source extrapolation, Wielandt (eigenvalue shift) iteration, etc. – that have been omitted here in this brief summary. However, many of these are discussed at some length in well-known textbooks [84–87]. The -difference scheme just summarized, albeit rather briefly, was the basis for the one-dimensional space–time reactor kinetics code WIGLE [88], developed at BAPL in the early 1960s – and also for TWIGLE [89], its extension later in the 1960s to two-dimensional space–time reactor kinetics. These codes were very widely and effectively used in the 1960s and 1970s; and their “guts” can still be found in some reactor kinetics and reactor dynamics software used today. Unfortunately, in order to achieve reasonable accuracy in calculations for LWRs – in which the neutron diffusion length is of the order of a couple of centimeters – very fine spatial meshes must be used when these standard finite difference schemes are employed. Thus, the vectors of group fluxes, ` in Eq. 8.101, become quite large even in two-dimensional calculations. For example, if only one hundred points were used in each direction (x and y) in a quarter-core calculation, that vector would have dimension 10,000 and the matrix that would have to be 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 423 inverted (in Eq. 8.101) in each group, in each iteration, at each time step would be 10; 000 10; 000. And for three-dimensional quarter-core calculations – which are important in order to properly represent the motion of control rods or blades in reactor kinetics calculations – these numbers become 2 106 and .2 106 / .2 106 /! Thus, in those days, when “large jobs” were run on (32K!) mainframe computers (e.g., the IBM 360) only at night or over weekends, two-dimensional LWR kinetics calculations were not done routinely! And three-dimensional calculations were not done at all – at least not using codes based on finite difference schemes. Rather, other methods that required less computer storage and time were developed both for criticality calculations and space–time reactor kinetics calculations. These included methods developed in the 1960s based on variational principles – in which the space-dependence of the neutron flux typically was expanded in functions defined over the whole reactor spatial domain – and later, in the 1970s and 1980s, methods that were capable of achieving very high numerical accuracy while using very large computational elements, or, equivalently, coarse meshes. The main ideas used in the development of these methods based on variational principles, along with some results obtained using them, will be summarized in the next section. And the key steps in the formulation of the so-called coarse-mesh and nodal methods and some examples of their application will be described in the section after that. 8.5.2 Variational, Modal, Synthesis, and Related Methods for the Time-Dependent Multigroup Diffusion Equations The development of variational methods, (variational) synthesis methods and socalled modal methods for space–time reactor kinetics (and reactor statics) begins from a simple variational principle [90–94]; hence, it seems apropos to include here a brief review of the use of these principles to develop techniques for obtaining approximate solutions to differential equations, integral equations, and other functional equations. Even though none of the equations of interest in space–time reactor kinetics – except the one-group steady-state diffusion equation – is selfadjoint it is convenient for expository purposes to begin from a linear self-adjoint equation L DS (8.102) and introduce the classic functional [90–92] GŒ' D .'; L'/ 2.'; S/: (8.103) Here, all the variables are taken to be real, and the real inner product is given by the integral over the multidimensional domain X over which the unknown function '.x/ is defined. The function that minimizes this functional is the '.x/ that is the 424 J. Dorning solution to Eq. 8.102. This becomes obvious when the variation of GŒ' with respect to ' is set equal to zero ıGŒ' D .ı'; L'/ C .'; Lı'/ 2.ı'; S/ D 0; (8.104) D .ı'; L'/ C .L'; ı'/ 2.ı'; S/ D 0; D 2.ı'; fL' S g/ D 0; (8.105) (8.106) Here, ı' is the variation in ', and since it is arbitrary, the inner product is zero only if '.x/ is the solution to Eq. 8.102. (If this development began from Eq. 8.103 in which GŒ' represented some physical quantity – for example, the energy in a mechanical system – its minimization would yield Eq. 8.102 as the equation(s) of motion for the system, the so-called Euler-LaGrange equation(s) corresponding to the functional GŒ' . The second variation of GŒ' with respect to ' is 2.ı'; Lı'/ where L' S D 0 has been used; hence, if L is a positive operator the extremum that follows from ıGŒ' D 0 is a minimum. In practice, when the original operator in Eq. 8.102 is known – as is the case in reactor kinetics and reactor physics in general – an approximation to its solution can be generated by minimizing GŒ' , not with respect to an arbitrary variation of ', but rather with respect to the variation of a trial function – comprising, for example, a linear combination of acceptable functions, i.e., functions that belong to some specific class. This procedure is made very explicit by introducing the trial function 'T .x/ D N X Cn n .x/; (8.107) nD1 into the functional GŒ' and then minimizing this functional by setting all its partial derivatives with respect to the Cn equal to zero – or equivalently setting its first variation fıGŒ T =ıCn gıCn ; n D 1; : : : ; N with respect to all the Cn equal to zero – which leads to N X .n ; Lm /Cm .n ; S / D 0; n D 1; : : : ; N; (8.108) mD1 where a common factor of 2 has been cancelled, and the facts that L is self-adjoint and the inner product is real have been used. This is a simple nonhomogeneous linear system of algebraic equations which can be inverted as long as it does not have a zero eigenvalue, to obtain the Cn ; n D 1; : : : ; N and therefore, via Eq. 8.107, the specific 'T .x/ that minimizes the functional GŒ'T over functions of the form given by that equation, and thereby provides an approximation of this form to the solution of Eq. 8.102. This procedure is known as the Rayleigh–Ritz method [91], and it is equivalent to the Bubnov–Galerkin method [91] of approximation to solutions in linear function spaces. In that method, the solution to Eq. 8.102 is simply approximated by the expansion '.x/ Š 'N .x/ N X nD1 Cn n .x/; (8.109) 8 Nuclear Reactor Kinetics: 1934–1999 and Beyond 425 in the set of functions n .x/; n D 1; : : : ; N , which is substituted directly into Eq. 8.102; then, the inner product of the resulting equation with each of the n .x/; n D 1; : : : ; N , is separately set equal to zero, forcing the residual RN .x/ D L'N .x/ S.x/ to be orthogonal to each of the expansion functions n .x/; n D 1; : : : ; N . The final equations, therefore, are identical to Eq. 8.108; hence, this method is sometimes called the Ritz–Galerkin method – even though the lives of the two mathematicians did not overlap. (Equation 8.108 is equivalent to setting the gradient of GŒ'T with respect to the vector C of coefficients Cn ; n D 1; : : : ; N , equal to zero, i.e., rC GŒ'n D 0.) This, of course, establishes that GŒ'T undergoes an extremum for the resulting value of C . To insure that this extremum actually is a minimum – and not a maximum or inflection point – the Hessian rC rC GŒ'T also should be calculated and evaluated at the value of C that corresponds to the extremum to show that it is positive and, therefore that GŒ'T has been minimized, not maximized. But this long and tedious step is usually omitted in real-world calculations. Now that the development for self-adjoint operators has been summarized, the extension to non-self-adjoint operators can be made clear very easily. This development – which applies to the time-dependent (and steady-state) multigroup diffusion equations and multigroup transport equations, and the time- and energy-dependent diffusion and transport equations as well – begins from the linear non-self-adjoint operator equation A' D S; (8.110) and the related adjoint equation A ' D S ; (8.111) where, for example, A might be a nonsymmetric matrix of non-self-adjoint integrodifferential operators, and A , of course, is its adjoint. As for the self-adjoint case a functional is introduced and its variation is set equal to zero. Here, however, the classic functional GŒ' , given by Eq. 8.103, is replaced by its generalization to the non-self-adjoint case – the Roussopoulos functional [93, 94] (Reference [94], cited here, is a somewhat obscure, but lovely, introduction to variational methods in general and variational methods for reactor calculations in particular; and every one interested in reactor calculations and calculational methods should have a copy in her or his library. (My copy is a not-very-good photocopy of an old mimeographed copy; nevertheless, I cherish it!)) i h F ' ; ' D ' ; A' ' ; S S ; ' ; (8.112) where, in general, all the variables and the operator A may be complex and the inner product is the standard complex inner product over the multidimensional domain X. The variation of F Œ' ; ' with respect to both ' and ' , set equal to zero, is h i ıF ' ; ' D ı' ; fA' S g C fA ' S g; ı' D 0; (8.113) 426 J. Dorning and since both ı' and ı' are arbitrary variations this leads to Eqs. 8.110 and 8.111 for the so-called forward and adjoint solutions, respectively, as the generalized Euler–LaGrange equations. (Analogously to the self-adjoint case, if the original equations for ' and ' , Eqs. 8.110 and 8.111, had not been known and this development began from the functional F Œ' ; ' , setting its first variation equal to zero would have generated these equations.) Now, approximations to the solutions to the forward equation and the adjoint equation can be generated via steps closely analogous to those just summarized for the self-adjoint case. Introducing trial functions for both the forward and adjoint solutions 'T .x/ D N X Cn n .x/; (8.114) Ck k .x/; (8.115) nD1 and 'T .x/ D N X kD1 into the Roussopoulos functional and setting its first variation with respect to all the Ck and all the Cn equal to zero N n X N n h o h o i i X @F ' ; ' =@Ck ıCk C @F ' ; ' =@Cn ıCn D 0; (8.116) nD1 kD1 or equivalently, since all the ıCk and all the ıCn are arbitrary, setting all the par tial derivatives of F Œ' ; ' with respect to both the Ck ; k D 1; N and the Cn ; n D 1; N equ