Sensitivity Analysis An introduction Andrea Saltelli, European Commission, Joint Research Centre Copenhagen, October 2007 1 Based on Global sensitivity analysis. Gauging the worth of scientific models John Wiley, 2007 Andrea Saltelli, Marco Ratto, Terry Andres, Francesca Campolongo, Jessica Cariboni, Debora Gatelli , Michaela Saisana, Stefano Tarantola 2 Outline Models – an unprecedented critique Definition of UA, SA Strategies Model quality Type I, II and III errors 3 The critique of models and what sensitivity analysis has to do with it 4 “They talk as if simulation were real-world data. They ‘re not. That ‘s a problem that has to be fixed. I favor a stamp: WARNING: COMPUTER SIMULATION – MAY BE ERRONEOUS and UNVERIFIABLE. Like on cigarettes […]”, p. 556 5 For sure modelling is subject toady to an unprecedented critique. Have models fallen out of grace Is modelling just useless arithmetic as claimed by Pilkey and Pilkey-Jarvis 2007? 6 Useless Arithmetic: Why Environmental Scientists Can't Predict the Future by Orrin H. Pilkey and Linda Pilkey-Jarvis Quantitative mathematical models used by policy makes and government administrators to form environmental policies are seriously flawed 7 One of the examples discussed concerns the Yucca Mountain repository for radioactive waste disposal, where a very large model called TSPA (for total system performance assessment) is used to guarantee the safe containment of the waste. TSPA is Composed of 286 sub-models. 8 TSPA (like any other model) relies on assumptions -- a crucial one being the low permeability of the geological formation and hence the long time needed for the water to percolate from the desert surface to the level of the underground disposal. The confidence of the stakeholders in TSPA was not helped when evidence was produced which could lead to an upward revision of 4 orders of magnitude of this parameter. 9 We just can’t predict, concludes N. N. Taleb, and we are victims of the ludic fallacy, of delusion of uncertainty, and so on. Modelling is just another attempt to ‘Platonify’ reality… Nassim Nichola Taleb, The Black Swan, Penguin, London 2007 10 You may disagree with the Tale’s and the Pilkey’s ... But is the cat is out of the bag. Stakeholders and media will tend to expect or suspect instrumental use of computation models, or mistreatment of their uncertainty. 11 The critique of models The nature of models, after Rosen Decoding N F Natural system Formal system Entailment Entailment Encoding 12 The critique of models After Robert Rosen, 1991, ”World” (the natural system) and “Model” (the formal system) are internally entailed - driven by a causal structure. [Efficient, material, final for ‘world’ – formal for ‘model’] Nothing entails with one another “World” and “Model”; the association is hence the result of a craftsmanship. Decoding F N Entailment Entailment Formal system Natural system Encoding 13 The critique of models Since Galileo's times scientists have had to deal with the limited capacity of the human mind to create useful maps of ‘world’ into ‘model’. The emergence of ‘laws’ can be seen in this context as the painful process of simplification, separation and identification which leads to a model of uncharacteristic simplicity and beauty. 14 The critique of models <<Groundwater models cannot be validated [!]>> Konikov and Bredehoeft, 1992. Konikov and Bredehoeft’s work was reviewed on Science in “Verification, Validation and Confirmation of numerical models in the earth sciences”, by Oreskes et al. 1994. Both papers focused on the impossibility of model validation. 15 The (post modern) critique of models. The post-modern French thinker Jean Baudrillard (1990) presents 'simulation models' as unverifiable artefact which, used in the context of mass communication, produce a fictitious hyper reality that annihilates truth. 16 Science for the post normal age is discussed in Funtowicz and Ravetz (1990, 1993, 1999) mostly in relation to Science for policy use. Jerry Ravetz Silvio Funtowicz 17 The critique of models <-> Uncertainty <<I have proposed a form of organised sensitivity analysis that I call “global sensitivity analysis” in which a neighborhood of alternative assumptions is selected and the corresponding interval of inferences is identified. Conclusions are judged to be sturdy only if the neighborhood of assumptions is wide enough to be credible and the corresponding interval of inferences is narrow enough to be useful.>> Edward E. Leamer, 1990 18 Jerry Ravetz GIGO (Garbage In, Garbage Out) Science where uncertainties in inputs must be suppressed lest outputs become indeterminate 19 The critique of models <-> Uncertainty Peter Kennedy, A Guide to Econometrics Anticipating criticism by applying sensitivity analysis. This is one of the ten commandments of applied econometrics according to Peter Kennedy: “Thou shall confess in the presence of sensitivity. Corollary: Thou shall anticipate criticism ’’ 20 The critique of models <-> Uncertainty When reporting a sensitivity analysis, researchers should explain fully their specification search so that the readers can judge for themselves how the results may have been affected. This is basically an `honesty is the best policy' approach, advocated by Leamer’. 21 The critique of models - LAST! George Box, the industrial statistician, is credited with the quote ‘all models are wrong, some are useful’ Probably the first to say that was W. Edwards Deming. W. E. Deming G. Box Box, G.E.P., Robustness in the strategy of scientific model building, in Robustness in Statistics, R.L. Launer and G.N. Wilkinson, Editors. 1979, Academic Press: New York. 22 The critique of models – Back to Rosen If modelling is a craftsmanship, then it can help the craftsman that the uncertainty in the inference (the substance of use for the decoding exercise) is apportioned to the uncertainty in the assumptions (encoding). Decoding F N Entailment Entailment Formal system Natural system Encoding 23 Models – Uncertainty ASSUMPTIONS <-> ENCODING <-> INFERENCES DECODING Apportioning inferences to assumptions is an ingredient of decoding – how can this be done? Decoding F N Entailment Entailment Formal system Natural system Encoding 24 Definition. A possible definition of sensitivity analysis is the following: The study of how uncertainty in the output of a model (numerical or otherwise) can be apportioned to different sources of uncertainty in the model input. A related practice is `uncertainty analysis', which focuses rather on quantifying uncertainty in model output. Ideally, uncertainty and sensitivity analyses should be run in tandem, with uncertainty analysis preceding in current practice. 25 Models maps assumptions onto inferences … (Parametric bootstrap version of UA/SA ) Input data Model (Estimation) Estimated parameters Uncertainty and sensitivity analysis (Parametric bootstrap: we sample from the posterior parameter probability) Inference 26 About models. A model can be: 1. Diagnostic or prognostic. - models used to understand a law and - models used to predict the behaviour of a system given a supposedly understood law. Models can thus range from wild speculations used to play what-if games (e.g. models for the existence of extraterrestrial intelligence) to models which can be considered accurate and trusted predictors of a system (e.g. a control system for a chemical plant). 27 About models. A model can be: 2. Data-driven or law-driven. - A law-driven model tries to put together accepted laws which have been attributed to the system, in order to predict its behaviour. For example, we use Darcy's and Ficks' laws to understand the motion of a solute in water flowing through a porous medium. - A data-driven model tries to treat the solute as a signal and to derive its properties statistically. 28 More about law-driven versus data-driven Advocates of data-driven models like to point out that these can be built so as to be parsimonious, i.e. to describe reality with a minimum of adjustable parameters (Young, Parkinson 1996). Law-driven models, by contrast, are customarily overparametrized, as they may include more relevant laws than the amount of available data would support. 29 More about law-driven versus data-driven For the same reason, law-driven models may have a greater capacity to describe the system under unobserved circumstances, while data-driven models tend to adhere to the behaviour associated with the data used in their estimation. Statistical models (such as hierarchical or multilevel models) are another example of data-driven models. 30 More categorizations of models are possible, e.g. Bell D., Raiffa H., Tversky A. (eds.) (1988) – Decision making: Descriptive, normative and prescriptive interactions, Cambridge University press, Cambridge. •Formal models; axiomatic systems characterized by internal consistency. No need to have relations with the real world. •Descriptive models: these models are factual in the sense that their basic assumptions should be as close as possible with the real-world. •Normative models: they propose a series of rules that an agent should follow to reach specific objectives. 31 Wikipedia’s entry for mathematical model A mathematical model is an abstract model that uses mathematical language to describe the behaviour of a system. Mathematical models are used particularly in the natural sciences and engineering disciplines (such as physics, biology, and electrical engineering) but also in the social sciences (such as economics, sociology and political science); physicists, engineers, computer scientists, and economists use mathematical models most extensively. Eykhoff (1974) defined a mathematical model as 'a representation of the essential aspects of an existing system (or a system to be constructed) which presents knowledge of that system in usable form'. 32 Sample matrix for parametric bootstrap (ignoring the covariance structure). Each row is a sample trial for one model run. Each column is a sample of size N from the marginal distribution of the parameters as generated by the estimation procedure. 33 Each row is the error-free result of the model run. 34 Bootstrapping-of-the-modelling-process version of UA/SA, after Chatfield, 1995 Model (Model Identification) Loop on bootreplica of the input data (Estimation) (Bootstrap of the modelling process) Estimation of parameters Inference 35 Bayesian Model Averaging Posterior of Model(s) Prior of Model(s) Model Data (Sampling) Prior of Parameters Inference Posterior of Parameters 36 The analysis can thus be set up in various ways …what are the implications for model quality? What constitutes an input for the analysis depends upon how the analysis is set up … … and will instruct the modeller about those factors which have been included. A consequence of this is that the modeller will remain ignorant of the importance of those variables which have been kept fixed. 37 The spectre of type III errors: “Assessing the wrong problem by incorrectly accepting the false meta-hypothesis that there is no difference between the boundaries of a problem, as defined by the analyst, and the actual boundaries of the problem” (Dunn, 1997). = answering the wrong question = framing issue (Peter Kennedy’s II commandment of applied econometrics: ‘Thou shall answer the right question’, Kennedy 2007) Dunn, W. N.: 1997, Cognitive Impairment and Social Problem Solving: Some Tests for Type III Errors in Policy Analysis, Graduate School of Public and International Affairs, University of Pittsburgh, Pittsburgh. 38 The spectre of type III errors: Donald Rumsfeld version: "Reports that say that something hasn't happened are always interesting to me, because as we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns -- the ones we don't know we don't know." 39 In sensitivity analysis: Type I error: assessing as important a non important factor Type II: assessing as non important an important factor Type III: analysing the wrong problem 40 Type III in sensitivity: Examples: •In the case of TSPA (Yucca mountain) a range of 0.02 to 1 millimetre per year was used for percolation of flux rate. Applying sensitivity analysis to TSPA could or could not identify this as a crucial factor, but this would be of scarce use if the value of the percolation flux were later found to be of the order of 3,000 millimetres per year. •Another example: The result of a model is found to depend on the mesh size employed. 41 Our suggestions on useful requirements of a sensitivity analysis Requirement 1. Focus About the output Y of interest. The target of interest should not be the model output per se, but the question that the model has been called to answer. 42 Requirement 1 - Focus Another implication Models must change as the question put to them changes. The optimality of a model must be weighted with respect to the task. According to Beck et al. 1997, a model is relevant when its input factors cause variation in the ‘answer’. 43 Requirements Requirement 2. Multidimensional averaging. In a sensitivity analysis all known sources of uncertainty should be explored simultaneously, to ensure that the space of the input uncertainties is thoroughly explored and that possible interactions are captured by the analysis. 44 Requirements Requirement 3. Important how?. Define unambiguously what you mean by ‘importance’ in relation to input factors / assumptions. 45 Requirements Requirement 4. Pareto. Be quantitative. Quantify relative importance by exploiting factors’ unequal influence on the output. … Requirement N. Look at uncertainties before going public with findings. 46 All models have use for sensitivity analysis … Volume S tatistics TRAPS 90 90 50 50 50 50 25 25 0 0 400 8 00 0 GAS 0 0 400 8 00 0 0 OIL 1000 20 00 0 GAS 60 90 90 30 30 50 50 0 0 0 6000 12 000 0 6000 12 000 20 00 OIL 60 0 1000 0 0 20 0 40 0 0 20 0 40 0 Atmospheric chemistry, transport emission modelling, fish population dynamics, composite indicators, risk of portfolios, oil basins models, macroeconomic modelling, radioactive waste management ... 47 Prescription have been issued for sensitivity analysis of models when these used for policy analysis In Europe, the European Commission recommends sensitivity analysis in the context of the extended impact assessment guidelines and handbook (European Commission SEC(2005) 791 IMPACT ASSESSMENT GUIDELINES, 15 June 2005) http://ec.europa.eu/governance/docs/index_en.htm 48 European Commission IMPACT ASSESSMENT GUIDELINES 2005) 13.5. Sensitivity analysis Sensitivity analysis explores how the outcomes or impacts of a course of action would change in response to variations in key parameters and their interactions. Useful techniques are presented in a book published by the JRC(*) […] Advantages • it is often the best way to handle the analysis of uncertainties. 49 Sources: a multi-author book published in 2000. Methodology and applications by several practitioners. Chapter1, Introduction and 2, Hitch Hiker guide to sensitivity analysis offer a useful introduction to the topic 50 Sources: a ‘primer’, an introductory book to the topic – its examples are based on a software, SIMLAB that can be freely downloaded from the web. 51 Prescriptions for sensitivity analysis (continued) • Similar recommendation in the United States EPA’s 2004 guidelines on modelling http://cfpub.epa.gov/crem/cremlib.cfm Models Guidance Draft - November 2003 Draft Guidance on the Development, Evaluation, and Application of Regulatory Environmental Models Prepared by: The Council for Regulatory Environmental Modeling 52 Prescriptions for sensitivity analysis (continued) “methods should preferably be able to (a) deal with a model regardless of assumptions about a model’s linearity and additivity; (b) consider interaction effects among input uncertainties; and 53 … EPA prescriptions (continued) (c) cope with differences in the scale and shape of input PDFs; (d) cope with differences in input spatial and temporal dimensions; and (e) evaluate the effect of an input while all other inputs are allowed to vary as well […].” 54 Other prescriptions While the EPA prescriptions seem modern from a practitioner viewpoint, those of the Intergovernmental Panel on Climate Change (IPCC, 1999, 2000) are rather conservative. The IPCC mentions the existence of “…sophisticated computational techniques for determining the sensitivity of a model output to input quantities...", while in fact recommending merely local (derivative based) methods. 55 Sensitivity analysis and the White House Odd though it might be, in he US the OFFICE OF MANAGEMENT AND BUDGET (OMB) in its controversial ‘Proposed Risk Assessment Bulletin’ also puts forward prescription for sensitivity analysis. (next story) 56 Other prescriptions Although the IPCC background papers advise the reader that [… the sensitivity is a local approach and is not valid for large deviations in non-linear functions…], they do not provide any prescription for non-linear models. 57 Some of the questions … The space of the model induced choices (the inference) swells and shrinks by our swelling and shrinking the space of the input assumptions. How many of the assumptions are relevant at all for the choice? And those that are relevant, how do they act on the outcome; singularly or in more or less complex combinations? 58 Some of the questions … I desire to have a given degree of robustness in the choice, what factor/assumptions should be tested more rigorously? (Should I look at how much “fixing” any given f/a can potentially reduce the variance of the output?) 59 Some of the questions … Can I confidently “fix” a subset of the input factors/assumptions? The Beck and Ravetz “relevance” issue. How do I find these f/a? 60 • Global sensitivity analysis Global* sensitivity analysis: “The study of how the uncertainty in the output of a model (numerical or otherwise) can be apportioned to different sources of uncertainty in the model input”. *Global could be an unnecessary specification, were it not for the fact that most analysis met in the literature are local or one-factor-at-a-time. 61 Uncertainty analysis = Mapping assumptions onto inferences Sensitivity analysis = The reverse process Resolution levels errors model structures Simulation Model data uncertainty analysis model output sensitivity analysis parameters feedbacks on input data and model factors Simplified diagram - fixed model 62 How to play uncertainties in environmental regulation … Scientific American, Jun2005, Vol. 292, Issue 6 63 - Fabrication (and politicisation) of uncertainty The example of the US Data quality act and of the OMB “Peer Review and Information Quality” which ”seemed designed to maximize the ability of corporate interests to manufacture and magnify scientific uncertainty”. 64 And the story goes on … OFFICE OF MANAGEMENT AND BUDGET (OMB) Proposed Risk Assessment Bulletin (January 9, 2006) http://www.whitehouse.gov/omb/inforeg/ OMB under attack by US legislators and scientists “Main Man. John Graham has led the White House mission to change agencies' approach to risk” ibidem in Nature “The aim is to bog the process down, in the name of transparency” (Robert Shull). […] the proposed bulletin resembles several earlier efforts, including rules on 'information quality' and requirements for cost–benefit analyses, that make use of the OMB's extensive powers to weaken all forms of regulation. Colin Macilwain, Safe and sound? Nature, 19 July 2006. 65 And still sensitivity analysis is part of the story: 4. Standard for Characterizing Uncertainty Influential risk assessments should characterize uncertainty with a sensitivity analysis and, where feasible, through use of a numeric distribution […] Sensitivity analysis is particularly useful in pinpointing which assumptions are appropriate candidates for additional data collection to narrow the degree of uncertainty in the results. Sensitivity analysis is generally considered a minimum, necessary component of a quality risk assessment report. OFFICE OF MANAGEMENT AND BUDGET Proposed Risk Assessment Bulletin (January 9, 2006) http://www.whitehouse.gov/omb/inforeg/ 66 (SA-based) consideration from Michaels paper and the OMB story: 1) High uncertainty is not the same as low quality. 2) One should focus on the ability to sort policy options outcomes: are they distinguishable from one another given the uncertainties? 67 High uncertainty is not the same as low quality Example: imagine the inference is Y = the logarithm of the ratio between the two pressure-on-decision indices PI1 and PI2 Region where Incineration is preferred Region where Landfill is preferred Y=Log(PI 1/PI 2) 68 Useful inference versus falsification of the analysis 69 Post Normal Science Funtowicz and Ravetz, Science for the Post Normal age, Futures, 1993 70 Post Normal Science Post-Normal Science, a mode of scientific problemsolving appropriate to policy issues where facts are uncertain, values in dispute, stakes high and decisions urgent. 71 Post Normal Science Elements of Post Normal Science Appropriate management of uncertainty quality and valueladenness Plurality of commitments and perspectives Internal extension of peer community (involvement of other disciplines) External extension of peer community (involvement of stakeholders in environmental assessment & quality control) 72 Post Normal Science Remark: on Post Normal Science diagram increasing stakes increases uncertainty Funtowicz and Ravetz, Science for the Post Normal age, Futures, 1993 73 Remark: “Rising political stakes catalyze scientific uncertainty” The critique of Daniel Sarewitz Daniel Sarewitz, Arizona State University ‘The notion that science can be used to reconcile political disputes is fundamentally flawed’ The example of the 2000 Bush-Gore count (American Scientist issue of March-April 2006 Volume: 94 Number: 2 Page: 104) 74 Pedigree matrix for evaluating models Courtesy of Jeroen van der Sluijs 75 Example result of pedigree analysis of emission monitoring acidifying substances Courtesy of Jeroen van der Sluijs Labels of source activity combinations plotted: Danger zone 1. NH3 dairy cows, application of manure 2. NOx mobile sources agriculture 3. NOx agricultural soils 4. NH3 meat pigs, application of manure 5. NOx highway: gasoline personal cars 6. NH3 dairy cows, animal housings and storage 7. NOx highway: truck trailers Safe zone strong 8. NH3 breeding stock pigs, application of manure 9. NH3 calves, yearlings, application weak of manure 10. NH3 application of synthetic fertilizer 76 Summary of SA input to post-normal science to the science-law-policy interfaces •Surprise the analyst •Find errors •Falsify the analysis (Popperian demarcation)* •Make sure that you are not falsified 77 •Falsify the analysis (Popperian demarcation): *‘Scientific mathematical modelling should involve constant efforts to falsify the model’ (Pilkey and Pilkey Jarvis, op. cit.) ** The white swan syndrome (Nassim N. Taleb, 2007) 78 Summary of SA input to post-normal science to the science-law-policy interfaces •Check if policies are distinguishable •Obtain minimal model representation •Contribute to the pedigree of the assessment. •… other practitioner’s chores such as mesh or grid size identification 79 The critique of models <-> Uncertainty Uncertainty is not an accident of the scientific method, but its substance. Peter Høeg, a Danish novelist, writes in Borderliners (Høeg, 1995): "That is what we meant by science. That both question and answer are tied up with uncertainty, and that they are painful. But that there is no way around them. And that you hide nothing; instead, everything is brought out into the open". 80