4 – 6 November 2009 QUANTIFYING HAZARDS AND RISKS WITH EXPERT JUDGMENT Willy Aspinall Bristol University / Aspinall & Associates Willy.Aspinall@Bristol.ac.uk Promises, promises “.... our work/project/research will reduce uncertainty” The Three Horsemen of Risk Apocalypse UNCERTAINTY with apologies to Roger Cooke AMBIGUITY INDECISION The Three Horsemen example UNCERTAINTY PF goes which way, how far and how fast? AMBIGUITY What is understood by term “pyroclastic flow”? INDECISION Do we evacuate? The Three Horsemen responses UNCERTAINTY Do measurements, quantify uncertainty AMBIGUITY Define concepts, domain of application INDECISION Assess utilities, preferences The Three Horsemen roles UNCERTAINTY Experts‟ role to quantify AMBIGUITY Analyst/facilitator‟s job to clarify INDECISION Stakeholder, problem owner‟s responsibility Elicitation of expert judgment In climate change modelling, for instance, the challenges are exemplified by: “…. We explore a high rate of refusal to participate in this expert survey: many scientists prefer to rely on output from future climate model simulations.” Arnell, N. W., E. L. Tompkins, et al. (2005). Eliciting Information from Experts on the Likelihood of Rapid Climate Change. Risk Analysis 25: 1419-1431. “…The past performance of such projections has been systematically overconfident. Analysts have often used scenarios based on detailed story lines…. for evaluating uncertainty. No probabilities are typically assigned to such scenarios.” Morgan, M.G. and D. Keith (2008). Improving the way we think about projecting future energy use and emissions of carbon dioxide. Climatic Change 90: 189-215. The Classical Model A performance-based procedure for quantifying uncertainties from expert judgments Qi = Pr(event|data)??? Wj = Cj * Ij Cooke, R.M. (1991) Experts in Uncertainty. OUP. Cooke, R.M. and L.L.H.J. Goossens (2008) TU Delft expert judgment data base. Reliability Engineering & System Safety Expert Judgement 93: 657-674. Synthesised group “Decision-Maker” DMi = Wj*Qi One case history (of several) DEFRA study objective: to develop a generic quantitative model for accelerated internal erosion in Britain‟s population of 2,500 ageing dams, using elicited quantities for key variables Cowlyd Reservoir inspection party - 1917 Warmwithens Dam failure - 1970 ..risk assessment and reservoir safety in the UK Experts‟ spreads of opinion for one parameter Opinions on the time-to-failure (in days from first detection) for the 10%ile of slowest cases…. ….. and outcomes obtained by alternative ways of weighting and pooling opinions Note the “two schools of thought” effect…and the strong „opinionation‟ of many experts The reservoir engineers: performance-based scores, and mutual weighting rankings Calibration weights versus mutual weights Equal weights, performance-based weights and an expert census approach …hypothetical SSHAC-4 expert census uncertainty spread?? Advanced 3D computational fluid dynamics modelling Courtesy INGV and EU EXPLORIS Project Elicitation of „realistic‟ physical uncertainties on model outputs Analysing expert elicitations with Cooke‟s “Classical Model” The procedure relies on cornerstones of the scientific method: Empirical control - evaluates weights for experts on basis of measures of performance Accountability - inputs are traceable in terms of scientific inputs of individuals Reproducibility - can replicate and review all calculations used Advantages: Impartiality - experts are treated equally prior to calibration Equity – individual experts’ scores are maximised by stating true scientific views Diagnostic - procedure can highlight discrepancies in reasoning or inconsistencies in interpretation ……this approach produces a “rational consensus”, and sits squarely within the Bayesian paradigm for decision-support Montserrat - 11 October 2009 Probabilistic forecasting for Montserrat volcano using the structured expert elicitation approach 2. GIVEN current conditions, what is the probability that within the next year the first significant development will be the resumption of lava extrusion. SAC elicitation Credible interval lower bound Median estimate Credible interval upper bound 6.3% 34.1% 66.1% Forecast metric - Brier Skill Score Brier Score BS 1 n •oi n fk ok k 1 2 = 1 if the event occurs = 0 if the event does not occur •fi is the probability of occurrence according to the forecast system •BS can take on values in the range [0,1], a perfect forecast having BS = 0 Brier Skill Score BSS BS cli BS cli BS BS cli o1 o The forecast system has predictive skill relative to some reference (e.g. climate record) if BSS is positive, a perfect system having BSS = 1. = total frequency of the event o (e.g. sample climatology / global data / other reference basis) Forecast skill performance of Montserrat SAC Probabilistic forecast scorecard +ve BSS All forecasts 84 zero or -ve BSS 26 (76%) (24%) 61 14* (83%) (17%) (110 no.) Life critical forecasts (75 no.) * includes some „most threatening‟ scenarios cautious Communicating forecast skill Surrogate metrics for forecast skill 40 € ROI [1€ staked per forecast] 30 € 20 € 10 € 0€ -10 € -20 € Sep-2008 Sep-2006 Sep-2004 Sep-2002 Sep-2000 Sep-1998 Sep-1996 Sep-1994 Cumulative Return on Investment ROI Montserrat case, following Lenny Smith & colleagues…… [Hagedorn, R., Smith, L.A. (2008) Communicating the value of probabilistic forecasts with weather roulette. Meteorol. Appl. Published online in Wiley InterScience (www.interscience.wiley.com) DOI: 10.1002/met.9. ] Forecast performance versus outlook period Brier Score by outlook period 1.5 Brier Score by outlook period 0.5 0 Months 70 -1 60 -0.5 50 40 30 20 10 0 Brier Score rel. uniform probs. 1 Challenging elicitations of scientific expert judgment The Harvard study on Kuwait’s First Gulf War reparations claim More Than 700 Fires First Fires – Air War ~ 17 January 1991 Ground War ~ 23 February 1991 Liberation ~ 28 February 1991 Last Fire - 6 November 1991 6 Oil Burned ~ 4 x 10 barrels per day 9 PM Emissions ~ 3 x 10 kg PM10 levels – typical 300 Health effects ug/m3claim , sometimbased es 2000 on expert • elicitation: ~ 35 deaths Individual experts’ best mortality estimates: 13, 32, 54, 110, 164, 2874 Equal Weights (82 deaths; 90% conf. range: 18 to 400 ) Performance Weights (35 deaths; 16 to 54) The judicial decision of the UN Commission eventually rejected the admissibility of this form of evidence: “…not actual data…..” …and we won‟t mention Prof Nutt and cannabis! Estimating dose-response curves for cancer risk from airborne arsenic using expert inputs Work with the late Joey Hanzich (Env. Epid. MPhil 2006-07) and Dr Peter Baxter at IPH Cambridge Extracting signal from expert noise Example self-weighted curves from one individual expert for one risk ratio value….. ….and pooled results for group, combined with EXCALIBUR weights Weighted Cumulative Probability Weighted Cumulative Probability vs Cumulative Exposure 1.0 Estimated Risk Ratio 1.01 1.05 1.10 1.50 2.00 0.8 0.6 0.4 0.2 0.0 1 00 0 0 0. 0 01 0 0 0. 0 10 0 0 0. 0 00 1 0 0. 0 00 0 1 0. 0 0 0 00 00 00 0 0 0 0 0 .0 1. 0. 10 10 Cumulative Exposure in (mg/cubic m)*years A supplementary approach The Cooke Classical Model and EXCALIBUR procedure for eliciting quantitative values and uncertainty distributions from multiple experts. Two-factor ranking of option items by Paired Comparison with Probabilistic Inversion For more qualitative assessments of uncertain factors, simple paired comparison analysis using Probabilistic Inversion (PI) model fitting provides an alternative way of characterizing relative rankings (“revealed preferences”) from a group, with quantitative estimates of associated uncertainties: 0.624 0.185 Item 7 0.586 0.233 0.604 0.199 Item 9 0.577 0.212 0.666 0.180 Item 10 0.785 0.222 0.425 0.168 Item 3 0.593 0.191 0.786 0.187 Item 1 0.440 0.226 0.316 0.130 Item 6 0.781 0.192 0.763 0.143 Item 4 0.805 0.133 0.447 0.158 Item 8 0.159 0.111 0.168 0.102 Item 2 0.4 0.3 0.2 Item8 0.1 Item 5 0.0 Importance 1 0.143 0.9 0.521 0.8 Item 2 0.7 0.089 Item 1 0.5 0.6 0.171 Item 3 0.5 0.118 Item 9 0.6 0.4 0.155 Item 7 0.7 0.3 Item 5 0.8 0.2 Std. dev. Item 6 Item 4 0.1 Import. Item 10 0 Std. dev. 0.9 Performance Perform. 1.0 In almost all circumstances, and at all times, we find ourselves in a state of uncertainty - Bruno de Finetti ….and scientists will continue to be perplexed, bemused and uncertain! Summing up “.... our work/project/research will reduce uncertainty……” …. a laudable goal, but the opposite is likely to emerge when exhaustive and formalized investigations of scientific uncertainty are undertaken – and scientists will have to think how best to communicate the implications for hazard and risk management! Thank you!