InFER2011 (Inference For Epidemic-related Risk) 28th March - 1st April 2011 Assessing the Adequacy of Epidemic Models Using Hybrid Approaches Gavin J Gibson & George Streftaris Maxwell Institute for Mathematical Sciences Heriot-Watt University Collaborators: Alex Cook, Chris Gilligan, Tim Gottwald, Glenn Marion, Mark Woolhouse, Joao Filipe,.. Funding: BBSRC, USDA Outline • Modern algorithms allow computation of solutions to complex problems in inference • Understanding and interpreting solutions is not always so easy! • True for fitting/testing/comparing epidemic models • Hybrid approaches & the Freudian Metaphor • Examples and alternatives • Questions and challenges Bayesian inference for epidemics Experiment: yields partial data y. Stochastic model: with parameter q specifying p(y|q). Aim: Express belief re q as a probability density p(q|y). Bayesian solution: Assign prior distribution p(q), to yield posterior p(q|y) p(q)p(y|q). Problem: p(y|q) is often intractable integral. Data augmentation: Consider data x from ‘richer’ experiment (i.e. y = f(x)) for which p(x|q) is tractable. Consider p(q, x | y) p(q)p(x |q)p(y|x). Often straightforward to simulate from p(q, x |y) e.g. using MCMC. Basic SEIR model S E: If j is in state S at time t, then Pr(j is exposed in (t, t+dt)) = bI(t)dt E I: TEj ~ pqE I R: TI j ~ pqI E I (random time in E) (random time in I) Parameters: q = (b, qE, qI) If times of transitions are observed (x), then likelihood p(x | q) tractable. Data y usually heavily censored/filtered (e.g. only removals are observed, weekly totals of new infections) Fitting using McMC p(q, x |y) p(q)p(x, y|q) •Construct Markov chain with stationary distribution p(q, x |y) •Iterate by proposing & accepting/rejecting changes to the current state (qi, xi) to obtain (qi+1, xi+1). Updates to q can often be carried out by Gibbs steps. Updates to x, usually require Metropolis-Hastings and Reversible-Jump type approaches. Iterate chain to produce sample from p(q, x | y). (See e.g. GJG + ER, 1998,O’Neill & Roberts, 1999, Streftaris & GJG, 2004, Forrester et al, 2007, Gibson et al., 2006, Chis-Ster et al. 2008, Starr et al. 2009,…) Sensitivity to prior d Markovian SEIR model: b : contact rate, g: removal rate, d : E → I rate g Removals from smallpox epidemic (Bailey) Similar difficulties arise e.g. when considering infection processes incorporating primary (a) and secondary infection (b) rates. b 1000 samples from posterior using uniform prior over cuboid. (Gibson & Renshaw, 2001) Extension to spatio-temporal SI models Susceptible j acquires infection at rate: j Rj = f(t; g)(e + b Si K(dij, a )) Spatial kernels (examples): 1. K(d, a) = exp(-ad) 2. K(d, a) = exp(-ad2) 3. K(d, a) = (d+1)-a Can be fitted using standard Bayesian/data augmentation/MCMC approach See GJG (1997), Jamieson (2004), Cook et al (2008) , Chis-Ster & Ferguson (2009) If models are used to design control strategy – e.g. spatial eradication programmes – then model choice can be crucial. Example: Miami Citrus Canker epidemic (Gottwald et al., Phytopathology, 2002; Jamieson, PhD Thesis, U. Cambridge, 2004, Cook et al. 2008) Data: Dade county, Miami 6056 susceptibles, 1124 infections after 12 30-day periods Optimal strategy for eradication sensitive to model choice Control strategies can be controversial Classical-Bayesian Spectrum ‘Classical’ model: q fixed, model specifies p(x |q) , where x represents quantities varying between replicate experiments. Predicts frequencies for x given q. ‘Bayesian’ model: Uncertainty in q modelled as prior p(q) giving p(q, x) p(q)p(x | q). A framework for both prediction of x and learning about q. How ‘large’ should our space of possible q be? Very large - less need to benchmark against alternatives – but problems of prior representation and sensitivity, computational complexity Very small – greater need to assess adequacy – sensitivity and complexity of inference reduced Hybrid approaches to model adequacy Example: Posterior predictive p-values (e.g. Rubin, 1984, XL Meng, 1994). To test H0: the model is valid. Observe y, calculate test statistic T(y), then consider p p q | y P(T T ( y) | q )dq Interpretation: •Posterior probability of more extreme value of T in next experiment. •Posterior expectation of the classical p-value P(T > T(y); q), computed by classical statistician with knowledge of q. The Freudian Metaphor (see e.g. Gigerenzer*, 1993) Co-existence of multiple facets of the statistical personality. SUPEREGO: Conscience, criticism of EGO EGO: Reason, common sense, translates the appetites if ID into action ID: Basic Instincts & Drives *in Handbook for data analysis in the behavioral sciences The Freudian Metaphor (see e.g. Gigerenzer*, 1993) Co-existence of multiple facets of the statistical personality. Gigerenzer SUPEREGO: Conscience, criticism of EGO Neyman-Pearson EGO: Reason, common sense, translates the appetites if ID into action Fisher ID: Basic Instincts & Drives Bayesians *in Handbook for data analysis in the behavioral sciences The Freudian Metaphor (see e.g. Gigerenzer*, 1993) Co-existence of multiple facets of the statistical personality. SUPEREGO: Conscience, criticism of EGO EGO: Reason, common sense, translates the appetites if ID into action ID: Basic Instincts & Drives Gigerenzer GJG Neyman-Pearson Classical Fisher Bayesian Bayesians *in Handbook for data analysis in the behavioral sciences The Freudian Metaphor (see e.g. Gigerenzer*, 1993) Co-existence of multiple facets of the statistical personality. SUPEREGO: Conscience, criticism of EGO EGO: Reason, common sense, translates the appetites if ID into action ID: Basic Instincts & Drives Gigerenzer GJG Neyman-Pearson Classical Fisher Bayesian Bayesians Physicists? *in Handbook for data analysis in the behavioral sciences Schematic diagram of PP p-value WORLD OF THE EGO Large probability of small p-value indicates conflict between E and S E Asserts model and p(q) 0 Observes: Imputes: q y p(q |y) S p(p |y) p(T(y), q)) 1 PP p-value for model comparison Suppose S applies Likelihood Ratio Test, for example • T(y, q) = p(y | q)/p1(y) where p1(y) denotes sampling density of y under an alternative model. Problem of intractable likelihood p(y | q) arises again! • Impute S’s response to observation of latent process x? • So long as both p and p1 specify a tractable sampling density for x, then p(x|q)/p1(x) can be imputed (e.g. from MCMC). Imputed p-value from a latent process Large probability of small p-value indicates conflict between E and S E Asserts model and p(q) 0 Observes: y Imputes: q, x p(q, x |y) S p(p |y) p(T(x, q)) 1 Model checking with imputed p-values R solani in radish (GJG, et al., 2006) 18 x 23 grids of plants, daily sampling (approx): High inoculum: 45 randomly chosen sites (13 reps) Low inoculum: 15 randomly chosen sites (13 reps) •Model: SI with primary infection (a), nearest-neighbour secondary infection (b0, b1, b2) representing max rate, variability and peak timing. •Fitted using MCMC methods Results: Replicates fitted jointly (assuming common parameters) and separately. Sample of data (high inoculum) Missing O, Primary inoc. + Symptomatic day 9 X Symptomatic day 9 X Posterior densities (primary infection rate, a) High inoculum Low inoculum Posterior densities (peak secondary rate, b0) High inoculum Low inoculum Posterior densities (secondary variability, b1) High inoculum Low inoculum Posterior densities (secondary peak time, b2) High inoculum Low inoculum Posterior predictive envelope of I(t) (joint fit posterior mean parameters) High Low Checking using ‘Sellke’ residuals If Ri(t) denotes infectious challenge to i at time t, xi i Ri t dt ~ Exp(1) 0 where xi denotes infection time. Impute latent ‘Sellke’ thresholds for each site and S’s p-value from K-S test, generating posterior distribution of p-values. E p(q) p(p|y) Observation y (q, x, ) S p() Posteriors for p indicate lack of fit……. High inoculum - joint fit p-val. posterior summaries mean median LQ UQ Pr(p<5%) 0.013454 0.000024 0.000048 0.002916 0.136859 0.075692 0.014030 0.000000 0.000024 0.552893 0.000002 0.000000 0.086550 0.008933 0.000006 0.000015 0.002039 0.121301 0.061481 0.010434 0.000000 0.000009 0.540598 0.000000 0.000000 0.067288 0.003938 0.000002 0.000005 0.001029 0.070754 0.033481 0.005457 0.000000 0.000003 0.388507 0.000000 0.000000 0.035668 0.017715 0.000020 0.000046 0.003757 0.189353 0.101138 0.018893 0.000000 0.000026 0.707525 0.000002 0.000000 0.116220 0.9706 1.0000 1.0000 1.0000 0.1508 0.4054 0.9794 1.0000 1.0000 0.0014 0.9844 1.0000 0.3708 rep H1 H2 H3 H4 H5 H6 H7 H8 H9 H10 H11 H12 H13 Low inoculum - joint fit p-val. posterior summaries mean median LQ UQ Pr(p<5%) 0.085966 0.016515 0.025164 0.001911 0.000126 0.067819 0.000045 0.000004 0.000030 0.003341 0.000001 0.001243 0.051377 0.063765 0.011087 0.018007 0.001254 0.000034 0.052805 0.000017 0.000000 0.000006 0.001897 0.000000 0.000678 0.027275 0.030051 0.004239 0.008519 0.000602 0.000009 0.024010 0.000005 0.000000 0.000001 0.000737 0.000000 0.000257 0.008475 0.118423 0.022995 0.034758 0.002506 0.000116 0.097047 0.000048 0.000001 0.000022 0.004192 0.000001 0.001549 0.069362 0.4036 0.9412 0.8724 1.0000 1.0000 0.4804 1.0000 1.0000 1.0000 0.9998 1.0000 1.0000 0.6560 rep L1 L2 L3 L4 L5 L6 L7 L8 L9 L10 L11 L12 L13 Model comparison using imputed p-values Streftaris &Gibson (PRSB, 2004) implicitly followed this approach •Analysed data from 2 experiments of FMD in 2 populations of sheep. •2 groups of 32 sheep each subdivided into 4 sub-groups. •Group 1 exposed to FMD, subsequent epidemic observed. Data: Censored estimates of infectious period for each sheep, and measures of peak viraemic load for each animal. Question: Does viraemia decline as we go down the infection tree? Experiment (Hughes et al., J. Gen. Virol. 2002): 32 sheep allocated to 4 groups G1, .., G4. Animals in G1 inoculated with FMD virus (4 at t=0, 4 at t=1 day). Thereafter animals mix according to the following scheme: Day 1 2 3 4 . G1 G2 G3 G4 Idea is to ‘force’ higher groups further down the chain of infection. Data: Daily tests on each animal, summarised by: y = (time of 1st +ve test, last +ve test, peak viraemic level) Model: SEIR -Relationship between infectivity and viraemic load -Weibull distributions for sojourn in E and I classes -Peak viraemia independent of depth -Vague priors for model parameters p(q). Let x denote the infection network – must be imputed using MCMC x Depth 0 Depth 1 S conducts 1-way ANOVA on peak viraemic levels, to generate p-value. Depth 2 Depth 4 E considers p(p|y) to identify potential conflict. From Streftaris & Gibson, PRSB (2004) Group 1 Group 2 Arguably, a little too strongly stated. If we accept the modelling assumptions we must nevertheless concede that, with high posterior probability, an ANOVA test would provide significant evidence of differences in viraemia with depth in the infection chain. A general infection process (Streftaris & Gibson, 2011, in preparation) Assume ‘Sellke’ thresholds drawn from unit mean Weibull with shape parameter n. (NB n = 1 is exponential). Is there evidence against the exponential model in favour of this new model for Experiments 1 & 2? A: Full Bayesian – include n as additional parameter and consider p(n|data) B: Latent KS-test applied to imputed thresholds for exponential model C: ‘Latent’ LRT (against Weibull alternative) applied to imputed thresholds for exponential model Results for Weibull threshold model EXPERIMENT 1 A C B Imputation ‘reinforces’ the model Large probability of small p-value indicates conflict between E and S E Asserts model and p(q) Observes: y Imputes:q, x 0 p(q, x |y) S p(p |y) 1 p(T(x, q)) Power of tests applied to x should be expected to diminish with amount of imputation. Loss of ‘power’ as the ‘richness’ of x increases Consider simple hypotheses regarding distribution of x. E asserts x ~ p0(x) = p(x | q0). S checks against alternative p1(x) Observe y = f(x). E imputes x ~ p0(x|y) and result of S’s test based on p0(x)/p1(x). Should use p0(x)/(p1(y)p0(x|y)) = p0(y)/p1(y). If we use z instead of x, where x = f(z), the corresponding mis-match between denominators increases (as measured by K-L divergence). Comparing models - ‘Symmetric’ Approaches 1. Bayesian Model Choice • Embed ‘competing’ models i = 1, … , k in an expanded model space equipped with prior for models (p1, …, pk) and parameters p(qi), i = 1, ..., k. • Increased complexity makes implementation of MCMC harder. • Model posterior probabilities sensitive to choice of prior p(qi). CTV spread by melon aphid: Model: Rj = b + Si dij-a 1 year n1 infections 1 year Gottwald et al., 1996, GJG 1997 n2 infections Posterior contour plots: Melon aphid (3 epidemics) v Brown citrus aphid (3 epidemics) Gottwald, et al (1999) MELON APHID (B + NN) BC APHID (Local not NN) Power-law decay, a (transformed) Analysis of such historical data could provide informative priors for comparison of MA and BCA ‘models’ fitted to a new data set. MA prior 3rd model representing unspecified alternative (characterised by vague prior) may not be favoured in Bayesian model comparison. Background infection b BCA prior Local parameter a Leads to comparisons based on separate fitting of models. 2. Posterior Bayes Factors / DIC •PBF (Aitken, 1991) compare models on basis of p 1 q1 | y p 1 y | q1 dq1 p 2 q 2 | y p 2 y | q 2 dq 2 Ratio of posterior expectations of the likelihood •DIC (Spiegelhalter et al, 2002) uses D(q) = -2 log p(y|q). Formally ~ DIC D(q ) 2 pD where pD D(q ) Dq~ is measure of complexity and expectations are taken over p(q|y). DIC is then computed across the models to be compared. DIC for epidemic modelling? We may need to consider augmented parameter vector q′ = (q, x) where x are unobserved components so that p(y, x| q) is tractable. •No unique choice of x! •Dimension of imputed x may approach (or exceed) dimension of data set y. ~ See Celeux at al, 2006, DIC q for missing data models for extensive range of alternative ways to define DIC •Bayesian relevance of comparing DICs across models? Philosophical difficulties with DIC/PBF? Interpretation 1 Interpretation 2 2 or more Egos required! “Batesian” rather than Bayesian? 2 or more statisticians: DIC interpreted by some external arbiter. Observation y p1(q1) DIC DIC2(y) p2(q2) DIC1(y) E1 E2 p1(q1|y) q1 DIC1(y) DIC2(y) p2(q2|y) S q2 E1 E2 S1 S2 Philosophical difficulties with DIC/PBF? Interpretation 1 Interpretation 2 2 or more Egos required! “Batesian” rather than Bayesian? 2 or more statisticians: DIC interpreted by some external arbiter. Observation y p1(q1) DIC DIC2(y) p2(q2) DIC1(y) E1 E2 p1(q1|y) q1 p2(q2|y) S q2 DIC1(y) p1(p|y) E 1 S1 q1 DIC2(y) E2 q2 p2(p|y) S2 Summing up •Many ‘tensions’ in Bayesian methods come to the fore in the context of dynamical epidemic models •Hybrid approaches may offer a way of addressing these tensions by applying Bayesian methods to low-complexity models checked in a classical approach •Perhaps we need to underplay the importance of models as predictive tools as opposed to interpretive tools. •Qualitative conclusions that are robust to model choice may be seen as extremely valuable Final example: R solani in radish re-visited No trichoderma Model: SEI with ‘quenching’, primary and secondary infection constant latent period. 3 ‘submodels’: Trichoderma 1. Latent period = 0 (SI) 2. SEI with observations recording I 3. SEI with observations recording E+I 1. SI model Primary 2. SEI model, I observed Primary Secondary Secondary Quenching Quenching Latent 3. SEI model, E+I observed Primary Secondary Although quantitative estimates of parameters changes with model the qualitative conclusion seems robust. There is consistent evidence that T viride appears to affect the primary infection parameter. Quenching Latent Models are useful ‘lenses’ even if they cannot be used as ‘crystal balls’!