William Browne§, Laura Green*, Graham Medley* and Camille Szmaragd§ §University of Bristol *University of Warwick Using Discrete time survival models to model breakdown with TB of cattle using the Randomised Badger Culling Trial dataset 2 Summary • Description of the dataset • Discrete time survival models with missing data • Model extension to test imperfect TB test sensitivity • Multiple state models 3 Randomised Badger Culling Trial (RBCT) • Where: High prevalence areas of the UK • When: 1998-2005 • What: Real-time control/treatment experiment • How: – 10 triplets defined (A-J) – 3 areas in each; control: survey, treatment: reactive or proactive – Treatment: culling badgers either around a herd breakdown (reactive), or everywhere and annually in the area (proactive) – Survey of signs of badger’s activity – TB testing of all herds in the areas 4 DEFRA grant on modelling TB in cattle and badgers • Using RBCT data • Investigate spatial and temporal patterns in TB incidence in both cattle and badgers • Project started in December 2008 – one of four that DEFRA funded in the same call. 5 DEFRA grant on modelling TB in cattle and badgers • Data are rich in so much as the quantity of data collected by the trial was large • Data on badgers’ social group, infection status and other characteristics from trial database. • Link these data to the VETNET cattle data for TB tests and to records of cattle movements from the CTS. • There are however several . challenges to overcome 6 Challenges in modelling TB in cattle and badgers • Badgers • not kept in fields • social dynamics are disrupted by culling their fellows (current DEFRA call on this) • survey areas only baseline estimates • Cattle • Cannot test every cow on one day so time of test needs to be considered. • Farmers have several fields and we are not sure which field each cow frequents with regard transmission. • TB test is not perfect Data selection 7 • Cattle data - Collected for each CPHH (County Parish Holding Herd) includes many test variables along with cattle movement information (collected at the CPH level). • Badger Data - Collected for individual badgers but also information at the “social group” level. • GIS files - Used to calculate neighbouring relationships and trapping efforts, dealing with multiple land parcels. 8 Modelling TB in cattle • CPHH is the observation unit within CPH (combining multiple parcels of land) as spatial unit • “badger year” as temporal unit • Discrete time survival model (response variable: HBD) based on the outcome of TB tests during that time period; •Possible predictors: •demographic characteristics of the farm •concurrent and past level of TB infection in local cattle •risk posed by importing animals onto the farm •badger-related variables 9 Cattle models logit[hij t ] (t) X ij t u j , With hij(t) is the hazard of HBD during episode i of individual CPHH j . (t) is a function of the “time at risk” variable Xij(t) are covariates which might be time-varying or defined at the episode or individual level uj random effect representing unobserved characteristics of individual j shared-frailty (common to all episodes for individual j). We generally assume that uj~N(0,σ2u) Model also extended to include spatial (CAR) random effects Fitting data to DTSM framework: Constructing response 10 Expand the response variable so there is a response for each 12-month time interval. This was done in 2 stages as follows: • Herd tested positive at any time during a 12-month period and the last test performed is not clear = Herd Status coded as 1 • Herd tested positive at any time during a 12-month period and the last two tests performed are clear = Herd Status coded as 2, indicating that the herd was no longer under restriction at the end of the period • Herd tested negative during a 12-month period= Herd status coded as 0 • Herd not tested during a 12-month period= Missing value • An episode is then defined as a period where a herd is at risk of breakdown (our response variable) 11 Constructing response variable (continued) • The purpose of first constructing the herd status is to work out when herds are actually at risk: • For example if herd status pattern for 5 years is 00111 then the herd is at risk in years 1-3 but not at risk in years 4 and 5 → 001.. Whilst pattern 00121 means that the herd is at risk in years 1-3 clears in year 4 and then is back at risk in year 5 → 001.1 12 Missing test data • Existence of some years that are missing herd tests; We looked at three ways to deal with this (the first two to bound probabilities): 1. Assume all missing tests are clear i.e. fill in all missing values as 0s. 2. Assume all missing tests are positive i.e. fill in all missing values as 1s. 3. A model-based solution with “true” value for the missing test treated as a parameter and estimated by the Bayesian model. 13 Multiple pattern approach • For each herd with missing test(s) results, a set of possible patterns (and associated covariates) will be determined – E.g. Test result sequence: 0, 0, M, M, 1, 0, 0, 1 → 3 possible patterns with the same probability of occurring a priori: • 0,0,1, -,-, 0,0,1; and corresponding time at risk 1, 2, 3, 1, 2, 3 • 0,0,0,1,-, 0,0,1 ; and corresponding time at risk 1, 2, 3, 4, 1, 2, 3 • 0,0,0,0,1,0,0,1 ; and corresponding time at risk 1, 2, 3, 4, 5,1, 2, 3 with – indicating a period not at risk for event occurrence. 14 Construction of pattern sets • Set of possible patterns constructed following a set of 6 rules (validity of rules can be debated!) depending on location of the missing test: – rule a: a missing value between a 1 and a 1 is assumed to be a 1 – rule b: a missing value between a 1 and a 0 is assumed to be either a 0 or a 1 – rule c: a missing value between a 0 and a 0 is assumed to be a 0 – rule d: a missing value between a 0 and a 1 is assumed to be either a 0 or a 1 – rule e: If the first records are missing, assume the previous not missing is a 0 and follow either rule c or d – rule f: if the last records are missing, then replace by a 4 (this is equivalent to removing them) Multiple pattern approach 15 • These steps will allow construction for each CPHH of a pattern set containing all feasible patterns for that CPHH. • The full posterior for a multilevel (DTS) model with J herds and nj records can be estimated from the likelihood for pattern p given parameters β n (j p ) p ( ) p ( ) p ( j ) p (u j | ) L( yij 2 u 2 u j p ( p) | X ij ( p) , ,u j ) I ( j p ) i • Here Өj is the currently chosen pattern for CPHH j and we assume each feasible pattern is equally likely. • Different models (sets of predictors) were then fitted using an MCMC algorithm (mixture of Gibbs and Metropolis sampling) 16 Model fitting • Initially fit each trial area separately • The badger variables were constructed by aggregating for each herd the number of badgers (trapped, estimated alive at the time of the trial (Szmaragd et al. submitted ) or infected) whose territory (identified through their allocation to a social group) overlapped any parcels of land to which the herd had accessed • There were large numbers of cattle movement variables due to different movement types. 17 Model Fitting • Method variant of that suggested by Cox & Wermuth. • Begin with adding each predictor univariately to a baseline model. • Add in all that are significant but remove those that are strongly correlated (ρ>0.7) • Knock out non-significant predictors and use resulting model as starting point for a new fit with univariate addition. • Continue until no addition required. 18 Model fitting • We will only discuss briefly results of fitting each trial area separately. We focussed only on Polwhele areas due to issues with the definition of badger social territories in Aston Down administered areas. • We did then continue on to fitting the 5 proactive areas in one combined analysis (without spatial random effects) • Finally we combined 10 (5 proactive + 5 control) areas where model fitting took several months! • Results being written up in Szmaragd et al. (In preparation a) 19 Results • Different set of significant predictor variables were identified for each trial area • Only number of cattle tested (positive effect-increased risk) and number of cattle sold the year of the test (negative effect – protective factor) came up consistently for all the areas • Proactive area F1 was the only area for which a specific “badger-effect” was detected. – For this area, the proportion of infected badgers caught and the number of badger estimated alive had a strong positive effect. – Note less power when focussing on single areas 20 Results – for proactive B area Variables OR 2.5CI 97.5CI Intercept 0.084 0.057 0.120 Post 2001- FMD (0/1) 1.973 1.359 2.861 Nb Cattle Tested (Y) 1.006 1.003 1.008 Nb Reactors (Y-2) 1.101 1.012 1.202 Nb of positive Neighbours (Y) 1.433 1.198 1.714 Nb of Calves (Y-1) 1.019 1.013 1.026 Nb of Cattle Sold (Y) 0.972 0.965 0.978 Nb of Cattle bought through Market (Y) from farms tested positive the following year 1.100 1.066 1.138 Nb of Cattle bought directly (Y) from low risk farms tested positive the previous year 1.231 1.034 1.522 Nb of Cattle bought through Market (Y) from low risk farms tested positive the following year 1.605 1.138 2.572 Nb of Cattle bought through Market (Y) from high risk farms tested negative the previous year 0.790 0.614 0.951 21 Extension to imperfect testing • We assumed in above analysis that the specificity of the TB test is near perfect; i.e. No false positive • But sensitivity of the test may be as low as 50%: negative tests may actually be obtained for positive herds !?!? What is the impact on the parameter estimates? 22 Extension to imperfect testing • For each herd, if herd status in a specific year is 0 or 2, then create two alternative patterns: – The test was a true negative with probability 1-p – The herd was actually positive (false negative), with probability p • 1-p represents the Negative Predictive Value (NPV) and can be linked to the sensitivity (Se) of the test, using testing data – here Se of 0.5 is equivalent to p = 0.153 and Se of 0.95 to p = 0.008 (see Szmaragd et al. in prep. b) 23 Extension to imperfect testing • Construct a first set of possible patterns to account for uncertainty related to negative tests • Use rules defined previously to deal with missing values (extend the set of possible patterns) • Test a range of values for p (corresponding to Se between 0.5 and 1) • Pattern selection in the MCMC algorithm by Metropolis using the prior distribution of the (cumulative) probabilities of each pattern as a proposal distribution 24 Extension to Imperfect testing: Example: Original data is 1 0 M 2 1 0 Firstly we need to deal with the actual years with negative tests – there are 3 resulting in 8 patterns Pattern 1 2 3 4 5 6 7 8 Pattern 10M210 10M211 10M110 10M111 11M210 11M211 11M110 11M111 Prob (1-p)3 p(1-p)2 p(1-p)2 p2(1-p) p(1-p)2 p2(1-p) p2(1-p) p3 Here we have simple binomial probabilities – now conditional on these patterns we deal with the missing data. Note in first 4 patterns the M can be either a 0 or a 1 whilst in the other 4 it is deterministically decided to be a 1. Extension to imperfect testing: Example: Original data: 1 0 M 2 1 0 25 Pattern No Pattern Prior Prob Response Clock Variable 1 100210 ½ (1-p)3 100110 112311 2 101210 ½ (1-p)3 101.10 112.11 3 100211 ½ p(1-p)2 10011. 11231. 4 101211 ½ p(1-p)2 101.1. 112.1. 5 100110 ½ p(1-p)2 1001.0 1123.1 6 101110 ½ p(1-p)2 101..0 112..1 7 100111 ½ p2 (1-p) 1001.. 1123.. 8 101111 ½ p2 (1-p) 101... 112... 9 111210 p (1-p) 2 1...1 0 1...11 10 111211 p2(1-p) 1...1. 1...1. 11 111110 p2 (1-p) 1....0 1....1 12 111111 p3 1..... 1..... 26 Effect of imperfect test • Tested for proactive B area only • Most of previously significant predictors kept in the models • Additional significant predictors found, mostly related to number of cattle bought in from different type of farms • Larger confidence intervals surrounding parameter estimates which increase as sensitivity decreases • With lower test sensitivity (≤0.75), for some herds a pattern with lower prior likelihood is selected as best • Badger predictors still not significant here. 27 Other effects of imperfect test • When sensitivity is lowered the MCMC algorithm can exhibit convergence issues. • Here we believe the posterior may become multimodal with some modes hard to escape. • A loop of ‘if test x was actually positive then predictor y is highly significant then test x is definitely positive …’ • An issue with small dataset and some sparse predictors (cattle movements) 28 Multiple state models • Assuming perfect test, model both transitions: – from “at risk” to under-restriction (current model) (state 1) logit[hij(1) (t )] (1) (t ) (1) xij(1) (t ) u (1) j – From under-restriction to “at risk” (state 2) logit[hij(2) (t )] (2) (t ) (2) xij(2) (t ) u(2) j – Allow correlations between the herd level residuals • Specify a single equation model with dummy variables for each state. Interact dummies with duration and covariates to obtain state-specific duration and covariate effects (See Steele, Goldstein and Browne, 2004) • Currently writing up (Szmaragd et al. in prep. c) 29 Project Team • William Browne, University of Bristol • Camille Szmaragd, University of Bristol • Laura Green, University of Warwick • Graham Medley, University of Warwick • Sam Mason, University of Warwick • Andy Mitchell, VLA • Paul Upton, VLA 30 A plug for some other work! Much of my research over the years has been into statistical software development (funded largely by the ESRC). For the badger work we wrote stand-alone C code for the model fitting and Camille wrote lots of R scripts to manipulate data and manage model fitting. In other current work we are working on a new piece of software STAT-JR to follow on from our work on MLwiN 31 STAT-JR • Named in memory of my long-term collaborator and senior partner in the MLwiN software development, Jon Rasbash. • Our take on Jon’s vision for where statistics software goes next. • A team of programmers working on the project (Chris Charlton, Danius Michaelides, Camille Szmaragd, Bruce Cameron and me). • Will have laptop with me to discuss software with interested people this week. 32 The E-STAT project and STAT-JR STAT-JR developed jointly by LEMMA II and E-STAT ESRC nodes. Consists of a set of components many of which we have an alpha version for which contains: Templates for model fitting, data manipulation, input and output controlled via a web browser interface. Currently model fitting for 90% of the models that MLwiN can fit in MCMC plus some it can’t including greatly sped up REALCOM templates Some interoperability with MLwiN, WinBUGS, R, Stata and SPSS (written by Camille) 33 An example of STAT-JR – setting up a model 34 An example of STAT-JR – setting up a model 35 Equations for model and model code Note Equations use MATHJAX and so underlying LaTeX can be copied and paste. The model code is based around the WinBUGS language with some variation. This is a more complex template for 2 level models. 36 37 Model code in detail model { for (i in 1:length(normexam)) { normexam[i] ~ dnorm(mu[i], tau) mu[i] <- cons[i] * beta0 + standlrt[i] * beta1 + u[school[i]] * cons[i] } for (j in 1:length(u)) { u[j] ~ dnorm(0, tau_u) } # Priors beta0 ~ dflat() beta1 ~ dflat() tau ~ dgamma(0.001000, 0.001000) tau_u ~ dgamma(0.001000, 0.001000) } For this template the code is, aside from the length function, standard WinBUGS model code. 38 Bruce’s (Demo) algebra system step for parameter u 39 40 Output of generated C++ code The package can output C++ code that can then be taken away by software developers and modified. 41 42 Output from the E-STAT engine Here the six-way plot functionality is in part taken over to STAT-JR after the model has run. In fact graphs for all parameters are calculated and stored as picture files so can be easily viewed quickly. 43 44 Interoperability with WinBUGS Interoperability in the user interface is obtained via a few extra inputs. In fact in the template code user written functions are required for all packages apart from WinBUGS. The transfer of data between packages is however generic. 45 Interoperability with WinBUGS Interoperability in the user interface is obtained via a few extra inputs. In fact in the template code user written functions are required for all packages apart from WinBUGS. The transfer of data between packages is however generic. 46 Output from WinBUGS with multiple chains STAT-JR generates appropriate files and then fires up WinBUGS. Multiple Chains are superimposed in the sixway plot output. 47 Output from WinBUGS with multiple chains