§, Laura Green*, Graham Medley* William Browne § and Camille Szmaragd

advertisement
William Browne§, Laura Green*, Graham Medley*
and Camille Szmaragd§
§University of Bristol *University of Warwick
Using Discrete time survival models to
model breakdown with TB of cattle using
the Randomised Badger Culling Trial
dataset
2
Summary
• Description of the dataset
• Discrete time survival models with missing
data
• Model extension to test imperfect TB test
sensitivity
• Multiple state models
3
Randomised Badger Culling Trial (RBCT)
• Where: High prevalence areas of the UK
• When: 1998-2005
• What: Real-time control/treatment experiment
• How:
– 10 triplets defined (A-J)
– 3 areas in each; control: survey, treatment: reactive
or proactive
– Treatment: culling badgers either around a herd
breakdown (reactive), or everywhere and annually
in the area (proactive)
– Survey of signs of badger’s activity
– TB testing of all herds in the areas
4
DEFRA grant on modelling TB in cattle and badgers
• Using RBCT data
• Investigate spatial and temporal
patterns in TB incidence in both
cattle and badgers
• Project started in December
2008 – one of four that DEFRA
funded in the same call.
5
DEFRA grant on modelling TB in cattle and badgers
• Data are rich in so much as the
quantity of data collected by the
trial was large
• Data on badgers’ social group,
infection status and other
characteristics from trial database.
• Link these data to the VETNET
cattle data for TB tests and to
records of cattle movements from
the CTS.
• There are however several
.
challenges
to overcome
6
Challenges in modelling TB in cattle and badgers
• Badgers
• not kept in fields
• social dynamics are disrupted by
culling their fellows (current DEFRA
call on this)
• survey areas only baseline
estimates
• Cattle
• Cannot test every cow on one day so time of test needs to be
considered.
• Farmers have several fields and we are not sure which field
each cow frequents with regard transmission.
• TB test is not perfect
Data selection
7
• Cattle data
- Collected for each CPHH (County Parish Holding Herd)
includes many test variables along with cattle movement
information (collected at the CPH level).
• Badger Data
- Collected for individual badgers but also information at the
“social group” level.
• GIS files
- Used to calculate neighbouring relationships and trapping
efforts, dealing with multiple land parcels.
8
Modelling TB in cattle
• CPHH is the observation unit
within CPH (combining multiple
parcels of land) as spatial unit
• “badger year” as temporal unit
• Discrete time survival model (response variable: HBD)
based on the outcome of TB tests during that time period;
•Possible predictors:
•demographic characteristics of the farm
•concurrent and past level of TB infection in local cattle
•risk posed by importing animals onto the farm
•badger-related variables
9
Cattle models
logit[hij  t ]   (t)    X ij  t   u j ,
With hij(t) is the hazard of HBD during episode i of
individual CPHH j .
(t) is a function of the “time at risk” variable
Xij(t) are covariates which might be time-varying or defined
at the episode or individual level
uj random effect representing unobserved characteristics
of individual j shared-frailty (common to all episodes for
individual j). We generally assume that uj~N(0,σ2u)
Model also extended to
include spatial (CAR) random effects
Fitting data to DTSM framework:
Constructing response
10
Expand the response variable so there is a response for each 12-month
time interval. This was done in 2 stages as follows:
•
Herd tested positive at any time during a 12-month period and the last test
performed is not clear = Herd Status coded as 1
•
Herd tested positive at any time during a 12-month period and the last two tests
performed are clear = Herd Status coded as 2, indicating that the herd was no
longer under restriction at the end of the period
•
Herd tested negative during a 12-month period= Herd status coded as 0
•
Herd not tested during a 12-month period= Missing value
• An episode is then defined as a period where a herd is at risk of
breakdown (our response variable)
11
Constructing response variable (continued)
• The purpose of first constructing the herd status is
to work out when herds are actually at risk:
• For example if herd status pattern for 5 years is
00111 then the herd is at risk in years 1-3 but not at
risk in years 4 and 5 → 001..
Whilst pattern 00121 means that the herd is at risk in
years 1-3 clears in year 4 and then is back at risk in
year 5 → 001.1
12
Missing test data
• Existence of some years that are missing herd
tests; We looked at three ways to deal with this
(the first two to bound probabilities):
1. Assume all missing tests are clear i.e. fill in all
missing values as 0s.
2. Assume all missing tests are positive i.e. fill in all
missing values as 1s.
3. A model-based solution with “true” value for the
missing test treated as a parameter and estimated
by the Bayesian model.
13
Multiple pattern approach
• For each herd with missing test(s) results, a set of possible
patterns (and associated covariates) will be determined
– E.g. Test result sequence: 0, 0, M, M, 1, 0, 0, 1 → 3 possible
patterns with the same probability of occurring a priori:
• 0,0,1, -,-, 0,0,1; and corresponding time at risk 1, 2, 3, 1, 2, 3
• 0,0,0,1,-, 0,0,1 ; and corresponding time at risk 1, 2, 3, 4, 1, 2, 3
• 0,0,0,0,1,0,0,1 ; and corresponding time at risk 1, 2, 3, 4, 5,1, 2, 3
with – indicating a period not at risk for event occurrence.
14
Construction of pattern sets
• Set of possible patterns constructed following a set of 6
rules (validity of rules can be debated!) depending on
location of the missing test:
– rule a: a missing value between a 1 and a 1 is assumed to be a 1
– rule b: a missing value between a 1 and a 0 is assumed to be
either a 0 or a 1
– rule c: a missing value between a 0 and a 0 is assumed to be a 0
– rule d: a missing value between a 0 and a 1 is assumed to be
either a 0 or a 1
– rule e: If the first records are missing, assume the previous not
missing is a 0 and follow either rule c or d
– rule f: if the last records are missing, then replace by a 4
(this is equivalent to removing them)
Multiple pattern approach
15
• These steps will allow construction for each CPHH of a pattern
set containing all feasible patterns for that CPHH.
• The full posterior for a multilevel (DTS) model with J herds and
nj records can be estimated from the likelihood for pattern p
given parameters β
n (j p )
p (  ) p ( ) p ( j ) p (u j |  ) L( yij
2
u
2
u
j
p
( p)
| X ij
( p)
,  ,u j )
I ( j  p )
i
• Here Өj is the currently chosen pattern for CPHH j and we
assume each feasible pattern is equally likely.
• Different models (sets of predictors) were then fitted using an
MCMC algorithm (mixture of Gibbs and Metropolis sampling)
16
Model fitting
• Initially fit each trial area separately
• The badger variables were constructed by
aggregating for each herd the number of badgers
(trapped, estimated alive at the time of the trial
(Szmaragd et al. submitted ) or infected) whose
territory (identified through their allocation to a
social group) overlapped any parcels of land to
which the herd had accessed
• There were large numbers of cattle movement
variables due to different movement types.
17
Model Fitting
• Method variant of that suggested by Cox &
Wermuth.
• Begin with adding each predictor univariately to a
baseline model.
• Add in all that are significant but remove those that
are strongly correlated (ρ>0.7)
• Knock out non-significant predictors and use
resulting model as starting point for a new fit with
univariate addition.
• Continue until no addition required.
18
Model fitting
• We will only discuss briefly results of fitting each trial
area separately. We focussed only on Polwhele areas
due to issues with the definition of badger social
territories in Aston Down administered areas.
• We did then continue on to fitting the 5 proactive
areas in one combined analysis (without spatial
random effects)
• Finally we combined 10 (5 proactive + 5 control)
areas where model fitting took several months!
• Results being written up in
Szmaragd et al. (In preparation a)
19
Results
• Different set of significant predictor variables were
identified for each trial area
• Only number of cattle tested (positive effect-increased
risk) and number of cattle sold the year of the test
(negative effect – protective factor) came up
consistently for all the areas
• Proactive area F1 was the only area for which a specific
“badger-effect” was detected.
– For this area, the proportion of infected badgers caught
and the number of badger estimated alive had a strong
positive effect.
– Note less power when focussing on single areas
20
Results – for proactive B area
Variables
OR
2.5CI
97.5CI
Intercept
0.084 0.057
0.120
Post 2001- FMD (0/1)
1.973 1.359
2.861
Nb Cattle Tested (Y)
1.006 1.003
1.008
Nb Reactors (Y-2)
1.101 1.012
1.202
Nb of positive Neighbours (Y)
1.433 1.198
1.714
Nb of Calves (Y-1)
1.019 1.013
1.026
Nb of Cattle Sold (Y)
0.972 0.965
0.978
Nb of Cattle bought through Market (Y) from farms
tested positive the following year
1.100 1.066
1.138
Nb of Cattle bought directly (Y) from low risk farms
tested positive the previous year
1.231 1.034
1.522
Nb of Cattle bought through Market (Y) from low risk
farms tested positive the following year
1.605 1.138
2.572
Nb of Cattle bought through Market (Y) from high risk
farms tested negative the previous year
0.790 0.614
0.951
21
Extension to imperfect testing
• We assumed in above analysis that the
specificity of the TB test is near perfect; i.e.
No false positive
• But sensitivity of the test may be as low as
50%: negative tests may actually be
obtained for positive herds !?!?
What is the impact on the parameter
estimates?
22
Extension to imperfect testing
• For each herd, if herd status in a specific year
is 0 or 2, then create two alternative patterns:
– The test was a true negative with probability
1-p
– The herd was actually positive (false
negative), with probability p
• 1-p represents the Negative Predictive Value
(NPV) and can be linked to the sensitivity (Se)
of the test, using testing data – here Se of 0.5
is equivalent to p = 0.153 and Se of 0.95 to p
= 0.008 (see Szmaragd et al. in prep. b)
23
Extension to imperfect testing
• Construct a first set of possible patterns to
account for uncertainty related to negative tests
• Use rules defined previously to deal with missing
values (extend the set of possible patterns)
• Test a range of values for p (corresponding to Se
between 0.5 and 1)
• Pattern selection in the MCMC algorithm by
Metropolis using the prior distribution of the
(cumulative) probabilities of each pattern as a
proposal distribution
24
Extension to Imperfect testing:
Example: Original data is 1 0 M 2 1 0
Firstly we need to deal with the actual years with negative tests –
there are 3 resulting in 8 patterns
Pattern
1
2
3
4
5
6
7
8
Pattern
10M210
10M211
10M110
10M111
11M210
11M211
11M110
11M111
Prob
(1-p)3
p(1-p)2
p(1-p)2
p2(1-p)
p(1-p)2
p2(1-p)
p2(1-p)
p3
Here we have simple binomial probabilities – now conditional on these patterns
we deal with the missing data. Note in first 4 patterns the M can be either a 0 or
a 1 whilst in the other 4 it is deterministically decided to be a 1.
Extension to imperfect testing:
Example: Original data: 1 0 M 2 1 0
25
Pattern No
Pattern
Prior Prob
Response
Clock Variable
1
100210
½ (1-p)3
100110
112311
2
101210
½ (1-p)3
101.10
112.11
3
100211
½ p(1-p)2
10011.
11231.
4
101211
½ p(1-p)2
101.1.
112.1.
5
100110
½ p(1-p)2
1001.0
1123.1
6
101110
½ p(1-p)2
101..0
112..1
7
100111
½ p2 (1-p)
1001..
1123..
8
101111
½ p2 (1-p)
101...
112...
9
111210
p (1-p) 2
1...1 0
1...11
10
111211
p2(1-p)
1...1.
1...1.
11
111110
p2 (1-p)
1....0
1....1
12
111111
p3
1.....
1.....
26
Effect of imperfect test
• Tested for proactive B area only
• Most of previously significant predictors kept in the
models
• Additional significant predictors found, mostly related to
number of cattle bought in from different type of farms
• Larger confidence intervals surrounding parameter
estimates which increase as sensitivity decreases
• With lower test sensitivity (≤0.75), for some herds a
pattern with lower prior likelihood is selected as best
• Badger predictors still not significant here.
27
Other effects of imperfect test
• When sensitivity is lowered the MCMC algorithm
can exhibit convergence issues.
• Here we believe the posterior may become multimodal with some modes hard to escape.
• A loop of ‘if test x was actually positive then
predictor y is highly significant then test x is
definitely positive …’
• An issue with small dataset and some sparse
predictors (cattle movements)
28
Multiple state models
• Assuming perfect test, model both transitions:
– from “at risk” to under-restriction (current model)
(state 1) logit[hij(1) (t )]   (1) (t )   (1) xij(1) (t )  u (1)
j
– From under-restriction to “at risk” (state 2)
logit[hij(2) (t )]   (2) (t )   (2) xij(2) (t )  u(2)
j
– Allow correlations between the herd level residuals
• Specify a single equation model with dummy variables
for each state. Interact dummies with duration and
covariates to obtain state-specific duration and covariate
effects (See Steele, Goldstein and Browne, 2004)
• Currently writing up (Szmaragd et al. in prep. c)
29
Project Team
• William Browne, University of Bristol
• Camille Szmaragd, University of Bristol
• Laura Green, University of Warwick
• Graham Medley, University of Warwick
• Sam Mason, University of Warwick
• Andy Mitchell, VLA
• Paul Upton, VLA
30
A plug for some other work!
Much of my research over the years has been into
statistical software development (funded largely by the
ESRC).
For the badger work we wrote stand-alone C code for
the model fitting and Camille wrote lots of R scripts to
manipulate data and manage model fitting.
In other current work we are working on a new piece of
software STAT-JR to follow on from our work on MLwiN
31
STAT-JR
• Named in memory of my long-term collaborator
and senior partner in the MLwiN software
development, Jon Rasbash.
• Our take on Jon’s vision for where statistics
software goes next.
• A team of programmers working on the project
(Chris Charlton, Danius Michaelides, Camille
Szmaragd, Bruce Cameron and me).
• Will have laptop with me to discuss software with
interested people this week.
32
The E-STAT project and STAT-JR
STAT-JR developed jointly by LEMMA II and E-STAT ESRC
nodes.
Consists of a set of components many of which we have an
alpha version for which contains:
Templates for model fitting, data manipulation, input and output
controlled via a web browser interface.
Currently model fitting for 90% of the models that MLwiN can fit
in MCMC plus some it can’t including greatly sped up
REALCOM templates
Some interoperability with MLwiN, WinBUGS, R,
Stata and SPSS (written by Camille)
33
An example of STAT-JR – setting up a model
34
An example of STAT-JR – setting up a model
35
Equations for model and model code
Note Equations use MATHJAX and so underlying LaTeX can
be copied and paste. The model code is based around the
WinBUGS language with some variation. This is a more
complex template for 2 level models.
36
37
Model code in detail
model {
for (i in 1:length(normexam)) {
normexam[i] ~ dnorm(mu[i], tau)
mu[i] <- cons[i] * beta0 + standlrt[i] * beta1 + u[school[i]] * cons[i]
}
for (j in 1:length(u)) {
u[j] ~ dnorm(0, tau_u)
}
# Priors
beta0 ~ dflat()
beta1 ~ dflat()
tau ~ dgamma(0.001000, 0.001000)
tau_u ~ dgamma(0.001000, 0.001000)
}
For this template the code is, aside from the length function,
standard WinBUGS model code.
38
Bruce’s (Demo) algebra system step for parameter u
39
40
Output of generated C++ code
The package can output C++ code that can then be taken
away by software developers and modified.
41
42
Output from the E-STAT engine
Here the six-way plot functionality is in part taken over to
STAT-JR after the model has run. In fact graphs for all
parameters are calculated and stored as picture files so
can be easily viewed quickly.
43
44
Interoperability with WinBUGS
Interoperability in the user interface is obtained via a few extra
inputs. In fact in the template code user written functions are
required for all packages apart from WinBUGS. The transfer of
data between packages is however generic.
45
Interoperability with WinBUGS
Interoperability in the user interface is obtained via a few extra
inputs. In fact in the template code user written functions are
required for all packages apart from WinBUGS. The transfer of
data between packages is however generic.
46
Output from WinBUGS with multiple chains
STAT-JR generates appropriate files and then fires up
WinBUGS. Multiple Chains are superimposed in the sixway plot
output.
47
Output from WinBUGS with multiple chains
Download