Inferring Reproductive Numbers from Final Size Epidemic

advertisement
Prologues
Estimating the Reproduction Number
Epilogue
References
Inferring Reproductive Numbers from Final Size Epidemic
Data on Hepatitis A in Flanders, Belgium: Truncation,
Censoring and Heterogeneity
Niel Hens
InFER - Warwick 2011
Hens et al.
InFER - Warwick 2011
1
Prologues
Estimating the Reproduction Number
Epilogue
References
Outline
1
Prologues
Prologue I: Immunological Assessment
Prologue II: Serological Assessment
2
Estimating the Reproduction Number
Data
Objectives
Known Number of Initial Cases
Unknown Number of Initial Cases
Known and Unknown Number of Initial Cases
Handling Reporting Issues
Truncation
Censoring
Heterogeneity
3
Epilogue
Hens et al.
InFER - Warwick 2011
2
Prologues
Estimating the Reproduction Number
Epilogue
References
Prologue I: Immunological Assessment
Prologue II: Serological Assessment
Introduction
Case Study on Hepatitis A
‘Federal’ knowledge centre:
If any, targeted or universal vaccination?
Five steps:
Immunological assessment
Serological assessment
Outbreak modelling
Mathematical modelling
Cost-effectiveness analysis
Hens et al.
InFER - Warwick 2011
3
Prologues
Estimating the Reproduction Number
Epilogue
References
Prologue I: Immunological Assessment
Prologue II: Serological Assessment
Hepatitis A Immunology
Hepatitis A vaccine study: post-vaccination antibody levels
follow-up over 10 years (113 individuals)
Consider long and short living plasma cells producing antibodies
dPs
dt
dPl
dt
dA
dt
= −µs Ps ,
= −µl Pl ,
= φs Ps + φl Pl − µA,
where µ. are mortality rates and φ. are production rates.
Analytical solution
Inference using nonlinear mixed models (SAS & Monolix)
Hens et al.
InFER - Warwick 2011
4
Prologues
Estimating the Reproduction Number
Epilogue
References
Prologue I: Immunological Assessment
Prologue II: Serological Assessment
Hepatitis A Immunology
Results:
antibody levels: lifespan of 1 month
short-lived plasma cells: average lifespan of 7 months
long-lived plasma cells persist for life
Assumption of lifelong immunity seems reasonable
Hens et al.
InFER - Warwick 2011
5
Prologues
Estimating the Reproduction Number
Epilogue
References
Prologue I: Immunological Assessment
Prologue II: Serological Assessment
Hepatitis A Serial Seroprevalence
Serological samples from 1993 and 2002
Serial seroprevalence
Ades and Nokes [1993], Batter et al. [1994], Marschner [1997],
Nagelkerke et al. [1999]: multiplicative age & time specific model
λ(a, t) = exp(µ0 +
I
X
µi ai +
i=1
J
X
θj tj ), I, J = 1, 2, ...,
j=1
assuming differential selection and a (semi-)parametric model.
Focus:
hepatitis A: SIR
non-differential selection
general framework
Hens et al.
InFER - Warwick 2011
6
Prologues
Estimating the Reproduction Number
Epilogue
References
Prologue I: Immunological Assessment
Prologue II: Serological Assessment
Hepatitis A Serial Seroprevalence
Nagelkerke et al. [1999]: proportional hazards model
for two calendar times t1 and t2 :
λ(a, t2 ) = exp(β(t2 − t1 ))λ(a, t1 ).
(1)
a cohort born at calendar time b has age a at calendar time a + b:
λb (a) = λ(a, a + b).
the proportional hazards assumption:
λb (a) = exp(β(b − b0 ))λb0 (a),
(2)
where λb0 (a) = λ(a, a + b0 ) is the baseline force of infection
MM-algorithm: maximization-maximization with isotonic stepwise
estimate to achieve λb0 (a) ≥ 0
Hens et al.
InFER - Warwick 2011
7
Prologues
Estimating the Reproduction Number
Epilogue
References
Prologue I: Immunological Assessment
Prologue II: Serological Assessment
Hepatitis A Serial Seroprevalence
fully nonparametric proportional hazards model:
λ(a, t2 ) = exp(g(t2 − t1 ))λ(a, t1 )
(3)
where g(t) is a smooth function with the constraint that g(0) = 0.
In terms of the proportion susceptible
xb (a) = x(a, b + a) = exp{exp(g(b − b0 )) log xb0 (a)},
(4)
where, with xb0 (a) = x(a, b0 + a),
Z
xb0 (a) = exp{−
a
λb0 (s)ds}.
0
Hens et al.
InFER - Warwick 2011
8
Prologues
Estimating the Reproduction Number
Epilogue
References
Prologue I: Immunological Assessment
Prologue II: Serological Assessment
Hepatitis A Serial Seroprevalence
Denote πb (a) = 1 − xb (a) the proportion subjects infected at age a
or before:
log(− log(1 − πb (a))) = s1 (a) + s2 (b),
(5)
where
s1 (a) = log(− log(1 − πb0 (a))).
s2 (b) = g(b − b0 ),
Cohort-specific force of infection:
λb (a) =
πb0 (a)
= exp(s2 (b))λb0 (a),
1 − πb (a)
(6)
πb0 0 (a)
d
=
exp(s1 (a)).
1 − πb0 (a)
da
(7)
with
λb0 (a) =
Extensions are possible
Hens et al.
InFER - Warwick 2011
9
Prologues
Estimating the Reproduction Number
Epilogue
References
Prologue I: Immunological Assessment
Prologue II: Serological Assessment
Hepatitis A Serial Seroprevalence
Standard statistical software: ‘mgcv’-package in R: thin plate
regression splines
Test for cohort-effect: approximate F-test
Results:
Model Components
Model
Model
Model
Model
Model
1:
2:
3:
4:
5:
a+b
s(a) + b
a + s(b)
s(a) + s(b)
s(a, b)
edf
AIC
rank
BIC
rank
3
10.03
10.55
13.85
19.04
6062.03
5976.90
5958.68
5934.30
5936.16
5
4
3
1
2
6082.07
6043.94
6082.07
6026.83
6063.34
4
2
5
1
3
Table: Results for different models: parametric model 1, semi-parametric
models 2 and 3, non-parametric models 4 and 5.
Note: TB example of Nagelkerke et al. [1999]: Model 5 - best model
Hens et al.
InFER - Warwick 2011
10
Prologues
Estimating the Reproduction Number
Epilogue
References
Prologue I: Immunological Assessment
Prologue II: Serological Assessment
Hepatitis A Serial Seroprevalence
Ensuring positivity for the cohort-specific FOI:
maximization-maximization
age: monotone P-spline
time: P-spline
Backfitting algorithm
Iterate until convergence
Two-dimensional tuning parameter selection
Hens et al.
InFER - Warwick 2011
11
Prologues
Estimating the Reproduction Number
Epilogue
References
Prologue I: Immunological Assessment
Prologue II: Serological Assessment
0
0.20
●
●
●
●
0.6
●
●
0.2
●●
●
●
●
●
●● ●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
80
●
●
●
●●
●
●
●
●
0
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
20
age
40
0.15
●
●
0.10
●
●
●
●
60
●
●
●
●
40
●
●
●
20
●
●
0.05
●
●
●
●
●●●
●
●
●●
●
● ●
●
●
●
0.00
●
●●
●
●
●
0.8
●
●
0.0
●
●
● ●●
●
0.4
0.6
0.4
●
●
●●●
●
●
●
0.0
●
●
● ●●
0.2
seroprevalence
0.8
●●
●●●●●● ● ●
●
●●
●
●
●
●
●●
●
●
●●
●
●●
●
seroprevalence
●
● ●
proportion immunized between 1993 and 2002
●
●●●●●●●●●●
●
●●
1.0
1.0
Hepatitis A Serial Seroprevalence
60
80
age
1940
1950
1960
1970
1980
1990
year of birth
Figure: HAV-seroprevalence in Flanders based on serological sample from 1993
(left panel) and 2002 (middle panel) together with the proportion infected in
the period 1993-2002 (right panel). Model-based estimates (solid lines) and
95% bootstrap-based percentile confidence intervals (dashed lines) are based
on the analysis reported by Hens et al. [2011].
Hens et al.
InFER - Warwick 2011
12
Prologues
Estimating the Reproduction Number
Epilogue
References
Prologue I: Immunological Assessment
Prologue II: Serological Assessment
Hepatitis A Serial Seroprevalence
Seroprevalence data 2002 for Belgium
Odds ratios for the spatial effect (tensor-product spline for age)
[0,0.75)
[0.75,1)
[1,1.25)
[1.25,1.5)
[1.5,3)
[3,9]
Hens et al.
InFER - Warwick 2011
13
Prologues
Estimating the Reproduction Number
Epilogue
References
Data
Objectives
Known Number of Initial Cases
Unknown Number of Initial Cases
Known and Unknown Number of Initial Cases
Handling Reporting Issues
Hepatitis A Outbreak Data
80
113 outbreaks (provincial health
agencies)
available information:
20
final sizes
40
Frequency
60
5 foodborne outbreaks: excluded
from the analysis
0
5 Flemish provinces
auxiliary information: school,
family, children, . . .
0
10
20
30
40
observed size
reporting issues:
number of initial cases mostly unknown
outbreaks with size at least 2
underreporting due to long incubation period: 15-45 days
Hens et al.
InFER - Warwick 2011
14
Prologues
Estimating the Reproduction Number
Epilogue
References
Data
Objectives
Known Number of Initial Cases
Unknown Number of Initial Cases
Known and Unknown Number of Initial Cases
Handling Reporting Issues
Objectives
Estimate the reproduction number from final size data
Dealing with reporting issues:
marginalization
truncation
censoring
heterogeneity
Input for a mathematical model with
demography
migration rates
→ confirmation of no transmission, only importation
Hens et al.
InFER - Warwick 2011
15
Prologues
Estimating the Reproduction Number
Epilogue
References
Data
Objectives
Known Number of Initial Cases
Unknown Number of Initial Cases
Known and Unknown Number of Initial Cases
Handling Reporting Issues
Estimating the Reproduction Number
Mortal branching process
Assume we know the number of initial cases
Final size: total number of infected persons with links
Becker (1974): final size distribution
P (X = x; s) = b(x, s)
θx−s
,
A(θ)x
where
S = s initial cases
X = x is the final size
θ is the reproductive ratio
b(x, s) is constant.
A(θ)x is a normalizing factor
Hens et al.
InFER - Warwick 2011
16
Prologues
Estimating the Reproduction Number
Epilogue
References
Data
Objectives
Known Number of Initial Cases
Unknown Number of Initial Cases
Known and Unknown Number of Initial Cases
Handling Reporting Issues
Estimating the Reproduction Number
Offspring distribution: generalized power series distribution
Poisson offspring distribution: Borel-Tanner distribution [Haight and
Breuer, 1960]
P (X = x; s) =
sxx−s−1 θx−s e−xθ
,
(x − s)!
x = s, s + 1, . . . .
Geometric offspring distribution:
s
θx−s
2x − s
P (X = x; s) =
,
x−s
2x − s
(1 + θ)2x−s
x = s, s + 1, . . . .
Defined for X < ∞ and θ ≤ 1
Hens et al.
InFER - Warwick 2011
17
Data
Objectives
Known Number of Initial Cases
Unknown Number of Initial Cases
Known and Unknown Number of Initial Cases
Handling Reporting Issues
Prologues
Estimating the Reproduction Number
Epilogue
References
Applied to the Data
−300
Assume 1 source case for each outbreak (ignoring truncation):
0.4
0.6
0.8
−400
profile log−likelihood
0.2
x
−500
−350
−450
−550
profile log−likelihood
x
1.0
0.2
0.4
R
0.6
0.8
1.0
R
profile likelihood confidence intervals
Borel-Tanner: R̂ = 0.76 (0.68, 0.84)
Geometric Offspring: R̂ = 0.76 (0.66; 0.88)
Hens et al.
InFER - Warwick 2011
18
Prologues
Estimating the Reproduction Number
Epilogue
References
Data
Objectives
Known Number of Initial Cases
Unknown Number of Initial Cases
Known and Unknown Number of Initial Cases
Handling Reporting Issues
Unknown Number of Initial Cases
Crucial assumption: conditioning S = s
assume initial case distribution
derive the joint and marginal likelihood {x ≥ s}
Initial case distribution
one-parameter distributions
degenerate: S = 1
discrete cases: S = 1, 2
truncated Poisson
two-parameter distributions
discrete cases: S = 1, 2, 3
truncated negative binomial
Hens et al.
InFER - Warwick 2011
19
Data
Objectives
Known Number of Initial Cases
Unknown Number of Initial Cases
Known and Unknown Number of Initial Cases
Handling Reporting Issues
Prologues
Estimating the Reproduction Number
Epilogue
References
Applied to the Data
Unknown initial cases (ignoring truncation):
Illustration: truncated Poisson
[ = 2.41
Borel-Tanner: R̂ = 0.37, E(S)
4.0
4.0
[ = 2.41
Geometric offspring: R̂ = 0.42, E(S)
−3
0
0
−3
00
50
−3
3.0
ms
ms
3.0
−250
●
2.0
2.0
●
0.0
−3
50
−30
1.0
1.0
−250
−50
0
0
0.2
0.4
0.6
0.8
1.0
−5
00
0.0
−35
0
0.2
R
−300
0.4
0.6
0.8
1.0
R
Hens et al.
InFER - Warwick 2011
20
Data
Objectives
Known Number of Initial Cases
Unknown Number of Initial Cases
Known and Unknown Number of Initial Cases
Handling Reporting Issues
Prologues
Estimating the Reproduction Number
Epilogue
References
Known and Unknown Number of Initial Cases
Marginalization induces assumption dependent results
What if we do have some information on initial cases (7 × 1)?
ignorance leads to inefficiency provided a correct model is specified
missing data approach? direct likelihood
0
0
−33
0
−31
0
−37
−4
−4
00
0
−3
30
−3
60
90
−3
−4
60
−360
20
−27
ms
3
0.0
−4
8
0
−4
20
−300
−260
2
70
−3
−4
40
●
20
0.2
−250
−2
80
0.4
0.6
20
80
0
−280
1
2
−3
−40
0
−240
●
−2
−3
10
−380
−340
−320
80
4
−260
−4
−3
−4
1
ms
4
50
6
−34
−29
5
5
−32
0
−300
−280
3
6
Illustration: truncated Poisson & Borel-Tanner
0.8
1.0
0.0
0.2
R
0.4
0.6
0.8
1.0
R
Hens et al.
InFER - Warwick 2011
21
Data
Objectives
Known Number of Initial Cases
Unknown Number of Initial Cases
Known and Unknown Number of Initial Cases
Handling Reporting Issues
Prologues
Estimating the Reproduction Number
Epilogue
References
Truncation: Known Number of Initial Cases
−300
−260
−240
−280
−220
x
profile log−likelihood
−200
x
−320
profile log−likelihood
Assume 1 source case for each outbreak:
0.2
0.4
0.6
0.8
1.0
0.2
0.4
R
0.6
0.8
1.0
R
profile likelihood confidence intervals
Truncated Borel-Tanner: R̂ = 0.58 (0.51, 0.66)
Truncated Geometric Offspring: R̂ = 0.52 (0.44; 0.61)
Hens et al.
InFER - Warwick 2011
22
Data
Objectives
Known Number of Initial Cases
Unknown Number of Initial Cases
Known and Unknown Number of Initial Cases
Handling Reporting Issues
Prologues
Estimating the Reproduction Number
Epilogue
References
Truncation: Unknown Number of Initial Cases
Illustration: truncated Poisson
[ = 1.00
Truncated Borel-Tanner: R̂ = 0.58, E(S)
−3
0
00
4.0
−2
8
20
60
40
−3
−2
−3
4.0
[ = 1.01
Truncated Geometric offspring: R̂ = 0.52, E(S)
60
0
−3
0
0
−3
20
3.0
−220
ms
3.0
ms
2.0
2.0
−220
−2
0.4
1.0
0.6
0.8
1.0
20
0.2
−3
●
0.0
−200
60
−200
60
−3
20
−2
1.0
−2
8
−2
−240
−240
0.0
●
0.2
R
0.4
0.6
0.8
1.0
R
Hens et al.
InFER - Warwick 2011
23
Data
Objectives
Known Number of Initial Cases
Unknown Number of Initial Cases
Known and Unknown Number of Initial Cases
Handling Reporting Issues
Prologues
Estimating the Reproduction Number
Epilogue
References
Truncation: Known and Unknown Number of Initial Cases
Illustration: truncated Poisson
3.0
ms
3.0
ms
2.0
2.0
1.0
1.0
−220
0.6
0.8
1.0
0
0.4
−3
40
−36
60
0.2
20
−240
−200
●
0.0
−3
−260
−320
−2
−3
20
00
60
−220
−3
−280
−3
00
−240
−300
−3
0
4.0
−2
8
20
60
40
−3
−2
−3
4.0
[ = 1.00
Truncated Borel-Tanner: R̂ = 0.58, E(S)
−200
●
0.0
0.2
R
0.4
0.6
0.8
1.0
R
Hens et al.
InFER - Warwick 2011
24
Prologues
Estimating the Reproduction Number
Epilogue
References
Data
Objectives
Known Number of Initial Cases
Unknown Number of Initial Cases
Known and Unknown Number of Initial Cases
Handling Reporting Issues
Censoring & Conditioning on Extinction
Farrington et al. [2003]:
X = ∞:
P (X = ∞) = 1 − q(θ)s
Censored likelihood:

1−ci 



x
−1
n
i


Y
X
ci 

L(θ; s, x, c) =
1−
P (X = j, si )
P (X = xi ; si )



j=si
i=1 
[ = 1.00
Only two clusters affected by censoring: R̂ = 0.59, E(S)
Assumption: non-informative censoring
Hens et al.
InFER - Warwick 2011
25
Prologues
Estimating the Reproduction Number
Epilogue
References
Data
Objectives
Known Number of Initial Cases
Unknown Number of Initial Cases
Known and Unknown Number of Initial Cases
Handling Reporting Issues
Heterogeneity
Spread depends on the setting
calendar time
school vs non-school
...
Auxiliary information can be used:
regression approach:
additive effect
multiplicative effect
random effects approach: ωR where ω ∼ Γ(θ, 1/θ)
empirical Bayes estimates for individual effects
Hens et al.
InFER - Warwick 2011
26
Prologues
Estimating the Reproduction Number
Epilogue
References
Data
Objectives
Known Number of Initial Cases
Unknown Number of Initial Cases
Known and Unknown Number of Initial Cases
Handling Reporting Issues
Applied to the Data
Illustration: regression approach & marginal truncated Borel-Tanner
Provinces (significant effect due to Antwerp):
Antwerp: R̂ = 0.84
East-Flanders: R̂ = 0.43
Flemish-Brabant: R̂ = 0.45
Limburg: R̂ = 0.56
West-Flanders: R̂ = 0.57
School vs Non-school:
School: R̂ = 0.47
Non-school: R̂ = 0.72
Hens et al.
InFER - Warwick 2011
27
Prologues
Estimating the Reproduction Number
Epilogue
References
Epilogue
Results were used as input parameters for the mathematical model
We used WAIFW and contact data
Mathematical model with demography, including migration
→ confirms no transmission only migration
Cost-effectiveness analysis
→ vaccination of second generation children
Difficult to ‘sell’ that story
Hens et al.
InFER - Warwick 2011
28
Prologues
Estimating the Reproduction Number
Epilogue
References
Acknowledgements
Co-authors: Christel Faes, Joke Bilcke, Steven Abrams, Stijn
Jaspers, Yannick Vandendijck, Marc Aerts and Philippe Beutels
Prologue I: Olivier Lejeune, Mathieu Andraud, Zebedee Musoro
Epilogue: Chellafe Ensoy
This work has been funded by “SIMID”, a strategic basic research
project funded by the institute for the Promotion of Innovation by
Science and Technology in Flanders (IWT), project number 060081.
Hens et al.
InFER - Warwick 2011
29
Prologues
Estimating the Reproduction Number
Epilogue
References
References
A. E. Ades and D. J. Nokes. Modeling age- and time-specific incidence from seroprevalence:toxoplasmosis. Am J Epidemiol, 137(9):
1022–1034, May 1993.
V. Batter, B. Matela, M. Nsuami, M. T., M. Kamenga, F. Behets, R. Ryder, W. Heyward, J. Karon, and M. St Louis. High HIV-1
incidence in young women masked by stable overall seroprevalence in young childbearing women in Kinshasa, Zaire: Estimating
incidence from seroprevalence data. AIDS, 8:811–817, 1994.
C. P. Farrington, M. N. Kanaan, and N. J. Gay. Branching process models for surveillance of infectious diseases controlled by mass
vaccination. Biostatistics, 4(2):279–295, Apr 2003. doi: 10.1093/biostatistics/4.2.279.
F. Haight and M. Breuer. The borel-tanner distribution. Biometrika, 47:143–150, 1960.
N. Hens, Z. Shkedy, M. Aerts, C. Faes, P. Van Damme, and P. Beutels. Infectious Disease Parameters for Transmission Models: A
Modern Statistical Perspective. Springer-Verlag: Forthcoming, 2011.
I. Marschner. A method for assessing age-time disease incidence using serial prevalence data. Biometrics, 53:1384–1398, 1997.
N. Nagelkerke, S. Heisterkamp, M. Borgdorff, J. Broekmans, and H. Van Houwelingen. Semi-parametric estimation of age-time specific
infection incidence from serial prevalence data. Statistics in Medicine, 18:307–320, 1999.
Hens et al.
InFER - Warwick 2011
30
Download