mmc1

advertisement
Supplementary Information Text S1
Quantifying spatiotemporal heterogeneity of MERS-CoV transmission in the
Middle East region: a combined modeling approach
Chiara Polettoa, Vittoria Colizzaa, b, Pierre-Yves Boëllea
Universités, UPMC Univ Paris 06, INSERM, Institut Pierre Louis d’épidémiologie et de
Santé Publique (IPLESP UMRS 1136), F75012, 27 rue Chaligny, 75012 Paris, France.
aSorbonne
bInstitute
for Scientific Interchange Foundation, via Alassio 11/c, 10126 Torino, Italy.
Contents
Data ......................................................................................................................................................... 2
Available information ........................................................................................................................... 2
Imputation of epidemic curves ............................................................................................................. 2
Distribution of time from onset to hospitalization and notification ................................................... 2
Imputed epidemic curves ................................................................................................................. 3
Model design (regional analysis) ............................................................................................................. 3
Results ..................................................................................................................................................... 4
Model selection .................................................................................................................................... 4
Posterior distribution summary ............................................................................................................ 5
Estimated geographical scaling factors ........................................................................................... 5
Estimated parameters in the complete information on transmission scenario ................................ 6
Comparison of simulated and observed epidemics............................................................................. 6
Correlation between parameters ......................................................................................................... 6
Seasonal model ................................................................................................................................... 7
Sensitivity analysis .................................................................................................................................. 7
Sensitivity on other modelling assumptions ........................................................................................ 7
References .............................................................................................................................................. 9
Data
Available information
Data were obtained from the website [Rambaut, A. “MERS-cov Spatial, Temporal and Epidemiological
Information”, http://epidemic.bio.ed.ac.uk/coronavirus background]. Table S1 shows the completeness
of the data used for the analysis.
Variable
% reported cases for
whom information is
available
Region
100%
Date of onset
63%
Date of hospitalisation
57%
Date of notification
96%
Onset imputed from
Known
63%
Hospitalisation
8%
Notification
29%
Secondary cases
34%
Table S1: Completeness of the dataset
Imputation of epidemic curves
Distribution of time from onset to hospitalization and notification
For hospitalization, onset to hospitalization data was available for 31% of the cases, with mean 4.6
days and standard deviation 4.5 days. For onset to notification, data was available for 49% of the
cases, with mean 10.5 days and standard deviation 9.2 days.
Figure S1: Time from onset to hospitalisation (left) and from onset to notification (right) of MESR-CoV in the
Middle East from March 2012 to September 2014.
For imputing missing onset dates, we applied the following approach:
If hospitalisation date th was available, the onset date was imputed at th – d where d was
sampled from the distribution of time from onset to hospitalisation, determined from other cases
hospitalized in the same period (i.e. [th – 30 days, th + 30 days]);
If only the notification date tr was known, the onset date was imputed at tr – d where d was
sampled from the distribution of time from onset to notification, determined from other cases notified in
the same period (i.e. [tr – 45 days, tr + 45 days]).
This approach allowed taking into account potential changes in care and notification over time.
Imputed epidemic curves
20 epidemic curves were imputed from the original data, accounting for missing onset dates. The
overall profiles were little affected.
Figure S2: MERS-CoV epidemic curves in the Middle East. The gray bars show the variability in incidence due to
imputation of missing onset dates.
Model design (regional analysis)
The step 1 model is an analysis of incident cases time series in different regions. As transmission is
still low, we hypothesized that the epidemic process was independent in the geographical regions.
In each region, we postulated (dropping the r superscript for region):
𝐷(𝑑)~π‘ƒπ‘œπ‘–π‘ π‘ π‘œπ‘›(𝐸(π·π‘Ÿ (𝑑))), where D(t) is overall incidence at time t;
𝑠(𝑑)~π΅π‘–π‘›π‘œπ‘šπ‘–π‘Žπ‘™(πœ‹ 𝛽 𝑅(𝑑 − 1)𝐷(𝑑 − 1), 𝐷(𝑑)), where s(t) is incidence of cases described as secondary;
ln(𝛽) ~𝑁(0, 𝜎𝐡2 ), random effect for transmission strength;
ln(𝛼) ~𝑁(0, 𝜎𝐴2 ), random effect for sporadic cases.
The prior distributions were:
𝜎𝐡2 ~exp(1)
𝜎𝐴2 ~exp(1)
πœ‹~π΅π‘’π‘‘π‘Ž(1,1)
𝑅~exp(0.1)
′
𝑝𝑠𝑝
~exp(0.1)
An example BUGS script is shown below :
var pR[3],pr[2],alpha[n.province],beta[n.province],
count[n.province,n.week],count.secondary[n.province,n.week], p[n.province],
r[n.week], R[n.week]
model {
s.alpha ~ dexp(1)
s.beta ~ dexp(1)
pR[1]~dexp(0.1)
pR[2]~dexp(0.1)
pR[3]~dexp(0.1)
pr[1]~dexp(0.1)
pr[2]~dexp(0.1)
# first observations
for (k in 1:n.province) {
alpha[k] ~ dnorm(0,1/(s.alpha*s.alpha))
count[k,1] ~dpois(pr[1] * exp(alpha[k]) * pop[k])
count.secondary[k,1] ~ dpois(0.01)
p[k] ~ dbeta(1,1)
}
# rest of time series
for (t in 2:n.week) {
r[t] <- pr[1] +
(pr[2]-pr[1]) * (t - 87)/4* step(t - 87)*step(90-t) +
(pr[2]-pr[1]) * (95-t)/4* step(t - 91)*step(94-t)
R[t-1] <- pR[1] +
(pR[2]-pR[1]) * step(t - 87)*step(90-t) +
(pR[3]-pR[1]) * step(t - 91)*step(94-t)
# in each province
for (k in 1:n.province) {
e[k,t] <- r[t] * exp(alpha[k])* pop[k] + R[t-1]* count[k,
t-1]
count[k,t] ~ dpois(e[k,t])
count.secondary[k,t] ~ dbinom(p[k] * R[t-1]* count[k,t1]/e[k,t], count[k,t])
}
}
}
Results
Model selection
Gibbs sampling was performed using JAGS and rjags. We obtained posterior samples of size 1000
from 100000 iterations sampled every 100 steps to limit autocorrelation. The DIC was computed by
the JAGS module “dic”. DICs for all models were obtained for the 20 imputed epidemics and
averaged.
Geographical
variation
none
R
psp
both psp and R
none
2169
2131
1973
1959
psp
2148
2115
1891
1882
R
2088
2056
1888
1878
both psp and R
2072
2043
1850
1837
none
4288
4273
3128
3113
psp
3382
3368
2222
2207
R
4276
4250
3116
3090
both psp and R
3370
3344
2209
2183
temporal
variation
Partial information
Complete information
Table S2: DIC for all 32 models tested in the scenarios partial information on transmission and complete
information on transmission.
Posterior distribution summary
Estimated geographical scaling factors
Posterior means and credible intervals are provided for the best-fit model (partial information;
geographical and temporal variation in all regions).
Region
 ο€ 

UAE
1.79 [0.59 - 3.89]
1.07 [0.68 - 1.82]
Aseer
0.57 [0.15 - 1.36]
1.06 [0.46 - 2.13]
Al Bahah
0.27 [0.03 - 0.81]
1.02 [0.31 - 2.29]
Border Region
0.81 [0.09 - 2.64]
1.01 [0.30 - 2.27]
Jordan
0.34 [0.09 - 0.80]
0.85 [0.32 - 1.54]
Al-Jawf
2.84 [0.63 - 7.27]
0.78 [0.20 - 1.44]
Kuwait
0.32 [0.06 - 0.89]
1.03 [0.36 - 2.22]
Al Madinah
2.14 [0.50 - 5.42]
1.32 [0.79 - 2.40]
Makkah
2.28 [0.64 - 5.62]
1.45 [0.96 - 2.47]
Nejran
1.67 [0.33 - 4.57]
0.82 [0.22 - 1.55]
Oman
0.46 [0.10 - 1.16]
0.77 [0.21 - 1.40]
Qatar
2.73 [0.79 - 6.38]
0.83 [0.25 - 1.54]
Al-Qassim
0.46 [0.08 - 1.31]
0.94 [0.28 - 1.95]
Riyadh
6.23 [2.27 - 13.07]
1.25 [0.83 - 2.09]
Eastern Province
3.19 [0.99 - 7.13]
1.86 [1.03 - 3.43]
Tabuk
1.25 [0.23 - 3.37]
1.11 [0.57 - 2.19]
Yemen
0.04 [0.01 - 0.10]
0.87 [0.24 - 1.69]
Table S3: Posterior means and credible intervals for parameters π›Όπ‘Ÿ and π›½π‘Ÿ are provided for the best-fit model
(partial information; geographical and temporal variation in all regions).
Estimated parameters in the complete information on transmission scenario
The parameters estimated in the “complete information” model showed, as expected, less
transmission and more sporadic cases.
Parameter estimate
π‘žπ‘ π‘,1
0.027 ×10-6 [0.010 – 0.065]
π‘žπ‘ π‘,2
1.2 ×10-6 [0.45 – 2.8]
𝑅1
0.26 [0.15 – 0.38]
𝑅2
0.82 [0.46 – 1.3]
𝑅3
0.33 [0.20 – 0.48]
Table S4: Parameter estimates obtained in the complete information on transmission scenario.
Comparison of simulated and observed epidemics
We simulated outbreaks in the Middle East using parameters sampled in the posterior distribution of
the best fitting model. Each week, the number of detected cases was sampled in each region from a
Poisson distribution with mean E(Dr(t)) as described in the text. The envelope of the predicted values
was in accordance with the observed epidemic and showed that large stochastic variability was
possible.
Figure S3: observed MESR-CoV in the Middle East (line) and median (dashed) and pointwise 95% prediction
interval from 1000 simulations.
Correlation between parameters
Overall, mixing in the chains was good, and quantiles of the posterior distribution stabilized over time. We
limited autocorrelation in the posterior samples by retaining only 1 iteration every 1000. The scatterplots of
the final distributions are shown in Figure S4. The 𝑅 and π‘žπ‘ π‘, distributions were roughly independent, as shown
by the shape of the posterior scatterplots. There was more correlation in the parameters of the same nature
(𝑅1 , 𝑅2 , 𝑅3 ) and (π‘žπ‘ π‘,1 , π‘žπ‘ π‘,2 ), but the scatterplots do not suggest problems in estimation (multiple maxima,
bimodality, non-identifiability).
Figure S3 : Bivariate scatterplot of posterior distributions for main model parameters. Parameter distributions are
from the best fitting model. The label R.base, R.before, R.after, p[sp]base and p[sp]peak indicate in the order π‘ΉπŸ ,
π‘ΉπŸ , π‘ΉπŸ‘ and 𝒒𝒔𝒑,𝟏 , 𝒒𝒔𝒑,𝟐 .
Seasonal model
Geographical variation
R
psp
both psp
and R
original
12
206
0
psp seasonal
52
220
42
R seasonal
14,
213
2
both psp and R seasonal
64
228
53
Temporal variation
Table S5: Fit of a model with seasonal change in R and psp. The results show the difference in DIC computed
with the best fitting model reported in the manuscript, averaged over 20 imputed epidemics.
Sensitivity analysis
Sensitivity on other modelling assumptions
We explored the impact of arbitrary modelling choices on the distribution of the parameters. There
were five variations in addition to the original model described below (changes from the original are
summarized in parentheses):
- wide : change in 𝑅 and 𝑝𝑠𝑝 on a 10-week-long period (8-week-long) centered around 201417.
- peak 2014-16 : change in 𝑅 and 𝑝𝑠𝑝 in the two periods 2014-12 to 16 and 2014-17 to 201420 (change in the two periods 2014-13 to 17 and 2014-18 to 2014-21).
- narrow step: change in 𝑅 on a 4-week-long period from 2014-13 to 2014-17 (two changes
from 2014-13 to 17 and 18-21).
- large step: change in 𝑅 on a 8-week-long period from 2014-13 to 2014-21 (two changes
from 2014-13 to 17 and 18-21).
- distrib: taking into account the distribution of the generation time over 3 weeks (generation
time was 1 week). The distribution of the generation time was obtained from [1] (lognormal distribution
(meanlog=1.9, sdlog=0.49)) and discretized over weeks (week1: 81%, week 2: 17%, week3 : 2%)
compared to (week1: 100%) in the original model.
Table S6 summarizes the fits, as measured by DIC. There was little difference in overall goodness of
fit: in all cases, a model allowing changes in both psp and R had better fit than others. More precisely,
the shift by one week of the peak time had no effect on the overall DIC (model “peak 2014-16”), and
other changes led to small increases in the DIC.
Model
original
peak
2014-16
narrow
step
wide
large
step
distrib
Constant
2169
2147
2147
2147
2147
2098
both
1959
1959
2097
1958
1958
1974
Constant
2148
2129
2132
2131
2131
2088
both
1882
1879
1882
1878
1882
1901
Constant
2088
2088
2109
2083
2109
2012
both
1888
1887
1921
1897
1921
1878
constant
2072
2054
2096
2071
2096
2006
both
1837
1837
1867
1850
1867
1849
Geographical
none
psp
R
both psp and R
Table S6 : DICs of the models in the sensitivity analysis. We consider here the four level of geographical
heterogeneity and only two levels of temporal heterogeneity (constant and both).
The distributions obtained in these models are shown in Figure S5 for each region (we multiplied the
parameter value by the region-specific modifier). Overall, there were no major changes in the posterior
distributions. As expected, in “large step” model, the R estimates were lower during the period of
epidemic increase (𝑅2 ) and higher during the decreasing part of the epidemic wave (𝑅3 ). In the “distrib”
model, 𝑝𝑠𝑝, parameters decreased while 𝑅 parameters increased above the original, but the changes
were small in magnitude.
Figure S5: Sensitivity analysis: parameter distributions according to the five model formulations considered in the
analysis. Parameter distributions are presented for all regions under study. The label R.base, R.before, R.after,
p[sp]base and p[sp]peak indicate in the order π‘ΉπŸ , π‘ΉπŸ , π‘ΉπŸ‘ and 𝒒𝒔𝒑,𝟏 , 𝒒𝒔𝒑,𝟐 .
References
1. Assiri A, McGeer A, Perl TM, Price CS, Al Rabeaah AA, Cummings DA et al. Hospital outbreak of
Middle East respiratory syndrome coronavirus. New England Journal of Medicine 2013;
369(5):407-16. http://dx.doi.org/10.1056/NEJMoa1306742.
Download