1 - Lancaster University

advertisement
A Comparison of INLA and MCMC for the Estimation
of Smoothed Risk Maps in Epidemiology
Ramis13,
Rebeca
Virgilio
Gómez-Rubio2,
Peter J.
Diggle1
& Gonzalo
...
ciberesp
López-Abente3
Centro de Investigación Biomédica en red
Epidemiología y Salud Pública
1 Division of Medicine, Lancaster University, UK. {r.ramis,p.diggle}@lancaster.ac.uk
2 Departamento de Matemáticas, Universidad de Castilla-La Mancha, Spain. Virgilio.Gomez@uclm.es
3 Centro Nacional de Epidemiología, Instituto de Salud ’Carlos III’, Madrid, Spain. glabente@isciii.es
Background
In Epidemiology, spatial disease mapping models are commonly used for the
estimation of smoothed risk maps. In this context, the Besag, York and Mollie model
has been widely used:
es por esofag o
RME. Esófag o
1.5 a 200
1.3 a 1.5
1.1 a 1.3
1
a 1.1
0.95 a 1.05
0.91 a 0.95
0.77 a 0.91
0.67 a 0.77
0
a 0.67
1.5 a 200
(1378)
1.3 a 1.5
(249)
1.1 a 1.3
(304)
1
a 1.1
(72)
0.95 a 1.05 (157)
0.91 a 0.95
(60)
0.77 a 0.91 (228)
0.67 a 0.77 (171)
0
a 0.67 (5573)
Besag, York and Molie model
(161)
(262)
(892)
(404)
(1135)
(470)
(1853)
(1579)
(1436)
Oi ~ Po(i )
log(i )    hi  bi
As in many other Bayesian models, MCMC is often used to estimate the posterior
distribution of the parameters of interest; with the main disadvantage of being
computationally intensive. Specially when the number of areas is high. (We work
with the 8068 Spanish towns)
hi ~ N (0, vh )
bi ~ CAR(, vb )
In contrast, INLA (Rue, Martino and Chopin,2009, JRSS-B, 71:319–392) has recently
offered a different alternative with a negligible computational burden.
Oi is the number of cases in area i, ρ is the prevalence of the disease, hi is a
heterogeneity random effect and bi is a spatially correlated random effect.
Our aim is to compare the performance and accuracy of both techniques using a
factorial experiment analysis over the real geographical distribution of 8068 small
areas in which Spain in divided.
Factorial experiment
yijk = mean(90% CI Empirical Coverage for scenario ijk)
We carry out a factorial experiment with 3 factors: ρ (prevalence), vh (variance of
heterogeneity term) y vb, (variance of spatial autocorrelation term). We define 3
levels for each factor: low (1), medium (2) and high (3). These combinations of
factors try to reproduce various real scenarios of chronic disease outcomes in Spain.
We simulate 25 datasets for the 27 different scenarios and then we compute the
90% CI Empirical Coverage. We take the mean of the 25 replications in each
scenario.
where:
• i = 1,2,3 levels of the spatial autocorrelation term variance (vb)
• j = 1,2,3 levels of the heterogeneity term variance (vh)
• k =1,2,3 levels of prevalence (ρ)
We repeat the experiment with INLA and MCMC (WinBUGS) to assess and compare
their performance.
Results
90% CI Empirical Coverage
yINLA
Our results show that both techniques produce similar estimations.
^
Here there are some examples of these results for some of the simulated data μ^INLA and μ
MCMC
are very similar however the standard errors (se) show a different behaviours. For INLA
estimations μ^ and se are independent however for MCMC estimations se increase with
^
increasing value of μ.
ˆ
ˆ
j=2
vs se
MCMC
j=3
1.0
Standard Error
0.6
Standard Error
4
3
MCMC
i=1
i=2
i=3
k
i=1
i=2
i=3
1
0.899
0.913
0.912
1
0.895
0.905
0.904
2
0.918
0.920
0.914
2
0.901
0.899
0.904
3
0.929
0.928
0.924
3
0.901
0.899
0.902
1
0.919
0.937
0.911
1
0.895
0.908
0.894
2
0.929
0.940
0.915
2
0.895
0.906
0.897
3
0.932
0.940
0.920
3
0.897
0.901
0.899
1
0.933
0.966
0.923
1
0.945
0.925
0.873
2
0.943
0.965
0.941
2
0.939
0.910
0.905
3
0.943
0.962
0.945
3
0.932
0.911
0.909
j=1
j=2
j=3
0
0.0
0.2
1
0.5
0.4
2
Scenario.1.1.1
(simulation 1)
k
0.8
1.5
5
1.0
2.0
6
INLA vs MCMC
ˆ
vs se
INLA
j=1
yMCMC
1
2
3
4
5
6
1
2
3
4
5
6
0
1
2
3
Mean
4
5
6
Mean
0.45
4
1.2
INLA
0.8
0.4
0.6
Standard Error
0.35
0.30
Standard Error
0.25
2
0
0.0
0.15
0.2
0.20
1
Scenario.2.2.2
(simulation 1)
MCMC
3
0.40
1.0
• 90% CI Empirical Coverage for INLA estimations are superior to 90% for all
combinations but 1.1.1
0
1
2
3
4
0.5
1.0
1.5
2.5
3.0
3.5
0
1
2
3
Mean
4
5
6
7
Mean
0.25
Standard Error
0.20
Standard Error
0.2
0.15
0.3
0.4
2.5
2.0
0.1
0.10
0.0
MCMC
1.5
1.0
0.5
0.0
Scenario.3.3.3
(simulation 1)
0.0
0.5
1.0
1.5
INLA
2.0
2.5
3.0
• MCMC coverage intervals are almost 90% for scenarios with j=1 and j=2, however
for j=3 they are superior but 3.3.1
0.5
3.0
INLA
2.0
0.5
1.0
1.5
Mean
2.0
2.5
1
2
3
4
Mean
• INLA results show variation along factors levels. Increases in heterogeneity term
variance (j) and in prevalence (k) produce increases in the 90% CI Empirical
Coverage. But increases in spatial autocorrelation term variance (vh) do not produce
the same effect.
• MCMC results are not affected for changes in the factors levels.
Concluding Remarks
For situations with a high number of small areas, some remarks should be taken into account in order to
choose a technique to estimate risk maps.
• INLA and MCMC techniques estimate similar smoothed risk maps.
• INLA standard errors are larger.
• For scenarios with higher heterogeneity both techniques produce wider intervals.
Download