SAS and Stata - WHO Collaborating Centre for Reference and

advertisement
Bayesian and Penalised Regression Methods for Epidemiological Analysis
Lab 1. Information-weighted averaging
Consider the association between chocolate consumption and risk of stroke in a
prospective cohort of middle-aged and elderly men (Larsson et al. Neurology, 2012).
From a Cox proportional-hazards model, the adjusted rate ratio for stroke comparing
the highest quartile of chocolate consumption (median 62.9 g/week) with the lowest
quartile (median 0 g/week) was 0.83 (95% CL 0.70, 0.99). Assuming there is upward
skewing of the chocolate-consumption distribution, the means must be higher than
the medians, so we assume that the RRs at issue are for roughly 70 g (2.5 oz) per
week (units are especially important to keep in mind when considering priors).
Because stroke incidence was under 10% over the study period, we will ignore
distinctions among risk, rate, and odds ratios.
Question 1
Prior with null centre
Although few studies had reported an inverse association between chocolate
consumption and risk of stroke, strong associations seemed unlikely.
We start by modeling this a priori idea by placing 2:1 odds on a RR between ½ and 2,
and 0.95 probability on RR between 1/4 and 4, assuming a normal distribution for
our prior.
The implied prior distribution for the loge rate ratio (ln(RR)) would follow a normal
distribution that satisfies:
exp(prior mean – 1.96×prior standard deviation) = ¼
exp(prior mean + 1.96×prior standard deviation) = 4
a) What is the prior mean and prior variance of ln(RR)?
b) What are the estimated ln(RR) and estimated variance from the observed (actual)
data?
c) What is the approximate posterior median and 95% posterior limits for RR (the
50th, 2.5th and 97.5th posterior percentiles for RR) based on information-weighted
averaging?
[Note, assuming normality for both the prior and the estimate allows us to calculate the posterior
mean ln(RR) as a weighted average of the prior mean and the maximum-likelihood estimate from the
data, where the weights are the inverse variances (the inverse variance is a measure of precision of or
information in an estimate). The variance of the posterior distribution for ln(RR) is then one over the
sum of the weights.]
d) Check the distribution of the prior. Is the specified null prior compatible with the
results from the analysis of the data alone?
Note, ln(RR) and var(ln(RR)) can be derived from upper and lower
bounds using the following formulae:
ln(𝑅𝑅𝑢𝑝𝑝𝑒𝑟 ) + ln⁡(𝑅𝑅𝑙𝑜𝑤𝑒𝑟 )
ln(𝑅𝑅) =
2
ln(𝑅𝑅𝑢𝑝𝑝𝑒𝑟July
) −24-25,
ln⁡(𝑅𝑅2014
𝑙𝑜𝑤𝑒𝑟 ) 2
Greenland S., Orsini N., Sullivan
S.,
Simpson
J.A.
Melbourne
𝑣𝑎𝑟(ln(𝑅𝑅)) = [
]
2 × 1.96
Question 2
Prior with non-null centre
Four cohort studies had reported an inverse association between chocolate
consumption and risk of stroke. Previous findings are pooled with a meta-analysis.
Study
|
RR
[95% Conf. Interval]
% Weight
---------------------+----------------------------------------------Mink PJ 2007
| 0.850
0.700
1.030
46.99
Janszky I 2009
| 0.620
0.330
1.160
4.44
Buijsse B 2010
| 0.520
0.300
0.890
5.93
Larsson SC 2011
| 0.800
0.660
0.990
42.64
---------------------+----------------------------------------------Fixed-effect
Pooled RR
| 0.793
0.695
0.906
100.00
Random-effects
Pooled RR†
| 0.786
0.677
0.913
100.00
---------------------+----------------------------------------------† - DerSimonian & Laird method
Heterogeneity chi-squared =
3.41 (d.f. = 3) p = 0.333
I-squared (variation in ES [percent variance in ln(RR)]attributable
to heterogeneity) = 11.9%
Estimate of between-study variance Tau-squared = 0.0031
The random-effects pooled relative risk of stroke for approximately 70 gr per week
of chocolate consumption was 0.786 (95% CL 0.677, 0.913). Ordinarily, we should
derive our prior from the random-effects results because they refer to the
distribution of RRs across studies, and a good prior will allow for any potential RR
variation across studies (the fixed-effects model simply assumes the RR are the same
across studies). However the random-effects results are for the average ln(RR)
across studies; they are not a prediction (prior) for a new study. To get a prediction
for the current study, we will add the estimated ln(RR) variance across studies, tausquared (τ2 = 0.0031) to the variance we calculate from the random-effects interval,
to get our prior variance for the current study.
a) What are the prior mean, prior variance, and 95% prior limits for ln(RR) using the
random-effects results (variance expanded to allow for prediction to a new study)?
b) What are the approximate posterior median and 95% posterior limits for RR?
Question 3
Reverse-Bayes analysis
a) What normal prior would make the 95% posterior interval include RR=1?
b) What is the hypothetical RCT result corresponding to such a prior if the incidence
rate of stroke in the study population is about 5 per 1000 person-years?
[Hint: Work out the number of stroke cases and person-time for a RCT with a 1:1 allocation of
participants to high and low intake of chocolate. Var(ln(RR))= 2/A where A is the number of events in
each exposure group(A=A1=A0 since the prior for ln(RR) is symmetric).]
Greenland S., Orsini N., Sullivan S., Simpson J.A. Melbourne July 24-25, 2014
Download