The regression discontinuity design in epidemiology S.Geneletti , G.Baio

advertisement
The regression discontinuity design in
epidemiology
S.Geneletti1 , G.Baio2 and A.P.Dawid3
1
London School of Economics and Political Science,
2 University College London,
3 University of Cambridge
30/11/2010
Outline
I
I
I
I
I
I
What is the RD design?
Causal inference
RD design applied to statins
THIN data
Results
Further work
What is the RD design?
I
I
I
The regression discontinuity (RD) design was first
introduced in the educational econometrics literature in
the 60’s [5]
Recently other econometricians have become interested in
formal causal aspects [3, 6]
The original idea was to exploit policy thresholds to
estimate the causal effect of an educational intervention
What is the RD design?
Example
I
I
I
I
We want to know what the effect of going to college is on
income
Comparing the income of individuals who attend college
and those who do not will not tell us the effect of college
attendance alone
Confounders such as social class, ability, motivation etc.
will make this difficult
Classic problem of observational studies
What is the RD design?
Example cont’d
I
I
I
I
I
Often college scholarships are given on the basis of grades
obtained in final school examinations
For example: if all exam grades are above 75% student
gets scholarship
If one student gets 74% and another 76%
Can we really consider them as coming from different
populations especially if in other respects (e.g. family
income etc) they are the same?
Given that there is natural variability in exam performance
even for the same individual?
What is the RD design?
Public health Example
I
Many medicines are prescribed according to a particular
guideline
I
I
I
Antiretroviral HIV drugs prescribed when patient’s CD4 counts
is less than 200 cells/mm3
Blood pressure medication is prescribed when patient’s BP is
140/90mmHg or above
Statins are prescribed when e.g. 10 year Framingham risk
score is over 20%
What is the RD design?
Public Health Example cont’d
I
I
I
I
I
I
Consider the HIV patients.
If one patient has a CD4 count of 195 and another of 205
cells/mm3
Theoretically, one patient gets the drugs while the other
doesn’t
If the two are the same in every other relevant respect
Can we really consider them as coming from different
populations?
Given that there is a natural variability in CD4 counts and
in the instruments used to measure them?
RD design and confounding
Sharp Design
I
I
The idea of the RD design is that the threshold behaves
like a randomising device
If we imagine that the thresholds are adhered to very
strictly
I
I
I
I
termed sharp design
Then we can think of the RD design as removing the
confounding due unobserved factors
For education could be e.g. academic history, talent,
motivation
For HIV could also be unobserved health/personal
characteristics
RD design and confounding
Fuzzy Design
I
I
I
I
In public health contexts the sharp threshold is unlikely to
be adhered to
Often GP’s override guidelines – generally because they
feel patients will benefit from medication even when they
do not fit guidelines
Often patients do not take the prescribed drugs as
recommended
There are statistical methods that cater for these cases
I
termed fuzzy design
RD design and compliance
I
For RD applied to GP prescription context there are two
layers of compliance
1. Compliance of GP to prescription guidelines [i.e. only give
patients with CD4 count below 200 cells/mm3 the
antiretroviral drug]
2. Compliance of patient to prescription [i.e. take the
antiretroviral drug twice a day every day]
I
I
The RD design is related to compliance of the first type
The RD’s relation to compliance means it is also related
to intention-to-treat experiments
RD design and compliance
I
I
I
I
I
I
I
RD with sharp threshold = randomised trial with perfect
compliance
RD with fuzzy threshold = randomised trial with partial
compliance
Mathematically the LHS and RHS of both equations are
identical
So in the fuzzy design we don’t estimate an average
causal effect but rather a complier causal effect
The compliers are those who “respect” the threshold,
For the GP prescription it is those who the GP prescribes
the drug to in accordance to the guidelines
Whether the patients take the drugs as recommended
needs to be dealt with separately
Causality in Statistics
Motivation
I
I
I
I
Causation = intervention
However we cannot always intervene and randomise
The trick is to understand what mechanisms behave in
the same way under intervention and under observation
These mechanisms are then causal
Decision theoretic (DT) set-up
I
I
F intervention variable, X other variables
p(T = t|F = t, X) = 1 means set T = t e.g. by
randomisation in trial
Decision theoretic (DT) set-up
I
I
I
F intervention variable, X other variables
p(T = t|F = t, X) = 1 means set T = t e.g. by
randomisation in trial
p(T |F = ∅, X) = p(T |X), T arises “naturally” in the
observational regime
Decision theoretic (DT) set-up
I
I
I
I
F intervention variable, X other variables
p(T = t|F = t, X) = 1 means set T = t e.g. by
randomisation in trial
p(T |F = ∅, X) = p(T |X), T arises “naturally” in the
observational regime
We estimate effects as predictive expectations (or other
functions) -i.e. we answer which treatment would benefit
a new unit exchangeable to those we have observed?
Simple problem first
I
I
Consider the
AT E = E(Y |F = 1, T = 1) − E(Y |F = 0, T = 0)
Where we leave out X for simplicity
Simple problem first
I
I
I
Consider the
AT E = E(Y |F = 1, T = 1) − E(Y |F = 0, T = 0)
Where we leave out X for simplicity
This is not necessarily the same as the “naive” treatment
effect
N T E = E(Y |F = ∅, T = 1) − E(Y |F = ∅, T = 0)
Simple problem first
I
I
I
I
I
Consider the
AT E = E(Y |F = 1, T = 1) − E(Y |F = 0, T = 0)
Where we leave out X for simplicity
This is not necessarily the same as the “naive” treatment
effect
N T E = E(Y |F = ∅, T = 1) − E(Y |F = ∅, T = 0)
Unless Y does not depend on how the treatment was
administered
I.e. F ⊥⊥Y |T
Simple problem cont
F
T
Y
1. Y ⊥
⊥F |T means only the value of treatment matters for Y
2. However that does not tend to hold...
Simple problem cont
U
F
T
Y
1. Y ⊥
⊥F |T means only the value of treatment matters for Y
2. However that does not tend to hold...
3. Usually there is a confounder U s.t.
U ⊥⊥ F
Y ⊥⊥ F |(U, T )
4. If U is unobserved and there is no randomisation then
AT E 6= N T E
Simple problem first
I
I
I
I
I
If we look at adherence to the threshold as compliance
We can introduce another variable binary Z – the
threshold indicator:
If Z = 1 the individual is above the threshold
If Z = 0 the individual is below the threshold
When the threshold is strict then Z = F
RD design
Z
U
F
I
I
T
Y
Z and F both have the same relationship with U ,T and Y
This means Z can be used for causal inference
The RD design
Assumptions
A1 The threshold is set prior to the observed data and is not
changed after observation
I
Generally plausible as threshold set by the powers that be e.g.
gov’t agencies, NICE etc.
A2.1 Individuals close to the threshold are exchangeable
I
I
I
We have no reason to believe that the individuals just above
and below the threshold are different
This is violated if individuals can change their outcome to fall
above or below the threshold
Benefit fraud: individuals might say their income is below a
threshold in order to fall into a category that receives benefits
The RD design
Assumptions cont’d
Another way of expressing A2.1:
A2.1 The threshold is a randomising device
I
I
I
I
This means that a comparison of above and below gives us a
causal effect estimate of the treatment – at the threshold
This is because randomisation is the gold standard for causal
inference as controls for confounding
The question is how far above and how far below?
The RD design
The RD design
Assumptions cont’d
A3 The assignment variable is continuous
I
I
I
I
There cannot be a threshold w/out a continuous variable
Means we don’t have to worry about choosing bands
We fit two separate regressions – one above and one below the
threshold
Or assume a common slope and fit one regression – this
assumes effect is the same everywhere
The RD design
The causal effect
The continuous case: Sharp threshold
I
I
Let Y be the outcome, W the assignment variable and T
the treatment indicator
If the regressions are given by
E(Y )s = αs + βs W
where:
I
I
I
x is the value of X at the threshold;
s = b ⇒ W < w (below)
s = a ⇒ W ≥ w (above)
An estimate of the causal effect of the treatment is
ACE = E(Y |T = 1) − E(Y |T = 0)
= αb − αa + (βb − βa )w
I
There are more sophisticated estimates[3, 6]
The causal effect
The continuous case: Fuzzy threshold
I
I
I
I
I
Often there is not strict adherence to threshold
Use the relationship between RD design and compliance
to estimate the effect in this situation
If Z = 1 if individual is above the threshold and Z = 0
below then RD fuzzy estimate same as partial compliance
estimate
The local average treatment effect (LATE) – complier
effect [? ]
Can be equated to fuzzy average causal effect (FACE)
LATE
The causal effect
The continuous case: Fuzzy threshold
I
The formula for the fuzzy estimator is
FACE =
I
E(Y |Z = 1) − E(Y |Z = 0)
E(T |Z = 1) − E(T |Z = 0)
One estimate is:
αb − αa + (βb − βa )w
pˆ1|1 − pˆ1|0
I
I
Where pˆt|z is an estimate of p(T = t|Z = z)
This is partly based on the compliance literature [1]
The RD design for binary outcomes
I
I
I
I
I
Many outcomes in public health are binary (death, cvd
event)
The RD design can be used for binary outcomes by using
logistic regressions
And then looking at treatment risk-ratios (RR)
We don’t want to use odds ratios because we don’t
necessarily have rare outcomes
Also, we want to be able to evaluate the RR at the
threshold
The RD design for binary outcomes
The causal risk ratio
The binary case: sharp threshold
I
I
If we fit two separate logistic regressions
logit(p)s = αs + βs X,
where s = {a, b} for above and below,
then causal risk ratio at the threshold x is given by
RR =
1 + exp(−{αb + βb x})
1 + exp(−{αa + βa x})
The causal risk ratio
The binary case: fuzzy threshold
I
The fuzzy design for a binary outcome was originally
developed in the compliance literature by [2]
FRR
1−
I
I
p(Y |Z = 1) − p(Y |Z = 0)
p(Y |T = 1, Z = 1)p(T |Z = 1) − p(Y |T = 1, Z = 0)p(T |Z = 0)
The different parts are estimated using logistic regressions
evaluated at the threshold
The FRR
I
I
I
=
=RR when the design is sharp
Is further from the RR the more fuzzy the design
This can also be derived along the same lines as the LATE
but much harder work!
The trouble with statins
I
I
I
Statins are a class of drugs used to lower cholesterol and
prescribed to prevent heart disease
They are amongst the most prescribed drugs in the UK
Some even suggest handing them out with fast food!
The trouble with statins
I
I
Trials [7] show an average reduction of LDL cholesterol of
approximately 2 mmol/l
Also, NHS guidelines are to prescribe statins to individuals
w/out previous CVD if their 10 year CVD score exceeds
20% [4]
I
I
CVD scores are predicted probabilities of event in next 10 years
and are based on age, sex, smoking status, pressure, cholesterol
and depending on type of score also diabetes, LVH etc.
We could use the RD design with the threshold to see
whether the effect of statins is the same as in the trials
The trouble with statins
I
I
In a second instance we can also try and determine
whether the prescription threshold is ideal
By looking at CVD events and incorporating a
cost-effectiveness analysis
RD design design for statins
How do we measure the effects?
I
We have two outcomes of interest:
I
I
I
I
Change in LDL cholesterol after treatment
Occurrence of CVD events after treatment
The threshold variable is the 10 year Framingham CVD
score
Or another continuous variable that might be used by GPs
to determine statin prescription
Example — RD design in the THIN data
I
The THIN data set contains data from routine general
practice prescriptions as well as information on the
variables that determine these prescriptions
I
I
I
Individual characteristics (sex, date of birth, date of
registration with practice, proxies of socioeconomic status)
Medical history (GP visits, prescriptions, exams)
This information can be used to characterise the patients
with respect to
I
I
I
Measurements of health indicators that allow to estimate a risk
of experiencing cardiovascular events
Treatment with statins
Measurements of suitable outcomes (e.g. LDL level, CHD
events, deaths)
Example (cont’d)
Preliminary analysis
I Data from THIN10 (a sub sample of 10 practices as of
February 2009)
I Already existing “code lists” to identify and manage
cardiovascular events & related variables
I
I
I
Identify relevant read codes & select records of patients with
measurements for suitable variables
Will need to update and perhaps modify this code list
Created new (provisional) lists to identify records of
prescription for statin treatment
Example (cont’d)
I
Estimated a cardiovascular risk predictor
I
Based on University of Edinburgh risk calculator
(http://cvrisk.mvm.ed.ac.uk/calculator/calc.asp)
Example (cont’d)
I
Estimated a cardiovascular risk predictor
I
I
I
Combines two dimensions from Framingham risk calculator
NB: Framingham risk calculator would be ideal, but it is not
consistently recorded in THIN
Requires measurements of
I
I
I
I
I
HLD and total cholesterol;
systolic blood pressure;
smoking and diabetes status and the presence of left
ventricular hypetrophy;
age and sex
Problems with recording of smoking status, so will need to
make this estimation more robust
Example (cont’d)
Preliminary analysis
I For the sake of simplicity we considered a simple
continuous outcome
I
I
To simplify the analysis, we grouped the patients
according to their age at the risk prediction
I
I
Measure of LDL cholesterol following the estimation of CVD
risk
Bins of 5 years (50-54 — 85+)
Each patient was associated with the treatment group if
they had a prescription for statins in the year following the
risk prediction
Example — Sharp design
I
I
Assume that the design is sharp (i.e. “perfect” treatment
allocation)
Run two regression analyses
I
I
Control for sex, risk and age at LDL measurement
Treatment effect measured as ACE
ACE = E(Y |T = 1) − E(Y |T = 0)
Example — Sharp design
Age at prediction = 50−54 (n = 1484)
ACE = −0.271
Age at prediction = 55−59 (n = 2016)
ACE = −0.0334
6
Not Treated
Treated
4
2
2
1
1
0
0
3
LDL (mmol/l)
5
4
3
LDL (mmol/l)
4
2
LDL (mmol/l)
5
6
6
Age at prediction = 60−64 (n = 2188)
ACE = −0.098
Not Treated
Treated
7
Not Treated
Treated
0.0
0.1
0.2
0.3
0.0
0.1
0.2
Predicted risk score
0.3
0.4
0.5
0.6
0.0
0.1
Predicted risk score
0.3
0.4
0.5
0.6
Predicted risk score
Age at prediction = 70−74 (n = 2142)
ACE = 0.0552
Age at prediction = 75−79 (n = 1167)
ACE = 0.120
8
8
Age at prediction = 65−69 (n = 2485)
ACE = −0.554
0.2
Not Treated
Treated
7
Not Treated
Treated
5
4
3
LDL (mmol/l)
6
4
LDL (mmol/l)
4
0
1
2
2
2
LDL (mmol/l)
6
6
Not Treated
Treated
0.0
0.2
0.4
0.6
0.0
0.2
0.4
Predicted risk score
0.6
0.8
0.0
Predicted risk score
Age at prediction = 80−84 (n = 613)
ACE = 0.064
Age at prediction = 85+ (n = 251)
ACE = 3.32
5
5
Not Treated
Treated
2
3
LDL (mmol/l)
4
4
1
LDL (mmol/l)
3
2
1
0.2
0.4
0.6
Predicted risk score
0.8
1.0
0.0
0.1
0.2
0.3
0.4
Predicted risk score
0.4
0.6
Predicted risk score
Not Treated
Treated
0.0
0.2
0.5
0.6
0.7
0.8
1.0
Example — Sharp design
I
I
I
ACE reasonably stable and negative (i.e. treatment
decreases level of LDL) for age groups 50-54 up to 70-74
Older age groups show very unstable estimates (few data
points in the treatment group!)
Overall, treatment effect is small
Example — Sharp design
8
Age at prediction = 65−69 (n = 2485)
ACE = −0.554
4
2
0
LDL (mmol/l)
6
Not Treated
Treated
0.0
0.2
0.4
Predicted risk score
0.6
Example — Sharp design
I
ACE reasonably stable and negative (i.e. treatment
decreases level of LDL) for age groups 50-54 up to 70-74
Older age groups show very unstable estimates (few data
points in the treatment group!)
Overall, treatment effect is small
I
More importantly, the design is not sharp!
I
I
Example — Fuzzy design
Age at prediction = 50−54 (n = 1484)
Age at prediction = 55−59 (n = 2016)
5
4
0
0
1
2
3
LDL (mmol/l)
6
6
4
2
LDL (mmol/l)
Not Treated
Treated
7
Not Treated
Treated
0.1
0.2
0.3
0.0
0.1
0.2
Predicted risk score
0.3
0.4
0.5
Predicted risk score
Age at prediction = 65−69 (n = 2485)
8
Age at prediction = 60−64 (n = 2188)
Not Treated
Treated
0
2
2
4
LDL (mmol/l)
6
4
LDL (mmol/l)
8
6
10
Not Treated
Treated
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.1
Predicted risk score
8
4
2
LDL (mmol/l)
6
Not Treated
Treated
0.2
0.4
Predicted risk score
0.3
Predicted risk score
Age at prediction = 70−74 (n = 2142)
0.0
0.2
0.6
0.8
0.4
0.5
0.6
Example — Fuzzy design
I
I
I
Under these circumstances, we cannot use ACE to
estimate the causal effect, but need to build FACE
For this preliminary analysis, we estimate the denominator
using the observed raw proportions
There are a few possible ways of computing the estimand
I
I
I
By threshold only
By treatment only
By treatment & threshold
Example (cont’d)
Age at prediction = 50−54 (n = 1484)
FACE = −0.326
Age at prediction = 55−59 (n = 2016)
FACE = −0.509
Not Treated
Treated
4
0
0
1
2
3
LDL (mmol/l)
4
2
LDL (mmol/l)
5
6
6
7
Not Treated
Treated
0.1
0.2
0.3
0.0
0.1
0.2
Predicted risk score
0.3
0.4
0.5
Predicted risk score
Age at prediction = 65−69 (n = 2485)
FACE = −5.53
8
Age at prediction = 60−64 (n = 2188)
FACE = −0.916
Not Treated
Treated
4
LDL (mmol/l)
6
2
4
0
2
LDL (mmol/l)
8
6
10
Not Treated
Treated
0.0
0.1
0.2
0.3
0.4
Predicted risk score
0.5
0.6
0.1
0.2
0.3
0.4
Predicted risk score
0.5
0.6
Some results
50-54
55-59
ACE
-0.2709 -0.0334
FACE -2.0254 -0.2734
FACE∗ -0.3255 -0.5085
I ACE two regressions on
compliance
ACE
I F ACE =
p1.1−p1.0
I
I
60-64
65-69
70-74
-0.0980 -0.5535 0.0550
-0.7816 -6.9494 0.5801
-0.9161 -5.5263 4.2267
data defined by threshold and
ACE ∗ two regressions on data defined by thresholds but
with treatment as predictor
ACE ∗
F ACE ∗ = p1.1−p1.0
Some results
50-54
55-59
ACE
-0.2709 -0.0334
FACE -2.0254 -0.2734
FACE∗ -0.3255 -0.5085
I ACE two regressions on
compliance
ACE
I F ACE =
p1.1−p1.0
I
I
60-64
65-69
70-74
-0.0980 -0.5535 0.0550
-0.7816 -6.9494 0.5801
-0.9161 -5.5263 4.2267
data defined by threshold and
ACE ∗ two regressions on data defined by thresholds but
with treatment as predictor
ACE ∗
F ACE ∗ = p1.1−p1.0
Some results
50-54
55-59
ACE
-0.2709 -0.0334
FACE -2.0254 -0.2734
FACE∗ -0.3255 -0.5085
I ACE two regressions on
compliance
ACE
I F ACE =
p1.1−p1.0
I
I
60-64
65-69
70-74
-0.0980 -0.5535 0.0550
-0.7816 -6.9494 0.5801
-0.9161 -5.5263 4.2267
data defined by threshold and
ACE ∗ two regressions on data defined by thresholds but
with treatment as predictor
ACE ∗
F ACE ∗ = p1.1−p1.0
Some results
I
I
Estimates of FACE are very unstable
Need to come up with more robust estimates of
denominator
Example — Comments
I
The results are only indicative of the underlying causal
mechanism, due to a series of factors
I
I
Data need to be made more robust (include more practices &
more precise information on crucial predictor, such as smoking
status)
Account properly for the two layers on “non compliance”
I
I
I
GPs prescribing below threshold (or not prescribing above)
Individual compliance (patients prescribed statins who do not
take them continuously)
There seems to be an effect of treatment, especially in
some age groups, but more analyses are required
I
I
Careful stratification by sex
Control for more health conditions
Where to next?
I
I
I
I
Clean up data more and apply to whole THIN dataset
Find more stable/robust estimates of the denominator of
the FACE
Incorporate cost-effectiveness analysis
Apply RD design to other drugs/screening
References
[1] A. P. Dawid. Causal inference using influence diagrams: The problem of partial compliance (with Discussion).
In P.J. Green, N.L. Hjort, and S. Richardson, editors, Highly Structured Stochastic Systems, pages 45–81.
Oxford University Press, 2003.
[2] MA Hernan and JM Robins. Instruments for causal inference - An epidemiologist’s dream? Epidemiology,
17(4):360–372, JUL 2006.
[3] Guido W. Imbens and Thomas Lemieux. Regression discontinuity designs: A guide to practice. Journal of
Econometrics, 142(2):615 – 635, 2008. The regression discontinuity design: Theory and applications.
[4] NICE. Quick reference guide: Statins for the prevention of cardiovascular events, 2008.
[5] DL. Thistlethwaite and DT. Campbell. Regression-Discontinuity Analysis - An alternative to the ex-post-facto
experiment. Journal of Educational Psychology, 51(6):309–317, 1960.
[6] G. van der Klaauw. Regression-discontinuity analysis: A survey of recent developments in economics. Labour,
22(2):219–245, 2008.
[7] S. Ward, L. Jones, A. Pandor, M. Holmes, R. Ara, A. Ryan, W. Yeo, and N. Payne. A systematic review and
economic evaluation of statins for the prevention of coronary events. Health Technology Assessment, 11(14),
2007.
Deriving the LATE
I
I
Pretend we’re looking at a randomised trial with partial
compliance
Introduce three variables
I
I
I
Z the randomised treatment – not necessarily complied to
U the unobserved confounders
CZ the preferred treatment under Z
Deriving the LATE
U
Z
I
I
T
Y
If the DAG above describes the situation
Then we can replace U with CZ
Deriving the LATE
CZ
Z
I
I
T
Y
If the DAG above describes the situation
Then we can replace U with CZ
Deriving the LATE
I
I
I
I
The CZ ’s look a bit like counterfactuals
But they aren’t as they represent preferences that you can
elicit prior to any treatment being assigned
So they are random variables
We assume that T = CZ ,
I
I
i.e. the treatment actually taken is the preferred treatment
We also assume monotonicity
I
I
Individuals do not want to do the opposite of what they are
recommended
p(C0 = 1, C1 = 0) = 0
Deriving the LATE
I
By using this set-up it is possible to derive an estimate of
the LATE
I
I
back
based on only the Z’s and the T ’s
rather than the CZ ’s which we cannot directly observe
Download