The regression discontinuity design in epidemiology S.Geneletti1 , G.Baio2 and A.P.Dawid3 1 London School of Economics and Political Science, 2 University College London, 3 University of Cambridge 30/11/2010 Outline I I I I I I What is the RD design? Causal inference RD design applied to statins THIN data Results Further work What is the RD design? I I I The regression discontinuity (RD) design was first introduced in the educational econometrics literature in the 60’s [5] Recently other econometricians have become interested in formal causal aspects [3, 6] The original idea was to exploit policy thresholds to estimate the causal effect of an educational intervention What is the RD design? Example I I I I We want to know what the effect of going to college is on income Comparing the income of individuals who attend college and those who do not will not tell us the effect of college attendance alone Confounders such as social class, ability, motivation etc. will make this difficult Classic problem of observational studies What is the RD design? Example cont’d I I I I I Often college scholarships are given on the basis of grades obtained in final school examinations For example: if all exam grades are above 75% student gets scholarship If one student gets 74% and another 76% Can we really consider them as coming from different populations especially if in other respects (e.g. family income etc) they are the same? Given that there is natural variability in exam performance even for the same individual? What is the RD design? Public health Example I Many medicines are prescribed according to a particular guideline I I I Antiretroviral HIV drugs prescribed when patient’s CD4 counts is less than 200 cells/mm3 Blood pressure medication is prescribed when patient’s BP is 140/90mmHg or above Statins are prescribed when e.g. 10 year Framingham risk score is over 20% What is the RD design? Public Health Example cont’d I I I I I I Consider the HIV patients. If one patient has a CD4 count of 195 and another of 205 cells/mm3 Theoretically, one patient gets the drugs while the other doesn’t If the two are the same in every other relevant respect Can we really consider them as coming from different populations? Given that there is a natural variability in CD4 counts and in the instruments used to measure them? RD design and confounding Sharp Design I I The idea of the RD design is that the threshold behaves like a randomising device If we imagine that the thresholds are adhered to very strictly I I I I termed sharp design Then we can think of the RD design as removing the confounding due unobserved factors For education could be e.g. academic history, talent, motivation For HIV could also be unobserved health/personal characteristics RD design and confounding Fuzzy Design I I I I In public health contexts the sharp threshold is unlikely to be adhered to Often GP’s override guidelines – generally because they feel patients will benefit from medication even when they do not fit guidelines Often patients do not take the prescribed drugs as recommended There are statistical methods that cater for these cases I termed fuzzy design RD design and compliance I For RD applied to GP prescription context there are two layers of compliance 1. Compliance of GP to prescription guidelines [i.e. only give patients with CD4 count below 200 cells/mm3 the antiretroviral drug] 2. Compliance of patient to prescription [i.e. take the antiretroviral drug twice a day every day] I I The RD design is related to compliance of the first type The RD’s relation to compliance means it is also related to intention-to-treat experiments RD design and compliance I I I I I I I RD with sharp threshold = randomised trial with perfect compliance RD with fuzzy threshold = randomised trial with partial compliance Mathematically the LHS and RHS of both equations are identical So in the fuzzy design we don’t estimate an average causal effect but rather a complier causal effect The compliers are those who “respect” the threshold, For the GP prescription it is those who the GP prescribes the drug to in accordance to the guidelines Whether the patients take the drugs as recommended needs to be dealt with separately Causality in Statistics Motivation I I I I Causation = intervention However we cannot always intervene and randomise The trick is to understand what mechanisms behave in the same way under intervention and under observation These mechanisms are then causal Decision theoretic (DT) set-up I I F intervention variable, X other variables p(T = t|F = t, X) = 1 means set T = t e.g. by randomisation in trial Decision theoretic (DT) set-up I I I F intervention variable, X other variables p(T = t|F = t, X) = 1 means set T = t e.g. by randomisation in trial p(T |F = ∅, X) = p(T |X), T arises “naturally” in the observational regime Decision theoretic (DT) set-up I I I I F intervention variable, X other variables p(T = t|F = t, X) = 1 means set T = t e.g. by randomisation in trial p(T |F = ∅, X) = p(T |X), T arises “naturally” in the observational regime We estimate effects as predictive expectations (or other functions) -i.e. we answer which treatment would benefit a new unit exchangeable to those we have observed? Simple problem first I I Consider the AT E = E(Y |F = 1, T = 1) − E(Y |F = 0, T = 0) Where we leave out X for simplicity Simple problem first I I I Consider the AT E = E(Y |F = 1, T = 1) − E(Y |F = 0, T = 0) Where we leave out X for simplicity This is not necessarily the same as the “naive” treatment effect N T E = E(Y |F = ∅, T = 1) − E(Y |F = ∅, T = 0) Simple problem first I I I I I Consider the AT E = E(Y |F = 1, T = 1) − E(Y |F = 0, T = 0) Where we leave out X for simplicity This is not necessarily the same as the “naive” treatment effect N T E = E(Y |F = ∅, T = 1) − E(Y |F = ∅, T = 0) Unless Y does not depend on how the treatment was administered I.e. F ⊥⊥Y |T Simple problem cont F T Y 1. Y ⊥ ⊥F |T means only the value of treatment matters for Y 2. However that does not tend to hold... Simple problem cont U F T Y 1. Y ⊥ ⊥F |T means only the value of treatment matters for Y 2. However that does not tend to hold... 3. Usually there is a confounder U s.t. U ⊥⊥ F Y ⊥⊥ F |(U, T ) 4. If U is unobserved and there is no randomisation then AT E 6= N T E Simple problem first I I I I I If we look at adherence to the threshold as compliance We can introduce another variable binary Z – the threshold indicator: If Z = 1 the individual is above the threshold If Z = 0 the individual is below the threshold When the threshold is strict then Z = F RD design Z U F I I T Y Z and F both have the same relationship with U ,T and Y This means Z can be used for causal inference The RD design Assumptions A1 The threshold is set prior to the observed data and is not changed after observation I Generally plausible as threshold set by the powers that be e.g. gov’t agencies, NICE etc. A2.1 Individuals close to the threshold are exchangeable I I I We have no reason to believe that the individuals just above and below the threshold are different This is violated if individuals can change their outcome to fall above or below the threshold Benefit fraud: individuals might say their income is below a threshold in order to fall into a category that receives benefits The RD design Assumptions cont’d Another way of expressing A2.1: A2.1 The threshold is a randomising device I I I I This means that a comparison of above and below gives us a causal effect estimate of the treatment – at the threshold This is because randomisation is the gold standard for causal inference as controls for confounding The question is how far above and how far below? The RD design The RD design Assumptions cont’d A3 The assignment variable is continuous I I I I There cannot be a threshold w/out a continuous variable Means we don’t have to worry about choosing bands We fit two separate regressions – one above and one below the threshold Or assume a common slope and fit one regression – this assumes effect is the same everywhere The RD design The causal effect The continuous case: Sharp threshold I I Let Y be the outcome, W the assignment variable and T the treatment indicator If the regressions are given by E(Y )s = αs + βs W where: I I I x is the value of X at the threshold; s = b ⇒ W < w (below) s = a ⇒ W ≥ w (above) An estimate of the causal effect of the treatment is ACE = E(Y |T = 1) − E(Y |T = 0) = αb − αa + (βb − βa )w I There are more sophisticated estimates[3, 6] The causal effect The continuous case: Fuzzy threshold I I I I I Often there is not strict adherence to threshold Use the relationship between RD design and compliance to estimate the effect in this situation If Z = 1 if individual is above the threshold and Z = 0 below then RD fuzzy estimate same as partial compliance estimate The local average treatment effect (LATE) – complier effect [? ] Can be equated to fuzzy average causal effect (FACE) LATE The causal effect The continuous case: Fuzzy threshold I The formula for the fuzzy estimator is FACE = I E(Y |Z = 1) − E(Y |Z = 0) E(T |Z = 1) − E(T |Z = 0) One estimate is: αb − αa + (βb − βa )w pˆ1|1 − pˆ1|0 I I Where pˆt|z is an estimate of p(T = t|Z = z) This is partly based on the compliance literature [1] The RD design for binary outcomes I I I I I Many outcomes in public health are binary (death, cvd event) The RD design can be used for binary outcomes by using logistic regressions And then looking at treatment risk-ratios (RR) We don’t want to use odds ratios because we don’t necessarily have rare outcomes Also, we want to be able to evaluate the RR at the threshold The RD design for binary outcomes The causal risk ratio The binary case: sharp threshold I I If we fit two separate logistic regressions logit(p)s = αs + βs X, where s = {a, b} for above and below, then causal risk ratio at the threshold x is given by RR = 1 + exp(−{αb + βb x}) 1 + exp(−{αa + βa x}) The causal risk ratio The binary case: fuzzy threshold I The fuzzy design for a binary outcome was originally developed in the compliance literature by [2] FRR 1− I I p(Y |Z = 1) − p(Y |Z = 0) p(Y |T = 1, Z = 1)p(T |Z = 1) − p(Y |T = 1, Z = 0)p(T |Z = 0) The different parts are estimated using logistic regressions evaluated at the threshold The FRR I I I = =RR when the design is sharp Is further from the RR the more fuzzy the design This can also be derived along the same lines as the LATE but much harder work! The trouble with statins I I I Statins are a class of drugs used to lower cholesterol and prescribed to prevent heart disease They are amongst the most prescribed drugs in the UK Some even suggest handing them out with fast food! The trouble with statins I I Trials [7] show an average reduction of LDL cholesterol of approximately 2 mmol/l Also, NHS guidelines are to prescribe statins to individuals w/out previous CVD if their 10 year CVD score exceeds 20% [4] I I CVD scores are predicted probabilities of event in next 10 years and are based on age, sex, smoking status, pressure, cholesterol and depending on type of score also diabetes, LVH etc. We could use the RD design with the threshold to see whether the effect of statins is the same as in the trials The trouble with statins I I In a second instance we can also try and determine whether the prescription threshold is ideal By looking at CVD events and incorporating a cost-effectiveness analysis RD design design for statins How do we measure the effects? I We have two outcomes of interest: I I I I Change in LDL cholesterol after treatment Occurrence of CVD events after treatment The threshold variable is the 10 year Framingham CVD score Or another continuous variable that might be used by GPs to determine statin prescription Example — RD design in the THIN data I The THIN data set contains data from routine general practice prescriptions as well as information on the variables that determine these prescriptions I I I Individual characteristics (sex, date of birth, date of registration with practice, proxies of socioeconomic status) Medical history (GP visits, prescriptions, exams) This information can be used to characterise the patients with respect to I I I Measurements of health indicators that allow to estimate a risk of experiencing cardiovascular events Treatment with statins Measurements of suitable outcomes (e.g. LDL level, CHD events, deaths) Example (cont’d) Preliminary analysis I Data from THIN10 (a sub sample of 10 practices as of February 2009) I Already existing “code lists” to identify and manage cardiovascular events & related variables I I I Identify relevant read codes & select records of patients with measurements for suitable variables Will need to update and perhaps modify this code list Created new (provisional) lists to identify records of prescription for statin treatment Example (cont’d) I Estimated a cardiovascular risk predictor I Based on University of Edinburgh risk calculator (http://cvrisk.mvm.ed.ac.uk/calculator/calc.asp) Example (cont’d) I Estimated a cardiovascular risk predictor I I I Combines two dimensions from Framingham risk calculator NB: Framingham risk calculator would be ideal, but it is not consistently recorded in THIN Requires measurements of I I I I I HLD and total cholesterol; systolic blood pressure; smoking and diabetes status and the presence of left ventricular hypetrophy; age and sex Problems with recording of smoking status, so will need to make this estimation more robust Example (cont’d) Preliminary analysis I For the sake of simplicity we considered a simple continuous outcome I I To simplify the analysis, we grouped the patients according to their age at the risk prediction I I Measure of LDL cholesterol following the estimation of CVD risk Bins of 5 years (50-54 — 85+) Each patient was associated with the treatment group if they had a prescription for statins in the year following the risk prediction Example — Sharp design I I Assume that the design is sharp (i.e. “perfect” treatment allocation) Run two regression analyses I I Control for sex, risk and age at LDL measurement Treatment effect measured as ACE ACE = E(Y |T = 1) − E(Y |T = 0) Example — Sharp design Age at prediction = 50−54 (n = 1484) ACE = −0.271 Age at prediction = 55−59 (n = 2016) ACE = −0.0334 6 Not Treated Treated 4 2 2 1 1 0 0 3 LDL (mmol/l) 5 4 3 LDL (mmol/l) 4 2 LDL (mmol/l) 5 6 6 Age at prediction = 60−64 (n = 2188) ACE = −0.098 Not Treated Treated 7 Not Treated Treated 0.0 0.1 0.2 0.3 0.0 0.1 0.2 Predicted risk score 0.3 0.4 0.5 0.6 0.0 0.1 Predicted risk score 0.3 0.4 0.5 0.6 Predicted risk score Age at prediction = 70−74 (n = 2142) ACE = 0.0552 Age at prediction = 75−79 (n = 1167) ACE = 0.120 8 8 Age at prediction = 65−69 (n = 2485) ACE = −0.554 0.2 Not Treated Treated 7 Not Treated Treated 5 4 3 LDL (mmol/l) 6 4 LDL (mmol/l) 4 0 1 2 2 2 LDL (mmol/l) 6 6 Not Treated Treated 0.0 0.2 0.4 0.6 0.0 0.2 0.4 Predicted risk score 0.6 0.8 0.0 Predicted risk score Age at prediction = 80−84 (n = 613) ACE = 0.064 Age at prediction = 85+ (n = 251) ACE = 3.32 5 5 Not Treated Treated 2 3 LDL (mmol/l) 4 4 1 LDL (mmol/l) 3 2 1 0.2 0.4 0.6 Predicted risk score 0.8 1.0 0.0 0.1 0.2 0.3 0.4 Predicted risk score 0.4 0.6 Predicted risk score Not Treated Treated 0.0 0.2 0.5 0.6 0.7 0.8 1.0 Example — Sharp design I I I ACE reasonably stable and negative (i.e. treatment decreases level of LDL) for age groups 50-54 up to 70-74 Older age groups show very unstable estimates (few data points in the treatment group!) Overall, treatment effect is small Example — Sharp design 8 Age at prediction = 65−69 (n = 2485) ACE = −0.554 4 2 0 LDL (mmol/l) 6 Not Treated Treated 0.0 0.2 0.4 Predicted risk score 0.6 Example — Sharp design I ACE reasonably stable and negative (i.e. treatment decreases level of LDL) for age groups 50-54 up to 70-74 Older age groups show very unstable estimates (few data points in the treatment group!) Overall, treatment effect is small I More importantly, the design is not sharp! I I Example — Fuzzy design Age at prediction = 50−54 (n = 1484) Age at prediction = 55−59 (n = 2016) 5 4 0 0 1 2 3 LDL (mmol/l) 6 6 4 2 LDL (mmol/l) Not Treated Treated 7 Not Treated Treated 0.1 0.2 0.3 0.0 0.1 0.2 Predicted risk score 0.3 0.4 0.5 Predicted risk score Age at prediction = 65−69 (n = 2485) 8 Age at prediction = 60−64 (n = 2188) Not Treated Treated 0 2 2 4 LDL (mmol/l) 6 4 LDL (mmol/l) 8 6 10 Not Treated Treated 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.1 Predicted risk score 8 4 2 LDL (mmol/l) 6 Not Treated Treated 0.2 0.4 Predicted risk score 0.3 Predicted risk score Age at prediction = 70−74 (n = 2142) 0.0 0.2 0.6 0.8 0.4 0.5 0.6 Example — Fuzzy design I I I Under these circumstances, we cannot use ACE to estimate the causal effect, but need to build FACE For this preliminary analysis, we estimate the denominator using the observed raw proportions There are a few possible ways of computing the estimand I I I By threshold only By treatment only By treatment & threshold Example (cont’d) Age at prediction = 50−54 (n = 1484) FACE = −0.326 Age at prediction = 55−59 (n = 2016) FACE = −0.509 Not Treated Treated 4 0 0 1 2 3 LDL (mmol/l) 4 2 LDL (mmol/l) 5 6 6 7 Not Treated Treated 0.1 0.2 0.3 0.0 0.1 0.2 Predicted risk score 0.3 0.4 0.5 Predicted risk score Age at prediction = 65−69 (n = 2485) FACE = −5.53 8 Age at prediction = 60−64 (n = 2188) FACE = −0.916 Not Treated Treated 4 LDL (mmol/l) 6 2 4 0 2 LDL (mmol/l) 8 6 10 Not Treated Treated 0.0 0.1 0.2 0.3 0.4 Predicted risk score 0.5 0.6 0.1 0.2 0.3 0.4 Predicted risk score 0.5 0.6 Some results 50-54 55-59 ACE -0.2709 -0.0334 FACE -2.0254 -0.2734 FACE∗ -0.3255 -0.5085 I ACE two regressions on compliance ACE I F ACE = p1.1−p1.0 I I 60-64 65-69 70-74 -0.0980 -0.5535 0.0550 -0.7816 -6.9494 0.5801 -0.9161 -5.5263 4.2267 data defined by threshold and ACE ∗ two regressions on data defined by thresholds but with treatment as predictor ACE ∗ F ACE ∗ = p1.1−p1.0 Some results 50-54 55-59 ACE -0.2709 -0.0334 FACE -2.0254 -0.2734 FACE∗ -0.3255 -0.5085 I ACE two regressions on compliance ACE I F ACE = p1.1−p1.0 I I 60-64 65-69 70-74 -0.0980 -0.5535 0.0550 -0.7816 -6.9494 0.5801 -0.9161 -5.5263 4.2267 data defined by threshold and ACE ∗ two regressions on data defined by thresholds but with treatment as predictor ACE ∗ F ACE ∗ = p1.1−p1.0 Some results 50-54 55-59 ACE -0.2709 -0.0334 FACE -2.0254 -0.2734 FACE∗ -0.3255 -0.5085 I ACE two regressions on compliance ACE I F ACE = p1.1−p1.0 I I 60-64 65-69 70-74 -0.0980 -0.5535 0.0550 -0.7816 -6.9494 0.5801 -0.9161 -5.5263 4.2267 data defined by threshold and ACE ∗ two regressions on data defined by thresholds but with treatment as predictor ACE ∗ F ACE ∗ = p1.1−p1.0 Some results I I Estimates of FACE are very unstable Need to come up with more robust estimates of denominator Example — Comments I The results are only indicative of the underlying causal mechanism, due to a series of factors I I Data need to be made more robust (include more practices & more precise information on crucial predictor, such as smoking status) Account properly for the two layers on “non compliance” I I I GPs prescribing below threshold (or not prescribing above) Individual compliance (patients prescribed statins who do not take them continuously) There seems to be an effect of treatment, especially in some age groups, but more analyses are required I I Careful stratification by sex Control for more health conditions Where to next? I I I I Clean up data more and apply to whole THIN dataset Find more stable/robust estimates of the denominator of the FACE Incorporate cost-effectiveness analysis Apply RD design to other drugs/screening References [1] A. P. Dawid. Causal inference using influence diagrams: The problem of partial compliance (with Discussion). In P.J. Green, N.L. Hjort, and S. Richardson, editors, Highly Structured Stochastic Systems, pages 45–81. Oxford University Press, 2003. [2] MA Hernan and JM Robins. Instruments for causal inference - An epidemiologist’s dream? Epidemiology, 17(4):360–372, JUL 2006. [3] Guido W. Imbens and Thomas Lemieux. Regression discontinuity designs: A guide to practice. Journal of Econometrics, 142(2):615 – 635, 2008. The regression discontinuity design: Theory and applications. [4] NICE. Quick reference guide: Statins for the prevention of cardiovascular events, 2008. [5] DL. Thistlethwaite and DT. Campbell. Regression-Discontinuity Analysis - An alternative to the ex-post-facto experiment. Journal of Educational Psychology, 51(6):309–317, 1960. [6] G. van der Klaauw. Regression-discontinuity analysis: A survey of recent developments in economics. Labour, 22(2):219–245, 2008. [7] S. Ward, L. Jones, A. Pandor, M. Holmes, R. Ara, A. Ryan, W. Yeo, and N. Payne. A systematic review and economic evaluation of statins for the prevention of coronary events. Health Technology Assessment, 11(14), 2007. Deriving the LATE I I Pretend we’re looking at a randomised trial with partial compliance Introduce three variables I I I Z the randomised treatment – not necessarily complied to U the unobserved confounders CZ the preferred treatment under Z Deriving the LATE U Z I I T Y If the DAG above describes the situation Then we can replace U with CZ Deriving the LATE CZ Z I I T Y If the DAG above describes the situation Then we can replace U with CZ Deriving the LATE I I I I The CZ ’s look a bit like counterfactuals But they aren’t as they represent preferences that you can elicit prior to any treatment being assigned So they are random variables We assume that T = CZ , I I i.e. the treatment actually taken is the preferred treatment We also assume monotonicity I I Individuals do not want to do the opposite of what they are recommended p(C0 = 1, C1 = 0) = 0 Deriving the LATE I By using this set-up it is possible to derive an estimate of the LATE I I back based on only the Z’s and the T ’s rather than the CZ ’s which we cannot directly observe