Methods of explanatory analysis for psychological treatment trials workshop Session 1 Introduction to causal inference and the analysis of treatment effects in the presence of departures from random allocation Ian White Funded by: MRC Methodology Grant G0600555 MHRN Methodology Research Group Methodology Research Group Plan of session 1 1. Describe departures from random allocation 2. Intention-to-treat analysis, per-protocol analysis and their limitations 3. What do we want to estimate? 4. Estimation methods: principal stratification 5. Instrumental variables 6. Structural mean model 7. Extensions: complex departures, missing data, covariates 8. Small group discussion Illustrated with data from the ODIN and SoCRATES trials 2 Parallel-group trial Recruit Randomise Standard treatment (S) Experimental treatment (E) Get 0 Get X ??? Get E Get S ? Measure outcome Switches ?? Get E ?? Get S ? Get X Get 0 ??? Measure outcome Changes to non-trial treatment 3 Aim of session 1 • Infer causal effect of treatment in the presence of departures from randomised intervention – Better term than “non-compliance”: includes both non-adherence and changes in prescribed treatment • Types of departure: – Switches to other trial treatment or changes to nontrial (or no) treatment – Yes / no or quantitative (e.g. attend some sessions) – Constant or time-dependent • We’ll start by considering the simplest case: all-ornothing switches to the other trial treatment • The methods introduced here will be used in later sessions 4 Plan of session 1 1. Describe departures from random allocation 2. Intention-to-treat analysis, per-protocol analysis and their limitations 3. What do we want to estimate? 4. Estimation methods: principal stratification 5. Instrumental variables 6. Structural mean model 7. Extensions: complex departures, missing data, covariates 8. Small group discussion All illustrated with data from the ODIN trial 5 Intention-To-Treat (ITT) Principle http://www.consort-statement.org/ glossary: • “A strategy for analyzing data in which all participants are included in the group to which they were assigned, whether or not they completed the intervention given to the group. • “Intention-to-treat analysis prevents bias caused by the loss of participants, which may disrupt the baseline equivalence established by random assignment and which may reflect non-adherence to the protocol.” • Now the standard analysis – and rightly so 6 Intention-to-treat analysis • Compare groups as randomised, ignoring any departures • Answers an important pragmatic question – e.g. the public health impact of prescribing E • Disadvantage: this may be the wrong question! – may want to explore public health impact of prescribing E outside the trial, when compliance might be less » alternative pragmatic question – may want to know the effect of receiving E » explanatory question 7 Disadvantage of ITT • “Doctor doctor, will psychotherapy cure my depression?” • “I don’t know, but I expect prescribing psychotherapy to reduce your BDI score by 5 units … – on average … – that’s on average over whether you attend or not” • Clearly, judgements about whether a patient is likely to attend, take a drug, etc., should be a part of prescribing • But we often need to know effects of attendance, the drug, etc. in themselves 8 Per-protocol (PP) analysis • Alternative to ITT • Exclude any data collected after a departure from randomised treatment – requires careful pre-definition: what will be counted as departures? • Idea is to exclude data that doesn’t allow for the full effect of treatment • However, PP implicitly assumes that individuals with different treatment experience are comparable – rarely true – in practice there can be substantial selection bias 9 Alternative to ITT and PP • We adopt a “causal modelling” approach that carefully considers what we want to estimate and what assumptions are needed to do so • Estimation will avoid assumptions of comparability between groups as treated – will instead be based on comparisons of randomised groups 10 Plan of session 1 1. Describe departures from random allocation 2. Intention-to-treat analysis, per-protocol analysis and their limitations 3. What do we want to estimate? 4. Estimation methods: principal stratification 5. Instrumental variables 6. Structural mean model 7. Extensions: complex departures, missing data, covariates 8. Small group discussion All illustrated with data from the ODIN trial 11 What do we want to estimate? • The effect of the intervention, if everyone had received their randomised intervention? – “average causal effect”, ACE – “average treatment effect”, ATE – conceptual difficulties: » how could we make them receive their randomised intervention? » would this be ethical? » would it have other consequences? – technical difficulties: » turns out to be unidentified (unestimable) without further strong assumptions 12 What do we want to estimate? (2) Alternatives to the average causal effect: • “Average treatment effect in the treated”, ATT • “Complier-average causal effect”, CACE – to be defined below • Note how we separate what we want to estimate from analysis methods 13 Counterfactuals • Consider a trial of intervention E vs. control S • Define “counterfactual” or “potential” outcomes: – Yi(1) = outcome for individual i if they received intervention – Yi(0) = outcome for individual i if they received control – We can only observe one of these! • Intervention effect for individual i is Di = Yi(1) - Yi(0) • Then average causal effect of intervention is E[Di] – the average difference between outcome with intervention and outcome with control 14 Estimation with perfect compliance • With perfect compliance, we observe – Yi(1) in everyone in the intervention arm – Yi(0) in everyone in the control arm • Randomisation means that mean outcome with intervention can be estimated by mean outcome of those who got intervention E[Yi | R=E] – E[Yi | R=S] = E[Yi(1) | R=E] – E[Yi(0) | R=S] = E[Yi(1)] – E[Yi(0)] = E[Di] – Not true with imperfect compliance! • So ITT estimates the average causal effect of intervention 15 Estimation with imperfect compliance • Assume “all-or-nothing” compliance – everyone gets either intervention or control • In the intervention arm, we observe – Yi(1) in compliers – Yi(0) in non-compliers • In the control arm, we observe – Yi(0) in compliers – Yi(1) in “contaminators” • Need assumptions to estimate the average causal effect of intervention • A very simple assumption is – Yi(1) - Yi(0) = b – b is the (average) causal effect of intervention 16 Estimation with imperfect compliance (2) • Continuing with “causal model” Yi(1) - Yi(0) = b – can be written as Yi = Yi(0) + b Di – Di = 1 if intervention was received, else 0 • Implies that expected difference in outcome (between randomised groups) = causal effect of intervention x expected difference in intervention receipt – E[Yi|R=E] – E[Yi|R=S] = b {E[Di|R=E] – E[Di|R=S]} • This gives the simplest causal estimator: • causal effect of intervention = expected difference in outcome / expected difference in intervention receipt 17 But … • Angrist, Imbens and Rubin (1996) took a different perspective and showed that this estimator isn’t what it seems • To see this, consider “counterfactual treatments”: – DiE = treatment if randomised to intervention – DiS = treatment if randomised to control – both are 0/1 (received standard / intervention) • Implies 4 types of person (“compliance-types”): – DiE=1, DiS=1: always-takers – DiE=1, DiS=0: compliers – DiE=0, DiS=0: never-takers – DiE=0, DiS=1: defiers – assumed absent 18 Introducing the complier-average causal effect Outcome if randomised to intervention control Always-taker Yi(1) Yi(1) Complier Yi(1) Yi(0) Never-taker Yi(0) Yi(0) • The observed data tell us nothing about the causal effects of treatment in always-takers and never-takers • In fact, our simple estimator estimates the “complieraverage causal effect” (CACE) = E[Di| DiE=1, DiS=0] • This is all we can hope to estimate in RCTs! 19 Problems with the CACE • We don’t know who is a “complier” • In practice, we may want to know what will be observed – if compliance is worse than in the trial (e.g. if rolled out in clinical practice) – if compliance is better than in the trial (e.g. because intervention is well publicised / marketed) This means we want to know the average causal effect in a different subgroup. We might assume this is the CACE – but it is an assumption 20 Summary of things we can estimate • • • • • ITT: PP: ACE/ATE: ATT:E[Y(1) CACE: E[Y|R=E] – E[Y|R=S] E[Y|R=E, DE=1] – E[Y|R=S, DS=0] E[Y(1) – Y(0)] – Y(0) | DE=1] E[Y(1) – Y(0) | DE=1, DS=0] We are going to explore ways to estimate the CACE 21 Plan of session 1 1. Describe departures from random allocation 2. Intention-to-treat analysis, per-protocol analysis and their limitations 3. What do we want to estimate? 4. Estimation methods: principal stratification 5. Instrumental variables 6. Structural mean model 7. Extensions: complex departures, missing data, covariates 8. Small group discussion All illustrated with data from the ODIN trial 22 Principal stratification • An idea of Frangakis and Rubin (1999), generalising the simple compliance-types above • Again, let – DiE = treatment if randomised to intervention – DiS = treatment if randomised to control where both could be complex (e.g. numbers of sessions of psychotherapy) • Principal strata are the levels of the pair (DiE, DiS) 23 Using principal stratification • We should model outcomes conditional on principal strata – typically allow a different mean for each principal stratum – avoids assuming they are comparable – allow differences between randomised groups within principal strata – these parameters have a causal meaning • Of course this may not be easy, since for every individual we only know one of (DiE, DiS) so we don’t know their principal stratum 24 Example: ODIN trial • Trial of 2 psychological interventions to reduce depression (Dowrick et al, 2000) • Randomised individuals: – 236 to the psychological interventions (E) – 128 to treatment as usual (S) • Outcome: Beck Depression Inventory (BDI) at 6 months – recorded on 317 randomised individuals ITT results Unadjusted Adjusted for baseline BDI Mean (SD) Difference in BDI6 (std error) E (n=177) S (n=140) 13.29 (9.85) 15.16 (10.42) -1.87 (1.14) -2.28 (1.02) 25 ODIN trial: compliance • Of 236 individuals randomised to psychological interventions, 128 (54%) attended in full – others refused, did not attend or discontinued • Psychological interventions weren’t available to the control arm (no “contaminators”) so DS=0 for all • Only 2 principal strata: – would attend if randomised to intervention » DE=1, “compliers” – would not attend if randomised to intervention » DE=0, “never-takers” 26 Exclusion restriction • Key assumption used to identify the CACE • In individuals for whom randomisation has no effect on treatment (e.g. in never-takers and always-takers), randomisation has no effect on outcome • Often reasonable: e.g. in a double-blind drug trial, not taking active drug is the same as not taking placebo • But not always reasonable: e.g. not attending counselling despite being invited could be different from not attending because uninvited – “I wouldn’t have gone, but I’d like to have been invited” 27 Exclusion restriction in ODIN • In ODIN, the exclusion restriction means that randomisation has no effect on outcomes in those who would not attend if randomised to psychological intervention • But recall that we included those who discontinued as “non-attenders” – their partial attendance is very likely to have had some effect on them – the exclusion restriction would be more plausible if we defined compliance as any attendance – we’ll return to this later 28 CACE analysis (complete cases) # participants mean BDI Compliers Never-takers All Therapy (E) 118 13.32 59 13.22 177 13.29 Control (S) ? ? ? ? 140 15.16 29 CACE analysis (2) # participants mean BDI Compliers Never-takers All 118 13.32 59 13.22 177 13.29 Therapy (E) complier-average causal effect (CACE) Control (S) 93.3 16.13 CACE = 13.32 – 16.13 = -2.81 (cf ITT = 13.29 – 15.16 = -1.87) randomisation balance (59*140/177) 46.7 13.22 exclusion restriction 140 15.16 Note: 66.7% compliance (118/177) 30 ITT / 0.667 = CACE CACE vs. PP # participants mean BDI Therapy (E) Compliers Never-takers All 118 13.32 59 13.22 177 13.29 CACE Control (S) 93.3 16.13 PP equal equal 46.7 13.22 140 15.16 CACE is based on the “exclusion restriction” assumption Per-protocol analysis estimates the CACE under the “random non-compliance” assumption 31 Plan of session 1 1. Describe departures from random allocation 2. Intention-to-treat analysis, per-protocol analysis and their limitations 3. What do we want to estimate? 4. Estimation methods: principal stratification 5. Instrumental variables 6. Structural mean model 7. Extensions: complex departures, missing data, covariates 8. Small group discussion All illustrated with data from the ODIN trial 32 Instrumental variables (IV) • Popular in econometrics • Model: – Model of interest: Yi = a + b Di + ei – Error ei may be correlated with Di (“endogenous”) – Example in econometrics: D is years of education, Y is adult wage, e includes unobserved confounders • We can’t estimate b by ordinary linear regression • Instead, we assume error ei is independent of an 3rd instrumental variable Ri – i.e. Ri only affects outcome through its effect on Di – or: randomisation only affects outcome through its effect on treatment actually received 33 IV estimation • Estimation by “two-stage least squares”: model implies – E[Yi | Ri] = a + b E[Di | Ri] – so first regress Di on Ri to get E[Di | Ri] – then regress Yi on E[Di | Ri] – NB standard errors not quite correct by this method: general IV uses different standard errors • More generally, we use an estimating equation based on Si Ri (Yi – a – b Di ) = 0 34 Instrumental variables for ODIN . ivreg bdi6 (treata=z) Instrumental variables (2SLS) regression Source | SS df MS -------------+-----------------------------Model | -58.5115086 1 -58.5115086 Residual | 32532.4232 315 103.277534 -------------+-----------------------------Total | 32473.9117 316 102.765543 Number of obs F( 1, 315) Prob > F R-squared Adj R-squared Root MSE = = = = = = 317 2.64 0.1049 . . 10.163 -----------------------------------------------------------------------------bdi6 | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------treata | -2.803511 1.724143 -1.63 0.105 -6.195802 .5887801 _cons | 15.15714 .8588927 17.65 0.000 13.46725 16.84703 -----------------------------------------------------------------------------Instrumented: treata Instruments: z Same estimate as before! 35 ------------------------------------------------------------------------------ Easy to extend to include covariates . ivreg bdi6 (treata=z) bdi0 Instrumental variables (2SLS) regression Source | SS df MS -------------+-----------------------------Model | 6808.64828 2 3404.32414 Residual | 25665.2634 314 81.7365076 -------------+-----------------------------Total | 32473.9117 316 102.765543 Number of obs F( 2, 314) Prob > F R-squared Adj R-squared Root MSE = = = = = = 317 43.26 0.0000 0.2097 0.2046 9.0408 -----------------------------------------------------------------------------bdi6 | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------treata | -3.428509 1.539881 -2.23 0.027 -6.458298 -.3987196 bdi0 | .5813933 .0630405 9.22 0.000 .4573581 .7054285 _cons | 2.395561 1.546673 1.55 0.122 -.6475924 5.438714 -----------------------------------------------------------------------------Instrumented: treata 36 Usual gain in precision Instruments: bdi0 z ------------------------------------------------------------------------------ Plan of session 1 1. Describe departures from random allocation 2. Intention-to-treat analysis, per-protocol analysis and their limitations 3. What do we want to estimate? 4. Estimation methods: principal stratification 5. Instrumental variables 6. Structural mean model 7. Extensions: complex departures, missing data, covariates 8. Small group discussion All illustrated with data from the ODIN trial 37 Structural mean model (SMM) • Extends our simple model Yi(1) - Yi(0) = b • SMM is E[YiE - YiC | DiE, DiC, X] = b Di* – where Di* is a summary of treatment thought to have a causal effect, e.g.: » Di* = DiE – DiC: causal effect of treatment is proportional to amount of treatment » Di* = (DiE – DiC , Xi(DiE – DiC)): and X is an effect modifier • Goetghebeur and Lapp, 1997 (assumed DiC=0) • Estimation is equivalent to instrumental variables with R and R*X as instruments – in other words, we also assume that X does not modify the causal effect of treatment 38 Summary for binary compliance • The principal stratification approach divides individuals into always-takers, compliers and never-takers • We can then identify the complier-average causal effect, provided we make the exclusion restriction assumption • This works for binary or continuous outcomes • Instrumental variables and structural mean models approaches lead to the same estimates for continuous outcomes • For binary outcomes, instrumental variables are problematic, and generalised structural mean models are needed (Vansteelandt and Goetghebeur, 2003) 39 Plan of session 1 1. Describe departures from random allocation 2. Intention-to-treat analysis, per-protocol analysis and their limitations 3. What do we want to estimate? 4. Estimation methods: principal stratification 5. Instrumental variables 6. Structural mean model 7. Extensions: complex departures, missing data, covariates 8. Small group discussion All illustrated with data from the ODIN trial 40 Example with missing outcome data • Our IV analyses of ODIN used complete cases only • This is a bad idea • Follow-up rates were worse in non-attenders (55%) than in attenders (92%) • So we modify the previous analysis • We will now assume the data are “missing at random” given randomised group and attendance – e.g. among non-attenders, there is no difference on average between non-responders and responders 41 CACE analysis under MAR Mean BDI participants Compliers 118 128 13.32 Therapy (E) complier-average causal effect (CACE) Control (S) 93.3 103.6 16.13 16.80 Never-takers 59 108 13.22 All 177 236 13.29 randomisation balance (108*191/236) 46.7 87.4 13.22 exclusion restriction CACE (MAR) = 13.32 – 16.80 = -3.48 cf CACE (CC) = 13.32 – 16.13 = -2.81 140 191 15.16 42 A more general approach • We can allow for missing data by using inverse probability weights • Suppose a certain group of individuals has only 50% chance of responding – give each responder in that group a weight of 2 – accounts for their non-responding fellows • In ODIN, we will consider the baseline-adjusted analysis • We will construct weights depending on baseline BDI, randomised group and attendance 43 Constructing the weights . logistic resp6 z treata bdi0 Logistic regression Log likelihood = -218.70364 Number of obs LR chi2(3) Prob > chi2 Pseudo R2 = = = = 427 49.84 0.0000 0.1023 -----------------------------------------------------------------------------resp6 | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------z | .4327186 .1102412 -3.29 0.001 .2626333 .7129535 treata | 10.1753 3.909568 6.04 0.000 4.791789 21.60713 bdi0 | .9750455 .0136551 -1.80 0.071 .9486461 1.00218 -----------------------------------------------------------------------------. predict presp (option pr assumed; Pr(resp6)) . gen wt=1/presp 44 2.5 Examining the weights wt 2 therapy, non-compliers 1.5 control 1 therapy, compliers 10 20 30 bdi@baseline 40 50 45 Weighted IV analysis . ivreg bdi6 (treata=z) bdi0 [pw=wt] (sum of wgt is 4.2710e+02) Instrumental variables (2SLS) regression Number of obs F( 2, 314) Prob > F R-squared Root MSE = = = = = 317 37.28 0.0000 0.2183 9.0521 -----------------------------------------------------------------------------| Robust bdi6 | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------treata | -3.953868 1.944846 -2.03 0.043 -7.780444 -.1272916 bdi0 | .5810663 .0680343 8.54 0.000 .4472056 .714927 _cons | 2.37602 1.554941 1.53 0.128 -.6834003 5.435441 -----------------------------------------------------------------------------Instrumented: treata Instruments: bdi0 z 46 ------------------------------------------------------------------------------ Back to the exclusion restriction • Recall that partial attenders were included as noncompliers • If instead we include them as compliers, the exclusion restriction is much more plausible • The estimated causal effect is smaller because it is an average over a wider group that includes partial compliers Definition of complier Causal effect (95% CI) Full attendance -3.95 (-7.78 to -0.13) Any attendance -3.19 (-6.17 to -0.21) 47 Summary of ODIN results Analysis Missing data? ITT CC (=MAR) n CACE CC MAR Adjusted? Complier definition full Causal effect (std. error) -1.87 (1.14) y -2.28 (1.02) n -2.81 (1.72) y -3.43 (1.54) n -3.48 (2.15) y -3.95 (1.94) any -3.19 (1.51) 48 Example with continuous compliance: the SoCRATES trial • SoCRATES was a multi-centre RCT designed to evaluate the effects of cognitive behaviour therapy (CBT) and supportive counselling (SC) on the outcomes of an early episode of schizophrenia. • 201 participants were allocated to one of three groups: – Control: Treatment as Usual (TAU) – Treatment: TAU plus psychological intervention, either CBT + TAU or SC + TAU – The two treatment groups are combined in our analyses • Outcome: psychotic symptoms score (PANSS) at 18 months 49 SoCRATES: ITT results ITT results Unadjusted Adjusted for baseline PANSS and centre Mean (SD) E (n=84) S (n=69) 61.1 (20.0) 66.3 (18.2) Difference in 18m PANSS (95% CI) -5.2 (-11.3 to +1.0) -6.7 (-11.7 to -1.6) 50 SoCRATES: compliance • We have a record of the number of sessions attended – ranges from 2 to 29 in the intervention group – 0 for all in the control group • We could dichotomise – e.g. split at the median (17) – attending <17 sessions is “non-compliance” – BUT the exclusion restriction is implausible • Instead, we keep number of sessions as continuous 51 Model for continuous compliance • Structural mean model: Yi(1) - Yi(0) = b Di(1) • The causal effect of d sessions is proportional to the number of sessions – 20 sessions are twice as good as 10 sessions • This is an assumption that you have to believe • Q: is this assumption wrong if individuals continue with sessions until they feel they have achieved an adequate benefit? • Estimation can be done by instrumental variables just as before 52 IV model in SoCRATES . ivregress 2sls pant18 (sessions=rg) i.centre pantot [pw=1/presp], small (sum of wgt is 2.0101e+02) NB: I’ve used Stata 11 here -----------------------------------------------------------------------------| Robust pant18 | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------sessions | -.4243381 .1632735 -2.60 0.010 -.7469866 -.1016897 | centre | 2 | 5.927803 4.013788 1.48 0.142 -2.003934 13.85954 3 | -11.32247 2.523946 -4.49 0.000 -16.3101 -6.334842 | pantot | .4236632 .091294 4.64 0.000 .243255 .6040714 _cons | 30.27006 7.72171 3.92 0.000 15.01101 45.5291 -----------------------------------------------------------------------------Instrumented: sessions Instruments: 2.centre 3.centre pantot rgroup Each extra session reduces PANSS by 0.4 points 53 Summary for continuous compliance • There are too many principal strata for the principal stratification approach to work • Instrumental variables and structural mean models approaches work for continuous outcomes 54 Plan of session 1 1. Describe departures from random allocation 2. Intention-to-treat analysis, per-protocol analysis and their limitations 3. What do we want to estimate? 4. Estimation methods: principal stratification 5. Instrumental variables 6. Structural mean model 7. Extensions: complex departures, missing data, covariates 8. Small group discussion All illustrated with data from the ODIN trial 55 Practical session • Please work in small groups. • We’ll consider the “Down your drink” (DYD) trial – internet users seeking help with their drinking were randomised to a new interactive website or control. – the intervention group’s use of the new website is measured by the number of page hits. The mean was 60 hits over a 3-month period. – outcome: weekly alcohol consumption at 3 months • I will list some possible analyses of this trial, all aiming to estimate the causal effect of treatment. In each case, please: – identify the underlying assumption – decide how plausible you think that assumption is. 56 Analyses to consider (1) 1. Regarding those who hit less than 60 pages as “noncompliers”: a) A per-protocol analysis: intervention group compliers compared with the control group b) A CACE analysis: intervention group compliers compared with those members of the control group who would have complied if they had been randomised to intervention 2. The same, but regarding those who hit less than 10 pages as “non-compliers” 3. A structural mean model analysis, modelling the causal effect of the intervention as proportional to the number of pages hit 57 Analyses to consider (2) The control group had access to a different web site, and averaged 30 page hits. • A per-protocol analysis: intervention group with >60 page hits compared with the control group with >30 page hits • A SMM analysis modelling the causal effect of each intervention as proportional to the number of pages hit (with different parameters) • Do you have any other suggestions for the analysis? 58 References • Dowrick C, Dunn G, et al. Problem solving treatment and group psychoeducation for depression: multicentre randomised controlled trial. BMJ 2000; 321: 1450–4. • Goetghebeur E, Lapp K. The effect of treatment compliance in a placebo-controlled trial: Regression with unpaired data. JRSS(C) 1997; 46: 351–364. • Angrist JD, Imbens GW, Rubin DB. Identification of causal effects using instrumental variables. JASA 1996; 91: 444–455. 59 Suggested further reading • Dunn G et al. Estimating psychological treatment effects from a randomised controlled trial with both non-compliance and loss to follow-up. British Journal of Psychiatry 2003; 183: 323–331. – simple CACE methods • Maracy M, Dunn G. Estimating dose-response effects in psychological treatment trials: the role of instrumental variables. SMiMR 2008. – IV methods • White IR. Uses and limitations of randomization-based efficacy estimators. SMiMR 2005; 14: 327–347. – overview of ideas • Fischer-Lapp K, Goetghebeur E. Practical properties of some structural mean analyses of the effect of compliance in randomized trials. Controlled Clinical Trials 1999; 20: 531–60 546.