2/20/2012 Study Designs Appropriate for Comparative Effectiveness Research Matthew L. Maciejewski, PhD Durham VA HSR&D and Duke University Objectives of Session • • • • Motivation Incident vs. prevalent user cohorts Counterfactuals (aka comparators, controls) Understand definition of and threats to internal validity and external validity – Fundamental to study design selection • Appreciate most rigorous quasi‐experimental study designs – Quasi‐experiments as “broken trials” 2 Motivation & Value of Quasi‐Experimental Studies • Quasi‐experiment is 1st step to RCT – Big secondary data yield sample estimates of outcome variation and differences – May generate/support hypotheses of causes • Quasi‐experiment is only way to assess change in policy or clinical practice – Population effect in real world is parameter of interest 3 1 2/20/2012 Ideal Quasi‐Experiment • Answers the research question • Great internal validity – Treatment effect estimated is “true” effect • External validity – Generalizable across settings, time 4 Competing Demands in CER • As scientists, internal validity is #1 concern – Why RCTs considered gold standard – Why evidence from observational studies considered suspect or invalid – Historically, this is of greatest concern Hi t i ll thi i f t t • As policy/clinical experts, generalizability is of concern – Want results to be definitive across space (& time?) – In CER, this is more prominent and quasi‐ experimental studies can address this issue 5 Four Design Choices • In absence of randomization, three study design choices determine internal validity of your study – Incident or prevalent users – Control group – Pre‐period measurement – Post‐period measurement • The more thoughtfully these design elements are chosen a priori, the stronger the study’s internal validity and causal statements 6 2 2/20/2012 Concept of Internal Validity • Internal validity refers to ability to make causal inferences (e.g., Y is caused by X) – With successful randomization, X is only source of variation between groups f i ti b t – In quasi‐experimental studies, X is no longer only source of variation b/c tx & control not balanced as in RCT • Internal validity is not generalizability, reproducibility or reliability 7 Conditions for Causality • Treatment precedes outcome – Temporal relationship • Treatment and outcome covary – Significant correlation g • No other plausible competing hypotheses – Identification of “true” cause is process of eliminating all competing hypotheses but 1 – If all other explanations for outcome variation are eliminated but treatment, treatment is cause 8 Incident User & Prevalent User Designs g 3 2/20/2012 Three Cohort Choices • Incident (aka new) users – Index date refers to first use with no history of prior use – Useful for medication and surgery studies • Prevalent users – Index date refers to treatment use in reference to some other date (e.g., policy change) – Represents majority of medication users at point in time • Both incident & prevalent users (‘wedge design’) 10 Tradeoffs of Incident User Cohort • Advantages – Well suited to assess acute meds or initial risks of chronic meds – Captures events that C t t th t occur early in therapy – Can control for risk factors before they are altered by exposure • Disadvantages – Likely small sample – Limited precision/efficiency – Need to be careful about characterizing risk of adverse events if risk varies greatly over time in early period Ray WA, 2003 Am J Epi, 158(9): 915‐920 Defining Pre‐Period for Incident Cohort Design • Two reasons to define pre‐period carefully – Assess covariates before patient exposed to tx – Ensure no prior history of treatment exposure (wash out) • Need to choose look back period carefully – 6 months: 32 of 1000 pts were eligible – 12 months: 27.5 of 1000 – 24 months: 23.5 of 1000 – 108 months: 17.2 of 1000 Gardarsdottir H, Heerdink ER, Egberts, 2006 PharmacoEpi Drug Safety 15(5): 338 12 4 2/20/2012 Tradeoffs of Prevalent User Cohort • Disadvantages • Advantages – Can characterize event risk because risk likely stable over time – Large sample – Improved precision/efficiency – Addresses experience of most current medication users – Under‐ascertainment of events occurring early due to treatment You miss folks who stopped meds due to adverse event – “Baseline” covariates may have been impacted by treatment exposure – Treatment duration unknown and variable – Patients who adhere to treatment over long term are not random sample Ray WA, 2003 Am J Epi, 158(9): 915‐920 Defining Pre‐Period for Prevalent Cohort • Need to choose lookback period and history of treatment exposure carefully Blais 2003 Pre-Period Duration (months) 70 Number of Intervals 1 Number of Fills per Interval 1+ Each Interval Length 70 Roblin 2005 12 2 4+, 1+ 6, 6 Landsman 2005 Schneeweiss et al., 2007 Yin, 2008 24 2 1+ 3, 24 12 2 0, 1+ 6, 12 24 1 1+ 24 Doshi 2009 24 2 1+ 3, 24 Maciejewski 2010 12 2 1+ 3, 9 14 The Value of a Control Group for Internal Validityy 5 2/20/2012 Importance of Control Group • Terminology – Control group = counterfactual = comparator • When observing treatment, how do we know what would have happened in the absence of what would have happened in the absence of treatment? – Counterfactual evidence is needed • Control group is needed to compare – Outcomes under treatment – Outcomes NEVER under treatment 16 What is a Counterfactual? • Counterfactual is measurement on subject that mirrors subject in every respect – Observed and unobserved • Perfect counterfactual represents – What would have happened to treated patient had they not received treatment – What would have happened to control patient had they received treatment • Ideal counterfactual: two universes – Same person in treatment & in control at same time 17 Ways of Estimating Counterfactuals • Second best: Identical twins in RCT • Third best: RCT • Fourth best: Quasi‐experiment w/ equivalent controls – Recommended by Shadish (2008) • Fifth best: Quasi‐experiment w/ non‐ equivalent controls – This is typical counterfactual in quasi‐experiment 18 6 2/20/2012 What is Effect of QI Program on Hospital Acquired Infections? 50 Treatment 40 30 20 10 0 Yr ‐1 Yr 1 Yr 2 19 Harris, Maciejewski, in press Health Affairs What is Effect of QI Program on Hospital Acquired Infections? 50 Treatment Control 40 30 20 10 0 Yr ‐1 Yr 1 Yr 2 Harris, Maciejewski, in press Health Affairs 20 Counterfactuals in Quasi‐Experiments • Self‐selection in quasi‐experiments – Imbalance in observed covariates – Can assume imbalance in unobserved factors – Imbalance = nonequivalence (compared to RCT) • Non‐equivalence complicates interpretation of treatment effect b/c several explanations – Treatment? Selection bias? Both? 21 7 2/20/2012 Threats to Internal Validity Biggest Internal Validity Threats in Quasi‐Experimental CER • Ambiguous Temporal Precedence • Selection Bias • Regression • History 23 Threats to Internal Validity • Ambiguous Temporal Precedence – Major concern in cross‐sectional studies – Eg., Volume‐outcome relationship • Selection – “confounding of treatment effects with population differences” – Observed (easier to fix) or Unobserved – May require knowledge of how people get treatment, not just knowledge of outcome 24 8 2/20/2012 Example of Selection Bias on Observables (Bariatric Surgery) Age (Mean, SD) Male (%) Caucasian (%) ( ) Non‐Caucasian (%) Married (%) Previously Married (%) Never Married (%) Unknown Marital (%) BMI (Mean, SD) Diagnostic Cost Group score (Mean, SD) Surgical Non‐Surg Statistical Cases Controls Significance (N=850) (N=41,244) p‐value Std Diff 49.5 (8.3) 54.7 (10.2) <0.001 ‐55.92 73.9 91.7 <0.001 ‐48.54 77.9 67.8 <0.001 22.86 16.0 19.3 0.015 ‐8.66 52.1 57.7 0.001 ‐11.27 30.4 27.3 0.048 6.85 16.4 13.8 0.035 7.00 1.2 1.2 0.957 0 47.4 (7.8) 42.0 (5.0) <0.001 82.43 0.60 0.47 (0.75) <0.001 15.49 (0.92) 25 Addressing Temporal Precedence and History Threats • Ambiguous Temporal Precedence – Design: Choose sample to identify precedence – Statistics: Examine Y = f(X) & X = f(Y) • Selection – Design: Careful a priori choice of primary comparator – Measurement: Additional data collection – Statistics: Propensity score, selection methods 26 History Threat to Internal Validity • History – Co‐occurring events (e.g., pt in multiple RCTs) that could impact outcome of interest – Attitudes about race, homosexuality change Attitudes about race homosexuality change • Address by understanding, measuring and adjusting for co‐occurring changes over time 27 9 2/20/2012 History Threat: Near Simultaneous Copay Changes December 6, 2001 Specialty Care up from $15 to $50 Primary Care copay ($15) introduced 2000 2001 2002 2003 February 4, 2002 Medication copay up from $2 to $7 Maciejewski, et al., 2010 AJMC Attempt to Account for History Threat (1 of 2) Full Model includes lagged OP visits Copay non‐exempt β 0.030 S.E. .075 Short‐term (ST) post 0.292*** .055 Long‐term (LT) post ‐0.078 .076 Copay*ST post ‐0.082 .060 Copay * LT post ‐0.150 .082 Wang, Liu & Maciejewski, HSR in press 29 Attempt to Account for History Threat (2 of 2) Full Model includes lagged OP visits Copay non‐exempt β 0.030 Short‐term (ST) post Reduced model (includes current OP visits) β 0.032 S.E. .075 β 0.033 S.E. .075 0.292*** .055 0.288*** .055 0.288*** .055 Long‐term (LT) post ‐0.078 .076 ‐0.082 .076 ‐0.082 .076 Copay*ST post ‐0.082 .060 ‐0.078 .060 ‐0.078 .060 Copay * LT post ‐0.150 .082 ‐0.145 .082 ‐0.144 .082 Wang, Liu & Maciejewski, HSR in press S.E. .075 Reduced model (excludes lagged OP visits) 30 10 2/20/2012 Regression Threat to Internal Validity • History – Co‐occurring events (e.g., pt in multiple RCTs) that could impact outcome of interest – Attitudes about race, homosexuality change Attitudes about race homosexuality change • Address by understanding, measuring and adjusting for co‐occurring changes over time 31 What is Effect of QI Program on Hospital Acquired Infections? 50 Treatment 40 30 20 10 0 Yr ‐1 Yr 1 Yr 2 32 Harris, Maciejewski, in press Health Affairs What is Effect of QI Program on Hospital Acquired Infections? 50 Treatment 40 30 20 10 0 Yr ‐2 Yr ‐1 Harris, Maciejewski, in press Health Affairs Yr 1 Yr 2 33 11 2/20/2012 What is Effect of QI Program on Hospital Acquired Infections? 50 Treatment Control 40 30 20 10 0 Yr ‐1 Yr 1 Yr 2 34 Harris, Maciejewski, in press Health Affairs What is Effect of QI Program on Hospital Acquired Infections? 50 Treatment Control 40 30 20 10 0 Yr ‐2 Yr ‐1 Yr 1 Harris, Maciejewski, in press Health Affairs Yr 2 35 Measurement Issues for Internal Validity for Internal Validity 12 2/20/2012 Pre‐tests, Post‐tests & Non‐equivalent Dependent Variables • Pre‐test measurements – Can assess selection bias • Post Post‐test test measurements measurements – Can reduce/eliminate ambiguity about temporal precedence of cause & effect – The more the better to see pattern of association • Non‐equivalent outcomes (post‐tests) – Assesses degree of change due to other (unmeasured) contextual factors 37 Natural Experiments • Definition: A study that contrasts a naturally occurring event, such as an earthquake, with a comparable condition (Shadish, Cook & p ( , Campbell) – Intervention cannot be manipulated by researcher or research subject 38 Example of a Natural Experiment • Value‐based insurance design (VBID) program implemented by BCBSNC in Jan 2008 – Lowered generic copays for some, brand copays for all – Population based implementation, so tons of patients across multiple drug classes lti l d l – Many employers opted out, so good controls • Controls weren’t equivalent, but made so via propensity score matching • Non‐equivalent outcomes – ACEIs & CAIs only had brand Rx available Maciejewski, Health Affairs Nov 2010 39 13 2/20/2012 Design of VBID Evaluation • Pre‐tests – One year prior to VBID implementation • Post‐tests – Two years after VBID implementation Two years after VBID implementation • Controls – Non‐equivalent, but made so via propensity score matching • Non‐equivalent outcomes (ACEIs & CAIs) – Only brand Rx available Maciejewski, Health Affairs Nov 2010 40 Unadjusted Trends in Diuretic Adherence (MPR) Cases 100% 90% 79% 80% Controls Difference-in-difference=3% 79% 79% 76% 70% 60% 50% 40% 30% 20% 10% 0% 2007 2008 Adjusted Adherence Difference Associated with VBID Program 4.5% 4.0% 3.5% 3.0% 2.5% 2.0% 1.5% 1.0% 0.5% 0.0% Diuretics OHAs ACEIs Beta block Statins CalciumCBs ARBs CAIs Note: GLM in unmatched sample, controlling for age, gender, Episode Risk Groups, count of unique medications filled, generic dispensing rate, total copay, and whether an enrollee filled one or more prescriptions with a 90‐day supply; Maciejewski et al. Nov 2010 Health Affairs 14 2/20/2012 Characteristics of Quasi‐Experiments with Strong Internal Validity • At least one pre‐test measurement • At least one post‐test measurements – Outcomes of interest Outcomes of interest – Non‐equivalent outcomes • Control group – The more equivalent the better • Choice of incident or prevalent user design depends on research question 43 External Validity Concept of External Validity • External validity = generalizability of results – Persons, settings, treatments outside of study • Fundamental question: Is response to treatment is homogeneous or heterogeneous? – Assessment by subgroup (defined by pt characteristic, site, time) can inform this issue 45 15 2/20/2012 Threat to External Validity due Heterogeneity in Units • Due to Heterogeneity in Units: Response may differ across subgroups – Age groups, genders, ethnicities – Income levels – Health systems (VBID example) • Due to Treatment Variation – Interventionist variation, if 2+ interventionists – In multi‐component trials, intervention delivered may vary if some pts got components #2‐5 but others got components #6‐9? 46 Example of External Validity in VBID Results 6% 5% Maciejewski Choudhry 4% Chernew Sedjo 3% Kelly Chang 2% Rodin 1% Mahoney 0% OHAs ACEI Beta block Statin External Validity & Treatment Variation in DPP Original DPP Lifestyle Arm DEPLOY East Harlem DPP Study Design RCT RCT RCT Setting AHCs YMCA Urban low income Intervention Individual Individual lifestyle 16 group meetings ?? 16 group meetings at YMCA Weight loss compared to UC 6% greater 5.5% greater 1.9% greater Intervention cost $2,269 over 3 years NR NR Cost/QALY Cite $1,100 NEJM 2002; Ann Int Med 05 NR Am J Prev Med 2008 NR Am J Pub Health 2010 16 2/20/2012 External Validity and CER/Translation Research • Fundamental issue in disseminating RCTs to communities (we are struggling with) • If intervention we test in RCT is implemented in community clinic, would results differ? – Protocol: We shorten intervention intensity and/or change interventionist – Setting: Non‐academic – Subjects: Lower severity & motivation – Funding: None (unlike RCT) 49 Take‐Away Points • Quasi‐experimental study designs are widely used b/c secondary data analysis is common – Several threats to internal validity – Greatest external validity • Have more design/statistical issues than RCTs – Require more data than RCTs – More assumptions about data (not observed) – Creativity in control group and statistics selection • Carefully consider trade‐offs in incident & prevalent user designs 50 References & Resources 1. 2. 3. 4. 5. 6. 7. Gardarsdottir, H., Heerdink, E. R., & Egberts, A. C. (2006). Potential bias in pharmacoepidemiological studies due to the length of the drug free period: a study on antidepressant drug use in adults in the Netherlands. Pharmacoepidemiol Drug Saf, 15(5), 338‐343. doi: 10.1002/pds.1223 Harris, B. D., Hanson, C., Christy, C., Adams, T., Banks, A., Willis, T. S., & Maciejewski, M. L. (2011). Strict hand hygiene and other practices shortened stays and cut costs and mortality in a pediatric intensive care unit. [Research Support, U.S. Gov't, Non‐P.H.S.]. Health Aff (Millwood), 30(9), 1751‐1761. doi: 10.1377/hlthaff.2010.1282 Maciejewski, M. L., Bryson, C. L., Perkins, M., Blough, D. K., Cunningham, F. E., Fortney, J. C., . . . Liu, C. F. (2010). Increasing copayments and adherence to diabetes, hypertension, and hyperlipidemic medications. [Research Support U S Gov'tt, Non‐P.H.S.]. Am J Manag [Research Support, U.S. Gov Non‐P H S ] Am J Manag Care, 16(1), e20‐34. Care 16(1) e20‐34 Maciejewski, M. L., Farley, J. F., Parker, J., & Wansink, D. (2010). Copayment reductions generate greater medication adherence in targeted patients. [Research Support, Non‐U.S. Gov't Research Support, U.S. Gov't, Non‐P.H.S.]. Health Aff (Millwood), 29(11), 2002‐2008. doi: 10.1377/hlthaff.2010.0571 Maciejewski, M. L., Livingston, E. H., Smith, V. A., Kavee, A. L., Kahwati, L. C., Henderson, W. G., & Arterburn, D. E. (2011). Survival among high‐risk patients after bariatric surgery. [Research Support, U.S. Gov't, Non‐P.H.S.]. JAMA, 305(23), 2419‐2426. doi: 10.1001/jama.2011.817 Ray, W. A. (2003). Evaluating medication effects outside of clinical trials: new‐user designs. [Research Support, U.S. Gov't, Non‐P.H.S. Research Support, U.S. Gov't, P.H.S.]. Am J Epidemiol, 158(9), 915‐920. Wang, V., Liu, C. F., Bryson, C. L., Sharp, N. D., & Maciejewski, M. L. (2011). Does medication adherence following a copayment increase differ by disease burden? [Research Support, U.S. Gov't, Non‐P.H.S. Research Support, U.S. Gov't, P.H.S.]. Health Serv Res, 46(6pt1), 1963‐1985. doi: 10.1111/j.1475‐ 6773.2011.01286.x 17