Caution should be used in applying propensity scores estimated in a full cohort to adjust for confounding in subgroup analyses Sue M. Marcus, Columbia University Robert D. Gibbons, University of Chicago JSM 2012, San Diego 1 Testimony of Andrew Leon: Medication and Veteran Suicide • ‘All of us here today share a common goal: to do the very best for our veterans’ • ‘doing the best requires the discipline to use empirical methods to understand optimal mental health care and prevention of suicide.’ JSM 2012, San Diego 2 Outline: Caution should be used… • Context: automated propensity score analyses of large observational databases for drug safety surveillence • When to use caution (Rosenbaum and Rubin 1983; Marcus and Gibbons 2012) • Illustration: Do antiepileptic drugs cause suicide? JSM 2012, San Diego 3 Drug Safety • Spontaneous reports collected through FDA’s Adverse Event Reporting System • Analysis of large-scale integrated medical claims data • Large potential for bias JSM 2012, San Diego 4 Propensity scores estimated in full cohort for subgroup? • If so, one step closer to automated drug safety system for which separate analysis for each subgroup is unnecessary • A correctly specified propensity score should (at least in expectation) remain valid in a subgroup population (Rosenbaum and Rubin 1983) • When can this go wrong? JSM 2012, San Diego 5 Illustration: Do AEDs cause suicide? • 1/2008 FDA alert: AEDs can increase suicidal thoughts and behaviors • 7/2008 FDA scientific advisory committee: association between AEDs and suicidality • American Epilepsy Society: unintended dire consequences, do not want to discontinue effective seizure medication if it does not cause suicide JSM 2012, San Diego 6 Causal question? • AEDs given for bipolar disorder, major depression, epilepsy, pain disorders, migraines, alcohol craving, others • Do AEDs cause suicide or do people with higher propensity for suicide tend to have higher propensity to take AEDs? • Goal: disentangle who takes AEDs from the biological effect of the drugs JSM 2012, San Diego 7 Conflicting conclusions following FDA alert for two propensity–score adjusted analyses Paper Gibbons et al 2009 Population Bipolar Disorder BD, epilepsy, migraine, pain Comparison AED vs no AED Each AED vs topiramate Conclusion AEDs do not increase SA JSM 2012, San Diego Patorno et al 2010 Some AEDs may have increased risk 8 AED A (↑BP) vs AED B (↑epilepsy) • Answers public health question: more suicide among those who take A vs B? • Does not address whether cause of suicide is biological effect of drug or reflects who is taking drug • Higher suicide rate for A reflects higher suicide rate for BP compared to epilepsy JSM 2012, San Diego 9 Correct specification for full vs subgroup • Propensity to use drug depends on different characteristics for different disorders (eg bipolar disorder vs epilepsy) • Can we correctly specify propensity for each subgroup using full cohort? • Propensity to use AED vs Topiramate does not balance comparison of AED vs no treatment for particular disorder JSM 2012, San Diego 10 Potential Outcomes Framework • r1= response if AED, r0 = response if no AED Z = 1 for AED, = 0 for no AED • in general, E (r1 - r0 ) is not equal to E (r1| Z = 1) – E (r0 | Z = 0) • E (r1 - r0 ) may be equal to E (r1| Z = 1, x) – E (r0 | Z = 0, x) JSM 2012, San Diego 11 What is being estimated? • Gibbons et al E (r1| Z = 1, x, BP) – E (r0 | Z = 0, x, BP) • Patorno et al E (r1| Z = particular AED, x, BP or epilepsy or pain) – E (r0 | Z = Topiramate, x, BP or epilepsy or pain ) • Patorno et al estimate reflects who takes each AED, rather than biologic effect of each AED JSM 2012, San Diego 12 Correctly specified PS? • Generally more difficult to correctly specify PS for full cohort when many subgroups have different processes related to confounding by indication • Those with epilepsy have different reason for choosing particular AED compared to those with BP and also have different underlying suicide rates • Better to analyze each subgroup separately? JSM 2012, San Diego 13 Covariance adjustment on PS • Known to perform poorly when PS is poorly estimated (Rosenbaum and Rubin, 1983; Marcus and Gibbons 2011) • Can happen when the variance in the PS for the treatment group is smaller than for control (those who receive new treatment more homogeneous) • Univariate covariance adjustment can greatly increase bias (Rubin, 1973) JSM 2012, San Diego 14 Conclusions • Potential outcome framework can help to clarify whether what is being estimated makes sense • AED vs no AED for single disorder better than AED 1 vs AED 2 for many disorders • Goal is to ‘add efficiency to studies with many subgroups’ which could greatly facilitate automatic large-scale drug safety screening • Is this worth the cost of increased bias: ‘stopping or refusing to start AEDs in epilepsy may result in serious harm, including death’ Fountoulakis et al 2012 JSM 2012, San Diego 15