Interpretation of Medical Literature Statistics Karen Pieper, MS Duke Clinical Research Institute www.mc.vanderbilt.edu/prevmed/ps Website for Power and Sample Size program used in second class To be an informed consumer of published statistical analyses, one should know: ■ What needs to appear in a manuscript ■ What to keep in mind when looking at data ■ The dangers of subgroup analyses Death or MI at 30 Days Overall Male Female Under 65 65 or Older No DM DM Enrolling MI No Enrolling MI No Prior CABG Prior CABG 0.5 0.89 (0.79, 0.99) 0.79 (0.69, 0.91) 1.10 (0.91, 1.34) 0.78 (0.66, 0.93) 0.98 (0.84, 1.13) 0.87 (0.76, 1.00) 0.96 (0.77, 1.19) 0.93 (0.80, 1.08) 0.85 (0.72, 1.00) 0.89 (0.79, 1.00) 1 2 0.90 (0.66, 1.24) Female Male OR and 95% CI OR and 95% CI PRISM PRISM+ PARAGON-A PURSUIT PARAGON-B GUSTO-IV All 0.5 GP IIb/IIIa Better 1 2 Placebo/ Control Better 0.5 GP IIb/IIIa Better 1 2 Placebo/ Control Better Sex Issue All patients Death Death or MI OR (95% CI) Males Females P-value for Heterogeneity 0.83 (0.71-0.96) 0.81 (0.75-0.89) 1.08 (0.89-1.33) 1.15 (1.01-1.30) 0.030 < 0.0001 Patients with missing data on baseline cardiac troponin Death 0.81 (0.67-0.98) 1.20 (0.93-1.55) 0.011 Death or MI 0.78 (0.70-0.86) 1.18 (1.02-1.36) < 0.0001 Patients with data on baseline cardiac troponin Death 0.85 (0.65-1.11) 0.91 (0.66-1.27) Death or MI 0.93 (0.78-1.11) 1.07 (0.85-1.35) 0.83 0.38 Patients with baseline cardiac troponin T or I < 0.1 g/L Death 1.07 (0.67-1.71) 1.20 (0.69-2.10) Death or MI 1.10 (0.84-1.43) 1.29 (0.91-1.83) 0.84 0.65 Patients with baseline cardiac troponin T or I 0.1 g/L Death 0.75 (0.54-1.04) 0.80 (0.53-1.21) Death or MI 0.82 (0.65-1.03) 0.93 (0.68-1.28) 0.88 0.48 Do Tirofiban And ReoPro Give Similar Efficacy Outcomes Trial Primary Hypothesis Tirofiban will have comparable efficacy to abciximab in reducing the incidence of adverse cardiac ischemic events during the first 30 days after intracoronary stent placement. N Engl J Med 2001;344:1888-94 Statistical Considerations Sample size provides 88% power to declare tirofiban noninferior to abciximab, based on the relative efficacy of abciximab to placebo in EPISTENT.* * The upper bound of the one-sided 95% C.I. for the odds ratio (tirofiban relative to abciximab) must be below 1.47. N Engl J Med 2001;344:1888-94 Primary Endpoint 30-day composite of: ■ Death ■ Myocardial infarction ■ CK-MB > 3x ULN in two samples ■ New Q waves ■ Urgent TVR ■ PCI or CABG N Engl J Med 2001;344:1888-94 Primary Endpoint 30-day Death, MI, Urgent TVR (%) 30-day Death, MI, Urgent TVR R.R. = 1.26 10% 8% p = 0.038 Upper bound of 95% C.I. = 1.51 7.6% 6.0% 6% Abciximab Better 4% 2% Tirofiban Better 0% Tirofiban Abciximab N Engl J Med 2001;344:1888-94 Noninferiority boundary R.R. = 1.26 Primary Endpoint Analysis Tirofiban 8% Abciximab 7.6% 7.2% 6.9% 6.0% Event Rate 6% 5.4% 5.7% 4% p = 0.038 p = 0.66 p = 0.04 p = 0.04 p = 0.49 2% 0.8% 0.7% 0.5% 0.4% 0% Composite N Engl J Med 2001;344:1888-94 Death MI Death/MI Urgent TVR Subgroup Analysis Tirofiban Abciximab % % RR CI Diabetes 6.3 5.4 1.16 0.72, 1.90 No Diabetes 7.9 6.2 1.29 1.01, 1.64 Age < 65 6.6 4.6 1.45 1.05, 2.01 Age 65 8.8 7.8 1.13 0.82, 1.50 Male 7.2 6.5 1.10 0.86, 1.43 Female 8.7 4.7 1.86 1.19, 2.89 Tirofiban Better 1 N Engl J Med 2001;344:1888-94 Abciximab Better Subgroup Analysis Tirofiban Abciximab % % RR Pre-procedure Clopidogrel CI Yes 7.2 5.8 1.24 1.00, 1.58 No 12.5 8.3 1.50 0.73, 2.68 ACS 9.3 6.3 1.49 1.15, 1.94 Non-ACS 4.5 5.6 0.82 0.54, 1.24 U.S. 7.7 6.7 1.14 0.91, 1.45 Ex-U.S. 6.9 2.9 2.42 1.27, 4.64 Tirofiban Better 1 N Engl J Med 2001;344:1888-94 Abciximab Better 30-Day Conclusions ■ Abciximab was superior to tirofiban in reducing the incidence of the composite endpoint of death/MI/urgent target vessel revascularization at 30 days after intracoronary stent placement. ■ There were no differences in rates of TIMI major bleeding, but significant differences in minor bleeding and thrombocytopenia were observed and favored tirofiban. N Engl J Med 2001;344:1888-94 Composite Endpoint (Death/MI/TVR) Tirofiban (N = 2398) 14.4% 13.8% 15% 10% Abciximab (N = 2411) 7.6% 6.0% 5% p = 0.038 p = 0.509 30-Day 6-Month 95% CI: 1.05,1.52 95% CI: 0.90,1.22 0% Death/MI Tirofiban (N = 2398) Abciximab (N = 2411) 15% 10% 8.7% 7.4% 7.2% 5.7% 5% p = 0.04 p = NS 30-Day 6-Month 95% CI: 1.01,1.58 95% CI: 0.96,1.44 0% Target Vessel Revascularization Tirofiban (N = 2398) Abciximab (N = 2411) 15% 10% Urgent TVR 5% 7.5% 8.0% p = NS 0.8% 0.7% p = NS 0% 30-Day 6-Month 95% CI: 0.65,2.44 95% CI: 0.76,1.14 Diabetics: Composite Endpoint (Death/MI/TVR) Tirofiban (N = 2398) Abciximab (N = 2411) 16.7% 15.2% 15% 10% 6.3% 5.4% 5% p = NS p = NS 30-Day 6-Month 95% CI: 0.72,1.90 95% CI: 0.67,1.21 0% Subgroups: EPISTENT Diabetes Paper 6-month Event Rates for Study Group Stent/Placebo Stent/Abciximab p Death, MI, TVR Diabetics Nondiabetics 43 (25.2) 104 (16.5) 21 (13.0) 81 (13.0) 0.005 0.062 Death or MI Diabetics Nondiabetics 22 (12.7) 70 (11.0) 10 (6.2) 34 (5.4) 0.041 < 0.001 MI Diabetics Nondiabetics 19 (11.0) 64 (10.1) 10 (6.2) 31 (4.9) 0.11 < 0.001 3 (1.7) 7 (1.1) 8 (4.8) 1 (0.6) 3 (0.5) 2 (0.9) 0.35 0.21 0.08 Death Diabetics Nondiabetics Diabetics (1 yr) Post-randomization Subgroups Issues Specific to the “Post-randomization” Component Example: ■ A clinical trial evaluated the treatment effect of a new drug (A) versus placebo (P) in ACS patients. The primary endpoint of the trial was 30-day death or MI. Of special interest was the effectiveness of the new drug in patients who had received a PCI versus those who had not. Sample Patient 1 Randomization PCI Death or MI 30-day Assessment Sample Patient 2 Randomization Death or MI 30-day Assessment Improper Subgroups Incidence of 1 Endpoint Eptifibatide Placebo PCI < 72 hours P Odds Ratio (95% CI) (N = 606) (N = 622) 96 hours 57 (9.4) 95 (15.3) 0.002 0.576 (0.406, 0.817) 7 days 62 (10.2) 100 (16.1) 0.003 0.595 (0.424, 0.835) 30 days 70 (11.6) 104 (16.7) 0.010 0.650 (0.469, 0.901) No PCI < 72 hrs (N = 4116) (N = 4117) 96 hours 302 (7.3) 334 (8.1) 0.188 0.897 (0.763, 1.055) 7 days 415 (10.1) 452 (11.0) 0.185 0.909 (0.790, 1.047) 30 days 602 (14.6) 641 (15.6) 0.232 0.929 (0.823, 1.048) PCI = percutaneous coronary intervention; CI = confidence interval Sample Patient 3 Randomization MI PCI 30-day Assessment Time Interval Eptifibatide (N = 606) Placebo (N = 622) Absolute Reduction P-value Before PTCA Death/MI 1.7% 5.5% 3.8% < 0.001 96 hours Death/MI* 8.1% 10.9% 2.9% 0.090 7 days Death/MI* 8.9% 11.7% 2.8% 0.105 30 days Death/MI* 10.2% 12.4% 2.2% 0.235 *Composite only includes myocardial infarctions (MI) occurring after the percutaneous SUPPORT Study of effectiveness of Right Heart Catheterization (RHC) in the initial care of critically ill patients. •Issues: •Timing of RHC relative to hospital arrival •Timing of RHC relative to death •Need to survive long enough to get a RHC – patients who die early will be in the no RHC group. •Bias in who is chosen for a RHC JAMA 1996; 276:889-897 Unadjusted Survival Rates from SUPPORT study No RHC (N=3551) 70% RHC (N = 2184) 69.4% 62.0% 60% 53.7% 46.3% 50% 40% 30% 20% p <0.0001 p <0.0001 30-Day 6-Month 10% 0% JAMA 1996; 276:889-897 SUPPORT study – Adjustments • Only included RHC that occurred within the first 24 hrs. This cut down on the problem of survivor bias. • Calculated the “propensity” to receive a RHC and adjusted for both baseline differences and this propensity score. JAMA 1996; 276:889-897 SUPPORT study – Adjusted results Hazard Ratio: 1.21 (1.09, 1.25), (p<0.001) JAMA 1996; 276:889-897 EMBARGOED FOR RELEASE Tuesday, November 9, 2004 CONTACT: NHLBI Communications Office (301) 496-4236 Email: nhlbi_news@nhlbi.nih.gov AHA Scientific Sessions News Media Center: (504) 670-6500. (8 a.m. to 8 p.m. EST, Sun., Nov 7, through Tues., Nov 9, and 8 a.m. to 6 p.m. EST, Weds. Nov 10.) No Increase in Deaths or Hospitalizations for Heart Failure Patients Who Have a Pulmonary Artery Catheter Safeguards ■ Remember that the primary hypothesis is the only one the study was designed to answer ■ Nonsignificant results may indicate that there were too few patients studied to detect a small meaningful difference. ■ Subgroup results should be confirmed in subsequent studies before acceptance. Baseline Table ■ The baseline table is needed to show where the differences in baseline characteristics exist, especially when the sample size is small or the groups being compared are not randomized. ■ If baseline characteristics are not equally distributed, be sure that there is at least one analysis of the endpoint(s) adjusted for the other factors associated with the endpoint(s). Example—ESPRIT Diabetes Baseline Characteristics for Diabetics and Nondiabetics Variable Age, yrs Weight, kg Women Previous MI Previous PCI Previous CABG Hypertension Hypercholesterolemia PVD Previous stroke Current smoker Diabetics (%) N = 466 62.0 (55.0, 70.0) 89.0 (77.0, 102.0) 175 (37.6) 142 (30.5) 125 (26.8) 55 (11.8) 334 (71.7) 294 (63.1) 51 (10.9) 29 (6.2) 89 (19.3) Nondiabetics (%) N = 1595 62.0 (54.0, 71.0) 83.5 (73.2, 94.5) 386 (24.2) 509 (31.9) 357 (22.4) 156 (9.8) 877 (55.0) 905 (56.8) 86 (5.4) 60 (3.8) 389 (24.6) P-value 0.668 < 0.001 0.001 0.556 0.046 0.205 0.001 0.015 0.001 0.021 0.012 Be Careful of Axes Sizes! Compare this… 0.03 Event Rates for Death Diabetic Event Rate Non-diabetic 0.02 0.01 p = 0.167 0 0 100 200 Days 300 To this… (It’s the same data!) 1 Event Rates for Death Diabetic Non-diabetic Event Rate 0.8 0.6 0.4 0.2 p = 0.167 0 0 100 200 Days 300 Other Precautions ■ Check the Ns—do things add up? You’d be surprised how many times they don’t! ■ Watch for missing data. Does it appear that the sample size has dropped for some variables? What has been done about missing data? How will that influence the results? Other Precautions ■ Can you account for every patient in how the sample for this study was drawn from the original study population? ■ Are confidence intervals or error bars included for estimates? References: ■ Bailar JC III, Mosteller F. Guidelines for statistical reporting in articles for medical journals, Ann Intern Med, 108:266-273, 1988. ■ DerSimonian R, Charette LJ, McPeek B, Mosteller F. Reporting on methods in clinical trials. In Medical Uses of Statistics, 2nd ed, Bailar JC III et al (ed), Boston, NEJM, 1992, pp333-348. ■ Gardner MJ, Machin D, Campbell MJ. Use of check lists in assessing the statistical content of medical studies. In Statistics with Confidence, Gardner MJ, et al (eds), London, British Medical Journal, 1989, pp101108.