Primer on Statistics for Interventional Cardiologists Giuseppe Sangiorgi, MD Pierfrancesco Agostoni, MD Giuseppe Biondi-Zoccai, MD What you will learn • • • • • • • • • • • • Introduction Basics Descriptive statistics Probability distributions Inferential statistics Finding differences in mean between two groups Finding differences in mean between more than 2 groups Linear regression and correlation for bivariate analysis Analysis of categorical data (contingency tables) Analysis of time-to-event data (survival analysis) Advanced statistics at a glance Conclusions and take home messages What you will learn • • • • • • • • • • • • Introduction Basics Descriptive statistics Probability distributions Inferential statistics Finding differences in mean between two groups Finding differences in mean between more than 2 groups Linear regression and correlation for bivariate analysis Analysis of categorical data (contingency tables) Analysis of time-to-event data (survival analysis) Advanced statistics at a glance Conclusions and take home messages What you will learn •Analysis of categorical data (contingency tables) – Estimating a proportion with the binomial test – Comparing proportions in two-way contingency tables – Relative risk and odds ratio – Fisher exact test for small samples – McNemar test for proportions using paired samples – Comparing proportions in three-way contingency tables with the Cochran-MantelHaenszel test Types of variables Variables CATEGORY nominal QUANTITY ordinal discrete continuous ranks counting measuring TIMI flow Stent diameter Stent length BMI Blood pressure QCA data (MLD, late loss) Death: yes/no TLR: yes/no ordered categories Radial/brachial/femoral Types of variables Variables CATEGORY nominal ordinal Death: yes/no TLR: yes/no ordered categories ranks Radial/brachial/femoral TIMI flow What you will learn •Analysis of categorical data (contingency tables) – Estimating a proportion with the binomial test – Comparing proportions in two-way contingency tables – Relative risk and odds ratio – Fisher exact test for small samples – McNemar test for proportions using paired samples – Comparing proportions in three-way contingency tables with the Cochran-MantelHaenszel test Binomial test Variable type Nominal Ordinal Continuous Patient ID Diabetes AHA/ACC Type Lesion Length 1 Y A 18 2 N B1 24 3 N A 17 4 N C 25 Yes 5 38.5% 5 Y B2 23 No 8 61.5% 6 N A 15 7 N A 16 8 Y B2 18 9 N B1 21 10 Y B2 19 11 N B1 14 12 Y C 22 13 N C 27 Diabetes n=13 Binomial test Is the percentage of diabetics in this sample comparable with the known CAD population? We fix the population rate at 15% Binomial test Is the percentage of diabetics in this sample comparable with the CAD population? We fix the population rate at 15% Binomial Test DIABETES Group 1 Group 2 Total Category yes no N 5 8 13 Obs erved Prop. ,38 ,62 1,00 Tes t Prop. ,15 Exact Sig. (1-tailed) ,034 Binomial test Agostoni et al. AJC 2007 Binomial test What you will learn •Analysis of categorical data (contingency tables) – Estimating a proportion with the binomial test – Comparing proportions in two-way contingency tables – Relative risk and odds ratio – Fisher exact test for small samples – McNemar test for proportions using paired samples – Comparing proportions in three-way contingency tables with the Cochran-MantelHaenszel test Compare discrete variables χ2 test or chi-square test The first basis for the chi-square test is the contingency table ENDEAVOR II. Circulation 2006 Compare discrete variables χ2 test or chi-square test Compare discrete variables χ2 test or chi-square test 600 STENT * TVF Crosstabulation 500 Count TVF Total Driver Endeavor yes 89 47 136 Total 591 592 1183 400 Count STENT no 502 545 1047 300 200 TVF 100 no 0 yes Driver Endeavor STENT Compare discrete variables No TVF TVF Driver a b r1 Endeavor c d r2 s1 s2 N STENT * TVF Crosstabulation TVF STENT Driver Endeavor Total Count % within STENT Count % within STENT Count % within STENT no 502 84,9% 545 92,1% 1047 88,5% yes 89 15,1% 47 7,9% 136 11,5% Total 591 100,0% 592 100,0% 1183 100,0% Compare discrete variables The second basis is the “observed”-“expected” relation STENT * TVF Crosstabulation TVF STENT Driver Endeavor Total Count Expected Count Count Expected Count Count Expected Count no 502 523,1 545 523,9 1047 1047,0 yes 89 67,9 47 68,1 136 136,0 Total 591 591,0 592 592,0 1183 1183,0 Stent TVF Compare discrete variables χ2 test or chi-square test Compare discrete variables χ2 test or chi-square test Chi-Square Tests Pears on Chi-Square Continuity Correctiona Likelihood Ratio Fisher's Exact Test Linear-by-Linear Ass ociation N of Valid Cas es Value 14,736b 14,044 14,951 14,723 df 1 1 1 1 Asymp. Sig. (2-s ided) ,000 ,000 ,000 Exact Sig. (2-s ided) Exact Sig. (1-s ided) ,000 ,000 ,000 1183 a. Computed only for a 2x2 table b. 0 cells (,0%) have expected count les s than 5. The minimum expected count is 67,94. Compare discrete variables More than 2x2 contingency tables Post-hoc comparisons A no DIABETE S Total yes Count % within DIABETES Count % within DIABETES Count % within DIABETES 3 37,5% 1 20,0% 4 30,8% AHA/ACC type B1 B2 3 0 37,5% ,0% 0 3 ,0% 60,0% 3 3 23,1% 23,1% C Total 2 25,0% 1 20,0% 3 23,1% 8 100,0% 5 100,0% 13 100,0% Is there a difference between diabetics and nondabetics in the rate of AHA/ACC type lesions? Post-hoc groups the chi-square test was used to determine differences between groups with respect to the primary and secondary end points. Odds ratios and their 95 percent confidence intervals were calculated. Comparisons of patient characteristics and survival outcomes were tested with the chi-square test, the chi-square test for trend, Fisher's exact test, or Student's t-test, as appropriate. This is a sub-group ! Bonferroni ! The level of significant p-value should be divided by the number of tests performed… Or the computed p-value, multiplied for the number of tests… P=0.12 and not P=0.04 !! Wenzel et al, NEJM 2004 What you will learn •Analysis of categorical data (contingency tables) – Estimating a proportion with the binomial test – Comparing proportions in two-way contingency tables – Relative risk and odds ratio – Fisher exact test for small samples – McNemar test for proportions using paired samples – Comparing proportions in three-way contingency tables with the Cochran-MantelHaenszel test Compare event rates No TVF TVF Driver a b Endeavor c d Absolute Risk = [ d / ( c + d ) ] Absolute Risk Reduction = [ d / ( c + d ) ] - [ b / ( a + b ) ] Relative Risk = [ d / ( c + d ) ] / [ a / ( a + b ) ] Relative Risk Reduction = 1 - RR Odds Ratio = (d/c)/(b/a) = ( a * d ) / ( b * c ) Compare event rates • Absolute Risk (AR) 7.9% (47/592) & 15.1% (89/591) • Absolute Risk Reduction (ARR) 7.9% (47/592) – 15.1% (89/591) = -7.2% STENT * TVF Crosstabulation Count TVF STENT Total Driver Endeavor no 502 545 1047 yes 89 47 136 Total 591 592 1183 • Relative Risk (RR) 7.9% (47/592) / 15.1% (89/591) = 0.52 (given an equivalence value of 1) • Relative Risk Reduction (RRR) 1 – 0.52 = 0.48 or 48% • Odds Ratio (OR) 8.6% (47/545) / 17.7% (89/502) = 0.49 (given an equivalence value of 1) • Odds Ratio Reduction (ORR) 1 – 0.49 = 0.51 or 51% Compare event rates STENT * TVF Crosstabulation Count TVF STENT Driver Endeavor Total no 502 545 1047 yes 89 47 136 Total 591 592 1183 No TVF TVF Driver a b Endeavor c d RR = [ d / ( c + d ) ] / [ a / ( a + b ) ] OR = (d/c)/(b/a) = ( a * d ) / ( b * c ) • Relative Risk (RR) 7.9% (47/592) / 15.1% (89/591) = 0.52 or 52% (given an equivalence value of 1) • Odds Ratio (OR) 8.6% (47/545) / 17.7% (89/502) = 0.49 or 49% (given an equivalence value of 1) • For small event rates (b and d) OR ~ RR *152 pts in the invasive vs 150 in the medical group ARc: 56% ARt: 46.7% ARR: 9.3% RR: 0.83 RRR: 17% OR: 0.69 ROR: 31% SHOCK, NEJM 1999 Compare event rates NNT=1/ARR Testa, Biondi Zoccai et al. EHJ 2005 Compare event rates • Absolute Risk Reduction (ARR) 7.9% (47/592) – 15.1% (89/591) = -7.2% STENT * TVF Crosstabulation Count TVF STENT Total Driver Endeavor no 502 545 1047 yes 89 47 136 Total 591 592 1183 • Number Needed to Treat (NNT) 1 / 0.072 = 13.8 ~ 14 • I need to treat 14 patients with Endeavor instead of Driver to avoid 1 TVF • The larger the ARR, the smaller the NNT Low NNT => Large benefit ENDEAVOR II. Circulation 2006 Compare event rates Compare event rates To compute Confidence Intervals for ARR, RR, OR, NNT SPSS is not so good… Confidence Interval Analysis (CIA) downloadable software [with the book “Statistics with Confidence”, Editor: DG Altman, BMJ Books London (2000)] https://www.som.soton.ac.uk/cia/ Compare event rates Compare event rates Compare event rates “Incidence study” (RCTs) for Relative Risk Compare event rates Compare event rates “Unmatched case control study” for Odds Ratio Compare event rates Compare event rates http://www.quantitativeskills.com/sisa/statistics/twoby2.htm Free in internet, always available! Compare event rates http://www.quantitativeskills.com/sisa/statistics/twoby2.htm Compare event rates http://www.quantitativeskills.com/sisa/statistics/twoby2.htm What you will learn •Analysis of categorical data (contingency tables) – Estimating a proportion with the binomial test – Comparing proportions in two-way contingency tables – Relative risk and odds ratio – Fisher exact test for small samples – McNemar test for proportions using paired samples – Comparing proportions in three-way contingency tables with the Cochran-MantelHaenszel test Exact tests • Every time we use conventional tests or formulas, we ASSUME that the sample we have is a random sample drawn from a specific distribution (usually normal, chisquare, or binomial…) • It is well known that as N increases, an established and specific distribution may be ASYMPTOTICALLY assumed (usually N≥30 is ok) Exact tests • Whenever asymptotic assumptions cannot be met (small, non-random, skewed samples, with sparse data, major imbalances or few events), EXACT TESTS should be employed • Exact tests are computationally burdensome (they involve PERMUTATIONS)*, but they do not rely on any underlying assumption • If in a 2x2 table a cell has an expected event rate ≤5, Pearson chi-square test is biased (ie ↑alpha error), and Fisher exact test is warranted *6! is a permutation, and equals 6x5x4x3x2x1=720 Fisher Exact test Exp Ctrl Event a b r1 No event c d r2 s1 s2 N P= s1! * s2! * r1! * r2! N! * a! * b! * c! * d! Exact tests Exact tests What you will learn •Analysis of categorical data (contingency tables) – Estimating a proportion with the binomial test – Comparing proportions in two-way contingency tables – Relative risk and odds ratio – Fisher exact test for small samples – McNemar test for proportions using paired samples – Comparing proportions in three-way contingency tables with the Cochran-MantelHaenszel test McNemar test • The McNemar test is a non parametric test applicable to 2x2 contingency tables • It is used to show differences in dichotomous data (presence/absence; +/-; Y/N) before and after a certain event / therapy / intervention (thus to evaulate the efficacy of these), if data are available as frequencies McNemar test Migraine and PFO closure Migraine after a+b = a+c No migraine TOT after c+d =b+d b=c Migraine before a b a+b No migraine before c d c+d TOT a+c b+d n The test determines whether the row and column marginal frequencies are equal What you will learn •Analysis of categorical data (contingency tables) – Estimating a proportion with the binomial test – Comparing proportions in two-way contingency tables – Relative risk and odds ratio – Fisher exact test for small samples – McNemar test for proportions using paired samples – Comparing proportions in three-way contingency tables with the Cochran-MantelHaenszel test 3-way contingency tables A no DIABETE S Total yes Count % within DIABETES Count % within DIABETES Count % within DIABETES 3 37,5% 1 20,0% 4 30,8% AHA/ACC type B1 B2 3 0 37,5% ,0% 0 3 ,0% 60,0% 3 3 23,1% 23,1% C Total 2 25,0% 1 20,0% 3 23,1% 8 100,0% 5 100,0% 13 100,0% This is a 2-way 2x4 contingency table… And if we know the ratio of smokers? 3-way 2x4x2 contingency table! That means 2 different 2-ways 2x4 contingency tables 3-way contingency tables DIABETES * AHA/ACC type * SMOKER Crosstabulation Count SMOKER no A DIABETES yes Total DIABETES Total no yes no yes 1 0 1 2 1 3 AHA/ACC type B1 B2 1 0 1 2 0 2 C 0 2 2 0 1 1 Total 1 1 2 1 0 1 3 3 6 5 2 7 The Cochran-Mantel-Haenszel chi-square tests the null hypothesis that two nominal variables are conditionally independent in each stratum, assuming that there is no three-way interaction. It works in a 3-way (3-dimensional) contingency table, where the last dimension refers to the strata 3-way contingency tables SAINT I, NEJM 2006 Thank you for your attention For any correspondence: gbiondizoccai@gmail.com For further slides on these topics feel free to visit the metcardio.org website: http://www.metcardio.org/slides.html