# c! - metcardio.org ```Primer on Statistics
for Interventional
Cardiologists
Giuseppe Sangiorgi, MD
Pierfrancesco Agostoni, MD
Giuseppe Biondi-Zoccai, MD
What you will learn
•
•
•
•
•
•
•
•
•
•
•
•
Introduction
Basics
Descriptive statistics
Probability distributions
Inferential statistics
Finding differences in mean between two groups
Finding differences in mean between more than 2 groups
Linear regression and correlation for bivariate analysis
Analysis of categorical data (contingency tables)
Analysis of time-to-event data (survival analysis)
Conclusions and take home messages
What you will learn
•
•
•
•
•
•
•
•
•
•
•
•
Introduction
Basics
Descriptive statistics
Probability distributions
Inferential statistics
Finding differences in mean between two groups
Finding differences in mean between more than 2 groups
Linear regression and correlation for bivariate analysis
Analysis of categorical data (contingency tables)
Analysis of time-to-event data (survival analysis)
Conclusions and take home messages
What you will learn
•Analysis of categorical data (contingency tables)
– Estimating a proportion with the binomial test
– Comparing proportions in two-way
contingency tables
– Relative risk and odds ratio
– Fisher exact test for small samples
– McNemar test for proportions using paired
samples
– Comparing proportions in three-way
contingency tables with the Cochran-MantelHaenszel test
Types of variables
Variables
CATEGORY
nominal
QUANTITY
ordinal
discrete
continuous
ranks
counting
measuring
TIMI
flow
Stent diameter
Stent length
BMI
Blood pressure
QCA data (MLD, late loss)
Death: yes/no
TLR: yes/no
ordered
categories
Types of variables
Variables
CATEGORY
nominal
ordinal
Death: yes/no
TLR: yes/no
ordered
categories
ranks
TIMI
flow
What you will learn
•Analysis of categorical data (contingency tables)
– Estimating a proportion with the binomial test
– Comparing proportions in two-way
contingency tables
– Relative risk and odds ratio
– Fisher exact test for small samples
– McNemar test for proportions using paired
samples
– Comparing proportions in three-way
contingency tables with the Cochran-MantelHaenszel test
Binomial test
Variable
type
Nominal
Ordinal
Continuous
Patient ID
Diabetes
AHA/ACC
Type
Lesion Length
1
Y
A
18
2
N
B1
24
3
N
A
17
4
N
C
25
Yes
5
38.5%
5
Y
B2
23
No
8
61.5%
6
N
A
15
7
N
A
16
8
Y
B2
18
9
N
B1
21
10
Y
B2
19
11
N
B1
14
12
Y
C
22
13
N
C
27
Diabetes
n=13
Binomial test
Is the percentage of diabetics in this sample
comparable with the known CAD population?
We fix the population rate at 15%
Binomial test
Is the percentage of diabetics in this sample
We fix the population rate at 15%
Binomial Test
DIABETES
Group 1
Group 2
Total
Category
yes
no
N
5
8
13
Obs erved
Prop.
,38
,62
1,00
Tes t Prop.
,15
Exact Sig.
(1-tailed)
,034
Binomial test
Agostoni et al. AJC 2007
Binomial test
What you will learn
•Analysis of categorical data (contingency tables)
– Estimating a proportion with the binomial test
– Comparing proportions in two-way
contingency tables
– Relative risk and odds ratio
– Fisher exact test for small samples
– McNemar test for proportions using paired
samples
– Comparing proportions in three-way
contingency tables with the Cochran-MantelHaenszel test
Compare discrete variables
χ2 test or chi-square test
The first basis for the chi-square test
is the contingency table
ENDEAVOR II. Circulation 2006
Compare discrete variables
χ2 test or chi-square test
Compare discrete variables
χ2 test or chi-square test
600
STENT * TVF Crosstabulation
500
Count
TVF
Total
Driver
Endeavor
yes
89
47
136
Total
591
592
1183
400
Count
STENT
no
502
545
1047
300
200
TVF
100
no
0
yes
Driver
Endeavor
STENT
Compare discrete variables
No TVF
TVF
Driver
a
b
r1
Endeavor
c
d
r2
s1
s2
N
STENT * TVF Crosstabulation
TVF
STENT
Driver
Endeavor
Total
Count
% within STENT
Count
% within STENT
Count
% within STENT
no
502
84,9%
545
92,1%
1047
88,5%
yes
89
15,1%
47
7,9%
136
11,5%
Total
591
100,0%
592
100,0%
1183
100,0%
Compare discrete variables
The second basis is the “observed”-“expected” relation
STENT * TVF Crosstabulation
TVF
STENT
Driver
Endeavor
Total
Count
Expected Count
Count
Expected Count
Count
Expected Count
no
502
523,1
545
523,9
1047
1047,0
yes
89
67,9
47
68,1
136
136,0
Total
591
591,0
592
592,0
1183
1183,0
Stent
TVF
Compare discrete variables
χ2 test or chi-square test
Compare discrete variables
χ2 test or chi-square test
Chi-Square Tests
Pears on Chi-Square
Continuity Correctiona
Likelihood Ratio
Fisher's Exact Test
Linear-by-Linear
Ass ociation
N of Valid Cas es
Value
14,736b
14,044
14,951
14,723
df
1
1
1
1
Asymp. Sig.
(2-s ided)
,000
,000
,000
Exact Sig.
(2-s ided)
Exact Sig.
(1-s ided)
,000
,000
,000
1183
a. Computed only for a 2x2 table
b. 0 cells (,0%) have expected count les s than 5. The minimum expected count is
67,94.
Compare discrete variables
More than 2x2 contingency tables
Post-hoc comparisons
A
no
DIABETE
S
Total
yes
Count
% within DIABETES
Count
% within DIABETES
Count
% within DIABETES
3
37,5%
1
20,0%
4
30,8%
AHA/ACC type
B1
B2
3
0
37,5%
,0%
0
3
,0%
60,0%
3
3
23,1%
23,1%
C
Total
2
25,0%
1
20,0%
3
23,1%
8
100,0%
5
100,0%
13
100,0%
Is there a difference between diabetics and nondabetics in the rate of AHA/ACC type lesions?
Post-hoc groups
the chi-square test was used to determine
differences between groups with respect to
the primary and secondary end points. Odds
ratios and their 95 percent confidence
intervals were calculated. Comparisons of
patient
characteristics
and
survival
outcomes were tested with the chi-square
test, the chi-square test for trend, Fisher's
exact test, or Student's t-test, as
appropriate.
This is a sub-group !
Bonferroni !
The level of significant
p-value should be
divided by the number
of tests performed…
Or the computed p-value,
multiplied for the number
of tests… P=0.12 and not P=0.04 !!
Wenzel et al, NEJM 2004
What you will learn
•Analysis of categorical data (contingency tables)
– Estimating a proportion with the binomial test
– Comparing proportions in two-way
contingency tables
– Relative risk and odds ratio
– Fisher exact test for small samples
– McNemar test for proportions using paired
samples
– Comparing proportions in three-way
contingency tables with the Cochran-MantelHaenszel test
Compare event rates
No TVF
TVF
Driver
a
b
Endeavor
c
d
Absolute Risk = [ d / ( c + d ) ]
Absolute Risk Reduction = [ d / ( c + d ) ] - [ b / ( a + b ) ]
Relative Risk = [ d / ( c + d ) ] / [ a / ( a + b ) ]
Relative Risk Reduction = 1 - RR
Odds Ratio = (d/c)/(b/a) = ( a * d ) / ( b * c )
Compare event rates
• Absolute Risk (AR)
7.9% (47/592) &amp; 15.1% (89/591)
• Absolute Risk Reduction (ARR)
7.9% (47/592) – 15.1% (89/591) = -7.2%
STENT * TVF Crosstabulation
Count
TVF
STENT
Total
Driver
Endeavor
no
502
545
1047
yes
89
47
136
Total
591
592
1183
• Relative Risk (RR)
7.9% (47/592) / 15.1% (89/591) = 0.52
(given an equivalence value of 1)
• Relative Risk Reduction (RRR)
1 – 0.52 = 0.48 or 48%
• Odds Ratio (OR)
8.6% (47/545) / 17.7% (89/502) = 0.49
(given an equivalence value of 1)
• Odds Ratio Reduction (ORR)
1 – 0.49 = 0.51 or 51%
Compare event rates
STENT * TVF Crosstabulation
Count
TVF
STENT
Driver
Endeavor
Total
no
502
545
1047
yes
89
47
136
Total
591
592
1183
No TVF
TVF
Driver
a
b
Endeavor
c
d
RR = [ d / ( c + d ) ] / [ a / ( a + b ) ]
OR = (d/c)/(b/a) = ( a * d ) / ( b * c )
• Relative Risk (RR)
7.9% (47/592) / 15.1% (89/591) =
0.52 or 52%
(given an equivalence value of 1)
• Odds Ratio (OR)
8.6% (47/545) / 17.7% (89/502) =
0.49 or 49%
(given an equivalence value of 1)
• For small event rates (b and d)
OR ~ RR
*152 pts in the invasive vs 150 in the medical group
ARc: 56%
ARt: 46.7%
ARR: 9.3%
RR: 0.83
RRR: 17%
OR: 0.69
ROR: 31%
SHOCK, NEJM 1999
Compare event rates
NNT=1/ARR
Testa, Biondi Zoccai et al. EHJ 2005
Compare event rates
• Absolute Risk Reduction (ARR)
7.9% (47/592) – 15.1% (89/591) = -7.2%
STENT * TVF Crosstabulation
Count
TVF
STENT
Total
Driver
Endeavor
no
502
545
1047
yes
89
47
136
Total
591
592
1183
• Number Needed to Treat (NNT)
1 / 0.072 = 13.8 ~ 14
• I need to treat 14 patients with Endeavor
instead of Driver to avoid 1 TVF
• The larger the ARR, the smaller the NNT
Low NNT =&gt; Large benefit
ENDEAVOR II. Circulation 2006
Compare event rates
Compare event rates
To compute Confidence Intervals for ARR, RR, OR, NNT
SPSS is not so good…
[with the book “Statistics with Confidence”, Editor: DG Altman,
BMJ Books London (2000)]
https://www.som.soton.ac.uk/cia/
Compare event rates
Compare event rates
Compare event rates
“Incidence study” (RCTs) for Relative Risk
Compare event rates
Compare event rates
“Unmatched case control study” for Odds Ratio
Compare event rates
Compare event rates
http://www.quantitativeskills.com/sisa/statistics/twoby2.htm
Free in internet,
always available!
Compare event rates
http://www.quantitativeskills.com/sisa/statistics/twoby2.htm
Compare event rates
http://www.quantitativeskills.com/sisa/statistics/twoby2.htm
What you will learn
•Analysis of categorical data (contingency tables)
– Estimating a proportion with the binomial test
– Comparing proportions in two-way
contingency tables
– Relative risk and odds ratio
– Fisher exact test for small samples
– McNemar test for proportions using paired
samples
– Comparing proportions in three-way
contingency tables with the Cochran-MantelHaenszel test
Exact tests
• Every time we use conventional tests or formulas, we
ASSUME that the sample we have is a random sample
drawn from a specific distribution (usually normal, chisquare, or binomial…)
• It is well known that as N increases, an established and
specific distribution may be ASYMPTOTICALLY assumed
(usually N≥30 is ok)
Exact tests
• Whenever asymptotic assumptions cannot be met
(small, non-random, skewed samples, with sparse data,
major imbalances or few events), EXACT TESTS should
be employed
• Exact tests are computationally burdensome (they
involve PERMUTATIONS)*, but they do not rely on any
underlying assumption
• If in a 2x2 table a cell has an expected event rate ≤5,
Pearson chi-square test is biased (ie ↑alpha error), and
Fisher exact test is warranted
*6! is a permutation, and equals 6x5x4x3x2x1=720
Fisher Exact test
Exp
Ctrl
Event
a
b
r1
No event
c
d
r2
s1
s2
N
P=
s1! * s2! * r1! * r2!
N! * a! * b! * c! * d!
Exact tests
Exact tests
What you will learn
•Analysis of categorical data (contingency tables)
– Estimating a proportion with the binomial test
– Comparing proportions in two-way
contingency tables
– Relative risk and odds ratio
– Fisher exact test for small samples
– McNemar test for proportions using paired
samples
– Comparing proportions in three-way
contingency tables with the Cochran-MantelHaenszel test
McNemar test
• The McNemar test is a non parametric test
applicable to 2x2 contingency tables
• It is used to show differences in dichotomous
data (presence/absence; +/-; Y/N) before and after
a certain event / therapy / intervention (thus to
evaulate the efficacy of these), if data are
available as frequencies
McNemar test
Migraine and PFO closure
Migraine
after
a+b = a+c
No migraine
TOT
after
c+d =b+d
b=c
Migraine
before
a
b
a+b
No migraine
before
c
d
c+d
TOT
a+c
b+d
n
The test determines whether the row and column marginal
frequencies are equal
What you will learn
•Analysis of categorical data (contingency tables)
– Estimating a proportion with the binomial test
– Comparing proportions in two-way
contingency tables
– Relative risk and odds ratio
– Fisher exact test for small samples
– McNemar test for proportions using paired
samples
– Comparing proportions in three-way
contingency tables with the Cochran-MantelHaenszel test
3-way contingency tables
A
no
DIABETE
S
Total
yes
Count
% within DIABETES
Count
% within DIABETES
Count
% within DIABETES
3
37,5%
1
20,0%
4
30,8%
AHA/ACC type
B1
B2
3
0
37,5%
,0%
0
3
,0%
60,0%
3
3
23,1%
23,1%
C
Total
2
25,0%
1
20,0%
3
23,1%
8
100,0%
5
100,0%
13
100,0%
This is a 2-way 2x4 contingency table…
And if we know the ratio of smokers?
3-way 2x4x2 contingency table!
That means 2 different 2-ways 2x4 contingency tables
3-way contingency tables
DIABETES * AHA/ACC type * SMOKER Crosstabulation
Count
SMOKER
no
A
DIABETES
yes
Total
DIABETES
Total
no
yes
no
yes
1
0
1
2
1
3
AHA/ACC type
B1
B2
1
0
1
2
0
2
C
0
2
2
0
1
1
Total
1
1
2
1
0
1
3
3
6
5
2
7
The Cochran-Mantel-Haenszel chi-square tests the null hypothesis
that two nominal variables are conditionally independent in
each stratum, assuming that there is no three-way interaction. It
works in a 3-way (3-dimensional) contingency table, where the
last dimension refers to the strata
3-way contingency
tables
SAINT I, NEJM 2006