Group Comparisons Part 3

advertisement
Group Comparisons Part 3:
Nonparametric Tests,
Chi-squares and Fisher Exact
Robert Boudreau, PhD
Co-Director of Methodology Core
PITT-Multidisciplinary Clinical Research Center
for Rheumatic and Musculoskeletal Diseases
Core Director for Biostatistics
Center for Aging and Population Health
Dept. of Epidemiology, GSPH
Flow chart for group
comparisons
Measurements to be compared
continuous
discrete
( binary, nominal, ordinal with few values)
Distribution approx normal
or N ≥ 20?
No
Yes
Non-parametrics
T-tests
Chi-square
Fisher’s Exact
A physiologic index of comorbidity – relationship to
mortality and disability.
Anne B. Newman, MD, MPH, Robert M. Boudreau,
PhD, Barbara L. Naydeck, MPH, Linda F. Fried, MD,
MPH and Tamara B. Harris, MD, MS
J Gerontol Med Sci. 2008
5 Physiologic System Measures





Cystatin C
Internal Carotid Artery Wall Thickness (ICA)
Pulmonary: Forced Vital Capacity (FVC)
Fasting Glucose
White Matter Grade
N=2928 elderly participants in longitudinal cohort study
0-2 scale on each: 0=healthiest, 2=worst
 tertiles or clinical cutpoints
(e.g. glucose <100, 100-126, 126+)
Physiologic Index= sum (range=0 to 10)
* Mortality rates based on 9 yrs followup
√
Comparisons
Using
2-Sample
Independent
T-tests ?
√
Comparisons
Using
2-Sample
Independent
T-tests ?
√
√
√
√
Comparisons
Using
Chi-Square ?
(categorical)
√
Comparisons
Using
Chi-Square ?
(categorical)
√
√
√
Pooled or
Unequal
Variance
2-sample
T-test ?
Pooled or
Unequal
Variance
2-sample
T-test ?
Pooled
df=(1237-1)+
(1691-1)
= 2926
Unequal Vars
(Satterthwaite)
Unequal Vars
(Satterthwaite)
2-Sample T-test,
Non-parametric: Wilcoxon Rank-Sum Test
Three-dimensional and thermal surface imaging
produces reliable measures of joint shape and
temperature: a potential tool for quantifying arthritis
Steven J Spalding, C Kent Kwoh, Robert Boudreau,
Joseph Enama, Julie Lunich, Daniel Huber, Louis Denes
and Raphael Hirsch
Arthritis Research & Therapy 2008
 Will focus on HDI
Heat Distribution Index =
SD of temps in standard
reproducibly defined
HDI of MCPs: RA vs Controls
MCP Region
…………...
HDI (Heat Distribution Index) of MCPs
10 adults controls vs 9 adults with active RA
HDI (Heat Distribution Index) of MCPs
10 adults controls vs 9 adults with active RA
T-test (2-sample independent)
vs Wilcoxon Rank-Sum (aka Mann-Whitney)
Control
(n=10)
Arthritis
(n=9)
1.2
1.4
1.1
2.4
1.0
2.3
1.2
2.1
0.6
3.0
0.5
1.1
1.0
1.4
1.0
1.3
1.3
1.1
1.2
Mean
1.01
1.79
SD
0.26
0.70
Median
1.05
1.40
HDI (Heat Distribution Index) of MCPs
10 adults controls vs 9 adults with active RA
T-test (2-sample independent)
“pooled” df = 10+9-2=17
T-Tests
Variable
Method
Variances
HDI
HDI
Pooled
Satterthwaite
Equal
Unequal
DF
t Value
Pr > |t|
17
10.2
3.36
3.23
0.0037
0.0089
Test for Equality of Variances
Variable
Method
HDI
Folded F
Num DF
Den DF
F Value
Pr > F
8
9
6.60
0.0105
HDI (Heat Distribution Index) of MCPs
10 adults controls vs 9 adults with active RA
T-test (2-sample independent)
T-Tests
Variable
Method
Variances
HDI
HDI
Pooled
Satterthwaite
Equal
Unequal
DF
t Value
Pr > |t|
17
10.2
3.36
3.23
0.0037
0.0089
Test for Equality of Variances
Variable
Method
HDI
Folded F
Num DF
Den DF
F Value
Pr > F
8
9
6.60
0.0105
Test of equality of variances is rejected
=> Use Unequal Variance t-test (Satterthwaite)
HDI (Heat Distribution Index) of MCPs
10 adults controls vs 9 adults with active RA
Wilcoxon Rank-Sum (aka Mann-Whitney)
The idea/motivation:
 Method should work for any distribution
 non-parametric
 Base statistical test on ranks
 rank = order when all data is sorted from
lowest to highest
 each group then gets a “rank sum”
 Won’t be affected by outliers
 Like all statistical tests, p-value is based on
distribution (of difference in rank-sums here)
assuming there is no difference between the groups
HDI (Heat Distribution Index) of MCPs
10 adults controls vs 9 adults with active RA
Wilcoxon Rank-Sum (aka Mann-Whitney)
Base statistical test on ranks
 each group gets a “rank sum”
 p-value is based on distribution of difference in rank-sums
assuming there is no difference between the groups
 just like shuffling cards
(with only two colors on cards; even if different n’s)
 the critical values are the “extreme” differences in
rank-sums between the two groups
(α = 0.05 => the most extreme 5% of differences )
Sorted then assigned ranks
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
group
HDI
HDI_rank
Control
Control
Control
Control
Control
Control
Arthritis
Arthritis
Control
Control
Control
Control
Arthritis
Arthritis
Arthritis
Arthritis
Arthritis
Arthritis
Arthritis
0.5
0.6
1.0
1.0
1.0
1.1
1.1
1.1
1.2
1.2
1.2
1.3
1.3
1.4
1.4
2.1
2.3
2.4
3.0
1.0
2.0
4.0
4.0
4.0
7.0
7.0
7.0
10.0
10.0
10.0
12.5
12.5
14.5
14.5
16.0
17.0
18.0
19.0
Average rank
(= 12.5)
HDI (Heat Distribution Index) of MCPs
10 adults controls vs 9 adults with active RA
Wilcoxon Rank-Sum (aka Mann-Whitney)
Wilcoxon Scores (Rank Sums) for Variable HDI
Classified by Variable Group
Group
Control
Arthritis
N
10
9
Sum of
Scores
64.50
125.50
Expected
Under H0
100.0
90.0
Std Dev
Under H0
12.172013
12.172013
Average scores were used for ties.
Mean
Score
6.45000
13.94444
HDI (Heat Distribution Index) of MCPs
10 adults controls vs 9 adults with active RA
Wilcoxon Rank-Sum (aka Mann-Whitney)
Wilcoxon Two-Sample Test
Statistic (S)
125.5000
Normal Approximation
Z
One-Sided Pr > Z
Two-Sided Pr > |Z|
2.8754
0.0020
0.0040
t Approximation
One-Sided Pr > Z
Two-Sided Pr > |Z|
0.0050
0.0101
Exact Test
One-Sided Pr >= S
Two-Sided Pr >= |S - Mean|
0.0012
0.0023
Z includes a continuity correction of 0.5.
Comparing Groups in the
Percentage Falling into Categories
Example: Treatment for RA
 Compare MTX vs MTX+ETN
Outcomes (@ 3 months)
 Dichotomous: e.g. % in remission
% with DAS28 drop > 1.2 pts

Multiple Categories: ACR 20/50/70
% of pts reaching each level (sum to 100%)
Comparisons
Using
Chi-Square ?
(categorical)
√
√
√
Comparing Groups on the
Percentage Falling into Categories
Rule of thumb:
[1] All cell sizes ≥ 5 => Use Chi-square
[2] Any cell size < 5 => Use Fisher’s Exact
Reason: Criterion [1] is a condition for the
Central Limit Theorem to hold with good
accuracy (… so p-values are accurate)
Comparing Groups on the
Percentage Falling into Categories
Sharma L, et.al. Quadriceps Strength and OA Progression in Malaligned
and Lax Knees, Ann Intern Med. 2003
Inclusions:
 KLgrade ≥ 2

At least a little difficulty (Likert category) on at least two items
in Western Ontario and McMaster University osteoarthritis
index physical function scale
Exclusions:

corticosteroid injection < 3 months, avascular necrosis,
rheumatoid or other inflammatory arthritis, periarticular
fracture, Paget disease, villonodular synovitis, … (etc.)
Comparing Groups on the
Percentage Falling into Categories
JSN Progression
No
Yes
# Knees
Low quadraceps Strength
111 (88.8%)
14 (11.2%)
125
High quadraceps Strength
111 (88.8%)
14 (11.2%)
125
Low quadraceps Strength
28 (74.4%)
10 (26.3%)
38
High quadraceps Strength
20 (50.0%)
20 (50.0%)
40
More neutral alignment (< 5 degrees)
Malignment ( ≥ 5 degrees )
Comparing Groups on the
Percentage Falling into Categories
JSN Progression
No
Yes
# Knees
Low quadraceps Strength
28 (74.4%)
10 (26.3%)
38 (48.7%)
High quadraceps Strength
20 (50.0%)
20 (50.0%)
40 (51.3%)
Column totals
48 (61.5%)
30 (38.5)
Total = 78
Malignment ( ≥ 5 degrees )
Comparing Groups on the
Percentage Falling into Categories
Chi-square Statistic
df=(rows-1) x (cols-1)
Note: ni j = observed (actual) cell count
eij = (row %) x (col %) x (total # knees)
= (# knees in row) x (col %)
= expected cell count as if groups are the “same”
(eij effectively applies the “pooled” average
JSN Progression rate to both groups)
Cells are:
# observed
(# expected)
JSN Progression
No
Yes
Row %’s
Low quadraceps strength
28
(23.4)
10
(14.6)
38 (48.7%)
High quadraceps strength
20
(24.6)
20
(15.4)
40 (51.3%)
Column %’s
61.5%
38.5%
Total = 78
Malignment ( ≥ 5 degrees )
High quadraceps strength: Expected # Yes = 0.513*0.385*78=0.1975*78
= 0.385 * 40 knees=15.4
Comparing Groups on the
Percentage Falling into Categories
JSN Progression
No
Yes
# Knees
Low quadraceps Strength
28 (74.4%)
10 (26.3%)
38 (48.7%)
High quadraceps Strength
20 (50.0%)
20 (50.0%)
40 (51.3%)
Column totals
48 (61.5%)
30 (38.5)
Total = 78
Malignment ( ≥ 5 degrees )
Chi-square = 4.6184, p=0.0316
Fisher’s Exact:
p=0.0383
df = (2-1) x (2-1) = 1
Cells are: Obs # (Alt #)
Fisher’s Exact uses all (Alt #)’s that retain
same row/col counts
JSN Progression
No
Yes
# Knees
Low quadraceps Strength
28 (29)
10 (9)
38
High quadraceps Strength
20 (19)
20 (21)
40
Column totals
48 (61.5%)
30
Total = 78
Malignment ( ≥ 5 degrees )
Fisher’s Exact p-value is the hypergeometric proportion of tables that are
at least as “extreme” as the observed table. (above table is more “extreme”)
Comparing Groups on the
Percentage Falling into Categories
Rule of thumb:
[1] All cell sizes ≥ 5 => Use Chi-square
[2] Any cell size < 5 => Use Fisher’s Exact
Download