Comparing Three or More Groups

advertisement
Comparing Three or More Groups:
Multiple Comparisons
vs
Planned Comparisons
Robert Boudreau, PhD
Co-Director of Methodology Core
PITT-Multidisciplinary Clinical Research Center
for Rheumatic and Musculoskeletal Diseases
First a simple thought experiment
Flip a fair coin 100 times: Let H=# heads
 H = 0,1,2, …, 100 are the possible outcomes
 H has a binomial distribution with known probs
 Prob[ 40 < H < 60 ] very close to 0.95
 Prob [ H ≤ 40 ] + P[ H ≥ 60] = 0.05
-------------------------------------------------------------------------------------------------------
First a simple thought experiment
Flip a fair coin 100 times: Let H=# heads
 H = 0,1,2, …, 100 are the possible outcomes
 H has a binomial distribution with known probs
 Prob[ 40 < H < 60 ] very close to 0.95
 Prob [ H ≤ 40 ] + P[ H ≥ 60] = 0.05
-------------------------------------------------------------------------------------------------------
Experiment: 20 people flip their own coin 100 times
Q: Approx how many will get 40 or fewer heads
or 60+ heads?
First a simple thought experiment
Flip a fair coin 100 times: Let H=# heads
 H = 0,1,2, …, 100 are the possible outcomes
 H has a binomial distribution with known probs
 Prob[ 40 < H < 60 ] very close to 0.95
 Prob [ H ≤ 40 ] + P[ H ≥ 60] = 0.05
-------------------------------------------------------------------------------------------------------
Experiment: 20 people flip their own coin 100 times
Q: Approx how many will get less than 40 heads
or 60+ heads?
Answer: One
First a simple thought experiment
Flip a fair coin 100 times: Let H=# heads
 H = 0,1,2, …, 100 are the possible outcomes
 H has a binomial distribution with known probs
 Prob[ 40 < H < 60 ] very close to 0.95
 Prob [ H ≤ 40 ] + P[ H ≥ 60] = 0.05
-------------------------------------------------------------------------------------------------------
Experiment: 20 people flip their own coin 100 times
Q: Approx how many will get less than 40 heads
or 60+ heads?
Answer: One (1/20 = 5%)
First a simple thought experiment
Experiment: 20 people flip their own coin 100 times

One (1/20=0.05) will flip an unusually small or
unusually large # heads (on average)
Q: Can we conclude that this person “X” flips an
“unfair” coin, or was this explainable by “chance”?
Controlling Experiment-wise Error
Experiment: 20 people flip their own coin 100 times
 Person
X’s confidence interval didn’t cover 0.5
Q: What alpha level should be used so that 95% of the
time all 20 confidence intervals each cover 0.5?
(i.e. so that the correct conclusion is drawn about
every single coin)
Controlling Experiment-wise Error
Experiment: 20 people flip their own coin 100 times
 Person
X’s confidence interval didn’t cover 0.5
Q: What alpha level should be used so that 95% of the
time all 20 confidence intervals each cover 0.5?
(i.e. so that the correct conclusion is drawn about
every single coin)
 Equivalent to drawing a “wrong” conclusion
about at least one of the coins only 5% of the time
(Experiment-wise Type I error)
Controlling Experiment-wise Error
Q: What alpha level should be used so that there’s a
95% probability that all 20 confidence intervals each
cover 0.5? (aka Experiment-wise correct conclusion)
Experiment-wise α=0.05, solve for comparison-wise α*:
α = Prob[ At least one C.I. misses 0 ]
= 1 – Prob[ All C.I.’s cover 0 ]
= 1 – (1 – α* )20
Sidak: Comparison-wise α* = 1 – (1 – α)1/n
n=20 “comparisons”: α* = 1 – (1-.05)1/20 = 0.00256
Controlling Experiment-wise Error
Q: What alpha level should be used so that there’s a
95% probability that all 20 confidence intervals each
cover 0.5?
Sidak: Comparison-wise α* = 1 – (1 – α)1/n
n=20 “comparisons”: α* = 1 – (1-.05)1/20 = 0.00256
Bonferroni: α* = α/n
( 0.05/20=0.0025)
Controlling Experiment-wise Error

Mathematically: α/n < 1 – (1 – α)1/n
Bonferroni < Sidak (i.e. higher α-level)
But usually very close  Sidak slightly more powerful

Bonferroni works in all situations to guarantee control
of experimentwise error (but may be conservative)

Sidak (derived assuming independence) can undercontrol in presence of high correlations
Comparison of Adverse Effect
of 4 Drugs on Systolic BP
Comparison of Adverse Effect
of 4 Drugs on Systolic BP
Comparison of Adverse Effect
of 4 Drugs on Systolic BP
Unadjusted pairwise t-tests
(α = 0.05 each comparison)
critical value of t=2.13145
Pairwise t-tests (Bonferroni)
critical value of t=3.03628
Pairwise t-tests (Sidak)
critical value of t=3.02585
Comparison of critical values
Scheffe: * Designed for arbitrary post-hoc testing
* Controls experimentwise error for all
possible simultaneous comparisons and
contrasts
Comparison of Adverse Effect
of 4 Drugs on Systolic BP (v2)
s
s
Note: For Drug 4, I’ve subtracted 6 from the previous values
Comparison of Adverse Effect
of 4 Drugs on Systolic BP (v2)
ANOVA F-test
Unadjusted pairwise t-tests (v2)
(α = 0.05 each comparison)
critical value of t=2.13145
Pairwise t-tests (Bonferroni) (v2)
critical value of t=3.03628
Pairwise t-tests (Sidak) (v2)
critical value of t=3.02585
Tukey’s Studentized Range Test




Related in concept to Scheffe’s Method
Designed for all pairwise comparisons exclusively
(recall: Scheffe applies to all possible simultaneous
pairwise comparisons and contrasts)
Exact experimentwise error coverage if sample
sizes equal
Critical values smaller than Bonferroni or Sidak
 More powerful in finding differences
Pairwise t-tests (Tukey) (v2)
critical value of t=2.88215
Comparison of Adverse Effect
of 4 Drugs on Systolic BP
Dunnett’s Method
(Comparison vs a Control)




Related in concept to Scheffe and Tukey Methods
Designed for pairwise comparisons vs a single
control exclusively
Exact experimentwise error coverage of those
comparisons if sample sizes equal
Critical values smaller than Bonferroni, Sidak or
Tukey
 More powerful in finding differences vs control
Comparison vs Control (Dunnett) (v2)
critical value of t=2.61702
Controlling for Multiple Comparisons
in Exploratory Analyses
Caterina Rosano, Howard J. Aizenstein, Stephanie
Studenski, Anne B. Newman.
A Regions-of-Interest Volumetric Analysis of
Mobility Limitations in Community-Dwelling
Older Adults. Journal of Gerontology: Medical
Sciences 2007
Controlling for Multiple Comparisons
in Exploratory Analyses
A Regions-of-Interest Volumetric Analysis of
Mobility Limitations in Community-Dwelling
Older Adults. Journal of Gerontology: Medical
Sciences 2007
Controlling for Multiple Comparisons
in Exploratory Analyses
Controlling for Multiple Comparisons
in Exploratory Analyses
c
Thank you !
Any Questions?
Robert Boudreau, PhD
Co-Director of Methodology Core
PITT-Multidisciplinary Clinical Research Center
for Rheumatic and Musculoskeletal Diseases
Download