licensed under a . Your use of this Creative Commons Attribution-NonCommercial-ShareAlike License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this
material constitutes acceptance of that license and the conditions of use of materials on this site.
Copyright 2008, The Johns Hopkins University and Brian Caffo. All rights reserved. Use of these materials
permitted only in accordance with license rights granted. Materials provided “AS IS”; no representations or
warranties provided. User assumes all responsibility for use, and all liability related thereto, and must independently
review all materials for accuracy and efficacy. May contain materials owned by others. User is responsible for
obtaining permissions for use from third parties as needed.
Lecture 27
Brian Caffo
Table of
contents
Outline
Multiplicity
Bonferoni
Lecture 27
FDR
Brian Caffo
Department of Biostatistics
Johns Hopkins Bloomberg School of Public Health
Johns Hopkins University
December 13, 2007
Lecture 27
Table of contents
Brian Caffo
Table of
contents
Outline
Multiplicity
1 Table of contents
Bonferoni
FDR
2 Outline
3 Multiplicity
4 Bonferoni
5 FDR
Lecture 27
Outline
Brian Caffo
Table of
contents
Outline
Multiplicity
Bonferoni
FDR
1
Familywise error rates
2
Bonferoni procedure
3
Performance of Bonferoni with multiple independent tests
4
False discovery rate procedure
Lecture 27
Multiplicity
Brian Caffo
Table of
contents
Outline
Multiplicity
Bonferoni
FDR
• After rejecting a χ2 omnibus test you do all pairwise
comparisons
• You conducted a study with 20 outcomes and 30 different
combinations of covariates. You consider significance at all
combinations.
• You compare diseased tissue versus normal tissue
expression levels for 20k genes
• You compare rest versus active at 300k voxels in an fMRI
study
Lecture 27
Multiplicity
Brian Caffo
Table of
contents
Outline
Multiplicity
Bonferoni
FDR
• Performing two α-level tests:
H01 versus Ha1 and H02 versus Ha2
E1 Reject H01 and E2 Reject H02
FWE
P(one or more false rej | H01 , H02 )
= P(E1 ∪ E2 | H01 , H02 )
= P(E1 | H01 , H02 ) + P(E2 | H01 , H02 )
− P(E1 ∩ E2 | H01 , H02 )
≤ P(E1 | H01 , H02 ) + P(E2 | H01 , H02 )
= 2×α
Result : The familywise error rate for k hypotheses tested at
level α is bounded by kα
Lecture 27
Proof
Brian Caffo
Table of
contents
Outline
Multiplicity
Ei - false rejection for test i
All probabilities are conditional on all of the nulls being true
Bonferoni
FDR
FWE
= P(one or more false rej)
= P(∪ki=1 Ei )
o
n
= P E1 ∪ (∪ki=2 Ei )
≤ P(E1 ) + P(∪ki=2 Ei )
..
.
≤ P(E1 ) + P(E2 ) + . . . + P(Ek )
= kα
Lecture 27
Other direction
Brian Caffo
Table of
contents
Outline
Multiplicity
Bonferoni
FDR
• The FWE is no larger than kα where k is the number of
tests
• The FWE is no smaller than α
P(∪ki=1 Ei ) ≥ P(E1 ) = α
• The lower bound is obtained when the Ei are identical
E1 = E2 = . . . = Ek
• Bonferoni’s tests each individual hypothesis at level
α∗ = α/k
• The FWE is no larger than kα∗ = kα/k = α
• The FWE is no smaller than α/k
Lecture 27
Bonferoni’s procedure
Brian Caffo
Table of
contents
Outline
Multiplicity
Bonferoni
If α∗ is small and the tests are independent, then the upper
bound on the FWE is nearly obtained
FDR
FWE
= P(one or more false rej)
= 1 − P(no false rej)
= 1 − P(∩ki=1 Ēi )
= 1 − (1 − α∗ )k
≈ 1 − (1 − kα∗ )
= kα∗ = α
Lecture 27
Scratch work
Brian Caffo
Table of
contents
Outline
Multiplicity
Recall the approximation for α∗ near 0
Bonferoni
f (α∗ ) − f (0)
≈ f 0 (0)
α∗ − 0
FDR
hence
f (α∗ ) ≈ f (0) + α∗ f 0 (0)
In our case f (α∗ ) = (1 − α∗ )k so f (0) = 1
f 0 (α∗ ) = −k(1 − α∗ )k−1 so f 0 (0) = −k
Therefore (1 − α∗ )k ≈ 1 − kα∗
Lecture 27
Notes
Brian Caffo
Table of
contents
Outline
Multiplicity
Bonferoni
• For Bonferoni’s procedure α∗ = α/k so will be close to 0
for a large number of tests
FDR
• When there are lots of tests that are (close to)
independent, the upper bound on the FWE used is
appropriate
• When the test are closely related, then the FWE will be
closer to the lower bound, and Bonferoni’s procedure is
conservative
• Is the familywise error rate always the most appropriate
quantity to control for?
Lecture 27
FDR
Brian Caffo
Table of
contents
Outline
Multiplicity
Bonferoni
FDR
• The false discovery rate is the proportion of tests that
are falsely declared significant
• Controlling the FDR is less conservative than controlling
the FWE rate
• Introduced by Benjamini and Hochberg
Lecture 27
Benjamini and Hochberg
procedure
Brian Caffo
Table of
contents
Outline
Multiplicity
Bonferoni
FDR
1
Order your k p-values, say p1 < p2 < . . . < pk
2
Define qi = kpi /i
3
Define Fi = min(qi , . . . , qk )
4
Reject for all i so that Fi is less than the desired FDR
Note that the Fi are increasing, so you only need to find the
largest one so that Fi < FDR
Lecture 27
Example
Brian Caffo
Table of
contents
Outline
1st 10 of 50 SNPs (Rosner page 581)
Multiplicity
Bonferoni
FDR
Gene
30
20
48
50
4
40
7
14
26
47
i
1
2
3
4
5
6
7
8
9
10
pi
<.0001
.011
.017
.017
.018
.019
.026
.034
.042
.048
qi = kpi /i
.0035
.28
.28
.22
.18
.16
.18
.21
.23
.24
Fi
.0035
.16
.16
.16
.16
.16
.18
.21
.23
.24
Lecture 27
Example
Brian Caffo
Table of
contents
Outline
Multiplicity
Bonferoni
FDR
• Bonferoni cutoff .05/50 = .001; only the first Gene is
significant
• For a FDR of 0 − 15%; only the first Gene would be
declared significant
• For a FDR of 16 − 20%, the first 7 would be significant