Multiplicity

advertisement
Multiplicity in Clinical
Trials
Ziad Taib
Biostatistics
AstraZeneca
March 12, 2012
Issues
• The multiplicity problem
• Sources of multiplicity in clinical trials
• Bonferroni
• Holm
• Hochberg
• Closed test procedures
• FDR (Benjamini-Hochberg)
The multiplicity problem
• When we perform one test there is a risk
of % for a false significant result i.e.
rejecting H0 (no effect) when it is actually
true.
• What about the risk for at least one false
significant result when performing many
tests?
Greater or smaller?
When performing 20 independent tests, we shall expect to
have one significant result even though no difference exists.
Probability of at least one false significant result
Number of tests
1
2
5
10
50
Probability
0.05
0.0975
0.226
0.401
0.923
P(at least one false positive result)
= 1 - P(zero false positive results)
= 1 – (1 - .05) ^ k
Multiplicity Dimensions
•
•
•
•
•
A. Multiple treatments
B. Multiple variables
C. Multiple time points
D. Interim analyses
E. Subgroup analyses
The multiplicity problem
• Doing a lot of tests will give us significant
results just by chance.
• We want to find methods to control this
risk (error rate).
• The same problem arises when
considering many confidence intervals
simultaneously.
Family wise error rate
• FWE = probability of observing a false positive
finding in any of the tests undertaken
• While there may be different opinions about
needing to adjust:
– Regulatory authorities are concerned about any false
claims for the effectiveness of a drug, not just for the
claim based on the primary endpoint(s)
– So we will need to demonstrate adequate control of the
FWE rate
• Its not just about the p-value!
– True! estimates and confidence intervals are important
too
– Ideally, multiplicity methods need to handle these as well
Procedures for controlling the
probability of false significances
•
•
•
•
•
Bonferroni
Holm
Hochberg
Closed tests
FDR
Bonferroni
• N different null hypotheses H1, … HN
• Calculate corresponding p-values p1, … pN
• Reject Hk if and only if pk < /N
Variation: The limits may be unequal as long
as they sum up to 
Conservative
Bonferroni’s inequality
• P(Ai) = P(reject H0i when it is true )


N


 




P
A

P
A


N




i
i


NN
i

1
i

1
i

1

N
N
N
Reject at least one hypthesis falsely
Example of Bonferroni correction
• Suppose we have N = 3 t-tests.
• Assume target alpha = 0.05.
• Bonferroni corrected p-value is alpha/N = 0.05/3
= 0.0167
• Unadjusted p-values are p1 = 0.001; p2 = 0.013; p3 = 0.074
– p1 = 0.001 < 0.0167, so reject H01
– p2 = 0.013 < 0.0167, so reject H02
– p3 = 0.074 > 0.0167, so do not reject H03
Holm
• N different null hypotheses H01, … H0N
• Calculate corresponding p-values p1, … pN
• Order the p-values from the smallest to the
largest, p(1) < ….<p(N)
• Start with the smallest p-value and reject
H(j) as long as p(j) < /(N-j+1)
Example of Holm’s test
•
•
•
•
•
Suppose we have N = 3 t-tests.
Assume target alpha= 0.05.
Unadjusted p-values are
p1 = 0.001; p2 = 0.013; p3 = 0.074
For the jth test, calculate
alpha(j) = alpha/(N – j +1)
– For test j = 1, alpha(1) = 0.05/(3 – 1 + 1)=0.0167
– the observed p1 = 0.001 is less than 0.0167, so
we reject the null hypothesis.
• For test j = 2,
• alpha(2) = 0.05/(3 – 2 + 1) = 0.05 / 2= 0.025
• the observed p2 = 0.013 is less than alpha(j) =
0.025, so we reject the null hypothesis.
• For test j = 3,
• alpha(3) = 0.05/(3 – 3 + 1) = 0.05
• the observed p3 = 0.074 is greater than alpha(3) =
0.05, so we do not reject the null hypothesis.
Hochberg
• N different null hypotheses H1, … HN
• Calculate corresponding p-values p1, … pN
• Order the p-values from the smallest to the largest, p(1) < ….<p(N)
• Start with the largest p-value. If p(N) <  stop and declare all
comparisons significant at level (i.e. reject H(1) … H(N) at level ).
Otherwise accept H(N) and go to the next step
• if p(N-1) < /2 stop and declare H(1) … H(N-1) significant. Otherwise
accept H(N-1) and go to the next step
• ….
• If p(N-k+1) < /(N-k+1) stop and declare H(1) … H(N-k+1) significant.
Otherwise accept H(N-k+1) and go to the next step
Closed procedures - stepwise
• Pre-specify order of the tested hypothesis.
Test on 5% level until non-significant
result.
• Order of tested hypothesis stated in
protocol
– Dose-response
– Factorial designs
Example
• Assume we
performed N=5 tests
of hypothesis
simultaneously and
want the result to be
at the level 0.05. The
p-values obtained
were
p(1)
0.009
p(2)
0.011
p(3)
0.012
p(4)
0.134
p(5)
0.512
• Bonferroni: 0.05/5=0.01. Since only p(1) is less
than 0.01 we reject H(1) but accept the
remaining hypotheses.
• Holm: p(1), p(2) and p(3) are less than 0.05/5,
0.05/4 and 0.05/3 respectively so we reject the
corresponding hypotheses H(1), H(2) and H(3).
But p(4) = 0.134 > 0.05/2=0.025 so we stop and
accept H(3) and H(4).
• Hochberg:
– 0.512 is not less than 0.05 so we accept H(5)
– 0.134 is not less than 0.025 so we accept H(4)
– 0.012 is less than 0.0153 so we reject H(1),H(2) and
H(3)
Questions or Comments?
Download