Multiple pairwise comparisons procedures

advertisement
ST 524
NCSU - Fall 2008
Mean Separation Procedures
1. Estimating and comparing means: LSMEANS, CONTRAST, ESTIMATE
2. Multiple comparisons procedures and multiple tests about means
 For data from experiments with unstructured treatments: variety trials, insecticide
screening trials.
 When treatments have structure, analysis should be done accordingly.
o Quantitative treatments: Increasing dosages of a quantitative factor: row
spacing, concentration of a pesticide, times of application, temperatures.
Analyze “dose-response” relationship. Fit “meaningful mathematical”
equation, if suitable.
o Factorial Experiments: Study interaction of factors. Do not use a multiple
comparison procedure in the complete set of means, treatment combination
means.
 Compare main effect means if interaction effect is not significant, may
use mean separation procedures.
 If interaction present, compare one factor levels within each level of
other factor (or factor combination).
o Contrasts and pre-planned comparisons
 Use set of meaningful orthogonal contrasts to partition Treatment Sum
of Squares into (t-1) 1-df contrasts that may respond to specific
research questions.
 Overall F-test need not to be significant when testing pre-planned
comparisons.
 Relevance of contrasts to research objectives is more important than
orthogonality.
 For a two sample mean comparison test: H o : 1  2  0 vs H1 : 1  2  0
o Type I Error rate   : risk of rejecting a true null hypothesis
o Type II error rate    : risk of not rejecting a false null hypothesis.
 Multiple comparisons among the set of treatment means.
o Set of p treatment means: p – 1 available comparisons
o Pairwise comparisons:
p  p  1 2
o  C : Comparison-wise error rate : P(Difference is significant/Ho is true),
proportion of differences erroneously considered significant.
Thursday October 16, 2008
“… a hodgepodge treatment set often suggests that the experimental objectives were not well thought out.”
(Swallow, 1984)
1
ST 524
NCSU - Fall 2008
Mean Separation Procedures
o  E : Experiment-wise error rate: proportion of experiments in which one or
more differences are considered significant when in reality they are not
Probability of wrongly finding at
least one difference to be significant
increases as the number of means
increases.
different.
o For independent (orthogonal) pairwise test. Suppose the experiment has four
treatments, with 3 independent comparisons (n = 4),
 If C = 0.05, then  E  1  1   C  = 1- 0.953
n
Power to detect real
differences decreases as the
number of means increases,
 If E = 0.05, then  C  1  1   E 
1n
= 0.1426.
 1  0.951/ 3  0.0170 .
 Control Type I Error rates because of multiple testing: strategies to control the
family-wise error rate (significance level ).
o Liberal test: High sensitivity (low  or high power) and high 
o Conservative test: low sensitivity (high  or low power) and low 

Fisher’s protected Least Significant Difference (Fisher’s LSD)
o Comparison-wise error rate: P(Difference is significant/Ho is true),
proportion of differences erroneously considered significant.
 Tukey’s Honestly Significant Difference (Tukey’s HSD)
o Experiment-wise error rate: proportion of experiments in which one or more
differences are considered significant when in reality they are not different.
 Student-Newman-Keuls (SNK)
o For adjacent ranked means: comparison-wise error rate (LSD)
o For greatest apart ranked means: experiment-wise error rate (TUKEY HSD)
 Waller and Duncan Bayes LSD (Waller-Duncan)
o Does not use significance level  to determine the test statistic.
o Significance level is related to k, a factor that measures the ratio of the cost of
a Type I Error with respect to the cost of a Type II Error.
o Considers homogeneity of variances: for lower F, Waller-Duncan is like an
experiment-wise test (conservative, smaller differences, if any present).
Larger F (>4) Waller-Duncan is like comparison-wise test (liberal, Anova
indicates a greater likelihood of real differences).
o Power does not depend on number of means being compared.
 Sidak test :
for E = 0.05, use C = 1 – (1-0.05)1/3 = 0.0170
 Bonferroni test:
for E = 0.05, use C = 0.05/3 = 0.016
Thursday October 16, 2008
“… a hodgepodge treatment set often suggests that the experimental objectives were not well thought out.”
(Swallow, 1984)
2
ST 524
NCSU - Fall 2008
Mean Separation Procedures
 Pairwise comparisons (LSMEANS/DIFF, ESTIMATE)
 Family-wise Type I Error rates control: LSMEANS/ADJUST=
Carmer, S.G. and M.R. Swanson. "An Evaluation of Ten Pairwise Multiple Comparison
Procedures by Monte Carlo Methods." Journal of American Statistical Association 68:6674. 1983.
Thursday October 16, 2008
“… a hodgepodge treatment set often suggests that the experimental objectives were not well thought out.”
(Swallow, 1984)
3
Download