analysis of variance - LISA (Laboratory for Interdisciplinary Statistical

advertisement
ANALYSIS OF VARIANCE
Jennifer Kensler
ONE-WAY ANOVA
  ANOVA
is used to determine whether three or
more populations have different distributions.
A
B
Medical Treatment
C
ANOVA STRATEGY
 The
first step is to use the ANOVA F test to
determine if there are any significant differences
among means.
  If
the ANOVA F test shows that the means are
not all the same, then follow up tests can be
performed to see which pairs of means differ.
ANOVA ASSUMPTIONS
  The
samples are random and independent of each
other.
  The populations are normally distributed.
  The populations all have the same variance.
  The
ANOVA F test is robust to the assumptions
of normality and equal variances.
THE ANOVA MODEL
  The
one-way ANOVA is a linear model.
  ANOVA
can be formulated as a regression model.
ONE-WAY ANOVA MODEL
In other words, for each group the observed
value is the group mean plus some random
variation.
ONE-WAY ANOVA HYPOTHESIS
  We
test whether there is a difference in the
means.
ANOVA F TEST
A
B
C
A
B
C
Medical Treatment
Compare the variation within the samples to the
variation between the samples.
ANOVA TEST STATISTIC
Variation within groups
small compared with
variation between groups
→ Large F
Variation within groups
large compared with
variation between groups
→ Small F
MSG
  The
mean square for groups, MSG, measures the
variability of the sample averages.
  SSG
stands for sums of squares groups.
MSE
  Mean
square error, MSE, measures the variability
within the groups.
  SSE stands for sums of squares error.
EXAMPLE 1
  We
would like to determine if there is a
difference in a health index depending on which
medical treatment (A, B or C) is used.
  150 patients are randomly assigned to a
treatment (50 people in each treatment).
  JMP
demonstration
Analyze  Fit Y By X
Y, Response: Health Index
X, Factor: Treatment
EXAMPLE 1: JMP OUTPUT
FOLLOW-UP TEST
  The
p-value of the overall F test indicates that
the health index is not the same for all
treatments.
  We would like to know which pairs of treatments
are different.
  One method is to use Tukey’s HSD (honestly
significant differences).
TUKEY TESTS
  Tukey’s
test simultaneously tests
for all pairs of factor levels. Tukey’s HSD
controls the overall type I error.
  JMP
demonstration
Oneway Analysis of Health Index By Treatment 
Compare Means  All Pairs, Tukey HSD
JMP OUTPUT
  The
JMP output shows that all pairs of
treatments are significantly different from one
another.
ANALYSIS OF COVARIANCE
(ANCOVA)
  Covariates
are variables that may affect the
response but cannot be controlled.
  Covariates are not of primary interest to the
researcher.
  We will look at an example with two covariates,
the model is
EXAMPLE 2
  Consider
the previous example where we tested
whether the health index was different
depending on the treatment. Perhaps age and
gender may influence the health index. We can
use age and gender as covariates.
  JMP
demonstration
Analyze  Fit Model
Y: Health Index
Add: Treatment
Age
Gender
JMP OUTPUT
CONCLUSION
  ANOVA
and ANCOVA methods allow us to
determine whether the means of several groups
are statistically different.
  For
information about using SAS and SPSS to do
ANOVA:
http://www.ats.ucla.edu/stat/sas/topics/anova.htm
http://www.ats.ucla.edu/stat/spss/topics/anova.htm
Download