Analysis of variance (ANOVA)

advertisement
Analysis of variance (ANOVA)
Purpose: to test for statistically significant differences among two or more
population means.
H0: all of the population means are equal.
Ha: not all of the population means are equal.
Rejection of Ho means that there is a statistically significant difference between at
least two of the sample means.
Estimation (point and interval) of population means is also possible.
One-way ANOVA (completely randomized design)
Types of variation -- variation is measured by sums of squared deviations.
Between-group variation -- signal-related
Measured by "SST" -- sum of squared deviations for treatments
(groups).
Within-group variation -- noise-related
Measured by "SSE" -- sum of squared deviations for error.
Total variation
Measured by "TSS" -- total sum of squared deviations.
TSS = SST + SSE.
Degrees of freedom in ANOVA:
d.f. for treatments -- the "numerator" d.f.
d.f. = (no. of treatments - 1)
d.f. for error -- the "denominator" d.f.;
d.f. = (total no. of observations - no. of treatments)
overall d.f. -- sum of the other two d.f.'s, also
d.f. = (total no. of observations - 1)
SST and SSE are divided by the appropriate number of degrees of freedom,
yielding MST (mean of squared deviations for treatments) and MSE, (mean of
squared deviations for error), respectively.
MST = SST / numerator d.f.
MSE = SSE / denominator d.f.
MST is the "signal."
MSE is the "noise."
The larger the MST relative to the MSE, the stronger the evidence against Ho.
Fc, the test statistic, is the ratio of MST to MSE:
Reject H0 if Fc  Ft.
Fc= MST / MSE.
The Ft is based on α and the numerator and denominator degrees of freedom.
Reject H0 if p  α.
ANOVA table: standardized way of presenting results of anova computations.
When there are only two groups and a t-test could be used, the Fc will be equal to
the square of the tc.
ANOVA estimation -- single population mean and difference between two population
means
Similar to z and t procedures.
tt in the following equations is based on the number of degrees of freedom for
error.
 = X  tt ( ˆ X )
single population mean:
where
ˆ X =
MSE
n
difference between two population means:
( 1 -  2) = (x1 - x 2)  t t ˆ ( x1- x2 )
where
ˆ ( x - x ) = MSE x
1
2
1
n1
+
1
n2
Assumptions -- same as t-tests
Samples: random, independent
Populations: normal, equal variances
Two-way ANOVA (randomized block design)
Two independent variables -- treatments and blocks.
Two between-group variations -- signal-related:
SST & SSB (SSB = sum of squared deviations for blocks).
TSS = SST + SSB + SSE.
Degrees of freedom:
d.f. for treatments -- "numerator" d.f.
d.f. = (no. of treatments - 1)
d.f. for blocks -- "numerator" d.f.
d.f. = (no. of blocks - 1)
d.f. for error -- "denominator" d.f.
d.f. = (total no. observations - no. treatments - no. blocks + 1)
overall d.f. -- sum of the other three d.f.'s, also
d.f. = (total no. of observations - 1)
SSB is divided by d.f. for blocks, yielding MSB
(mean of squared deviations for blocks).
MSB = SSB / d.f. for blocks
MST & MSB are both "signals."
MSE is "noise."
Two hypothesis tests (one for treatments, one for blocks), two calculated F's, two
table F's, two p-vlaues.
Fc treatments = MST / MSE.
Fc blocks
= MSB / MSE.
Reject each Ho if Fc  Ft.
Two-way ANOVA estimation -- valid only for differences between population means.
Confidence intervals cannot be obtained for individual treatment means.
tt in the following equation is based on the number of degrees of freedom for error.
Difference between two population means:
( 1 -  2) = (x1 - x 2)  t t ˆ ( x1- x2 )
where
ˆ ( x - x ) = MSE x
1
2
Assumptions -- same as t-tests
Samples: random, independent
Populations: normal, equal variances
Three-way analysis of variance
"Latin Square" design
1
n1
+
1
n2
Download