ANALYSIS OF VARIANCE

advertisement
ANALYSIS OF VARIANCE
One-Way ANOVA
Assumptions:
1. Each of the k population or group distributions is normal. check with a Normal Quantile Plot (or boxplot) of each group
2. These distributions have identical variances (standard deviations). check if largest sd is > 2 times smallest sd
3. Each of the k samples is a random sample.
4. Each of the k samples is selected independently of one another.
H0: 1 = 2 = . . . = k vs. HA: not all k means are equal (this will always be the hypotheses, the only difference is # of groups)
no effect
the effect of the ‘treatment’ is significant
ANOVA Table:
Source
Group(between)
Error(within)
Total
degrees of
freedom
k-1
N-k
N-1
Sum of Squares
ni( x i - x )2=SSG
(ni – 1)si2 = SSE
(xij- x )2= SSTot
Mean Squares
SSG/(k-1) = MSG
SSE/(N-k) = MSE
SSTot/(N-1)=MST
F value
p-value
MSG/MSE = Fk-1,N-k
Pr(F > Fk-1,N-k *)
N = total number of observations =  ni, where ni = number of observations for group i
The F test statistic has two different degrees of freedom: the numerator = k –1,
and the denominator = N – k  Fk-1,N-k
2
2
2
NOTE: SSE/(N-k) = MSE = sp2 = (pooled sample variance) = (n1  1)s1  ...  (nk  1) s k = ˆ = estimate for assumed
(n1  1)  ...  (nk  1)
equal variance
this is the ‘average’
variance for each group
SSTot/(N-1) = MSTOT = s2 = the total variance of the data (assuming NO groups)
F  variance of the (between) samples means divided by the ~average variance of the data, the larger the F (the smaller the pvalue) the more varied the means are so the less likely H0 is true. We reject when the p-value < .
R2 = proportion of the total variation explained by the difference in means = SSG
SSTot
One-way ANOVA F-test
Hypotheses: Ho: 1 = 2 = … = k (there is no effect on the means due to the categorical variable)
HA: the means are NOT all equal (the effect due to the categorical variable is stat sig)
F test statistic has 2 dfs: k1 and Nk, where k is the number of groups (means being tested) and N is the total number of obs
Assumptions:
1. Each of the k population or group distributions is normal. check with a Normal Quantile Plot (or boxplot) of each group
2. These distributions have identical variances (standard deviations). check if largest sd is > 2 times smallest sd or use
Levine’s test (MSE = pooled variance and is the estimate of the true variance within each group)
3. Each of the k samples is a random sample.
4. Each of the k samples is selected independently of one another.
These assumptions are exactly the same as the pooled t-test.
Type I error: We claim that there is a stat sig effect when there actually is not one. We claim the means are not all the same
when actually they are.
Type II error: We fail to prove there is a stat sig effect even though it does exist. We fail to prove the means are not all the
same even though they are not all the same.
p-value interpretation: How often we see at least this strong of an effect when there actually is not one.
Conclusion: If p-value < , we reject H0 and conclude that there is a statistically significant effect.
If p-value NOT < , we fail to reject H0 and cannot prove that there is a statistically significant effect.
Download