A) One-way ANOVA with Polynomial Effects

advertisement
Statistics 512
Study Guide 2
Spring 2002
A) One-way ANOVA with Polynomial Effects
1.A substantial percentage of the potatoes raised in this country never have a chance to reach the table.
Instead they fall victim to potato rot while being stored for later use. To find out what could be done to
reduce this loss, and experiment was carried out at the University of Wisconsin. Potatoes were injected
with a bacteria known to cause rot, then stored. After 5 days the diameter of the rotted portion on each
potato was measured (in millimeters).
The levels of injection were low, medium and high, with 18 potatoes receiving each level of treatment.
The ANOVA table is below:
Analysis of Variance For: rot
Source df
Sum of Squares
bact
2
651.815
Error
51
2055.22
Total
53
2707.04
Summary statistics for: low
NumNumeric = 18
Mean = 5.2778
Standard Deviation = 4.1981
Mean Square
325.907
40.2985
F-ratio
8.0873
Summary statistics for: medium
NumNumeric = 18
Mean = 9.1667
Standard Deviation = 6.6177
Prob
0.0009
Summary statistics for: high
NumNumeric = 18
Mean = 13.778
Standard Deviation = 7.7121
Normal Probability plots were drawn for each group, and the data appears to be normally distributed
a) Does the level of bacteria injected have an effect on the diameter of rot in the potato?
b) Does the amount of rot at the low level differ significantly from amount at the medium level?
c) Does the amount of rot at the high level differ significantly from the amount at the other two levels?
d) Are the two constrasts above orthogonal?
e) What are the sums of squares for the contrasts? What is the sum of the two sums of squares?
f) The investigator wasn't sure of the exact inoculum level in the injections. However, the levels were
determined by dilution of a liquid medium containing the bacteria. Therefore, she was fairly certain that
the medium level is about 2 times the low level, and the high level is about 4 times the low level. How
can she test whether the relationship between inoculum level and rot diameter is linear?
g) What is the highest degree polynomial that can be fitted to this data?
h) What is the difference in this case between fitting a linear regression and testing the lack of fit statistic,
and fitting a quadratic regression and determining if the quadratic term is significant?
-1-
Statistics 512
Study Guide 2
Spring 2002
i) Below is the regression output from regressing rot diameter on inoculum level (1, 2 or 4). Compute the
lack of fit sum of squares. Is there evidence of lack of fit?
Dependent variable is: rot
R2 = 23.6% R2(adjusted) = 22.1%
s = 6.306 with 54 - 2 = 52 degrees of freedom
Source
Sum of Squares
Regression 638.922
Residual 2068.12
df
1
52
Variable
Constant
level
s.e. of Coeff
1.821
0.6881
Coefficient
2.97222
2.75794
Mean Square
639
39.7714
-2-
t-ratio
1.63
4.01
F-ratio
16.1
Statistics 512
Study Guide 2
Spring 2002
Solutions
1. a. The null hypothesis is either 1=2=0, or 1=2=3 or 1=2=3. In any case, the
question is answered by the F-test in the ANOVA table.
F*=8.0873 and under the null hypothesis it should be compared to an F(2, 51). The Pvalue is 0.0009. There appears to be a highly significant effect.
b. Ho: 1=2 versus not equal.
This can most readily be determined by a contrast:
y -y
1 2
s.e.(y -y )
1 2 =
t *=
5.2778 - 9.1667
40.2985( 1 + 1 )
18 18 = -1.84
t(.95,51)=1.68
t(.975,51)=2.00
The difference between the low and medium levels is marginally significant.
1+1
- 3 = 0
. Ho: 2
versus not equal.
This can most readily be determined by a contrast:
y1 +y2
5.2778 + 9.1667 - 13.778
-y3
2
2
y1 +y2
40.2985( 1 + 1 + 1 ) -6.55575
s.e.(
-y3 )
*
2
4*18 4*18 18 = 1.8325 = -3.577
t=
=
The contrast is highly significant.
d. The contrasts are orthogonal, since
1*1 * 1 + (-1)* 1 * 1 + 0 * 1 * 1 = 0
2 18
2 18
18
.
e. The sum of squares for the first contrast is
5.2778 - 9.1667 2
1 + 1
18 18
= 136.435
The sum of squares for the second contrast is
-3-
Statistics 512
Study Guide 2
Spring 2002
5.2778 + 9.1667 - 13.778 2
2
1 + 1 + 1
= 515.616
4*18 4*18 18
Since the contrasts are orthogonal, they should sum to the SSR.
136.435+515.616=652.051. The difference is due to round off error.
f. It is not necessary to know the "units" of measure (i.e. how much inoculum is in the
"low" injection, in order to fit polynomial regression, as long as the relative sizes of the
levels are known. That is, we know there is some concentration, C, so that the levels are
1C, 2C, and 4C. The regression coefficients will vary with C, but the fitted values will
not. We can determine if there is a linear (or polynomial fit) by regressing on the
continuous variable with values 1, 2, and 4, and then seeing if there is lack of fit (by
testing lack of fit, and looking at the residual plots).
g. Since there are 3 distinct values of the independent variable, a quadratic polynomial is
the highest degree that can be fit.
h. There is no difference between these methods. If a quadratic polynomial is fit, the
sequential sum of squares for the quadratic term will be the same as the lack of fit
statistics. If there were more levels, so that a higher degree polynomial could be fit, the
lack of fit statistic is the simultaneous test of whether any of the higher order terms
improve the fit.
i. The lack of fit sum of squares is:
SS(lack of fit)
SSR(polynomial)
= 651.815-638.922
=12.893
= SSR(categorical)-
The degrees of freedom for lack of fit is:
df(lack of fit) = df(categorical)-df(polynomial)
= 2-1
=1
The test for lack of fit is:
F* =SSR(lack of fit)/df(lack of fit)
MSE(categorical)
=12.893/40.2985
-4-
Statistics 512
Study Guide 2
Spring 2002
=.320
This should be compared to F(.95,1,51)=4.0. (However, an F-statistic must be greater
than 1 to be significant, so we don't really need to use the table.)
-5-
Download