HW 9 SOLUTIONS Regression and Correlation 1. 12.41. The three

advertisement
HW 9 SOLUTIONS
Regression and Correlation
1. 12.41. The three residual plots, (i), (ii), and (iii), were generated after fitting regression lines to the three
scatterplots, (a), (b), and (c). Which residual plot goes with which scatterplot? How do you know?
Correct:
Scatterplot (b) shows curvature, so it goes with residual plot (ii). In scatterplot (a), the points fan out as X
increases, so this scatterplot goes with residual plot (iii). Finally, there are no unusual features in scatterplot (c),
which goes with residual plot (i).
2. 12.5(modified). Twenty plots, each 10 x 4 meters were randomly chosen in a large field of corn. For each
plot, the plant density (number of plants in the plot) and the mean cob weight (g of grain per cob) were
observed. The results are given in the table.
Plant Density X Cob Weight Y Plant Density X Cob Weight Y
137
212
173
194
107
241
124
241
132
215
157
196
135
225
184
193
115
250
112
224
103
241
80
257
102
237
165
200
65
282
160
190
149
206
157
208
85
246
119
224
a. Calculate the linear regression of Y on X.
Correct:
TI-84 After entering x’s in L1 and y’s in L2->STAT->TESTS and LinRegTTest->ENTER->Xlist: L1 Ylist: L2
Calculate->ENTER yields Y = 316.376 – 0.7206X
b. Calculate sY and specify the units.
TI-84
STAT->CALC->ENTER->L2->ENTER yields s = 24.954 g
c. Calculate the value of sY|X and specify the units.
Correct:
TI-84 From LinRegTTest output we find SY|X = 8.619254138 = 8.619 g
d. Interpret the value of sY|X in the context of this setting.
Correct:
Predictions of cob weight based on the regression model tend to be off by 8.6 g, on average.
e. Calculate the value of r2
Correct:
TI-84 From LinRegTTest output, we find r2 = 0.887
f. Interpret the value of r2 in the context of the setting.
Correct:
88.7% of the variability in grams of grain per cob is explained by variability in the number of plants per plot
g. Now, using the QQplot of the residuals and a residual vs. predicted (fitted) values plot. Use these plots to
comment on the assumptions (that can be checked here).
Correct:
The residuals also appear to be centered around 0 at each slice of X, giving no indication that the errors do not
have mean zero.
The residual plot shows a fairly even spread at each "slice" of X. The points are a little more spread out where
there are more points, but this is to be expected. Little evidence against the errors having equal variance.
The QQplot of the residuals shows no systematic departure from the line. There is little to no indication of
nonnormality
in the errors here.
h. Assuming the linear model is correctly specified, compute a 95% confidence interval for β1.
Correct:
The CI is
-0.7206 ± (2.101)(0.0605)
-0.7206 ± 0.1271
(-0.8477,-0.5935) or -0.8477 < β1 < -0.5935.
TI-84 STAT->TESTS and LinRegTInt->ENTER->Xlist: L1 Ylist: L2 C-Level:0.95 Calculate->ENTER
(-.848, -0.593)
TI-83 Using the output from LinRegTTest, we have
.
0.606
.
The CI is
-0.7206 ± (2.101)(0.0605)
-0.7206 ± 0.1271
(-0.8477,-0.5935) or -0.8477 < β1 < -0.5935
i. Interpret the interval you just computed in part (h) in the context of the setting.
Correct:
We are 95% confident that mean cob weight decreases with each additional plant per plot by as little as 0.5935
or as much as 0.8477 grams of grain per cob.
3. 12.6. Laetisaric acid is a compound that holds promise for control of fungus diseases in crop plants. The
accompanying data show the results of growing the fungus Pythium ultimum in various concentrations of
laetisaric acid. Each growth value is the average of four radial measurements of a P. ultimum colony grown in a
petri dish for 24 hours; there were two petri dishes at each concentration.
Fungus
Laetisaric Acid
Growth Y
Concentration X (μg/mLi)
(mm)
0
33.3
0
31.0
3
29.8
3
27.8
6
28.0
6
29.0
10
25.5
10
23.8
20
18.3
20
15.5
30
11.7
30
10.0
a. Calculate the linear regression of Y on X.
TI-84 After entering x’s in L1 and y’s in L2->STAT->TESTS and LinRegTTest->ENTER->Xlist: L1 Ylist: L2
Calculate->ENTER yields Y = 31.83 - 0.712X
b. Calculate sY|X. What are the units of sY|X?
Correct:
TI-84 From LinRegTTest output we find sY|X = 1.295 mm
c. Calculate the value of r2.
Correct:
TI-84 From LinRegTTest output, r2 = 0.975
d. Interpret the value of r2 in the context of the setting.
Correct:
97.6% of the variability in Pithium ultimum growth can be explained by the variability in Laetisaric acid
concentration.
e. Suppose a second investigator were to replicate the experiment, using concentrations of 0, 2, 4, 6, 8, and 10
mg, with two petri dishes at each concentration. Would you predict that the value of r calculated by this second
investigator would be about the same as that calculated in part (a), smaller in magnitude, or larger in
magnitude? Explain.
Correct:
The second investigator would have less spread in the X values, so we would expect the second investigator to
obtain a smaller correlation (a value of r closer to zero).
f. Using the QQplot of the residuals and the residual vs. X plot. Use these plots to comment on the assumptions
(that can be checked here).
Correct:
The residuals plot show a fairly even spread and centered around 0 at each "slice" of X. I do not see an obvious
shape being made by this plot. There is little indication that the assumption of the errors having mean zero and
equal variance have been violated.
The QQplot of the residuals is showing what could be a slight pattern, but not a systematic departure, since at
every two or three points, the points go back to the line. For only having 12 residuals, it's hard to tell if this is a
pattern or not. For now, I do not have enough points to say this departure is systematic. There is little evidence
against normality.
g. Consider the null hypothesis that laetisaric acid has no effect on growth of the fungus. Assuming that the
linear model is applicable, formulate this as a hypothesis about the true regression line, and test the hypothesis
against the alternative that laetisaric acid inhibits growth of the fungus. Let α = 0.05.
Correct:
(1) α = 0.05
(2) H0: Laetisaric acid has no effect on fungus growth (β1 = 0)
HA: Laetisaric acid inhibits fungus growth (β1 < 0)
TI-84 After entering x’s in L1 and y’s in L2->STAT->TESTS and LinRegTTest->ENTER->Xlist: L1 Ylist: L2
β & ρ: < 0 Calculate->ENTER yields
(3) t = -19.840
(4) P = 0.00000000116
(5) P < α, reject H0
(6) Conclude that mean Pythium ultimum growth decreases significantly as laetisaric acid concentration is
increased.
h. Assuming that the linear model is applicable, find estimates of the mean and standard deviation of fungus
growth at a laetisaric acid concentration of 15 μg/mLi
Correct:
Substituting X = 15 into the fitted regression equation yields
Y = 31.83 - (0.7120)(15) = 21.15.
Thus, we estimate that the mean radial measurement of the Pithium ultimum colony would be 21.15 mm at a
laetisaric acid concentration of 15 μg/ml.
According to the linear model, the standard deviation of fungus growth does not depend on X. Our estimate of
this standard deviation, σY|X , is the residual standard deviation from the regression line, sY|X.
Thus, we estimate that the standard deviation of fungus growth would be 1.295 mm at a laetisaric acid
concentration of 15 μg/ml.
4. In a study of the tufted titmouse (Parus bicolor), an ecologist captured seven male birds, measured their
wing lengths and other characteristics, and then marked and released them. During the ensuing winter, he
repeatedly observed the marked birds as they foraged for insects and seeds on tree branches. He noted the
branch diameter on each occasion, and calculated (from 50 observations) the average branch diameter for each
bird. The results are shown in the table.
Bird
Wing Length X (mm) Branch Diameter Y (cm)
1
79.0
1.02
2
80.0
1.04
3
81.5
1.20
4
84.0
1.51
5
79.5
1.21
6
82.5
1.56
7
83.5
1.29
a. Calculate the correlation coefficient between wing length and branch diameter.
Correct:
r = 0.803
b. Construct a 90% confidence interval for the population correlation coefficient, ρ.
Correct:
We utilize the Fisher Z-Transform from the Chapter 12 slides.
From the previous exercise, we have r = 0.803344979.
So, Z(r) = 1/2[ln((1+r)/(1-r))] = .5[ln(1.803344979/.196655021)] = .5[ln(9.170093734)] = 1.107973755.
invNorm(.95,0,1) = Z.05 = 1.645.
Now, calculate (1/(n-3))1/2 = (1/(7-3))1/2 = 0.5.
We find the 90% CI on Z(ρ) as 1.107973755 +or- (1.645)(0.5) = 1.107973755 +or- 0.8225
or equivalently, 0.285473755 < Z(ρ) < 1.930473755.
Lower limit on ρ is e2(0.285473755)-1/e2(0.285473755)+1 = 0.769943296/2.769943755 = 0.277963559.
Upper limit on ρ is e2(1.930473755)-1/e2(1.930473755)+1 = 46.51034658/48.51034658 = 0.958771682.
Finally, we report the 90%CI for ρ as (0.278, 0.959) or equivalently 0.278 < ρ < 0.959 
Download