Stat 301 – HW 5 answers

advertisement
Stat 301 – HW 5 answers
1. Enzyme activity, 2 pts each part.
a. No, independence of errors is not reasonable. Errors for observations from the same solution are
associated. Two possible explanations, either of which is sufficient:
Most of the observations for solution1 are above the fitted regression line (i.e., positive errors).
Most of those for solution 2 are below the fitted regression line (i.e., negative errors).
Measurements are repeatedly made on the same solution. Anything unusual about one solution
will be present in all the observations from that solution.
b. Yes, there is no sign of lack of fit. The residuals are approximately centered at 0 for all predicted
values. (Or, no smile or frown).
c. No, constant variance is not reasonable. The residuals show a trumpet shape.
d. Yes, normality is reasonable. The normal QQ plot has observations falling along the expected line.
2. Enzyme activity in solution 4, 2 pt each part.
a. Your explanations are more important than your specific answer. Here are mine:
Lack of fit: the plot suggests a problem: the residuals wiggle up, down, and up again
Equal variances: again, the plot suggests a problem: more variable at smaller predicted values
You could look at the plot of concentration vs time and argue these effects are small and ignorable.
b. F = 206.9, p < 0.0001
3. Plots of residuals and predicted values. My interpretations. Again, your explanation is more
important than your conclusion. 1 pt each part for A, B, C, and D
A: no lack of fit, equal variances But you might claim unequal variances because the spread decreases.
B: lack of fit, equal variances (the vertical separation about the same all the way along the X axis)
C: no lack of fit, unequal variances. You could argue for lack of fit because the residuals are centered
around 1 instead of 0. But, there isn’t a frown or smile.
D: both lack of fit and unequal variances
E and F: all sorts of answers might be reasonable.
Note for 2a, 3E and 3F: I simulated all three of these data sets, so I know the truth. In all three cases,
the model used to generate the data was no lack of fit and equal variances. 3E and 3F in particular
illustrate the difficulty interpreting patterns for small (or smallish) data sets. In general, I look for
blatantly obvious problems and ignore more subtle effects, especially when the data set is not large.
4. TCDD measurements
a. 1 pt. r = 0.938 or 0.94
b. 2 pts. The two most obvious things are:
positive association between FAT and PLASMA (because r > 0)
relationship very close to linear (r close to 1).
Total of 19 points, 1 point for free.
Download