Stat 301 HW 8 Due: 13 Nov / 16 Nov 2015

advertisement
Stat 301
HW 8
Due: 13 Nov / 16 Nov 2015
1. 5.30 (p. 303) as given in the text
2. The data in pace.txt are the pace of life data used on the 2014 example midterm. Fit a model
with bank, walk, talk and the interaction of bank*walk to predict the heart attack mortality.
The question of interest is whether bank and all terms involving bank could be omitted from
the model.
(a) What two models should be compared to assess whether bank and all terms involving
bank could be omitted from the model?
(b) Fit those two models and calculate the F statistic using an ANOVA table to organize
the computations.
(c) Use JMP to construct the F test and obtain the p-value.
(d) Is it appropriate to remove bank and all terms involving bank from the model?
3. Some bat species locate prey by echolocation, the biological version of the sonar equipment
used to locate submarines and other objects underwater. Echolocation is energy intensive.
So is flight. But, it may be possible for there to be “energy savings” if an animal echolocates
while flying. To evaluate this possibility, data were collected on the energetic cost of flying for
4 species of echo-locating bats, 4 species of non-echo-locating bats, and 12 species of birds,
which can’t echolocate. The energetic cost to fly is known to depend on the mass of the
animal, so the comparisons of biological interest are the those between types of animals at
the same mass (or log mass). The echolocate.txt data set includes the log transformed mass
of the animal, the log transformed energy expenditure while flying (or flying and echolocating
because echolocating bats don’t “turn off” that ability), and the type of the animal.
(a) Fit a model to predict the log energy expenditure using the log mass and two indicator
variables for type of animal. I suggest you let JMP create the indicator variables for you.
What are the estimated regression coefficients for the 3 variables in this model? What
is the root Mean-Squared-Error for this model?
(b) The model you fit can be written as:
E log energy = β0 + β1 X1 + β2 X2 + β3 log mass,
where X1 and X2 are the indicator variables representing the three types of animals. If
there is no difference in average flying cost between the three types of animals, when
compared at the same mass, what must be true of β1 and β2 ?
(c) Test the hypothesis that there is no difference in average cost of flying between the three
types of animals, when compared at the same log mass. Report the F statistic and
p-value.
(d) Interpret your results of the test in part 3c by writing a one-sentence conclusion.
(e) The model above assumes that the slope (for log mass) is the same for all three types
of animals. Test whether this is a reasonable assumption. Report your F statistic and
p-value.
1
(f) The echo2.txt data file has two indicator variables for the type of animal. I have constructed these indicators so that β1 , the parameter for X1, in the model
E log energy = β0 + β1 X1 + β2 X2 + β3 log mass
estimates the difference in the intercepts between echolocating and non-echolocating bats.
Fit this model. Report the estimate of β1 and the p-value for the test of β1 = 0.
(g) The biologists are focusing on the comparison between echolocating and non-echolocating
bat of the same mass. What benefit is there to collecting data on 12 species of birds
when they aren’t involved in the comparison of interest?
(h) As good data analysis practice, look at the plot of residuals vs predicted values. Any
concerns about lack of fit or equal variance?
2
Download