Stat 301 Lab 7: Due October 28 Fall 2013

advertisement
Stat 301
Lab 7: Due October 28
Fall 2013
1. Dugongs are large aquatic mammals similar to manatees but native to the Indian and Pacific
Oceans. Data was collected on the age (years) and length (meters) of 27 dugongs captured
near Townsville in north Queensland, Australia. The data are given below.
Age
1
1.5
1.5
1.5
2.5
4
5
5
7
Length
1.80
1.85
1.87
1.77
2.02
2.27
2.15
2.26
2.35
Age
8
8.5
9
9.5
9.5
10
12
12
13
Length
2.47
2.19
2.26
2.4
2.39
2.41
2.50
2.32
2.43
Age
13
14.5
15.5
15.5
16.5
17
22.5
29
31.5
Length
2.47
2.56
2.65
2.47
2.64
2.56
2.70
2.72
2.57
a) Plot Length versus Age. Describe the general pattern.
b) Fit a simple linear model relating Length to Age.
•
•
•
Give the equation of the least squares line.
Interpret both the estimated intercept and the estimated slope within the context of
the problem.
Comment on how well the simple linear model fits the data. Be sure to mention the
R2 value, RMSE, model utility, significance of variables in the model, and the plot of
residuals versus Age.
c) Fit a polynomial regression (degree=2) model with Age and Age2 as the explanatory
variables. Do not center variables.
•
•
•
Give the equation of the least squares line.
Why is it difficult to interpret the parameter estimates for this model?
Comment on how well the model fits the data. Be sure to mention the R2 value,
RMSE, model utility, significance of variables in the model, and the plot of residuals
versus Age.
d) Fit a polynomial regression (degree=3) with Age, Age2 and Age3 as the explanatory
variables. Do not center variables.
•
•
Give the equation of the least squares line.
Comment on how well the model fits the data. Be sure to mention the R2 value,
RMSE, model utility, and the significance of variables in the model.
e) For the model with Age, Age2 and Age3 as the explanatory variables, look at the
distribution of residuals. Comment on the conditions of identically and normally
distributed errors. Be sure to refer to the appropriate plots in your comments.
1
f) Which model b), c) or d) does a better job of predicting the lengths of dugongs? To
answer this question you should look at the predictions especially for older dugongs.
Note: This question is not asking which model is the best statistical model but rather
which model gives more realistic predictions especially for older dugongs.
g) Report the correlations between Age and Age2, Age and Age3, Age2 and Age3. Is there
statistically significant multicollinearity? Explain briefly.
h) Fit a polynomial regression (degree=2) with Age and (Age – Mean Age)2.
•
•
•
Give the equation of the least squares line.
Comment on how well the model fits the data. Be sure to mention the R2 value,
RMSE, model utility, significance of variables in the model, and the plot of residuals
versus Age.
How does this model compare to the model in c)?
i) Fit a polynomial regression (degree=3) with Age, (Age – Mean Age)2 and (Age – Mean
Age)3 as the explanatory variables.
•
•
•
Give the equation of the least squares line.
Comment on how well the model fits the data. Be sure to mention the R2 value,
RMSE, model utility, and the significance of variables in the model.
How does this model compare to the model in d)?
j) Report the correlations between Age and (Age – Mean Age)2, Age and (Age – Mean
Age)3, (Age – Mean Age)2 and (Age – Mean Age)3. Is there statistically significant
multicollinearity? Explain briefly.
2
Download