Stat 301 Lab 7: Due October 28 Fall 2013 1. Dugongs are large aquatic mammals similar to manatees but native to the Indian and Pacific Oceans. Data was collected on the age (years) and length (meters) of 27 dugongs captured near Townsville in north Queensland, Australia. The data are given below. Age 1 1.5 1.5 1.5 2.5 4 5 5 7 Length 1.80 1.85 1.87 1.77 2.02 2.27 2.15 2.26 2.35 Age 8 8.5 9 9.5 9.5 10 12 12 13 Length 2.47 2.19 2.26 2.4 2.39 2.41 2.50 2.32 2.43 Age 13 14.5 15.5 15.5 16.5 17 22.5 29 31.5 Length 2.47 2.56 2.65 2.47 2.64 2.56 2.70 2.72 2.57 a) Plot Length versus Age. Describe the general pattern. b) Fit a simple linear model relating Length to Age. • • • Give the equation of the least squares line. Interpret both the estimated intercept and the estimated slope within the context of the problem. Comment on how well the simple linear model fits the data. Be sure to mention the R2 value, RMSE, model utility, significance of variables in the model, and the plot of residuals versus Age. c) Fit a polynomial regression (degree=2) model with Age and Age2 as the explanatory variables. Do not center variables. • • • Give the equation of the least squares line. Why is it difficult to interpret the parameter estimates for this model? Comment on how well the model fits the data. Be sure to mention the R2 value, RMSE, model utility, significance of variables in the model, and the plot of residuals versus Age. d) Fit a polynomial regression (degree=3) with Age, Age2 and Age3 as the explanatory variables. Do not center variables. • • Give the equation of the least squares line. Comment on how well the model fits the data. Be sure to mention the R2 value, RMSE, model utility, and the significance of variables in the model. e) For the model with Age, Age2 and Age3 as the explanatory variables, look at the distribution of residuals. Comment on the conditions of identically and normally distributed errors. Be sure to refer to the appropriate plots in your comments. 1 f) Which model b), c) or d) does a better job of predicting the lengths of dugongs? To answer this question you should look at the predictions especially for older dugongs. Note: This question is not asking which model is the best statistical model but rather which model gives more realistic predictions especially for older dugongs. g) Report the correlations between Age and Age2, Age and Age3, Age2 and Age3. Is there statistically significant multicollinearity? Explain briefly. h) Fit a polynomial regression (degree=2) with Age and (Age – Mean Age)2. • • • Give the equation of the least squares line. Comment on how well the model fits the data. Be sure to mention the R2 value, RMSE, model utility, significance of variables in the model, and the plot of residuals versus Age. How does this model compare to the model in c)? i) Fit a polynomial regression (degree=3) with Age, (Age – Mean Age)2 and (Age – Mean Age)3 as the explanatory variables. • • • Give the equation of the least squares line. Comment on how well the model fits the data. Be sure to mention the R2 value, RMSE, model utility, and the significance of variables in the model. How does this model compare to the model in d)? j) Report the correlations between Age and (Age – Mean Age)2, Age and (Age – Mean Age)3, (Age – Mean Age)2 and (Age – Mean Age)3. Is there statistically significant multicollinearity? Explain briefly. 2