Stat 401B Lab 6: Due October 23 Fall 2007 1. What is the relationship between the distance walked and the time it takes to deliver soft drinks to vending machines? An industrial engineer collected 20 observations on the delivery time (in minutes), number of cases delivered and distance walked (in feet). Delivery time includes time to unload the cases from the delivery truck, walk to the machine and load the machine with the soft drinks. The data are given below. Cases 7 3 3 4 6 7 2 7 5 10 Distance 560 220 340 80 150 330 110 210 605 215 Time 16.68 11.50 12.03 14.88 13.75 18.11 8.00 17.83 21.50 21.00 Cases 4 6 9 6 7 3 10 9 8 4 Distance 255 462 448 200 132 36 140 450 635 150 Time 13.50 19.75 24.00 15.35 19.00 9.50 17.90 18.75 19.83 10.75 a) Fit a simple linear model with Time as your response variable and Distance as your explanatory variable. Use the JMP output to answer the following questions. Be sure to turn in JMP output with your answers. • What is the least squares regression equation for this model? • Give an interpretation of the estimated slope within the context of the problem. • Give and interpretation of the estimated intercept within the context of the problem. • How much of the variation in Time is explained by the linear relationship with Distance? • Is Distance a statistically significant variable for predicting Time? Support your answer statistically. b) Fit a multiple regression model with Time as your response variable and Distance and Cases as explanatory variables. Use the JMP output to answer the following questions. Be sure to turn in JMP output with your answers. • What is the least squares regression equation for this model? • Give an interpretation of the estimated slope for Distance within the context of the problem. • Give an interpretation of the estimated slope for Cases within the context of the problem. • Why is there not an interpretation of the estimated intercept within the context of the problem for this model? • Does Cases add significantly to the model with Distance? Support your answer statistically. c) Fit a multiple regression model with Time as your response variable and Distance, Cases, and Distance*Cases as explanatory variables. Be careful! Remember to turn off the Center Polynomial option before you fit the model. Use the JMP output to answer the following questions. Be sure to turn in JMP output with your answers. • Does Distance*Cases add significantly to the model with Distance and Cases? Support your answer statistically. • What does the result above tell you about interaction between Distance and Cases? • What does the result above tell you about the relationship between Time and Distance as you change the number of delivered? 1 d) For the “best” model for these data, compute the residuals. Plot the residuals versus Distance. Plot the residuals versus Cases. Analyze the distribution of the residuals. Use the JMP output to answer the following questions. Be sure to turn in JMP output with your answers. • What do the plots of residuals versus the explanatory variables tell you about the adequacy of the model you have chosen as “best”? What do they tell you about the equal standard deviation condition? Be sure to support your answers by referring to the plots. • What does the analysis of the distribution of residuals tell you about the conditions of identically and normally distributed residuals? Be sure to support your answers by referring to the analysis. 2. Dugongs are large aquatic mammals similar to manatees but native to the Indian and Pacific Oceans. Data was collected on the age (years) and length (meters) of 27 dugongs captured near Townsville in north Queensland, Australia. The data are given below. Age 1 1.5 1.5 1.5 2.5 4 5 5 7 Length 1.80 1.85 1.87 1.77 2.02 2.27 2.15 2.26 2.35 Age 8 8.5 9 9.5 9.5 10 12 12 13 Length 2.47 2.19 2.26 2.4 2.39 2.41 2.50 2.32 2.43 Age 13 14.5 15.5 15.5 16.5 17 22.5 29 31.5 Length 2.47 2.56 2.65 2.47 2.64 2.56 2.70 2.72 2.57 a) Plot Length versus Age. Describe the general pattern. b) Fit a simple linear model relating Length to Age. i. Give the equation of the least squares line. ii. Interpret both the estimated intercept and the estimated slope within the context of the problem. iii. Comment on how well the simple linear model fits the data. Be sure to mention the R2 value, RMSE, model utility, significance of variables in the model, and the plot of residuals versus Age. c) Fit a polynomial regression (degree=2) model with Age and Age2 as the explanatory variables. i. Give the equation of the least squares line. ii. Why is it difficult to interpret the parameter estimates for this model? iii. Comment on how well the model fits the data. Be sure to mention the R2 value, RMSE, model utility, significance of variables in the model, and the plot of residuals versus Age. d) Fit a polynomial regression (degree=3) with Age, Age2 and Age3 as the explanatory variables. i. Give the equation of the least squares line. ii. Comment on how well the model fits the data. Be sure to mention the R2 value, RMSE, model utility, and the significance of variables in the model. e) Which model b), c) or d) does a better job of predicting the lengths of dugongs? For this model analyze the residuals and comment on the condition of normally distributed errors. 2