Stat 401B Lab 6: Due October 23 Fall 2007

advertisement
Stat 401B
Lab 6: Due October 23
Fall 2007
1. What is the relationship between the distance walked and the time it takes to deliver soft drinks
to vending machines? An industrial engineer collected 20 observations on the delivery time (in
minutes), number of cases delivered and distance walked (in feet). Delivery time includes time
to unload the cases from the delivery truck, walk to the machine and load the machine with the
soft drinks. The data are given below.
Cases
7
3
3
4
6
7
2
7
5
10
Distance
560
220
340
80
150
330
110
210
605
215
Time
16.68
11.50
12.03
14.88
13.75
18.11
8.00
17.83
21.50
21.00
Cases
4
6
9
6
7
3
10
9
8
4
Distance
255
462
448
200
132
36
140
450
635
150
Time
13.50
19.75
24.00
15.35
19.00
9.50
17.90
18.75
19.83
10.75
a) Fit a simple linear model with Time as your response variable and Distance as your
explanatory variable. Use the JMP output to answer the following questions. Be sure to turn
in JMP output with your answers.
• What is the least squares regression equation for this model?
• Give an interpretation of the estimated slope within the context of the problem.
• Give and interpretation of the estimated intercept within the context of the problem.
• How much of the variation in Time is explained by the linear relationship with Distance?
• Is Distance a statistically significant variable for predicting Time? Support your answer
statistically.
b) Fit a multiple regression model with Time as your response variable and Distance and Cases
as explanatory variables. Use the JMP output to answer the following questions. Be sure to
turn in JMP output with your answers.
• What is the least squares regression equation for this model?
• Give an interpretation of the estimated slope for Distance within the context of the
problem.
• Give an interpretation of the estimated slope for Cases within the context of the problem.
• Why is there not an interpretation of the estimated intercept within the context of the
problem for this model?
• Does Cases add significantly to the model with Distance? Support your answer
statistically.
c) Fit a multiple regression model with Time as your response variable and Distance, Cases,
and Distance*Cases as explanatory variables. Be careful! Remember to turn off the Center
Polynomial option before you fit the model. Use the JMP output to answer the following
questions. Be sure to turn in JMP output with your answers.
• Does Distance*Cases add significantly to the model with Distance and Cases? Support
your answer statistically.
• What does the result above tell you about interaction between Distance and Cases?
• What does the result above tell you about the relationship between Time and Distance as
you change the number of delivered?
1
d) For the “best” model for these data, compute the residuals. Plot the residuals versus
Distance. Plot the residuals versus Cases. Analyze the distribution of the residuals. Use the
JMP output to answer the following questions. Be sure to turn in JMP output with your
answers.
• What do the plots of residuals versus the explanatory variables tell you about the
adequacy of the model you have chosen as “best”? What do they tell you about the
equal standard deviation condition? Be sure to support your answers by referring to the
plots.
• What does the analysis of the distribution of residuals tell you about the conditions of
identically and normally distributed residuals? Be sure to support your answers by
referring to the analysis.
2. Dugongs are large aquatic mammals similar to manatees but native to the Indian and Pacific
Oceans. Data was collected on the age (years) and length (meters) of 27 dugongs captured near
Townsville in north Queensland, Australia. The data are given below.
Age
1
1.5
1.5
1.5
2.5
4
5
5
7
Length
1.80
1.85
1.87
1.77
2.02
2.27
2.15
2.26
2.35
Age
8
8.5
9
9.5
9.5
10
12
12
13
Length
2.47
2.19
2.26
2.4
2.39
2.41
2.50
2.32
2.43
Age
13
14.5
15.5
15.5
16.5
17
22.5
29
31.5
Length
2.47
2.56
2.65
2.47
2.64
2.56
2.70
2.72
2.57
a) Plot Length versus Age. Describe the general pattern.
b) Fit a simple linear model relating Length to Age.
i. Give the equation of the least squares line.
ii. Interpret both the estimated intercept and the estimated slope within the context of the
problem.
iii. Comment on how well the simple linear model fits the data. Be sure to mention the R2
value, RMSE, model utility, significance of variables in the model, and the plot of
residuals versus Age.
c) Fit a polynomial regression (degree=2) model with Age and Age2 as the explanatory
variables.
i. Give the equation of the least squares line.
ii. Why is it difficult to interpret the parameter estimates for this model?
iii. Comment on how well the model fits the data. Be sure to mention the R2 value, RMSE,
model utility, significance of variables in the model, and the plot of residuals versus
Age.
d) Fit a polynomial regression (degree=3) with Age, Age2 and Age3 as the explanatory
variables.
i. Give the equation of the least squares line.
ii. Comment on how well the model fits the data. Be sure to mention the R2 value, RMSE,
model utility, and the significance of variables in the model.
e) Which model b), c) or d) does a better job of predicting the lengths of dugongs? For this
model analyze the residuals and comment on the condition of normally distributed errors.
2
Download