252solnK1 12/02/03 Still problem 14.18 [14.9] distribution. 7

advertisement
252solnK1 12/02/03 Still problem 14.18 [14.9]
Normal Probability Plot of the Residuals
(response is DistCost)
2
Normal Score
1
0
-1
-2
-10
0
10
Residual
Comment: The text says that a straight-line Normal Probability plot indicates a Normal
distribution.
Comment: There doesn’t seem to be much of a pattern here.
7
252solnK1 12/02/03
Comment: A pattern here would indicate autocorrelation.
Comment: These two graphs show no pattern either.
8
252solnK1 12/02/03
MTB > Stepwise c1 c2 c3;
SUBC>
AEnter 0.15;
SUBC>
ARemove 0.15;
SUBC>
Constant.
Stepwise Regression: DistCost versus Sales, Orders
Alpha-to-Enter: 0.15
Alpha-to-Remove: 0.15
Response is DistCost on
2 predictors, with N =
Step
Constant
1
0.4576
2
-2.7282
Orders
T-Value
P-Value
0.0161
10.92
0.000
0.0119
5.31
0.000
Sales
T-Value
P-Value
24
0.047
2.32
0.031
S
5.22
4.77
R-Sq
84.42
87.59
R-Sq(adj)
83.71
86.41
C-p
6.4
3.0
More? (Yes, No, Subcommand, or Help)
SUBC> yes
No variables entered or removed
More? (Yes, No, Subcommand, or Help)
SUBC> no
MTB >
Comment: This was a stepwise regression. It was done with no options, so that all the
subcommands that you see here were generated by Minitab. It seems that the independent
variable with the most explanatory power was ‘Orders,’ and the regression was
Y = 0.4576 + .0161 Orders, with an R-squared of 84.42. Minitab then added the other independent variable
and got a regression of Y = -2.7282 + 0.119 Orders + 0.047 Sales, which is the same regression we got with
the ‘Regress’ command. The C-p statistic, also explained in the text, should be near k + 1, which it is for
this regression. After adding two independent variables, Minitab paused and asked me if I wanted to try for
more independent variables. I foolishly said ‘yes,’ whereupon Minitab discovered that it didn’t have any
more variables to add.
Dummy Variables
Exercise 14.38 [14.33 in 9th] (15.6 in 8th edition): The equation is Y  6  4 X 1  2 X 2
(a)
Holding constant the effect of X2, the estimated average value of the dependent variable
will increase by 4 units for each increase of one unit of X1.
(b)
Holding constant the effects of X1, the presence of the condition represented by X2 = 1 is
estimated to increase the average value of the dependent variable by 2 units.
17 
 2.11 . You can reject H0 and say
t  3.27 . n  k  1  20  2  1. This is larger than t .05
(c)
that the presence of X2 makes a significant contribution to the model.
9
252solnK1 12/01/03
Exercise 14.39 [14.34 in 9th] (15.7 in 8th edition):
(a)
First develop a multiple regression model using X1 as the variable for the SAT score and
X2 a dummy variable with X2 = 1 if a student had a grade of B or better in the
introductory statistics course. If the dummy variable coefficient is significantly different
than zero, you need to develop a model with the interaction term X1 X2 to make sure that
the coefficient of X1 is not significantly different if X2 = 0 or X2 = 1.
(b)
If a student received a grade of B or better in the introductory statistics course, the
student would be expected to have a grade point average in accountancy that is 0.30
higher than a student who had the same SAT score, but did not get a grade of B or better
in the introductory statistics course.
Exercise 14.41 [14.35 in 9th] (15.8 in 8th edition):
To run this regression I used the Statistics pull-down menu and then picked Regression twice. I had put
headings on my columns – the data is in the text and on your CD, but, since I’m lazy, I identified the
columns as C1, C2 and C3. So C2 was my response (dependent - Y) variable and C1 and C3 were my
predictor (independent – X) variables. There are just too many subcommands here to use the session
window to drive Minitab. On the Regression menu I went into Graphs and checked all the residual plots
except residuals vs. order. Under Options I picked Variance Inflation Factors and set up for confidence and
prediction intervals by telling it that the independent variables for this prediction were in C5 and C6. Under
Results I took the last and most complete option, though this can also be done by using the session
command ‘Brief 3’ before you start. Under storage I picked nothing. When this regression was finished and
I had copied all the graphs into a Word document, I ran the regression again with the Interaction variable
requested in part n) of the problem. To confirm my results, I ran Stepwise from the Regression menu using
C1, C3 and C4 as candidates to explain C2. The output from the run follows with comments.
————— 12/2/2003 9:15:15 PM ————————————————————
Welcome to Minitab, press F1 for help.
MTB > Retrieve "C:\Berenson\Data_Files-9th\Minitab\petfood.MTW".
Retrieving worksheet from file: C:\Berenson\Data_Files-9th\Minitab\petfood.MTW
# Worksheet was saved on Mon Apr 27 1998
Comment: I downloaded the data from the text CD, but stored it where I could get it
more easily if I needed it again.
Results for: petfood.MTW
MTB > Save "C:\Documents and Settings\RBOVE.WCUPANET\My Documents\Drive
D\MINITAB\petfood3";
SUBC>
Replace.
Saving file as: C:\Documents and Settings\RBOVE.WCUPANET\My Documents\Drive
D\MINITAB\petfood3.MTW
* NOTE * Existing file replaced.
Results for: petfood3.MTW
MTB > regress c2
1 c1
10
252solnK1 12/02/03
Regression Analysis: Sales versus Space
The regression equation is
Sales = 1.45 + 0.0740 Space
Predictor
Constant
Space
Coef
1.4500
0.07400
S = 0.3081
SE Coef
0.2178
0.01591
R-Sq = 68.4%
T
6.66
4.65
P
0.000
0.001
R-Sq(adj) = 65.2%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
10
11
SS
2.0535
0.9490
3.0025
MS
2.0535
0.0949
F
21.64
P
0.001
Comment: This is the regression referred to in Problems 13.3 and 13.14.
MTB > let c4 = c1 * c3
Comment: This command creates the interaction variable, which I have labeled ‘Inter’ in C4.
MTB > print c1-c6
Data Display
Row
Space
Sales
Locatn
Inter
C5
C6
1
2
3
4
5
6
7
8
9
10
11
12
5
5
5
10
10
10
15
15
15
20
20
20
1.6
2.2
1.4
1.9
2.4
2.6
2.3
2.7
2.8
2.6
2.9
3.1
0
1
0
0
0
1
0
0
1
0
0
1
0
5
0
0
0
10
0
0
15
0
0
20
8
0
Comment: This is the data I will use. Because part c) of this problem asks for confidence and
prediction intervals I have set up values of space and sales for these intervals in C5 and C6.
MTB > Regress c2 2 c1 c3;
SUBC> GHistogram;
SUBC> GNormalplot;
SUBC> GFits;
SUBC> GVars c1 c3;
SUBC> RType 1;
SUBC>
Constant;
SUBC>
VIF;
SUBC>
Predict c5 c6;
SUBC>
Brief 3.
11
252solnK1 12/02/03
Regression Analysis: Sales versus Space, Locatn
The regression equation is
Sales = 1.30 + 0.0740 Space + 0.450 Locatn
Predictor
Constant
Space
Locatn
Coef
1.3000
0.07400
0.4500
S = 0.2132
SE Coef
0.1569
0.01101
0.1305
R-Sq = 86.4%
T
8.29
6.72
3.45
P
0.000
0.000
0.007
VIF
1.0
1.0
R-Sq(adj) = 83.4%
Analysis of Variance
Source
Regression
Residual Error
Total
Source
Space
Locatn
DF
1
1
DF
2
9
11
SS
2.5935
0.4090
3.0025
MS
1.2967
0.0454
F
28.53
P
0.000
Seq SS
2.0535
0.5400
Comment: These results look great. Note that all my coefficients are significant, with p-values
below 1%. The VIF is way below 5, which indicates a lack of collinearity. The ANOVA gives me a p-value
of zero, indicating that the regression is quite useful.
Obs
1
2
3
4
5
6
7
8
9
10
11
12
Space
5.0
5.0
5.0
10.0
10.0
10.0
15.0
15.0
15.0
20.0
20.0
20.0
Sales
1.6000
2.2000
1.4000
1.9000
2.4000
2.6000
2.3000
2.7000
2.8000
2.6000
2.9000
3.1000
Fit
1.6700
2.1200
1.6700
2.0400
2.0400
2.4900
2.4100
2.4100
2.8600
2.7800
2.7800
3.2300
SE Fit
0.1118
0.1348
0.1118
0.0802
0.0802
0.1101
0.0802
0.0802
0.1101
0.1118
0.1118
0.1348
Residual
-0.0700
0.0800
-0.2700
-0.1400
0.3600
0.1100
-0.1100
0.2900
-0.0600
-0.1800
0.1200
-0.1300
St Resid
-0.39
0.48
-1.49
-0.71
1.82
0.60
-0.56
1.47
-0.33
-0.99
0.66
-0.79
Predicted Values for New Observations
New Obs
1
Fit
1.8920
SE Fit
0.0902
(
95.0% CI
1.6880, 2.0960)
(
95.0% PI
1.3684, 2.4156)
Values of Predictors for New Observations
New Obs
1
Space
8.00
Locatn
0.000000
Residual Histogram for Sales
Normplot of Residuals for Sales
Residuals vs Fits for Sales
12
Download