252x0541 4/22/05 Name and Class hour:_________________________

advertisement
252x0541 4/22/05
ECO252 QBA2
Final EXAM
May 4, 2005
Name and Class hour:_________________________
I. (25+ points) Do all the following. Note that answers without reasons receive no credit. Most answers
require a statistical test, that is, stating or implying a hypothesis and showing why it is true or false by citing
a table value or a p-value. If you haven’t done it lately, take a fast look at ECO 252 - Things That You
Should Never Do on a Statistics Exam (or Anywhere Else)
The next 12 pages contain computer output. This comes from a data set on the text CD-ROM called
Auto2002. There are 121 observations. The dependent variable is MPG (miles per gallon). The columns in
the data set are:
Name
The make and model
SUV
‘Yes’ if it’s an SUV, ‘No’ if not.
Drive Type
All wheel, front wheel, rear wheel or four wheel.
Horsepower
An independent variable
Fuel Type
Premium or regular
MPG
The dependent variable
Length
In inches – an independent variable
Width
In inches – an independent variable
Weight
In pounds – an independent variable
Cargo Volume Square feet – an independent variable
Turning Circle Feet – an independent variable.
I added the following
SUV_D
A dummy variable based on ‘SUV’, 1 for an SUV, otherwise zero.
Fuel_D
A dummy variable based on ‘Fuel Type’, 1 for a Premium fuel., otherwise zero
SUVwt
An interaction variable, the product of ‘SUV_D’ and ‘Weight’
SUVtc
An interaction variable, the product of ‘SUV_D’ and ‘Turning Circle’
HPsq
AWD_D
A dummy variable based on ‘Drive Type’, 1 for all wheel drive, otherwise zero
FWD_D
A dummy variable based on ‘Drive Type’, 1 for front wheel drive, otherwise zero
RWD_D
A dummy variable based on ‘Drive Type’, 1 for rear wheel drive, otherwise zero
SUV_L
An interaction variable, the product of ‘SUV_D’ and ‘Length’
Questions are included with the regressions and thus cannot be in order of difficulty. It’s probably a good
idea to look over the questions and explanations before you do anything.
————— 4/28/2005 6:18:32 PM ————————————————————
Welcome to Minitab, press F1 for help.
Results for: 252x0504-4.MTW
MTB > Stepwise 'MPG' 'Horsepower' 'Length' 'Width' 'Weight' 'Cargo Volume' &
CONT>
'Turning Circle' 'SUV_D' 'Fuel_D' 'SUVwt' 'HPsq' 'AWD_D' &
CONT>
'FWD_D' 'RWD_D' 'SUV_L';
SUBC>
AEnter 0.15;
SUBC>
ARemove 0.15;
SUBC>
Best 0;
SUBC>
Constant.
Because I had relatively little idea of what to do, I ran a stepwise regression. You probably have
not seen one of these before, but they are relatively easy to read. Note that it dropped 2 observations so that
the results will not be quite the same as I got later.
The first numbered column represents the single independent variable that seems to have the most
explanatory effect on MPG, The equation reads MPG = 38.31 – 15.34 Weight The fact that Weight
1
252x0541 4/22/05
entered first with a negative coefficient should surprise no one. At the bottom appears s e , R 2 , R 2 and the
C p statistic mentioned in your text. The value of the t-ratio and its p-value appear below the coefficient.
Stepwise Regression: MPG versus Horsepower, Length, ...
Alpha-to-Enter: 0.15 Alpha-to-Remove: 0.15
Response is MPG on 14 predictors, with N = 119
N(cases with missing observations) = 2 N(all cases) = 121
Step
Constant
Weight
T-Value
P-Value
1
38.31
2
36.75
3
41.59
4
50.06
5
50.15
6
59.00
-0.00491
-15.34
0.000
-0.00436
-11.87
0.000
-0.00578
-12.82
0.000
-0.00495
-9.31
0.000
-0.00424
-6.74
0.000
-0.00339
-5.61
0.000
-1.72
-2.84
0.005
-33.71
-4.99
0.000
-35.29
-5.36
0.000
-35.12
-5.40
0.000
-18.68
-2.71
0.008
0.180
4.75
0.000
0.185
5.04
0.000
0.182
5.01
0.000
0.088
2.26
0.026
-0.285
-2.79
0.006
-0.292
-2.90
0.004
-0.255
-2.75
0.007
-0.0124
-2.01
0.046
-0.1619
-5.04
0.000
SUV_D
T-Value
P-Value
SUV_L
T-Value
P-Value
Turning Circle
T-Value
P-Value
Horsepower
T-Value
P-Value
HPsq
T-Value
P-Value
S
R-Sq
R-Sq(adj)
Mallows C-p
0.00040
4.73
0.000
2.50
66.78
66.50
71.5
2.43
68.94
68.40
61.4
2.23
74.04
73.36
34.8
2.17
75.70
74.85
27.4
2.14
76.55
75.51
24.7
1.96
80.45
79.41
4.8
More? (Yes, No, Subcommand, or Help)
SUBC> y
I’m greedy, so while I was surprised that Minitab had found six explanatory (independent) variables that
actually seemed to affect miles per gallon I wanted more. For the first time ever (for me), Minitab found
another variable
2
252x0541 4/22/05
Step
Constant
7
58.50
Weight
T-Value
P-Value
-0.00342
-5.74
0.000
SUV_D
T-Value
P-Value
-19.0
-2.79
0.006
SUV_L
T-Value
P-Value
0.090
2.36
0.020
Turning Circle
T-Value
P-Value
-0.210
-2.24
0.027
Horsepower
T-Value
P-Value
-0.175
-5.43
0.000
HPsq
T-Value
P-Value
0.00042
5.03
0.000
Fuel_D
T-Value
P-Value
0.92
2.11
0.037
S
R-Sq
R-Sq(adj)
Mallows C-p
1.93
81.21
80.02
2.5
More? (Yes, No, Subcommand, or Help)
SUBC> y
No variables entered or removed
More? (Yes, No, Subcommand, or Help)
SUBC> n
Because I was worried about Collinearity, I had the computer do a table of correlations between all the
independent variables. The table is triangular since the correlation between, say, Length and Horsepower is
going to be the same as the correlation between Horsepower and Length. So, for example, the correlation
between Horsepower and Length is .648 and the p-value of zero below it evaluates the null hypothesis that
the correlation is insignificant. The explanation of Predicted R2 that appears below the correlation table
was a new one on me, but could help you in comparing the regressions.
3
252x0541 4/22/05
MTB > Correlation 'Horsepower' 'Length' 'Width' 'Weight' 'Cargo Volume' &
CONT>
'Turning Circle' 'SUV_D' 'Fuel_D' 'SUVwt' 'SUVtc' 'HPsq' 'AWD_D' &
CONT>
'FWD_D' 'RWD_D' 'SUV_L'.
Correlations: Horsepower, Length, Width, Weight, Cargo Volume, ...
Horsepower
0.648
0.000
Length
Width
0.660
0.000
0.825
0.000
Weight
0.673
0.000
0.634
0.000
0.780
0.000
Cargo Volume
0.296
0.001
0.395
0.000
0.546
0.000
0.716
0.000
Turning Circ
0.497
0.000
0.750
0.000
0.658
0.000
0.650
0.000
SUV_D
0.160
0.080
-0.102
0.265
0.180
0.049
0.535
0.000
Fuel_D
0.321
0.000
-0.013
0.886
-0.042
0.645
0.057
0.540
SUVwt
0.182
0.045
-0.077
0.403
0.206
0.023
0.562
0.000
SUVtc
0.185
0.042
-0.062
0.502
0.211
0.020
0.577
0.000
HPsq
0.989
0.000
0.632
0.000
0.645
0.000
0.668
0.000
AWD_D
0.059
0.523
-0.118
0.199
-0.037
0.691
0.065
0.483
FWD_D
-0.370
0.000
-0.001
0.994
-0.163
0.076
-0.453
0.000
RWD_D
0.334
0.000
0.070
0.445
0.151
0.101
0.351
0.000
SUV_L
0.197
0.030
-0.053
0.564
0.219
0.016
0.582
0.000
Cargo Volume
0.486
0.000
Turning Circ
SUV_D
Fuel_D
0.459
0.000
0.139
0.127
-0.245
0.007
-0.069
0.456
-0.147
0.110
SUVwt
0.473
0.000
0.161
0.078
0.999
0.000
-0.141
0.125
SUVtc
0.484
0.000
0.196
0.031
0.996
0.000
-0.142
0.121
Length
Turning Circ
SUV_D
Fuel_D
Width
Weight
4
252x0541 4/22/05
HPsq
0.289
0.001
0.480
0.000
0.173
0.058
0.296
0.001
AWD_D
0.021
0.823
-0.068
0.461
0.185
0.043
0.218
0.017
FWD_D
-0.165
0.071
-0.027
0.771
-0.517
0.000
-0.280
0.002
RWD_D
0.108
0.239
0.015
0.874
0.364
0.000
0.098
0.288
SUV_L
0.487
0.000
0.181
0.047
0.996
0.000
-0.145
0.114
SUVwt
0.998
0.000
SUVtc
HPsq
AWD_D
HPsq
0.198
0.030
0.200
0.028
AWD_D
0.184
0.044
0.174
0.057
0.040
0.667
FWD_D
-0.522
0.000
-0.526
0.000
-0.369
0.000
-0.366
0.000
RWD_D
0.367
0.000
0.374
0.000
0.347
0.000
-0.137
0.135
SUV_L
0.999
0.000
0.998
0.000
0.215
0.018
0.176
0.054
FWD_D
-0.810
0.000
RWD_D
-0.529
0.000
0.381
0.000
SUVtc
RWD_D
SUV_L
Cell Contents: Pearson correlation
P-Value
PRESS
Assesses your model's predictive ability. In general, the smaller the prediction sum of squares
(PRESS) value, the better the model's predictive ability. PRESS is used to calculate the predicted R 2. PRESS, similar
to the error sum of squares (SSE), is the sum of squares of the prediction error. PRESS differs from SSE in that each
fitted value, i, for PRESS is obtained by deleting the ith observation from the data set, estimating the regression
equation from the remaining n - 1 observations, then using the fitted regression function to obtain the predicted value
for the ith observation.
Predicted R2
Similar to R2. Predicted R2 indicates how well the model predicts responses for new observations,
2
whereas R indicates how well the model fits your data. Predicted R2 can prevent overfitting the model and is more
useful than adjusted R2 for comparing models because it is calculated with observations not included in model
calculation. Predicted R2 is between 0 and 1 and is calculated from the PRESS statistic. Larger values of predicted R 2
suggest models of greater predictive ability.
5
252x0541 4/22/05
So now it’s time to get serious. My first regression was based on what I had learned from the
stepwise regression. The only one of the variables that I left out from the stepwise regression was FUEL_D.
1. Look at the results of Regression 1. But don’t forget what has gone before.
a. What does the Analysis of variance show us? Why? (1)
b. What suggests that the relation of MPG to one of the variables is nonlinear? (1)
c. What does the equation suggest that the difference is between an extra inch on an SUV
and a non_SUV? (1)
d. Why did I leave out FUEL_D (2)
e. Which coefficients are not significant? Why? (2)
f. What do the values of the VIFs tell us? (2)
MTB > Regress 'MPG' 6 'Weight' 'SUV_D' 'SUV_L' 'Turning Circle'
CONT>
'Horsepower' 'HPsq';
SUBC>
Constant;
SUBC>
Brief 2.
&
MTB > Regress 'MPG' 6 'Weight' 'SUV_D' 'SUV_L' 'Turning Circle'
CONT>
'Horsepower' 'HPsq';
SUBC>
GNormalplot;
SUBC>
NoDGraphs;
SUBC>
RType 1;
SUBC>
Constant;
SUBC>
VIF;
SUBC>
Press;
SUBC>
Brief 2.
&
Regression Analysis: MPG versus Weight, SUV_D, ... (Regression 1)
The regression equation is
MPG = 63.1 - 0.00303 Weight - 14.8 SUV_D + 0.0653 SUV_L - 0.264 Turning Circle
- 0.213 Horsepower + 0.000522 HPsq
Predictor
Constant
Weight
SUV_D
SUV_L
Turning Circle
Horsepower
HPsq
Coef
63.105
-0.0030345
-14.812
0.06527
-0.2639
-0.21251
0.00052249
SE Coef
3.978
0.0006859
7.957
0.04478
0.1050
0.03575
0.00009459
T
15.86
-4.42
-1.86
1.46
-2.51
-5.94
5.52
P
0.000
0.000
0.065
0.148
0.013
0.000
0.000
VIF
5.6
282.1
307.9
2.0
63.5
61.3
S = 2.27485
R-Sq = 77.5%
R-Sq(adj) = 76.4%
PRESS = 752.906
R-Sq(pred) = 71.34%
Analysis of Variance
Source
DF
SS
Regression
6 2037.34
Residual Error 114
589.95
Total
120 2627.29
Source
Weight
SUV_D
SUV_L
Turning Circle
Horsepower
HPsq
DF
1
1
1
1
1
1
MS
339.56
5.17
F
65.62
P
0.000
Seq SS
1605.19
47.29
132.83
52.31
41.83
157.89
Unusual Observations
Obs Weight
MPG
Fit
16
5590 13.000 15.361
34
7270 10.000
6.856
40
5590 13.000 15.361
62
4065 19.000 14.633
SE Fit
1.137
1.461
1.137
0.654
Residual
-2.361
3.144
-2.361
4.367
St Resid
-1.20 X
1.80 X
-1.20 X
2.00R
6
252x0541 4/22/05
108
111
114
115
2150
2750
2935
2940
38.000
41.000
41.000
24.000
30.489
33.473
29.806
29.791
0.632
1.133
0.777
0.778
7.511
7.527
11.194
-5.791
3.44R
3.82RX
5.24R
-2.71R
R denotes an observation with a large standardized residual.
X denotes an observation whose X value gives it large influence.
2.
Look at the results of Regression 2. But don’t forget what has gone before.
a. What variable did I drop? Why? (2)
b. Are there any coefficients that have a sign that you would not expect? Why? (1)
c. A Chevrolet Suburban is an SUV with rear wheel drive and 285 horsepower, that takes
Regular fuel, has a length of 219 inches, a width of 79 inches, a weight of 5590 pounds, a
cargo volume of 77.0 square feet and a turning circle of 46 Feet (!!! Maybe it was
inches?). What miles per gallon does the equation predict? What would it be if the vehicle
was not classified as an SUV? (3)
d. Why do I like this regression better than the previous one? (2)
[17]
MTB > Regress 'MPG' 5 'Weight' 'SUV_D'
CONT>
'HPsq';
SUBC>
GNormalplot;
SUBC>
NoDGraphs;
SUBC>
RType 1;
SUBC>
Constant;
SUBC>
VIF;
SUBC>
Press;
SUBC>
Brief 2.
'Turning Circle' 'Horsepower'
Regression Analysis: MPG versus Weight, SUV_D, ...
&
(Regression 2)
The regression equation is
MPG = 63.1 - 0.00250 Weight - 3.25 SUV_D - 0.250 Turning Circle
- 0.239 Horsepower + 0.000593 HPsq
Predictor
Constant
Weight
SUV_D
Turning Circle
Horsepower
HPsq
Coef
63.137
-0.0025020
-3.2492
-0.2501
-0.23928
0.00059313
SE Coef
3.998
0.0005834
0.6272
0.1051
0.03082
0.00008163
T
15.79
-4.29
-5.18
-2.38
-7.76
7.27
P
0.000
0.000
0.000
0.019
0.000
0.000
VIF
4.0
1.7
1.9
46.7
45.2
S = 2.28595
R-Sq = 77.1%
R-Sq(adj) = 76.1%
PRESS = 744.047
R-Sq(pred) = 71.68%
Analysis of Variance
Source
DF
SS
Regression
5 2026.35
Residual Error 115
600.94
Total
120 2627.29
Source
Weight
SUV_D
Turning Circle
Horsepower
HPsq
DF
1
1
1
1
1
MS
405.27
5.23
F
77.56
P
0.000
Seq SS
1605.19
47.29
46.32
51.65
275.90
Unusual Observations
Obs
16
34
40
108
Weight
5590
7270
5590
2150
MPG
13.000
10.000
13.000
38.000
Fit
14.381
5.945
14.381
30.081
SE Fit
0.921
1.328
0.921
0.570
Residual
-1.381
4.055
-1.381
7.919
St Resid
-0.66 X
2.18RX
-0.66 X
3.58R
7
252x0541 4/22/05
111
114
115
2750
2935
2940
41.000
41.000
24.000
33.910
30.060
30.047
1.098
0.761
0.762
7.090
10.940
-6.047
3.54RX
5.08R
-2.81R
R denotes an observation with a large standardized residual.
X denotes an observation whose X value gives it large influence.
Because I wanted to look at the effect of the three drive variables on MPG, I ran another stepwise
regression. The first part of this is identical to the last stepwise regression, but after the 6 th regression, I
forced out SUV_L and forced in AWD_D, FWD_D and RWD_D. Because I had to make the regressions
comparable, I threw an observation with an anomalous drive variable out and redid my two regressions as
Regressions 3 and 4. I then added in all the drive variables as a package in Regression 5.
MTB > Stepwise 'MPG' 'Horsepower' 'Length' 'Width' 'Weight' 'Cargo Volume' &
CONT>
'Turning Circle' 'SUV_D' 'Fuel_D' 'SUVwt' 'HPsq' 'AWD_D' &
CONT>
'FWD_D' 'RWD_D' 'SUV_L';
SUBC>
AEnter 0.15;
SUBC>
ARemove 0.15;
SUBC>
Best 0;
SUBC>
Constant.
Stepwise Regression: MPG versus Horsepower, Length, ...
Alpha-to-Enter: 0.15 Alpha-to-Remove: 0.15
Response is MPG on 14 predictors, with N = 119
N(cases with missing observations) = 2 N(all cases) = 121
Step
Constant
Weight
T-Value
P-Value
1
38.31
2
36.75
3
41.59
4
50.06
5
50.15
6
59.00
-0.00491
-15.34
0.000
-0.00436
-11.87
0.000
-0.00578
-12.82
0.000
-0.00495
-9.31
0.000
-0.00424
-6.74
0.000
-0.00339
-5.61
0.000
-1.72
-2.84
0.005
-33.71
-4.99
0.000
-35.29
-5.36
0.000
-35.12
-5.40
0.000
-18.68
-2.71
0.008
0.180
4.75
0.000
0.185
5.04
0.000
0.182
5.01
0.000
0.088
2.26
0.026
-0.285
-2.79
0.006
-0.292
-2.90
0.004
-0.255
-2.75
0.007
-0.0124
-2.01
0.046
-0.1619
-5.04
0.000
SUV_D
T-Value
P-Value
SUV_L
T-Value
P-Value
Turning Circle
T-Value
P-Value
Horsepower
T-Value
P-Value
HPsq
T-Value
P-Value
S
R-Sq
R-Sq(adj)
Mallows C-p
0.00040
4.73
0.000
2.50
66.78
66.50
71.5
2.43
68.94
68.40
61.4
2.23
74.04
73.36
34.8
2.17
75.70
74.85
27.4
2.14
76.55
75.51
24.7
1.96
80.45
79.41
4.8
More? (Yes, No, Subcommand, or Help)
SUBC> remove 'SUV_L'.
Step
7
8
9
8
252x0541 4/22/05
Constant
59.15
59.00
58.50
Weight
T-Value
P-Value
-0.00267
-5.10
0.000
-0.00339
-5.61
0.000
-0.00342
-5.74
0.000
SUV_D
T-Value
P-Value
-3.13
-5.51
0.000
-18.68
-2.71
0.008
-18.95
-2.79
0.006
0.088
2.26
0.026
0.090
2.36
0.020
SUV_L
T-Value
P-Value
Turning Circle
T-Value
P-Value
-0.236
-2.51
0.013
-0.255
-2.75
0.007
-0.210
-2.24
0.027
Horsepower
T-Value
P-Value
-0.199
-7.09
0.000
-0.162
-5.04
0.000
-0.175
-5.43
0.000
0.00050
6.75
0.000
0.00040
4.73
0.000
0.00042
5.03
0.000
HPsq
T-Value
P-Value
Fuel_D
T-Value
P-Value
S
R-Sq
R-Sq(adj)
Mallows C-p
0.92
2.11
0.037
2.00
79.56
78.66
7.8
1.96
80.45
79.41
4.8
1.93
81.21
80.02
2.5
More? (Yes, No, Subcommand, or Help)
SUBC> enter 'AWD_D' 'FWD_D' 'RWD_D'.
Step
Constant
10
60.14
11
59.11
12
58.50
13
58.50
Weight
T-Value
P-Value
-0.00355
-5.75
0.000
-0.00346
-5.72
0.000
-0.00344
-5.72
0.000
-0.00342
-5.74
0.000
SUV_D
T-Value
P-Value
-19.5
-2.82
0.006
-19.1
-2.77
0.007
-18.8
-2.74
0.007
-19.0
-2.79
0.006
SUV_L
T-Value
P-Value
0.092
2.37
0.020
0.090
2.32
0.022
0.089
2.30
0.023
0.090
2.36
0.020
Turning Circle
T-Value
P-Value
-0.207
-2.10
0.038
-0.205
-2.09
0.039
-0.202
-2.07
0.041
-0.210
-2.24
0.027
Horsepower
T-Value
P-Value
-0.175
-5.33
0.000
-0.177
-5.42
0.000
-0.176
-5.41
0.000
-0.175
-5.43
0.000
0.00042
4.98
0.000
0.00043
5.04
0.000
0.00042
5.02
0.000
0.00042
5.03
0.000
HPsq
T-Value
P-Value
9
252x0541 4/22/05
Fuel_D
T-Value
P-Value
0.73
1.49
0.139
0.80
1.66
0.099
0.87
1.92
0.057
AWD_D
T-Value
P-Value
-1.1
-0.76
0.451
FWD_D
T-Value
P-Value
-1.36
-0.98
0.331
-0.51
-0.62
0.535
-0.17
-0.32
0.752
RWD_D
T-Value
P-Value
-1.23
-0.93
0.353
-0.42
-0.55
0.586
S
R-Sq
R-Sq(adj)
Mallows C-p
1.95
81.37
79.65
7.6
1.95
81.27
79.73
6.1
0.92
2.11
0.037
1.94
81.22
79.86
4.4
1.93
81.21
80.02
2.5
More? (Yes, No, Subcommand, or Help)
SUBC> no
Results for: 252x0504-41.MTW
MTB > WSave "C:\Documents and Settings\rbove\My Documents\Minitab\252x050441.MTW";
SUBC>
Replace.
Saving file as: 'C:\Documents and Settings\rbove\My
Documents\Minitab\252x0504-41.MTW'
MTB > erase c21
MTB > Regress 'MPG' 6 'Weight' 'SUV_D' 'SUV_L' 'Turning Circle' &
CONT>
'Horsepower' 'HPsq' ;
SUBC>
GNormalplot;
SUBC>
NoDGraphs;
SUBC>
RType 1;
SUBC>
Constant;
SUBC>
VIF;
SUBC>
Press;
SUBC>
Brief 2.
Regression Analysis: MPG versus Weight, SUV_D, ...
(Regression 3)
The regression equation is
MPG = 64.4 - 0.00284 Weight - 15.8 SUV_D + 0.0694 SUV_L - 0.305 Turning Circle
- 0.214 Horsepower + 0.000524 HPsq
Predictor
Constant
Weight
SUV_D
SUV_L
Turning Circle
Horsepower
HPsq
Coef
64.364
-0.0028431
-15.843
0.06943
-0.3045
-0.21444
0.00052386
SE Coef
3.973
0.0006832
7.867
0.04423
0.1055
0.03528
0.00009332
T
16.20
-4.16
-2.01
1.57
-2.89
-6.08
5.61
P
0.000
0.000
0.046
0.119
0.005
0.000
0.000
VIF
5.7
276.4
301.7
2.0
63.1
61.0
S = 2.24427
R-Sq = 78.3%
R-Sq(adj) = 77.2%
PRESS = 725.963
R-Sq(pred) = 72.34%
Analysis of Variance
Source
DF
SS
Regression
6 2055.21
Residual Error 113
569.15
Total
119 2624.37
Source
DF
MS
342.54
5.04
F
68.01
P
0.000
Seq SS
10
252x0541 4/22/05
Weight
SUV_D
SUV_L
Turning Circle
Horsepower
HPsq
1
1
1
1
1
1
Unusual Observations
Obs Weight
MPG
16
5590 13.000
34
7270 10.000
36
2715 24.000
40
5590 13.000
107
2150 38.000
110
2750 41.000
113
2935 41.000
114
2940 24.000
1602.61
49.58
135.39
61.04
47.88
158.71
Fit
15.259
6.907
28.432
15.259
30.543
33.747
30.000
29.985
SE Fit
1.123
1.442
0.493
1.123
0.624
1.126
0.772
0.774
Residual
-2.259
3.093
-4.432
-2.259
7.457
7.253
11.000
-5.985
St Resid
-1.16 X
1.80 X
-2.02R
-1.16 X
3.46R
3.74RX
5.22R
-2.84R
R denotes an observation with a large standardized residual.
X denotes an observation whose X value gives it large influence.
MTB > Regress 'MPG' 5 'Weight' 'SUV_D'
CONT>
'HPsq' ;
SUBC>
GNormalplot;
SUBC>
NoDGraphs;
SUBC>
RType 1;
SUBC>
Constant;
SUBC>
VIF;
SUBC>
Press;
SUBC>
Brief 2.
'Turning Circle' 'Horsepower'
Regression Analysis: MPG versus Weight, SUV_D, ...
&
(Regression 4)
The regression equation is
MPG = 64.4 - 0.00228 Weight - 3.53 SUV_D - 0.288 Turning Circle
- 0.243 Horsepower + 0.000599 HPsq
Predictor
Constant
Weight
SUV_D
Turning Circle
Horsepower
HPsq
Coef
64.352
-0.0022848
-3.5330
-0.2884
-0.24278
0.00059879
SE Coef
3.999
0.0005871
0.6366
0.1057
0.03051
0.00008071
T
16.09
-3.89
-5.55
-2.73
-7.96
7.42
P
0.000
0.000
0.000
0.007
0.000
0.000
VIF
4.2
1.8
2.0
46.6
45.0
S = 2.25865
R-Sq = 77.8%
R-Sq(adj) = 76.9%
PRESS = 720.507
R-Sq(pred) = 72.55%
Analysis of Variance
Source
DF
SS
Regression
5 2042.80
Residual Error 114
581.57
Total
119 2624.37
Source
Weight
SUV_D
Turning Circle
Horsepower
HPsq
DF
1
1
1
1
1
MS
408.56
5.10
F
80.09
P
0.000
Seq SS
1602.61
49.58
52.45
57.33
280.82
Unusual Observations
Obs Weight
MPG
Fit
16
5590 13.000 14.223
34
7270 10.000
5.938
40
5590 13.000 14.223
107
2150 38.000 30.108
SE Fit
0.914
1.312
0.914
0.563
Residual
-1.223
4.062
-1.223
7.892
St Resid
-0.59 X
2.21RX
-0.59 X
3.61R
11
252x0541 4/22/05
110
113
114
2750
2935
2940
41.000
41.000
24.000
34.201
30.262
30.251
1.095
0.759
0.760
6.799
10.738
-6.251
3.44RX
5.05R
-2.94R
R denotes an observation with a large standardized residual.
X denotes an observation whose X value gives it large influence.
Look at the results of Regression 5 and Regression 4. But don’t forget what has gone before.
a. Do an F test to see if Regression 5 is better than Regression 4. If you can include the
results from my forcing variables in the last stepwise regression. (6)
b. Should I have included another dummy variable to represent 4-wheel drive? Why? (2)
c. Are there any coefficients in Regression 5 that have a sign that you would not expect?
Why? (1)
d. A Chevrolet Suburban is an SUV with rear wheel drive and 285 horsepower, that takes
Regular fuel, has a length of 219 inches, a width of 79 inches, a weight of 5590 pounds, a
cargo volume of 77.0 square feet and a turning circle of 46 Feet (!!! Maybe it was
inches?). How do the predictions for MPG in Equations 2 and 4 differ in percentage
terms? (3)
[29]
Why do I like this regression better than the pr
3.
MTB > Regress 'MPG' 8 'Weight' 'SUV_D' 'Turning Circle' 'Horsepower'
CONT>
'HPsq' 'AWD_D' 'FWD_D' 'RWD_D';
SUBC>
GNormalplot;
SUBC>
NoDGraphs;
SUBC>
RType 1;
SUBC>
Constant;
SUBC>
VIF;
SUBC>
Press;
SUBC>
Brief 2.
Regression Analysis: MPG versus Weight, SUV_D, ...
&
(Regression 5)
The regression equation is
MPG = 66.4 - 0.00248 Weight - 3.83 SUV_D - 0.254 Turning Circle
- 0.251 Horsepower + 0.000618 HPsq - 1.21 AWD_D - 2.10 FWD_D - 1.70 RWD_D
Predictor
Constant
Weight
SUV_D
Turning Circle
Horsepower
HPsq
AWD_D
FWD_D
RWD_D
Coef
66.435
-0.0024795
-3.8302
-0.2541
-0.25082
0.00061833
-1.213
-2.103
-1.697
SE Coef
4.400
0.0006077
0.6814
0.1116
0.03122
0.00008244
1.620
1.490
1.434
T
15.10
-4.08
-5.62
-2.28
-8.03
7.50
-0.75
-1.41
-1.18
P
0.000
0.000
0.000
0.025
0.000
0.000
0.455
0.161
0.239
VIF
4.4
2.0
2.2
48.6
46.7
3.4
11.2
8.6
S = 2.26416
R-Sq = 78.3%
R-Sq(adj) = 76.8%
PRESS = 727.840
R-Sq(pred) = 72.27%
Analysis of Variance
Source
DF
SS
Regression
8 2055.33
Residual Error 111
569.03
Total
119 2624.37
Source
Weight
SUV_D
Turning Circle
Horsepower
HPsq
AWD_D
DF
1
1
1
1
1
1
MS
256.92
5.13
F
50.12
P
0.000
Seq SS
1602.61
49.58
52.45
57.33
280.82
2.00
12
252x0541 4/22/05
FWD_D
RWD_D
1
1
3.36
7.17
Unusual Observations
Obs
34
57
72
107
109
110
113
114
Weight
7270
4735
4720
2150
5435
2750
2935
2940
MPG
10.000
14.000
15.000
38.000
14.000
41.000
41.000
24.000
Fit
5.609
13.622
15.901
30.231
13.477
34.346
30.341
30.329
SE Fit
1.377
1.447
1.374
0.574
1.338
1.106
0.765
0.766
Residual
4.391
0.378
-0.901
7.769
0.523
6.654
10.659
-6.329
St Resid
2.44RX
0.22 X
-0.50 X
3.55R
0.29 X
3.37RX
5.00R
-2.97R
R denotes an observation with a large standardized residual.
X denotes an observation whose X value gives it large influence.
13
252x0541 4/22/05
II. Do at least 4 of the following 6 Problems (at least 15 each) (or do sections adding to at least 60 points –
(Anything extra you do helps, and grades wrap around) . Show your work! State H 0 and H1 where
applicable. Use a significance level of 5% unless noted otherwise. Do not answer questions without
citing appropriate statistical tests – That is, explain your hypotheses and what values from what table
were used to test them. Clearly label what section of each problem you are doing! The entire test has
175 points, but 100 is considered a perfect score.
Exhibit 1. A tear-off copy of this exhibit appears at the end of the exam.
An entrepreneur believes that her business is growing steadily and wants to compute a trend line for her
output Y against time x1  T . She also decides to repeat the regression after adding x 2  T 2 as a second
independent variable. Her data and results follow. The t statistics have been relabeled ‘t-ratio’ to prevent
confusion with T .
Regression Analysis: Y versus T
Row
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Y
53.43
59.09
59.58
64.75
68.65
65.53
68.44
70.93
72.85
73.60
72.93
75.14
73.88
76.55
79.05
T
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
T2
1
4
9
16
25
36
49
64
81
100
121
144
169
196
225
The regression equation is
Y = 56.7 + 1.54 T
Predictor
Coef SE Coef
Constant
56.659
1.283
T
1.5377
0.1411
S = 2.36169
R-Sq = 90.1%
t-ratio
P
44.15 0.000
10.89 0.000
R-Sq(adj) = 89.4%
Regression Analysis: Y versus T, TSQ
The regression equation is
Y = 52.4 + 3.04 T - 0.0939 TSQ
Predictor
Coef SE Coef
Constant
52.401
1.545
T
3.0405
0.4444
TSQ
-0.09392 0.02701
S = 1.73483
R-Sq = 95.1%
t-ratio P
33.91 0.000
6.84 0.000
-3.48 0.005
If you need them, her means and spare parts are below.
Y  68.9600
X 22  nX 22  SSX 2  75805.3
X 1  8.00
Y 2  nY 2  SST  SSY  734.556
X 2  82.6667
X 1Y  nX 1Y  SX 1Y  430.550
X 12  nX 12  SSX1  280
X 2 Y  nX 2 Y  SX 2Y  6501.33





X
1X 2
 nX 1 X 2  SX 1X 2  4480.00
14
252x0541 4/22/05
1. Do the following using Exhibit 1.
a) Explain what numbers in the printout were used to compute the t-ratio 6.84, what table value
you would compare it with to do a 2-sided 1% significance test and whether and why the
coefficient is significant. (3)
b) The entrepreneur looked at the residual analysis of the first regression and decided that she
needs a time squared term. What is she likely to have seen to cause her to make that decision? (1)
c) Use the values of R 2 to do an F test to see if the addition of the T 2 term makes a significant
improvement in the regression. (4)
d) Get R 2 ( R 2 adjusted for degrees of freedom) for the second regression and explain what it
seems to show. (2)
e) In the first regression, the Durbin-Watson statistic was 1.07 and for the second it was 1.94.
What do these numbers indicate? (Do a significance test.) (5)
f) For the second regression, make a prediction of the output in the 16 th year and use the suggestion
in the outline to make it into a prediction interval. Why would a confidence interval be
inappropriate here? (3)
[15]
15
252x0541 4/22/05
2. Do the following using Exhibit 1.
a) Compute the (Pearson’s) sample correlation between output and time and test it for significance.
(5)
b) Test the hypothesis that the population correlation   0.8 . (5)
c) Do Spearman’s rank correlation between output and time and test it for significance. Why is the
rank correlation higher than Pearson’s? (6)
[16, 31]
16
252x0541 4/22/05
3. (Berenson et. al.) The operations manager of a light bulb factory wants to determine if there is any
difference between the average life expectancy of a light bulb manufactured by two different machines. A
random sample of 25 light bulbs from machine 1 has a sample mean of 375 hours. A random sample of 25
light bulbs from machine 2 has a sample mean of 362 hours.
a) Test whether the mean lives of the bulbs differ at the 5% significance level. Assume that 110 is
the population standard deviation for machine 1 and that 125 is the population standard deviation
for machine 2. Do not use a confidence interval. State your null hypothesis! (4)
b) Find a p-value for the null hypothesis in part a) and interpret it. Do not use the t-table. (3)
c) Test whether the mean lives of the bulbs differ at the 5% significance level. Assume that 110 is
the sample standard deviation for machine 1 and that 125 is the sample standard deviation for
machine 2. Do not use a confidence interval. State your null hypothesis! Make and state an
assumption about the equality of the two standard deviations. (3 or 5)
d) Test the assumption about the standard deviations that you made in c). State your null
hypothesis! (2)
e) Make the following confidence intervals.
(i) A confidence interval for the difference between the means in a). (1)
(ii) A confidence interval for the difference between the means in c) (2)
(iii) A confidence interval for the ratio of the population variances in d) (2)
[15, 46]
17
252x0541 4/22/05
4. (Berenson et. al.) A student team is investigating the size of the bubbles that can be blown with various
brands of bubble gum. The data below is the total diameter in inches of the bubbles and is presented in two
different ways. These Exhibits are repeated as a tear-off sheet at the end of the exam with the sums
and sums of squares computed for you.
Exhibit 2: Size of Bubbles Blocked by Blower
Brand 1
Row
1
2
3
4
5
Student
Loopy
Percival
Poopsy
Dizzy
Booger
Brand 2
Brand 3
Brand 4
x1
x 2
x 3
x 4
8.75
9.50
9.25
9.50
9.25
9.5
4.0
5.5
8.5
4.5
8.5
8.5
7.5
7.5
8.0
11.5
11.0
7.5
7.5
8.0
Exhibit 3: Size of Bubbles Blocked by Blower but ranked as four independent random samples.
Row
1
2
3
4
5
Student
Loopy
Percival
Poopsy
Dizzy
Booger
x1
r1
x 2
r2
x 3
r3
x 4
r4
8.75
9.50
9.25
9.50
9.25
13.0
17.0
14.5
17.0
14.5
9.5
4.0
5.5
8.5
4.5
17
1
3
11
2
8.5
8.5
7.5
7.5
8.0
11.0
11.0
5.5
5.5
8.5
11.5
11.0
7.5
7.5
8.0
20.0
19.0
5.5
5.5
8.5
Compare the data in 3 different ways.
a) Do only one of the following.
(i) Consider the data random samples from Normally distributed populations and
compare means. (6)
(ii) Consider the data blocked data from Normally distributed populations and compare
means. (8)
b) Consider the data random samples from non-Normally distributed populations and compare
medians. (5)
c) Consider the data blocked data from non-Normally distributed populations and compare
medians. (5)
[24, 70]
18
252x0541 4/22/05
5. (Berenson et. al.) The time it takes to design and launch a marketing campaign is called a cycle time.
Marketing campaigns are classified by cycle time (in months) and effectiveness. Don’t even think of
answering any part of this question without doing a statistical test!
Effectiveness
D u r a t i o n
< 1 mo.
1-2 mo.
2-4 mo.
>4 mo.
Total
Very Effective
15
28
24
6
73
Effective
9
26
33
19
87
Ineffective
5
2
3
5
15
Total
29
56
60
30
175
a. Test to see if the proportion in the various effectiveness categories is related to cycle time. (8)
b. Of the campaigns that took 0 – 2 months 7 were ineffective. Of the campaigns that took more
than two months, 8 were ineffective. Is the fraction that were ineffective in the first category below
the fraction in the second category? (5)
c. Test the hypothesis that 45% of campaigns are very effective. (4)
d. As you know a Jorcillator has two components, a Phillinx and a Flubberall. We recorded the
order in which they were replaced over the last year to see if there was a pattern or the replacement
sequence was just random. We got PPPFFFPPPFFPPFFPPFFF. Test it! (3)
e. (Anderson et. al.) The number of emergency calls our Fire department receives is believed to
have a Poisson distribution with a parameter of 3. Test this against data for a period of 120 days
: 0 calls on 9 days, 1 call on 12 days, 2 calls on 30 days, 3 calls on 27 days, 4 calls on 22 days. 5
calls on 13 days and 7 calls on 6 days. (5)
[25, 95]
19
252x0541 4/22/05
6. Test to see if the price of new homes rose between 2001 and 2002. The following data represents a
random sample of typical prices in thousands in 10 zip codes in 2001 and 2002.
Row
Location
2001
2002
x1
1
2
3
4
5
6
7
8
9
10
Alexandria
Boston
Decatur
Kirkland
New York
Philadelphia
Phoenix
Raleigh
San Bruno
Tampa
245.795
391.750
205.270
326.524
545.363
185.736
170.413
210.015
385.387
194.205
293.266
408.803
227.561
333.569
531.098
197.874
175.030
196.094
391.409
199.858
Some of the following data may be of use to you.
 x  2860.458,
 d  -94.104,
1
x
d
2
1
2
change
d  x1  x 2
x2
 953941.216,
x
-47.471
-17.053
-22.291
-7.045
14.265
-12.138
-4.617
13.921
-6.022
-5.653
2
 2954.562,
x
2
2
 999628.915,
 3724.975
If you want to receive full credit, you must clearly label each section that you do!
a) Remember that the data is cross classified. Assume that the underlying distribution is not
Normal and compare medians. (5)
b) Remember that the data is cross classified. Assume that the underlying distribution is Normal
and compare means. (4)
c) Forget that the data is cross classified. Assume instead that it represents two random samples
from Chester County, one for each year and that the underlying distribution is not Normal.
Compare medians. (6)
[15, 110]
20
252x0541 4/22/05
(Blank)
21
252x0541 4/22/05
ECO252 QBA2
Final EXAM
May 2-6, 2005
TAKE HOME SECTION
Name: _________________________
Student Number: _________________________
Class days and time : _________________________
1) Please Note: computer problems 2,3 and 4 should be turned in with the exam (2). In problem 2, the 2
way ANOVA table should be checked. The three F tests should be done with a 5% significance level and
you should note whether there was (i) a significant difference between drivers, (ii) a significant difference
between cars and (iii) significant interaction. In problem 3, you should show on your third graph where the
regression line is. Check what your text says about normal probability plots and analyze the plot you did.
Explain the results of the t and F tests using a 5% significance level. (2)
2) 4th computer problem (4+)
This is an internet project. You are trying to answer the question, ‘how well does manufacturing explain
differences in income?’ You should use some measure of income per person or family in each state as your
dependent variable and try to explain it as a function of (to start with) percent of output or labor force in
manufacturing. This should start out as a simple regression. Then you should try to see whether there are
other variables that explain the differences as well. One possibility is the per cent of the adult population
with college or high school diplomas. Possible sources of data are below, but think about what you use, and
try to find some other sources. Total income of a state, for example is a very poor choice, rather than some
per capita measure because it is simply going to be high for places with a lot of people without indicating
how well off they are. Similarly the fraction of the workforce with a certain education level is far better then
the number. For instructions on how to do a regression, try the material in Doing a Regression.
http://www.nam.org/s_nam/sec.asp?CID=5&DID=3 Manufacturing share in state economies
(http://www.nam.org/Docs/IEA/26767_20002001ManufacturingShareandChangeinStateEconomies.pdf?DocTypeID=9&TrackID=&Param=@CategoryI
D=1156@TPT=2002-2001+Manufacturing+Share+and+Change+in+State+Economics)
http://www.nemw.org/data.htm Per capita income by state.
http://www.nemw.org/data.htm State personal income per capita.
http://www.bea.doc.gov/bea/regional/data.htm Personal income per capita by state.
http://www.census.gov/statab/www/ Many state statistics, including persons with bachelor’s degrees.
http://www.epinet.org/content.cfm/datazone_index Income inequality, median income, unemployment rates.
Anyway, your job is to add whatever variable you think ought to explain your income measure. Consider all
50 states your sample. Your report should tell what numbers you used, from where and from what years.
What coefficients were significant and do you think on the basis of your results that manufacturing is an
important predictor of a state’s prosperity? Mark all significant F and t coefficients using a 5% significance
level. Explain VIFs.
Of course, if you don’t like this assignment, get approval to research something else on the internet. For
example, does the per cent of the population in prison affect the crime rate (maybe with a few years’ lag)?
Or are there better predictors? And get out the Durbin-Watson, prison vs. crime rate is a time series project.
[8]
3) Hotshot Associates is afraid of sex discrimination charges and collects the data below. The dependent
variable is income in thousands of dollars and the two independent variables are education in years and a
dummy variable indicating sex (1 means a female). The lines in the middle are missing because the totals
22
252x0541 4/22/05
are reliable and you don’t need them. The only thing that is missing is you. Add yourself to the sample as a
21st observation with 12 years of education and an income of 100.0 (thousand) plus the last two digits of
your student number as hundreds. For example Roland Dough’s student number is 123689, so he adds
$8900 to $100000 to get 108900, which he records as 108.9.
y
Row
1
2
3
4
5
INC
39.0
43.7
62.6
42.8
55.0
17 72.9
18 56.1
19 67.1
20 82.3
1168.5
x1
x2
x12
x 22
EDUC
2
4
8
8
8
SEX
0
1
0
1
0
4
16
64
64
64
0
1
0
1
0
16
16
17
21
241
0
1
0
0
7
256
256
289
441
3285
y2
1521.00
1909.69
3918.76
1831.84
3025.00
x1 y
78.0
174.8
500.8
342.4
440.0
x2 y
x1 x 2
0.0
43.7
0.0
42.8
0.0
0
4
0
8
0
0 5314.41 1166.4
0.0
1 3147.21
897.6 56.1
0 4502.41 1140.7
0.0
0 6773.29 1728.3
0.0
7 70091.67 14783.9 370.6
0
16
0
0
81
a. Compute the regression equation Yˆ  b0  b1 x1 to predict salaries the basis of education only.
(2)
b. Compute R 2 . (2)
c. Compute s e . (2)
d. Compute s b1 and do a significance test on b1 (1.5)
e. Compute s b0 and do a confidence interval for b0 (1.5)
f. You are about to hire your nephew for the summer and want to know how much to pay him He
has 14 years of education. Using this create a prediction interval his salary. Explain why a
confidence interval for the price is inappropriate. (3)
g. Do an ANOVA table for the regression. What conclusion can you draw from the hypothesis test
in the ANOVA? (2)
[22]
Extra credit from here on.
h. Do a multiple regression of price against education and sex.(5)
i. Compute R-squared and R-squared adjusted for degrees of freedom for this regression and
compare them with the values for the previous problem. (2)
j. Using either R – squares or SST, SSR and SSE do F tests (ANOVA). First check the usefulness
of the simple regression and then the value of ‘sex’ as an improvement to the regression. How
should this impact Hotshot Associates’ discrimination problem? (Don’t say a word without
referring to a statistical test.) (3)
k. Predict what you will pay your nephew now. How much change is there from your last
prediction? (2)
4) An airport authority wants to compare training of air traffic controllers at three locations. Data is on the
next page. To personalize these data add the last two digits of your student number as a 9 th number to
column C.
a. Compare the performance of locations A, B, and C assuming that the underlying distribution is nonNormal. (4)
[26]
b. Use a one-way ANOVA to test the hypothesis of equal means. (5) It is legitimate to check your results by
computer, but I expect to see hand computations every step of the way.
[31]
c. (Extra Credit) Decide between the methods that you used in a) and b). To do this test for equal variances
and for Normality on the computer. What is your decision? Why?
(4)
You can do most of this with the following commands in Minitab if you put your data in 3 columns of
Minitab with A, B, and C above them.
MTB > AOVOneway A B C
MTB > stack A B C c11;
#Does a 1-way ANOVA
# Stacks the data in c12, col.no. in c12.
23
252x0541 4/22/05
SUBC>
SUBC>
MTB >
MTB >
subscripts c12;
UseNames.
rank c11 c13
vartest c11 c12
#Puts the ranks of the stacked data in c13
#Does a bunch of tests, including Levene’s
On stacked data in c11 with IDs in c12.
MTB > Unstack (c13);
SUBC>
Subscripts c12;
SUBC>
After;
SUBC>
VarNames.
#Unstacks the ranks in the next 5 available
# columns. Uses IDs in c12.
MTB > NormTest 'A';
SUBC>
KSTest.
#Does a test (apparently Lilliefors)for Normality
# on column A.
Data for Problem 4
Row
1
2
3
4
5
6
7
8
A
96
82
88
70
90
91
87
88
B
65
74
72
66
79
82
73
C
60
73
85
61
79
85
88
79
This might help.
MTB > sum c1
Sum of A
Sum of A = 692
MTB > ssq c1
Sum of Squares of A
Sum of squares (uncorrected) of A = 60278
MTB > sum c2
Sum of B
Sum of B = 511
MTB > ssq c2
Sum of Squares of B
Sum of squares (uncorrected) of B = 37535
24
252x0541 4/22/05
Name:_______________________
Exhibit 1.
An entrepreneur believes that her business is growing steadily and wants to compute a trend line for her
output Y against time x1  T . She also decides to repeat the regression after adding x 2  T 2 as a second
independent variable. Her data and results follow. The t statistics have been relabeled ‘t-ratio’ to prevent
confusion with T .
Regression Analysis: Y versus T
Row
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Y
53.43
59.09
59.58
64.75
68.65
65.53
68.44
70.93
72.85
73.60
72.93
75.14
73.88
76.55
79.05
T
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
T
1
4
9
16
25
36
49
64
81
100
121
144
169
196
225
2
The regression equation is
Y = 56.7 + 1.54 T
Predictor
Coef SE Coef
Constant
56.659
1.283
T
1.5377
0.1411
S = 2.36169
R-Sq = 90.1%
t-ratio
P
44.15 0.000
10.89 0.000
R-Sq(adj) = 89.4%
Regression Analysis: Y versus T, TSQ
The regression equation is
Y = 52.4 + 3.04 T - 0.0939 TSQ
Predictor
Coef SE Coef
Constant
52.401
1.545
T
3.0405
0.4444
TSQ
-0.09392 0.02701
S = 1.73483
R-Sq = 95.1%
t-ratio P
33.91 0.000
6.84 0.000
-3.48 0.005
If you need them, her means and spare parts are below.
Y  68.9600
X 22  nX 22  SSX 2  75805.3
X 1  8.00
Y 2  nY 2  SST  SSY  734.556
X 2  82.6667
X 1Y  nX 1Y  SX 1Y  430.550
X 12  nX 12  SSX1  280
X 2 Y  nX 2 Y  SX 2Y  6501.33





X
1X 2
 nX 1 X 2  SX 1X 2  4480.00
25
252x0541 4/22/05
Name:_______________________
Exhibit 2: Size of Bubbles Blocked by Blower
Brand 1
Row
1
2
3
4
5
Student
Loopy
Percival
Poopsy
Dizzy
Booger
Brand 2
Brand 3
Brand 4
x1
x 2
x 3
x 4
8.75
9.50
9.25
9.50
9.25
9.5
4.0
5.5
8.5
4.5
8.5
8.5
7.5
7.5
8.0
11.5
11.0
7.5
7.5
8.0
Column sums and sums of squares are as follows.
x
x
1
 46.25,
3
 40,
x
x
2
1
 428.188,
2
3
 321,
x
x
2
 32,
4
 45.5,
x
x
2
2
 229,
2
4
 429.75
Exhibit 3: Size of Bubbles Blocked by Blower but ranked as four independent random samples.
Row
1
2
3
4
5
Student
Loopy
Percival
Poopsy
Dizzy
Booger
x1
r1
x 2
r2
x 3
r3
x 4
r4
8.75
9.50
9.25
9.50
9.25
13.0
17.0
14.5
17.0
14.5
9.5
4.0
5.5
8.5
4.5
17
1
3
11
2
8.5
8.5
7.5
7.5
8.0
11.0
11.0
5.5
5.5
8.5
11.5
11.0
7.5
7.5
8.0
20.0
19.0
5.5
5.5
8.5
Row Sums and Sums of squares are as follows.
1
3
5
x
x
x
1
 38.25
3
 29.75
5
 29.75
x
x
x
2
1
 371.313
2
3
 228.313 4
2
5
 233.813
2
x
4
x
2
 33.00
 33.00
x
2
4
x
2
2
 299.500
 275.000
26
Download