Class 28 Extra Exam 3 practice problems with

advertisement
EXAM3 PRACTICE PROBLEMS
The data from Exam2 reported Car Class (Compact, Midsize, and Large), Displacement (liters), Fuel Type
(Premium or Regular), and MPG for 60 US car models. The first two and last three observations and
summary stats appear below.
Car
1
2
Class
Displacement
Midsize
3.5
Midsize
3
.
.
.
.
58
Compact
6
59
Midsize
2.5
60
Midsize
2
Average
3.287
Std Dev
1.059
Min
2
Max
6.2
Count
60
Fuel
Type
R
R
.
.
P
R
R
Hwy MPG
28
26
.
.
20
30
32
25.683
3.721
15
33
60
Answer these first five questions without using the data.
1. Al regressed MPG on displacement and got an estimated coefficient (𝑏̂) of -2.97. Al forgot to write
down his estimated intercept (𝑎̂). Please calculate it for him.
We know that Al’s regression will go through the sample averages. This means that
̂ – 2.97 (3.287). So 𝒂
̂ = 25.683 + 2.97*3.287 = 35.4
25.683 = 𝒂
2. The engine in car B is a full liter larger than the engine in car A. Using Al’s regression, how much lower
do we expect the mileage (MPG) of B to be?
Could the question get be any easier? We expect car B’s MPG to be 2.97 MPG lower than A’s.
3. Is Al’s t-stat (the one associated with the coefficient of his X-variable) positive or negative?
The sign of the t-stat of a regression coefficient always matches the sign of the coefficient. Since Al’s
coefficient is negative, his t-stat will be negative.
4. Car B has a 4.2 liter engine. Will car B get more than 30 MPG? (You phone Al and discover that the
standard error of his regression model is 2.00.)
The point forecast of B’s MPG is 35.4 – 2.97 (4.2) = 22.93. The probability its MPG will exceed 30 is
t.dist.rt[(30-22.93)/2.00,60-2] = 0.0004. This calculation requires the four regression assumptions
(MPG is linearly related to displacement, homeskedasticity, normality, and independence).
5. Bo created dummy variable P defined to be 1 if the car used premium gasoline and 0 if it used regular.
Bo then regressed MPG on DISPLACEMENT and P and got
Intercept
Displacement
P
Coefficients
35.518
-2.761
-1.270
Do the cars in the sample that use premium gasoline have a higher or lower average displacement?
The coefficient of Displacement went from -2.97 to -2.76 when P was added to the model. That means
that higher Displacement cars tended to use Premium (that makes sense) which moved the -2.72 down
to -2.97. So the Premium cars in the sample had higher average displacement.
You may use the data to help answer the remaining questions. You can find the data linked to Class 19
assignment.
6. A new midsize car with a 3.1-liter engine uses Premium gasoline. Will its MPG be less than 27?
To answer this question, we need a model. We know three things about the car: its class, it uses P, and
it has a 3.1-liter engine. First we create dummy variables DM (midsize) and DL (large) car classes. We
fit the four-variable model.
Intercept
DM
DL
Displacement
P
Coefficients
29.001
4.262
2.055
-1.613
-0.569
Standard
Error
1.263
0.701
0.584
0.275
0.445
t Stat
22.953
6.077
3.518
-5.859
-1.279
P-value
0.000
0.000
0.001
0.000
0.206
Notice that Displacement is a significant predictor of MPG but gasoline type is not. Also notice that
Midsize cars get the highest MPGs (for their engine size and fuel type), and Large cars the next
highest. Compact cars get the lowest. It is pretty clear that many of the compact cars are low-mileage
sports cars. The Midsize cars are the higher mileage economy cars.
Since P is not significant, we drop it from the model and fit the simpler 3-variable model.
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
0.917
0.840
0.832
1.527
60
ANOVA
df
Regression
Residual
Total
Intercept
DM
DL
Displacement
SS
MS
686.365 228.788
130.618
2.332
816.983
3
56
59
Standard
Error
1.239
0.683
0.585
0.276
Coefficients
28.641
4.487
2.115
-1.640
t Stat
23.123
6.570
3.614
-5.944
F
98.088
Significance
F
2.8283E-22
Pvalue
0.000
0.000
0.001
0.000
Note that the overall model is significant (P-value of 2.8E-22), displacement is significant (p-value of
0.000 to three decimals), and car type is significant (both M and L coefficient are significant).
Our point forecast of MPG for the new car is
Intercept
DM
DL
Displacement
Point
Forecast
Coefficients
28.641
4.487
2.115
-1.640
New Car
1
1
0
3.1
28.043
Assuming the four regression assumptions hold, the probability the new car will get more than 27
MPG is t.dist.rt[(27-28.043)/1.527,56] = 0.75.
7. Which car’s mileage is most unusually high (low)?
When I ran the above regression, I checked the box for RESIDUALS (not shown here). Car 10 has the
highest positive residual of 3.97 meaning it’s MPG is 3.97 higher than expected for its class and engine
size. (10 is a midsize, 2.5 liter getting 33 MPG.) Car 49 has the most negative residual. Its MPG is 4.29
below its expectation. (49 is a Compact 5.7 liter getting only 15 MPG.) Interestingly, cars 10 and 49
also have the overall lowest and highest MPGs. These cars’ MPGs are both the most extreme AND the
most extreme for their class/displacement.
Download