EXAM3 PRACTICE PROBLEMS The data from Exam2 reported Car Class (Compact, Midsize, and Large), Displacement (liters), Fuel Type (Premium or Regular), and MPG for 60 US car models. The first two and last three observations and summary stats appear below. Car 1 2 Class Displacement Midsize 3.5 Midsize 3 . . . . 58 Compact 6 59 Midsize 2.5 60 Midsize 2 Average 3.287 Std Dev 1.059 Min 2 Max 6.2 Count 60 Fuel Type R R . . P R R Hwy MPG 28 26 . . 20 30 32 25.683 3.721 15 33 60 Answer these first five questions without using the data. 1. Al regressed MPG on displacement and got an estimated coefficient (𝑏̂) of -2.97. Al forgot to write down his estimated intercept (𝑎̂). Please calculate it for him. We know that Al’s regression will go through the sample averages. This means that ̂ – 2.97 (3.287). So 𝒂 ̂ = 25.683 + 2.97*3.287 = 35.4 25.683 = 𝒂 2. The engine in car B is a full liter larger than the engine in car A. Using Al’s regression, how much lower do we expect the mileage (MPG) of B to be? Could the question get be any easier? We expect car B’s MPG to be 2.97 MPG lower than A’s. 3. Is Al’s t-stat (the one associated with the coefficient of his X-variable) positive or negative? The sign of the t-stat of a regression coefficient always matches the sign of the coefficient. Since Al’s coefficient is negative, his t-stat will be negative. 4. Car B has a 4.2 liter engine. Will car B get more than 30 MPG? (You phone Al and discover that the standard error of his regression model is 2.00.) The point forecast of B’s MPG is 35.4 – 2.97 (4.2) = 22.93. The probability its MPG will exceed 30 is t.dist.rt[(30-22.93)/2.00,60-2] = 0.0004. This calculation requires the four regression assumptions (MPG is linearly related to displacement, homeskedasticity, normality, and independence). 5. Bo created dummy variable P defined to be 1 if the car used premium gasoline and 0 if it used regular. Bo then regressed MPG on DISPLACEMENT and P and got Intercept Displacement P Coefficients 35.518 -2.761 -1.270 Do the cars in the sample that use premium gasoline have a higher or lower average displacement? The coefficient of Displacement went from -2.97 to -2.76 when P was added to the model. That means that higher Displacement cars tended to use Premium (that makes sense) which moved the -2.72 down to -2.97. So the Premium cars in the sample had higher average displacement. You may use the data to help answer the remaining questions. You can find the data linked to Class 19 assignment. 6. A new midsize car with a 3.1-liter engine uses Premium gasoline. Will its MPG be less than 27? To answer this question, we need a model. We know three things about the car: its class, it uses P, and it has a 3.1-liter engine. First we create dummy variables DM (midsize) and DL (large) car classes. We fit the four-variable model. Intercept DM DL Displacement P Coefficients 29.001 4.262 2.055 -1.613 -0.569 Standard Error 1.263 0.701 0.584 0.275 0.445 t Stat 22.953 6.077 3.518 -5.859 -1.279 P-value 0.000 0.000 0.001 0.000 0.206 Notice that Displacement is a significant predictor of MPG but gasoline type is not. Also notice that Midsize cars get the highest MPGs (for their engine size and fuel type), and Large cars the next highest. Compact cars get the lowest. It is pretty clear that many of the compact cars are low-mileage sports cars. The Midsize cars are the higher mileage economy cars. Since P is not significant, we drop it from the model and fit the simpler 3-variable model. Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 0.917 0.840 0.832 1.527 60 ANOVA df Regression Residual Total Intercept DM DL Displacement SS MS 686.365 228.788 130.618 2.332 816.983 3 56 59 Standard Error 1.239 0.683 0.585 0.276 Coefficients 28.641 4.487 2.115 -1.640 t Stat 23.123 6.570 3.614 -5.944 F 98.088 Significance F 2.8283E-22 Pvalue 0.000 0.000 0.001 0.000 Note that the overall model is significant (P-value of 2.8E-22), displacement is significant (p-value of 0.000 to three decimals), and car type is significant (both M and L coefficient are significant). Our point forecast of MPG for the new car is Intercept DM DL Displacement Point Forecast Coefficients 28.641 4.487 2.115 -1.640 New Car 1 1 0 3.1 28.043 Assuming the four regression assumptions hold, the probability the new car will get more than 27 MPG is t.dist.rt[(27-28.043)/1.527,56] = 0.75. 7. Which car’s mileage is most unusually high (low)? When I ran the above regression, I checked the box for RESIDUALS (not shown here). Car 10 has the highest positive residual of 3.97 meaning it’s MPG is 3.97 higher than expected for its class and engine size. (10 is a midsize, 2.5 liter getting 33 MPG.) Car 49 has the most negative residual. Its MPG is 4.29 below its expectation. (49 is a Compact 5.7 liter getting only 15 MPG.) Interestingly, cars 10 and 49 also have the overall lowest and highest MPGs. These cars’ MPGs are both the most extreme AND the most extreme for their class/displacement.