252solnK1 12/02/03 Still problem 14.18 [14.9] Normal Probability Plot of the Residuals (response is DistCost) 2 Normal Score 1 0 -1 -2 -10 0 10 Residual Comment: The text says that a straight-line Normal Probability plot indicates a Normal distribution. Comment: There doesn’t seem to be much of a pattern here. 7 252solnK1 12/02/03 Comment: A pattern here would indicate autocorrelation. Comment: These two graphs show no pattern either. 8 252solnK1 12/02/03 MTB > Stepwise c1 c2 c3; SUBC> AEnter 0.15; SUBC> ARemove 0.15; SUBC> Constant. Stepwise Regression: DistCost versus Sales, Orders Alpha-to-Enter: 0.15 Alpha-to-Remove: 0.15 Response is DistCost on 2 predictors, with N = Step Constant 1 0.4576 2 -2.7282 Orders T-Value P-Value 0.0161 10.92 0.000 0.0119 5.31 0.000 Sales T-Value P-Value 24 0.047 2.32 0.031 S 5.22 4.77 R-Sq 84.42 87.59 R-Sq(adj) 83.71 86.41 C-p 6.4 3.0 More? (Yes, No, Subcommand, or Help) SUBC> yes No variables entered or removed More? (Yes, No, Subcommand, or Help) SUBC> no MTB > Comment: This was a stepwise regression. It was done with no options, so that all the subcommands that you see here were generated by Minitab. It seems that the independent variable with the most explanatory power was ‘Orders,’ and the regression was Y = 0.4576 + .0161 Orders, with an R-squared of 84.42. Minitab then added the other independent variable and got a regression of Y = -2.7282 + 0.119 Orders + 0.047 Sales, which is the same regression we got with the ‘Regress’ command. The C-p statistic, also explained in the text, should be near k + 1, which it is for this regression. After adding two independent variables, Minitab paused and asked me if I wanted to try for more independent variables. I foolishly said ‘yes,’ whereupon Minitab discovered that it didn’t have any more variables to add. Dummy Variables Exercise 14.38 [14.33 in 9th] (15.6 in 8th edition): The equation is Y 6 4 X 1 2 X 2 (a) Holding constant the effect of X2, the estimated average value of the dependent variable will increase by 4 units for each increase of one unit of X1. (b) Holding constant the effects of X1, the presence of the condition represented by X2 = 1 is estimated to increase the average value of the dependent variable by 2 units. 17 2.11 . You can reject H0 and say t 3.27 . n k 1 20 2 1. This is larger than t .05 (c) that the presence of X2 makes a significant contribution to the model. 9 252solnK1 12/01/03 Exercise 14.39 [14.34 in 9th] (15.7 in 8th edition): (a) First develop a multiple regression model using X1 as the variable for the SAT score and X2 a dummy variable with X2 = 1 if a student had a grade of B or better in the introductory statistics course. If the dummy variable coefficient is significantly different than zero, you need to develop a model with the interaction term X1 X2 to make sure that the coefficient of X1 is not significantly different if X2 = 0 or X2 = 1. (b) If a student received a grade of B or better in the introductory statistics course, the student would be expected to have a grade point average in accountancy that is 0.30 higher than a student who had the same SAT score, but did not get a grade of B or better in the introductory statistics course. Exercise 14.41 [14.35 in 9th] (15.8 in 8th edition): To run this regression I used the Statistics pull-down menu and then picked Regression twice. I had put headings on my columns – the data is in the text and on your CD, but, since I’m lazy, I identified the columns as C1, C2 and C3. So C2 was my response (dependent - Y) variable and C1 and C3 were my predictor (independent – X) variables. There are just too many subcommands here to use the session window to drive Minitab. On the Regression menu I went into Graphs and checked all the residual plots except residuals vs. order. Under Options I picked Variance Inflation Factors and set up for confidence and prediction intervals by telling it that the independent variables for this prediction were in C5 and C6. Under Results I took the last and most complete option, though this can also be done by using the session command ‘Brief 3’ before you start. Under storage I picked nothing. When this regression was finished and I had copied all the graphs into a Word document, I ran the regression again with the Interaction variable requested in part n) of the problem. To confirm my results, I ran Stepwise from the Regression menu using C1, C3 and C4 as candidates to explain C2. The output from the run follows with comments. ————— 12/2/2003 9:15:15 PM ———————————————————— Welcome to Minitab, press F1 for help. MTB > Retrieve "C:\Berenson\Data_Files-9th\Minitab\petfood.MTW". Retrieving worksheet from file: C:\Berenson\Data_Files-9th\Minitab\petfood.MTW # Worksheet was saved on Mon Apr 27 1998 Comment: I downloaded the data from the text CD, but stored it where I could get it more easily if I needed it again. Results for: petfood.MTW MTB > Save "C:\Documents and Settings\RBOVE.WCUPANET\My Documents\Drive D\MINITAB\petfood3"; SUBC> Replace. Saving file as: C:\Documents and Settings\RBOVE.WCUPANET\My Documents\Drive D\MINITAB\petfood3.MTW * NOTE * Existing file replaced. Results for: petfood3.MTW MTB > regress c2 1 c1 10 252solnK1 12/02/03 Regression Analysis: Sales versus Space The regression equation is Sales = 1.45 + 0.0740 Space Predictor Constant Space Coef 1.4500 0.07400 S = 0.3081 SE Coef 0.2178 0.01591 R-Sq = 68.4% T 6.66 4.65 P 0.000 0.001 R-Sq(adj) = 65.2% Analysis of Variance Source Regression Residual Error Total DF 1 10 11 SS 2.0535 0.9490 3.0025 MS 2.0535 0.0949 F 21.64 P 0.001 Comment: This is the regression referred to in Problems 13.3 and 13.14. MTB > let c4 = c1 * c3 Comment: This command creates the interaction variable, which I have labeled ‘Inter’ in C4. MTB > print c1-c6 Data Display Row Space Sales Locatn Inter C5 C6 1 2 3 4 5 6 7 8 9 10 11 12 5 5 5 10 10 10 15 15 15 20 20 20 1.6 2.2 1.4 1.9 2.4 2.6 2.3 2.7 2.8 2.6 2.9 3.1 0 1 0 0 0 1 0 0 1 0 0 1 0 5 0 0 0 10 0 0 15 0 0 20 8 0 Comment: This is the data I will use. Because part c) of this problem asks for confidence and prediction intervals I have set up values of space and sales for these intervals in C5 and C6. MTB > Regress c2 2 c1 c3; SUBC> GHistogram; SUBC> GNormalplot; SUBC> GFits; SUBC> GVars c1 c3; SUBC> RType 1; SUBC> Constant; SUBC> VIF; SUBC> Predict c5 c6; SUBC> Brief 3. 11 252solnK1 12/02/03 Regression Analysis: Sales versus Space, Locatn The regression equation is Sales = 1.30 + 0.0740 Space + 0.450 Locatn Predictor Constant Space Locatn Coef 1.3000 0.07400 0.4500 S = 0.2132 SE Coef 0.1569 0.01101 0.1305 R-Sq = 86.4% T 8.29 6.72 3.45 P 0.000 0.000 0.007 VIF 1.0 1.0 R-Sq(adj) = 83.4% Analysis of Variance Source Regression Residual Error Total Source Space Locatn DF 1 1 DF 2 9 11 SS 2.5935 0.4090 3.0025 MS 1.2967 0.0454 F 28.53 P 0.000 Seq SS 2.0535 0.5400 Comment: These results look great. Note that all my coefficients are significant, with p-values below 1%. The VIF is way below 5, which indicates a lack of collinearity. The ANOVA gives me a p-value of zero, indicating that the regression is quite useful. Obs 1 2 3 4 5 6 7 8 9 10 11 12 Space 5.0 5.0 5.0 10.0 10.0 10.0 15.0 15.0 15.0 20.0 20.0 20.0 Sales 1.6000 2.2000 1.4000 1.9000 2.4000 2.6000 2.3000 2.7000 2.8000 2.6000 2.9000 3.1000 Fit 1.6700 2.1200 1.6700 2.0400 2.0400 2.4900 2.4100 2.4100 2.8600 2.7800 2.7800 3.2300 SE Fit 0.1118 0.1348 0.1118 0.0802 0.0802 0.1101 0.0802 0.0802 0.1101 0.1118 0.1118 0.1348 Residual -0.0700 0.0800 -0.2700 -0.1400 0.3600 0.1100 -0.1100 0.2900 -0.0600 -0.1800 0.1200 -0.1300 St Resid -0.39 0.48 -1.49 -0.71 1.82 0.60 -0.56 1.47 -0.33 -0.99 0.66 -0.79 Predicted Values for New Observations New Obs 1 Fit 1.8920 SE Fit 0.0902 ( 95.0% CI 1.6880, 2.0960) ( 95.0% PI 1.3684, 2.4156) Values of Predictors for New Observations New Obs 1 Space 8.00 Locatn 0.000000 Residual Histogram for Sales Normplot of Residuals for Sales Residuals vs Fits for Sales 12