4/28/00 252cp400s (Open this document in 'Page Layout' view!) Last Computer Problem The data is below. You need to set up Columns for REV(revenue of the Coca-Cola company 1986 -1996),YEAR(year with 1986 as zero), YEARSQ (YEAR squared), GDP(Real GDP) and RMINW(Real Minimum Wage) as well as resid and pred. Row REV YEAR GDP RMINW 1 7.0 0 5.5 3.06 2 7.7 1 5.6 2.95 3 8.3 2 5.9 2.83 4 9.0 3 6.1 2.70 5 10.2 4 6.1 2.91 6 11.6 5 6.1 3.12 7 13.0 6 6.2 3.03 8 14.0 7 6.4 2.94 9 16.2 8 6.6 2.87 10 18.0 9 6.7 2.79 11 18.5 10 6.9 2.94 To set up YEARSQ use LET 'YEARSQ' = 'YEAR'*'YEAR' To do the problem with all independent variables, you need to do: Brief K = 3 Regress 'REV' on 4 'YEAR’ ’YEARSQ’ ’GDP’ ’RMINW’ ‘resid’ ’pred’ Plot ‘REV’ * ‘pred’ Explanation: The 4 in here tells it that there are 4 explanatory (independent) variables. This number must be correct - for example if it were 3, ‘RMINW’ would be zapped. Brief K=3 needs to be set only once. The plot shows how well you did - a perfect prediction gives a 45 degree line. Check the significance tests on the coefficients of your independent variables. Which are not significant? Which have the wrong sign? Can you suggest why? The rest of the assignment: First: Try replacing the Regress in the instruction above with Stepwise Regression as is done on page 844 of the text. Second: Do more regressions using Regress and Plot with appropriate changes to the number in the Regress instruction. First regress ‘REV’ against YEAR alone, then against YEAR and YEARSQ, then YEAR, YEARSQ and GDP. then try REV against GDP alone. Create a new variable equal to GDP squared. Do GDP and GDP squared do as well as YEAR and YEARSQ in predicting REV? What about GDP, GDP squared and YEAR? Try the stepwise regression command with GDP squared added to your independent variables. 1 Worksheet size: 100000 cells MTB > RETR 'C:\MINITAB\CP4-00.MTW'. Retrieving worksheet from file: C:\MINITAB\CP4-00.MTW Worksheet was saved on 4/26/2000 MTB > brief k=3 MTB > regress 'rev' on 4 'year''yearsq''gdp''rminw''resid''pred' Regression Analysis The regression equation is rev = 42.6 + 1.24 year + 0.0618 yearsq - 4.94 gdp - 2.79 rminw Predictor Constant year yearsq gdp rminw Coef 42.56 1.2360 0.06181 -4.944 -2.792 s = 0.3231 Stdev 14.98 0.2648 0.01117 2.028 1.455 R-sq = 99.6% t-ratio 2.84 4.67 5.53 -2.44 -1.92 p 0.029 0.003 0.001 0.051 0.103 R-sq(adj) = 99.4% Analysis of Variance SOURCE Regression Error Total DF 4 6 10 SS 169.639 0.626 170.265 SOURCE year yearsq gdp rminw DF 1 1 1 1 SEQ SS 166.173 2.844 0.238 0.384 Obs. 1 2 3 4 5 6 7 8 9 10 11 year 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 rev 7.0000 7.7000 8.3000 9.0000 10.2000 11.6000 13.0000 14.0000 16.2000 18.0000 18.5000 MS 42.410 0.104 Fit 6.8231 7.9336 8.2067 9.1259 10.2084 11.4144 13.0872 14.3892 15.7590 17.7748 18.7776 F 406.29 Stdev.Fit 0.2637 0.2276 0.1767 0.2466 0.1608 0.2345 0.1963 0.1464 0.1384 0.2524 0.2912 p 0.000 Residual 0.1769 -0.2336 0.0933 -0.1259 -0.0084 0.1856 -0.0872 -0.3892 0.4410 0.2252 -0.2776 St.Resid 0.95 -1.02 0.34 -0.60 -0.03 0.84 -0.34 -1.35 1.51 1.12 -1.99 MTB > plot 'rev' * 'pred' 2 MTB > stepwise 'rev''year''yearsq''gdp''rminw' Stepwise Regression F-to-Enter: 4.00 Response is Step Constant year T-Ratio rev 1 5.991 1.229 19.12 on F-to-Remove: 4.00 4 predictors, with N = 11 2 6.855 0.653 4.67 yearsq T-Ratio 0.058 4.27 S 0.674 0.395 R-Sq 97.60 99.27 More? (Yes, No, Subcommand, or Help) SUBC> y No variables entered or removed More? (Yes, No, Subcommand, or Help) SUBC> n MTB > regress 'rev' on 1 'year''resid''pred' Regression Analysis The regression equation is rev = 5.99 + 1.23 year Predictor Constant year Coef 5.9909 1.22909 s = 0.6743 Stdev 0.3804 0.06429 R-sq = 97.6% t-ratio 15.75 19.12 p 0.000 0.000 R-sq(adj) = 97.3% Analysis of Variance SOURCE Regression Error Total Obs. 1 2 3 4 5 6 7 8 9 10 11 DF 1 9 10 year 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 SS 166.17 4.09 170.27 rev 7.000 7.700 8.300 9.000 10.200 11.600 13.000 14.000 16.200 18.000 18.500 MS 166.17 0.45 Fit 5.991 7.220 8.449 9.678 10.907 12.136 13.365 14.595 15.824 17.053 18.282 F 365.45 Stdev.Fit 0.380 0.328 0.280 0.241 0.213 0.203 0.213 0.241 0.280 0.328 0.380 p 0.000 Residual 1.009 0.480 -0.149 -0.678 -0.707 -0.536 -0.365 -0.595 0.376 0.947 0.218 St.Resid 1.81 0.81 -0.24 -1.08 -1.11 -0.83 -0.57 -0.94 0.61 1.61 0.39 3 MTB > regress 'rev' on 2 'year''yearsq''resid''pred' Regression Analysis The regression equation is rev = 6.85 + 0.653 year + 0.0576 yearsq Predictor Constant year yearsq Coef 6.8545 0.6533 0.05758 s = 0.3950 Stdev 0.3009 0.1400 0.01348 R-sq = 99.3% t-ratio 22.78 4.67 4.27 p 0.000 0.000 0.003 R-sq(adj) = 99.1% Analysis of Variance SOURCE Regression Error Total DF 2 8 10 SS 169.017 1.248 170.265 SOURCE year yearsq DF 1 1 SEQ SS 166.173 2.844 Obs. 1 2 3 4 5 6 7 8 9 10 11 year 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 rev 7.000 7.700 8.300 9.000 10.200 11.600 13.000 14.000 16.200 18.000 18.500 MS 84.509 0.156 Fit 6.855 7.565 8.392 9.333 10.389 11.561 12.847 14.249 15.766 17.398 19.145 F 541.67 Stdev.Fit 0.301 0.208 0.165 0.162 0.174 0.180 0.174 0.162 0.165 0.208 0.301 p 0.000 Residual 0.145 0.135 -0.092 -0.333 -0.189 0.039 0.153 -0.249 0.434 0.602 -0.645 St.Resid 0.57 0.40 -0.25 -0.92 -0.53 0.11 0.43 -0.69 1.21 1.79 -2.52R R denotes an obs. with a large st. resid. MTB > regress 'rev' on 3 'year''yearsq''gdp''resid''pred' Regression Analysis The regression equation is rev = 16.7 + 0.870 year + 0.0588 yearsq - 1.77 gdp Predictor Constant year yearsq gdp Coef 16.705 0.8698 0.05882 -1.773 s = 0.3800 Stdev 7.684 0.2159 0.01301 1.382 R-sq = 99.4% t-ratio 2.17 4.03 4.52 -1.28 p 0.066 0.005 0.000 0.240 R-sq(adj) = 99.2% Analysis of Variance SOURCE Regression Error Total DF 3 7 10 SS 169.255 1.011 170.265 SOURCE year yearsq gdp DF 1 1 1 SEQ SS 166.173 2.844 0.238 MS 56.418 0.144 F 390.81 p 0.000 4 Obs. 1 2 3 4 5 6 7 8 9 10 11 year 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 rev 7.000 7.700 8.300 9.000 10.200 11.600 13.000 14.000 16.200 18.000 18.500 Fit 6.954 7.705 8.219 9.029 10.310 11.709 13.049 14.329 15.726 17.419 19.051 Stdev.Fit 0.300 0.228 0.208 0.284 0.178 0.208 0.230 0.168 0.161 0.201 0.299 Residual 0.046 -0.005 0.081 -0.029 -0.110 -0.109 -0.049 -0.329 0.474 0.581 -0.551 St.Resid 0.20 -0.02 0.25 -0.11 -0.33 -0.34 -0.16 -0.96 1.38 1.80 -2.35R R denotes an obs. with a large st. resid. MTB > regress 'rev' on 1 'gdp''resid''pred' Regression Analysis The regression equation is rev = - 44.1 + 9.09 gdp Predictor Constant gdp Coef -44.139 9.0900 s = 1.179 Stdev 5.297 0.8537 R-sq = 92.6% t-ratio -8.33 10.65 p 0.000 0.000 R-sq(adj) = 91.8% Analysis of Variance SOURCE Regression Error Total Obs. 1 2 3 4 5 6 7 8 9 10 11 DF 1 9 10 gdp 5.50 5.60 5.90 6.10 6.10 6.10 6.20 6.40 6.60 6.70 6.90 SS 157.74 12.52 170.27 rev 7.000 7.700 8.300 9.000 10.200 11.600 13.000 14.000 16.200 18.000 18.500 MS 157.74 1.39 Fit 5.856 6.765 9.492 11.310 11.310 11.310 12.219 14.037 15.855 16.764 18.582 F 113.39 Stdev.Fit 0.689 0.617 0.434 0.364 0.364 0.364 0.356 0.398 0.498 0.562 0.702 p 0.000 Residual 1.144 0.935 -1.192 -2.310 -1.110 0.290 0.781 -0.037 0.345 1.236 -0.082 St.Resid 1.19 0.93 -1.09 -2.06R -0.99 0.26 0.69 -0.03 0.32 1.19 -0.09 R denotes an obs. with a large st. resid. MTB > regress 'rev' on 2 'gdp''gdpsq''resid''pred' Regression Analysis * NOTE * * NOTE * gdp is highly correlated with other gdpsq is highly correlated with other predictor variables predictor variables The regression equation is rev = 65.7 - 26.5 gdp + 2.88 gdpsq Predictor Constant gdp gdpsq s = 1.085 Coef 65.67 -26.54 2.877 Stdev 67.70 21.92 1.769 R-sq = 94.5% t-ratio 0.97 -1.21 1.63 p 0.360 0.261 0.143 R-sq(adj) = 93.1% 5 Analysis of Variance SOURCE Regression Error Total DF 2 8 10 SS 160.855 9.410 170.265 SOURCE gdp gdpsq DF 1 1 SEQ SS 157.745 3.111 Obs. 1 2 3 4 5 6 7 8 9 10 11 gdp 5.50 5.60 5.90 6.10 6.10 6.10 6.20 6.40 6.60 6.70 6.90 rev 7.000 7.700 8.300 9.000 10.200 11.600 13.000 14.000 16.200 18.000 18.500 MS 80.428 1.176 Fit 6.733 7.273 9.237 10.835 10.835 10.835 11.720 13.662 15.835 17.008 19.526 F 68.38 Stdev.Fit 0.832 0.648 0.428 0.444 0.444 0.444 0.449 0.432 0.458 0.538 0.868 p 0.000 Residual 0.267 0.427 -0.937 -1.835 -0.635 0.765 1.280 0.338 0.365 0.992 -1.026 St.Resid 0.38 0.49 -0.94 -1.85 -0.64 0.77 1.30 0.34 0.37 1.05 -1.58 MTB > regress 'rev' on 3 'gdp''gdpsq''year''resid''pred' Regression Analysis * NOTE * * NOTE * gdp is highly correlated with other gdpsq is highly correlated with other predictor variables predictor variables The regression equation is rev = 114 - 34.2 gdp + 2.68 gdpsq + 1.36 year Predictor Constant gdp gdpsq year Coef 114.01 -34.202 2.6765 1.3642 s = 0.4269 Stdev 27.61 8.705 0.6971 0.2042 R-sq = 99.3% t-ratio 4.13 -3.93 3.84 6.68 p 0.004 0.006 0.006 0.000 R-sq(adj) = 98.9% Analysis of Variance SOURCE Regression Error Total DF 3 7 10 SS 168.990 1.276 170.265 SOURCE gdp gdpsq year DF 1 1 1 SEQ SS 157.745 3.111 8.135 Obs. 1 2 3 4 5 6 7 8 9 10 11 gdp 5.50 5.60 5.90 6.10 6.10 6.10 6.20 6.40 6.60 6.70 6.90 rev 7.000 7.700 8.300 9.000 10.200 11.600 13.000 14.000 16.200 18.000 18.500 MS 56.330 0.182 Fit 6.862 7.777 8.114 9.062 10.426 11.790 13.027 14.295 15.778 17.282 19.086 F 309.13 Stdev.Fit 0.328 0.266 0.238 0.318 0.185 0.226 0.263 0.195 0.181 0.216 0.348 p 0.000 Residual 0.138 -0.077 0.186 -0.062 -0.226 -0.190 -0.027 -0.295 0.422 0.718 -0.586 St.Resid 0.51 -0.23 0.52 -0.22 -0.59 -0.53 -0.08 -0.78 1.09 1.95 -2.37R R denotes an obs. with a large st. resid. 6 MTB > stepwise 'rev''year''yearsq''gdp''gdpsq''rminw' Stepwise Regression F-to-Enter: Response is 4.00 rev F-to-Remove: on 5 predictors, with N = Step Constant 1 5.991 2 6.855 year T-Ratio 1.229 19.12 0.653 4.67 yearsq T-Ratio 4.00 11 0.058 4.27 S 0.674 0.395 R-Sq 97.60 99.27 More? (Yes, No, Subcommand, or Help) SUBC> y No variables entered or removed More? (Yes, No, Subcommand, or Help) SUBC> y No variables entered or removed More? (Yes, No, Subcommand, or Help) SUBC> n MTB > print c1-c8 Data Display Row rev year yearsq gdp rminw gdpsq resid pred 1 2 3 4 5 6 7 8 9 10 11 7.0 7.7 8.3 9.0 10.2 11.6 13.0 14.0 16.2 18.0 18.5 0 1 2 3 4 5 6 7 8 9 10 0 1 4 9 16 25 36 49 64 81 100 5.5 5.6 5.9 6.1 6.1 6.1 6.2 6.4 6.6 6.7 6.9 3.06 2.95 2.83 2.70 2.91 3.12 3.03 2.94 2.87 2.79 2.94 30.25 31.36 34.81 37.21 37.21 37.21 38.44 40.96 43.56 44.89 47.61 0.50585 -0.22994 0.52356 -0.21764 -0.58829 -0.52584 -0.07928 -0.77751 1.09065 1.94876 -2.37060 6.8618 7.7768 8.1145 9.0620 10.4262 11.7905 13.0266 14.2953 15.7782 17.2820 19.0860 7