Stat 328 Exam II (Regression) Summer 2001 Prof. Vardeman This exam treats the analysis of some data taken from Data Analysis Using Regression Models by Frees. They concern the earnings of 8 œ $% stock insurance companies. For each company there are values for C œ I T W œ second quarter 1991 earnings per share (presumably in dollars?) and for the ( predictor variables B" B# B$ B% B& B' B( œ ITW*! œ 1990 earnings per share (presumably in dollars?) œ XETGI œ total assets as a percentage of common stockholders equity œ PHT X G œ long term debt as a percent of total capital œ R X WPW œ net sales in thousands of dollars œ THMZ œ preferred dividends (presumably in dollars?) œ GWS œ common shares outstanding œ XETVI œ total assets as a percent of retained earnings The JMP reports for these data attached to this exam are: page 1: page 2: page 3: page 4: page 5: page 6: output from JMP's "Multivariate" routine output from a run of JMP's "Fit Y by X" routine output from a first run of JMP's "Fit Model" routine output from a second run of JMP's "Fit Model" routine some plots based on the (page 4) second run of JMP's "Fit Model" routine the JMP data table, with some added columns based on the (page 4) second run of "Fit Model" a) Based on page 1 of the printouts, which single variable is the best predictor of I T W ? Explain. b) Exactly what on page 1 of the printout indicates that there is strong multicollinearity present in this data set? 1 Consider first the problem of predicting I T W from only I T W *! using the normal SLR model. c) What do you estimate to be the standard deviation of 2nd quarter 1991 earnings per share for companies of this kind having a particular 1990 earnings performance? d) Give a single number estimate and 95% confidence limits for the increase in mean I T W that accompanies a one dollar increase in ITW*!. Explain why you should not be surprised that the estimate is roughly this size. single number estimate: ___________ lower confidence limit: __________ upper confidence limit: __________ rationale: e) Is the hypothesis H! À "" œ ! plausible in this SLR? (Give and interpret an appropriate :-value. Say precisely where you found it on the printouts.) :-value: __________ from exactly where?: __________ interpretation: 2 f) Give 95% confidence limits for the mean 2nd quarter 1991 earnings per share of companies of this type with 1990 earnings per share of $2. g) Give 95% prediction limits for the 2nd quarter 1991 earnings per share of an additional company of this type with 1990 earnings per share of $2. h) Two particular additional companies of this type had 1990 earnings per share of $2 and $4.50. I'd like to predict the difference in their 2nd quarter 1991 earnings per share. Notice that a sensible single number prediction is s œ Ð,! € %Þ&," Ñ • Ð,! € #," Ñ œ #Þ&," . You can get a standard error for P s without using P anything beyond the pages already in your possession. Give 95% prediction limits for this difference. (If you can't see how to find the necessary standard error, write "standard s" where it is needed in the formula.) error for P 3 Now consider the multiple regressions represented on the JMP reports. i) Find an J value, its degrees of freedom and (as best you can) the corresponding :value for judging whether after accounting for ITW*!, some among the other 6 predictors provide statistically detectable explanatory power for modeling I T W . J œ __________ .Þ0 Þ œ ______ , ______ :-value œ __________ j) What do you estimate to be the increase in mean I T W for a one dollar increase in ITW*!, holding all 6 other predictors fixed? Give 95% confidence limits. (No need to simplify.) k) Based on pages 3 and 4 of the printout, Vardeman would be inclined to use only some of the 7 possible predictor variables. Make his case. Point out at least 3 features of pages 3 and 4 that support this judgment. 4 Henceforth consider modeling IWT using only the $ predictors I T W *!ß PHT X G ß and T J HMZ . l) Say what the plots on pages 4 and 5 of the printout indicate about the appropriateness of such modeling. m) For which two cases (which two companies) do you judge the fitted equation to do the poorest job of predicting I T W ? Explain. n) Does it appear that one could drop any single predictor from this equation without significantly reducing ones ability to account for I T W ? Explain. o) Give 95% prediction limits for the I T W of an additional company with ITW*! œ #Þ"&ß PHT X G œ ! and T J HMZ œ !. (Plug in, but there's no need to simplify.) 5 Stat 328 MLR Exam Printout Multivariate Correlations EPS 1.0000 0.8926 -0.0761 -0.3339 0.3620 0.0371 0.1155 -0.2536 EPS EPS90 TAPCE LDPTC NTSLS PFDIV CSO TAPRE EPS90 0.8926 1.0000 -0.0580 -0.1137 0.2678 -0.1166 0.1408 -0.2106 TAPCE -0.0761 -0.0580 1.0000 0.0610 0.2734 -0.0822 0.1447 0.8158 LDPTC -0.3339 -0.1137 0.0610 1.0000 -0.0140 0.4312 0.1961 0.2793 NTSLS 0.3620 0.2678 0.2734 -0.0140 1.0000 0.3472 0.7288 0.0575 PFDIV 0.0371 -0.1166 -0.0822 0.4312 0.3472 1.0000 0.2466 -0.0659 CSO 0.1155 0.1408 0.1447 0.1961 0.7288 0.2466 1.0000 0.0199 TAPRE -0.2536 -0.2106 0.8158 0.2793 0.0575 -0.0659 0.0199 1.0000 Scatterplot Matrix 3 1.5 EPS 0 14 10 6 2 2500 1500 500 -500 EPS90 TAPCE 60 LDPTC 30 0 15000 5000 NTSLS 20 10 PFDIV 0 150000 CSO 50000 6000 4000 2000 0 TAPRE 0 1 2 3 2 4 6 8 12 -5001000 0 2040 70 500015000 0 5 10 2050000 0 3000 1 Bivariate Fit of EPS By EPS90 4 3.5 3 EPS 2.5 2 1.5 1 0.5 0 -0.5 0 2 4 6 8 EPS90 10 12 14 Linear Fit Linear Fit EPS = -0.06612 + 0.2930231 EPS90 Summary of Fit RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.796728 0.790376 0.328271 0.991176 34 Analysis of Variance Source Model Error C. Total DF 1 32 33 Sum of Squares 13.515976 3.448377 16.964353 Mean Square 13.5160 0.1078 F Ratio 125.4246 Prob > F <.0001 Parameter Estimates Term Intercept EPS90 Estimate -0.06612 0.2930231 Std Error 0.109919 0.026164 t Ratio -0.60 11.20 Prob>|t| 0.5517 <.0001 Lower 95% -0.290018 0.239728 Upper 95% 0.1577778 0.3463182 2 Response EPS Whole Model Actual by Predicted Plot EPS Actual 3.5 3 2.5 2 1.5 1 0.5 0 -0.5 -0.5 .0 .5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 EPS Predicted P<.0001 RSq=0.93 RMSE=0.2143 Summary of Fit RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.929586 0.910628 0.214345 0.991176 34 Analysis of Variance Source Model Error C. Total DF 7 26 33 Sum of Squares 15.769819 1.194534 16.964353 Mean Square 2.25283 0.04594 F Ratio 49.0347 Prob > F <.0001 Parameter Estimates Term Intercept EPS90 TAPCE LDPTC NTSLS PFDIV CSO TAPRE Estimate 0.1768909 0.2942356 -0.00015 -0.014567 0.0000109 0.0362552 -7.665e-7 0.0000968 Std Error 0.114681 0.019265 0.000142 0.002694 0.000015 0.008572 0.000001 0.00007 t Ratio 1.54 15.27 -1.05 -5.41 0.71 4.23 -0.54 1.39 Prob>|t| 0.1350 <.0001 0.3013 <.0001 0.4859 0.0003 0.5926 0.1776 Effect Tests Source EPS90 TAPCE LDPTC NTSLS PFDIV CSO TAPRE Nparm 1 1 1 1 1 1 1 DF 1 1 1 1 1 1 1 Sum of Squares 10.717653 0.051108 1.342878 0.022958 0.821897 0.013482 0.088208 F Ratio 233.2785 1.1124 29.2288 0.4997 17.8893 0.2934 1.9199 Prob > F <.0001 0.3013 <.0001 0.4859 0.0003 0.5926 0.1776 Residual by Predicted Plot EPS Residual 0.6 0.4 0.2 -0.0 -0.2 -0.4 -0.5 .0 .5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 EPS Predicted Press 1.9350050355 3 Response EPS Whole Model Actual by Predicted Plot EPS Actual 3.5 3 2.5 2 1.5 1 0.5 0 -0.5 -0.5 .0 .5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 EPS Predicted P<.0001 RSq=0.92 RMSE=0.2095 Summary of Fit RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.9224 0.91464 0.209478 0.991176 34 Analysis of Variance Source Model Error C. Total DF 3 30 33 Sum of Squares 15.647918 1.316435 16.964353 Mean Square 5.21597 0.04388 F Ratio 118.8658 Prob > F <.0001 Parameter Estimates Term Intercept EPS90 LDPTC PFDIV Estimate 0.1630568 0.2908916 -0.013677 0.0363115 Std Error 0.089807 0.016853 0.002142 0.006933 t Ratio 1.82 17.26 -6.38 5.24 Prob>|t| 0.0794 <.0001 <.0001 <.0001 Lower 95% -0.020354 0.2564729 -0.018052 0.022152 Upper 95% 0.3464675 0.3253103 -0.009302 0.050471 Effect Tests Source EPS90 LDPTC PFDIV Nparm 1 1 1 DF 1 1 1 Sum of Squares 13.073125 1.788888 1.203646 F Ratio 297.9210 40.7666 27.4297 Prob > F <.0001 <.0001 <.0001 Residual by Predicted Plot EPS Residual 0.6 0.4 0.2 -0.0 -0.2 -0.4 -0.5 .0 .5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 EPS Predicted Press 1.7181931456 4 2 1 0 -1 -2 -0.5 0 .5 1 1.5 2 2.5 3 3.5 Pred Formula EPS 0 -1 1 0 -1 5 EPS90 10 -1 0 50000 100000 150000 200000 CSO 1 0 -1 0 5000 10000 NTSLS 0 5 10 PFDIV 15000 20000 2 1 0 -1 -2 -1000 0 1000 2000 3000 4000 50006000 TAPRE 3 Studentized Resid EPS 3 0 3 2 15 1 10 20 30 40 50 60 70 80 LDPTC -2 0 2 -2 0 Studentized Resid EPS 2 -2 Studentized Resid EPS 1 3 Studentized Resid EPS Studentized Resid EPS 3 2 1 0 -1 -2 -1000 2 -2 -10 4 3 Studentized Resid EPS 3 Studentized Resid EPS Studentized Resid EPS 3 2 1 0 -1 -2 0 500 1000150020002500 TAPCE 15 20 5 Rows EPS EPS90 TAPCE LDPTC NTSLS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 0.33 1.44 1.1 1.88 2.46 0.28 -0.06 1.1 1.75 1.1 0.03 0.69 0.9 0.26 0.43 0.95 1.39 1.28 1.05 0.94 0.75 1.16 1.65 0.3 1.16 0.23 1.52 0.35 0.7 3.49 0.93 0.96 0.38 0.82 1.29 4.74 4.69 6.76 4.88 0.91 0.5 3.56 4.28 4.15 2.05 3.29 2.28 1.53 1.44 3.61 3.39 4.41 3.06 4.9 3.85 4.44 5.01 2.28 4.27 0.09 5.79 2.53 2.15 12.45 3.81 2.77 3.58 3.94 658.38 1262.67 817.01 587.04 716.34 192.57 -697.29 836.3 1232.05 222.17 2479.29 1138.5 1306.39 938.04 1015.83 716.2 1188.47 329.26 580.2 1650.65 1007.16 587.18 769.27 855.98 492.98 875.6 425.57 675.87 432.75 368.65 232.88 1128.74 659.61 534.14 PFDIV CSO 29.778 1338.4 0 40600 11.965 19020.5 0 110113 40.339 4481 2 111275 44.156 16214 9.688 212143 0.734 9944.4 9.787 61793 6.184 449.2 0 35543 75.129 443.1 21.215 38715 12.565 2929.2 0.112 47502 13.547 8489.6 10.625 41496 22.765 2723 0 73521 63.804 3683.8 0 74550 68.763 6703.1 15.759 76352 17.733 11313.4 17 102170 28.028 870.1 0 105600 10.026 2678.4 0 81451 26.168 2626.4 0 65109 19.922 2577.3 10.432 44783 0 1162.6 0 34523 35.484 334.1 0 7847 27.136 1331.7 0 12264 13.39 2751.3 0 46481 34.54 1796.1 6.898 51315 5.768 2170.3 0 33198 23.225 456.4 0 10786 27.256 1235.6 0.463 15358 5.678 702.6 0.366 10105 22.111 4284.1 0 81912 27.671 5694.7 3.8 54388 0 86 0 4885 17.929 1931.6 0 14851 18.903 562.4 0 20833 37.987 783.9 7.469 6354 53.69 1357 0 23100 15.174 3053.2 0 62722 TAPRE 2726 1520.2 902.8 697.7 828.3 264.3 -264.4 919.5 1338 235.4 5791.7 1473.7 1597.4 1274.9 1335.9 793.2 1044.8 340.1 666.8 2306.9 1043.6 738.5 1699.8 1138.8 418.7 938.6 543.2 784.2 967.4 243.2 248.3 3405.9 1040.7 590.1 Pred Formula EPS 0.1310351 1.3782384 1.04824739 1.87735098 1.92794965 0.34318998 0.05131654 1.03084705 1.60860113 1.05890143 -0.1132588 0.75185589 1.20105214 0.22478373 0.44481573 0.85527735 1.255509 1.44588875 0.56787261 1.21728823 1.09985525 1.23269082 1.54153512 0.50864276 1.04919752 0.12486942 1.54490836 0.65854172 0.78847375 3.53944333 1.01281862 0.72049132 0.4701339 1.10163583 PredSE EPS Studentized Resid EPS 0.06122086 0.04801077 0.05557923 0.08254454 0.09172789 0.0720207 0.13328286 0.04503791 0.07523226 0.04258077 0.10393915 0.1022246 0.10933127 0.05765498 0.06236511 0.04348473 0.06524952 0.06097201 0.05346186 0.04852117 0.04438589 0.04641522 0.05554727 0.04871886 0.04354933 0.08243071 0.05426177 0.04028192 0.06756913 0.15206464 0.04186013 0.04759604 0.08234599 0.04329175 0.9931724 0.30289799 0.25623832 0.01375902 2.82513651 -0.3212368 -0.6888097 0.33802485 0.72325808 0.20037821 0.78768535 -0.3383018 -1.684833 0.17486788 -0.0740863 0.46225283 0.6756407 -0.8277528 2.38038959 -1.3607138 -1.7089289 -0.3558541 0.53700958 -1.0240926 0.54075967 0.54591115 -0.1231085 -1.5009168 -0.4462025 -0.3431784 -0.4034948 1.17406501 -0.4679499 -1.3741276 6 1