Stat 328 Exam II (Regression) Data Analysis Using Regression 8 œ $%

advertisement
Stat 328 Exam II (Regression)
Summer 2001
Prof. Vardeman
This exam treats the analysis of some data taken from Data Analysis Using Regression
Models by Frees. They concern the earnings of 8 œ $% stock insurance companies. For
each company there are values for
C œ I T W œ second quarter 1991 earnings per share (presumably in dollars?)
and for the ( predictor variables
B"
B#
B$
B%
B&
B'
B(
œ ITW*! œ 1990 earnings per share (presumably in dollars?)
œ XETGI œ total assets as a percentage of common stockholders equity
œ PHT X G œ long term debt as a percent of total capital
œ R X WPW œ net sales in thousands of dollars
œ THMZ œ preferred dividends (presumably in dollars?)
œ GWS œ common shares outstanding
œ XETVI œ total assets as a percent of retained earnings
The JMP reports for these data attached to this exam are:
page 1:
page 2:
page 3:
page 4:
page 5:
page 6:
output from JMP's "Multivariate" routine
output from a run of JMP's "Fit Y by X" routine
output from a first run of JMP's "Fit Model" routine
output from a second run of JMP's "Fit Model" routine
some plots based on the (page 4) second run of JMP's "Fit Model" routine
the JMP data table, with some added columns based on the (page 4)
second run of "Fit Model"
a) Based on page 1 of the printouts, which single variable is the best predictor of I T W ?
Explain.
b) Exactly what on page 1 of the printout indicates that there is strong multicollinearity
present in this data set?
1
Consider first the problem of predicting I T W from only I T W *! using the normal
SLR model.
c) What do you estimate to be the standard deviation of 2nd quarter 1991 earnings per
share for companies of this kind having a particular 1990 earnings performance?
d) Give a single number estimate and 95% confidence limits for the increase in mean
I T W that accompanies a one dollar increase in ITW*!. Explain why you should not be
surprised that the estimate is roughly this size.
single number estimate: ___________
lower confidence limit: __________
upper confidence limit: __________
rationale:
e) Is the hypothesis H! À "" œ ! plausible in this SLR? (Give and interpret an
appropriate :-value. Say precisely where you found it on the printouts.)
:-value: __________
from exactly where?: __________
interpretation:
2
f) Give 95% confidence limits for the mean 2nd quarter 1991 earnings per share of
companies of this type with 1990 earnings per share of $2.
g) Give 95% prediction limits for the 2nd quarter 1991 earnings per share of an
additional company of this type with 1990 earnings per share of $2.
h) Two particular additional companies of this type had 1990 earnings per share of $2
and $4.50. I'd like to predict the difference in their 2nd quarter 1991 earnings per share.
Notice that a sensible single number prediction is
s œ Ð,! € %Þ&," Ñ • Ð,! € #," Ñ œ #Þ&," . You can get a standard error for P
s without using
P
anything beyond the pages already in your possession. Give 95% prediction limits for
this difference. (If you can't see how to find the necessary standard error, write "standard
s" where it is needed in the formula.)
error for P
3
Now consider the multiple regressions represented on the JMP reports.
i) Find an J value, its degrees of freedom and (as best you can) the corresponding :value for judging whether after accounting for ITW*!, some among the other 6
predictors provide statistically detectable explanatory power for modeling I T W .
J œ __________
.Þ0 Þ œ ______ , ______
:-value œ __________
j) What do you estimate to be the increase in mean I T W for a one dollar increase in
ITW*!, holding all 6 other predictors fixed? Give 95% confidence limits. (No need to
simplify.)
k) Based on pages 3 and 4 of the printout, Vardeman would be inclined to use only some
of the 7 possible predictor variables. Make his case. Point out at least 3 features of pages
3 and 4 that support this judgment.
4
Henceforth consider modeling IWT using only the $ predictors I T W *!ß
PHT X G ß and T J HMZ .
l) Say what the plots on pages 4 and 5 of the printout indicate about the appropriateness
of such modeling.
m) For which two cases (which two companies) do you judge the fitted equation to do
the poorest job of predicting I T W ? Explain.
n) Does it appear that one could drop any single predictor from this equation without
significantly reducing ones ability to account for I T W ? Explain.
o) Give 95% prediction limits for the I T W of an additional company with
ITW*! œ #Þ"&ß PHT X G œ ! and T J HMZ œ !. (Plug in, but there's no need to
simplify.)
5
Stat 328 MLR Exam Printout
Multivariate
Correlations
EPS
1.0000
0.8926
-0.0761
-0.3339
0.3620
0.0371
0.1155
-0.2536
EPS
EPS90
TAPCE
LDPTC
NTSLS
PFDIV
CSO
TAPRE
EPS90
0.8926
1.0000
-0.0580
-0.1137
0.2678
-0.1166
0.1408
-0.2106
TAPCE
-0.0761
-0.0580
1.0000
0.0610
0.2734
-0.0822
0.1447
0.8158
LDPTC
-0.3339
-0.1137
0.0610
1.0000
-0.0140
0.4312
0.1961
0.2793
NTSLS
0.3620
0.2678
0.2734
-0.0140
1.0000
0.3472
0.7288
0.0575
PFDIV
0.0371
-0.1166
-0.0822
0.4312
0.3472
1.0000
0.2466
-0.0659
CSO
0.1155
0.1408
0.1447
0.1961
0.7288
0.2466
1.0000
0.0199
TAPRE
-0.2536
-0.2106
0.8158
0.2793
0.0575
-0.0659
0.0199
1.0000
Scatterplot Matrix
3
1.5
EPS
0
14
10
6
2
2500
1500
500
-500
EPS90
TAPCE
60
LDPTC
30
0
15000
5000
NTSLS
20
10
PFDIV
0
150000
CSO
50000
6000
4000
2000
0
TAPRE
0 1 2 3
2 4 6 8 12 -5001000
0 2040 70 500015000 0 5 10 2050000
0 3000
1
Bivariate Fit of EPS By EPS90
4
3.5
3
EPS
2.5
2
1.5
1
0.5
0
-0.5
0
2
4
6
8
EPS90
10
12
14
Linear Fit
Linear Fit
EPS = -0.06612 + 0.2930231 EPS90
Summary of Fit
RSquare
RSquare Adj
Root Mean Square Error
Mean of Response
Observations (or Sum Wgts)
0.796728
0.790376
0.328271
0.991176
34
Analysis of Variance
Source
Model
Error
C. Total
DF
1
32
33
Sum of Squares
13.515976
3.448377
16.964353
Mean Square
13.5160
0.1078
F Ratio
125.4246
Prob > F
<.0001
Parameter Estimates
Term
Intercept
EPS90
Estimate
-0.06612
0.2930231
Std Error
0.109919
0.026164
t Ratio
-0.60
11.20
Prob>|t|
0.5517
<.0001
Lower 95%
-0.290018
0.239728
Upper 95%
0.1577778
0.3463182
2
Response EPS
Whole Model
Actual by Predicted Plot
EPS Actual
3.5
3
2.5
2
1.5
1
0.5
0
-0.5
-0.5 .0 .5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
EPS Predicted P<.0001 RSq=0.93
RMSE=0.2143
Summary of Fit
RSquare
RSquare Adj
Root Mean Square Error
Mean of Response
Observations (or Sum Wgts)
0.929586
0.910628
0.214345
0.991176
34
Analysis of Variance
Source
Model
Error
C. Total
DF
7
26
33
Sum of Squares
15.769819
1.194534
16.964353
Mean Square
2.25283
0.04594
F Ratio
49.0347
Prob > F
<.0001
Parameter Estimates
Term
Intercept
EPS90
TAPCE
LDPTC
NTSLS
PFDIV
CSO
TAPRE
Estimate
0.1768909
0.2942356
-0.00015
-0.014567
0.0000109
0.0362552
-7.665e-7
0.0000968
Std Error
0.114681
0.019265
0.000142
0.002694
0.000015
0.008572
0.000001
0.00007
t Ratio
1.54
15.27
-1.05
-5.41
0.71
4.23
-0.54
1.39
Prob>|t|
0.1350
<.0001
0.3013
<.0001
0.4859
0.0003
0.5926
0.1776
Effect Tests
Source
EPS90
TAPCE
LDPTC
NTSLS
PFDIV
CSO
TAPRE
Nparm
1
1
1
1
1
1
1
DF
1
1
1
1
1
1
1
Sum of Squares
10.717653
0.051108
1.342878
0.022958
0.821897
0.013482
0.088208
F Ratio
233.2785
1.1124
29.2288
0.4997
17.8893
0.2934
1.9199
Prob > F
<.0001
0.3013
<.0001
0.4859
0.0003
0.5926
0.1776
Residual by Predicted Plot
EPS Residual
0.6
0.4
0.2
-0.0
-0.2
-0.4
-0.5 .0 .5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
EPS Predicted
Press
1.9350050355
3
Response EPS
Whole Model
Actual by Predicted Plot
EPS Actual
3.5
3
2.5
2
1.5
1
0.5
0
-0.5
-0.5 .0 .5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
EPS Predicted P<.0001 RSq=0.92
RMSE=0.2095
Summary of Fit
RSquare
RSquare Adj
Root Mean Square Error
Mean of Response
Observations (or Sum Wgts)
0.9224
0.91464
0.209478
0.991176
34
Analysis of Variance
Source
Model
Error
C. Total
DF
3
30
33
Sum of Squares
15.647918
1.316435
16.964353
Mean Square
5.21597
0.04388
F Ratio
118.8658
Prob > F
<.0001
Parameter Estimates
Term
Intercept
EPS90
LDPTC
PFDIV
Estimate
0.1630568
0.2908916
-0.013677
0.0363115
Std Error
0.089807
0.016853
0.002142
0.006933
t Ratio
1.82
17.26
-6.38
5.24
Prob>|t|
0.0794
<.0001
<.0001
<.0001
Lower 95%
-0.020354
0.2564729
-0.018052
0.022152
Upper 95%
0.3464675
0.3253103
-0.009302
0.050471
Effect Tests
Source
EPS90
LDPTC
PFDIV
Nparm
1
1
1
DF
1
1
1
Sum of Squares
13.073125
1.788888
1.203646
F Ratio
297.9210
40.7666
27.4297
Prob > F
<.0001
<.0001
<.0001
Residual by Predicted Plot
EPS Residual
0.6
0.4
0.2
-0.0
-0.2
-0.4
-0.5 .0 .5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
EPS Predicted
Press
1.7181931456
4
2
1
0
-1
-2
-0.5 0
.5
1 1.5 2 2.5 3 3.5
Pred Formula EPS
0
-1
1
0
-1
5
EPS90
10
-1
0
50000
100000 150000 200000
CSO
1
0
-1
0
5000
10000
NTSLS
0
5
10
PFDIV
15000 20000
2
1
0
-1
-2
-1000 0
1000 2000 3000 4000 50006000
TAPRE
3
Studentized Resid EPS
3
0
3
2
15
1
10 20 30 40 50 60 70 80
LDPTC
-2
0
2
-2
0
Studentized Resid EPS
2
-2
Studentized Resid EPS
1
3
Studentized Resid EPS
Studentized Resid EPS
3
2
1
0
-1
-2
-1000
2
-2
-10
4
3
Studentized Resid EPS
3
Studentized Resid EPS
Studentized Resid EPS
3
2
1
0
-1
-2
0
500 1000150020002500
TAPCE
15
20
5
Rows EPS EPS90 TAPCE LDPTC NTSLS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
0.33
1.44
1.1
1.88
2.46
0.28
-0.06
1.1
1.75
1.1
0.03
0.69
0.9
0.26
0.43
0.95
1.39
1.28
1.05
0.94
0.75
1.16
1.65
0.3
1.16
0.23
1.52
0.35
0.7
3.49
0.93
0.96
0.38
0.82
1.29
4.74
4.69
6.76
4.88
0.91
0.5
3.56
4.28
4.15
2.05
3.29
2.28
1.53
1.44
3.61
3.39
4.41
3.06
4.9
3.85
4.44
5.01
2.28
4.27
0.09
5.79
2.53
2.15
12.45
3.81
2.77
3.58
3.94
658.38
1262.67
817.01
587.04
716.34
192.57
-697.29
836.3
1232.05
222.17
2479.29
1138.5
1306.39
938.04
1015.83
716.2
1188.47
329.26
580.2
1650.65
1007.16
587.18
769.27
855.98
492.98
875.6
425.57
675.87
432.75
368.65
232.88
1128.74
659.61
534.14
PFDIV
CSO
29.778 1338.4
0 40600
11.965 19020.5
0 110113
40.339
4481
2 111275
44.156 16214 9.688 212143
0.734 9944.4 9.787 61793
6.184
449.2
0 35543
75.129
443.1 21.215 38715
12.565 2929.2 0.112 47502
13.547 8489.6 10.625 41496
22.765
2723
0 73521
63.804 3683.8
0 74550
68.763 6703.1 15.759 76352
17.733 11313.4
17 102170
28.028
870.1
0 105600
10.026 2678.4
0 81451
26.168 2626.4
0 65109
19.922 2577.3 10.432 44783
0 1162.6
0 34523
35.484
334.1
0
7847
27.136 1331.7
0 12264
13.39 2751.3
0 46481
34.54 1796.1 6.898 51315
5.768 2170.3
0 33198
23.225
456.4
0 10786
27.256 1235.6 0.463 15358
5.678
702.6 0.366 10105
22.111 4284.1
0 81912
27.671 5694.7
3.8 54388
0
86
0
4885
17.929 1931.6
0 14851
18.903
562.4
0 20833
37.987
783.9 7.469
6354
53.69
1357
0 23100
15.174 3053.2
0 62722
TAPRE
2726
1520.2
902.8
697.7
828.3
264.3
-264.4
919.5
1338
235.4
5791.7
1473.7
1597.4
1274.9
1335.9
793.2
1044.8
340.1
666.8
2306.9
1043.6
738.5
1699.8
1138.8
418.7
938.6
543.2
784.2
967.4
243.2
248.3
3405.9
1040.7
590.1
Pred Formula
EPS
0.1310351
1.3782384
1.04824739
1.87735098
1.92794965
0.34318998
0.05131654
1.03084705
1.60860113
1.05890143
-0.1132588
0.75185589
1.20105214
0.22478373
0.44481573
0.85527735
1.255509
1.44588875
0.56787261
1.21728823
1.09985525
1.23269082
1.54153512
0.50864276
1.04919752
0.12486942
1.54490836
0.65854172
0.78847375
3.53944333
1.01281862
0.72049132
0.4701339
1.10163583
PredSE EPS
Studentized Resid
EPS
0.06122086
0.04801077
0.05557923
0.08254454
0.09172789
0.0720207
0.13328286
0.04503791
0.07523226
0.04258077
0.10393915
0.1022246
0.10933127
0.05765498
0.06236511
0.04348473
0.06524952
0.06097201
0.05346186
0.04852117
0.04438589
0.04641522
0.05554727
0.04871886
0.04354933
0.08243071
0.05426177
0.04028192
0.06756913
0.15206464
0.04186013
0.04759604
0.08234599
0.04329175
0.9931724
0.30289799
0.25623832
0.01375902
2.82513651
-0.3212368
-0.6888097
0.33802485
0.72325808
0.20037821
0.78768535
-0.3383018
-1.684833
0.17486788
-0.0740863
0.46225283
0.6756407
-0.8277528
2.38038959
-1.3607138
-1.7089289
-0.3558541
0.53700958
-1.0240926
0.54075967
0.54591115
-0.1231085
-1.5009168
-0.4462025
-0.3431784
-0.4034948
1.17406501
-0.4679499
-1.3741276
6
1
Download