5/3/99 252x9943 ECO252 QBA2 Name

advertisement
5/3/99 252x9943
ECO252 QBA2
FINAL EXAM
May 5, 1999
Name
Hour of Class Registered (Circle)
MWF 10 11 TR 12:30 2:00
I. (16 points) Do all the following.
1.
Hand in your fourth regression problem (2 points) and answer the following questions.
a. For the regression of the number of hours of work against the number of machines, what
coefficients are significant at the 1% level? Why? What about the 5% level? (2)
b. Would you say that the regression of number of hours of work against the number of machines and
months of experience is more successful than the regression against machines alone? Why? (3)
c. What was the surprise that occurred when you did the stepwise regression? (2)
2. The following pages show the regression of the variable 'mins', the winning time in minutes in a
triathlon, against some of the following independent variables:
'female' A dummy variable that is 1 if the contestant is female.
'swim' Number of miles of swimming
'bike'
Number of miles of biking
'run'
Number of miles of running
c6
‘swim’ multiplied by ‘female’
c7
‘bike’ multiplied by ‘female’
c8
‘run’ multiplied by ‘female’
c9
‘swim’ squared
c10
‘bike’ squared
c11
‘run’ squared
a. In the regression of ‘mins’ against ‘female’, ‘swim’, ‘bike’ and ‘run’, which coefficients have
signs that look wrong? Why? Which coefficients are not significant at the 99% confidence level?
(3)
b. Look at the regression of ‘mins’ against ‘run‘, c8 and c11 and the regression of ‘mins’ against
‘run’, and c8. Use   .10 . Does either seem to be an improvement over the regression of ‘mins’
against ‘run’ alone? Why?(2)
c. Explain the meaning of the F test in the regression of ‘mins’ against ‘female’, ‘swim’, ‘bike’ and
‘run’ . What is being tested and what are the conclusions? (2)
d. The printout concludes with a printout of the data and of a correlation matrix. What does this
suggest about the problems that are occurring with these regressions? (2)
5/3/99 252x9943
Worksheet size: 100000 cells
MTB > RETR 'C:\MINITAB\LR13-49.MTW'.
Retrieving worksheet from file: C:\MINITAB\LR13-49.MTW
Worksheet was saved on 5/ 3/1999
MTB > regress c1 on 4 c2 c3 c4 c5
Regression Analysis
The regression equation is
mins = - 24.6 + 35.5 female - 25.0 swim + 7.13 bike - 6.37 run
Predictor
Constant
female
swim
bike
run
Coef
-24.57
35.47
-25.01
7.130
-6.372
s = 33.02
Stdev
20.13
14.77
45.75
1.331
5.384
R-sq = 98.0%
t-ratio
-1.22
2.40
-0.55
5.36
-1.18
p
0.241
0.030
0.593
0.000
0.255
R-sq(adj) = 97.4%
Analysis of Variance
SOURCE
Regression
Error
Total
DF
4
15
19
SS
786104
16351
802455
MS
196526
1090
SOURCE
female
swim
bike
run
DF
1
1
1
1
SEQ SS
6291
726098
52189
1526
Unusual Observations
Obs.
female
mins
1
0.00
489.25
18
1.00
660.48
Fit
547.00
582.47
F
180.29
Stdev.Fit
17.48
17.48
p
0.000
Residual
-57.75
78.01
St.Resid
-2.06R
2.79R
R denotes an obs. with a large st. resid.
MTB > regress c1 on 1 c5
Regression Analysis
The regression equation is
mins = - 19.2 + 23.6 run
Predictor
Constant
run
s = 57.74
Coef
-19.25
23.615
Stdev
23.19
1.582
R-sq = 92.5%
t-ratio
-0.83
14.92
p
0.417
0.000
R-sq(adj) = 92.1%
2
5/3/99 252x9943
Analysis of Variance
SOURCE
Regression
Error
Total
DF
1
18
19
SS
742445
60011
802455
MS
742445
3334
Unusual Observations
Obs.
run
mins
1
26.2
489.2
12
18.6
589.1
Fit
599.5
420.0
F
222.69
Stdev.Fit
25.7
16.4
p
0.000
Residual
-110.2
169.1
St.Resid
-2.13R
3.05R
R denotes an obs. with a large st. resid.
MTB > regress c1 on 2 c5 c8
Regression Analysis
The regression equation is
mins = - 19.2 + 22.1 run + 3.02 C8
Predictor
Constant
run
C8
Coef
-19.25
22.106
3.017
s = 54.36
Stdev
21.83
1.705
1.659
R-sq = 93.7%
t-ratio
-0.88
12.96
1.82
p
0.390
0.000
0.087
R-sq(adj) = 93.0%
Analysis of Variance
SOURCE
Regression
Error
Total
DF
2
17
19
SS
752216
50240
802455
SOURCE
run
C8
DF
1
1
SEQ SS
742445
9771
Unusual Observations
Obs.
run
mins
2
18.6
505.1
11
26.2
540.9
12
18.6
589.1
MS
376108
2955
Fit
391.9
639.0
448.0
F
127.27
Stdev.Fit
21.9
32.5
21.9
p
0.000
Residual
113.2
-98.1
141.0
St.Resid
2.27R
-2.25R
2.83R
R denotes an obs. with a large st. resid.
3
5/3/99 252x9943
MTB > regress c1 on 2 c5 c11
Regression Analysis
The regression equation is
mins = - 102 + 39.6 run - 0.519 C11
Predictor
Constant
run
C11
Coef
-101.71
39.550
-0.5192
s = 54.11
Stdev
49.18
8.654
0.2778
R-sq = 93.8%
t-ratio
-2.07
4.57
-1.87
p
0.054
0.000
0.079
R-sq(adj) = 93.1%
Analysis of Variance
SOURCE
Regression
Error
Total
DF
2
17
19
SS
752675
49780
802455
SOURCE
run
C11
DF
1
1
SEQ SS
742445
10230
Unusual Observations
Obs.
run
mins
12
18.6
589.1
MS
376337
2928
Fit
454.3
F
128.52
Stdev.Fit
24.0
p
0.000
Residual
134.8
St.Resid
2.78R
R denotes an obs. with a large st. resid.
MTB > regress c1 on 3 c5 c8 c11
Regression Analysis
The regression equation is
mins = - 102 + 38.0 run + 3.02 C8 - 0.519 C11
Predictor
Constant
run
C8
C11
s = 50.01
Coef
-101.71
38.042
3.017
-0.5192
Stdev
45.45
8.033
1.526
0.2567
R-sq = 95.0%
t-ratio
-2.24
4.74
1.98
-2.02
p
0.040
0.000
0.066
0.060
R-sq(adj) = 94.1%
4
5/3/99 252x9943
Analysis of Variance
SOURCE
Regression
Error
Total
DF
3
16
19
SS
762446
40009
802455
SOURCE
run
C8
C11
DF
1
1
1
SEQ SS
742445
9771
10230
Unusual Observations
Obs.
run
mins
12
18.6
589.1
MS
254149
2501
Fit
482.3
F
101.64
Stdev.Fit
26.3
p
0.000
Residual
106.7
St.Resid
2.51R
R denotes an obs. with a large st. resid.
5
5/3/99 252x9943
MTB > print c1-c11
Data Display
Row
mins
female
swim
bike
run
C6
C7
C8
C9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
489.250
505.150
245.500
204.400
114.533
108.267
79.417
566.500
74.983
116.117
540.933
589.067
280.100
235.033
127.167
120.750
90.317
660.483
83.150
131.817
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
2.40
2.00
1.20
1.50
0.93
0.93
0.50
2.40
0.50
0.60
2.40
2.00
1.20
1.50
0.93
0.93
0.50
2.40
0.50
0.60
112.0
100.0
55.3
48.0
24.8
24.8
18.0
112.0
20.0
25.0
112.0
100.0
55.3
48.0
24.8
24.8
18.0
112.0
20.0
25.0
26.2
18.6
13.1
10.0
6.2
6.2
5.0
26.2
4.0
6.2
26.2
18.6
13.1
10.0
6.2
6.2
5.0
26.2
4.0
6.2
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
2.40
2.00
1.20
1.50
0.93
0.93
0.50
2.40
0.50
0.60
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
112.0
100.0
55.3
48.0
24.8
24.8
18.0
112.0
20.0
25.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
26.2
18.6
13.1
10.0
6.2
6.2
5.0
26.2
4.0
6.2
5.7600
4.0000
1.4400
2.2500
0.8649
0.8649
0.2500
5.7600
0.2500
0.3600
5.7600
4.0000
1.4400
2.2500
0.8649
0.8649
0.2500
5.7600
0.2500
0.3600
Row
C10
C11
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
12544.0
10000.0
3058.1
2304.0
615.0
615.0
324.0
12544.0
400.0
625.0
12544.0
10000.0
3058.1
2304.0
615.0
615.0
324.0
12544.0
400.0
625.0
686.44
345.96
171.61
100.00
38.44
38.44
25.00
686.44
16.00
38.44
686.44
345.96
171.61
100.00
38.44
38.44
25.00
686.44
16.00
38.44
MTB > Correlation c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11.
Correlations (Pearson)
mins
0.089
0.951
0.984
0.962
0.510
0.584
0.564
0.956
0.975
0.928
female
swim
bike
run
C6
C7
C8
female
swim
bike
run
C6
C7
C8
C9
C10
C11
0.000
0.000
0.000
0.792
0.716
0.726
0.000
0.000
0.000
0.973
0.965
0.432
0.480
0.470
0.985
0.954
0.932
0.985
0.420
0.494
0.479
0.979
0.989
0.954
0.417
0.487
0.487
0.982
0.983
0.985
0.982
0.980
0.426
0.412
0.403
0.993
0.483
0.488
0.471
0.478
0.478
0.479
C9
0.982
0.974
C10
C10
C11
0.975
6
5/3/99 252x9943
II. Do at least 4 of the following 7 Problems (at least 15 each) (or do sections adding to at least 60 points Anything extra you do helps, and grades wrap around) . Show your work! State H 0 and H1 where
applicable. Use a significance level of 5% unless noted otherwise.
1. a. Premiums on a group of 11closed end mutual funds were as follows. (These are in per cent, but that
shouldn’t affect your analysis.)Test the hypothesis that the mean is 3 per cent using (i) Either a test ratio or a
critical value and (ii) A confidence interval. (6)
+4.7 -0.7 +5.3 +9.2 -0.3 -0.3 +5.0 +0.4 -1.9 +0.5 -3.1
b. Test that the following data (i) has a Poisson distribution (6) and (ii)has a Poisson distribution with a
mean of 4.5 (6). If you do both parts do only one with a chi-square method.
x 0 1 2 3 4 5 6 7
O 23 19 42 60 89 79 48 40
7
5/3/99 252x9943
2. Eight Technicians are asked to take a test and then rated by their supervisors. Scores and ratings follow,
with the addition of productivity figures. (Use   .01 )
Technician
Test Score Performance Productivity
ranking
x1
x2
x3
Armstrong
83
3
180
Brubecker
68
7
170
Cooper
60
6
164
Dollfuss
81
4
182
Ezekiel
74
5
174
Fassbinder
95
1
191
Goodwrench
90
2
195
Hingle
66
8
160
x
1
 617,
x
2
1
 48631,
x
2
 36,
x
2
2
 204,
x
3
 1416,
x
2
3
 251702.
a. Compute the correlation between x1 and x 2 and test it for significance.(5)
b. Compute the rank correlation between x1 and x 2 and test it for significance. Which of these two
measures (rank or conventional correlation?) is most appropriate here? Why?(5)
c. Compute Kendall’s W for these data and test it for significance (6)
d. Test the hypothesis that the correlation between x1 and x 2 is .8 .
(5)
8
5/3/99 252x9943
3. Samples of demand for four types of sailboat sold by your firm is as follows:
West Coast East Coast Total
Pirates Revenge
74
146
220
Jolly Roger
54
110
164
Bluebeard’s Treasure
46
100
146
Ahab’s Quest
50
120
170
Total
224
476
700
Do all tests at the 95% confidence level.
a. Management had initially assumed that the proportion of total sales of “Pirates Revenge” would
be at most 30% of sales. Test this. (3)
b. Test the hypothesis that sales of the “Pirates Revenge” are the same proportion of sales on both
the East and West Coast (4)
c. Test the hypothesis that sales on the West Coast follow a uniform distribution (i.e. that each
model is the same proportion of West Coast sales) (5)
d. Test the hypothesis that the proportions of each boat sold are the same on both coasts. (5)
9
5/3/99 252x9943
4. Data on passengers (in thousands), advertising (in $thousands) and (National income in $trillions)
appears below. (Use   .05 )
x 3 y (2) (This must be done correctly
pass
adv
inc
season
a) Compute

y
15
17
13
23
x1
10
12
8
17
x2
2.40
2.72
2.08
3.68
x3
1
1
1
1
to get full credit for b.)
b) Compute a simple regression of passengers against
National income. (6)
c) Compute R 2 (4)
d) Compute s e (3)
16
10
2.56
1
e) Compute s b1 ( the std deviation of the coefficient
21
14
20
26
18
17
18
23
15
16
y  272 ,
15
3.36
0
10
2.24
0
14
3.20
0
19
3.84
0
10
2.72
0
11
2.07
0
13
2.33
0
16
2.98
0
10
1.94
1
12
2.17
1
2
y  5128 ,
x  187,


 x y  3549 ,  x
1

2y
1
 759 .090 ,
of National Income) and do a confidence interval for
1 .(3)
f) Do a confidence interval for Passengers, when
income is $4.10 billion. (3) At what income will
this interval be smallest? (1)
x
x x
1 2
2
1
 2469,
x
2
 40.29,
x
2
2
 113.339,
 525 .38 and n  15 . You do not need all of these.
10
5/3/99 252x9943
5. Data from problem 4 is repeated below. (Use   .01 )
y  272 ,
y 2  5128 ,
x  187,
x12  2469,


 x y  3549 ,  x
1

2y
1
 759 .090 ,

x x
1 2
x
2
 40.29,
x
2
2
 113.339,
 525 .38 and n  15 .
a. Do a multiple regression of passengers against advertising and National Income. (12)
b. Compute R 2 and R 2 adjusted for degrees of freedom for both this and the previous problem. Compare
the values of R 2 adjusted between this and the previous problem. Use an F test to compare R 2 here with
the R 2 from the previous problem.(5)
c. Compute the regression sum of squares and use it in an F test to test the usefulness of this regression. (5)
d. Use your regression to predict the number of passengers when we spend $13 (thousand) on advertising
and National Income is $3.5 (trillion).(2)
d. This regression on the previous page was run with the command
MTW > regress C1 on 1 C3;
SUBC > dw.
As a result, the last line of the regression read
Durbin-Watson statistic = 0.71
What did I test for and what was the meaning of the last line of the regression? Assume a confidence level
and use the tables in the text.
11
5/3/99 252x9943
6.(Watch it!) Three methods are used to train candidates for the FAA pilots exam. Scores for trainees are
shown below classified by method.
Method
a. Assume that the data is normal and compare the means
Video
Audio
for the first two methods (Assume unequal variances) (5)
Cassette Cassette Classroom
b. Do the same for all three methods (You may assume
72
73
68
equal variances now) (7)
86
75
83
80
60
50
c. Test column 1 to see if it has the normal distribution (5)
91
52
91
46
84
84
68
76
77
75
94
81
92
90
72
86
80
91
46
68
75
 x1

Note: For the first column:  x1  x
 0.14 0.82 0.41 1.16  1.91  0.41 0.07

 s
Note:
x
1
 518,
x
2
1
 39626,
x
3
 810,
x
2
3
 67240.
12
4/29/99 252x9942
7. (Watch it!) Three methods are used to train candidates for the FAA pilots exam. Scores for trainees are
shown below classified by method.
Method
a. Using a sign test, check x3 to see if it has a median of 85. (4)
Video
Audio
Cassette Cassette Classroom
b. Repeat the test on x3 using a more powerful method. (5)
x1
72
86
80
91
46
68
75
x2
73
75
60
52
84
76
x3
68
83
50
91
84
77
94
81
92
90
c. Apply the Runs Test as follows:
Write down the numbers in x1 and x3 together in order.
Underneath the numbers write down A if the number comes from
x1 and C if it comes from x3 . You will have a sequence like AACACC
….. . In case of a tie remove both tying numbers from your test.
Do a runs test on the resulting sequence to see if the A’s and C’s
appear randomly.
Congratulations! You have just done a Wald-Wolfowitz Test for the
equality of means in two (nonnormal) samples. If the sequence is
random, the means are equal. (6)
13
Download