4/19/02 252x0232 c ECO252 QBA2 Name Correct

advertisement
4/19/02 252x0232 c
(Page layout view!)
ECO252 QBA2
THIRD HOUR EXAM
April 18, 2002
Name Correct
Hour of Class Registered (Circle)
MWF TR 10 12 12:30 2:00
I. (10+ points) Do all the following;
1. Hand in your computer printouts for problems 2 and 3.(5 points – 3 point penalty for not handing in).
remember that the ANOVA printout must be completed, using a 5% significance level, for full credit. I
should be able to tell what is tested and what are the conclusions.
2. a. In particular, is the interaction between car and driver significant? Which numbers made you think
that? (2)
b. Create two confidence intervals for the difference between the means for drivers 2 and 3, one that is
valid alone, and one that is valid simultaneously with other similar intervals. Do these intervals show a
significant difference between these two means? Why? (4)
c. In your income and education regression,
(i) explain what coefficients are significant and why? (2)
(ii) What income would you predict for someone with 3 years of education? (1)
(iii) Make a confidence interval for the income of someone with 3 years of education using some
of the information generated by Minitab below. (2)
Descriptive Statistics
Variable
Educ
N
32
Mean
12.000
Median
12.000
TrMean
12.071
Variable
Educ
Min
4.000
Max
20.000
Q1
8.000
Q3
16.000
StDev
4.363
Column Sum of Squares
Sum of squares (uncorrected) of Educ
=
5198.0
SEMean
0.771
4/12/02 252x0232
II. Do at least 4 of the following 5 Problems (at least 10 each) (or do sections adding to at least 40 points Anything extra you do helps, and grades wrap around) . Show your work! State H 0 and H1 where
applicable. Never say 'yes' or 'no' without a statistical test.
1. On the following pages there are printouts from two computer problems.
a. The One-way ANOVA Problem ( Albright, Winston, Zappe - abbreviated): An automobile parts producer
has instituted an employee empowerment program in five plants. Random samples of employees in
each plant are asked to rate the success of the program on a 1 to 10 scale. 10 being the highest
rating. They want to know if the program is being implemented with equal success at each plant
and are thus looking to see if there is a significant difference between mean ratings at each plant.
They are assuming that the results are distributed according to Normal distributions with similar
variances.
(i) Indicate what hypothesis was tested, what the p-value was and whether, using the p-value, you
would reject the null if () the significance level was 5% and () the significance level was 1%.
Explain why. Does this mean that the success was equal in all plants? (3)
(ii) Do a 'normal' and a Scheffe confidence interval   .05  for the difference between the
means in the two plants that were least successful. Do these intervals indicate a difference in the
success of the program between these two plants? Why? (4.5).
(iii) The printout gives 95% confidence intervals for the means for each plant. Find the numbers
for the confidence interval for 'Midwest.' Why is this interval smaller than the others? (2.5)
(iv) I would question whether ANOVA was appropriate for this problem because there is no
evidence that the underlying populations are Normally distributed. What method would I prefer for
this problem? (1)
b. The Regression Problem: This relates the number of shares in thousands to the age of board members of
a corporation.
(i) Looking at significance tests and the value of R-squared, how successful is this regression?
Why? Why shouldn't this surprise you? (3)
(ii) Note that c1 contains 'shares' and that c4 contains predicted values of 'shares.' Add a regression
line to the graph. (1)
(ii) What equation relates the number of shares owned to the age of the board member? How many
shares does it say that we should expect a 83-year old board member to own? Would you take this
seriously? Why? (2)
2
4/12/02 252x0232
One-way ANOVA problem
MTB > RETR 'C:\MINITAB\2X0232-1.MTW'.
Retrieving worksheet from file: C:\MINITAB\2X0232-1.MTW
Worksheet was saved on 4/ 9/2002
MTB > print c1-c5
Data Display
Row
south
midwest
n-east
s-west
west
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
7
1
8
7
2
9
3
8
5
7
4
7
6
10
3
9
10
8
4
3
2
7
7
5
10
10
6
3
5
2
6
4
5
2
7
8
7
7
5
5
5
4
3
4
5
5
3
3
3
5
5
6
4
7
10
7
6
6
7
4
3
7
8
9
10
4
10
4
6
6
6
6
6
3
4
8
6
2
4
5
6
4
7
4
3
5
4
7
6
4
MTB > AOVOneway c1 c2 c3 c4 c5.
One-Way Analysis of Variance
Analysis of Variance
Source
DF
SS
Factor
4
46.24
Error
85
393.55
Total
89
439.79
Level
south
midwest
n-east
s-west
west
Pooled StDev =
N
11
26
14
18
21
Mean
5.545
6.000
4.429
6.556
5.048
MS
11.56
4.63
StDev
2.697
2.623
1.158
2.229
1.532
F
2.50
p
0.049
Individual 95% CIs For Mean
Based on Pooled StDev
---+---------+---------+---------+--(----------*----------)
(------*------)
(---------*--------)
(--------*-------)
(-------*-------)
---+---------+---------+---------+---
2.152
Regression Problem
Worksheet size: 100000 cells
MTB > RETR 'C:\MINITAB\2X0232-5.MTW'.
Retrieving worksheet from file: C:\MINITAB\2X0232-5.MTW
Worksheet was saved on 4/12/2002
MTB > echo
MTB > Execute 'C:\MINITAB\252SOLS3.MTB' 1.
Executing from file: C:\MINITAB\252SOLS3.MTB
MTB > #252sols3
MTB > print c1 c2
3
4/12/02 252x0232
Data Display
Row
shares
age
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
7.9
66.4
29.7
60.5
10.4
28.7
86.9
121.1
35.3
2.8
74.4
13.1
9.1
19.1
18.8
3.1
96.5
47.0
31.1
53
60
69
49
67
68
46
62
63
55
57
71
66
70
66
57
54
64
56
MTB > plot c1*c2 (plot omitted)
MTB > regress c1 on 1 c2 c3 c4
Regression Analysis
The regression equation is
shares = 153 - 1.86 age
Predictor
Constant
age
Coef
152.95
-1.860
s = 33.01
Stdev
64.82
1.061
R-sq = 15.3%
t-ratio
2.36
-1.75
p
0.031
0.098
R-sq(adj) = 10.3%
Analysis of Variance
SOURCE
Regression
Error
Total
DF
1
17
18
SS
3348
18522
21870
MS
3348
1090
F
3.07
Unusual Observations
Obs.
age
shares
Fit Stdev.Fit
8
62.0
121.10
37.65
7.70
R denotes an obs. with a large st. resid.
Residual
83.45
St.Resid
2.60R
plot c4*c2 (plot omitted)
plot c4*c2 c1*c2;
symbol;
type 3 1;
color 8 9;
overlay.
end
100
C4
MTB >
MTB >
SUBC>
SUBC>
SUBC>
SUBC>
MTB >
p
0.098
50
0
50
60
70
age
4
4/12/02 252x0232
2. A researcher believes that the data below has a Normal distribution with a mean of 80 and a standard
x   x  80

deviation of 5. For your convenience the values of z 
are computed for you.

5
a. Use a chi-squared test to find out if the distribution is correct. (9)
b. Is there a better way to do this problem than chi-squared? Why? Do it. (5)
c. Assume that, instead of using population means given above, we actually checked the data and
found that x  80 and s  5. How would this change what we did in a)? (1)
d. Assume that, instead of using population means given above, we actually checked the data and
found that x  80 and s  5. How would this change what we did in b)? (1)
x interval z interval
below 74
below -1.2
74-78
-1.2 to -0.4
78-82
-0.4 to 0.4
82-86
0.4 to 1.2
86-90
1.2 to 2.0
above 90
above 2.0
Observed
Frequency
23
53
52
46
24
2
200
5
4/12/02 252x0232
3. (Weirs) A maker of stain removers is testing the effectiveness of four different formulations of a new
product. Columns represent formulations 1-4 of the product and the 6 rows represent different stains
(Creosote, crayon, motor oil, grape juice, ink, coffee). Each formulation is rated on a 1-10 scale for its
effectiveness.
Stain
1
2
3
4
5
6
Sum
Count
Form 1 Form 2 Form 3 Form 4
1
7
2
5
9
10
7
5
4
6
1
4
9
7
4
5
6
8
4
4
9
4
2
6
38
42
20
29
6
6
6
6
Sum of
Squares
296
314
sum count
15
4
31
4
15
4
25
4
22
4
21
4
129
24
24
Sum of squares
79
255
69
171
132
137
843
90
a. Assume that the parent distribution is Normal and compare the mean ratings for the four formulations,
noting the fact that it is cross-classified. Use   .10 . (14) Note: If you wish to ignore that the fact that the
data is classified by stain type, indicate this now and compare the column means assuming that the data is
four independent random samples from a Normal distribution.(10). (   .10 )
b. Using the same significance level, assume that Formulation 1 is the current formula and use Scheffe
intervals to see which formulations have mean ratings that differ significantly from the current formulation.
(4)
c. Using a significance level of 15%, repeat the analysis in b) using Bonferroni intervals. (4)
6
4/12/02 252x0232
3(ctd.). d. Actually, when Weirs presented the data in the previous problem, repeated below, he assumed
that the underlying distribution was not Normal. So compare the median ratings using a 10% significance
level. (6)
Stain
1
2
3
4
5
6
Sum
Count
Sum of
Squares
Form 1 Form 2 Form 3 Form 4
1
7
2
5
9
10
7
5
4
6
1
4
9
7
4
5
6
8
4
4
9
4
2
6
38
42
20
29
6
6
6
6
296
314
sum count
15
4
31
4
15
4
25
4
22
4
21
4
129
24
24
Sum of squares
79
255
69
171
132
137
843
90
7
4/12/02 252x0232
4. Use methods appropriate to testing goodness of fit.
a. Test the hypothesis that the numbers below came from a Normal distribution. Use a 10%
significance level. (6) note that Minitab says the following:
mean
303.000
stdev
64.0878
n
9.00000
b. Test the hypothesis that the numbers below came from a Normal distribution with a mean of
240 and a standard deviation of 50 (6)
238 222 272 280 292 301 333 357 432
8
4/12/02 252x0232
5. (Weirs) The following data gives years of membership and numbers of shares (in thousands) owned for 8
board members of our corporation. Numbers are the dependent variable and years is the independent
variable.
Data Display
Row
1
2
3
4
5
6
7
8
Total
share
years
300
408
560
252
288
650
600
522
3580
6
12
14
6
9
13
15
9
84
years
shares
squared squared
36
90000
144
166464
196
313600
36
63504
81
82944
169
422500
225
390000
81
272484
968 1771496
Note that n  8 and that you will have to compute
 xy .
a. Compute the regression equation Y  b0  b1 x to predict thousands of shares owned on the basis
of age. (6)
b. On the basis of your regression, how many thousands of shares do you expect to be owned by
someone who has been on the board for 3 years ? (1)
c. Compute R 2 . (4)
d. Compute s e . (3)
e. Compute s b0 and do a significance test on b0 .(4)
f.. Do an interval that shows the average number of shares that would be owned by someone who
has been on the board for 3 years. (3)
g. Using your SST etc., put together the ANOVA table (6)
9
4/12/02 252x0232
(Intentionally left blank for calculations)
10
Download