252y0571 11/28/05 ECO252 QBA2 Name KEY

advertisement
252y0571 11/28/05 (Page layout view!)
ECO252 QBA2
THIRD HOUR EXAM
Dec 1 2005
Name KEY
Hour of Class Registered
MWF 2, MWF3, TR 12:30, TR2
I. (40 points) Do all the following (2 points each unless noted otherwise). Do not answer question ‘yes’ or
‘no’ without giving reasons. Show your work in questions that are not multiple choice.
1. Turn in your computer problems 2 and 3 marked to show the following: (5 points, 2 point penalty for
not doing.)
a) In problem 2 – what is tested and what are the results?
b) In problem 3 – what coefficients are significant? What is your evidence?
c) In the last graph in problem 3, where is the regression line?
[5]
2. (Dummeldinger) As part of a study to investigate the effect of helmet design on football injuries,
head width measurements were taken for 30 subjects randomly selected from each of 3 groups (High
school football players, college football players and college students who do not play football – so that
there are a total of 90 observations) with the object of comparing the typical head widths of the three
groups. If the researchers assume that the data in each of these three groups comes from a Normally
distributed population, they should use the following method.
a) The Kruskal-Wallis test.
b) *One-way ANOVA
c) The Friedman test
d) Two-Way ANOVA
[7]
3. (Sandy) Which of the following is not an assumption required for 1-way ANOVA wth 4 columns..
a) * 1   2   3   4 .
b) All of the columns are random samples
c) All of the population have to be Normally distributed.
d)  1   2   3   4
[9]
4. If we are comparing the means of 4 random samples and find the following:
x1  10 x 2  12 x 3  11 x 4  13
s1  2.0 s 2  2.2 s 3  2.3 s 4  2.5
n1  5 n 2  5 n 3  5 n 4  5
The appropriate test statistic is:
a)  2 with 12 degrees of freedom
b)  2 with 19 degrees of freedom
c) F with 5 and 4 degrees of freedom (0.5)
d) * F with 3 and 16 degrees of freedom (2)
e) F with 4 and 5 degrees of freedom. (0.5)
x1  x 2  x3  x 4
f) D 
s12 s 22 s 32 s 42



n1 n 2 n3 n 4
[11]
Solution: We use ANOVA for multiple comparison of means.  2 is only used in comparing
medians and proportions. n 
n
j
 20 so total degrees of freedom are 19. There are 4
columns, so degrees of freedom between are 3. Thus there are 19 – 3 = 16 degrees of freedom
MSB
within, and for an ANOVA, F 
has 3 and 16 DF.
MSW
252y0571 11/28/05 (Page layout view!)
5. If we are doing a 2-way ANOVA and find the following:
Two-way ANOVA: C8 versus C9, C10
Source
Rows
Columns
Interaction
Error
Total
DF
3
2
6
60
71
SS
8.2963
3.7183
25.8108
56.3071
94.1325
MS
2.76542
1.85916
4.30180
0.93845
S = 0.9687
R-Sq = 40.18%
F
2.95
1.98
4.58
P
0.040
0.147
0.001
R-Sq(adj) = 29.22%
The following are significant at the 1% level.
(3)
a) Differences between Row means only
b) Differences between Column means only
c) Differences between both Row and Column means
d) *Interaction only
e) All are significant at the 1% level
f) None are significant at the 1% level
g) Not enough information.
Note that Interaction is the only F with a p-value below .01.
[14]
6. If we do a 1-way ANOVA and find the following.
One-way ANOVA: C1, C2, C3, C4
Source
Factor
Error
Total
Level
C1
C2
C3
C4
DF
3
68
71
N
18
18
18
18
SS
32.37
266.27
298.64
Mean
11.916
12.436
12.927
13.736
MS
10.79
3.92
F
2.76
P
0.049
Individual 95% CIs For Mean Based on Pooled
StDev
StDev
+---------+---------+---------+--------1.095
(--------*--------)
2.195
(--------*---------)
1.929
(--------*---------)
2.434
(--------*---------)
+---------+---------+---------+--------11.0
12.0
13.0
14.0
Give a 1% Tukey confidence interval (or equivalent test) for 1   4 and explain whether this
shows a significant difference between these two means.
(3)
[17]
Extra Credit – do the same with a Scheffe interval.
(2)
Extra Credit – Do the same for an individual confidence interval for the difference and explain
why it is more likely to show a significant difference than the other two. (2)
Solution: From the printout n  m  68, m  4, s 2  MSW  3.92 , n1  18, n 2  18,
x.1  11.916 and x.4  13 .736 .

 1
1 
1
1 
1 1
  3.92     0.4356  0.65997 .
 s 2  
First  s

 n

n4 
 18 18 
 n1 n 2 
1

x.1  x.4  11.916  13.736  1.82 .
a) Tukey Confidence Interval 1   4  x1  x4   q m,n  m 
q m,n  m 
1
2
4, 68
 q.01
1
2

4.59
2

s
2
1
1

n1 n 4
4.59 2
 3.2456 . So
2
1   4  1.82  3.2456 0.65997   1.82  2.14
2
252y0571 11/28/05 (Page layout view!)
1
1 
. Note


 n1 n 4 
3,68 is between F 3,65  4.10 and F 3,70  4.07 . So F 3,68 must be about 4.08.
that F.01
.01
.01
.01
b) Scheff e  Confidence Interval 1   4  x1  x4  
m  1Fm1,nm

m  1Fm1,nm  s
3F3,68  34.08   3.4986 . Our interval is now
1   4  1.82  3.4986 0.65997   1.82  2.31

c) Individual Confidence Interval 1   4  x1  x4   t n  m  s
2
1
1

n1 n 4
t nm  t.68
005  2.650. Our interval is now 1   4  1.82  2.650 0.65997   1.82  1.75
2
Looking back, recall that the four means were significantly different at the 5% level but not the 1%
level. In this case only the individual confidence interval shows a significant difference between
the means. We know that as confidence levels go up confidence intervals have to get wider. The
individual confidence interval by itself has a confidence level of 99%, but since the Tukey and
Scheffè intervals have a collective confidence level of 99%, the individual confidence intervals
must have confidence levels above 99%.
7. If we do a 1-way ANOVA and find the following: (Sandy 12.50, 12.51)
One-way ANOVA:
Source DF
SS
Factor
?
6.76792
Error
? 162.448
Total 179 169.216
MS
F
0.615264 0.636
0.966951
The degrees of freedom for the F test are
a) 10, 168
b) 11, 158
c) 10, 158
d) *11, 168.
e) 9, 178
f) 10, 178
P
(2)
Explanation: 6.76792/11 = 0.615264. 162.448/168 = 0.96695. 11+168 = 179.
8. If we do a 1-way ANOVA and assume that your answer in 7 is correct, pick an appropriate value for
[21]
F with a 10% significance level from the table and explain your results. (2)
10,168 is between 1.63 and 1.65; F 11,158 is between 1.60 and
Solution: From the F table - F.10
.10
10,158 is between 1.63 and 1.65; F 11,168 is between 1.60 and 1.62; F 9,178 is between
1.62; F.10
.10
.10
10,178 is between 1.63 and 1.65. In any case, the computed value of F is below
1.66 and 1.68; F.10
the table value, so we cannot reject the null hypothesis of equal factor means.
3
252y0571 11/28/05 (Page layout view!)
9. If we do a simple regression and find the following: (Sandy 13.1, 13.2)
xy  1150 , x  5 , y  10, n  20 ,
x 2  550 . The predicted value of y when x  6 is:
a) 10
b) 11
c) 12
d) * 13
e) Answer can’t be obtained with information given. (4)
[25]

x
x 
y
y 

n
?
5
20
n
?
 10
20
 x  nx  550  205  550  500  50
Sxy   xy  nx y  1150  20 510   1150  1000  150
Sxy  xy  nx y 150
b 


 3.00
SSx  x  nx
50
SSx 
2
2
2
1
2
2
b0  y  b1 x  10  3.00  5  5 So the equation is Yˆ  5  3 X and if x  6 , Yˆ  5  36  13
10. Assume the following data:
x
y
4
2
3
0
7
5
2
1
16
8
Find the following. Show your work!
Row x y
x2
y 2 xy
1
2
3
4
x
4
3
7
2
16
2
0
5
1
8
16
9
49
4
78
 x  16  4
n
4
0
25
1
30
8
0
35
2
45
 x ,  xy , R
2
 ny 2  30  422  30  16  14
 x  nx  78  44  78  64  14 *
Sxy   xy  nx y  45  442  45  32  13
SSx 
2
2
2
 xy  nxy  13  0.92857
 x  nx 14
2
2
SSR  b1 Sxy  0.92857 (13)  12.0714 * So R 2 
R2 
y
[29]
*
4
Sxy

SSx
(4)
SST  SSy 
y  8 2
y
b1 
2
Starred quantities must be positive.
4
n
2
Sxy2  132  169
SSxSSy 14 14  196
SSR 12 .0714

 .6622 or
SST
14
 .8622 This must be between zero and one.
4
252y0571 11/28/05 (Page layout view!)
11. The percentage of the total (squared) variation of the y variable around its mean accounted
for by the x variable is measured by the
a) Coefficient of Correlation
b) Coefficient of Explanation
c) *Coefficient of Determination
d) Standard error of the estimate s e
(2)
[31]
————— 11/28/2005 8:40:25 PM ————————————————————
Welcome to Minitab, press F1 for help.
MTB > Regress c1 1 c2;
SUBC>
Constant;
SUBC>
Brief 3.
Regression Analysis: Y versus X
The regression equation is
Y = 12.5 + 5.10 X
Predictor
Constant
X
Coef
12.464
5.0990
SE Coef
2.465
0.6282
Analysis of Variance
Source
DF
SS
Regression
1
998.38
Residual Error
8
121.22
Total
9 1119.60
Obs
1
2
3
4
5
6
7
8
9
10
X
0.00
5.00
3.00
6.00
3.00
6.00
3.00
1.00
5.00
2.00
y
Y
11.00
34.00
34.00
48.00
23.00
40.00
26.00
19.00
39.00
24.00
298,
P
0.001
0.000
MS
998.38
15.15
F
65.89
P
0.000
Ŷ
12.46
37.96
27.76
43.06
27.76
43.06
27.76
17.56
37.96
22.66
x 
T
5.06
8.12
34 and
x
2
 154
12. From the computer output above, find the following:
a) R 2
(2)
b) s e
(2)
c) A 95% confidence interval for  0
(2)
d) A 95% confidence interval for Y when X  5. (3)
[40]
General Comment: Note that the table giving the regression equation and the ANOVA are something that
you should understand. The values b0  12 .464 and b1  5.0990 appear there. To their right are
s b0  2.465 and s b1  0.6282 . These are used in the t ratios that appear next. But you should be able to
ignore them and note that because the p-value for the constant is below any significance level that you might
use, the constant is highly significant. For the coefficient of X the p-value is below any significance level
that you might use, so the coefficient is highly significant too.
a) R 2 Solution: The ANOVA table says SSR  998 .38 , SST  1119 .60 and
SSE  SST  SSR  121 .22 .
SSR 998 .38
R2 

 .8917
SST 1119 .60
5
252y0571 11/28/05 (Page layout view!)
 Y  Yˆ 
2
b)
s e Solution: s e2
SSE


n2
n2
can be copied from the ANOVA table.

SS y  b 2 SS x
n2

SST  SSR 121 .22

 15 .1525 This
n2
8
s e  15 .1525  3.8926
c) A 95% interval for  0 . The regression output has
Predictor
Constant
X
Coef
12.464
5.0990
SE Coef
2.465
0.6282
8
 2.306
df  n  2  8 t .025
T
5.06
8.12
P
0.001
0.000
 0  b0  2.306sb0  12.464  2.3062.465  12.464  5.684
d) A 95% confidence interval for Y when X  5. Solution: The Confidence Interval is
1 X X 2 
 . The table above says that if x  5 , Yˆ  37.96 .
 Y0  Yˆ0  t sYˆ , where sY2ˆ  s e2   0
n

SS x



x
 x  34  3.4 SSx 
n
10

1 X X
sY2ˆ  s e2   0
n
SS x

x
2

 nx 2  154  103.42  78  64  14 .
2   15.1525  1  5  3.42   15.1525 0.1  0.1829   4.2860


 10

14


s ˆ  4.2860  2.0703 Y0  37.96  2.3062.0703  37.96  4.77
Y
6
Download