252y0561 11/9/05 ECO252 QBA2 Name

advertisement
252y0561 11/9/05 (Page layout view!)
ECO252 QBA2
SECOND HOUR EXAM
November 8, 2005
Name
KEY
CircleHour of Class Registered
MWF2, MWF3, TR12:30, TR3
Show your work! Make Diagrams! Exam is normed on 50 points. Answers without reasons are not
usually acceptable.
I. (8 points) Do all the following. Make diagrams!
x ~ N 24, 7  - If you are not using the supplement table, make sure that I know it.
30 .4  24 
 0.50  24
z
1. P0.50  x  30 .4  P 
  P3.36  z  0.91
7
7


 P3.36  z  0  P0  z  0.91  .4996  .3186  .8182
Make a diagram: For x draw a Normal curve with a vertical line at 24 in the middle. Shade the entire area
between 0.50 and 30.4. This will cover areas on both sides of 24. Or for z draw a Normal curve with a
vertical line at zero in the middle. Shade the area from -3.36 to zero and from zero to 0.91.
34  24 

2. Px  34   P  z 
 Pz  1.43   Pz  0  P0  z  1.43  .5  .4236  .9236
7 

Make a diagram: For x draw a Normal curve with a vertical line at 24 in the middle. Shade the entire area
below 34. This will cover areas on both sides of 24. Or for z draw a Normal curve with a vertical line at
zero in the middle. Shade the entire area below zero and from zero to 1.43.
24  24 
 0  24
z
3. P0  x  24.00   P 
 P3.43  z  0 =.4997
7
7 

Make a diagram: For x draw a Normal curve with a vertical line at 24 in the middle. Shade the entire area
between zero and 24. This will be an area on the left side of 24. Or for z draw a Normal curve with a
vertical line at zero in the middle. Shade the entire area from -3.43 to zero.
x.045
To find z .045 make a Normal diagram for z showing a mean at 0 and 50% above 0, divided into 4.5%
above z .045 and 45.5% below z .045 . So P0  z  z.045   .4550 The closest we can come is
P0  z  1.69   .4545 or P0  z  1.70   .4554 . So use z .045  1.695 (or 1.69 or 1.70).
x.045    z.045  24  1.695 7  35.865 .
Check:
35 .865  24 

Px  35 .865   P  z 
  Pz  1.695   Pz  0  P0  z  1.695   .5  .4549  .0451
7


 4.5%
4.
252y0561 11/09/05 (Page layout view!)
II. (22+ points) Do all the following? (2points each unless noted otherwise). Look them over first – the
Note the following:
1. This test is normed on 50 points, but there are more points possible including the take-home.
You are unlikely to finish the exam and might want to skip some questions.
2. A table identifying methods for comparing 2 samples is at the end of the exam.
3. If you answer ‘None of the above’ in any question, you should provide an alternative
answer and explain why. You may receive credit for this even if you are wrong.
4. Use a 5% significance level unless the question says otherwise.
Computer problem is at the end. Note that some formulas have been squashed by a bug in Word. They
should print correctly and will read right if you click on them.
Exhibit 1
The director of the MBA program of a state university wanted to know if a one week orientation would
change the proportion among potential incoming students who would perceive the program as being
good. Given below is the result from 215 students’ view of the program before and after the
orientation.
After the Orientation
Good
Not Good
93
37
71
14
164
51
Before the Orientation
Good
Not Good
Total
1.
Total
130
85
215
Referring to Exhibit 1, which test should she use?
2
a)  -test for difference in proportions
b) Z-test for difference in proportions
c) *McNemar test for difference in proportions
d) Wilcoxon rank sum test
ANSWER: c
TYPE: MC DIFFICULTY: Moderate
KEYWORDS: McNemar test, assumption
In Method D6b, the McNemar Test, we compare two proportions taken from the same sample. Assume that
question 2
question 1
yes no
two different questions are asked of the same group with the following responses.
x
yes
 11 x12 
x

no
 21 x 22 
question1
In this case
good
not
question 2
good not
93 37 
 71 14 


H 0 : p1  p 2
If we wish to test 
,where p1 is the proportion saying
H 1 : p1  p 2
‘yes’ before and p 2 is the proportion saying ‘yes’ after, let
z
x12  x 21
x12  x 21

37  71
37  71

 34
108

 34
 3.2717 (The test is valid only if x12  x 21  10 .)
10 .3923
2. Referring to Exhibit 1, what is the null hypothesis?
Solution: See above.
2
252y0561 11/09/05 (Page layout view!)
3. Referring to Exhibit 2, what is the value of the computed test statistic? [6]
Solution: See above. ANSWER: 3.37 or 3.37
TYPE: PR DIFFICULTY: Moderate
KEYWORD: McNemar test, test statistic
4.
Referring to Exhibit 2, what is the p-value of the test statistic using a 5% level of significance?
[8]
ANSWER:
0.0008 Actually 2Pz  3.37   2.5  .4995   .0010 is less exact, but fine.
TYPE: PR DIFFICULTY: Moderate
KEYWORD: McNemar test, p-value
Exhibit 2
(This was Problem D7)
In a study of sleep gotten with a sleeping pill and with a placebo the results were (Keller, Warren, Bartel,
2nd ed. p. 354)
d
x1
x2
Pill
Placebo
difference
7.3
6.8
.5
8.5
7.9
.6
6.4
6.0
.4
9.0
8.4
.6
6.9
6.5
.4
x1  7.620
x 2  7.120 d  0.500
 1.197 s 22  0.997 s d2  0.010
We want to see if the means or medians, as appropriate, are different.
Assume that these are independent samples from population with a Normal distribution and that  12   22 .
s12
5.
Referring to Exhibit 2, what should be the degrees of freedom for this test?
a) DF  4
b) * DF  8
c) DF  9
2
 s12 s 22 



 n1 n 2 
0.4388 2


d) DF 

 7.9341 (Rounded to 7.)
2
2
0.2394 2  0.1994 2
 s12 
 s 22 
 
 
4
4
 n1 
 n2 
 
 

n1  1
n 1
e) Degrees of
freedom2 are irrelevant because we should use a (Mann-Whitney-) Wilcoxon
rank sum test.
f) Degrees of freedom are irrelevant because we should use a Wilcoxon signed rank test.
g) We do not have enough information to answer this question. (You must explain what
information is missing)
3
252y0561 11/09/05 (Page layout view!)
6.
Referring to Exhibit 2, in the formula
, we should use the following.
a)
b)
c)
d)
e)
f)
7.
*
, which is used to compute
The standard error is irrelevant because we are using a (Mann-Whitney-) Wilcoxon
signed rank test.
The standard error is irrelevant because we are using a (Mann-Whitney-) Wilcoxon
signed rank test.
We do not have enough information to answer this question. (You must explain what
information is missing)
Referring to Exhibit 2, assume that the correct alternate hypothesis is 1   2 , that use of the
Formula t 
d  D0 x1  x 2   1   2 
is correct, that t  2.263 and that there are 27

sd
sd
Degrees of freedom (Assume that all of these are correct, even though it is very unlikely!), we
should do the following.
a)
b)
c)
d)
e)
f)
Reject the null hypothesis only if the significance level is a value below .01
Reject the null hypothesis if the significance level is any value below .02
Reject the null hypothesis if the significance level is any value above .025
*Reject the null hypothesis if the significance is any value above .05
Reject the null hypothesis only if the significance level is above .10
None of the above.
[14]
27 
27 
 2.052 and t .01
 2.473 so that, for a 2-sided test, the p-value is
Explanation: Note t .025
between .05 and .02. If the p-value is below the significance level, reject the null hypothesis.
4
252y0561 11/09/05 (Page layout view!)
8.
You are having a part produced in two different machines. x1 Is 201 randomly selected data points
that represent the length of parts from machine one, x 2 is 501 randomly selected data points that
represent the length of parts from machine two. You want to test your suspicion that parts from
machine two are more variable in length than parts from machine one (This is the same as saying
that machine 1 is more reliable than machine 2). Test this suspicion after stating your hypotheses.
Your sample means are 25.593 inches for machine 1 and 25.592 for machine 2. Sample standard
deviations are 8.379 for machine 1 and 9.293 for machine 2.(2)
[17]
Solution:
H 0 :   
 H 0 :  12   22
2
2
1
2
Or 
. In terms of the variance ratio 12 or 22 , the alternate

2
1
 H 1 :  12   22
H 1 :  1   2
hypothesis rules, so H 0 :
 22
 12
 1 and H 1 :
 22
 12
 1.
Since you are comparing variances, use Method D7. Compare the ratio
s 22
s12
against F .
2
s22  9.293 

  1.230 . This has the F distribution with 500 and 200 degrees of freedom. From
s12  8.379 
the table F 500, 200  1.22 . Since the computed F is larger than the table F, reject the null
.05
hypothesis.
9.
(Extra credit) compute a two-sided confidence interval for the ratio of the two variances in the
previous problem. (3)
2
s2
 2 s 2 ( n 1, n 1) s 2  9.293 
1
Solution: From the outline 22 n 1,n 1  22  22 F 1 2 . 22  
  1.230 has
s1  8.379 
s1 F 2 1
 1 s1 2
2
the F distribution with 500 and 200 degrees of freedom. So,
2
( 200, 500)
500, 200  1.27 , F 200,500 must be between
. F.025
.025
500, 200   2  1.230 F.025
F.025
1
F 200,400  1.27 and F 200,1000  1.23 , probably about 1.26. The interval thus becomes
1.230 
2
1
.025
1.230 
.025
1

1.27
 22
 12
 1.230 1.26  or 0.968 
 22
 12
 1.55 .
5
252y0561 11/09/05 (Page layout view!)
10. The following problem is an easier version of a problem in the text.
A pet food canning factory produces 8 oz cans of cat food. The manager suspects that
the amount of cat food put into the cans by machine 1 is significantly larger than that put
in by machine 2. A sample of output is taken with the results below.
x1  8.005 x 2  7.997 s1  0.048 s 2  0.015 n1  176 n 2  144
a) What are the manager’s null and alternate hypotheses?
(1)
d  D0
b) You will use a test ratio of the form
to test the hypothesis. Find s d . (3)
sd
Warning: Be accurate! s d2 is roughly the size of .0000175. If you start rounding
excessively, your answers will be completely wrong. If you absolutely cannot do this
section, say so and use .000175, which is very wrong.
c) Compute the test ratio and find a p-value for your result. (2)
d) If the manager had, instead, suspected that a larger amount was going into cans filled
by machine 2, what would the p-value be? (1)
e) Find a 91% two-sided confidence interval for the difference between the average
amount of food put in the cans by the two methods. (2)
[25]
Solution: a) H 1 : 1   2 , so H 0 : 1   2 . Note d  8.005  7.997  0.008
0.015 2
 .00000156
144
s12 s 22

n1 n1
b)
0.048 2
 .0000131
176

0.048 2 0.015 2

 .0000131  .00000156  .0000147  0.003828
176
144
sd 
8.005  7.997
 2.0899 pvalue  Pz  2.09   .5  .4817  .0183
0.003828
d) 1-.0183 = .9817.
e) From page 1 z .045  1.695 (or 1.69 or 1.70), so D  1   2  d  z .045 s d z
c) z 
 0.008  1.695 .003828   0.008  0.006 or 0.002 to 0.014.
6
252y0561 11/09/05 (Page layout view!)
11. Over 104 weeks, the following numbers of mortgages were approved by a bank. Do the results
below follow a Poisson distribution?
O
Number Approved
a) Find the average number of mortgages
0
13
approved per week? Hint: The original
1
25
version of the problem came up with a mean
2
31
of 2.1058 approvals per week. You will
3
17
come out with something closer to a mean
4
8
that you actually can find in your tables (1)
5
5
6
1
7
1
8 or more
0
Total
104
b) If the data followed an appropriate Poisson distribution exactly, what would the numbers of weeks
be with 0, 1, 2, 3, 4, 5, 6, 7 and 8 or more approvals be? (3)
c) Use a statistical test to compare the number of actual approvals with the distribution you found in the
last section. (3)
d) If all this sounds like too much work, guess the mean and compare the observed data with a Poisson
distribution with the mean that you guessed. Do not use a Chi-squared method in d. (5)
[37]
Solution: a)
O
xO
Number Approved x 
0
1
2
3
4
5
6
7
8 or more
Total
13
25
31
17
8
5
1
1
0
104
0
25
62
51
32
25
6
7
0
208
This implies mean 
208
2
104
b) The f column to the right was copied from the
easiest way to get the last probability is
Poisson table for a parameter of 2. It actually
Px  5  1  Px  4  1  .94735
included the following probabilities: for 5, .036089;
 .05265 . So E  fn  f 104 
for 6, .012030; for 7, .003437; for 8, .000859; for 9,
x
f
E
.000191; for 10, .000038; for 11, .000007 and for
1 0
14.0748 0.135335
12.000001. If you multiply these numbers by 104,
2 1
28.1498 0.270671
you will get values of E that are less than 5. They
3 2
28.1498 0.270671
4 3
18.7665 0.180447
were lumped together in a single class for 5 and over.
5 4
9.3833 0.090224
Note that, if you are rounding these quantities, the
6 5+
5.4758 0.052652
c) The Chi-squared computations are below. Both the traditional and short-cut method are shown.
Row
1
2
3
4
5
6
O
E
16 14.0748
25 28.1498
31 28.1498
17 18.7665
8
9.3833
7
5.4758
104 104.0000
D  E O
D2
D2
E
O2
E
-1.92516 3.70624 0.263324 18.1885
3.14978 9.92114 0.352441 22.2027
-2.85022 8.12373 0.288589 34.1388
1.76649 3.12048 0.166279 15.3998
1.38330 1.91351 0.203927
6.8206
-1.52419 2.32316 0.424259
8.9485
0.00000
1.69882 105.6988
7
252y0561 11/09/05 (Page layout view!)
4 
The table chi-square for 6 – 1 – 1 = 4 degrees of freedom is  2 .05  9.4877 . (We lost a degree of freedom
because we estimated a parameter from the data.) Our computed chi-square is 105 .6988  104  1.6988 .
Our null hypothesis is that the distribution is Poisson, and since our computed chi-square is less than the
table value we cannot reject the null hypothesis.
12. A researcher randomly samples female (Sample 1) and male graduates of an MBA program. The
figures represent starting salaries. The Minitab command used here is a pull-down command that is
equivalent to the two-sample command that you used in your computer assignment.
MTB > TwoT 18 48266.7 13577.63 12 55000 11741.29;
SUBC>
Alternative -1.
Two-Sample T-Test and CI
Sample
1
2
N
18
12
Mean
48267
55000
StDev
13578
11741
SE Mean
3200
3389
Difference = mu (1) - mu (2)
Estimate for difference: -6733.30
95% upper bound for difference: 1229.26
T-Test of difference = 0 (vs <): T-Value = -1.44
DF = 25
P-Value = 0.081
a) Turn in your first computer assignment only. (2)
b) For this assignment make 3 curves with centers at zero and with the t ratio marked by a vertical
line. Indicate, using the three diagrams and information from the printout, what the p-value would
be if the null hypothesis was
(i) H 0 : 1   2
(ii) H 0 : 1   2
(iii) H 0 : 1   2
Give me a number for the p-value.
(1.5)
c) Using the style that I use for null hypotheses, which of the three hypotheses in b) is the null
hypothesis used here and would it be rejected if the confidence level was .05?
(1)
d) Nothing in the command told Minitab which method to use. Of the four methods you learned to
compare two means, which one did Minitab use? (1)
[42.5]
Solution:
b)
(i) H 0 : 1   2
pvalue  1  .081  .919
(ii) H 0 : 1   2
pvalue  .081
(iii) H 0 : 1   2
pvalue  2.081   .161
Diagram: It says T-Value = -1.44 so make three diagrams with an almost Normal curve and
a mean at zero. (i) .919 is the area above -1.44, (ii) .081 is the area below -.144, (iii) .161 is the
area above +1.44 and the area below -1.44.
(1.5)
c) It says T-Test of difference = 0 (vs <). This says H 1 : 1   2 , so the null
hypothesis is H 0 : 1   2 . P-value is above .05, so do not reject the null hypothesis.
d) Of the four methods you learned to compare two means, Minitab uses D3 when there are
independent samples and no explicit directions that say to use a pooled or equal variance method.
8
252y0561 11/09/05 (Page layout view!)
ECO252 QBA2
SECOND EXAM
Nov 8-9, 2005
TAKE HOME SECTION
Name: _________________________
Student Number: _________________________
III. Do sections adding to at least 20 points - Anything extra you do helps, and grades wrap around) . Show
your work! State H 0 and H 1 where appropriate. You have not done a hypothesis test unless you
have stated your hypotheses, run the numbers and stated your conclusion. (Use a 95% confidence
level unless another level is specified.) Answers without reasons are not usually acceptable. Neatness
counts! Check the website regularly for hints or corrections.
1) A state is trying to figure out whether the background on highway signs makes a difference. In order to
do this two samples of 15 individuals are shown a number of slides rapidly. The slides have either a green
or a red background. You are trying to find out whether there is a difference between the number of slides
correctly read between those with a red or a green background. To do so you will compare the mean or
median as appropriate to the distribution. To personalize the data, look at the third digit from the end to
decide what red data you will use. Call the column that you pick rj and compute a column called dj with the
formula dj = green – rj. (Example: Seymour Butz’s student number is 976512, so he picks column 5 and
used d5 = green – r5.) Tell me what column you are using! If you compare means state your hypotheses
both in terms of 1 and  2 and in terms of D  1   2 .
Row
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
green
8
10
6
5
9
7
3
7
7
3
6
6
8
3
10
r0
9
10
9
4
9
9
8
7
5
6
10
8
8
9
9
r1
5
9
5
7
6
7
9
8
7
5
7
5
8
7
7
r2
7
10
11
9
8
8
8
7
9
9
7
7
10
7
6
r3
6
12
7
10
12
10
9
11
9
8
7
6
11
13
8
r4
5
6
5
6
5
11
8
8
7
9
10
7
7
6
5
r5
8
9
6
10
7
7
5
6
11
9
10
7
7
8
8
r6
9
7
6
7
8
7
6
6
7
7
5
7
10
7
6
r7
7
9
10
11
11
6
11
6
9
7
7
7
9
8
8
r8
10
9
7
10
6
6
7
8
8
7
9
11
6
8
7
r9
5
6
6
7
10
10
10
13
9
6
8
6
7
8
10
Minitab computed some basic statistics from the data which will help you in some parts of this problem.
Variable
green
r0
r1
r2
r3
r4
r5
r6
r7
r8
r9
N
15
15
15
15
15
15
15
15
15
15
15
Mean
6.533
8.000
6.800
8.200
9.267
7.000
7.867
7.000
8.400
7.933
8.067
SE Mean
0.601
0.458
0.355
0.368
0.581
0.488
0.435
0.324
0.456
0.408
0.573
StDev
2.326
1.773
1.373
1.424
2.251
1.890
1.685
1.254
1.765
1.580
2.219
Minimum
3.000
4.000
5.000
6.000
6.000
5.000
5.000
5.000
6.000
6.000
5.000
Q1
5.000
7.000
5.000
7.000
7.000
5.000
7.000
6.000
7.000
7.000
6.000
Median
7.000
9.000
7.000
8.000
9.000
7.000
8.000
7.000
8.000
8.000
8.000
Q3 Maximum
8.000 10.000
9.000 10.000
8.000
9.000
9.000 11.000
11.000 13.000
8.000 11.000
9.000 11.000
7.000 10.000
10.000 11.000
9.000 11.000
10.000 13.000
a) Display the numbers that you are using in columns and compute a sample mean and sample standard
deviation for the d column. (1)
9
252y0561 11/09/05 (Page layout view!)
In this problem assume that the red and green data are two independent samples. Use a confidence level of
95%.
b) Assume that you believe that the normal distribution does not apply to the data and compare the means or
medians as appropriate. (4)
c) You suspect that the data has the Normal distribution. Test to see if the Normal distribution applies. Use
a test that I taught you. (3)
d) You decide that the Normal distribution applies to the data, but do not know if the variances are equal.
Test them for equality. (1)
e) You conclude that the underlying distributions are Normal and that the population variances are equal.
Compare the means or medians as appropriate. Use a test ratio, critical value or a confidence interval (4) or
all three (6).
[15]
f) (Extra credit) You conclude that the underlying distributions are Normal and that the population
variances are not equal. Compare the means or medians as appropriate. Use a test ratio, critical value or a
confidence interval (5) or all three (7)
2) In fact the data on the previous page applies to a single sample of 15 individuals. That is the first line of
your worksheet tells you how the first person in the sample did when showed the same slides with red or
green backgrounds. This applies to a) and b) in this question. Use a confidence level of 95%.
a) Assume that you believe that the normal distribution does not apply to the data and compare the means or
medians as appropriate. (3)
b) You assume that the data has the Normal distribution. Compare the means or medians as appropriate. (3)
c) For any part of one of these problems (tell me which one!), compute a confidence interval that you would
use to compare means if your alternate hypothesis was H 1 :  2  1 . (2)
[23]
d) For the same part as you used in c), find a p-value for the null hypothesis. (2) [25]
These results are all supposed to look to me as you did them by hand. But what I don’t know won’t hurt me.
If you want to check your results by computer, you might try to use the following Minitab routine. If you
put green in C1 and label columns with headings like rj, dj and dsqj (Seymour called his green, r5, d5 and
dsq5.) The routine below with appropriate changes to rj, dj and dsqj, will compute much of the stuff above,
though not in the right order. Note that to do a Wilcoxon signed rank test by hand, you will have to drop all
zeroes from the d column.
Computations for comparing c1 and rj
MTB >
MTB >
MTB >
MTB >
MTB >
MTB >
MTB >
SUBC>
MTB >
MTB >
MTB >
SUBC>
MTB >
SUBC>
MTB >
SUBC>
let dj = c1 – rj
let dsqj = dj *dj
print c1 rj dj dsqj
describe c1 rj dj
sum dj
ssq dj
TwoSample c1 rj;
Pooled.
TwoSample c1 rj.
Paired c1 'rj'.
VarTest c1 'rj';
Unstacked.
WTest 0.0 'dj';
Alternative 0.
Mann-Whitney 95.0 c1 'rj';
Alternative 0.
10
252y0561 11/09/05 (Page layout view!)
If you want to fake the calculations for the Mann-Whitney test, try this.
Procedure for setting up Mann-Whitney Test
#c1 is green, c2 is rj, c3 is difference.
# Mann Whitney Test
MTB > Stack c1 c2 c5;
SUBC>
Subscripts c6.
MTB > Rank c5 c7.
MTB > Unstack (c7);
SUBC>
Subscripts c6;
SUBC>
After;
SUBC>
VarNames.
MTB > sum c8
MTB > sum c9
MTB > print c1 c8 c2 c9 #The rest is up to you.
If you want to fake the calculations for the Wilcoxon signed rank test, try this. Unfortunately, I know no
good way to remove the zeros or change the signs except by hand.
Procedure for setting up Wilcoxon Signed Rank Test
#c1 is green, c2 is rj, c3 is difference.
MTB > Let c3 = c1-c2
#Maybe you already did this.
MTB > # Wilcoxon signed rank test
MTB > let c10 = c3
MTB > #Remove zeroes from c10. (Just use delete on the cells with zeros.)
MTB > #Notice that n has gotten smaller.
MTB > let c11 = abs(c10)
MTB > rank c11 c12
MTB > let c13 = c12
MTB > #Change signs in c13 to agree with signs in c10.
MTB > let c14 = c13 *c10
#Check on signs. All should be positive.
# Aside from this, consider c14 garbage.
MTB > print c10 c11 c12 c13 #You now have the four columns that I computed in
MTB > #the examples. The totals are up to you.
3) The results of a Gallup phone survey appear below. Consumers were asked if they objected to having
their medical records shared with different types of organizations. Results follow.
The proportion in a sample of 1000 who objected to sharing with insurance companies was p1  .820 .
The proportion in a sample of 1000 who objected to sharing with pharmacies was p 2  .590
The proportion in a sample of 1000 who objected to sharing with medical researchers was p3  .670
Personalize the data by using the second to last digit of your student number, call it d . Multiply it by .001.
Call the result .00d – If the second to last number is zero, use .00d = .010. Add .00d to .820 and subtract
.00d from .670. . (Example: Seymour Butz’s student number is 976532, so he adds .003 to .820 and gets
.823 and subtracts .003 from .670, getting .667. He leaves .590 alone.
a) Is the proportion of people who object different for different institutions?   .01 . (4)
b) If appropriate, use the Marascuilo procedure to determine which organizations are different. Discuss. (3)
[32]
11
Download