252y0321

advertisement
3/26/03 252y0321
(Page layout view!)
ECO252 QBA2
Name KEY
SECOND HOUR EXAM Hour of Class Registered _______
March 27 - 28, 2003
I. (40 points) Do all the following (2points each unless noted otherwise).
A table identifying methods for comparing 2 samples is at the end of the exam.
A powerful women’s group has claimed that men and women differ in attitudes about sexual
discrimination. A group of 50 men (group 1) and 40 women (group 2) were asked if they thought
sexual discrimination is a problem in the United States. Of those sampled, 11 of the men and 19 of
the women did believe that sexual discrimination is a problem. If the p value turns out to be 0.035
(which is not the real value in this data set), then
a) at  = 0.05, we should fail to reject H0
b) *at  = 0.04, we should reject H0
c) at  = 0.03, we should reject H0
d) None of the above would be correct statements.
Explanation: The rule on p-value says if the p-value is less than the significance level (alpha =  )
reject the null hypothesis; if the p-value is greater than or equal to the significance level, do not reject
the null hypothesis.
1.
Table 12-4
A few years ago, Pepsi invited consumers to take the “Pepsi
Challenge.” Consumers were asked to decide which of two sodas, Coke
or Pepsi, they preferred in a blind taste test. Pepsi was interesting
in determining what factors played a role in people’s taste
preferences. One of the factors studied was the gender of the
consumer. Below are the results of analyses comparing the taste
preferences of men and women with the proportions depicting
preference for Pepsi.
Males: n = 109, pSM = 0.422018
Females: n = 52, pSF = 0.25
pSM – pSF = 0.172018 z = 2.11825 (Note that H 1 may differ in 2) and 3)
2.
Referring to Table 12-4, to determine if a difference exists in the taste preferences of men and
women, give the correct alternative hypothesis that Pepsi would test.
a) H1:  M –  F  0
b) H1:  M –  F  0
c) *H1: pM – pF  0
d) H1: pM – pF  0
3.
Referring to Table 12-4, suppose Pepsi wanted to test to determine if the males preferred Pepsi less
than the females. Using the test statistic given z  2.11825  , compute the appropriate p value for
the test.
a) 0.0170
b) 0.0340
c) 0.9660
d) *0.9830
Explanation: Alternative hypothesis is now : pM < pF or : pM – pF < 0. If z  2.12 , we
want Pz  2.12   .5  .4830  .9830 .
3/26/03 252y0321
4.
The amount of time necessary for assembly line workers to complete a product is a normal random
variable with a mean of 15 minutes and a standard deviation of 2.1 minutes. The probability is
__________ that a product is assembled in more than 20 minutes.
20  15 

Solution: x ~ N 15,2.1 . Px  20   P z 
  Pz  2.38   .5  .4913  .0087
2.1 

Make a diagram!
5.
The amount of time necessary for assembly line workers to complete a product is a normal random
variable with a mean of 15 minutes and a standard deviation of 2.3 minutes.
Find x.275 for this distribution.
Solution: x ~ N 15,2.. We want a point, x.275 , so that Px  x.275   .2750 . The corresponding value of z has Pz  z.275   .2750 . and Pz  z.275   .7250 . If this is true, and zero is
the median, we must have P0  z  z.275   .2250 . The closest we can come on the Normal table
is P0  z  0.60   .2257 . So z .275  0.60, and x.275    z  15  0.602.3  16.38.
6.
A local real estate appraiser analyzed the sales prices of homes in 2 neighborhoods to the
corresponding appraised values of the homes. The goal of the analysis was to compare the
distribution of sale-to-appraised ratios from homes in the 2 neighborhoods. Random and
independent samples were selected from the 2 neighborhoods from last year’s homes sales, 8 from
each of the 2 neighborhoods. Identify the nonparametric method that would be used to analyze the
data.
a) the Wilcoxon Signed-Ranks Test, D5b, using the test statistic Z
b) the Wilcoxon Signed-Ranks Test, D5b, using the table test statistic W
c) *the Wilcoxon Rank Sum Test, D5a, using the table test statistic T
d) the Wilcoxon Rank Sum Test, using the test statistic Z
7.
Turn in your computer output from computer problem 1 only tucked inside this exam paper. (3
points - 2 point penalty for not handing this in.)
TABLE 10-5
To test the effects of a business school preparation course, 8
students took a general business test before the course. The same
eight students took the test after the course. The data are given
below.
Student Before
x1
1
2
3
4
5
6
7
8
530
690
910
700
450
820
820
630
After
x2
670
770
1000
710
550
870
770
610
diff
Note: Last column and sums not in original problem.
d  x 2  x1
140
80
90
10
100
50
-50
-20
400
d2
19600
6400
8100
100
10000
2500
2500
400
49600
Two tests were run using Minitab. (Only one should have been run!). The results follow:
2
3/26/03 252y0321
Table 1:
Results for: 2x0321-1.MTW
MTB > TwoSample c2 c1;
SUBC>
Alternative 1.
Two-Sample T-Test and CI: after, before
Two-sample T for after vs before
after
before
N
8
8
Mean
744
694
StDev
144
155
SE Mean
51
55
Difference = mu after - mu before
Estimate for difference: 50.0
95% lower bound for difference: -82.6
T-Test of difference = 0 (vs >): T-Value = 0.67
P-Value = 0.258
DF = 13
MTB > Paired c2 c1;
SUBC>
Alternative 1.
Paired T-Test and CI: after, before
Paired T for after - before
after
before
Difference
N
8
8
8
Mean
743.8
693.8
50.0
StDev
143.9
155.4
65.0
SE Mean
50.9
54.9
23.0
95% lower bound for mean difference: 6.4
T-Test of mean difference = 0 (vs > 0): T-Value = 2.17
8.
P-Value = 0.033
Compute the standard deviation of the d column and use it to fill in the underlined blanks in the
second of the two tests in Table 1. Show your work! (4 points - 2 point penalty for not computing
the variance.)
d 2  nd 2 49600  850 2
Solution: s d2 

 4228 .5714 s d  4228 .5714  65 .027
n 1
7
65 .027
4228 .5714
sd 

 22 .9907
8
8

9.
What was the alternate hypothesis tested in Table 1? (1)
H 1 : D  0 or H 1 : 1   2  0 or H 1 : 1   2
10. Using the correct test, state the null hypothesis and conclusion, explaining what numbers in the
tests led you to your conclusion. (2)
Solution: Since this is paired data, we use the second test. The null hypothesis is H 0 : D  0 or
H 0 : 1   2  0 or H 0 : 1   2 . The p-value is .033. Assuming   .05 , since the p-value is
below the significance level (if we use 5%), reject the null hypotheses.
11. The assertion has been made that tests using paired data are more powerful than tests using
independent samples. Can you point to any numbers in Table 1 that illustrate this?
Solution: Power is the probability of rejecting a false null hypothesis. Notice that the first test,
which is appropriate for independent samples, gives a p-value that is much higher than .033. So
using the same numbers and significance levels like 5% or 10%, we reject the null hypothesis in
the second test, but not the first. A more sophisticated answer might be that the standard error for
d in the first version is 50 .9 2  54 .9 2 , which is much larger than 65.0, so that the second
method produces larger t ' s. (This is the hardest question on the exam.)
3
3/26/03 252y0321
12. Given the following information, calculate the degrees of freedom that should be used in the
pooled-variance t test.
s12 = 4, s22 = 6, n1 = 16, n2 = 25
a) df = 41
b) *df = 39
c) df = 16
d) df = 25
Solution: In the pooled-variance t test, df  n1  n2  1  16  25  2  39
13. Given the following information, calculate sp2, the pooled sample variance that should be used in
the pooled-variance t test.
s122 = 4, s22 = 6, n1 = 16, n2 = 25
a) sˆ p = 6
2
b) sˆ p = 5
2
c) * sˆ p = 5.23
2
d) sˆ p = 4
1 
  1
 , where
Solution: From the outline, t  t n1  n2 2 and s d  s p2  
 n1 n 2 
n  1s12  n2  1s 22 154  246 60  144 204

s p2  1



 52.23 .
n1  n 2  2
15  24
39
39
TABLE 10-7
A perfume manufacturer is trying to choose between 2 magazine
advertising layouts. An expensive layout would include a small
package of the perfume. A cheaper layout would include a "scratchand-sniff" sample of the product. The manufacturer would use the
more expensive layout only if there is evidence that it would lead to
a higher approval rate. The manufacturer presents the more expensive
layout to 4 groups and determines the approval rating for each group.
He presents the "scratch-and-sniff" layout to 5 groups and again
determines the approval rating of the perfume for each group. The
data are given below. Use this to test the appropriate hypotheses
with the Wilcoxon Rank Sum Test with a level of significance of 0.05.
Package
52
68
43
48
Scratch
37
43
53
39
47
14. Referring to Table 10-7, the hypotheses that should be used are:
a) H 0 : 1   2 versus H 1 : 1   2
b) H 0 : 1   2 versus H 1 : 1   2
c) H 0 : 1   2 versus H 1 : 1   2
d) * H 0 : 1   2 versus H 1 : 1   2
4
3/26/03 252y0321
15. Referring to Table 10-7, the rank given to the second observation in the "scratch-and-sniff" group
is 6.5 or 3.5.
Explanation: The numbers are written down in order. The r1 and r2 orderings are correct, since
the rule is to work from the extreme of the smaller group. In this case, the 6th and 7th numbers are
identical, so they are given the average rank.
x1
x2
r1
r2
r1*
r2*
37
9
1
39
8
2
43
43
6.5
6.5
3.5
3.5
47
5
5
48
4
6
52
3
7
53
2
8
68
1
_.
9
.
14.5
30.5
25.5
19.5
16. Referring to Table 10-7, the calculated value of the test statistic is 14.5 or 19.5.
17. Referring to Table 10-7, the critical values or p-value of the test is ________.
Solution: Critical values from Table 6 are 12 and 28. From Table 5, p-value for 14.5 is between .0952
and .1429, p-value for 19.5 is between .4524 and .5476. All result in not rejecting the null hypothesis.
18. Referring to Table 10-7, the perfume manufacturer will
a) *use the "scratch-and-sniff" layout because there is insufficient evidence to do otherwise.
b) use the package layout because there is insufficient evidence to do otherwise.
c) use the "scratch-and-sniff" layout because there is sufficient evidence to conclude that this
is the best course of action.
d) use the package layout because there is sufficient evidence to conclude that this is the best
course of action
TABLE 10-1
Are Japanese managers more motivated than American managers? A
randomly selected group of each were administered the Sarnoff Survey
of Attitudes Toward Life (SSATL), which measures motivation for
upward mobility. The SSATL scores are summarized below.
Sample Size
Mean SSATL Score
Population Std. Dev.
American
211
65.75
11.07
Japanese
100
79.83
6.41
5
3/26/03 252y0321
19. Referring to Table 10-1, assuming the independent samples procedure was used, choose the value
of the test statistic.
a)
b)
c)
z
65 .75  79 .83
z
z
d) * z 
9.82 9.82

211 100
65 .75  79 .83
11 .07 6.41

211
100
65 .75  79 .83
9.82 2  9.82 2
211
100
65 .75  79 .83
11 .07 2  6.412
211
Location - Normal distribution.
Compare means.
Location - Distribution not
Normal. Compare medians.
100
Paired Samples
Method D4
Independent Samples
Methods D1- D3
Method D5b
Method D5a
Proportions
Method D6
Variability - Normal distribution.
Compare variances.
Method D7
Let’s try p-value again! Say we end up with z  3.00 .
If H 1 is D  0 , p  0, p  p 0 or    0 , pval  Pz  3  .5  P0  z  3 .
If H 1 is D  0 , p  0, p  p 0 or    0 , pval  Pz  3  .5  P0  z  3 .
If H 1 is D  0 , p  0, p  p 0 or    0 , pval  2Pz  3  2.5  P0  z  3 .
6
3/26/03 252y0321
ECO252 QBA2
SECOND EXAM
February 20, 21 2003
TAKE HOME SECTION
Name: _________________________
Social Security Number: _________________________
TABLE 12-6
One criterion used to evaluate employees in the assembly section of a large
factory is the number of defective pieces per 1,000 parts produced. The
quality control department wants to find out whether there is a relationship
between years of experience and defect rate. Since the job is repetitious,
after the initial training period any improvement due to a learning effect
might be offset by a loss of motivation. A defect rate is calculated for
each worker in a yearly evaluation. The results for 100 workers are given in
the table below. Before you start, replace the 9 in the upper right hand
(northeast)corner with the last digit of your Social Security Number. Total
will be between 91 and 100.
Defect Rate:
High
Average
Low
Years Since Training Period
< 1Year
1 – 4 Years
5 – 9 Years
6
9
9
9
19
23
7
8
10
1) Use   .05 in this question.
a) Test the hypothesis that the proportion of workers with a low defect rate is larger for workers with 5-9
years experience than for workers with 1-4 years experience. State your hypotheses, find a test ratio and
explain your conclusion. (2)
b) Get a p-value for your test ratio.(1)
c) Do a 95% 2-sided confidence interval for the difference between the proportion of 5-9 year workers
with low defect rates and the proportion of 1-4 year workers with low defect rates. (2)
d) Using all the data above, test whether there is a relationship between years of experience and the defect
rate. (3)
Variations on Solution: This is the solution to the problem shown above. It is your solution if your Social
Security number ends in 9. For other solutions see 252y032app. To summarize the information in the
problem -   .05
Defect Rate
<1 yr
1-4 yr
5-9yr
Total
6
9
9
24
High
9
19
23
51
Average
7
8
10
25
Low
22
36
42
100
Total
a) We are comparing x 2  8, n 2  36 , p 2  8  .2222 and x3  10, n3  42, p 3  10  .2381 .
36
42
Our we are testing H 1 : p3  p 2 . So the null hypothesis is H 0 : p 3  p 2
7
3/26/03 252y0321
a) Let p  p 2  p3 . So p  p 2  p3  .2222  .2381  .0159 and our hypotheses become
H 0 : p 2  p3  0 and H1 : p 2  p 3  0 . or H 0 : p  0 and H 1 : p  0 .
s p 
, p0 
p 2 q 2 p3 q3
.2222 .7778  .2381 .7619 



 .004801  .004319  .009120  .0955
n2
n3
36
42
n p  n3 p 3 36 .2222   42 .2381  18
8  10
 2 2


 .2308
36  42
n 2  n3
36  42
78
  .05, z  z.05  1.645 , z 2  z.025  1.960. Note that q  1  p and that q and p are between zero
and one.  p 
p0 q0

1
n1

1
n3

.2308 .7692 136  1 42 
.17753 .05159   .009158  .0957
Use one of the following:
Confidence interval: Since the alternate hypothesis is H 1 : p  0 , the confidence interval will be
p  p  z s p  0.0159  1.645.0955 or p  0.1412 . This does not contradict H 0 : p  0 since
any value of p between 0 and .1412 satisfies both the null hypothesis and the confidence interval, so do
not reject H 0 .
Test ratio: z 
p  p 0
 p

 .0159
 0.166 . Make a diagram of a Normal curve with zero in the
.0957
middle. The ‘reject’ zone is the area below - z   z.05  1.645 . Since the test ratio is not in this zone, do
not reject H 0 .
Critical value: Because the alternate hypothesis is H 1 : p  0 , we need a critical value below zero. Use
pcv  p0  z  p  0 1.645.0957  .1574. Make a diagram of a Normal curve with zero in the
middle. The ‘reject’ zone is the area below -.1574. Since p  .0159 is not in this zone, do not reject H 0 .
b) The p-value for this problem is Pp  .0159   Pz  0.17   .5  .0675  .4325 . Since this is not
below   .05, do not reject H 0 .
c) p  p  z s p  0.0159  1.960.0955  .0159  .1872 or -.2031 to .1713
2
8
3/26/03 252y0321
d) n  100 . The proportions in rows, p r , are used with column totals to get the items in E . O is at the top
of the page. Note that row and column sums in E are the same as in O except for a possible small rounding
error. (Note that  2 is computed two different ways here - only one way is needed.)
E
H
 1 yr 1  4 5  9
5.28 8.64 10 .08
24
.24
A
11 .22 16 .36 21 .42
51
.51
25
.25
L
5.50
9.00 10 .50
22 .00 36 .00 42 .00 100 1.00
Row
O
E
E O
1
6
5.28 -0.72
2
9
11.22
2.22
3
7
5.50 -1.50
4
9
8.64 -0.36
5
19
18.36 -0.64
6
8
9.00
1.00
7
9
10.08
1.08
8
23
21.42 -1.58
9
10
10.50
0.50
Total 100 100
0.00
H 0 : years and defect rate independent
E  O  2
O2
E
E
0.5184
0.098182
6.8182
4.9284
0.439251
7.2193
2.2500
0.409091
8.9091
0.1296
0.015000
9.3750
0.4096
0.022309
19.6623
1.0000
0.111111
7.1111
1.1664
0.115714
8.0357
2.4964
0.116545
24.6965
0.2500
0.023810
9.5238
1.35101
101.3510
DF  r  1c  1  22  4  .2054   9.4877
E  O2
O  E 2  101 .3510 100  1.351
O2
n 
E
E
Since this is less than 9.4877, do not reject H 0 .
(Diagram!)


9
3/26/03 252y0321
2) A firm has been experimenting with two separate assembly line arrangements and finds the following for
the number of finished units a day. Each sample represents 21 days work. For the first arrangement the
(sample) mean units produced per day were 85 with a (sample) variance of 1200. For the second
arrangement the mean was 87 and the variance 3500. Use   .10 .
a) Test the variances for equality. (2)
b) (Extra credit) Using the results of the test in a), test the equality of the means. You may use a test ratio,
critical value or a confidence interval (4 points) or all three of these (6 points – assuming that you get the
same conclusion for all of them) .
c) (Extra credit) Given the results of both tests, write a short essay with your recommendations as to
which of the two arrangements to use. (2)
Solution: The facts given in the problem are n1  21, x1  85, s12  1200, n 2  21, x 2  87, s 22  3500
and H 0 : 1   2 H 1 : 1   2  . The null hypothesis is the same as H 0 : D  0 H 1 : D  0 if
D  1   2 and d  x1  x 2  85  87  2.   .10 .
a) Use a F test on the sample variances to see if the population variances are equal.
Since we are comparing variances, we use Method D7. DF1  n1  1  20 and DF2  n 2  1  20 , Since the
table is set up for one sided tests, if we wish to test H 0 :  12   22 , we must do two separate one-sided
tests. First test
s12
s 22

2
1200
3500
20, 20  2.12 and then test s 2
 0.3429 against F.05

 2.9167 against
2
3500
s1 1200
20, 20  2.12 . If either test is failed, we reject the null hypothesis. Since 2.9167 is above the table F, we
F.05
reject the null hypothesis of equal variances and say that the variances are not equal. We should use Method
D3, a method for comparing the means that allows unequal variances.
b) First find degrees of freedom and a value of s d for this problem.
s 2 3500
s2 s2
s12 1200

 57 .1429 , 2 
 166 .667 , so 1  2  57 .1429  166 .667  223 .810 .
n2
21
n1 n 2
n1
21
2
 s12 
 
 n1 
57 .1429 2  163 .265 ,
 

n1  1
20
2
2
 s 22 
 
 n2 
166 .667 2  1388 .89,
 

n2 1
20
2
 s12 
 s 22 
 
 
 n1 
 n2 
 
 
so

 163 .265  1388 .89  1552 .15 .
n1  1
n2 1


  s2 s2 2
  1  2 
  n1 n 2 
Finally df  
2
2
  s2 
 s 22 
1 




 n2 
  n1 
 


n

1
n2 1
 1
sd 




 223 .810 2
 32 .2717 . To be conservative, use 32 degrees of freedom.

1552 .15





s12 s 22

 223 .810  14 .9603
n1 n 2
10
3/26/03 252y0321
Now do at least one of the following using t  2  t .32
05  1.694 .
Confidence Interval: D  d  t  2 s d  2  1.694 14 .9603   2  25 .343 . This interval obviously includes
zero, so do not reject H 0 .
Test Ratio: t 
d  D0
20

 0.134 . Make a diagram of an almost Normal curve with a mean at
sd
14 .9603
32
zero and ‘reject’ zones above t  2  t .32
05  1.694 and below  t  2  t .05  1.694 . Since the test ratio does
not fall into the ‘reject’ zones, do not reject H 0 .
Critical Value: d cv  D0 t  2 s d  0  1.694 14.9603   25.343 . Make a diagram of an almost Normal
curve with a mean at zero and ‘reject’ zones above 25.343 and below -25.343. Since the d  2 does not
fall into the ‘reject’ zones, do not reject H 0 .
If you stubbornly tried to assume equal variances, the degrees of freedom for the problem are
DF  n1  1  n 2  1  n1  n 2  2  21  21  2  40 . t  2  t .40
05  1.684 The formula for the pooled
variance is sˆ 2p 
 2350
n1  1s12  n2  1s 22
n1  n 2  2

20 1200   20 3500 
 2350 .
40
sd  s p
1
1

n1 n2
1
1
1 
 1

 2350     223 .810  14 .9603
21 21
 21 21 
Confidence Interval: D  d  t  2 s d  2  1.684 14 .9603   2  25 .193 . This interval obviously includes
zero, so do not reject H 0 .
Test Ratio: t 
d  D0
20

 0.134 . Make a diagram of an almost Normal curve with a mean at
sd
14 .9603
40
zero and ‘reject’ zones above t  2  t .40
05  1.684 and below  t  2  t .05  1.684 . Since the test ratio does
not fall into the ‘reject’ zones, do not reject H 0 .
Critical Value: d cv  D0 t  2 s d  0  1.684 14.9603   25.193 . Make a diagram of an almost Normal
curve with a mean at zero and ‘reject’ zones above 25.342 and below -25.342. Since the d  2 does not
fall into the ‘reject’ zones, do not reject H 0 .
c) Any report should emphasize that the major difference is the unreliability of the second
method, shown by its significantly larger variance.
11
Download