Document 15930471

advertisement
3/29/99 252z9922
2.
a. A Gallup survey of 100 US entrepreneurs asks about the origin of the car that they drive most
frequently. The answers are below. (This material will be covered in the third exam in Spring
2000)
US
Europe
Japan
45
46
9
(i)
If there is no preference among entrepreneurs as to the origin of the car that they drive,
the proportions in the population with each type should be equal. Test this hypothesis
using   .10 . (4)
(ii)
Redo this test using another method. (3)
b. A researcher wishes to test whether a set of data fits the distribution z ~ N 0,1 . The researcher
observes the following: (This material will be covered in the third exam in Spring 2000)
(i)
(iii)
O
Interval
Below -1.282
7
-1.282 to -0.842
5
-0.842 to -0.524
9
-0.524 to -0.253
6
-0.253 to 0.000
2
0.000 to 0.253
5
0.253 to 0.524
2
0.524 to 0.842
5
0.842 to 1.282
5
Above 1.282
4
Total
50
This problem isn’t as hard as it looks. Set up E . You might find this much easier to do if
you look at the bottom of the t-table rather than using the normal table. (4)
Do the  2 test and explain why this grouping might be superior to the one suggested in
class for Normal data. (4)
Solution: a. H 0 :Uniformity
O
(i)
45
46
9
100
f
.3333
.3333
.3333
1.000
OE
E  fn
33.3333
33.3333
33.3333
100.000
11.6667
12.6667
-24.3333
0.000
O  E 2
O  E 2
E
4.0833
4.8133
17.7633
26.6600
O2
E
60.75
63.48
4.43
126.66
O2
n
E
E
= 126.66 – 100 = 26.66. We have 3 cells, so 2 degrees of freedom. Since  .2102   4.6052 is less
Depending on which method we use  2 

than the chi-square we computed, do not reject H 0 .
 26 .66 or  2 

3/29/99 252z9922
(ii)
Use the Kolmogorov-Smirnov Method.
O
F0
O
E  fn
E
n
.3333
.3333
.3333
1.000
f 
n
0.45
.45
33.3333
0.46
.91
33.3333
0.09
1.00
33.3333
1.00
100.000
1.22
Since the critical value is
 0.122 and the maximum D is larger, reject H 0 .
100
45
46
9
100
b.
Fe
D
.3333
.6667
1.0000
.1167
.2433
.0000
(i) H 0 : N 0,1 According to the t-table 1.282  z.10 , 0.842  z .20 , 0.524  z 30 etc. So
Pz  1.282   .10 , P1.282  z  0.842   .10 etc.
(ii)
O
OE
f
E  fn
O  E 2
O2
E
E
7
.10
5
2
0.8
9.8
5
.10
5
0
0.0
5.0
9
.10
5
4
3.2
16.2
6
.10
5
1
0.2
7.2
2
.10
5
-3
1.8
0.8
5
.10
5
0
0.0
5.0
2
.10
5
-3
1.8
0.8
5
.10
5
0
0.0
5.0
5
.10
5
0
0.0
5.0
4
.10
5
-1
0.2
3.2
50
1.00
50
0
8.0
58.0
We have 10 cells, so 9 degrees of freedom. Since  .2019   14.6837 is greater than the chi-square
we computed, accept H 0 .
This grouping is superior to the one taught in class because if n  50 there will be no items in the
E column that are below 5. Items this small usually complicate the test by compelling one to
merge cells.
7
3/29/99 252z9922
3.
a. In an ad that appeared in the Sunday Inquirer Parade Magazine (numbers slightly
modified), Astra Pharmaceuticals reported that the most frequent adverse reaction to its
heartburn drug, Prilosec, was a headache. In the reported test 35 out of 465 getting Prilosec
reported a headache, while 5 out of 73 getting a placebo reported a headache as did 15 out of
195 getting Ranitidine, another heartburn medication. Test if there is a significant difference
between these three proportions.   .05  (6)
b. Why can’t the problem above be done by the Kolmogorov-Smirnov method? (1)
c. (Extra Credit) I lied. Actually the number that received the placebo was 62 and 4 had an
adverse reaction. Why did I have to change the numbers? (2)
d. Test the hypothesis that the proportion reporting headaches with Prilosec was lower than
with Ranitidine. (4)
Solution: ( An awful lot of people tried to use the method for d. in this part of the problem - it could
never work!)
DF   r  1 c  1  1 2  2
H 0 : Homogeneous
a.
2 2
H 1 : Not homogeneous
 .05   5.9915
O
sum
pr
E
pr
35 5 15
55 .075034
34 .891 5.477 14 .632 .075034
430 68 180 678 .924966
430 .109 67 .523 180 .368 .924966
sum 465 73 195 733 1.00000
sum 465 .000 73 .000 195 .000 1.00000
The proportions in rows, p r , are used with column totals to get the items in E . Note that row and column
sums in E are the same as in O . (Note that  2  .055 is computed two different ways here - only one
way is needed.)
O2
O  E 2
OE
O
E
E
E
O  E 2
O2
35
34.891 35.109 0.109 .00034
n 
5
5.477
4.565 -0.477 .04154
E
E
 733 .055  733  0.055
15
14.632 15.377 0.368 .00926
430
430.109 429.891 -0.109 .00003
Since this is less than 5.9915. do not reject H 0 .
68
67.523 68.480 0.477 .00337
(Diagram!)
180
180.638 179.633 -0.368 .00075
733
733.000 733.055 0.000 .05530
b. (This material will be covered in the third exam in Spring 2000) The Kolmogorov-Smirnov test can
only be used when the parameters are known. In tests of independence or homogeneity, the proportions in
each row and column are the parameters and are estimated in the process of putting together E .
c. The problem is partially set up with the correct values below.
O
sum
pr
E
pr

35 4 15
54 .07479
32 .779
430 58 180 668
.?
?
sum 465 62 195 722 1.00000
sum 465 .000
We get a cell with a value below 5, which can complicate the solution.

4.637 14 .584 .07479
?
?
?
62 .000 195 .000 1.00000
8
3/29/99 252z9922
d. From Table 3 of the Syllabus Supplement:
Interval for
Confidence
Interval
p  p  z 2 sp
Difference
between
proportions
q  1 p
p  p1  p2
s p 
p1q1 p2 q 2

n1
n2
Hypotheses
Test Ratio
H 0 : p  p0
H 1 : p  p0
p 0  p 01  p 02
or p 0  0
z
p  p 0
 p
If p  0
 p 
Critical Value
pcv  p0  z 2  p
If p0  0
 p 
p01q 01 p02 q 02

n1
n2
Or use s p
p0 
p0 q 0  1 n1 
1
n2

n1 p1  n2 p2
n1  n 2
H 0 : p1  p 3
H 0 : p1  p 3  0
35
15
p1 
 .07527 , p 3 
 .076923 ,
or 

465
195
H 1 : p1  p 3
H 1 : p1  p 3  0
35  15
 .07576 ,   .05, z  1.645 .
p  p1  p3  .00165 , p 0 
465  195
H 0 : p  0
Same as

H 1 : p 0
 p  p 0 q 0

1
n1

1
n3

.07576 .92424  1 465  1195  .02258
(Only one of the following methods is needed!)
p  p 0  .00165  0

 0.733 This is above -1.645. (Diagram!)
Test Ratio: z 
 p
.02258
or Critical Value: pcv  p0  z  p  0  1.645 .02258   .03714
p  .008106 is above this value.
or Confidence Interval: p  p  z s p where s p 
p1 q1 p 3 q 3

. ( I’ll do it if you do it!). The
n1
n3
interval includes 0. In all cases do not reject H 0 .
9
3/29/99 252z9922
4.
Two fuel additives are being tested to see whether the there is a significant difference in miles per
gallon for the two additives. The data is below.
difference
x1
x2
16.7
21.2
-4.5
17.3
18.6
-1.3
17.5
19.7
-2.2
18.2
22.0
-3.8
18.4
17.7
0.7
18.4
18.1
0.3
18.6
18.7
-0.1
19.1
21.2
-2.1
You may need some of the following numbers:   .05, x1  18.025 , s1  0.788 , n1  8,
x 2  19.650 , s 2  1.627 , n2  8, and d  1.625, s d  1.893.
a.
Test for a significant difference in miles per gallon if these are independent samples and
the underlying distribution is not Normal. (5)
b.
Test for a significant difference in miles per gallon if each line represents results for a
single vehicle and the underlying distribution is not Normal. (5)
c.
Test for a significant difference in miles per gallon if each line represents results for a
single vehicle and the underlying distribution is Normal. (5)
Solution: a. Wilcoxon-Mann-Whitney Method H 0 :  1   2
x1
16.7
17.3
17.5
r1
x2
r2
H 1 :  1   2 This is not paired data!
If we correct the starred items we get the following:
1
2
3
r1
17.7 4
18.1 5
18.2 6
18.4 7*
18.4 8*
18.6 9* 18.6
18.7
19.1 12
19.7
21.2
21.2
22.0
*tie
10*
11*
13
14*
15*
16
r2
1
2
3
6
7.5
7.5
9.5
12
48.5
4
5
9.5
11
13
14.5
14.5
16
87.5
16 17 
Check: 48 .5  87 .5  126 
2
For a 5% two-tailed test, Table 6 says that the lower
critical value is 49. The lower of the two rank sums,
W  48.5 is below this value, so reject H 0 .
10
3/29/99 252z9922
b. Wilcoxon Signed rank test for paired data. H 0 :  1   2
If we add items with + and – signs separately, we
find T   31, T   5 . To check this, compute
difference rank
-4.5
8-1.3
4-2.2
6-3.8
70.7
3+
0.3
2+
-0.1
1-2.1
5-
T  T    31  5  36  89 . From Table 7
2
with n  8 , TL   TL .025  4 , and since 5,

H 0 :  1   2  0

H 1 :  1   2  0
d  1.625, s d  1.893,
3.5834
 0.66927 , DF  n  1  7, t .7025  2.365
8
n
8
(Only one of the following methods is needed!)
d   0  1.625  0

 2.428
Test Ratio: t 
This is not on the
sd
0.66927
sd 

1.893
2
the smaller T is above the critical value, do not
reject H 0 .
c. Test of equality of means for paired data.
H 0 :   0
H :    2
or
  1   2 or  0 1

H1 :   0
H 1 :  1   2
sd
H 1: 1   2 .

interval between –2.365 and +2.365.
or Critical Value: d cv   0  t s d  0  2.365 0.66927   1.583
2
d  1.625 is not on this interval.
or Confidence Interval:   d  t  2 s d  1.625  1.583  or –3.208
to –0.042. This interval does not include 0.
With all methods reject H 0 . Note that this method is more powerful than the one in c. However, it
still should not be used unless the conditions justify it.
3/29/99 252z9922
5. The second column from the last page is repeated. (Use   .05 )
x2
21.2
18.6
19.7
22.0
17.7
18.1
18.7
21.2
You may need some of the following numbers: x 2  19 .650 , s 2  1.627 , n 2  8 .
a.
Test these data to see if the distribution is Normal. (5) (This material will be covered in
the third exam in Spring 2000)
b.
Test these data to see if the distribution is Normal with a mean of 17 and a standard
deviation of 0.5. (5) (This material will be covered in the third exam in Spring 2000)
c.
Assume that the distribution is not normal and test whether these data have a median of
17.2 (3 – 5 if you use a method learned recently)
Solution: a. H 0 : N  ?, ? H 1 : Not Normal
Because the mean and standard deviation are unknown, this is a Lilliefors problem. The x values must be
x  x x  19 .650

in order From the data we find that x  19.650 and s  1.627 . t 
.This is often
s
1.627
called z as in a K-S problem and F t  is a cumulative Normal probability computed just like F z  below.
x
t
F t 
O
O
n
Fo
D
19 .7
21 .2
21 .2
22 .0
 1.20  0.95  0.65  0.58 0.03
.1151 .1711 .2578 .2810 .5120
1
1
1
1
1
17 .7
0.95
.8289
1
0.95
.8289
1
1.44
.9251
1
0.125
0.125
.0099
18 .1
0.125
0.250
.0789
18 .6
0.125
0.375
.1172
18 .7
0.125
0.500
.2190
0.125 0.125 0.125 0.125
0.625 0.750 0.825 1.000
.1130 .0789 .0039 .0749

MaxD   .2190
Since the Critical
O  n  8 Value for   .05
is .285 , do not re ject H 0 .
b. H 0 : N 17 ,0.5
H 1 : Not N 17 ,0.5
Because the mean and standard deviation are known, this is a Kolmogorov-Smirnov problem.
x   x  17

The x values must be in order z 
.

0 .5
x
17 .7 18 .1 18 .6 18 .7 19 .7 21 .2 21 .2 22 .0
z
F z 
O
O
n
Fo
D
1.40
.9192
1
2.20
.9861
1
3.20
.9993
1
3.40
.9997
1
5.40
1.000
1
8.40
1.000
1
8.40
1.000
1
10 .00
1.000
1
0.125
0.125
.7942
0.125 0.125 0.125 0.125 0.125 0.125 0.125
0.250 0.375 0.500 0.625 0.750 0.825 1.000
.7361 .6243 .4997 .3750 .2500 .1250 .0000
O  n  8
Note that it might be better to group the repeated values together with O  2 and
MaxD   .7942
Since the Critical
Value for   .05
is .454 , reject H 0 .
O
 .25 . It would not
n
affect the results.
12
3/29/99 252z9922
c. Wilcoxon Signed rank test for paired data. H 0 :   7.2 H 1 :   7.2
The x values need not be in order (but it makes things easier). ‘difference’ below is x   .   .05 
x difference rank
So T   0, T   36 . To check this, compute
17.7
0.5
1+
T  T    0  36  36  89 . From Table 7
18.1
0.9
2+
2
18.6
1.4
3+
with
,


T
n

8
  T L .025  4 , and since 5,
L
18.7
1.5
4+
2
19.7
2.5
5+
the smaller T is below the critical value, reject
21.2
4.0
6.5+
H0.
21.2
4.0
6.5+
22.0
4.8
8
An alternative and less powerful test is the sign test. Here, note that there are 8 numbers above the median.
pvalue  2Pz  8  21  Pz  8  21  .99609   2.00391   .00782 . Since this is below the
significance level, reject H 0 .
13
3/29/99 252z9922
6.
In a wage discrimination case the following hourly wage data is reported.   .05  A Normal
distribution is assumed.
Men
Women
n1  31
n 2  15
x1  9.25
a.
b.
c.
d.
x 2  8.70
s1  0.90
s 2  1.35
Test the statement that the variance for women is greater than the variance for men. (2)
Test the statement that the variance for women is not equal to the variance for men. (2)
Test the hypothesis that men have a wage less than or equal to that of women, assuming that
the variances differ between the two populations. (6) (Note - this method was not covered in
Spring 2000 - If you have studied this material and want an extra credit question on it,
please tell me in advance.)
Repeat the test in c using the same sample means and variances, but assuming that
n1  310 and n 2  150 . (4)
H 0 :  12   22
s 2 1.35 2
14,30  2.04
Solution: a. 
is smaller than

 .05 , 22 
 2.25 . Since F  F.05
2
s1
H 1 :  12   22
0.90 2
our ratio, reject H 0 .
H 0 :  12   22
b. 
H 1 :  12   22
check is
s 22
s12
s12
s 22

0.90 2
1.30 2
 1,
s 22
s12
 2.25 . Since the first ratio is below 1, the only one we need
14,30  2.34
. Since F  F.025
is larger than our ratio, do not reject H 0 .
2
H 0 :   0
H 0 :  1   2
H 0 :  1   2  0
c. 
or 
,   .05 . See problem II for
where   1   2 or 
H 1 :   0
H 1 :  1   2
H 1 :  1   2  0
formulas.
n1  31, x1  9.25, s1  0.90 , n 2  15, x 2  8.70, s 2  1.35 , d  x1  x 2  0.55 .
s12 0.90 2

 0.02613
n1
31
s 22 1.35 2

 0.12150
n2
15
s12 s 22

 0.14763
n1 n 2
DF 
 s12 s 22 



 n1 n 2 


2
sd 
s12 s 22

 0.14763  0.3842
n1 n 2
2
2
 s12 
 s 22 
 
 
 n1 
 n2 
 
 

n1  1
n2 1

0.14763 2
0.02613 2  0.12150 2
30
 20 .233 , so use 20 degrees of freedom.
14
14
3/29/99 252z9922
t.20
05  1.725. (Only one of the following methods is needed!)
Test Ratio: t 
d   0 0.55  0

 1.432 This is below 1.725.
sd
0.3842
or Critical Value: d cv   0  t  2 s d  0  1.725 0.3842   0.663 .
0.55 lies below this value.
or Confidence Interval:   d  t s d  0.55  1.725 0.3842   0.113
This interval includes 0. In all cases do not reject H 0 .
d.
H 0 :   0
H :    2
H 0 :  1   2  0
or  0 1
,
where   1   2 or 

H 1 :   0
H 1 :  1   2
H 1 :  1   2  0
n1  310 , x1  9.25, s1  0.90 , n 2  150 , x 2  8.70, s 2  1.35 , d  x1  x 2  0.55 .
Because of the large sample size, we can act as if the variances were known. From Table 3 in the
syllabus supplement:
Interval for
Confidence
Hypotheses
Test Ratio
Critical Value
Interval
Difference
Between Two
Means (
Known)
H 0:   0
  d  z 2 d
d 
 12
n1

 22
n2
t
H1 :    0
  1   2
d  0
sd
d cv   0  z 2 d
d  x1  x 2
z .05  1.645 ,
sd 
s12 s 22


n1 n 2
0.90 2  1.35 2
310
150
 0.1215
(Only one of the following methods is needed!)
d   0 0.55  0

 4.527 This is above 1.645.
Test Ratio: z 
sd
0.1215
or Critical Value: d cv   0  t  2 s d  0  1.645 0.1215   0.1999 .
0.55 lies above this value.
or Confidence Interval:   d  t s d  0.55  1.645 0.1215   0.350
This interval does not include 0. In all cases reject H 0 .
15
3/29/99 252z9922
IV. Computer Problem. There is no computer problem on the Spring 2000 exam, but you are
responsible for the rule on p-value below.
1. Hand in your first problem (3 – 2point penalty for not handing in).
2. Assume that your output is:
MTB > ttest mu = 30 ‘glop’;
SUBC > alt=1.
TEST OF MU=30 VS MU > 30
Variable N Mean StDev
glop
20 38.00 9.23
a.
b.
c.
SE Mean T P-Value
2.06
3.88 0.0025
(Don’t do this problem unless you handed in the computer problem.)
Show how the value of t was computed from the values of the mean and standard deviation
(1)
Give the null hypothesis and tell, using the p-value, whether (and why) you would accept
it if   .020 . (1)
What would the p-value be for the following tests (2):
(i)
MTB > ttest mu = 30 ‘glop’
(ii)
MTB > ttest mu = 30 ‘glop’;
SUBC > alt = -1.
x   0 38 .00  30
s
9.23

 3.88, s x 

 2.06 .
sx
2.06
n
20
The rule on p-value: if the p-value is less than the significance level   reject the
null hypothesis; if the p-value is greater or equal than the significance level, do not reject the
null hypothesis.
Solution: a. t 
b.
c.
 H 0 :   30
Since the p-value of .0025 is less than the significance

 H 1 :   30
level (   .020 ), reject H 0 .
See diagrams.
(i) Since this is a 2-sided test, double the probability between
t and the nearest corner. Thus the p-value is 2(.0025)
= .005. (If   .020 , reject H 0 .)
(ii) This is the opposite test to the test in b., so the p-value is
1 - .0025 = .9975. (If   .020 , do not reject H 0 .)
16
Download