252y0222 3/29/02 ECO252 QBA2 Name

advertisement
252y0222 3/29/02
ECO252 QBA2
Name
SECOND HOUR EXAM Hour of Class Registered (Circle)
(Open this document in 'Page Layout') March 19, 2002
MWF 10 11 TR 12:30 2:00
Hour of Class Attended (If Different) __
I. (14 points) Do all the following. Diagrams will help!
x ~ N 7,11 Probabilities still can't be negative!
16  7 
 3  7
z
 P 0.91  z  0.82 
1. P 3  x  16   P 
11
11 

 P 0.91  z  0  P0  z  0.82   .3186  .2939  .6125
7  7
0  7
z
 P 0.64  z  0  .2389
2. P0  x  7  P 
11 
 11
14  7 
  30  7
z
 P 3.37  z  0.64 
3. P 30  x  14   P 
11
11 

 P3.37  z  0  P0  z  0.64   .4996  .2389  .7385
4  7
 3  7
z
 P 0.91  z  0.27 
4. P 3  x  4  P 
11 
 11
 P0.91  z  0  P0.27  z  0  .3186  .1064  .2122
2  7

5. F 2  (The Cumulative probability up to 2) . Px  2  P  z 
11 

 Pz  0.45   Pz  0  P 0.45  z  0  .5  .1736  .3264
6. A symmetrical interval about the mean with 33% probability.
We want two points x.665 and x.335 , so that Px.665  x  x335   .3300 . Make a
Diagram showing 7 in the middle at the center of a 33% region split into two areas
with probabilities of 16.5%. From the diagram, if we replace x by z, P0  z  z.335   .1650 .
The closest we can come is P0  z  0.42   .1628 or P0  z  0.43  .1664 .
Since neither of these is much closer than the other, and the probability we want is
about 30% of the way between these two probabilities, the best compromise is
z.335  0.423 , and x    z.335  7  0.423 11  7  4.653 , or 2.347 to 11.653.
11 .653  7 
 2.347  7
z
To check this note that P2.347  x  11.653   P

11
11


 P 0.42  z  0.42   2P0  z  0.42   2.1628   .3256  33% .
Of course z.335  0.42 and z.335  0.43 are perfectly acceptable.
7. x.04 - We want a point x.04 , so that Px x.04   .04 . Make a diagram of z
showing zero in the middle, .4600 between 0 and z.04 and .04 above z.04 . From
the diagram, if we replace x by z, P0  z  z.04   .4600 . The Normal table says
P0  z  1.75   .4599 , which is the closest we can come to.4600. So z.04  1.75 , and
26 .25  7 

x    z.04  7  1.7511  26.25. To check this note Px  26 .25   P  z 

11


 Pz  1.75   Pz  0  P0  z  1.75   .5  .4599  .0401  .04.
252y0222 3/29/02
II. (6 points-2 point penalty for not trying part a.) Show your work! Do not answer part b with a yes or
no unless you have stated your hypotheses!
a. According to your text, a study was made to compare ages of purchasesrs of Crest with nonpurchasesrs,
yielding the following results. These are two independent samples taken from an approximately normal
population.
Row
crest
nocrest
x1
1
2
3
4
5
6
7
8
x2
34
33
52
41
32
34
49
50
60
54
53
58
52
52
66
35
The Minitab 'describe' function gave the following results for the 'nocrest' column.
Variable
N
Mean
Median
StDev
nocrest
8
53.75
53.50
8.99
a. Compute the standard deviation, s1 , for the 'crest' column. Show your work! (3)
b. Compute a 90% confidence interval for the difference between the two population means 1 and  2 on
the assumption that these are independent samples taken from approximately normal populations with
similar variances. According to your confidence interval, is there a significant difference between the
population means? Why? (3)
Solution: a) n1  8.
Row
1
2
3
4
5
6
7
8
x1
x12
34
1156
33
1089
52
2704
41
1681
32
1024
34
1156
49
2401
50
2500
325 13711
b) From Table 3 of the Syllabus Supplement:
Interval for
Confidence
Hypotheses
Interval
Difference
H 0:   0
  d  t  2 sd
between Two
H 1:   0
1
1
Means (
sd  s p

  1   2
n1 n2
unknown,
variances
DF  n1  n2  2
assumed equal)
x1 
s12 
x
1
n1
x
2
1

325
 40 .625
8
 n1x12
n1  1

13711  840 .625 2 507 .875

 72 .553571
7
7
s1  8.51784
Test Ratio
t
sˆ 2p 
d  0
sd
Critical Value
d cv   0  t  2 sd
n1  1s12  n2  1s22
n1  n2  2
2
252y0222 3/29/02
x1  40 .625 , s12  72 .553571 , s1  8.51784
x2  53 .75, s2  8.99, s22  80 .8201
d  x1  x2  40 .625  53 .75  13 .125
DF  n1  n2  2  8  8  2  14
sˆ 2p 
  .10,
n1  1s12  n2  1s22 = 772.553571   780.8201   72.553571  80.8201  76.6868
sd  s p
n1  n2  2
1
1


n1 n2
14
76 .6868  1  1   76 .6868 .25  
8
8
2
14
t.05
 1.761
19 .1717  4.37855
Confidence Interval:   d  t sd  13.125  1.7614.37855   13.125  7.711 or -20.836 to -5.414.
2
The interval does not includes 0, so there is a significant difference between the means. Formally, our
H 0 :   0
H :    2
H :    2  0

hypotheses are H 1 :   0 or  0 1
or  0 1
We reject H 0 .
H 1 :  1   2
H 1 :  1   2  0
  1   2
3
252y0222 3/29/02
III. Do at least 3 of the following 4 Problems (at least 10 each) (or do sections adding to at least 30 points Anything extra you do helps, and grades wrap around) . You must do problem 2a! Show your work! State
H 0 and H1 where applicable. Do not answer a question 'yes' or 'no' without citing a statistical test.
Use a 95% confidence level unless another level is specified.
1. For your convenience, data is repeated from the previous page. Use a 90% confidence level in this
problem.
Row
crest
x1
1
2
3
4
5
6
7
8
34
33
52
41
32
34
49
50
nocrest
x2
60
54
53
58
52
52
66
35
The Minitab 'describe' function gave the following results for the 'nocrest' column.
Variable
N
Mean
Median
StDev
nocrest
8
53.75
53.50
8.99
a. Test the hypothesis that the mean age of Crest buyers is lower than the mean for those who did not buy
Crest. Assume that these are independent samples taken from approximately normal populations with
similar variances and .
(i) State your null and alternate hypotheses. (2)
(ii) Find a critical value for the difference between the sample means and use it to test your
hypothesis. (2)
(iii) Repeat the test using a test ratio and find an approximate p-value for the hypothesis. (3)
(iv) Repeat the test using a confidence interval. (2).
b. Test the equality of the standard deviations of the two samples.
(i) Do the test without using a confidence interval. (2)
(ii) Create a confidence interval for the variance ratio and use it to find a confidence interval for
the ratio of the standard deviations. (3.5)
c. (Extra credit) Repeat the tests in a(ii) - a(iv) dropping the assumption of equal variances. (6)
Solution: a. (i) Because we are being asked if the mean of Crest users is less than the mean for non-Crest
users, we are asking if 1   2 . Because this does not contain an equality this must be an alternate
H 0 : 1   2  0
H 0 :   0
 H 0 : 1   2
hypothesis. Thus we are testing 
or 
or 
,   .10.
H1 : 1   2  0
 H1 :   0
 H1 : 1   2
From the previous page x1  40.625 , x2  53 .75, d  x1  x2  40.625  53.75  13.125,
1
1

  4.37855 .
n1 n2
14
Because this is a one - sided hypothesis, we use t.10
 1.345 .
DF  n1  n2  2  8  8  2  14 and
sd  s p
(ii) The formula for a Critical Value is d CV   0  t  2 s d or x1  x 2 CV  10   20   t  2 s d .
Because this is a one-sided test, we want one critical value below zero. The critical value formula
becomes d CV   0  t sd  0  1.345 4.37855   5.889 . Make a diagram showing an almost
Normal curve with a mean at zero and a 'reject' region below -5.899. Since -13.125 is in this
region, we reject H 0 .
4
252y0222 3/29/02
(iii) The formula for a Test Ratio is : t 
x  x 2   10   20 
d  0
or t  1
.
sd
sd
d   0  13 .125  0

 2.998 . To do a conventional test make a diagram showing an
sd
4.37855
almost Normal curve with a mean at zero and a 'reject' region below  t n 1   t 14  1.345 .
t

.10
Since -2.998 is in this region, we reject H 0 . To find an approximate p- value, compare this value
of t with the values on the DF  14 line of the t table. Because this is a left-sided test, we want
14
to know the area below -2.998. Since t.14
005  2.997 and t.001  3.787, we can say that
.001  pval  .005 .
(iv) The formula for a Confidence Interval is   d  t  2 s d or 1   2   x1  x 2   t  2 s d . Since
the alternate hypothesis is H1 : 1  2 or   0 , this interval becomes
  d  t sd  13.125  1.3454.37855  13.125  5.889  7.236 . Since this interval does not
include zero and numbers above zero, we reject H 0 .
H 0 :  1   2
b. The best place to find the formulas for comparing variances to test 
is the outline. Recall
H 1 :  1   2
that s12  72.553571, s22  80.8201, n1  n2  8 , and   .10.
(i) . If we want to do a 2-sided test where DF1  n1  1  7 and DF2  n2  1  7 , we compare


7 7 
against F DF1 , DF2  F.057
 3.79 and
2
so
s 22
7,7  .
against FDF2 ,DF1   F.05
s12
s22
s12

s12
s 22
80 .8201
 1.114 ,
72 .553571
s12
must be below one. Since both ratios are not above the corresponding table values for
s 22
F , we cannot reject the null hypothesis of equality.
(ii) A 2-sided confidence interval is
 22 s 22 ( n1 1, n2 1)


F
, which becomes
s12 Fn2 1,n1 1  12 s12 2
s 22
1
2
1.114 
s12
s 22
1

3.79
 22
 12
 1.114  3.79 or 0.2939 
 22
 12
 4.2221 . The opposite interval is
 12 s12 ( n2 1, n1 1)
 12
1
1
1


F
,
which
becomes
3.79  or



n1 1, n2 1  2 s 2
2
2
1.114 3.79  2 1.114
F
2
2
1
2
0.2369 
 12
 22
 3.4022 . If we take the square roots , we get confidence intervals for the ratios of
standard deviations 0.542 
2
1
 2.05 or 0.487 
1
2
 1.84 . Since this interval includes one, we
cannot reject H 0 .
H 0 : 1   2  0
H 0 :   0
 H 0 : 1   2
c. (i). We are testing 
or 
or 
,   .10.
H1 : 1   2  0
 H1 :   0
 H1 : 1   2
5
252y0222 3/29/02
From the formula table;
Interval for
Confidence
Interval
  d  t 2 s d
Difference
between Two
Means(
unknown,
variances
assumed
unequal)
s12 s 22

n1 n2
sd 
DF 
 s12 s22 
  
n

 1 n2 
2
s 22
n1
n1  1
Test Ratio
H 0 :   0
H 1:   0
t
  1   2
d 0
sd
Critical Value
d cv   0  t  2 s d
Same as
H 0: 1   2
2
H 1:  1   2
   
s12
Hypotheses
2
n2
if  0  0
n2  1
From the previous pages x1  40.625 , x2  53 .75, d  x1  x2  40.625  53.75  13.125,
s12  72.553571, s22  80.8201, n1  n2  8 For these calculations, done by Minitab, I used s1  8.51784,
and s2  8.9900, so that Minitab reported s12  72.5536 and s22  80.8201.
s12 72 .5536

 9.0692
n1
8
s22 80 .8201

 10 .1025
n2
8
sd 
s12 s22

 19 .1717  4.37855
n1 n2
s12 s22

 19 .1717
n1 n2
DF 
 s12 s 22 



 n1 n 2 


2
2
2
 s12 
 s 22 
 
 
 n1 
 n2 
 
 

n1  1
n2 1
degrees of freedom.

19 .1717 2
9.0692 2  10 .1025 2
7

367 .555
367 .555

 13 .9594 , so use 13
11 .7501  14 .5801 26 .3302
7
13
 1.350 .
Because this is a one - sided hypothesis, we use t .10
(ii) The formula for a Critical Value is d CV   0  t  2 s d or x1  x 2 CV  10   20   t  2 s d .
Because this is a one-sided test, we want one critical value below zero. The critical value formula
becomes d CV   0  t sd  0  1.350 4.37855   5.911 . Make a diagram showing an almost
Normal curve with a mean at zero and a 'reject' region below -5.911. Since -13.125 is in this
region, we reject H 0 .
iii) The formula for a Test Ratio is : t 
t
x  x 2   10   20 
d  0
or t  1
.
sd
sd
d   0  13 .125  0

 2.998 . To do a conventional test make a diagram showing an
sd
4.37855
13
almost Normal curve with a mean at zero and a 'reject' region below  tn 1   t.10
 1.350 .
Since -2.998 is in this region, we reject H 0 .
6
252y0222 3/29/02
(iv) The formula for a Confidence Interval is   d  t  2 s d or 1   2   x1  x 2   t  2 s d . Since the
alternate hypothesis is H1 : 1  2 or   0 , this interval becomes
  d  t sd  13 .125  1.350 4.37855   13 .125  5.911  7.214 . Since this interval does not include
zero and numbers above zero, we reject H 0
Two reminders
1) A one-sided hypothesis is tested by a one-sided test which includes a one-sided critical value or a onesided confidence interval.
2) A table from the outline:
Methods for Comparing Two Samples.
Paired Samples
Location - Normal distribution.
Method D4
Compare means.
Location - Distribution not
Normal. Compare medians.
Method D5b
Independent Samples
Methods D1- D3
Method D5a
Proportions
Method 6
Variability - Normal distribution.
Compare variances.
Method 7
7
252y0222 3/29/02
2
a. Turn in your computer output from computer problem 1 only tucked inside this exam paper. (3 - 2 point
penalty for not handing this in.)
b. A new battery is being tested for use in a tiny stuffed animal. We will use the new battery if it is longerlasting than the old one. Use a 95% confidence level. Slightly edited Minitab output is below.
The assumption was that the old battery had an average life of 4.7 hours and this is tested for both the old
and new battery before they are compared. Can we say that either battery has a life that is significantly
different from 4.7 hours? What evidence in what tests led you to your conclusion? (3)
c. Continuing to use the data below, which one of these tests would you use to decide whether to switch
batteries. What would you do? Why? (2)
d. Using an almost Normal curve, and the appropriate values of t, shade the areas represented by the pvalues in tests 3, 4 and 5. (3)
e. Using means and standard deviations from the printout, explain how the computer got the values of t in
tests 3, 4 and 5. (2) Part f is at the end.
Worksheet size: 100000 cells
MTB > RETR 'C:\MINITAB\2X0222-2.MTW'.
Retrieving worksheet from file: C:\MINITAB\2X0222-2.MTW
Worksheet was saved on 3/14/2002
MTB > print 'new'
Data Display
new
4.2
7.3
3.9
5.4
5.1
7.3
4.5
5.8
6.4
4.5
4.6
4.9
7.2
6.1
3.9
4.0
3.5
5.1
5.1
3.2
4.0
3.5
5.3
4.4
4.5
3.8
MTB > print 'old'
Data Display
old
5.1
3.8
4.5
4.9
5.0
4.0
5.2
3.5
3.0
5.2
MTB > describe 'new''old'
Descriptive Statistics
Variable
N
Mean
new
18
5.206
old
18
4.333
Variable
new
old
Min
3.500
3.000
Median
5.000
4.450
TrMean
5.181
4.356
Q1
4.150
3.725
Q3
6.175
5.100
Max
7.300
5.300
StDev
1.228
0.755
SEMean
0.289
0.178
SE Mean
0.289
T
1.75
P-Value
0.099
SE Mean
0.178
T
-2.06
P-Value
0.055
Test 1: MTB > ttest mu=4.7 'new'
T-Test of the Mean
Test of mu = 4.700 vs mu not = 4.700
Variable
new
N
18
Mean
5.206
StDev
1.228
Test 2: MTB > ttest mu=4.7 'old'
T-Test of the Mean
Test of mu = 4.700 vs mu not = 4.700
Variable
old
N
18
Mean
4.333
StDev
0.755
8
252y0222 3/29/02
H 0 : new  4.6
 H 0 : old  4.6
b) Solution: The two tests above are 
and 
. We are using   .05 . In
H1 : new  4.6
 H1 : old  4.6
both cases the p-value is above the significance level so do not reject the null hypothesis.
Test 3: MTB > twosamplet 'new''old'
Two Sample T-Test and Confidence
Twosample T for new vs old
N
Mean
StDev
SE
new 18
5.21
1.23
old 18
4.333
0.755
Interval
Mean
0.29
0.18
95% C.I. for mu new - mu old: ( 0.18, 1.57)
T-Test mu new = mu old (vs not =): T= 2.57 P=0.016
DF=
28
Test 4: MTB > twosamplet 'new''old';
SUBC> alt=1.
Two Sample T-Test and Confidence
Twosample T for new vs old
N
Mean
StDev
SE
new 18
5.21
1.23
old 18
4.333
0.755
Interval
Mean
0.29
0.18
95% C.I. for mu new - mu old: ( 0.18, 1.57)
T-Test mu new = mu old (vs >): T= 2.57 P=0.0079
DF=
28
Test 5: MTB > twosamplet 'new''old';
SUBC> alt=-1.
Two Sample T-Test and Confidence
Twosample T for new vs old
N
Mean
StDev
SE
new 18
5.21
1.23
old 18
4.333
0.755
Interval
Mean
0.29
0.18
95% C.I. for mu new - mu old: ( 0.18, 1.57)
T-Test mu new = mu old (vs <): T= 2.57 P=0.99
DF=
28
c. Solution: If the new battery is 'longer - lasting,' we will find that, new  old so that our hypotheses are
 H 0 : new  old
. These are the hypotheses tested in test 4. The p-value reported in this test is .0079,

 H1 : new  old
which is certainly less than   .05 . So reject the null hypothesis and buy the new battery.
d. Solution: Make a diagram showing an almost-normal curve with a mean at zero.
In test 3, a 2-sided test, the p-value is twice the probability that t  2.57 , so shade the area above
2.57 and below -2.57 and label it 1.6%.
In test 4, a 1-sided test, the p-value is the probability that t  2.57 , so shade the area above 2.57
and label it 0.79%, which must be half the p-value in test 4.
In test 5, a 1-sided test, the p-value is the probability that t  2.55 , so shade the area below 2.55 2.55 and label it 99%. Except for rounding, this is one minus the p-value in test 4.
9
252y0222 3/29/02
2
2
d 0
e. Solution: As is shown on page 6, above. The formulas are s  s1  s2 and t 
. The 'describe
d
sd
n1 n2
printout shows that xnew  5.206 , xold  4.333, d  xnew  xold  5.206  4.333  0.873,
and
sold
nold

2
2
 0.178 . So s  s1  s2 
d
n1
n2
.0.289 2  0.178 2
 0.3394186 . Finally t 
snew
nnew
 0.289
d  0
sd
0.873  0
 2.5720 .
0.33394186
f. Assume that you got the following output from a 'describe' command
Descriptive Statistics
Variable
N
Mean
Median
TrMean
StDev
new
150
5.106
5.000
5.081
1.500
old
150
4.233
4.300
4.250
0.800
Construct a 92% confidence interval for  new   old (3)
SEMean
0.122
0.065
2
2
Solution: The formula that is used in the large sample case is s  s1  s2
d
n1

.0.122 2  0.065 2
n2
 0.13824 . d  5.106  4.233  0.873 . On Page 1, we found z.04  1.75 The
Confidence Interval formula is   d  t 2 sd  0.873  1.75 0.13824  0.873  0.242 .
10
252y0222 3/29/02
3. (McClave et. al. )I am a razor manufacturer and claim that my disposable razor gives more shaves than
my competitor's. A sample is taken, with x m representing the number of shaves per razor on my product
and x c representing the number of shaves on my competitor's product.
Row
1
2
3
4
5
6
7
8
me
rankm
compet
rankc
xm
rm
xc
rc
8
16
9
11
15
10
6
12
16
15
12
10
6
4
7
13
14
5
7
1
13
14
2
diff
d
-2
10
5
4
2
-4
1
5
For your convenience some calculations have been made. The columns rm and rc represent the beginning
and end of the ranking of the numbers in x m and x c . d represents the difference between the numbers
in x m and x c . For the three data columns we have the following.
Variable
Me
10.88
xm
Compet
Sample Mean
xc
8.25
Sample Std
Dev
3.40
3.69
2.62
4.41
d
Note that you will probably not need all the information that I am giving you.
a. The authors say that x m and x c represent two independent samples. Because of uncertainty about the
underlying distribution, the authors imply that you should compare medians to see if my blade is better.
State your hypotheses and your conclusion after doing an appropriate test. Use a 95% confidence level. (5)
b. The authors then concede that this was an inefficient way to do the problem and that we ought to compare
medians using paired samples. Assume that each line represents the experience of one shaver and repeat the
test. (5)
c. Now assume that we find out that the underlying distribution is Normal after all and repeat the test in part
b using means instead of medians. (5).
Solution: The data are repeated below with the ranks filled in. Rank totals are found for m and c. Absolute
values of d are found and d is ranked. The ranks are then corrected for ties and marked by the sign of the
Diff
difference.   .05 .
Row
me
rankm compet rankc diff
d
xm
rm
xc
rc
rd
d
rd*
1
8
7
10
9.5
-2
2
3
2.52
16
16
6
3.5
10
10
8
8 +
3
9
8
4
1
5
5
7
6.5+
4
11
11
7
5.5
4
4
5
4.5+
5
15
15
13
13
2
2
2
2.5+
6
10
9.5
14
14
-4
4
4
4.57
6
3.5
5
2
1
1
1
1 +
8
12
12.0
7
5.5
5
5
6
6.5+
82.0
54.0
T+=29, T-=7
H 0 :  m   c
a. 
For a test of the correctness of the ranking, note that the sum of the two Ts in a add to
 H1 :  m   c
82+54=136. This should be the same as the sum of the first 16 numbers 16(17 )  136 . For a 1-sided test
2
use The Mann-Whitney-Wilcoxon rank sum test. For n1  n2  8, Table 6b gives critical values of 52 and
84. Since our rank sums fall between these, we do not reject the null hypothesis.
11
252y0222 3/29/02
H 0 :  m   c
b. 
For a test of the correctness of the ranking, note that the sum of the two Ts in b add to
 H1 :  m   c
29+7=36. This should be the same as the sum of the first eight numbers 8(9)  36 . For a Wilcoxon
2
Signed Rank Sum Test, compare T  7, the smaller of the totals, with Table 7. For n  8, the 5% critical
value is 6. Since T is above this value, we do not reject the null hypothesis.
c. If the parent distribution is Normal, we use Method D4. From the outline, there are three ways of
approaching a problem involving two means. You should have chosen one! We know that
H :   2
s
4.41
or
 1.5592 . We are testing  0 1
d  xm  xc  2.62. ,   m  c , s d  4.41, sd  d 
n
8
 H1 : 1  2
 H 0 : 1   2  0
H 0 :   0
7
or 
, df  n  1  7 , tn1  t.05
 1.895 .

H
:




0
H
:


0
2
 1 1
 1
(i) . Confidence Interval:   d  t  2 s d or 1   2   x1  x 2   t  2 s d . This interval
becomes   d  t sd  2.62 - 1.895 1.5592   2.62 - 2.95  -0.33 . Since this interval
includes zero, we cannot reject H 0 .
(ii). Test Ratio: t 
x  x 2   10   20 
d  0
or t  1
.
sd
sd
d   0 2.62  0

 1.680 . Make a diagram showing an almost Normal curve with a
sd
1.5592
mean at zero and a 'reject' region above t n1  t 7   1.895 . Since 1.680 is not in this
t

.05
region, we cannot reject H 0 .
(iii). Critical Value: d CV   0  t  2 s d or x1  x 2 CV  10   20   t  2 s d . Because this
is a one-sided test, we want one critical value above zero. The critical value formula
becomes d CV   0  t sd  0  1.8951.5592  2.955 Make a diagram showing an
almost Normal curve with a mean at zero and a 'reject' region above 2.955. Since 2.62 is
not in this region, we cannot reject H 0 .
12
252y0222 3/29/02
4. According to your authors, when a sample of 74 woman students were asked whether they would consent
to and interview with a travelling recruiter in a local office building, 73 said yes. However, when another
sample of 74 women were asked whether they would consent to a similar interview in a hotel room, only 47
said yes. This represents a tremendous difference in the proportion that said yes and the researcher was
asked to verify her findings by repeating the question about the hotel room interview to a second sample of
100 students. This time 66 out of the 100 students said yes.
a. Was the proportion that consented to the hotel interview significantly different between the sample of 74
and the sample of 100? State your hypotheses and test them using a 90% confidence level? (4)
b. The sponsoring firm was so upset by the results that the researcher was asked to interview another sample
of 200 women. This time, out of the 200 women, 140 said that hey would consent. Check to see if the
proportions of the samples of 74, 100 and 200 women who consented differ, Again use a 90% confidence
level. (6)
c. Do a 2-sided confidence interval for the difference between the proportion of women in the samples of 74
who will consent to interviews in a office building and a hotel room. (3)
d. A business has just completed a switch to a new invoicing system. Previously the number of errors per
invoice seemed to follow a Poisson distribution with a mean of 0.2. After the switch to the new system was
made, a sample of 500 invoices was taken. Of these 461 had no errors, 28 had one error, 8 had 2 errors, 2
had 3 errors and one had 4 errors. Does the Poisson(0.2) distribution still apply? (5)
e. The business wants to know if some Poisson distribution works for the data in part d. Using your Poisson
table as best you can, how do you go about this and what do you conclude? (3)
Solution: To summarize the information in parts a and b -   .10 and
Hotel Room
Sample 1
Sample 2
Sample 3
47
66
140
Yes
27
34
60
No
74
100
200
Total
.6351
.6600
.7000
Proportion saying
yes
We are comparing p1  .6351 , n1  74 and p2  .6600 , n2  100 .
Interval for
Confidence
Hypotheses
Test Ratio
Critical Value
Interval
pcv  p0  z 2  p
Difference
p  p 0
p  p  z 2 sp
H 0 : p  p0
z
between
If p0  0
 p
H 1 : p  p0
p  p1  p2
proportions
 p  p0 q 0  1 n  1 n 
If p  0
p 0  p 01  p 02
p1q1 p2 q 2
q  1 p
s p 

p01q 01 p02 q 02
n p  n2 p2
 p 

n1
n2
or p 0  0
p0  1 1
n1
n2
n1  n 2
Or use s p
1
sp 
p1q1 p2q2


n1
n2
2
.6351 .3649  .6600 .3400 

 .0031317296  .00224400  .0053757  .0733191
74
100
p  p1  p2  .0249 , p0 
47  66
n p  n2 p2 74 .6351   100 .6600 
 1 1

 .6494 ,
74  100
n1  n2
74  100
  .10, z 2  z.05  1.645. Note that q  1  p and that q and p are between 0 and 1.
 p 
p0q0

1
n1

1
n3

.6494 .3506  174  1100 
.005354  .073167
13
252y0222 3/29/02
H0 : p  0
H 0 : p1  p2
H0 : p1  p2  0
a) 
Same as 
or 
H1 : p  0
H1 : p1  p2
H1 : p1  p2  0
There are three ways to do this problem. Only one is needed.
p  p0 .0249  0

 0.3403
(i) Test Ratio: z 
 p
.073167
Make a Diagram showing 'reject'
regions above 1.645 and below -1.645. Since -0.3403 is between these values, do not reject H 0 .
(ii) Critical Value: pcv  p0  z  p  0  1.645 .073167   0.1204 . Make a Diagram
2
showing a 'reject' region above 0.1204 and below -0.1204. Since -0.0249 is between these
values, do not reject H 0 .
(iii) Confidence Interval:: p  p  z s p  .0249  1.645 .0733191   .0249  .1206 or
2
-0.1455 to 0.0957. Since zero is between these values, do not reject H 0 .
b)
DF  r  1c  1  12  2
H 0 : Homogeneousor p1  p 2  p 3 
H 1 : Not homogeneousNot all ps are equal
O
1
2
3 Total
pr


Yes
47 66 140
253
.6765


No  27 34 60  121
.3235
Total 74 100 200
374 1.0000
.2102   4.6052
E
Satisfied
Not
Total
1
2
3
Total
pr
 50 .06 67 .65 135 .30  253 .01 .6765


 23 .94 32 .35 64 .70  120 .99 .3235
74 .00 100 .00 200 .00 374 .00 1.0000
The proportions in rows, p r , are used with column totals to get the items in E . Note that row and column
sums in E are the same as in O . (Note that  2  1.207  375.207  374 is computed two different ways
here - only one way is needed.)
O2
O  E 2
Row
E  O2
E O
O
E
E
E
1
47
50.06
3.06
9.3636
0.187048
44.127
2
27
23.94 -3.06
9.3636
0.391128
30.451
3
66
67.65
1.65
2.7225
0.040244
64.390
4
34
32.35 -1.65
2.7225
0.084158
35.734
5
140
135.30 -4.70
22.0900
0.163267
144.863
6
60
64.70
4.70
22.0900
0.341422
55.641
374
374.00
0.00
1.207267
375.206
Since the  2 computed here is less than the  2 from the table, we do not do not reject H 0 .
c) Let us call the proportion of women who consented to the office interview p4 
p1 
73
 .9865 . Recall that
74
47
 .6351 so p  p4  p1  .9865  .6351  .3514 .   .10, z  z.05  1.645.
2
74
sp 
p4q4 p1q1
.9865 .0135  .6351 .3649 



 .0001799695 9  .0031317296  .0033116  .0575465
n4
n1
74
74
p  p  z 2 s p  .3514  1.645 .057964   .3514  .0954
14
252y0222 3/29/02
d) If we take the column in the Poisson table for the Poisson distribution with a mean of .2 and multiply it
by 500, we get E. O is given in the problem.
x Poisson(0.2)
O
E
0
0.818731
409.366
461
1
0.163746
81.873
28
2
0.016375
8.187
8
3
0.001092
0.546
2
4
0.000055
0.027
1
But the E column has items in it below 5 (and 2) and thus can be used only if the last three cells are added
together. DF  3  1  2, .2102  4.6052 .
x
O
0
1
2+
461
28
11
500
E
E O
E  O2
409.366
81.873
8.761
500.000
-51.6345
53.8730
-2.2390
0.0005
2666.12
2902.30
5.01
O  E 2
E
6.5128
35.4488
0.5722
42.5338
O2
E
519.147
9.576
13.811
542.534
Since  2  42.534  542.534  500 is larger than the table value of 4.4052, we reject the null hypothesis.
461(0)  28 (1)  8(2)  2(3)  1(4)
 .108 .
500
The closest we can come on the table is Poisson(0.1). If we do the same thing we did in d), we get a
ridiculous situation, since the only numbers in E that are above 5 are the first two and scrunching the
bottom 3 cells together produces a number that results in a gigantic contribution to  2 .
e) There isn't much choice here, but our estimate of the mean is
x
0
1
2
3
4
Poisson(0.1)
0.904837
0.090484
0.004524
0.000151
0.000004
E
452.419
45.242
2.262
0.075
0.002
O
461
28
8
2
1
Our table reads:
x
O
E
E O
E  O2
1
2
461
39
500
452.419
47.581
500.000
-8.58148
8.58100
-0.00048
73.6418
73.6336
O  E 2
E
0.16277
1.54754
1.71031
O2
E
469.744
31.967
1.711
But the degrees of freedom seem to be 2 - 1 -1 = 0. (The second -1 is because we estimated a parameter
from the data.). This is worse than the dreaded 2 2 case, which needs corrections or should be done with
proportions directly. However, given the fact that .2101  2.70554, is larger than the  2 that we computed,
we have a very good fit here.
15
Download