252y0221 3/29/02 ECO252 QBA2 Name KEY

advertisement
252y0221 3/29/02
ECO252 QBA2
Name
KEY
SECOND HOUR EXAM Hour of Class Registered (Circle)
(Open this document in 'Page Layout') March 19, 2002
MWF 10 11 TR 12:30 2:00
Hour of Class Attended (If Different) __
I. (14 points) Do all the following. Diagrams will help!
x ~ N 5,9
Probabilities still can't be negative!
16  5 
 3  5
z
 P 0.89  z  1.22 
1. P 3  x  16   P 
9
9 

 P 0.89  z  0  P0  z  1.22   .3133  .3888  .7021
5  5
0  5
z
 P 0.56  z  0  .2123
2. P0  x  5  P 
9 
 9
14  5 
  25  5
z
 P 3.33  z  1.00 
3. P25  x  14   P 
9
9 

 P 3.33  z  0  P0  z  1.00   .4996  .3413  .8409
4  5
 2  5
z
 P 0.78  z  0.11
4. P2  x  4  P 
9 
 9
 P 0.78  z  0  P 0.11  z  0  .2823  .0438  .2385
2  5

5. F 2  (The Cumulative probability up to 2) . Px  2  P  z 
9 

 Pz  0.33  Pz  0  P 0.33  z  0  .5  .1293  .3707
6. A symmetrical interval about the mean with 37% probability. 6.
We want two points x.698 and x.315 , so that Px.685  x  x315   .3700 . Make a diagram
Showing 5 in the middle at the center of a 37% region split into two areas with probabilities
of 18.5%. From the diagram, if we replace x by z, P0  z  z.315   .1850 . The closest we
can come is P0  z  0.48   .1844 or P0  z  0.49   .1879 . Since neither of these is much
closer than the other, use z.315  0.485 , and x    z.315  5  0.485 9  5  4.365 , or 0.635
9.365  5 
 0.635  5
z
to 9.365. To check this note that P0.635  x  9.365   P

9
9


 .1844  .1879 
 P 0.485  z  0.485   2 P0  z  0.485   2
  .3723  37 % .
2


Of course z.315  0.48 or z.315  0.49 is perfectly acceptable.
7.
x.02 - We want a point x.02 , so that Px x.02   .02 . Make a diagram of z showing zero
in the middle, .4800 between 0 and z.02 and .02 above z.02 . From the diagram, if we replace x by z,
P0  z  z.02   .4800 . The Normal table says P0  z  2.05   .4798 , which is the closest we can
come to.4800. So z.02  2.05 , and x    z.02  5  2.059  23.45. To check this note
23 .45  5 

Px  23 .45   P  z 
  Pz  2.05   Pz  0  P0  z  2.05   .5  .4798  .0202  .02.
9


252y0221 3/29/02
II. (6 points-2 point penalty for not trying part a.) Show your work! Do not answer part b with a yes or
no unless you have stated your hypotheses!
a. According to your text, a study was made to compare ages of purchasers of Crest with nonpurchasesrs,
yielding the following results. These are two independent samples taken from an approximately normal
population.
Row
crest
nocrest
x1
1
2
3
4
5
6
7
8
x2
34
35
23
44
52
46
28
48
28
22
44
33
55
63
45
31
The Minitab 'describe' function gave the following results for the 'nocrest' column.
Variable
N
Mean
Median
StDev
nocrest
8
40.12
38.50
14.11
a. Compute the standard deviation, s1 , for the 'crest' column. Show your work! (3)
b. Compute a 90% confidence interval for the difference betweeen the two population means 1 and  2 on
the assumption that these are independent samples taken from approximately normal populations with
similar variances. According to your confidence interval, is there a significant difference between the
population means? Why? (3)
Solution: a) n1  8.
Row
1
2
3
4
5
6
7
8
x1
34
35
23
44
52
46
28
48
310
x12
1156
1225
529
1936
2704
2116
784
2304
12754
b) From Table 3 of the Syllabus Supplement:
Interval for
Confidence
Hypotheses
Interval
Difference
H 0:   0
  d  t  2 sd
between Two
H 1:   0
1
1
Means (
sd  s p

  1   2
n1 n2
unknown,
variances
DF  n1  n2  2
assumed equal)
x1 
s12 
x
1
n1
x
2
1
310
 38 .75
8

 n1x12
n1  1

12754  838 .75 2 741 .5

 105 .92857
7
7
s1  10.29161
Test Ratio
t
sˆ 2p 
d  0
sd
Critical Value
d cv   0  t  2 sd
n1  1s12  n2  1s22
n1  n2  2
2
252y0221 3/29/02
x1  38 .75, s12  105 .92857 , s1  10 .29161
x2  40 .12, s2  14 .11, s22  199 .0921
d  x1  x2  38 .75  40 .12  1.37
DF  n1  n2  2  8  8  2  14
sˆ 2p 
n1  1s12  n2  1s22 = 7105 .92857   7199 .0921   105 .92857  199 .0921
n1  n2  2
14
2
  .10,
 152 .510335
14
t.05
 1.761
sd  s p
1
1


n1 n2
152 .510335  1  1   152 .510335 .25  
8
8
38 .12758375  6.17475
Confidence Interval:   d  t sd  1.37  1.761 6.17475   1.37  10.87 or -12.24 to 9.50. The
2
interval includes 0, so there is no significant difference between the means. Formally, our hypotheses are
H 0 :   0
H 0 :  1   2
H 0 : 1   2  0

H 1 :   0 or H :    or H :     0 We do not reject H 0 .
2
2
 1 1
 1 1
  1   2
3
252y0221 3/29/02
III. Do at least 3 of the following 4 Problems (at least 10 each) (or do sections adding to at least 30 points Anything extra you do helps, and grades wrap around) . You must do problem 2a! Show your work! State
H 0 and H1 where applicable. Do not answer a question 'yes' or 'no' without citing a statistical test.
Use a 95% confidence level unless another level is specified.
1. For your convenience, data is repeated from the previous page. Use a 90% confidence level in this
problem.
Row
crest
nocrest
x1
1
2
3
4
5
6
7
8
x2
34
35
23
44
52
46
28
48
Variable
nocrest
28
22
44
33
55
63
45
31
N
8
Mean
40.12
Median
38.50
StDev
14.11
a. Test the hypothesis that the mean age of Crest buyers is lower* than the mean for those who did not buy
Crest. Assume that these are independent samples taken from approximately normal populations with
similar variances and .
(i) State your null and alternate hypotheses. (2)
(ii) Find a critical value for the difference between the sample means and use it to test your
hypothesis. (2)
(iii) Repeat the test using a test ratio and find an approximate p-value for the hypothesis. (3)
(iv) Repeat the test using a confidence interval. (2).
b. Test the equality of the standard deviations of the two samples.
(i) Do the test without using a confidence interval. (2)
(ii) Create a confidence interval for the variance ratio and use it to find a confidence interval for
the ratio of the standard deviations. (3.5)
c. (Extra credit) Repeat the tests in a(ii) - a(iv) dropping the assumption of equal variances. (6)
Solution: a. (i) Because we are being asked if the mean of Crest users is less than the mean for non-Crest
users, we are asking if 1   2 . Because this does not contain an equality this must be an alternate
H 0 : 1   2  0
H 0 :   0
H 0 : 1   2
hypothesis. Thus we are testing 
or 
or 
,   .10.
H1 : 1   2  0
 H1 :   0
H1 : 1   2
From the previous page x1  38.75, x2  40 .12, d  x1  x2  38.75  40.12  1.37,
1
1

  6.17475 .
n1 n2
14
Because this is a one - sided hypothesis, we use t.10
 1.345 .
DF  n1  n2  2  8  8  2  14 and
sd  s p
(ii) The formula for a Critical Value is d CV   0  t  2 s d or x1  x 2 CV  10   20   t  2 s d .
Because this is a one-sided test, we want one critical value below zero. The critical value formula
becomes d CV   0  t sd  0  1.345 6.175   8.305 . Make a diagram showing an almost
Normal curve with a mean at zero and a 'reject' region below -8.305. Since -1.37 is not in this
region, we cannot reject H 0 .
*Note: Because I confused a few people on this, I tried to figure out what you thought I was asking before
grading this. Your tests had to agree with your null hypothesis.
4
252y0221 3/29/02
(iii) The formula for a Test Ratio is : t 
x  x 2   10   20 
d  0
or t  1
.
sd
sd
d   0  1.37  0

 0.221 . To do a conventional test make a diagram showing an almost
sd
6.17475
Normal curve with a mean at zero and a 'reject' region below  t n 1   t 14  1.345 . Since t

.10
0.221 is not in this region, we cannot reject H 0 . To find an approximate p- value, compare this
value of t with the values on the DF  14 line of the t table. Because this is a left-sided test, we
14
want to know the area below --0.221. Since t.14
40  0.258 and t.45  0.128, we can say that
.40  pval  .45.
(iv) The formula for a Confidence Interval is   d  t  2 s d or 1   2   x1  x 2   t  2 s d . Since
the alternate hypothesis is H1 : 1  2 or   0 , this interval becomes
  d  t sd  1.37  1.3456.17475  1.37  8.305  6.935. Since this interval includes zero
and numbers above zero, we cannot reject H 0 .
H :    2
b. The best place to find the formulas for comparing variances to test  0 1
is the outline. Recall
H 1 :  1   2
that s12  105.9285, s22  199.0921, n1  n2  8 , and   .10.
(i) . If we want to do a 2-sided test where DF1  n1  1  7 and DF2  n2  1  7 , we compare
7 7 7 
against FDF1 , DF2   F.05
 3.79 and
2
so
s 22
7,7  .
against FDF2 ,DF1   F.05
s12
s22
s12

s12
s 22
199 .0921
 1.879 ,
105 .9285
s12
must be below one. Since both ratios are not above the corresponding table values for
s 22
F , we cannot reject the null hypothesis of equality.
(ii) A 2-sided confidence interval is
 22 s 22 ( n1 1, n2 1)


F
, which becomes
s12 Fn2 1,n1 1  12 s12 2
s 22
1
2
2
2
1.879  1   22  1.879 3.79 or 0.496   22  7.12 . The opposite interval is
3.79  1
1
s12
s 22
 12 s12 ( n2 1, n1 1)
1
1
2
1

, which becomes
3.79  or
 12 
n1 1, n2 1  2  s 2 F 2
1
.
879
3
.
79
1
.
879

F
2
2
2
1
2
0.140 
 12
 2.02 . If we take the square roots , we get confidence intervals for the ratios of
 22
standard deviations 0.704 
2

 2.67 or 0.375  1  1.42 . Since this interval includes one, we
2
1
cannot reject H 0 .
H :   2  0
H :   0
H :    2
c. We are testing  0 1
or  0 1
or  0
,   .10.
H1 : 1   2  0
 H1 :   0
H1 : 1   2
5
252y0221 3/29/02
From the formula table;
Interval for
Confidence
Interval
Difference
between Two
Means(
unknown,
variances
assumed
unequal)
Hypotheses
H 0 :   0
  d  t 2 s d
DF 
 s12 s22 
  
n

 1 n2 
  1   2
2
s 22
n1
n1  1
Critical Value
d cv   0  t  2 s d
d 0
sd
Same as
H 0: 1   2
2
   
s12
t
H 1:   0
s12 s 22

n1 n2
sd 
Test Ratio
2
n2
n2  1
H 1:  1   2
if  0  0
From the previous pages x1  38.75, x2  40 .12, d  x1  x2  38.75  40.12  1.37, s12  105.9285,
s22  199.0921, n1  n2  8 For these calculations, done by Minitab, I used s1  10.2916, and
s2  14.1100, so that Minitab reported s12  105.917 and s22  199.092.
s12 105 .917

 13 .2397
n1
8
s22 199 .092

 24 .8865
n2
8
sd 
s12 s22

 38 .1262  6.17464
n1 n2
s12 s22

 38 .1262
n1 n2
DF 
 s12 s 22 



 n1 n 2 


2
2
2
 s12 
 s 22 
 
 
 n1 
 n2 
 
 

n1  1
n2 1

6.17464 2
13 .2397 2  .24 .8865 2
7

1453 .60
1453 .60

 12 .8050
25 .0412  88 .4769 113 .518
7
12
So use 12 degrees of freedom. Because this is a one - sided hypothesis, we use t.10
 1.356 .
(ii) The formula for a Critical Value is d CV   0  t  2 s d or x1  x 2 CV  10   20   t  2 s d .
Because this is a one-sided test, we want one critical value below zero. The critical value formula
becomes d CV   0  t sd  0  1.356 6.17464   8.373 . Make a diagram showing an almost
Normal curve with a mean at zero and a 'reject' region below -8.373. Since -1.37 is not in this
region, we cannot reject H 0 .
iii) The formula for a Test Ratio is : t 
x  x 2   10   20 
d  0
or t  1
.
sd
sd
d   0  1.37  0

 0.222 . To do a conventional test make a diagram showing an almost
sd
6.17464
Normal curve with a mean at zero and a 'reject' region below  t n 1   t 12  1.356 . Since t

.10
0.222 is not in this region, we cannot reject H 0 .
6
252y0221 3/29/02
(iv) The formula for a Confidence Interval is   d  t  2 s d or 1   2   x1  x 2   t  2 s d . Since
the alternate hypothesis is H1 : 1  2 or   0 , this interval becomes
  d  t sd  1.37  1.3566.17645  1.37  8.373  7.003 . Since this interval includes zero
and numbers above zero, we cannot reject H 0 .
Two reminders
1) A one-sided hypothesis is tested by a one-sided test which includes a one-sided critical value or a onesided confidence interval.
2) A table from the outline:
Methods for Comparing Two Samples.
Paired Samples
Location - Normal distribution.
Method D4
Compare means.
Location - Distribution not
Normal. Compare medians.
Method D5b
Independent Samples
Methods D1- D3
Method D5a
Proportions
Method 6
Variability - Normal distribution.
Compare variances.
Method 7
7
252y0221 3/29/02
2
a. Turn in your computer output from computer problem 1 only tucked inside this exam paper. (3 - 2 point
penalty for not handing this in.)
b. A new battery is being tested for use in a tiny stuffed animal. We will use the new battery if it is longerlasting than the old one. Use a 95% confidence level. Slightly edited Minitab output is below.
The assumption was that the old battery had an average life of 4.6 hours and this is tested for both the old
and new battery before they are compared. Can we say that either battery has a life that is significantly
different from 4.6 hours? What evidence in what tests led you to your conclusion? (3)
c. Continuing to use the data below, which one of these tests would you use to decide whether to switch
batteries. What would you do? Why? (2)
d. Using an almost Normal curve, and the appropriate values of t, shade the areas represented by the pvalues in tests 3, 4 and 5. (3)
e. Using means and standard deviations from the printout, explain how the computer got the values of t in
tests 3, 4 and 5. (2) Part f is at the end of the printout.
Worksheet size: 100000 cells
MTB > RETR 'C:\MINITAB\2X0221-2.MTW'.
Retrieving worksheet from file: C:\MINITAB\2X0221-2.MTW
Worksheet was saved on 3/13/2002
MTB > print 'new'
Data Display
new
3.3
6.4
3.9
5.4
5.1
7.3
4.5
5.8
6.4
4.5
4.6
4.9
7.2
6.1
3.9
5.1
3.2
4.0
3.5
5.3
4.0
3.5
5.1
4.5
3.8
MTB > print 'old'
Data Display
old
4.2
2.9
4.5
4.9
5.0
4.0
5.2
3.5
3.0
5.2
4.4
MTB > describe 'new' 'old'
Descriptive Statistics
Variable
N
Mean
new
18
5.106
old
18
4.233
Variable
new
old
Min
3.300
2.900
Median
5.000
4.300
TrMean
5.081
4.250
Q1
3.975
3.500
Q3
6.175
5.025
Max
7.300
5.300
StDev
1.215
0.793
SEMean
0.286
0.187
Test 1: MTB > ttest mu=4.6 'new'
T-Test of the Mean
Test of mu = 4.600 vs mu not = 4.600
Variable
new
N
18
Mean
5.106
StDev
1.215
SE Mean
0.286
T
1.76
P-Value
0.096
SE Mean
0.187
T
-1.96
P-Value
0.066
Test 2: MTB > ttest mu=4.6 'old'
t-Test of the Mean
Test of mu = 4.600 vs mu not = 4.600
Variable
old
N
18
Mean
4.233
StDev
0.793
8
252y0221 3/29/02
H 0 : new  4.6
 H 0 : old  4.6
b) Solution: The two tests above are 
and 
. We are using   .05 . In
H1 : new  4.6
 H1 : old  4.6
both cases the p-value is above the significance level so do not reject the null hypothesis.
Test 3: MTB > twosamplet 'new' 'old'
Two Sample T-Test and Confidence
Twosample T for new vs old
N
Mean
StDev
SE
new 18
5.11
1.22
old 18
4.233
0.793
Interval
Mean
0.29
0.19
95% C.I. for mu new - mu old: ( 0.17, 1.57)
T-Test mu new = mu old (vs not =): T= 2.55 P=0.016
DF=
29
Test 4: MTB > twosamplet 'new' 'old';
SUBC> alt= 1.
Two Sample T-Test and Confidence Interval
Twosample T for new vs old
N
Mean
StDev
new 18
5.11
1.22
old 18
4.233
0.793
SE Mean
0.29
0.19
95% C.I. for mu new - mu old: ( 0.17, 1.57)
T-Test mu new = mu old (vs >): T= 2.55 P=0.0082
DF=
29
Test 5: MTB > twosample t 'new' 'old';
SUBC> alt = -1.
Two Sample T-Test and Confidence Interval
Twosample T for new vs old
N
Mean
StDev
new 18
5.11
1.22
old 18
4.233
0.793
SE Mean
0.29
0.19
95% C.I. for mu new - mu old: ( 0.17, 1.57)
T-Test mu new = mu old (vs <): T= 2.55 P=0.99
DF=
29
c. Solution: If the new battery is 'longer - lasting,' we will find that, new  old so that our hypotheses are
 H 0 : new  old
. These are the hypotheses tested in test 4. The p-value reported in this test is .0082,

 H1 : new  old
which is certainly less than   .05 . So reject the null hypothesis and buy the new battery.
d. Solution: Make a diagram showing an almost-normal curve with a mean at zero.
In test 3, a 2-sided test, the p-value is twice the probability that t  2.55 , so shade the area above
2.55 and below -2.55 and label it 1.6%.
In test 4, a 1-sided test, the p-value is the probability that t  2.55 , so shade the area above 2.55
and label it 0.82%, which must be half the p-value in test 4.
In test 5, a 1-sided test, the p-value is the probability that t  2.55 , so shade the area below 2.55 2.55 and label it 99%. Except for rounding, this is one minus the p-value in test 4.
9
252y0221 3/29/02
2
2
d 0
e. Solution: As is shown on page 6, above. the formulas are s  s1  s2 and t 
. The 'describe
d
sd
n1 n2
printout shows that xnew  5.106 , xold  4.233, d  xnew  xold  5.106  4.233  0.873,
and
sold
nold

2
2
 0.187 . So s  s1  s2 
d
n1
n2
.0.286 2  0.187 2
 0.34171 . Finally t 
snew
nnew
 0.286
d  0
sd
0.873  0
 2.5548 .
0.34171
f. Assume that you got the following output from a 'describe' command
Descriptive Statistics
Variable
N
Mean
Median
TrMean
StDev
new
150
5.106
5.000
5.081
1.500
old
150
4.233
4.300
4.250
0.800
Construct a 96% confidence interval for  new   old . (3)
SEMean
0.122
0.065
2
2
Solution: The formula that is used in the large sample case is s  s1  s2
d
n1

.0.122 2  0.065 2
n2
 0.13824 . d  5.106  4.233  0.873 . On Page 1, we found z.02  2.05 The
Confidence Interval formula is   d  t 2 sd  0.873  2.05 0.13824  0.873  0.283 .
10
252y0221 3/29/02
3. (McClave et. al. ) I am a razor manufacturer and claim that my disposable razor gives more shaves than
my competitor's. A sample is taken, with x m representing the number of shaves per razor on my product
and x c representing the number of shaves on my competitor's product.
Row
me
xm
1
2
3
4
5
6
7
8
8
17
9
11
15
10
6
12
rankm
rm
16
15
12
compet
rankc
xc
rc
10
6
3
7
13
14
5
7
1
13
14
2
diff
d
-2
11
6
4
2
-4
1
5
For your convenience some calculations have been made. The columns rm and rc represent the beginning
and end of the ranking of the numbers in x m and x c . d represents the difference between the numbers
in x m and x c . For the three data columns we have the following.
Variable
me
11.00
xm
compet
Sample Mean
xc
8.12
Sample Std
Dev
3.63
3.87
4.73
d
Note that you will probably not need all the information that I am giving you.
a. The authors say that x m and x c represent two independent samples. Because of uncertainty about the
underlying distribution, the authors imply that you should compare medians to see if my blade is better.
State your hypotheses and your conclusion after doing an appropriate test. Use a 95% confidence level. (5)
b. The authors then concede that this was an inefficient way to do the problem and that we ought to compare
medians using paired samples. Assume that each line represents the experience of one shaver and repeat the
test. (5)
c. Now assume that we find out that the underlying distribution is Normal after all and repeat the test in b,
using means instead of medians. (5).
Solution: The data are repeated below with the ranks filled in. Rank totals are found for m and c. Absolute
values of d are found and d is ranked. The ranks are then corrected for ties and marked by the sign of the
diff
2.88
difference.   .05 .
Row
me
rankm compet rankc diff
d
xm
rm
xc
rc
rd
d
rd*
1
8
7
10
9.5
-2
2
3
2.52
17
16
6
3.5
11
11
8
8 +
3
9
8
3
1
6
6
7
7 +
4
11
11
7
5.5
4
4
5
4.5+
5
15
15
13
13
2
2
2
2.5+
6
10
9.5
14
14
-4
4
4
4.57
6
3.5
5
2
1
1
1
1 +
8
12
12.0
7
5.5
5
5
6
6 +
82.0
54.0
T+=29, T-=7
H 0 :  m   c
a. 
To check the ranking, note that the rank sums add to 82 +54 = 136. This should be the
 H1 :  m   c
same as the totals of the numbers 1 through 16, 16(17 )
 136 . For a 1-sided test use the Mann-Whitney2
Wilcoxon rank sum test For n1  n2  8, Table 6b gives critical values of 52 and 84. Since our rank sums
fall between these, we do not reject the null hypothesis.
11
252y0221 3/29/02
H 0 :  m   c
b. 
For a test of the correctness of the ranking, note that the sum of the two Ts in b add to
 H1 :  m   c
29+7=36. This should be the same as the sum of the first eight numbers 8(9)  36 . For a Wilcoxon
2
Signed Rank Sum Test, compare T  7, the smaller of the totals, with Table 7. For n  8, the 5% critical
value is 6. Since T is above this value, we do not reject the null hypothesis.
c. If the parent distribution is Normal, we use Method D4. From the outline, there are three ways of
approaching a problem involving two means. You should have chosen one! We know that
H :   2
s
4.73
or
 1.6723 . We are testing  0 1
d  xm  xc  2.88. ,   m  c , s d  4.73, sd  d 
n
8
 H1 : 1  2
 H 0 : 1   2  0
H :   0
7
or  0
, df  n  1  7 , tn1  t.05
 1.895 .

H
:




0
H
:


0
2
 1 1
 1
(i) . Confidence Interval:   d  t  2 s d or 1   2   x1  x 2   t  2 s d . This interval
becomes   d  t sd  2.88 - 1.895 1.6723   2.88 - 3.17  -0.29 . Since this interval
includes zero, we cannot reject H 0 .
(ii). Test Ratio: t 
x  x 2   10   20 
d  0
or t  1
.
sd
sd
d   0 2.88  0

 1.722 . Make a diagram showing an almost Normal curve with a
sd
1.6723
mean at zero and a 'reject' region above t n1  t 7  1.895 . Since 1.722 is not in this
t

.05
region, we cannot reject H 0 .
(iii). Critical Value: d CV   0  t  2 s d or x1  x 2 CV  10   20   t  2 s d . Because this
is a one-sided test, we want one critical value above zero. The critical value formula
becomes d CV   0  t sd  0  1.895 1.6723   3.169 Make a diagram showing an
almost Normal curve with a mean at zero and a 'reject' region above 3.169. Since 2.88 is
not in this region, we cannot reject H 0 .
12
252y0221 3/29/02
4. According to your authors, when a sample of 74 woman students were asked whether they would consent
to an interview with a travelling recruiter in a local office building, 73 said yes. However, when another
sample of 74 women were asked whether they would consent to a similar interview in a hotel room, only 46
said yes. This represents a tremendous difference in the proportion that said yes and the researcher was
asked to verify her findings by repeating the question about the hotel room interview to a second sample of
100 students. This time 64 out of the 100 students said yes.
a. Was the proportion that consented to the hotel interview significantly different between the sample of 74
and the sample of 100? State your hypotheses and test them using a 90% confidence level? (4)
b. The sponsoring firm was so upset by the results that the researcher was asked to interview another sample
of 200 women. This time, out of the 200 women, 132 said that they would consent. Check to see if the
proportions of the samples of 74, 100 and 200 women who consented differ, Again use a 90% confidence
level. (6)
c. Do a 2-sided confidence interval for the difference between the proportion of women in the samples of 74
who will consent to interviews in a office building and a hotel room. (3)
d. A business has just completed a switch to a new invoicing system. Previously the number of errors per
invoice seemed to follow a Poisson distribution with a mean of 0.2. After the switch to the new system was
made, a sample of 500 invoices was taken. Of these 479 had no errors, 10 had one error, 8 had 2 errors, 2
had 3 errors and one had 4 errors. Does the Poisson(0.2) distribution still apply? (5)
e. The business wants to know if some Poisson distribution works for the data in part d. Using your Poisson
table as best you can, how do you go about this and what do you conclude? (3)
Solution: To summarize the information in parts a and b -   .10 and
Hotel Room
Sample 1
Sample 2
Sample 3
46
64
132
Yes
28
36
68
No
74
100
200
Total
.6216
.6400
.6600
Proportion saying
yes
We are comparing p1  .6216 , n1  74 and p2  .6400 , n2  100 .
Interval for
Confidence
Hypotheses
Test Ratio
Critical Value
Interval
pcv  p0  z 2  p
Difference
p  p 0
p  p  z 2 sp
H 0 : p  p0
z
between
If p0  0
 p
H 1 : p  p0
p  p1  p2
proportions
 p  p0 q 0  1 n  1 n 
If p  0
p0  p01  p02
p1q1 p2 q 2
q  1 p
s p 

p01q 01 p02 q 02
n p  n2 p2
n1
n2
 p 

or p 0  0
p  1 1
n
n
1
1
2
0
Or use s p
sp 
2
n1  n 2
p1q1 p2q2
.6216 .3784  .6400 .3600 



 .00317856  .00230400  .00548256  .074044
n1
n2
74
100
p  p1  p2  .0184 , p0 
46  64
n p n p
74 .6216   100 .6400 
 1 1 2 2 
 .6322 ,
74  100
n1  n2
74  100
  .10, z 2  z.05  1.645. Note that q  1  p and that q and p are between 0 and 1.
 p 
p0q0

1
n1

1
n3

.6322 .3678  174  1100 
.005467  .073942
13
252y0221 3/29/02
H : p  0
 H : p  p2
H : p  p2  0
a)  0
Same as  0 1
or  0 1
H1 : p  0
H1 : p1  p2
H1 : p1  p2  0
There are three ways to do this problem. Only one is needed.
p  p0 .0184  0
(i) Test Ratio: z 

 0.2488
 p
.073942
Make a Diagram showing 'reject'
regions above 1.645 and below -1.645. Since -0.2488 is between these values, do not reject H 0 .
(ii) Critical Value: pcv  p0  z  p  0  1.645 .073942   0.1216 . Make a Diagram
2
showing a 'reject' region above 0.1216 and below -0.1216. Since -0.0184 is between these
values, do not reject H 0 .
(iii) Confidence Interval:: p  p  z s p  .0184  1.645 .074044   .0184  .1218 or
2
-0.1402 to 0.1034. Since zero is between these values, do not reject H 0 .
b)
DF  r  1c  1  12  2
H 0 : Homogeneousor p1  p 2  p 3 
H 1 : Not homogeneousNot all ps are equal
O
1
2
3 Total
pr


Yes
46 64 132
242
.647


No  28 36 68  132
.353
Total 74 100 200
374 1.000
.2102   4.6052
E
Satisfied
Not
Total
1
2
3
Total
pr
 47 .88 64 .70 129 .40  241 .98 .647


 26 .12 35 .30 70 .60  132 .02 .353
74 .00 100 .00 200 .00 374 .00 1.000
The proportions in rows, p r , are used with column totals to get the items in E . Note that row and column
sums in E are the same as in O . (Note that  2  0.379  374.379  374 is computed two different ways
here - only one way is needed.)
O2
O  E 2
Row
E  O2
E O
O
E
E
E
1
46
47.88
1.88000
3.53440
0.073818
44.194
2
28
26.12 -1.88000
3.53440
0.135314
30.015
3
64
64.70
0.70000
0.49000
0.007573
63.308
4
36
35.30 -0.70000
0.49000
0.013881
36.714
5
132
129.40 -2.60001
6.76003
0.052241
134.652
6
68
70.60
2.60000
6.75999
0.095751
65.496
374
374.00
0.00001
0.378576
374.379
Since the  2 computed here is less than the  2 from the table, we do not do not reject H 0 .
c) Let us call the proportion of women who consented to the office interview p4 
p1 
73
 .9864 . Recall that
74
46
 .6216 so p  p4  p1  .9864  .6216  .3648 .   .10, z  z.05  1.645.
2
74
sp 
p4q4 p2q1


n4
n1
.9864 .0136  .6216 .3784 

 .00018128  .00317856  .00335984  .057964
74
74
p  p  z 2 s p  .3648  1.645 .057964   .3648  .0954
14
252y0221 3/29/02
d) If we take the column in the Poisson table for the Poisson distribution with a mean of .2 and multiply it
by 500, we get E. O is given in the problem.
x Poisson(0.2)
O
E
0
0.818731
409.366
479
1
0.163746
81.873
10
2
0.016375
8.187
8
3
0.001092
0.546
2
4
0.000055
0.027
1
But the E column has items in it below 5 (and 2) and thus can be used only if the last three cells are added
together. DF  3  1  2, .2102  4.6052 .
O  E 2
O2
E
E
0
479
409.366 -69.6345
4848.96
11.8451
560.480
1
10
81.873
71.8730
5165.73
63.0944
1.221
2+
11
8.761
-2.2390
5.01
0.5722
13.811
500
500.000
0.0005
75.5117
575.512
Since  2  75.512  575.512  500 is larger than the table value of 4.4052, we reject the null hypothesis.
x
O
E O
E
E  O2
479 (0)  10 (1)  8(2)  2(3)  1(4)
 .072 .
500
The closest we can come on the table is Poisson(0.1). If we do the same thing we did in d), we get a
ridiculous situation, since the only numbers in E that are above 5 are the first two and scrunching the
bottom 3 cells together produces a number that results in a gigantic contribution to  2 .
e) There isn't much choice here, but our estimate of the mean is
x
0
1
2
3
4
Poisson(0.1)
0.904837
0.090484
0.004524
0.000151
0.000004
E
452.419
45.242
2.262
0.075
0.002
O
479
10
8
2
1
Our table reads:
x
O
E
E O
1
2
479
21
500
452.419
47.581
500.000
-26.5815
26.5810
0.0005
E  O2
706.575
706.550
O  E 2
E
1.5618
14.8494
16.4112
O2
E
507.143
9.268
516.412
But the degrees of freedom seem to be 2 - 1 -1 = 0. (The second -1 is because we estimated a parameter
from the data.). This is worse than the dreaded 2 2 case, which needs corrections or should be done with
proportions directly. However, given the fact that .2101  2.70554, is smaller than the  2 that we
computed, we don't have a very good fit here.
© 2002 Roger Even Bove
15
Download