   D

advertisement
252solnD1 2/26/03 (Open this document in 'Page Layout' view!) Re-edited to replace  or  with D .
D. COMPARISON OF TWO SAMPLES
1. Two Means, Two Independent Samples, Large Samples.
9.2, 9.3
2. Two Means, Two Independent Samples, Populations Normally Distributed, Population Variances Assumed Equal.
9.4, 9.6a,b†, 9.19†
3. Two Means, Two independent Samples, Populations Normally Distributed, Population Variances not Assumed Equal.
(D3, D4)
4. Two Means, Paired Samples (If samples are small, populations should be normally distributed).
D1, D2, 9.33*
5. Rank Tests.
Downing & Clark 17-15, 17-9. Text 15-12†, 15-28†
a. The Wilcoxon-Mann-Whitney Test for Two Independent Samples.
b. Wilcoxon Signed Rank Test for Paired Samples.
D5
6. Proportions.
9.42†, 9.44*, look at 9.40†
7. Variances.
D6, D7, 9.70†, 9.71†, 9.76*, 9.77*
Solutions to outline points 1 through 3 are in this document.
From the formula table
Difference
H 0 : D  D0 *
  d  z  2  d
d  D0
z
between Two
H 1 : D  D0 ,
d
Means (
 12  22



D




1
2
d
known)
n
n
1
d cv  D0  z  2  d
2
d  x1  x2
Difference
between Two
Means (
unknown,
variances
assumed equal)
Difference
between Two
Means(
unknown,
variances
assumed
unequal)
H 1 : D  D0 ,
1
1

n1 n 2
sd  s p
D  1   2
DF  n1  n2  2
H 0 : D  D0 *
D  d  t 2 s d
DF 
H 1 : D  D0 ,
s12 s22

n1 n2
sd 
 s12 s22 
  
n

 1 n2 
D  1   2
t
sˆ 2p 
t
d  D0
sd
d cv  D0  t  2 s d
n1  1s12  n2  1s22
n1  n2  2
d  D0
sd
d cv  D0  t  2 s d
2
   
s12
2
n1
n1  1
* Same as
H 0 : D  D0 *
D  d  t 2 s d
s 22
2
n2
n2  1
H 0 : 1   2
H 1 : 1   2
if D0  0.
1
Problems with 2 means and independent samples
Exercise 9.2: In this problem the population variances are known. Our data is as follows:
1  12
 2  10
For sample 1
1  4
For sample 2
n1  64
2  3
n2  64
x1  sample mean
x2  sample mean
If  is known and the data is chosen from a normal population, x will have the normal distribution with a
standard deviation of  x 

or a variance of  x2 
2
. Sums or differences of sample means will also
n
n
have the normal distribution and the variances of sums or differences of sample means from independent
samples will be the sums of the variances of the individual sample means.

4
a) For sample 1, 1  12 ,  x1  1 
 0.5 . So x ~ N 12,0.5 .
n1
64
b) For sample 2, 1  10,  x2 
2
3

n2
64
 0.375 . So x ~ N 10,0.375  .
c) Last semester we learned that if D   1   2 and d  x1  x 2 .
 
E d  E x1  x 2   E x1   E x 2    1   2  D and
 d2
 1   2  12  10  2  D and
 
 Var d Varx1  x2   Varx1   2Covx1 , x 2   Varx2  . But if the first and second samples are
independent, Covx1, x2   0 , Varx1    x21 
 d2  Var x1   2Covx1, x2   Varx2  
Finally
d 
 12
n1
12
n1

0
42
64
 22
n2


16
 2 32 9
and Varx2    x22  2 
. So

64
n2
64 64
16
9 25
.


64 64 64
s12 s 22

n1 n 2
25 5
  0.625 . (The formula in the outline is s d 
64 8
d  x1  x2 ~ N 2,0.625
Exercise 9.3: a) From the outline if
). So we conclude
d) Yes.
D   1   2 and d  x1  x 2 , the formula for a confidence interval
is D  d  t 2 s d or 1   2   x1  x 2   t  2 s d and for large samples we replace t with z and use
s1  150
s2  200
s12 s 22
. For sample 1 n1  400
For sample 2 n2  400
sd 

n1 n 2
x1  5275
x2  5240
Note that n1  n2  2  400  400  2  798 . Since this is a large sample, we replace t with z
  .05 , so z 2  z.025  1.960 , sd 
s12 s22


n1 n2
150 2  200 2
400
400

62500
 156 .25  12 .5 and
400
d  x1  x2  5275  5240  35 . Finally D  d  z 2 s d  35  1.96 12.5   35  24.5 or 10.5 to 59.5.
2
H 0 : D  0
H :    2
H :    2  0
b)  0 1
or  0 1
or 
H 1 : D  0
 H 1 : 1   2
 H 1 : 1   2  0
(i) From the outline or Table 3 b. Test Ratio: t 
lies between t  , do not reject H 0 . z 
2
x  x 2   10   20 
d  D0
or t  1
. If this test ratio
sd
sd
d  D0
35  0

 2.8 . Make a diagram with zero in the
sd
12 .5
middle showing 'reject' regions below -1.96 and above 1.96. Since 2.8 falls in the upper 'reject' region,
reject H 0 .
Or use the p-value pval  2Pz  2.80   2.5  .4974   .0052 . . Since the p-value is below the significance
level , reject
H0 .
(ii). Critical Value: d CV  D0  t  2 s d or x1  x 2 CV  10   20   t  2 s d . If d  x1  x 2 is between the
two critical values, do not reject H 0 . d CV   0  z sd  0  1.96012.5  0  24.5 . Make a diagram
2
with zero in the middle showing 'reject' regions below -24.5 and above 24.5. Since d  35 falls in the
upper 'reject' region, reject H 0 .
(iii) Confidence Interval: Since D0  0 does not fall in the confidence interval in a), reject H 0 .
H 0 : 1   2
H 0 : 1   2  0
H : D  0
c) 
or 
or  0
H1 : 1   2
H1 : 1   2  0
H 1 : D  0
(i) Test Ratio: t 
H0 .
z
x  x 2   10   20 
d  D0
or t  1
. If this test ratio lies below t , do not reject
sd
sd
d  D0
35  0

 2.8 . Make a diagram with zero in the middle showing a 'reject' region
sd
12 .5
above z.05  1.645 . Since 2.8 falls in the 'reject' region, reject H0 .
Or use the p-value pval  Pz  2.80   .5  .4974   .0026 . . Since the p-value is below the significance
level , .05, reject
H0 .
(ii). Critical Value: d CV   0  t sd or x1  x2 CV  10  20   t sd . If d  x1  x 2 is not above the
critical value, do not reject H 0 . d CV   0  z sd  0  1.64512.5  20.5625 . Make a diagram with zero
in the middle showing a 'reject' region above 20.5625. Since d  35 falls in the upper 'reject' region,
reject H 0 .
(iii) Confidence Interval: The formula for a two sided confidence interval was D  d  z  2 s d . But since the
alternate hypothesis is now H 1 : D  0 , the confidence interval becomes
. D  d  z  2 s d  35  1.645 12.5   14 .4375 . Make a diagram. Shade the area above 14.4375 to
represent the confidence interval, D  14 .4375 . Shade the area below zero to represent the null
hypothesis, H 0 : D  0 . Since the two areas do not touch, the confidence interval contradicts the null
hypothesis, and we must reject it.
3
 H 0 : 1   2  25
H 0 : D  25
d) 
or 
, so this time D0  25 .
 H1 : 1   2  25
H 1 : D  25
(i) Test Ratio: t 
not reject H 0 .
x  x 2   10   20 
d  D0
or t  1
. If this test ratio lies between t    z  , do
sd
sd
2
2
z
d  D0
35  25

 0.8 . Make a diagram with zero in the middle showing 'reject'
sd
12 .5
regions below -1.96 and above 1.96. Since 0.8 does not fall in one of the 'reject' regions, do not reject H 0 .
Or use the p-value pval  2Pz  0.80   2.5  .2881   .4238 . Since the p-value is above the significance
level , do not reject H 0 .
(ii). Critical Value: d CV  D0  t  2 s d or x1  x 2 CV  10   20   t  2 s d . If d  x1  x 2 is between the
two critical values, do not reject H 0 . d CV  D0  z 2 s d  25  1.960 12.5   25  24.5 , or 0.5 and 49.5.
Make a diagram with 25 in the middle showing 'reject' regions below 0.5 and above 49.5. Since
does not fall in a 'reject' region, do not reject H0 .
(iii) Confidence Interval: Since D0  0 does not fall in the confidence interval in a), reject H 0 .
d  35
e) You must assume that x1 and x2 are two independent random variables.
Exercise 9.4: To use the t statistic, the narrowest assumptions are that we have two independent samples,
that each population is approximately normal. If we wish to use the method presented in the text (the
1 
n  1s12  n2  1s22 ,
 1

second method in the outline), where t  t n1  n2 2 , sd  s p2    and s p2  1
n1  n2  2
 n1 n2 
we must also assume that the population variances are equal.
n  1s12  n2  1s22 to use in

Exercise 9.6a,b†: All this problem wants is computation of s p2  1
n1  n2  2
s 2  120
1 
 1
sd  s p2    . a) For sample 1  1
. For sample 2
n1  25
 n1 n2 
25  1 120  25  1100  24 120   24 100   120  100

So s p2 
25  25  2
48
2
s22  100
.

n2  25
 110
s 2  12
s 2  20
b) For sample 1  1
.
For sample 2  2
.
n1  20
n2  10
20  1 12  10  120  19 12   9 20   408  14 .5714 .

So s p2 
20  10  2
48
28
4
 x1  0.0491

Exercise 9.19†: a) For sample 1 s12  0.009800 . For sample 2
n  27
 1
2
2
assuming 1   2 . d  x1  x2  0.0491   0.0307  0.0184
 x2  0.0307
 2
s2  0.002465
n  23
 2
n  1s12  n2  1s22  26 0.009800   22 0.002465   0.006438

So s p2  1
48
n1  n2  2
  .05 We are
1 
 1

and sd  s p2  

n
n
2 
 1
1 
 1
 0.006438 

  0.006438 0.037037  0.043478   0.006438 0.080515   0.0051836
27
23


= 0.02276747. df  n1  n2  2  27  23  2  48 .
H 0 : D  0
H :    2
H :    2  0
d  D0
a)  0 1
or  0 1
or 
(i) If we use a test ratio t 
sd
H
:
D

0
H
:



H
:




0
2
2
 1
 1 1
 1 1
0.0184  0

 0.80817 . If this test ratio lies between t  , do not reject H 0 . Make a diagram with
0.02276747
2
48
zero in the middle showing 'reject' regions below  t.48
025  2.011 and above t.025  2.011 . Since -0.80817
does not fall in a 'reject' region, do not reject H 0 .
48
Or use the p-value pval  2Pt  0.80817 . . Since 0.80817 lies between t.48
25  0.680 and t.20  0.849 we
can say .40  pval  .50 . Since the p-value is above the significance level , do not reject H 0 .
(ii). Critical Value: d CV  D0  t  2 s d  0  2.011 0.2276747   0.0458 . Make a diagram with zero in
the middle showing 'reject' regions below -0.0458 and above 0.0458. Since d  0.0184 does not fall in a
'reject' region, do not reject H 0 .
b) The confidence interval is D  d  t 2 s d or 1   2   x1  x 2   t  2 s d
 0.0184  2.011 0.2276747   0.0184  0.0458 or -0.64 to 0.027. since this interval includes D0  0 ,
do not reject H 0 .
Problem D3 (Optional): A secretary types 16 pages on word processor 1 and 16 pages on word processor
2. Her times are:
x1  8.20
s12  4.10
x2  7.10 s22  4.20
If   1  2 =1-2 test D  0 at the 90 per cent confidence level. Assume that these are independent
samples and that  12   22 . (   .10 )
Solution: We have n1  16 and n2  16 .
class.
From the Syllabus supplement:
Difference
D  d  t 2 s d
Between Two
s2 s2
Means(
sd  1  2
Unknown,
n1 n2
Variances
2
 s12 s22 
  
Assumed
n

n
1
2 
DF   2
Unequal)
2
s
s
The two-sided confidence interval for this problem was done in
H 0 : D  D0
H 1 : D  D0
D  1   2
t
d  D0
sd
d cv  D0  t  2 s d
   
2
1
n1
n1  1
2
2
n2
n2  1
5
We found the following in class:
s12 4.1
s 2 4.2
s2 s2

 0.25625 , 2 
 0.26250 , so 1  2  0.25625  0.26250  0.51875 ,
n1 16
n2 16
n1 n2
sd 
DF 
s12 s22

 0.51875  0.720 , d  x1  x2  8.20  7.10  1.10 and
n1 n2
 s12 s22 
  
n

 1 n2 
2
2
2

0.51875 2
0.25625 2  0.26250 2

0.26910
 29 .9 .
0.00438  0.00459
 s12 
 s22 
 
 
15
15
n 
 
 1    n2 
n1  1
n2  1
I will follow my own advice this time and round the degrees of freedom down to 29. (If we had followed
29
this advice in class, we would have used t.025
 2.045 and the two-sided confidence interval would have
been D  d  t  2 s d  1.10  2.045 0.720   1.10  1.47 .)
H 0 : 1   2
H : D  0
We are now testing  0
or 
. Since our hypotheses are one-sided we use
H1 : 1   2
H 1 : D  0
t 29  1.311
.10
d  D0 x1  x 2   1   2  1.10  0


 1.527 . Make a diagram with zero in the
sd
sd
0.720
middle showing a 'reject' region above t 29  1.311 . Since 1.527 falls in the 'reject' region, reject H .
(i) Test Ratio: t 
0
.10
(ii) Critical Value: d CV  D0  t s d  0  1.311 0.720   0.943 . Make a diagram with zero in the
middle showing a 'reject' above 0.943. Since d  1.10 falls in the 'reject' region, reject H 0 .
(iii) Confidence interval: D  d  t 2 s d becomes D  d  t s d  1.10  1.311 0.720   0.157 . D  0.157
contradicts the null hypothesis D  0 so reject
H0 .
6
Problem D4: (Old Minitab Manual - modified) In a study of tool life , two independent samples of wear
are taken. The first of these represents volume loss in millionths of a cubic inches from 10 untreated tools.
The second represents loss in the same units from 10 tools that were treated by a new wear retardant
process.
Untreated
.56 .50 .69 .59 .47 .42 .45 .47 .50 .50
Treated
.13 .13 .18 .23 .18 .31 .35 .23 .31 .33
On the assumption that the parent populations are Normal, test the hypothesis that the means are equal
and do a confidence interval for the difference between the means ) (a) assuming that the variances are
equal and (b) assuming that the variances are not equal.
Solution: a)
x1 
s12 
x
1
n`1

5.15
 0.515
10
x
 nx12
x
2

2
2
 nx2 2
2
1
n1  1
 0.00625 .
x2 
s22 
n2`
x

2.709  10 0.515 2
9
2.38
 0.238
10
n2  1
 0.00684 .

0.628  10 0.238 2
9
x12
0.3136
0.2500
0.4761
0.3481
0.2209
0.1764
0.2025
0 .2209
0.2500
0.2500
2.709
x1
0.56
0.50
0.69
0.59
0.47
0.42
0.45
0.47
0.50
0.50
5.15
H 0 : D  0
 H 0 : 1   2
H :    2  0
or  0 1
or 

H 1 : D  0
 H 1 : 1   2
 H 1 : 1   2  0
x2
0.13
0.13
0.18
0.23
0.18
0.31
0.35
0.23
0.31
0.33
2.38
x22
0.0169
0.0169
0.0324
0.0529
0.0324
0.0961
0.1225
0.0529
0.0961
0.1089
0.628
d  x1  x2  0.515  0.238  0.277
a) We assume that the variances are equal. (  12   22 ). So we use the traditional method for this problem.
n  1s12  n2  1s22  9 0.00625   9 0.00684   0.00625   0.00684   0.00655 and

s p2  1
18
2
n1  n2  2
1
1 
 1
1
  0.00625     0.00625 0.2   0.03618
sd  s p2  

 10 10 
 n1 n2 
df  n1  n2  2  10  10  2  18 .
(i) Test Ratio: t 
d  D0
sd

0.277  0
 7.66 . If this test ratio lies between t  , do not reject H 0 .
0.03618
2
Make a diagram with zero in the middle showing 'reject' regions below  t.18
025  2.101 and above
t.18
025  2.101. Since 7.66 falls in a 'reject' region, reject H 0 .
(ii). Critical Value: d CV  D0  t  2 s d  0  2.1010.03618   0.076 . Make a diagram with zero in the
middle showing 'reject' regions below -0.076 and above 0.076. Since d  0.277 falls in a 'reject' region,
reject H 0 .
(iii) Confidence Interval: D  d  t 2 s d  0.277  2.1010.03618   0.277  0.076 or 0.201 to 0.353.
Since zero is not on this interval, reject H 0 .
7
b) (Optional) We assume that the variances are not equal. ( 12   22 ).
approximation.
So use the Satterthwaite
s12 0.00625
s 2 .00684
s2 s2

 0.000625 , 2 
 0.000684 , so 1  2  0.000625  0.000684  0.001309 ,
n1
10
n2
10
n1 n2
sd 
DF 
s12 s22

 0.001309  0.0362 , d  x1  x2  8.20  7.10  1.10 and
n1 n2
 s12 s22 
  
n

 1 n2 
2

0..001309 2
0.000625 2  0.0006846250 2
 17 .96 . We probably should round down to 17
2
2
 s12 
 s22 
 
 
9
9
n 
 
1
    n2 
n1  1
n2  1
degrees of freedom, but note that, if we use 18 degrees of freedom, our results with this method are the
same as those with the traditional method.
d  D0
0.277  0

 7.66 . If this test ratio lies between t , do not reject H0 .
(i) Test Ratio: t 

sd
0.0362
2
Make a diagram with zero in the middle showing 'reject' regions below  t.17
025  2.110 and above
t.17
025  2.110 . Since 7.66 falls in a 'reject' region, reject
H0 .
(ii). Critical Value: d CV  D0  t  2 s d  0  2.110 0.03618   0.076 . Make a diagram with zero in the
middle showing 'reject' regions below -0.076 and above 0.076. Since d  0.277 falls in a 'reject' region,
reject H 0 .
(iii) Confidence Interval: D  d  t 2 s d  0.277  2.110 0.03618   0.277  0.076 or 0.201 to 0.353.
Since zero is not on this interval, reject H 0 .
Parts not copied ©2003 Roger Even Bove
8
Download