Comparing 2 Samples

advertisement
252mean 10/20/05 (Open this document in 'Outline' view!) Note that this document has been re-edited to
replace  with D . Another possibility would be to use  for the difference between 2 means. Tell me
what you think!
D. COMPARISON OF TWO SAMPLES
The first five methods shown here are appropriate in cases where one wants to compare the means
of two samples. The first one is the most general, since it is acceptable for most large samples. The second
and third method are usually thought of as small sample methods, but would be appropriate for large
samples. However, methods 2, 3, and four assume that the underlying distributions are normal, so that the
methods based on ranking must be used in small sample situations where the underlying distribution is not
normal. For the first four methods, the tests are very similar and thus are summarized together.
Suppose that we have two samples. The first sample is a sample of n1 items , has a sample mean of
x1 , a sample variance of s12 , and comes from a population with a mean of 1 and a variance of  12 . The
second sample is a sample of
n2 items, has a sample mean of x 2 , a sample variance of s 22 , and comes
from a population with a mean of  2 and a variance of  22 . Our hypotheses are
H 0 : D  D0
H0 :  1   2
or more generally
H1 :  1   2
H 1 : D  D0 .
where D   1   2 and d  x1  x 2 .
If you use the second representation of the hypotheses to test equality of population means, then
you set D0  0 .
For each of the first four methods, there are three different approaches to testing these hypotheses.
Each of these can be expressed in similar notation. For examples use 252meanx1.
a. Confidence Interval: D  d  t  2 s d or 1   2   x1  x 2   t  2 s d .
Form a confidence interval using this formula. If D0 (which may be ) is in the interval,
do not reject H 0 .
b. Test Ratio: t 
x  x 2   10   20 
d  D0
or t  1
. If this test ratio lies between
sd
sd
t  , do not reject H0 .
2
c. Critical Value: d cv  D0  t  2 s d or x1  x 2 cv  10   20   t  2 s d . If d  x1  x 2 is
between the two critical values, do not reject H 0 .
The difference between the cases comes down to the choice of t and the formula for s d . Let us
now consider the first four cases.
1. Two Means, Two Independent Samples, Large Samples.
If the total number of degrees of freedom is large (or the two samples come from normally distributed
populations with known variances  12 and  22 ), then replace t with z and use s d 
s12 s 22

.
n1 n 2
2. Two Means, Two Independent Samples, Populations Normally
Distributed, Population Variances Assumed Equal.
n  1s12  n2  1s 22 .
1 
  1

 , where s p2  1
t  t n1  n2 2 and s d  s p2  
n1  n 2  2
 n1 n 2 
(3. Two Means, Two independent Samples, Populations Normally
Distributed, Population Variances not Assumed Equal.
This time the degrees of freedom for t must be calculated by the Satterthwaite approximation. The




2
  s2 s2 

  1  2 

  n1 n 2 

formula is df  
 , but the formula for the standard deviation is the same as in method 1,
2
2
2  
  s2 

s
 2  
 1 
 n2  
  n1 
 



n

1
n2 1 
 1
sd 
s12 s 22
. This formula tends to give identical answers to method 1 when the degrees of freedom

n1 n 2
are large. It also tends to give answers similar to method 2 when sample variances are of similar size. For
Minitab examples see 252meanx3 and 252meanx5.
4. Two Means, Paired Samples (If samples are small, populations should
be normally distributed).
1
If n is the number of pairs of data, then t  t n1 and s d 
n
d 2  n d
n 1
2
. In this case
d1  x11  x 21 , d 2  x 21  x 22 , etc.
Switch to document 252meanx2 here for examples for points 5 and 6.
5. Rank Tests. ( The remainder of this document is expanded in 252meanx)
Especially in the case where samples are small and the underlying distributions are not normal, it is not
appropriate to compare means.
a. The Wilcoxon-Mann-Whitney Test for Two Independent Samples.
If samples are independent, This test is appropriate to test whether the two samples come from the same
distribution. If the distributions are similar, it is often called a test of equality of medians.
b. Wilcoxon Signed Rank Test for Paired Samples.
This is a more powerful test for equality of medians when the data is paired. It can also be used for the
median of a single sample. The Sign Test for paired data is a simpler test to use in this situation, but it is
less powerful.
6. Proportions.
a. For independent samples - If p1 is the proportion of successes in the first population, and p2 is the
proportion of successes in the second population, we define p  p1  p2 . Then our hypotheses will be
H 0 : p 1  p 2 or more generally H 0 : p = p 0
.
H 1 : p1  p 2
H 1 : p  p 0
Let p1  x1 , p2  x2 and p  p1  p2 where x1 is the number of successes in the first sample, x 2 is the
n1
n2
number of successes in the second sample and n1 and n2 are the sample sizes. The usual three approaches
to testing the hypotheses can be used.
(i). Confidence Interval: p  p  z s p or  p1  p 2    p1  p 2   z s p , where
2
2
s p 
p1 q1 p 2 q 2

. Compare this interval with
n1
n2
(ii). Test Ratio: z 
p  p 0
 p

p0 .
 p1  p    p10  p 20 
 p
the null hypothesis if specified and   p 
where p10 and p 20 come from
p1q1 p 2 q 2
although s p may have to be

n1
n2
used if p1 and p 2 are unknown. Also note that if the null hypothesis is
p1  p2 or p0  0 , we use  p 
 1
1
p 0 q 0  
 n1 n 2

 , where

n1 p1  n 2 p 2 x1  x 2

and x1 and x2 are the number of successes in sample 1
n1  n 2
n1  n 2
and sample 2, respectively.
(iii). Critical Value: pCV  p0  z  p or  p1  p 2 CV   p10  p 20   z 2   p .
2
p0 
Test this against p1  p2 . For calculation of  p , see Test Ratio above.
b. For paired samples, use the McNemar Test. This is described in 252meanx2.
7. Variances.
Switch to document 252meanx4 here for examples.
Test the ratios
s12
s22
and
against values of F .
2
s22
s12
 H 0 :  12   22
s2
If we want to test 
where DF1  n1  1 and DF2  n 2  1 , compare 12 against
s2
 H 1 :  12   22
F DF1 ,DF2  . If the ratio of sample variances is larger than F DF1 ,DF2  , reject H .


0
 H 0 :  12   22
If we want to do the opposite test 
 H 1 :  12   22
 H 0 :  12   22

 H 1 :  12   22
s 22
s12


F DF1 , DF2 
2
, compare
s 22
s12
against FDF2 ,DF1  . For the 2-sided test
, do both tests above , but use F . A 2-sided confidence interval is
2
 22
 12

s 22
s12
1
DF2 , DF1  .
F
2
8. Appendix: Sample sizes for confidence intervals for differences between
means and proportions.
For means n1  n 2 

z 22  12   22
e2

. For proportions n1  n 2 
z 22  p1 q1  p 2 q 2 
e2
.
Download