D. COMPARISON OF TWO SAMPLES

advertisement
252meanl 10/21/05
D. COMPARISON OF TWO SAMPLES
The first five methods shown here are appropriate in cases where one
wants to compare the means of two samples. The first one is the most general,
since it is acceptable for most large samples. The second and third method are
usually thought of as small sample methods, but would be appropriate for large
samples. However, methods 2, 3, and four assume that the underlying
distributions are normal, so that the methods based on ranking must be used in
small sample situations where the underlying distribution is not normal. For the
first four methods, the tests are very similar and thus are summarized together.
Suppose that we have two samples. The first sample is a sample of n1
items , has a sample mean of x1 , a sample variance of s12 , and comes from a
population with a mean of 1 and a variance of  12 . The second sample is a
sample of
n2 items, has a sample mean of x 2 , a sample variance of s 22 , and
comes from a population with a mean of  2 and a variance of  22 . Our
hypotheses are
H 0 : D  D0
H0 :  1   2
or more generally
H1 :  1   2
H 1 : D  D0 .
where D   1   2 and d  x1  x 2 .
If you use the second representation of the hypotheses to test equality of
population means, then you set D0  0 .
For each of the first four methods, there are three different approaches
to testing these hypotheses. Each of these can be expressed in similar notation.
For examples use 252meanx1.
a. Confidence Interval: D  d  t  2 s d or 1   2   x1  x 2   t  2 s d .
Form a confidence interval using this formula. If D0 (which may be ) is in the
interval, do not reject H 0 .
b. Test Ratio: t 
between
x  x 2   10   20 
d  D0
or t  1
. If this test ratio lies
sd
sd
t  , do not reject H0 .
2
c. Critical Value: d cv  D0  t  2 s d or x1  x 2 cv  10   20   t  2 s d .
If d  x1  x 2 is between the two critical values, do not reject H 0 .
The difference between the cases comes down to the choice of t and
The formula for s d . Let us now consider the first four cases.
1.
Two Means, Two Independent Samples, Large
Samples.
If the total number of degrees of freedom is large (or the two samples
come from normally distributed populations with known variances
 12 and  22 ), then replace t with z and use s d 
s12 s 22
.

n1 n 2
2.
Two Means, Two Independent Samples,
Populations Normally Distributed, Population
Variances Assumed Equal.
1
  1
t  t n1  n2 2 and s d  s p2  
 n1 n 2

 ,

n  1s12  n2  1s 22 .

where s p2  1
n1  n 2  2
(3. Two Means, Two independent Samples,
Populations Normally Distributed, Population
Variances not Assumed Equal.
This time the degrees of freedom for t must be calculated by the


  s2 s2 2
  1  2 
  n1 n 2 
Satterthwaite approximation. The formula is df  
2
2
  s2 
 s 22 
1 




 n2 
  n1 
 


n

1
n2 1
 1
but the formula for the standard deviation is the same as in method 1,





 ,





s12 s 22
. This formula tends to give identical answers to

n1 n 2
sd 
method 1 when the degrees of freedom are large. It also tends to give
answers similar to method 2 when sample variances are of similar size.
For Minitab examples see 252meanx3 and 252meanx5.
4.
Two Means, Paired Samples (If samples are small,
populations should be normally distributed).
If n is the number of pairs of data, then t  t n1 and
sd 
1
n
d 2  n d
n 1
2
. In this case d1  x11  x21, d 2  x21  x22 ,
etc.
5.
Rank Tests. ( The remainder of this document is expanded in
252meanx2)
Especially in the case where samples are small and the underlying
distributions are not normal, it is not appropriate to compare means.
a. The Wilcoxon-Mann-Whitney Test for Two
Independent Samples.
If samples are independent, This test is appropriate to test whether
the two samples come from the same distribution. If the
distributions are similar, it is often called a test of equality of
medians.
b. Wilcoxon Signed Rank Test for Paired Samples.
This is a more powerful test for equality of medians when
the data is paired. It can also be used for the median of a
single sample. The Sign Test for paired data is a simpler test
to use in this situation, but it is less powerful.
6. Proportions.
For independent samples - If p1 is the proportion of successes in the first
population, and p2 is the proportion of successes in the second
population, we define p  p1  p2 . Then our hypotheses will be
H 0 : p 1  p 2 or more generally H 0 : p = p 0
.
H 1 : p1  p 2
H 1 : p  p 0
Let p1  x1 , p2  x2 and p  p1  p2 where x1 is the number of
n1
n2
successes in the first sample, x 2 is the number of successes in the
second sample and n1 and n2 are the sample sizes. The usual three
approaches to testing the hypotheses can be used.
a. Confidence Interval: p  p  z s p or
2
 p1  p2    p1  p2   z

2
s p , where s p 
p1 q1 p 2 q 2

.
n1
n2
Compare this interval with p0 .
p  p 0  p1  p    p10  p 20 

b. Test Ratio: z 
where p10 and
 p
 p
p 20 come from the null hypothesis if specified and
p1q1 p 2 q 2
although s p may have to be used if p1 and p 2

n1
n2
p 
are unknown. Also note that if the null hypothesis is p1  p2 or p0  0 ,
 1
n p  n 2 p 2 x1  x 2
1 
 , where p 0  1 1

p 0 q 0  
n1  n 2
n1  n 2
 n1 n 2 
and x1 and x2 are the number of successes in sample 1 and sample 2,
respectively.
c: Critical Value: pCV  p0  z  p or  p1  p 2 CV   p10  p 20   z 2   p . Test
we use  p 
2
this against p1  p2 . For calculation of  p , see Test Ratio above.
For paired samples, use the McNemar Test. This is described in
252meanx2.
7. Variances.
Switch to document 252meanx4l here for examples.
s12
s22
Test the ratios 2 and 2 against values of F .
2
s2
s1
 H 0 :  12   22
If we want to test 
 H 1 :  12   22
DF2  n 2  1 , compare
s12
s 22
where DF1  n1  1 and
against FDF1 ,DF2  . If the ratio of sample
variances is larger than FDF1 ,DF2  , reject H 0 .
 H 0 :  12   22
If we want to do the opposite test 
 H 1 :  12   22
s 22
, compare
against FDF2 ,DF1  .
s12
 H 0 :  12   22
For the 2-sided test 
, do both tests above , but use
 H 1 :  12   22
F . A 2-sided confidence interval is
2
s 22
s12


F DF1 , DF2 
2
 22
 12

s 22
s12
1
DF2 , DF1  .
F
2
8. Appendix: Sample sizes for confidence intervals
for differences between means and proportions.
For means n1  n 2 
n1  n 2 

z 22  12   22
e2
z 22  p1 q1  p 2 q 2 
e2
.

. For proportions
Download