D. COMPARISON OF TWO SAMPLES (CTD.) 5. Rank Tests.

advertisement
1
252meanx 3/26/01 (Open this document in 'Outline' view!)
D. COMPARISON OF TWO SAMPLES (CTD.)
5. Rank Tests.
Especially in the case where samples are small and the underlying distributions are not normal, it is not
appropriate to compare means.
a. The Wilcoxon-Mann-Whitney Test for Two Independent Samples.
If samples are independent, This test is appropriate to test whether the two samples come from the same
distribution. If the distributions are similar, it is often called a test of equality of medians.
Example: Let us assume that we have two very small samples from New York n 2  6 and Pennsylvania
n1  4 and we wish to compare their medians. Let us call the smaller sample (Pennsylvania) ‘sample 1’ and
the larger sample ‘sample 2’, so that n1  n 2 . If we use  for the median, our hypotheses are
H 0 : 1   2
and   .05 .

H 1 : 1   2
Assume that our data is as below:
Pennsylvania
11000
16000
80000
85000
New York
17000
30000
50000
70000
80000
90000
Our first step is to rank the numbers from 1 to n  n1  n 2  4  6  10. note that the 7th and 8th numbers are
tied, so that both are numbered 7.5. These can be ordered from the largest to the smallest or from the
smallest to the largest. To decide which to do, look at the smaller sample. If the smallest number is in the
smaller sample, order from smallest to largest, if the largest number is in the smallest sample, order from the
largest to the smallest. Since 11000 is the smallest number, let that be 1.
Pennsylvania r1
New York
x1
x2
r2
11000
1
17000
3
16000
2
30000
4
80000
7.5
60000
5
85000
9 .
70000
6
19.5
80000
7.5
90000
10 .
35.5
Now compute the sums of the ranks. SR1  19 .5, SR2  35 .5 . As a check, note that these two rank sums
must add to the sum of the first n numbers, and that this is
SR1  SR2  19.5  35.5  55 .
nn  1 10 11

 55 , and that
2
2
2
The smaller of SR1 and SR2 is called W and is compared with Table 5 or 6. To use Table 5, first find the
part for n 2  6 , and then the column for n1  4 . Then try to locate W  19.5 in that column. In this case,
since for W  19 the p-value is .3048, and for W  20 the p-value is .3810, we can say that
.3048  pvalue  .3810 . Since both are above the significance level, we cannot reject the null hypothesis.
This can also be compared against the critical values for TL and TU ( TU is actually only needed for a 2sided test) in table 14b. these are 14 and 30. Since W  19.5 it is between these values and we cannot reject
the null hypothesis.
For values of n1 and n 2 that are too large for the tables, W has the normal distribution with mean
W  1 2 n1 n1  n2  1 and variance  W2  16 n2 W . Though the example above is too small for this
treatment, for continuity its data will be used here. If the significance level is 5% and the test is one-sided,
W  W
we reject our null hypothesis if z 
lies below z .05  1.645 . In this case then
W
W 
z
1
2 n1
n1  n2  1  2 44  6  1  22 and  W2
W  W
W
1

19 .5  22
22

1
6 n 2 W

1
6
622   22 so that
 0.53 . Since this is not below –1.645, we cannot reject H 0 .
b. Wilcoxon Signed Rank Test for Paired Samples.
This is a test for equality of medians when the data is paired. It can also be used for the median of a single
sample. The Sign Test for paired data is a simpler test to use in this situation, but it is less powerful.
As in many tests for measures of central tendency with paired data, the original numbers are discarded, and
the differences between the pairs are used. If there are n pairs, these are ranked according to absolute value
from 1 to n , either top to bottom or bottom to top. After replacing tied absolute values with their average
rank, each rank is marked with a + or – sign and two rank sums are taken, T  and T  . The smaller of
these is compared with Table 7.
Example: We wish to compare sales of a product before and after an advertisement appeared in a nationally
televised football game. Sales in a sample of eight stores before the game are x1 and sales after are x 2 .
Define d  x 2  x1 as the improvement in sales. Though the appropriate test here would be one-sided, a
two sided test is demonstrated here instead.
H 0 : 1   2
n  8 and   .05 . The data are below: The column d is the absolute value of d , the

H 1 : 1   2
column r ranks absolute values, and the column r * is the ranks corrected for ties and marked with the
signs on the differences.
3
x1
x2
d  x 2  x1
d
r
r*
7600
8600
+1000
1000
8
8+
8700
8900
+200
200
2
2.5+
9600
9400
-200
200
3
2.58400
8700
+300
300
4
4+
7600
8100
+500
500
6
6+
6900
7500
+600
600
7
7+
7300
7700
+400
400
5
5+
8200
8100
-100
100
1
1If we add together the numbers in r * with a + sign we get . T   32.5 . If we do the same for numbers
with a – sign, we get T   3.5. To check this, note that these two numbers must sum to the sum of the first
nn  1 89
n numbers, and that this is

 36 , and that T    T   32.5  3.5  36 .
2
2
We check 3.5, the smaller of the two rank sums against the numbers in table 7. For a two-sided 5% test, we
use the   .025 column. For n  8 , the critical value is 4, and we reject the null hypothesis only if our test
statistic is below this critical value. Since our test statistic is 3.5, we reject the null hypothesis.
For values of n that are too large for the table, TL , the smaller of T  and T  , has the normal distribution
with mean  T  1 4 nn  1 and variance  T2  16 2n  1T . Though the example above is too small for
this treatment, for continuity its data will be used here. If the significance level is 5% and the test is twoT  T
sided, we reject our null hypothesis if z  L
does not lie between  z   z.025  1.960 . In this
T
case then  T 
z
TL   T
T

1
4n
n  1  1 4 88  1  18 and  T2  16 2n  1T
3.5  18
51
2

1
6
16  118   51 so that
 2.03 . Since this is not between 1.960 , we reject H 0 .
6. Proportions.
If p1 is the proportion of successes in the first sample, and p 2 is the proportion of successes in the second
sample, we define p  p1  p 2 . Then our hypotheses will be
H 0 : p1  p 2

H 1 : p1  p 2
Let p1 
 H 0 : p = p 0
or more generally 
 H 1 : p  p 0
x1
x
, p2  2 and p  p1  p2 where x1 is the number of successes in the first sample, x2 is the
n2
n1
number of successes in the second sample and n1 and n 2 are the sample sizes. The usual three approaches
to testing the hypotheses can be used.
4
a. Confidence Interval: p  p  z s p
2
s p 
or  p1  p 2    p1  p 2   z s p , where
2
p1 q1 p 2 q 2

. Compare this interval with p0 .
n1
n2
b. Test Ratio: z 
 p 
p  p 0
 p

 p1  p 2    p10  p 20 
 p
where
p1 q1 p 2 q 2

although s p may have to be used if p1 and p 2 are unknown.
n1
n2
Also note that if the null hypothesis is p1  p2 or p0  0 , we use
 1
n p  n 2 p 2 x1  x 2
1 
 , where p 0  1 1
and x1 and x2 are the


n1  n 2
n1  n 2
n
n
2 
 1
number of successes in sample 1 and sample 2, respectively.
c: Critical Value: pCV  p0  z  p or  p1  p 2 CV   p10  p 20   z  p .
2
2
 p  p 0 q 0 
Test this against p1  p 2 . For calculation of  p , see Test Ratio above.
Example: An insurance company operating in its home state (region 1) has 18 claims on
1000 policies, a ratio of .018. In another state (region 2) it has 12 claims on 400 policies,
a ratio of .030. Are these two ratios significantly different at the   .01 significance
level?
n
H : p  p 2
  1000 x1  18 p1  .018 q1  1  .018  .982
Our facts are  1
. We are testing  0 1
or
 n 2  400 x 2  12 p 2  .030 q 2  1  .030  .930
H 1 : p1  p 2
 H 0 : p = 0
. For the critical Value or Test Ratio method, we need

 H 1 : p  0
n p  n 2 p 2 1000 .018   400 .030 
p0  1 1

 .0214 or, more easily,
n1  n 2
1000  400
p0 
x1  x 2
18  12

 .0214 . This implies that q 0  1  p 0  1  .0214  .9786 and that
n1  n 2 1000  400
 1
1 
 

n
n
2 
 1
 p  p 0 q 0 
.0214 .9786 
1
1 

  .000073297  .00856 .
1000
400


p  p1  p2  .018  .030  .012 . For a two-sided test, we will need z 2  z.005  2.576 .
Critical Value: pCV  p0  z  p  0  2.576.00856  .0221. Since p  .012 falls between –
2
0.012 and 0.012, do not reject the null hypothesis.
p  p 0  .012  0

 1.402 . Since this falls between -2.576 and 2.576 do not reject
Test Ratio: z 
 p
.00856
the null hypothesis.
Confidence Interval: s p 
p1 q1 p 2 q 2


n1
n2
.018 .982   .030 .970  
1000
400
.00009043  .00951
p  p  z 2 s p  .012  2.576.00951  .012  .024 or -.036 to .012. Since p 0  0 falls between .036 and .012, do not reject the null hypothesis.
5
7. Variances.
s12
s22
and
Test the ratios 2
against values of F .
2
s2
s12
 H 0 :  12   22
If we want to test 
 H 1 :  12   22
against H1 :
where DF1  n1  1 and DF2  n 2  1 , we effectively test H 0 :
 12
1
 22
s12
 12
by
comparing
against FDF1 ,DF2  . If the ratio of sample variances is larger than

1
 22
s 22
FDF1 ,DF2  , reject H 0 .
 H 0 :  12   22
If we want to do the opposite test 
 H 1 :  12   22
comparing
s 22
s12
, we test H 0 :
 22
 22
against
H
:
 1 by

1
1
 12
 12
 H 0 :  12   22
against FDF2 ,DF1  . For the 2-sided test 
 H 1 :  12   22
F 2 . A 2-sided confidence interval is
s 22
s12


F DF1 , DF2 
2
 22
 12

s 22
s12
, do both tests above , but use
1
DF2 , DF1  .
F
2
For examples, see the syllabus supplement article, “Confidence Intervals and Hypothesis Testing for
Variances.”
8. Summary
It may help to use the following table.
Paired Samples
Location - Normal distribution.
Method D4
Compare means.
Location - Distribution not
Normal. Compare medians.
Method D5b
Independent Samples
Methods D1- D3
Method D5a
Proportions
Method 6
Variability - Normal distribution.
Compare variances.
Method 7
Download