D. COMPARISON OF TWO SAMPLES (CTD.) 5. Rank Tests.

advertisement
1
252meanx2 11/04/07 (Open this document in 'Outline' view!)
D. COMPARISON OF TWO SAMPLES (CTD.)
5. Rank Tests.
Especially in the case where samples are small and the underlying distributions are not normal, it is not
appropriate to compare means.
a. The Wilcoxon-Mann-Whitney Test for Two Independent Samples.
If samples are independent, This test is appropriate to test whether the two samples come from the same
distribution. If the distributions are similar, it is often called a test of equality of medians.
Example: Let us assume that we have two very small samples from New York n 2  6 and Pennsylvania
n1  4 and we wish to compare their medians. Let us call the smaller sample (Pennsylvania) ‘sample 1’ and
the larger sample ‘sample 2’, so that n1  n 2 . If we use  for the median, our hypotheses are
H 0 : 1   2
and   .05 .

H 1 : 1   2
Assume that our data is as below:
Pennsylvania
11000
16000
80000
85000
New York
17000
30000
50000
70000
80000
90000
Our first step is to rank the numbers from 1 to n  n1  n 2  4  6  10. note that the 7th and 8th numbers are
tied, so that both are numbered 7.5. These can be ordered from the largest to the smallest or from the
smallest to the largest. To decide which to do, look at the smaller sample. If the smallest number is in the
smaller sample, order from smallest to largest, if the largest number is in the smallest sample, order from the
largest to the smallest. Since 11000 is the smallest number, let that be 1.
Pennsylvania r1
New York
x1
x2
r2
11000
1
17000
3
16000
2
30000
4
80000
7.5
60000
5
85000
9 .
70000
6
19.5
80000
7.5
90000
10 .
35.5
Now compute the sums of the ranks. SR1  19 .5, SR2  35 .5 . As a check, note that these two rank sums
must add to the sum of the first n numbers, and that this is
SR1  SR2  19.5  35.5  55 .
nn  1 10 11

 55 , and that
2
2
2
252meanx2 11/04/07 (Open this document in 'Outline' view!)
The smaller of SR1 and SR2 is called W and is compared with Table 5 or 6. {Wpval} {WCV} To use
Table 5, first find the part for n 2  6 , and then the column for n1  4 . Then try to locate W  19.5 in that
column. In this case, since for W  19 the p-value is .3048, and for W  20 the p-value is .3810, we can
say that .3048  pvalue  .3810 . Since both are above the significance level, we cannot reject the null
hypothesis. This can also be compared against the critical values for TL and TU in table 6b; these are 13
and 31. Since W  19.5 it is between these values and we cannot reject the null hypothesis.
For values of n1 and n 2 that are too large for the tables, W has the normal distribution with mean
W  1 2 n1 n1  n2  1 and variance  W2  16 n2 W . Though the example above is too small for this
treatment, for continuity its data will be used here. If the significance level is 5% and the test is one-sided,
W  W
we reject our null hypothesis if z 
lies below z .05  1.645 . In this case then
W
W 
z
1
2 n1
n1  n2  1  2 44  6  1  22 and  W2
W  W
W
1

19 .5  22
22

1
6 n 2 W

1
6
622   22 so that
 0.53 . Since this is not below –1.645, we cannot reject H 0 .
b. Wilcoxon Signed Rank Test for Paired Samples.
This is a test for equality of medians when the data is paired. It can also be used for the median of a single
sample. The Sign Test for paired data is a simpler test to use in this situation, but it is less powerful.
As in many tests for measures of central tendency with paired data, the original numbers are discarded, and
the differences between the pairs are used. If there are n pairs, these are ranked according to absolute value
from 1 to n , either top to bottom or bottom to top. After replacing tied absolute values with their average
rank, each rank is marked with a + or – sign and two rank sums are taken, T  and T  . The smaller of
these is compared with Table 7.
Example: We wish to compare sales of a product before and after an advertisement appeared in a nationally
televised football game. Sales in a sample of eight stores before the game are x1 and sales after are x 2 .
Define d  x 2  x1 as the improvement in sales. Though the appropriate test here would be one-sided, a
two sided test is demonstrated here instead.
H 0 : 1   2
n  8 and   .05 . The data are below: The column d is the absolute value of d , the

H 1 : 1   2
column r ranks absolute values, and the column r * is the ranks corrected for ties and marked with the
signs on the differences.
3
252meanx2 11/04/07 (Open this document in 'Outline' view!)
x1
x2
d  x 2  x1
d
r
r*
7600
8600
+1000
1000
8
8+
8700
8900
+200
200
2
2.5+
9600
9400
-200
200
3
2.58400
8700
+300
300
4
4+
7600
8100
+500
500
6
6+
6900
7500
+600
600
7
7+
7300
7700
+400
400
5
5+
8200
8100
-100
100
1
1If we add together the numbers in r * with a + sign we get . T   32.5 . If we do the same for numbers
with a – sign, we get T   3.5. To check this, note that these two numbers must sum to the sum of the first
nn  1 89
n numbers, and that this is

 36 , and that T    T   32.5  3.5  36 .
2
2
We check 3.5, the smaller of the two rank sums against the numbers in table 7. {wsignedr} For a two-sided
5% test, we use the   .025 column. For n  8 , the critical value is 4, and we reject the null hypothesis
only if our test statistic is below this critical value. Since our test statistic is 3.5, we reject the null
hypothesis.
For values of n that are too large for the table, TL , the smaller of T  and T  , has the normal distribution
with mean  T  1 4 nn  1 and variance  T2  16 2n  1T . Though the example above is too small for
this treatment, for continuity its data will be used here. If the significance level is 5% and the test is twoT  T
sided, we reject our null hypothesis if z  L
does not lie between  z   z.025  1.960 . In this
T
case then  T 
z
TL   T
T

1
4n
n  1  1 4 88  1  18 and  T2  16 2n  1T
3.5  18
51
2

1
6
16  118   51 so that
 2.03 . Since this is not between 1.960 , we reject H 0 .
6. Proportions.
6a. Independent Samples: If p1 is the proportion of successes in the first population, and p 2 is the
proportion of successes in the second population, we define p  p1  p 2 . Then our hypotheses will be
H 0 : p1  p 2

H 1 : p1  p 2
 H 0 : p = p 0
or more generally 
 H 1 : p  p 0
Let p1  x1 , p2  x2 and p  p1  p2 where x1 is the number of successes in the first sample, x2 is the
n1
n2
number of successes in the second sample and n1 and n 2 are the sample sizes. The usual three approaches
to testing the hypotheses can be used.
4
252meanx2 11/04/07 (Open this document in 'Outline' view!)
a. Confidence Interval: p  p  z s p
2
s p 
or  p1  p 2    p1  p 2   z s p , where
2
p1 q1 p 2 q 2

. Compare this interval with p0 .
n1
n2
b. Test Ratio: z 
 p 
p  p 0
 p

 p1  p 2    p10  p 20 
 p
where
p1 q1 p 2 q 2

although s p may have to be used if p1 and p 2 are unknown.
n1
n2
Also note that if the null hypothesis is p1  p2 or p0  0 , we use
 1
n p  n 2 p 2 x1  x 2
1 
 , where p 0  1 1
and x1 and x2 are the


n1  n 2
n1  n 2
n
n
2 
 1
number of successes in sample 1 and sample 2, respectively.
c: Critical Value: pCV  p0  z  p or  p1  p 2 CV   p10  p 20   z  p .
2
2
 p  p 0 q 0 
Test this against p1  p 2 . For calculation of  p , see Test Ratio above.
Example: An insurance company operating in its home state (region 1) has 18 claims on
1000 policies, a ratio of .018. In another state (region 2) it has 12 claims on 400 policies,
a ratio of .030. Are these two ratios significantly different at the   .01 significance
level?
n
H : p  p 2
  1000 x1  18 p1  .018 q1  1  .018  .982
Our facts are  1
. We are testing  0 1
or
 n 2  400 x 2  12 p 2  .030 q 2  1  .030  .970
H 1 : p1  p 2
 H 0 : p = 0
. For the critical Value or Test Ratio method, we need

 H 1 : p  0
n p  n 2 p 2 1000 .018   400 .030 
p0  1 1

 .0214 or, more easily,
n1  n 2
1000  400
p0 
x1  x 2
18  12

 .0214 . This implies that q 0  1  p 0  1  .0214  .9786 and that
n1  n 2 1000  400
 1
1 
 

n
n
2 
 1
 p  p 0 q 0 
.0214 .9786 
1
1 

  .000073297  .00856 .
1000
400


p  p1  p2  .018  .030  .012 . For a two-sided test, we will need z 2  z.005  2.576 .{ttable}
Critical Value: pCV  p0  z  p  0  2.576.00856  .0221. Since p  .012 falls between –
2
0.0221 and 0.0221, do not reject the null hypothesis.
p  p 0  .012  0

 1.402 . Since this falls between -2.576 and 2.576 do not reject
Test Ratio: z 
 p
.00856
the null hypothesis.
Confidence Interval: s p 
p1 q1 p 2 q 2


n1
n2
.018 .982   .030 .970  
1000
400
.00009043  .00951
p  p  z 2 s p  .012  2.576.00951  .012  .024 or -.036 to .012. Since p 0  0 falls between .036 and .012, do not reject the null hypothesis.
5
252meanx2 11/04/07 (Open this document in 'Outline' view!)
6b Paired Samples: In Method D6a, we assume that we are comparing proportions from two independent
samples. In the McNemar Test we compare two proportions taken from the same sample, which is
equivalent to paired samples. Assume that two different questions are asked of the same group with the
question 2
question 1
yes no
following responses.
So, for example x 21 is the number of people who answered
yes
 x11 x12 
x

no
 21 x 22 
no to question 1 and yes to question 2. x11  x12  x 21  x 22  n , p1 
x11  x12
x  x 21
and p 2  11
. If we
n
n
H : p  p 2
wish to test  0 1
,where p1 is the proportion saying ‘yes’ to the first question and p 2 is the
H 1 : p1  p 2
x  x 21
proportion saying ‘yes’ to the second question, let z  12
(The test is valid only if
x12  x 21
x12  x 21  10 .)
Example: A famous example of this concerns a debate between candidates, question 1 is whether the
respondent supports candidate 1 before the debate and question 2 is whether the respondent supports
question 2
question 1
yes no
candidate 1 after the debate. The data is
and the question is whether the debate has
yes
27 7
 13 28 
no


changed the fraction supporting candidate 1. Write this out as a hypothesis test and do the test.
H 0 : p1  p 2
H : p  p 2  0
Solution: 
or  0 1
This is a two-sided test, so if we use a 5% significance
H
:
p

p
2
 1 1
H 1 : p1  p 2  0
x  x 21
level, our rejection regions are below z .025  1.96 {ttable} and above z.025  1.96 . z  12
x12  x 21

7  13

6

36
  1.8  1.34 , and we cannot reject the null hypothesis. If we use a p-value,
20
7  13
20
{norm}
2Pz  1.34   2.5  .4099   0.0901 , so we could reject the null hypothesis at a 10% significance level,
H 0 : p1  p 2
but not a 5% level. If you (wrongly, but understandably though that the hypotheses were 
or
H 1 : p1  p 2
H 0 : p1  p 2  0
, the 5% rejection region would be below z .05  1.645 and we still could not reject the

H 1 : p1  p 2  0
null hypothesis.
Note: This is a version of the Chi-Square Test – Recall that  2 

O  E 2
E
. If we take x11 and x 22 as
question 1
given, and assume that the null hypothesis is correct, then the table already given,
yes
no
question 2
yes no
 x11 x12 
x

 21 x 22 
is our O , and the numbers in the x12 and the x 21 slots must be equal for there to be no change in
6
252meanx2 11/04/07 (Open this document in 'Outline' view!)
question 1
preferences, so that our E is
yes
no
question 2
yes
no
x12  x 21  . This means that two of the four terms

x11


2
x  x

21
 12
x 22 
2


2
in  2 


O  E 2
E
x12  x 21 2
x12  x 21
x  x 21 
x  x 21 


 x12  12

 x 21  12

2
2




2
are zero and the remaining terms are  

x12  x 21
x12  x 21
2
2
2
. But  2 has only one degree of freedom, and, since  2 is defined as a sum of z 2 , we can
take a square root and say z 
x12  x 21
x12  x 21
.
Switch to document 252meanx4 here for examples for point 7.
7. Variances.
s12
s22
Test the ratios 2 and 2 against values of F .
2
s2
s1
 H 0 :  12   22
If we want to test 
 H 1 :  12   22
against H1 :
where DF1  n1  1 and DF2  n 2  1 , we effectively test H 0 :
 12
1
 22
s12
 12
by
comparing
against FDF1 ,DF2  . If the ratio of sample variances is larger than

1
 22
s 22
FDF1 ,DF2  , reject H 0 .
 H 0 :  12   22
If we want to do the opposite test 
 H 1 :  12   22
comparing
s 22
s12
, we test H 0 :
 22
 22
against
H
:
 1 by

1
1
 12
 12
 H 0 :  12   22
against FDF2 ,DF1  . For the 2-sided test 
 H 1 :  12   22
F 2 . A 2-sided confidence interval is
s 22
s12


F DF1 , DF2 
2
 22
 12

s 22
1
, do both tests above , but use
s12 FDF2 , DF1 
.
2
For examples, see the syllabus supplement article, “Confidence Intervals and Hypothesis Testing for
Variances.”
7
252meanx2 11/04/07 (Open this document in 'Outline' view!)
8. Summary
It may help to use the following table.
Paired Samples
Location - Normal distribution.
Method D4
Compare means.
Independent Samples
Methods D1- D3
Location - Distribution not
Normal. Compare medians.
Method D5b
Method D5a
Proportions
Method D6b
Method D6a
Variability - Normal distribution.
Compare variances.
© 2005 Roger Even Bove
Method D7
Download