The Wilcoxon Rank Sum Test

advertisement
Statistical Methods II
Session 9
Non Parametric Testing –
The Wilcoxon Rank Sum Test
(also known as the Mann Whitney Test)
Wilcoxon Rank Sum Test
Recall that Non-Parametric tests (in all forms) should be your
“Plan B”.
In the previous two sessions, we covered the Sign Test and
the Wilcoxon Signed Rank Test – both of which can be used
when testing the center location of a single population (or a
pair).
In the current session, we will be covering the Wilcoxon Rank
Sum Test – used with two independent samples.
Wilcoxon Rank Sum Test
Test
Parametric
Non Parametric
One Quantitative
Response Variable
One Sample ttest
Sign Test
One Quantitative
Response Variable – Two
Values from Paired
Samples
Paired Sample ttest
Wilcoxon Signed Rank
Test
One Quantitative
Response Variable – One
Qualitative Independent
Variable with two groups
Two Independent
Sample ttest
Wilcoxon Rank Sum or
Mann Whitney Test
One Quantitative
Response Variable – One
Qualitative Independent
Variable with three or more
groups
ANOVA
Kruskall Wallis
Wilcoxon Rank Sum Test
Although this test does not have parametric assumptions –
specifically the number of observations can be small – it
does require two things:
1. The two groups being tested are independent of each
other.
2. The two groups should have approximately similar
distributions (this test evaluates the “shift” of the
distributions).
Wilcoxon Rank Sum Test
The hypothesis statements function the same way as the
two sample ttest – but we are focused on the medians
rather than on the means:
H0: η1 – η2 = 0
H1: η1 – η2 ≠ 0
These could also be expressed as one tailed tests.
Wilcoxon Rank Sum Test
Step 1: List the data values from both samples in a single list
arranged from smallest to largest.
Step 2: In the next column, assign the numbers 1 to N (where
N = n1+n2). These are the ranks of the observations. As
before, if there are ties, assign the average of the ranks the
values would receive to each of the tied values.
Step 3: Let W denote the sum of the ranks for the obs from
Population 1.
Note that if there is no difference between the two medians
(the null is true), the value of W will be around half the sum
of the ranks – {(n1(1+N))/2}
Wilcoxon Rank Sum Test
The following data measures the reaction times of two
samples of people – one set drank alcohol, one set drank a
placebo.
Alcohol
Placebo
1.56
.90
1.56
.37
1.76
1.63
1.44
.83
1.11
.95
3.07
.78
.98
.86
1.27
.61
2.56
.38
1.32
1.97
Wilcoxon Rank Sum Test
From this dataset, the hypothesis statements will be:
H0: The median reaction times for the placebo group is the
same or slower than the median reaction time for the
alcohol group.
H1: The median reaction times for the placebo group is
faster than the median reaction time for the alcohol group.
Wilcoxon Rank Sum Test
Data
Rank
Alcohol or Placebo Group
.37
1
Placebo
.38
2
Placebo
.61
3
Placebo
.78
4
Placebo
.83
5
Placebo
.86
6
Placebo
.90
7
Placebo
.95
8
Placebo
.98
9
Alcohol
1.11
10
Alcohol
1.27
11
Alcohol
1.32
12
Alcohol
1.44
13
Alcohol
1.45
14
Alcohol
1.46
15
Alcohol
1.63
16
Placebo
1.76
17
Alcohol
1.97
18
Placebo
2.56
19
Alcohol
3.07
20
Alcohol
Wilcoxon Rank Sum Test
If we sum the ranks of the Placebo group, we get
W = 1+2+3+4+5+6+7+8+16+18 = 70.
Since the middle point of the ranks is 105 - (10*21)/2 – and
the placebo ranks is much lower, we have initial evidence
to conclude that the placebo group had quicker reaction
times than did the alcohol group.
A z-score approximation can be found on page S2-11 of
your book.
Wilcoxon Rank Sum Test
Lets do this same test using SAS…
Download