Nonparametric Tests.

advertisement
Nonparametric
Statistics
In previous testing, we assumed that our samples were drawn
from normally distributed populations.
This chapter introduces some techniques that do not make
that assumption.
These methods are called distribution-free or nonparametric
tests.
In situations where the normal assumption is appropriate,
nonparametric tests are less efficient than traditional
parametric methods.
Nonparametric tests frequently make use only of the order of
the observations and not the actual values.
In this section, we will discuss four nonparametric tests:
the Wilcoxon Rank Sum Test (or Mann-Whitney U test),
the Wilcoxon Signed Ranks Test,
the Kruskal-Wallis Test, and
the one sample test of runs.
The Wilcoxon Rank Sum Test
or Mann-Whitney U Test
This test is used to test whether 2 independent samples have
been drawn from populations with the same median.
It is a nonparametric substitute for the t-test on the difference
between two means.
Wilcoxon Rank Sum Test Example:
university
A
B
50
70
52
73
56
77
60
80
64
83
68
85
71
87
74
88
89
96
95
99
Based on the following samples from
two universities, test at the 10% level
whether graduates from the two
schools have the same average grade
on an aptitude test.
First merge and rank the grades.
Sum the ranks for each sample.
rank sum for university A: 74
rank sum for university B: 136
university
A
B
rank
grade
university
1
50
A
2
52
A
3
56
A
4
60
A
5
64
A
6
68
A
7
70
B
8
71
A
50
70
52
73
9
73
B
56
77
10
74
A
11
77
B
60
80
12
80
B
64
83
13
83
B
68
85
14
85
B
15
87
B
16
88
B
71
87
74
88
17
89
A
89
96
18
95
A
19
96
B
20
99
B
95
99
Note: If there are
ties, each value
gets the average
rank. For example,
if 2 values tie for
3th and 4th place,
both are ranked 3.5.
If three differences
would be ranked 7,
8, and 9, rank them
all 8.
Here, thegroup fromuniversityA is consideredthe1st sample.
When thesamplesdiffer in size, designate thesmaller of the
2 samplesas the1st sample.
Define T1  sum of the ranks for 1st sample .
The mean of T1 is T1 
n1 (n1  n2  1)
,
2
and the standard deviation is  T1 
n1n 2 (n1  n2  1)
.
12
If n1 and n 2 are each at least 10, T1 is approximately normal.
So, Z 
T1 - T1
 T1
has a standard normal distribution.
(For smallsamplesizes, theZ approximation is sometimesused as well.)
For our example, T1  74.
n1 (n  1) 10(20  1)
T1 

 105
2
2
 T1 
Z 
n1n 2 (n  1)

12
T1 - T1
 T1

(10)(10)(20  1)
 13.229
12
74 - 105
 -2.343.
13.229
Since the critical values for a
2-tailed Z test at the 10%
level are 1.645 and -1.645, we
reject H0 that the medians are
the same and accept H1 that
the medians are different.
critical
region
.45
.45
.05
-1.645
critical
region
.05
0
1.645
Z
For small sample sizes, you can use Table E.6 in
your textbook, which provides the lower and upper
critical values for the Wilcoxon Rank Sum Test.
That table shows that for our 10% 2-tailed test, the
lower critical value is 82 and the upper critical value
is 128.
Since our smaller sample’s rank sum is 74, which is
outside the interval (82, 128) indicated in the table,
we reject the null hypothesis that the medians are
the same and conclude that they are different.
Equivalently, since the larger sample’s rank sum is
136, which is also outside the interval (82, 128), we
again reject the null hypothesis that the medians are
the same and conclude that they are different.
The Wilcoxon Signed Rank Test
This test is used to test whether 2 dependent samples have
been drawn from populations with the same median.
It is a nonparametric substitute for the paired t-test on the
difference between two means.
Wilcoxon Signed Rank Test Procedure
1.
2.
3.
4.
5.
Calculate the differences in the paired values (Di=X1i – X2i)
Take absolute values of the differences and rank them (Discard
all differences that equal 0.)
Assign ranks Ri with the smallest rank equal to 1.
As in the rank sum test, if two or more of the differences are
equal, each difference gets the average rank. (That is, if two
differences would be ranked 3 and 4, rank them both 3.5. If
three differences would be ranked 7, 8, and 9, rank them all 8.)
Assign the symbol + to positive differences and – to negative
differences.
Calculate the Wilcoxon statistic W as the sum of the positive
ranks. So,
W

Ri
Wilcoxon Signed Rank Test Procedure (cont’d)
In thefollowing,n refersto thenumber of non- zero differences.
The mean of the Wilcoxon statistic W is
n( n  1)
W 
4
The standard deviation of the Wilcoxon statistic W is
W 
n(n  1)(2n  1)
24
If n is at least 20, the test statisticW is approximately normal.So we have:
Z
W  W
W
(For smallsamplesizes, theZ approximation is sometimesused as well.)
Example
Suppose we have
a class with 22
students, each of
whom has two
exam grades.
We want to test at
the 5% level
whether there is a
difference in the
median grade for
the two exams.
exam1
exam2
95
diff
(ex2-ex1)
rank
(+)
rank
(-)
exam1
exam2
97
72
68
76
76
78
94
82
75
58
55
48
54
73
75
27
31
71
70
34
39
69
66
58
61
57
62
98
97
84
92
45
45
91
81
77
94
83
90
27
36
67
73
diff
(ex2-ex1)
rank
(+)
rank
(-)
We calculate the
difference between the
exam grades:
diff = exam2 – exam 1.
exam1
exam2
diff
(ex2-ex1)
95
97
76
rank
(+)
rank
(-)
exam1
exam2
diff
(ex2-ex1)
2
72
68
-4
76
0
78
94
16
82
75
-7
58
55
-3
48
54
6
73
75
2
27
31
4
71
70
-1
34
39
5
69
66
-3
58
61
3
57
62
5
98
97
-1
84
92
8
45
45
0
91
81
-10
77
94
17
83
90
7
27
36
9
67
73
6
rank
(+)
rank
(-)
Then we rank the
absolute values of the
differences from
smallest to largest,
omitting the two zero
differences.
The smallest non-zero
|differences| are the
two |-1|’s. Since they
are tied for ranks 1
and 2, we rank them
both 1.5.
Since the differences
were negative, we put
the ranks in the
negative column.
exam1
exam2
diff
(ex2-ex1)
95
97
76
rank
(+)
rank
(-)
exam1
exam2
diff
(ex2-ex1)
2
72
68
-4
76
0
78
94
16
82
75
-7
58
55
-3
48
54
6
73
75
2
27
31
4
71
70
-1
34
39
5
69
66
-3
58
61
3
57
62
5
98
97
-1
84
92
8
45
45
0
91
81
-10
77
94
17
83
90
7
27
36
9
67
73
6
1.5
rank
(+)
rank
(-)
1.5
exam1
exam2
diff
(ex2-ex1)
rank
(+)
The next smallest
non-zero |differences|
are the two |2|’s.
Since they are tied for
ranks 3 and 4, we
rank them both 3.5.
95
97
2
3.5
76
76
82
Since the differences
were positive, we put
the ranks in the
positive column.
rank
(-)
exam1
exam2
diff
(ex2-ex1)
72
68
-4
0
78
94
16
75
-7
58
55
-3
48
54
6
73
75
2
27
31
4
71
70
-1
34
39
5
69
66
-3
58
61
3
57
62
5
98
97
-1
84
92
8
45
45
0
91
81
-10
77
94
17
83
90
7
27
36
9
67
73
6
1.5
rank
(+)
rank
(-)
3.5
1.5
exam1
exam2
diff
(ex2-ex1)
rank
(+)
The next smallest
non-zero |differences|
are the two |-3|’s and
the |3|. Since they are
tied for ranks 5, 6,
and 7, we rank them
all 6.
95
97
2
3.5
76
76
82
Then we put the ranks
in the appropriately
signed columns.
rank
(-)
exam1
exam2
diff
(ex2-ex1)
72
68
-4
0
78
94
16
75
-7
58
55
-3
48
54
6
73
75
2
27
31
4
71
70
-1
1.5
34
39
5
69
66
-3
6
58
61
3
57
62
5
98
97
-1
84
92
8
45
45
0
91
81
-10
77
94
17
83
90
7
27
36
9
67
73
6
6
1.5
rank
(+)
rank
(-)
6
3.5
We continue until
we have ranked all
the non-zero
|differences| .
exam1
exam2
diff
(ex2-ex1)
rank
(+)
95
97
2
3.5
76
76
0
82
75
-7
48
54
6
27
31
34
rank
(-)
exam1
exam2
diff
(ex2-ex1)
72
68
-4
78
94
16
58
55
-3
12.5
73
75
2
4
8.5
71
70
-1
1.5
39
5
10.5
69
66
-3
6
58
61
3
6
57
62
5
10.5
98
97
-1
84
92
8
16
45
45
0
91
81
-10
77
94
17
20
83
90
7
14.5
27
36
9
17
67
73
6
12.5
14.5
1.5
rank
(+)
rank
(-)
8.5
19
6
3.5
18
exam1
exam2
diff
(ex2-ex1)
rank
(+)
Then we total the
signed ranks. We get
154 for the sum of
the positive ranks
and 56 for the sum of
the negative ranks.
95
97
2
3.5
76
76
0
82
75
-7
48
54
6
The Wilcoxon test
statistic is the sum of
the positive ranks.
So W = 154.
27
31
34
rank
(-)
exam1
exam2
diff
(ex2-ex1)
72
68
-4
78
94
16
58
55
-3
12.5
73
75
2
4
8.5
71
70
-1
1.5
39
5
10.5
69
66
-3
6
58
61
3
6
57
62
5
10.5
98
97
-1
84
92
8
16
45
45
0
91
81
-10
77
94
17
20
83
90
7
14.5
27
36
9
17
67
73
6
12.5
14.5
1.5
rank
(+)
rank
(-)
8.5
19
6
3.5
18
154
56
Since we had 22 students and 2 zero differences, the number of
non-zero differences n = 20.
n(n  1) (20)( 21)

 105
Recall that the mean of W is W 
4
4
The standard deviation of W is
n(n  1)(2n  1)
20(21)(41)
W 

 26.786
24
24
So we have :
Z
W  W
Since the critical values for a
2-tailed Z test at the 5% level
are 1.96 and -1.96, we can not
reject the null hypothesis H0
and so we conclude that the
medians are the same.
W
154 105

 1.829
26.786
critical
region
.475
.475
.025
-1.96
critical
region
.025
0
1.96
Z
For small sample sizes, you can use Table 12.19 in
the online material associated with section 12.8 of
your textbook, which provides the lower and upper
critical values for the Wilcoxon Signed Rank Test.
This table is shown on the next slide.
Lower & Upper Critical Values, W,
of Wilcoxon Signed Ranks Test
ONE-TAIL
TWO-TAIL
n
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
α = 0.05
α = 0.10
0,15
2,19
3,25
5,31
8,37
10,45
13,53
17,61
21,70
25,80
30,90
35,101
41,112
47,124
53,137
60,150
α = 0.025
α = 0.01
α = 0.05
α = 0.02
(Lower, Upper)
—,—
—,—
0,21
—,—
2,26
0,28
3,33
1,35
5,40
3,42
8,47
5,50
10,56
7,59
13,65
10,68
17,74
12,79
21,84
16,89
25,95
19,101
29,107
23,113
34,119
27,126
40,131
32,139
46,144
37,153
52,158
43,167
α = 0.005
α = 0.01
—,—
—,—
—,—
0,36
1,44
3,52
5,61
7,71
10,81
13,92
16,104
19,117
23,130
27,144
32,158
37,173
Recall that we
have 20 non-zero
differences and are
performing a 5%
2-tailed test.
Here we see that
the lower critical
value is 52 and the
upper critical
value is 158.
Our statistic W,
the sum of the
positive ranks, is
154, which is
inside the interval
(52, 158) indicated
in the table.
So we can not
reject the null
hypothesis and we
conclude that the
medians are the
same.
The Kruskal-Wallis Test
This test is used to test whether several populations have the
same median.
It is a nonparametric substitute for a one-factor ANOVA F-test.
2
12  R j 
T he test statisticis K 
- 3(n  1) ,



n(n  1) 
nj 
where nj is the number of observations in the jth sample,
n is the total number of observations, and
Rj is the sum of ranks for the jth sample.
If each n j  5 and thenull hypothesisis true,
then thedistribution of K is  2 with dof  c - 1,
where c is thenumber of samplegroups.
In the case of ties, a corrected statistic should be computed:
K
Kc 
where tj is the number of ties in
  (t3j  t j ) 
th sample.
the
j
1- 

3
 n  n 
Kruskal-Wallis Test Example: Test at the 5% level whether
average employee performance is the same at 3 firms, using
the following standardized test scores for 20 employees.
Firm 1
score
rank
Firm 2
score
rank
Firm 3
score
78
68
82
95
77
65
85
84
50
87
61
93
75
62
70
90
72
60
80
n1 = 7
73
n2 = 6
n3 =7
rank
We rank all the scores. Then we sum the ranks for each firm.
Then we calculate the K statistic.
Firm 1
Firm 2
Firm 3
score
rank
score
rank
score
rank
78
12
68
6
82
14
95
20
77
11
65
5
85
16
84
15
50
1
87
17
61
3
93
19
75
10
62
4
70
7
90
18
72
8
60
2
80
13
73
9
n1 = 7
R1 = 106
n3 =7
R3 = 57
n2 = 6 R2 = 47
2
12  R j 
12  1062 472 572 
K
 n  - 3(n  1)  20(21) 7  6  7  - 3(21) 6.641
n(n  1) 
j 


f(2)
crit.
reg.
acceptance
region
.05
5.991
 22
From the 2 table, we see that the 5% critical value for a 2
with 2 dof is 5.991.
Since our value for K was 6.641, we reject H0 that the
medians are the same and accept H1 that the medians are
different.
One sample test of runs
a test for randomness of order of occurrence
A run is a sequence of identical occurrences
that are followed and preceded by different
occurrences.
Example: The list of X’s & O’s below consists of 7 runs.
xxxooooxxooooxxxxoox
Suppose r is the number of runs, n1 is the number of
type 1 occurrences and n2 is the number of type 2
occurrences.
T hemean number of runs is
2n1n 2
μr 
 1.
n1  n 2
T hestandarddeviationof thenumber of runs is
2n1n 2 (2n1n 2 - n1 - n 2 )
r 
.
2
(n1  n 2 ) (n1  n 2  1)
If n1 and n2 are each at least 10, then r is
approximately normal.
So,
Z
r - r
r
is a standardnormalvariable.
Example: A stock exhibits the following price increase (+) and
decrease () behavior over 25 business days. Test at the 1%
whether the pattern is random.
r =16,
+ + +   +    + +  +  +   + +  + +  +  n1 (+) = 13,
n2 () = 12
2n1n 2
2(13)(12)
μr 
1 
 1  13.48
n1  n 2
13  12
2(13)(12)[(2(13)(12
) -13-12]
2n1n 2 (2n1n 2 - n1 - n 2 )
 2.44

r 
2
2
(13 12) (13 12  1)
(n1  n 2 ) (n1  n 2  1)
Z
r - r
r
16 - 13.48

 1.03
2.44
Since the critical values for a 2-tailed 1%
test are 2.575 and -2.575, we accept H0
that the pattern is random.
critical
region
.005
.495 .495
acceptance
region
-2.575
0
critical
region
.005
2.575
Z
Download