Nonparametric test

advertisement

Independent samples-

Wilcoxon rank sum test

Example

The main outcome measure in MS is the expanded disability status scale (EDSS)

The EDSS is a 0-10 scale with steps of 0.5

Ordinal scale

– Ordered, but magnitude between steps is uncertain

Dr. Kurtzke who developed the scale believes the steps of scale are just a rank, not a measure of magnitude

– This makes a t-test inappropriate

Pediatric vs. adult

Most MS patients develop the disease between age 20-40, but a subset of patients develop MS younger

What is different about these patients?

If we investigated patients at similar disease duration, is there a significant difference in EDSS?

Since we have two independent samples, we could have used two-sample t-test

Unfortunately, there seem to be outliers in the adult group

Also, we know that we have ordinal data so a t-test is not appropriate

Wilcoxon rank sum test

Since we have two independent samples and the t-test is not appropriate, we need a nonparametric test. The test for two independent samples is Wilcoxon rank sum.

Again, we are interested in the median rather than the mean.

The hypothesis test of interest is

– H

0

: median adult

– H

A

: median adult

= median pediatric

!= median pediatric

Wilcoxon rank sum

Again, we use the rank of the data points, rather than the actual values.

An exact Wilcoxon rank sum test can be used, but we focus on the approximate

6

7

4

5

Patient EDSS Group Rank

1

2

3

0

1.5

1.5

P

P

P

1

4.5

4.5

1

3

1

2

A

A

P

A

2.5

6

2.5

7

Approximate Wilcoxon test

If the sample size is large enough (rule of thumb, n=20) an approximate Wilcoxon test based on the normal approximation can be used z

W

W s

W m

W

– W=sum of ranks in smaller group

– m

W null

=expected sum of ranks in smaller group under

– s

W

=standard deviation of sum of ranks in smaller group under null

m

T

and

s

T

Under the null of no difference between the groups, this expression is the expected sum of ranks in the small group m

W

 n

S

( n

S

 n

L

1 )

2

The standard deviation is given by this formula s

W

 n

S n

L

( n

S

12

 n

L

1 )

Results

From our results,

– sum of the ranks in smaller group: W=1526

– expected value of sum of positive ranks: m

W

21 * ( 21

110

1 )

2

1386

– Standard deviation of sum of positive ranks s

T

21 * 110 * ( 21

110

1 )

159 .

4

12

Our approximate test statistic is

Z

1526

1386

0 .

88

159 .

4

Ties

In this example, we have many ties

As with the Wilcoxon signed rank test, a correction for ties can be made to the variance (see Rosner or other text book)

This correction is included in STATA and all other computer packages

Hypothesis test

4)

5)

6)

1)

2)

3)

7)

H

0

: median difference=0

Continuous outcome from paired data

Wilcoxon signed rank test

Test statistic: z=0.91

p-value= 0.36

Since the p-value is more than 0.05, we fail to reject the null hypothesis

We conclude that the there is no significant difference in terms of EDSS in pediatric and adult MS patients

z-statistic p-value

Comments

Wilcoxon rank sum test is becoming more prominent because computers allow this statistic to be calculated very quickly

There is not a large loss of power in using a Wilcoxon rank sum test compared to a ttest even when the normality assumption holds.

If normality does not hold or ordinal data,

Wilcoxon test is better

Parametric tests-nonparametric equivalent

Paired t-test – Wilcoxon signed rank

Two sample t-test – Wilcoxon rank sum

ANOVA – Kruskal-Wallis test

– When you have two or more independent samples and the assumptions of ANOVA are not met, you can use the Kruskal-Wallis test.

This is a rank based test.

Download