Introduction to Non-Parametric Testing

advertisement
Statistical Methods II
Session 7
Non Parametric Testing –
The Sign Test
STAT 3130 – Non Parametric Testing
In the previous course we learned a lot about one sample,
two sample and paired ttests. All of these tests had some
basic assumptions:
1. the individual samples were approximately normal.
2. the individual samples came from populations with
approximately equal variance.
3. we preferred that the individual samples were of a size
greater than 30.
In some cases, our data will violate these assumptions.
Specifically, we may find ourselves with small samples, which
are not normal and contain extreme observations. If the
samples are still independent, with approx equal variance,
we use non-parametric tests in lieu of ttests.
STAT 3130 – Non Parametric Testing
Non-parametric tests are also referred to as “distribution
free”, because they do not require any assumptions about
the distribution of the data.
You should know that Non-parametric tests almost always
have lower Power than the typical parametric tests and
therefore should always be the “Plan B” option.
Non-parametric tests are typically focused on the median
(rather than on the mean) and involve fairly straight-forward
procedures like ordering and counting.
STAT 3130 – Non Parametric Testing
Test
Parametric
Non Parametric
One Quantitative
Response Variable
One Sample ttest
Sign Test
One Quantitative
Response Variable – Two
Values from Paired
Samples
Paired Sample ttest
Wilcoxon Signed Rank
Test
One Quantitative
Response Variable – One
Qualitative Independent
Variable with two groups
Two Independent
Sample ttest
Wilcoxon Rank Sum or
Mann Whitney Test
One Quantitative
Response Variable – One
Qualitative Independent
Variable with three or more
groups
ANOVA
Kruskall Wallis
STAT 3130 - Sign Test
There are two non-parametric tests which can be used to
test the center of a single population – The Sign Test and the
Wilcoxon Signed Rank Test (covered in the next session).
Like the parametric version (a one sample ttest), the Sign
Test can be used with one sided or two sided hypotheses.
However, unlike the parametric version, the Sign Test will
execute a test around the median rather than the mean.
We will use the symbol η to represent the median.
STAT 3130 - Sign Test
Lets take a look at an example.
The following values are the ages of students in a Ph.D.
program. Determine if the median is less than 40.
42, 22, 24, 25, 32, 34, 38, 40, 40, 44, 25, 26, 29
Lets start by trying a t-test…and determining the Power of
the test.
STAT 3130 - Sign Test
Now lets develop the null and alternative hypotheses using
a Sign Test:
H0: η > 40
H1: η < 40
To execute the Sign Test,
1.
2.
3.
we count the number of values in the dataset greater than η0
(in this case, 40). This test statistic is referred to as S+.
Then, count the number of values in the dataset less than η0.
This test statistic is referred to as S-.
If S+ is greater than S-, then we fail to reject the null hypothesis.
Sign Test
Given the present dataset, S+ = 2 and S- = 9 (note that 2
observations are exactly 40 – they are not counted).
SAS will provide you with (S+ - S-)/2…or (2-9)/2 = -3.5. SAS will
also provide you with a p-value associated with this
outcome that is generated using the Binomial Distribution
(testing 2/11 or .1818 versus .5).
Sign Test
Lets go through a slightly more sophisticated version of this
analysis (and S2.4) using SAS…
Download