Statistical Methods II Session 7 Non Parametric Testing – The Sign Test STAT 3130 – Non Parametric Testing In the previous course we learned a lot about one sample, two sample and paired ttests. All of these tests had some basic assumptions: 1. the individual samples were approximately normal. 2. the individual samples came from populations with approximately equal variance. 3. we preferred that the individual samples were of a size greater than 30. In some cases, our data will violate these assumptions. Specifically, we may find ourselves with small samples, which are not normal and contain extreme observations. If the samples are still independent, with approx equal variance, we use non-parametric tests in lieu of ttests. STAT 3130 – Non Parametric Testing Non-parametric tests are also referred to as “distribution free”, because they do not require any assumptions about the distribution of the data. You should know that Non-parametric tests almost always have lower Power than the typical parametric tests and therefore should always be the “Plan B” option. Non-parametric tests are typically focused on the median (rather than on the mean) and involve fairly straight-forward procedures like ordering and counting. STAT 3130 – Non Parametric Testing Test Parametric Non Parametric One Quantitative Response Variable One Sample ttest Sign Test One Quantitative Response Variable – Two Values from Paired Samples Paired Sample ttest Wilcoxon Signed Rank Test One Quantitative Response Variable – One Qualitative Independent Variable with two groups Two Independent Sample ttest Wilcoxon Rank Sum or Mann Whitney Test One Quantitative Response Variable – One Qualitative Independent Variable with three or more groups ANOVA Kruskall Wallis STAT 3130 - Sign Test There are two non-parametric tests which can be used to test the center of a single population – The Sign Test and the Wilcoxon Signed Rank Test (covered in the next session). Like the parametric version (a one sample ttest), the Sign Test can be used with one sided or two sided hypotheses. However, unlike the parametric version, the Sign Test will execute a test around the median rather than the mean. We will use the symbol η to represent the median. STAT 3130 - Sign Test Lets take a look at an example. The following values are the ages of students in a Ph.D. program. Determine if the median is less than 40. 42, 22, 24, 25, 32, 34, 38, 40, 40, 44, 25, 26, 29 Lets start by trying a t-test…and determining the Power of the test. STAT 3130 - Sign Test Now lets develop the null and alternative hypotheses using a Sign Test: H0: η > 40 H1: η < 40 To execute the Sign Test, 1. 2. 3. we count the number of values in the dataset greater than η0 (in this case, 40). This test statistic is referred to as S+. Then, count the number of values in the dataset less than η0. This test statistic is referred to as S-. If S+ is greater than S-, then we fail to reject the null hypothesis. Sign Test Given the present dataset, S+ = 2 and S- = 9 (note that 2 observations are exactly 40 – they are not counted). SAS will provide you with (S+ - S-)/2…or (2-9)/2 = -3.5. SAS will also provide you with a p-value associated with this outcome that is generated using the Binomial Distribution (testing 2/11 or .1818 versus .5). Sign Test Lets go through a slightly more sophisticated version of this analysis (and S2.4) using SAS…