Christopher Knapp University of Akron, Fall 2011 Statistical Data Management Project #1 Problem Statement Problem #1: Please research all the non-parametric techniques discussed in lecture. Please report the source, formula, characteristics, and an example of usage for each. Problem #2: Please use the hospital data that was included in assignment #5. Please make sure you use the data set after you have fixed all the issues. Recall this was a set of observations on 27 people from two separate hospitals. In SAS: Ho: µLead,M = µLead,F Ho: µNeutro = 22.1 Ho: µLead = µHaemo In SPSS: Ho: µLympho, Hosp A, F = µLympho, Hosp B, F Ho: µLead = 16.2 Ho: µLympho = µNeutro For each hypothesis test perform the appropriate t-test and the appropriate nonparametric test. Discuss which test should be used and justify your choice as well as possible. Discuss everything possible with regards to hypothesis testing (set-up, assumptions, …). When the parametric approach is appropriate, complete the corresponding confidence interval technique. Please type up your report and include all necessary output (SAS and SPSS). Be sure to use the appropriate labels and formats. Email your SAS code. For the students enrolled in 480, only complete the hypothesis that are bolded. Contents Review of Non-parametric Techniques Page 1 Single Sample: Sign Test Description Example Page 2 Single Sample: Wilcoxon Signed-rank Test Description Example Page 3 Two Independent Samples: Wilcoxon Rank Sum Test Description Example Page 5 Matched Pairs Tests Description Example Hypothesis Tests in SAS Page 6 Page 8 Page 9 H0: µLead,M = µLead,F H0: µNeutro = 22.1 H0: µLead = µHaemo Hypothesis Tests in SPSS Page 11 Page 13 Page 14 H0: µLympho,Hosp A,F = µLympho,Hosp B,F H0: µLead = 16.2 H0: µLympho = µNeutro References and Appendices Page 16 Appendix A Appendix B References SAS Code SPSS Code Section 1 Review of Non-parametric Techniques Project 1 Section 1: Review of Non-parametric Techniques P a g e |1 Single Sample: Sign Test Goals and Assumptions: This test is used to check if a given value is equal to the population median. That is, H0: η = η0. If symmetry of the population can be assumed (weaker assumption than normality) then the hypothesis H0: µ= η0 is equivalent to H0: η = η0 (symmetry implies η = µ). Its parametric counterpart is the t-test, and can be used if the population is normally distributed or if the sample size is larger than 30. The sign test (less powerful than t-test) should be used if these conditions are not met. The alternative hypotheses: H0: η ≠ η0, H0: η > η0, or H0: η < η0. Description: The decision of the sign test is based on the binomial distribution. From a population with median η0, there is a 50% chance of randomly observing a value larger than η0. If n random observations are made (sample size of n observations) then the number of values larger than η0 (random variable X) follows a binomial distribution (parameters n and 50%). Represent the actual number of values larger than η0 by x, and define k as follows (dependent on Ha): If Ha: η > η0 then k is the smallest integer between 0 and n such that P[X ≥ k] < α. The null hypothesis is rejected if x is larger than or equal to k. If Ha: η < η0 then k is the largest integer between 0 and n such that P[X ≤ k] < α. The null hypothesis is rejected if x is smaller than or equal to k. If Ha: η ≠ η0 then k2 is the smallest integer between 0 and n such that P[X ≥ k2] < α/2, and k2 is the largest integer between 0 and n such that P[X ≤ k2] < α/2. The null hypothesis is rejected if x is smaller than or equal k1 or larger than or equal to k2. Note: when an observation has the value η0, this data is discarded (even if continuous data is rounded – that is, even if the probability of selecting data with the value η0 is nonzero). When a value is discarded, both x and n are not affected by its value. Example: (Example from Mathematical Statistics, page 475). The following are measurements of the breaking strength of a certain kind of 2-inch cotton ribbon in pounds: 163, 165, 160, 189, 161, 171, 158, 151, 169, 162, 163, 139, 172, 165, 148, 166, 172, 163, 187, 173 Project 1 Section 1: Review of Non-parametric Techniques P a g e |2 The goal of this example is to test the null hypothesis H0: η = 160 against the alternative Ha: η > 160 with a level of significance of α = .05. The actual mean and median are 164.85 and 164, which are lower than η0, so this alternative hypothesis makes sense. 1 of the 20 values are equal to η0, so n=19. Furthermore, there are x=15 values larger than η0. The value k=14 satisfies the definition for b(19,.5). Because k is exceeded by x, the null hypothesis is rejected, and with a significance level of α = .05 we conclude “The median breaking strength exceeds 160 pounds”. Single Sample: Wilcoxon Signed-rank Test Goals and Assumptions: Notice that the sign test only considers the signs of the differences between the observations and η0. Ignoring the magnitudes of the differences is wasteful; therefore, the Wilcoxon Signed-rank Test was created to consider these deviations. The null hypothesis is the same (H0: η = η0, where η represents the population median). This test does require a symmetric population, so this is equivalent to H0: µ= η0. The alternative hypotheses: H0: µ ≠ η0, H0: µ > η0, or H0: µ < η0. Description: The algorithm is simple. Consider the sequence S*=(xi - η0) for each of the xi observations of the sample. Discard any values in the sequence that equal 0, so that there are n remaining values. Rearrange S* in ascending order by the absolute value of the data points. Call this sequence S=(an). Assign the rank (an n) to each of the n terms. For each maximal subsequence (aj) of terms with equal absolute values, reassign each aj to the average of the ranks within the subsequence. Let T+ be the sum of the ranks assigned to positive values of an. For small values of n, the sampling distribution of T+ is based on a special table (which is easily derived with a computer considering all possible cases of positive and negative values); however, when n is larger than 14, T+ is approximately normal with mean n(n+1)/4 and variance n(n+1)(2n+1)/24. Note that the derivation of the distribution for T+ comes from the formula T+ = 1I1 + ∙∙∙ + nIn, where each Ii is the indicator for ai being positive. Therefore T+ is a linear function of Bernoulli variables. Project 1 Section 1: Review of Non-parametric Techniques P a g e |3 Example: Consider the following dataset and test the hypothesis H0: η = 0 and α = .05 with Ha: η > 0: 9.1, 7.3, 13.1, 0, -2.2, 4.0, 4.9, -3.6, 12.2, -1.3, 12.7, 8.3, -2.5, 1.0, 8.1, 0.1, -1.5, 0 Because n is larger than 14, normality can be assumed for T+. The excel spreadsheet to the right describes the decision to reject the null hypothesis and to conclude, for α = .05, that η > 0. Also notice that no two positive observations had the same values. Therefore ranks remained “simple”. If, for example, the first observation was 1, then the rank for the first two observations would both be 1.5. Two Independent Samples: Wilcoxon Rank Sum Test Goals and Assumptions: This test (also called the Mann-Whitney Test) is used to check if two independent samples come from identically distributed populations. The null hypothesis is H0: two samples have the same distribution (hence µ1=µ2). Its parametric counterpart is the t-test for 2 independent samples (Pooled for equal variances and Satterthewaite for unqueal variances), which is sensitive to departures form normality; unlike the t-test, population normality and equal sample sizes are not required. The alternative hypothesis: Ha: One population distribution tends to be higher than the other. Note: this test cannot be generalized to more than two samples – more complicated tests (like ANOVA) need to be applied in these situations. Project 1 Section 1: Review of Non-parametric Techniques P a g e |4 Method: The idea is to first combine data from sample A and data from sample B into one set C=(ci). Order C in ascending order and assign the rank cii to each value. For each maximal subsequence (cj) of terms with equal values, reassign each cj to the average of the ranks within the subsequence. Let W1 be the sum of the ranks for values coming from sample A and let n1 represent the number of observations in sample A. Define U1 = W1 – n1(n1+1)/2. U1 will take on values from 0 to n1n2 (where n1 and n2 represent the number of observations in samples A and B, respectively), with a sampling distribution that is symmetric about n1n2/2. When n1 and n2 are both bigger than 8, it is reasonable to assume that the distribution of U1 is approximately normal with E[U1]= n1n2/2 and VAR[U1]= n1n2(n1 + n2 + 1)/12. For smaller values of ni, U1 uses special charts. Example: Consider the following two datasets and test the hypothesis H0: Population distributions are the same (µ1=µ2) for α = .05 with Ha: µ1<µ2: Sample 1: 14.9, 11.3, 13.2, 16.6, 17.0, 14.1, 15.4, 13.0, 16.9 Sample 2: 15.2, 19.8, 14.7, 18.3, 16.2, 21.2, 18.9, 12.2, 15.3, 19.4 The excel worksheet below demonstrates the steps. The conclusion is that the null hypothesis is rejected and the mean of the second sample is larger than the mean of the first for α = .05. Project 1 Section 1: Review of Non-parametric Techniques P a g e |5 Matched Pairs Tests Description: When a sample involves two dependent variables, V1 and V2, (for example scores of a pretest and of a posttest for several students) the matched pairs tests can be applied. The null hypothesis is H0: µ1=µ2 (where µi represents the population mean of variable i). For the distribution V1-V2, normality is not required, but symmetry is. The method involves applying tests from the previous section: for each observation compute V1 - V2 and apply either of the single sample tests to this distribution, where the null hypothesis is H0: µ[V1 - V2] = 0. Notice that the alternative hypothesis Ha: µ[V1 - V2] > 0 is equivalent to Ha: µ1>µ2; Ha: µ[V1 - V2] < 0 is equivalent to Ha: µ1<µ2; Ha: µ[V1 - V2] ≠ 0 is equivalent to H0: µ1 ≠ µ2 . Example: Consider the following data of pretest scores (out of 20), posttest scores (out of 20), and the difference between the two. We wish to test the hypothesis, at α=.05, that H0: µpre=µpost, with alternative hypothesis Ha: µpre > µpost. This alternative hypothesis makes sense because the average test grade for the pretest is almost 4 points higher than the posttest. Notice that the values of the differences are identical to the data in the Wilcoxon Signed-rank Test example. We can apply the same test from the previous example to this example and get the same result – reject the null hypothesis and conclude the alternative. That is, with a significance level of .05, the pretest average is higher than the posttest average. Section 2 Hypothesis Tests in SAS Project 1 Section 2: Hypothesis Tests in SAS Page |6 H0: µLead,M = µLead,F Hypothesis: At a significance level of α=.05: H0: µLead,M = µLead,F Ha: µLead,M ≠ µLead,F Parametric Test: If the assumptions for the parametric approach are met, the high p-value for the F test implies it is reasonable to assume the variances are equal, so the Pooled test is the appropriate one. Notice that the p-value is .7507, so the null hypothesis cannot be rejected. That is, there is evidence for µLead,M = µLead,F. Non-parametric Test: If the assumptions for the parametric approach are not met, the Wilcoxon Rank Sum Test (under some assumptions) can be applied here, since we have two independent samples – lead information for males and lead information for females. The SAS output is displayed to the right. The pvalue for the two-sided test is .8845, so we cannot reject the null hypothesis, and the evidence for µLead,M = µLead,F is significant. Project 1 Section 2: Hypothesis Tests in SAS Page |7 Choosing the Appropriate Test: The tests for normality for the males all result in small p-values, so there is substantial evidence that normality does not hold. With a mean of 20.7, median of 19, and standard deviation of 5.5, the Pearson test passes because 1.7 is less than 2/3 standard deviations, but this is not a very good indicator of normality. You can see in the picture to the right the most extreme values exist on the right side of the graph. The most extreme value is 2.8 standard deviations away from the mean, which is high considering the small sample size. Furthermore, the QQPlot suggests a skewed right distribution because it is displayed as concave up rather than linear. Lastly, skewness and kurtosis are much larger than 0, which is inconsistent with a normal distribution. There is enough evidence to conclude that this distribution did not come from a normal population, so a nonparametric approach should be applied. To apply the Wilcoxon Ranked Sum Test, the assumption of equal shape must be met. From the side-by-side histogram to the right, it appears that this is a reasonable assumption. You can see that both shapes are skewed to the right, however with a small sample it is hard to determine how accurate this assumption truly is. Because small deviations from this assumption do not greatly affect the validity of the test, the nonparametric approach will be utilized. Conclusion: As described above, the nonparametric test is appropriate, and the p-value for the two-sided test is .8845, so we cannot reject the null hypothesis, and there is evidence for µLead,M = µLead,F. Project 1 Section 2: Hypothesis Tests in SAS Page |8 H0: µNeutro = 22.1 Hypothesis: At a significance level of α=.05: H0: µNeutro = 22.1 and H0: µNeutro ≠ 22.1 Parametric Test: If normality holds for the population, then the student’s t test should be used. From the SAS output below, the p-value is .0756 for the two-tailed test. Therefore, at a significance level of α=.05, the null hypothesis cannot be rejected. That is, the evidence supports µNeutro = 22.1. Non-parametric Test: Both the sign and the signed rank test produce the same conclusion. For a significance level of .05, there is not enough evidence to reject the null hypothesis (p-value of .23 and .111 for the sign and signed rank test, respectively). That is, the evidence supports µNeutro = 22.1. Choosing the Appropriate Test: The tests for normality all result in p-values larger than .05. Also, the difference between mean and median is .12, which is smaller than the Pearson restriction. Furthermore, from the histogram, the data has no extreme values and appears to following the empirical rule. Lastly, the normality plot appears fairly linear. Therefore it is likely that this data came from a population that is close to being normally distributed. Therefore the t-test is the appropriate test. Project 1 Section 2: Hypothesis Tests in SAS Page |9 Conclusion: Because the t-test is appropriate, we use its conclusion. Therefore, at a significance level of α=.05, the null hypothesis cannot be rejected. That is, the evidence supports µNeutro = 22.1. H0: µLead = µHaemo Hypothesis: At a significance level of α=.05: H0: µLead = µHaemo and Ha: µLead ≠ µHaemo Parametric Test: Notice that this is a matched pairs test, so if normality holds for the population’s difference leadhaemo, then the student’s t test should be used. From the SAS output to the right, the p-value is .0001 for the two-tailed test. Therefore, at a high confidence level, the null hypothesis can be rejected. That is, the evidence supports µLead ≠ µHaemo. Non-parametric Test: Both the sign and the signed rank test produce the same conclusion. For a significance level smaller than .0001, there is evidence to reject the null hypothesis. That is, the evidence supports µLead ≠ µHaemo. Project 1 Section 2: Hypothesis Tests in SAS P a g e | 10 Choosing the Appropriate Test: The tests for normality result in pvalues between .035 and .083. Therefore, there is some evidence that the population is normally distributed; however, the conclusion varies with a significance level of α=.05. Also, the difference between mean and median is .57, which is smaller than the Pearson restriction, which supports normality. From the histogram, the data contains values that are more extreme than expected under normal conditions and appears to be slightly skewed to the right. Lastly, the normality plot looks close to linear, but has a slight curve, which suggests some skewness. The skewness statistic is 1.03842 with a standard error of .448, so it is more than two standard errors away from 0. This indicates a right skewed distribution. The kurtosis is 1.74367 with a standard error of .872, so it is more than two standard errors away from 0. This indicates that the tails are longer than what is expected from a normal distribution. These tests are inconclusive, because some support normality while others reject normality. Notice that the single outlier may be the cause for this deviation from normality, and without more information on the dataset we don’t know if this is a meaningful value or not. This outlier certainly has an effect on the kurtosis and skewness values. Because slight deviations from normality are acceptable, and because the sample size of 27 is close to 30, the appropriate test is the parametric one. Conclusion: There is enough evidence (at α=.05 and p-value=.0001) to reject the null hypothesis; that is, the evidence supports µLead ≠ µHaemo. Section 3 Hypothesis Tests in SPSS Project 1 Section 3: Hypothesis Tests in SPSS P a g e | 11 H0: µLympho,Hosp A,F = µLympho,Hosp B,F Hypothesis: At a significance level of α=.05: H0: µLympho,Hops A,F = µLympho,Hops B,F and Ha: µLympho,Hops A,F ≠ µLympho,Hops B,F Parametric Test: I first built a new dataset with just females, then a two sample t-test for hospital A and hospital B to analyze Lympho. The output is displayed below. As you can see, it does not matter if equal variance is assumed or not, the confidence interval at α=.05 includes 0, so there is not enough evidence to reject H0. That is, the evidence supports µLympho,Hops A,F = µLympho,Hops B,F. Non-parametric Test: The table to the right displays the result of two nonparametric tests. The first is the MannWhitney U Test, which was described in section one. The second was the default test for SPSS given the dataset. The Mann-Whitney U Test supports the null hypothesis with a large p-value (.902). Project 1 Section 3: Hypothesis Tests in SPSS P a g e | 12 Choosing the Appropriate Test: Hospital A is fairly normal, as you can tell from the four displays on the right. The skewness and kurtosis values are fairly small, which is consistent with normality. Furthermore, the QQplot is close to linear and the boxplot appears to be closer to symmetric than skewed. SPSS offers two tests for normality – Kolmogorov-Smirnov and Shapiro-Wilk – both tests support the null hypothesis. Therefore we can assume hospital B has a normally distributed population for Lympho and Female. Hospital B is also normal, as you can tell from the four displays on the left. The skewness and kurtosis values are fairly small, which is consistent with normality. Furthermore, the QQplot is close to linear and the boxplot appears to be closer to symmetric than skewed. SPSS offers two tests for normality – KolmogorovSmirnov and Shapiro-Wilk – both tests support the null hypothesis. Therefore we can assume hospital A has a normally distributed population for Lympho and Female. Therefore, the appropriate test is the t-test. Conclusion: The Mann-Whitney U Test supports the null hypothesis with a large p-value (.902), so there is evidence supporting the null hypothesis that µLympho,Hops A,F = µLympho,Hops B,F. Project 1 Section 3: Hypothesis Tests in SPSS P a g e | 13 H0: µLead = 16.2 Hypothesis: At a significance level of α=.05: H0: µLead = 16.2 Parametric Test: From the one sample t-test on the right, the value 16.2 does not fall within the 95% confidence interval. Therefore the null hypothesis is rejected. Non-parametric Test: Applying the one-sample Wilcoxon Signed Rank Test, the table to the right displays the decision to reject the null hypothesis at a significance level of α=.05. If symmetry cannot be assumed, then the basic sign test can be applied by first computing the variable “lead-16.2” and observing that 24 of the 27 values were positive. Notice that P[X ≥ 24] = .0246 for the binomial variable X~b(27,50%), so the two sided test at α=.05 results in a rejection of H0. The differences in SPSS are sorted and displayed to the right: and Ha: µLead ≠ 16.2 Project 1 P a g e | 14 Section 3: Hypothesis Tests in SPSS Choosing the Appropriate Test: All of the data points toward a rejection of normality. Skewness and kurtosis are both more than 3 standard errors away from 0, the QQplot is concave down, the boxplot looks skewed with an outlier, and the two tests for normality reject H0: Normally distributed. Therefore the nonparametric test should be used. More specifically, the sign test should be applied because the distribution is skewed. Conclusion: The conclusion for the Non-ranked Sign Test is to reject the null hypothesis at a significance level of α=.05. That is, evidence shows µLead ≠16.2. H0: µLympho = µNeutro Hypothesis: At a significance level of α=.05: H0: µLympho = µNeutro and Ha: µLympho ≠ µNeutro Parametric Test: The 95% confidence interval, displayed to the right, does not include 0, so the null hypothesis is rejected. That is, µLympho ≠ µNeutro. It should be noted that the p-value (.049) was only slightly smaller than .05. Project 1 Section 3: Hypothesis Tests in SPSS Non-parametric Test: The conclusion of the Wilcoxon Signed Rank Test is that the null hypothesis should be retained. That is, the means are the same at a significance level of α=.05. Choosing the Appropriate Test: The statistics on the right support normality of the difference between the two variables. The Kolmogorov-Smirnov test rejects normality, however all other evidence points to normality, including the Shapiro-Wilk test. Furthermore, the QQPlot is fairly linear and the boxplot contains no outliers. The skewness and Kurtosis are also both within one standard error measure from 0. If deviation from normality exists it is likely a small. A small deviation from normality, combined with a sample size close to 30, indicates the parametric test is the appropriate one. Conclusion: At α=.05, the null hypothesis is rejected. That is, µLympho ≠ µNeutro. P a g e | 15 Section 4 References and Appendices Project 1 Section 4: References and Appendices P a g e | 16 References Freund and Walpole. Mathematical Statistics, 3rd. Prentice-Hall,1980. Fridline, Mark. Class Lecture. Statistical Data Management. University of Akron, Akron, OH. Fall 2011. Hollander, Myles, and Douglas A. Wolfe. Nonparametric Statistical Methods. 2nd. WileyInterscience, 1999. <http://books.google.com/books/feeds/volumes?q=0471190454>. Project 1 Section 4: References and Appendices Appendix A |1 SAS Code /* IMPORT DATA INTO NEW LIBRARY */ libname mylib '\\uanet.edu\ZIPSpace\C\crk32\Classes\F11 Statistical Data Management\Project 1\mylib\'; proc format library=mylib; value $gender 'M'='Male' 'F'='Female'; run; option fmtsearch=(mylib); data mylib.dataset; infile '\\uanet.edu\ZIPSpace\C\crk32\Classes\F11 Statistical Data Management\Project 1\dataset.txt' dsd delimiter='09'x; input hospital$ gender$ haemo pcv wbc lympho neutro lead; hospital = upcase(hospital); gender = upcase(gender); difference = lead - haemo; format gender $gender.; label hospital = 'Hospital Data was Collected From' gender = 'Gender of Person' haemo = 'Iron in Blood (Hemoglobin)' pcv = 'Packed Cell Volume' wbc = 'White Blood Cell Count' lympho = 'Number of Lymphocytes' neutro = 'Neutrophil' lead = 'Serum Lead Concentration'; run; /* problem1 - GET DESCRIPTIVE STATS FOR LEAD BY GENDER */ proc means data=mylib.dataset mean; var lead; class gender; run; /* problem1 - NON PARAMETRIC TEST */ proc npar1way data=mylib.dataset wilcoxon; class gender; var lead; run; /* problem2 - NON PARAMETRIC TEST */ proc univariate data=mylib.dataset loccount mu0 = 22.1 normal; var neutro; histogram neutro / midpoints =10 to 45 by 5 normal; qqplot neutro; run; /* problem3 - NON PARAMETRIC TEST */ proc univariate data=mylib.dataset loccount mu0 = 0 normal; var difference; histogram difference / midpoints =-15 to 25 by 2.5 normal; qqplot difference; run; Project 1 Section 4: References and Appendices SPSS Code T-TEST GROUPS=Hospital('A' 'B') /MISSING=ANALYSIS /VARIABLES=Lympho /CRITERIA=CI(.95). *Nonparametric Tests: Independent Samples. NPTESTS /INDEPENDENT TEST (Lympho) GROUP (Hospital) MANN_WHITNEY MEDIAN(TESTVALUE=SAMPLE COMPARE=PAIRWISE) /MISSING SCOPE=ANALYSIS USERMISSING=EXCLUDE /CRITERIA ALPHA=0.05 CILEVEL=95. DESCRIPTIVES VARIABLES=Neutro /STATISTICS=KURTOSIS SKEWNESS. EXAMINE VARIABLES=Lympho /PLOT BOXPLOT STEMLEAF NPPLOT /COMPARE GROUPS /STATISTICS DESCRIPTIVES /CINTERVAL 95 /MISSING LISTWISE /NOTOTAL. COMPUTE difference=Lympho-Neutro. EXECUTE. Appendix B |1