Psych 522, 10/25/05 p. 1/6 Assessing Normality (based on Kirk, Ch 3) Assessing Normality Because the one-sample t has a stronger normality assumption (of the population) than the one-sample z, it is important to examine the plausibility of the normality assumption, given the sample that you have obtained. It will be a very rare case that your sample data are actually normally distributed, but this is okay. Again, it is the plausibility of a normally distributed population that is important. In addition to simply examining the distribution of your data using a histogram, boxplot, or stem-and-leaf plot, SPSS offers a few other tools to help you assess this plausibility: 1) Kolmogorov-Smirnov test: This is an actual statistical test that tells you whether your sample’s deviation from normality is statistically significant. 2) Normal Q-Q Plots: This is a graphical procedure that plots the observed values on the X-axis and the expected values (assuming a normal distribution) on the Y-axis. Note that if the sample distribution is distributed exactly like a normal distribution, the points should fall on a straight line. Q-Q Plot Psych 522, 10/25/05 p. 2/6 3) Normal P-P Plots: These are similar to Q-Q plots, but instead of plotting observed values, these plot cumulative probabilities (values range from 0 to 1), with observed probabilities (cumulative proportion of cases) on the X-axis and expected probabilities given the normal curve on the Y-axis. Again, if the sample were exactly normally distributed, the points would lie on a straight line: P-P Plot Try these out using the exam variable in the examanxiety.sav dataset: 1. The Kolmogorov-Smirnov test and Q-Q plots can easily be obtained using AnalyzeDescriptive StatisticsExplore. Move exam over to the Dependent List box, click on Plots, and check “Normality plots with tests.” Click Continue, then OK. Your output will look like: Psych 522, 10/25/05 p. 3/6 Note that the Kolmogorov-Smirnov test is significant which suggests that the exam scores do not approximate normality (generate a histograms to think about why this might be!) Normal Q-Q Plot of Exam Performance (%) 4 Expected Normal 2 0 -2 -4 -20 0 20 40 60 80 100 120 Observed Value The normal Q-Q plot confirms this. The dots do not fall right on the line and, in fact create an S-like pattern (which suggests skew). If you look at the histogram, you notice some negative skew along with an interesting pattern of “valleys” in the data (i.e., just below 20, 40, 60). This is probably contributing to the non-normality as well. 20 Frequency 15 10 5 Mean = 56.5728 Std. Dev. = 25.94058 N = 103 0 0.00 20.00 40.00 60.00 Exam Performance (%) 80.00 100.00 Psych 522, 10/25/05 p. 4/6 In addition to the normal Q-Q plot , SPSS also gives us a “detrended” Q-Q plot (see below). Here, the Y-axis is the deviation (difference) between what was observed and what was expected. This detrended plot sometimes makes the pattern easier to decipher (note the clear “S” pattern). Detrended Normal Q-Q Plot of Exam Performance (%) 0.3 0.2 Dev from Normal 0.1 0.0 -0.1 -0.2 -0.3 -0.4 0 20 40 60 80 100 Observed Value 2. To plot the P-P plots (again, similar to Q-Q, but with cumulative probabilities), you can use GraphsP-P… The defaults in the dialogue box are fine (including the “Test Distribution” being Normal), so just move over exam to the Variables box, and click OK: The resulting plots will look slightly different, but yield a similar interpretation: Psych 522, 10/25/05 p. 5/6 Normal P-P Plot of Exam Performance (%) 1.0 Expected Cum Prob 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 Observed Cum Prob Detrended Normal P-P Plot of Exam Performance (%) 0.06 Deviation from Normal 0.03 0.00 -0.03 -0.06 -0.09 0.0 0.2 0.4 0.6 0.8 1.0 Observed Cum Prob 3. How should we proceed? Have we met the normality assumption. There is not a clear-cut answer. Knowledge of the variable of interest will come into play. If we went strictly by the results of the Kolmogorov-Smirnov test, we would say that we could not consider our data to have come from a normal population. The patterns of the Q-Q and P-P plots would support this. However, is this a big enough deviation to make a t-test invalid? Again, this depends. The t-test is fairly robust to violations of normality, so we might be ok in proceeding with the t. But we would certainly want to report on the normality data that we collected. We may also want to try some remedial measures (e.g., transformations). We may also decide that an “assumption freer” test is more appropriate. We will cover both of these topics at the end of the course. Psych 522, 10/25/05 p. 6/6 4. Now open a new data file and enter the birthweight data that we used for the hand-calculation example (6.4, 7.0, 7.4, 8.0, and 8.2 pounds). Calculate the Kolmogorov-Smirnov test, and generate P-P and Q-Q plots. Does it look like the normality assumption has been met?