Users of statistics

advertisement
Psych 522, 10/25/05
p. 1/6
Assessing Normality (based on Kirk, Ch 3)
Assessing Normality
Because the one-sample t has a stronger normality assumption (of the population)
than the one-sample z, it is important to examine the plausibility of the
normality assumption, given the sample that you have obtained.
It will be a very rare case that your sample data are actually normally distributed,
but this is okay. Again, it is the plausibility of a normally distributed
population that is important.
In addition to simply examining the distribution of your data using a histogram,
boxplot, or stem-and-leaf plot, SPSS offers a few other tools to help you
assess this plausibility:
1) Kolmogorov-Smirnov test: This is an actual statistical test that tells you
whether your sample’s deviation from normality is statistically significant.
2) Normal Q-Q Plots: This is a graphical procedure that plots the observed
values on the X-axis and the expected values (assuming a normal distribution)
on the Y-axis. Note that if the sample distribution is distributed exactly like a
normal distribution, the points should fall on a straight line.
Q-Q Plot
Psych 522, 10/25/05
p. 2/6
3) Normal P-P Plots: These are similar to Q-Q plots, but instead of plotting
observed values, these plot cumulative probabilities (values range from 0 to
1), with observed probabilities (cumulative proportion of cases) on the X-axis
and expected probabilities given the normal curve on the Y-axis. Again, if the
sample were exactly normally distributed, the points would lie on a straight
line:
P-P Plot
Try these out using the exam variable in the examanxiety.sav dataset:
1. The Kolmogorov-Smirnov test and Q-Q plots can easily be obtained using
AnalyzeDescriptive StatisticsExplore. Move exam over to the Dependent
List box, click on Plots, and check “Normality plots with tests.” Click
Continue, then OK.
Your output will look like:
Psych 522, 10/25/05
p. 3/6
Note that the Kolmogorov-Smirnov test is significant which suggests that the
exam scores do not approximate normality (generate a histograms to think
about why this might be!)
Normal Q-Q Plot of Exam Performance (%)
4
Expected Normal
2
0
-2
-4
-20
0
20
40
60
80
100
120
Observed Value
The normal Q-Q plot confirms this. The dots do not fall right on the line and, in
fact create an S-like pattern (which suggests skew). If you look at the
histogram, you notice some negative skew along with an interesting pattern of
“valleys” in the data (i.e., just below 20, 40, 60). This is probably
contributing to the non-normality as well.
20
Frequency
15
10
5
Mean = 56.5728
Std. Dev. = 25.94058
N = 103
0
0.00
20.00
40.00
60.00
Exam Performance (%)
80.00
100.00
Psych 522, 10/25/05
p. 4/6
In addition to the normal Q-Q plot , SPSS also gives us a “detrended” Q-Q plot
(see below). Here, the Y-axis is the deviation (difference) between what was
observed and what was expected. This detrended plot sometimes makes the
pattern easier to decipher (note the clear “S” pattern).
Detrended Normal Q-Q Plot of Exam Performance (%)
0.3
0.2
Dev from Normal
0.1
0.0
-0.1
-0.2
-0.3
-0.4
0
20
40
60
80
100
Observed Value
2. To plot the P-P plots (again, similar to Q-Q, but with cumulative probabilities),
you can use GraphsP-P… The defaults in the dialogue box are fine
(including the “Test Distribution” being Normal), so just move over exam to
the Variables box, and click OK:
The resulting plots will look slightly different, but yield a similar interpretation:
Psych 522, 10/25/05
p. 5/6
Normal P-P Plot of Exam Performance (%)
1.0
Expected Cum Prob
0.8
0.6
0.4
0.2
0.0
0.0
0.2
0.4
0.6
0.8
1.0
Observed Cum Prob
Detrended Normal P-P Plot of Exam Performance (%)
0.06
Deviation from Normal
0.03
0.00
-0.03
-0.06
-0.09
0.0
0.2
0.4
0.6
0.8
1.0
Observed Cum Prob
3. How should we proceed? Have we met the normality assumption. There is
not a clear-cut answer. Knowledge of the variable of interest will come into
play. If we went strictly by the results of the Kolmogorov-Smirnov test, we
would say that we could not consider our data to have come from a normal
population. The patterns of the Q-Q and P-P plots would support this.
However, is this a big enough deviation to make a t-test invalid? Again, this
depends. The t-test is fairly robust to violations of normality, so we might be
ok in proceeding with the t. But we would certainly want to report on the
normality data that we collected. We may also want to try some remedial
measures (e.g., transformations). We may also decide that an “assumption
freer” test is more appropriate. We will cover both of these topics at the end
of the course.
Psych 522, 10/25/05
p. 6/6
4. Now open a new data file and enter the birthweight data that we used for the
hand-calculation example (6.4, 7.0, 7.4, 8.0, and 8.2 pounds). Calculate the
Kolmogorov-Smirnov test, and generate P-P and Q-Q plots. Does it look like
the normality assumption has been met?
Download