17582_04_ch04_p140-220.qxd 196 11/25/08 3:33 PM Page 196 Chapter 4 Probability and Probability Distributions Solution TABLE 4.12 Sample and normal quantiles for cholesterol readings Patient Cholesterol Reading (i .5)20 Normal Quantile 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 133 137 148 149 152 167 174 179 189 192 201 209 210 211 218 238 245 248 253 257 .025 .075 .125 .175 .225 .275 .325 .375 .425 .475 .525 .575 .625 .675 .725 .775 .825 .875 .925 .975 1.960 1.440 1.150 .935 .755 .598 .454 .319 .189 .063 .063 .189 .319 .454 .598 .755 .935 1.150 1.440 1.960 A plot of the sample quantiles versus the corresponding normal quantiles is displayed in Figure 4.27. The plotted points generally follow a straight line pattern. FIGURE 4.27 290 Normal quantile plot 270 Cholesterol readings 250 230 210 190 170 150 130 110 –2 –1 0 Normal quantiles 1 2 Using Minitab, we can obtain a plot with a fitted line that assists us in assessing how close the plotted points fall relative to a straight line. This plot is displayed in Figure 4.28. The 20 points appear to be relatively close to the fitted line and thus the normal quantile plot would appear to suggest that the normality of the population distribution is plausible. Using a graphical procedure, there is a high degree of subjectivity in making an assessment of how well the plotted points fit a straight line. The scales of the axes 17582_04_ch04_p140-220.qxd 11/25/08 3:33 PM Page 197 4.14 Evaluating Whether or Not a Population Distribution Is Normal FIGURE 4.28 197 Cholesterol = 195.5 + 39.4884 Normal Quantiles S = 8.30179 R-Sq = 95.9% R-Sq(adj) = 95.7% Normal quantile plot 280 260 Cholesterol readings 240 220 200 180 160 140 120 100 –2 –1 0 Normal quantiles 1 2 on the plot can be increased or decreased, resulting in a change in our assessment of fit. Therefore, a quantitative assessment of the degree to which the plotted points fall near a straight line will be introduced. In Chapter 3, we introduced the sample correlation coefficient r to measure the degree to which two variables satisfied a linear relationship. We will now discuss how this coefficient can be used to assess our certainty that the sample data was selected from a population having a normal distribution. First, we must alter which normal quantiles are associated with the ordered data values. In the above discussion, we used the normal quantiles corresponding to (i .5)n. In calculating the correlation between the ordered data values and the normal quantiles, a more precise measure is obtained if we associate the (i .375)(n .25) normal quantiles for i 1, . . . , n with the n data values y(1), . . . , y(n). We then calculate the value of the correlation coefficient, r, from the n pairs of values. To provide a more definitive assessment of our level of certainty that the data were sampled from a normal distribution, we then obtain a value from Table 16 in the Appendix. This value, called a p-value, can then be used along with the following criterion (Table 4.13) to rate the degree of fit of the data to a normal distribution. TABLE 4.13 Criteria for assessing fit of normal distribution p-value p .01 .01 p .05 .05 p .10 .10 p .50 p .50 Assessment of Normality Very poor fit Poor fit Acceptable fit Good fit Excellent fit It is very important that the normal quantile plot accompany the calculation of the correlation because large sample sizes may result in an assessment of a poor fit when the graph would indicate otherwise. The following example will illustrate the calculations involved in obtaining the correlation. EXAMPLE 4.28 Consider the cholesterol data in Example 4.27. Calculate the correlation coefficient and make a determination of the degree of fit of the data to a normal distribution.