1 Normal Probability Plots The following example shows how to construct a normal probability plot (also called a normal quantile-quantile plot) to determine whether it is reasonable to believe that a data set was sampled from a normal distribution. A soft drink bottler is studying the internal pressure strength of 1-liter glass bottles. A random sample of 16 bottles is tested, and the pressure strengths are obtained. The r.v. X in this case is the internal pressure strength of a randomly selected bottle. We want to decide whether it is reasonable to conclude that X is normally distributed. The data values are listed in the fourth column of the table below, which gives the Excel development of the normal quantile-quantile plot. The values in the second column are the cumulative relative frequencies, found by calculating i 0.5 i 0.5 , 0,1 . . The third column lists the normal quantiles, found using zi NORMINV n n Each of these numbers is the cut-off point on the standard normal scale corresponding to a lefthand area of (i – 0.5)/n. The NORMINV function of Excel has three arguments – the left-hand tail probability (here given in column 2), the mean of the particular normal distribution (here taken to be 0), and the standard deviation of the normal distribution (here taken to be 1). The fourth column of the table below lists the ordered values of the data. The values in the last column are the standardized ordered values of the data, found by subtracting the sample mean from each data value, then dividing the result by the sample standard deviation. These standardized scores will be plotted against the expected standard normal quantiles in the third column. I i 0.5 n i 0.5 zi NORMINV , 0,1 n x i x i x 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 0.03125 0.09375 0.15625 0.21875 0.28125 0.34375 0.40625 0.46875 0.53125 0.59375 0.65625 0.71875 0.78125 0.84375 0.90625 0.96875 -1.862731 -1.318011 -1.009990 -0.776422 -0.579132 -0.402250 -0.237202 -0.078412 0.078412 0.237202 0.402250 0.579132 0.776422 1.009990 1.318011 1.862731 188.12 193.71 193.73 195.45 200.81 201.63 202.20 202.21 203.62 204.55 208.15 211.14 219.54 221.31 224.39 226.16 s -1.54904 -1.06597 -1.06424 -0.9156 -0.4524 -0.38154 -0.33228 -0.33141 -0.20956 -0.12919 0.18191 0.440299 1.16621 1.31917 1.585337 1.738297 What we want to do in order to construct the graph is the following: 1) Insert two columns before column 5. Copy column 3 into each of these two columns. 2) Highlight the last three columns in the table (the two columns just inserted, plus the column of standardized order statistics). 2 3) Go to Insert, Chart, and choose Scatterplot. Follow the prompts in the dialog boxes. You will need to create a title for the graph (e.g., Normal Q-Q Plot for Bottle Strength Data). You will also need to label the axes. The vertical axis should be labeled, “Standardized Order Statistics of the Data.” The horizontal axis should be labeled, “Standard Normal Quantiles.” 4) The resulting graph is shown below. It appears that the data points do not differ substantially from the straight line, so it is reasonable to conclude that the data were sampled from a normal distribution. Normal Probability Plot, Ch. 3, Exercise 3-63 Standardized Order Statistics 2.5 2 1.5 1 0.5 0 -3 -2 -1 -0.5 0 1 2 3 -1 -1.5 -2 -2.5 Standard Normal Quantiles Note: The points indicated by squares are obtained from the data by plotting the standardized data values against the calculated standard normal quantiles. The points indicated by diamonds are obtained by plotting the standard normal quantiles against themselves. The closer the squares are to the diamonds, overall, the more plausible it is that the data were sampled from a normally distributed population.