CD6-12 6.5: CD MATERIAL THE NORMAL APPROXIMATION TO THE BINOMIAL AND POISSON DISTRIBUTIONS In the earlier sections of this chapter the normal probability distribution was discussed. In this section another useful aspect of the normal distribution is considered—how it may be used to approximate the binomial and Poisson probability distributions. Need for a Correction for Continuity Adjustment There are two major reasons to employ a correction for continuity adjustment here. First, recall that a discrete random variable can take on only specified values while a continuous random variable can take on any values within a continuum or interval around those specified values. Hence, when using the normal distribution to approximate the binomial or the Poisson distributions, more accurate approximations of the probabilities are likely to be obtained if a correction for continuity adjustment is employed. Second, recall that with a continuous distribution (such as the normal), the probability of obtaining a particular value of a random variable is zero. On the other hand, when the normal distribution is used to approximate a discrete distribution, a correction for continuity adjustment can be employed so that the probability of a specific value of the discrete distribution can be approximated. As a case in point, consider an experiment in which you toss a fair coin 10 times and observe the number of heads. Suppose you want to compute the probability of obtaining exactly 4 heads. Whereas a discrete random variable can have only a specified value (such as 4), a continuous random variable used to approximate it could take on any values whatsoever within an interval around that specified value, as demonstrated on the accompanying scale: ... . . .X 2.5 3 3.5 4 4.5 5 4.5 The correction for continuity adjustment requires adding or subtracting 0.5 from the value or values of the discrete random variable X as needed. Hence to use the normal distribution to approximate the probability of obtaining exactly 4 heads (i.e., X = 4), you need to find the area under the normal curve from X = 3.5 to X = 4.5, the lower and upper boundaries of 4. To determine the approximate probability of observing at least 4 heads, you find the area under the normal curve from X = 3.5 and above since, on a continuum, 3.5 is the lower boundary of X. Similarly, to determine the approximate probability of observing at most 4 heads, we would find the area under the normal curve from X = 4.5 and below since, on a continuum, 4.5 is the upper boundary of X. When using the normal distribution to approximate discrete probability distributions, semantics are important. To determine the approximate probability of observing fewer than 4 heads, you find the area under the normal curve from X = 3.5 and below; to determine the approximate probability of observing more than 4 heads, you find the area under the normal curve from X = 4.5 and above; and to determine the approximate probability of observing 4 through 7 heads, you find the area under the normal curve from X = 3.5 to X = 7.5. Approximating the Binomial Distribution In section 5.3 you learned that the binomial distribution is symmetric (like the normal distribution) whenever p = .5. When p ≠ .5 the binomial distribution will not be symmetric. However, the closer p is to .5 and the larger the number of sample observations n, the more symmetric the distribution becomes. On the other hand, the larger the number of observations in the sample, the more tedious it is to compute the exact probabilities of success by use of Equation (5.11). Fortunately, though, whenever the sample size is large, the normal distribution can be used to approximate the exact probabilities of success that otherwise would have to be obtained through laborious computations. 6.5: The Normal Approximation to the Binomial and Poisson Distributions CD6-13 As a general rule this normal approximation can be used whenever np and n(1 p) are at least 5. Recall from section 5.3 that the mean of the binomial distribution is given by µ = np and the standard deviation of the binomial distribution is obtained from σ= np (1 − p ) Substituting into the transformation formula (6.2) Z = X −µ σ and therefore X − np = np (1 − p ) so that, for large enough n, the random variable Z is approximately normally distributed. Hence, to find approximate probabilities corresponding to the values of the discrete random variable X, Equation (6.9) is used. Z ≅ X a − np (6.9) np(1 − p ) where µ = np, mean of the binomial distribution σ= np (1 − p ) , standard deviation of the binomial distribution Xa = adjusted number of successes for the discrete random variable X, such that Xa = X .5 or Xa = X +.5 as appropriate Example 6.9 USING THE NORMAL DISTRIBUTION TO APPROXIMATE THE BINOMIAL DISTRIBUTION Suppose that a sample of n = 1,600 tires of the same type are obtained at random from an ongoing production process in which 8% of all such tires produced are defective. What is the probability that in such a sample 150 or fewer tires will be defective? SOLUTION Since both np = 1,600(.08) = 128 and n(1 p) = 1,600(0.92) = 1,472 exceed 5, you use the normal distribution to approximate the binomial: Z ≅ X a − np np (1 − p ) = 150.5 − 128 (1,600)(0.08)(0.92) = 22.5 = +2.07 10.85 Here Xa, the adjusted number of successes, is 150.5 and the Z value is +2.07. Using Table E.2, the area under the curve to the left of Z = +2.07 is 0.9808 (see Figure 6.35). Area is .9808 since Z = +2.07 FIGURE 6.35 Approximating the binomial distribution µ = 128 150.5 X 0 +2.07 Z CD6-14 CD MATERIAL Under the binomial distribution the probability of obtaining not more than 150 defective tires consists of all events up to and including 150 defectives—that P(X ≤ 150) = P(X = 0) + P(X = 1) + … + P(X = 150), and the true probability is laboriously computed from 150 1,600 X 1,600 − X (.08) (.92) X X =0 ∑ To appreciate the amount of work saved by using the normal approximation to the binomial model in lieu of the exact probability computations, just imagine making the following 151 computations from Equation (5.11) on page 175 before summing up the results: 1,600 1,600 1,600 0 1,600 1 1,599 150 1, 450 +L+ + (.08) (.92) (.08) (.92) (.08) (.92) 150 1 0 Obtaining a Probability Approximation for an Individual Value Suppose that you want to approximate the probability of obtaining exactly 150 defectives. The correction for continuity defines the integer value of interest to range from one-half unit below it to one-half unit above it. Therefore, the probability of obtaining 150 defective tires is defined as the area (under the normal curve) between 149.5 and 150.5. Thus by using Equation (6.9), the probability can be approximated as follows: Z ≅ 150.5 − 128 (1,600)(0.08)(0.92) = 22.5 = +2.07 10.85 and Z ≅ 149.5 − 128 (1,600)(0.08)(0.92) = +1.98 From Table E.2, note that the area under the normal curve to the left of X = 150.5 (Z = +2.07) is 0.9808 and the area under the curve to the left of X = 149.5 (Z = +1.98) is 0.9761. Thus, the approximate probability of obtaining 150 defective tires is the difference in the two areas, 0.0047. Approximating the Poisson Distribution The normal distribution can also be used to approximate the Poisson distribution whenever the parameter λ, the expected number of successes, equals or exceeds 5. Since the value of the mean and the variance of a Poisson distribution are the same, µ = σ2 = λ then the standard deviation is σ= λ Substituting into the transformation Equation (6.2) on page 199, Z = = X −µ σ X −λ λ so that, for large enough λ, the random variable Z is approximately normally distributed. Hence, to find approximate probabilities corresponding to the values of the Poisson random variable X Equation (6.10) is used. 6.5: The Normal Approximation to the Binomial and Poisson Distributions Z ≅ CD6-15 Xa − λ λ (6.10) where λ = expected number of successes or mean of the Poisson distribution σ= λ , the standard deviation of the Poisson distribution Xa = adjusted number of successes, x, for the discrete random variable X, such that Xa = X 0.5 or xa = X + 0.5 as appropriate Example 6.10 USING THE NORMAL DISTRIBUTION TO APPROXIMATE THE POISSON DISTRIBUTION Suppose that at a certain automobile plant the average number of work stoppages per day due to equipment problems during the production process is 12.0. What is the approximate probability of having 15 or fewer work stoppages due to equipment problems on any given day? SOLUTION Using Equation (6.10) Z ≅ Xa − λ λ = 15.5 − 12.0 12.0 = +1.01 Here Xa, the adjusted number of successes, is 15.5. Hence the approximate probability that X does not exceed this value corresponds to a Z value, of not more than +1.01. From Table E.2, note that the area under the normal curve less than Z = +1.01 is 0.8438. Therefore, the approximate probability of having 15 or fewer work stoppages due to equipment problems on any given day is 0.8438. This approximation compares quite favorably to the exact Poisson probability, 0.8445. PROBLEMS FOR SECTION 6.5 Learning the Basics 6.53 Why is a correction for continuity adjustment needed? 6.54 When can the normal distribution be used to approximate the binomial distribution? 6.55 When can the normal distribution be used to approximate the Poisson distribution? Applying the Concepts 6.56 Consider an experiment in which you toss a fair coin 10 times and observe the number of heads. Use Equation (5.11) on page 175 or Table E.6 or PHStat or Minitab to determine the probability of observing a. 4 heads b. at least 4 heads c. at most 4 heads d. fewer than 4 heads e. more than 4 heads f. 4 through 7 heads g. Use the normal approximation to the binomial distribution to approximate the probabilities in (a)–(f). h. Compare and contrast your findings in (a)–(f) and (g). Do you think that the normal distribution provides a good approximation to the binomial distribution in (g)? 6.57 For overseas flights, an airline has three different choices on its dessert menu—ice cream, apple pie, and chocolate cake. Based on past experience the airline feels that each dessert is equally likely to be chosen. a. If a random sample of four passengers is selected, what is the probability that at least two will choose ice cream for dessert? b. If a random sample of 21 passengers is selected, what is the approximate probability that at least two will choose ice cream for dessert? 6.58 Based upon past experience, 40% of all customers at Miller’s Automotive Service Station pay for their purchases with a credit card. If a random sample of three customers is selected, what is the probability that a. none pay with a credit card? b. two pay with a credit card? c. at least two pay with a credit card? d. not more than two pay with a credit card? If a random sample of 200 customers is selected, what is the approximate probability that e. at least 75 pay with a credit card? f. not more than 70 pay with a credit card? g. between 70 and 75 customers, inclusive, pay with a credit card? CD6-16 CD MATERIAL 6.59 On average, 10.0 persons per minute are waiting for an elevator in the lobby of a large office building between the hours of 8 A.M. and 9 A.M. a. What is the probability that in any one-minute period at most four persons are waiting? b. What is the approximate probability that in any oneminute period at most four persons are waiting? c. Compare your results in (a) and (b). 6.60 The number of cars arriving per minute at a toll booth on a particular bridge is Poisson distributed with a mean of 2.5. What is the probability that in any given minute a. no cars arrive? b. not more than two cars arrive? If the expected number of cars arriving at the toll booth per ten-minute interval is 25.0, what is the approximate probability that in any given ten-minute period c. not more than 20 cars arrive? d. between 20 and 30 cars arrive? 6.61 Cars arrive at Kenny’s Car Wash at a rate of nine per half-hour. a. What is the probability that in any given half-hour period at least three cars arrive? b. What is the approximate probability that in any given half-hour period at least three cars arrive? c. Compare your results in (a) and (b). 6.62 Suppose that the number of defective videocassette tapes that are returned to a video rental store has averaged seven per day. a. What is the (exact) probability that two tapes will be returned today? b. What is the (exact) probability that at least two tapes will be returned today? c. What assumptions were made about the probability distribution selected in (a) and (b)? Discuss. d. Obtain approximate answers for (a) and (b) using a different probability distribution model. Discuss the differences in your findings.