6.5: the normal approximation to the binomial and poisson

advertisement
CD6-12
6.5:
CD MATERIAL
THE NORMAL APPROXIMATION TO THE BINOMIAL AND
POISSON DISTRIBUTIONS
In the earlier sections of this chapter the normal probability distribution was discussed. In this section another useful aspect of the normal distribution is considered—how it may be used to approximate the binomial and Poisson probability distributions.
Need for a Correction for Continuity Adjustment
There are two major reasons to employ a correction for continuity adjustment here.
First, recall that a discrete random variable can take on only specified values while a continuous
random variable can take on any values within a continuum or interval around those specified values. Hence, when using the normal distribution to approximate the binomial or the Poisson distributions, more accurate approximations of the probabilities are likely to be obtained if a correction
for continuity adjustment is employed.
Second, recall that with a continuous distribution (such as the normal), the probability of
obtaining a particular value of a random variable is zero. On the other hand, when the normal distribution is used to approximate a discrete distribution, a correction for continuity adjustment
can be employed so that the probability of a specific value of the discrete distribution can be
approximated.
As a case in point, consider an experiment in which you toss a fair coin 10 times and observe the
number of heads. Suppose you want to compute the probability of obtaining exactly 4 heads.
Whereas a discrete random variable can have only a specified value (such as 4), a continuous random variable used to approximate it could take on any values whatsoever within an interval around
that specified value, as demonstrated on the accompanying scale:
...
. . .X
2.5
3
3.5
4
4.5
5
4.5
The correction for continuity adjustment requires adding or subtracting 0.5 from the value or
values of the discrete random variable X as needed. Hence to use the normal distribution to approximate the probability of obtaining exactly 4 heads (i.e., X = 4), you need to find the area under the
normal curve from X = 3.5 to X = 4.5, the lower and upper boundaries of 4. To determine the
approximate probability of observing at least 4 heads, you find the area under the normal curve
from X = 3.5 and above since, on a continuum, 3.5 is the lower boundary of X. Similarly, to determine the approximate probability of observing at most 4 heads, we would find the area under the
normal curve from X = 4.5 and below since, on a continuum, 4.5 is the upper boundary of X.
When using the normal distribution to approximate discrete probability distributions, semantics are important. To determine the approximate probability of observing fewer than 4 heads, you
find the area under the normal curve from X = 3.5 and below; to determine the approximate probability of observing more than 4 heads, you find the area under the normal curve from X = 4.5 and
above; and to determine the approximate probability of observing 4 through 7 heads, you find the
area under the normal curve from X = 3.5 to X = 7.5.
Approximating the Binomial Distribution
In section 5.3 you learned that the binomial distribution is symmetric (like the normal distribution)
whenever p = .5. When p ≠ .5 the binomial distribution will not be symmetric. However, the closer p
is to .5 and the larger the number of sample observations n, the more symmetric the distribution
becomes.
On the other hand, the larger the number of observations in the sample, the more tedious it is
to compute the exact probabilities of success by use of Equation (5.11). Fortunately, though, whenever the sample size is large, the normal distribution can be used to approximate the exact probabilities of success that otherwise would have to be obtained through laborious computations.
6.5: The Normal Approximation to the Binomial and Poisson Distributions
CD6-13
As a general rule this normal approximation can be used whenever np and n(1 p) are at least
5. Recall from section 5.3 that the mean of the binomial distribution is given by
µ = np
and the standard deviation of the binomial distribution is obtained from
σ=
np (1 − p )
Substituting into the transformation formula (6.2)
Z =
X −µ
σ
and therefore
X − np
=
np (1 − p )
so that, for large enough n, the random variable Z is approximately normally distributed.
Hence, to find approximate probabilities corresponding to the values of the discrete random
variable X, Equation (6.9) is used.
Z ≅
X a − np
(6.9)
np(1 − p )
where
µ = np, mean of the binomial distribution
σ=
np (1 − p ) , standard deviation of the binomial distribution
Xa = adjusted number of successes for the discrete random variable X, such that
Xa = X .5 or Xa = X +.5 as appropriate
Example 6.9
USING THE NORMAL
DISTRIBUTION TO
APPROXIMATE THE BINOMIAL
DISTRIBUTION
Suppose that a sample of n = 1,600 tires of the same type are obtained at random from an ongoing
production process in which 8% of all such tires produced are defective. What is the probability that
in such a sample 150 or fewer tires will be defective?
SOLUTION Since both np = 1,600(.08) = 128 and n(1 p) = 1,600(0.92) = 1,472 exceed 5, you use
the normal distribution to approximate the binomial:
Z ≅
X a − np
np (1 − p )
=
150.5 − 128
(1,600)(0.08)(0.92)
=
22.5
= +2.07
10.85
Here Xa, the adjusted number of successes, is 150.5 and the Z value is +2.07.
Using Table E.2, the area under the curve to the left of Z = +2.07 is 0.9808 (see Figure 6.35).
Area is .9808 since Z = +2.07
FIGURE 6.35
Approximating the binomial
distribution
µ = 128
150.5
X
0
+2.07
Z
CD6-14
CD MATERIAL
Under the binomial distribution the probability of obtaining not more than 150 defective
tires consists of all events up to and including 150 defectives—that P(X ≤ 150) = P(X = 0) +
P(X = 1) + … + P(X = 150), and the true probability is laboriously computed from
150
1,600
X
1,600 − X
 (.08) (.92)
X

X =0
∑ 
To appreciate the amount of work saved by using the normal approximation to the binomial
model in lieu of the exact probability computations, just imagine making the following 151 computations from Equation (5.11) on page 175 before summing up the results:
1,600
1,600
1,600
0
1,600
1
1,599
150
1, 450
+L+ 
+
 (.08) (.92)
 (.08) (.92)
 (.08) (.92)

 150 
 1 
 0 
Obtaining a Probability Approximation for an Individual Value Suppose that
you want to approximate the probability of obtaining exactly 150 defectives. The correction for continuity defines the integer value of interest to range from one-half unit below it to one-half unit
above it. Therefore, the probability of obtaining 150 defective tires is defined as the area (under the
normal curve) between 149.5 and 150.5. Thus by using Equation (6.9), the probability can be
approximated as follows:
Z ≅
150.5 − 128
(1,600)(0.08)(0.92)
=
22.5
= +2.07
10.85
and
Z ≅
149.5 − 128
(1,600)(0.08)(0.92)
= +1.98
From Table E.2, note that the area under the normal curve to the left of X = 150.5 (Z = +2.07) is
0.9808 and the area under the curve to the left of X = 149.5 (Z = +1.98) is 0.9761. Thus, the approximate probability of obtaining 150 defective tires is the difference in the two areas, 0.0047.
Approximating the Poisson Distribution
The normal distribution can also be used to approximate the Poisson distribution whenever the
parameter λ, the expected number of successes, equals or exceeds 5. Since the value of the mean and
the variance of a Poisson distribution are the same,
µ = σ2 = λ
then the standard deviation is
σ=
λ
Substituting into the transformation Equation (6.2) on page 199,
Z =
=
X −µ
σ
X −λ
λ
so that, for large enough λ, the random variable Z is approximately normally distributed.
Hence, to find approximate probabilities corresponding to the values of the Poisson random
variable X Equation (6.10) is used.
6.5: The Normal Approximation to the Binomial and Poisson Distributions
Z ≅
CD6-15
Xa − λ
λ
(6.10)
where
λ = expected number of successes or mean of the Poisson distribution
σ=
λ , the standard deviation of the Poisson distribution
Xa = adjusted number of successes, x, for the discrete random variable X, such that
Xa = X 0.5 or xa = X + 0.5 as appropriate
Example 6.10
USING THE NORMAL
DISTRIBUTION TO
APPROXIMATE THE POISSON
DISTRIBUTION
Suppose that at a certain automobile plant the average number of work stoppages per day due to
equipment problems during the production process is 12.0. What is the approximate probability of
having 15 or fewer work stoppages due to equipment problems on any given day?
SOLUTION Using Equation (6.10)
Z ≅
Xa − λ
λ
=
15.5 − 12.0
12.0
= +1.01
Here Xa, the adjusted number of successes, is 15.5. Hence the approximate probability that X does
not exceed this value corresponds to a Z value, of not more than +1.01. From Table E.2, note that the
area under the normal curve less than Z = +1.01 is 0.8438. Therefore, the approximate probability of
having 15 or fewer work stoppages due to equipment problems on any given day is 0.8438. This
approximation compares quite favorably to the exact Poisson probability, 0.8445.
PROBLEMS FOR SECTION 6.5
Learning the Basics
6.53 Why is a correction for continuity adjustment needed?
6.54 When can the normal distribution be used to approximate
the binomial distribution?
6.55 When can the normal distribution be used to approximate
the Poisson distribution?
Applying the Concepts
6.56 Consider an experiment in which you toss a fair coin 10
times and observe the number of heads. Use Equation (5.11)
on page 175 or Table E.6 or PHStat or Minitab to determine
the probability of observing
a. 4 heads
b. at least 4 heads
c. at most 4 heads
d. fewer than 4 heads
e. more than 4 heads
f. 4 through 7 heads
g. Use the normal approximation to the binomial distribution to approximate the probabilities in (a)–(f).
h. Compare and contrast your findings in (a)–(f) and (g).
Do you think that the normal distribution provides a
good approximation to the binomial distribution in (g)?
6.57 For overseas flights, an airline has three different choices on
its dessert menu—ice cream, apple pie, and chocolate cake.
Based on past experience the airline feels that each dessert is
equally likely to be chosen.
a. If a random sample of four passengers is selected, what is
the probability that at least two will choose ice cream for
dessert?
b. If a random sample of 21 passengers is selected, what is
the approximate probability that at least two will choose
ice cream for dessert?
6.58 Based upon past experience, 40% of all customers at Miller’s
Automotive Service Station pay for their purchases with a
credit card. If a random sample of three customers is
selected, what is the probability that
a. none pay with a credit card?
b. two pay with a credit card?
c. at least two pay with a credit card?
d. not more than two pay with a credit card?
If a random sample of 200 customers is selected, what is the
approximate probability that
e. at least 75 pay with a credit card?
f. not more than 70 pay with a credit card?
g. between 70 and 75 customers, inclusive, pay with a
credit card?
CD6-16
CD MATERIAL
6.59 On average, 10.0 persons per minute are waiting for an elevator in the lobby of a large office building between the
hours of 8 A.M. and 9 A.M.
a. What is the probability that in any one-minute period at
most four persons are waiting?
b. What is the approximate probability that in any oneminute period at most four persons are waiting?
c. Compare your results in (a) and (b).
6.60 The number of cars arriving per minute at a toll booth on a
particular bridge is Poisson distributed with a mean of 2.5.
What is the probability that in any given minute
a. no cars arrive?
b. not more than two cars arrive?
If the expected number of cars arriving at the toll booth per
ten-minute interval is 25.0, what is the approximate probability that in any given ten-minute period
c. not more than 20 cars arrive?
d. between 20 and 30 cars arrive?
6.61 Cars arrive at Kenny’s Car Wash at a rate of nine per half-hour.
a. What is the probability that in any given half-hour period
at least three cars arrive?
b. What is the approximate probability that in any given
half-hour period at least three cars arrive?
c. Compare your results in (a) and (b).
6.62 Suppose that the number of defective videocassette tapes that
are returned to a video rental store has averaged seven per day.
a. What is the (exact) probability that two tapes will be
returned today?
b. What is the (exact) probability that at least two tapes will
be returned today?
c. What assumptions were made about the probability distribution selected in (a) and (b)? Discuss.
d. Obtain approximate answers for (a) and (b) using a different probability distribution model. Discuss the differences in your findings.
Download