Approximation of the Poisson, Binomial and Hypergeometric Distributions by the Normal Distribution.

advertisement
251distrex5l 1/14/08 (Open this document in 'Page Layout' view!)
Approximation of the Poisson, Binomial and
Hypergeometric Distributions by the Normal
Distribution.
4. Normal Approximation to the Binomial Distribution.
If an appropriate Binomial table is not available, it is common to use the Normal
distribution in place of the Binomial with   np and   npq
q  1  p  .
Usually this is done when n is large, but there are rules of thumb available to decide
on the appropriateness of a substitution.
The traditional criterion is np  5 and nq  5 , which is easy to remember because
  np is the expected number of successes and nq  n   is the expected number of
failures. A more modern criterion 0    3  n , probably works better, but is harder
to remember.
a. Without Continuity Correction.
There is a problem with this sort of substitution which will be handled in section b
below.
Let us try to find the Binomial probability P5  x  15  when n  20 and p  .4
and let us assume that we do not have a Binomial table. First the criteria. If we use
the traditional criteria, we note that   np  20 .4  8 and that nq  20  8  12 .
Since these are both above 5, we can use the Normal distribution. If we wish to use the
more modern criteria, find   npq  8.6  4.8  2.191 so
  3  8  32.191   8  6.573 or 1.427 and 14.573. Since these numbers are
between zero and n  20 , we can use the Normal distribution.
x   x  np x  8
We transform x into z 



npq 2.191
15  8 
 58
z
 P 1.37  z  3.19 
P5  x  15   P 
2
.
191
2.191 

 P1.37  z  0  P0  z  3.19  = .4147 + .4993 = .9140 norm
b. With Continuity Correction.
There is a basic incompatibility between the Normal distribution, which is continuous,
and the Binomial distribution, which is discrete. Actually, individual probabilities like
P5 are undefined in a continuous distribution. However, we can approximate the
binomial probability P5 by the Normal probability P4.5  x  5.5 . If we continue
in this vein, we will expand each interval by lowering its lower limit by 0.5 and raising
its upper limit by 0.5. So the Binomial problem P5  x  15  is approximated by the
Normal problem P4.5  x  15 .5 . If we do this norm
15 .5  8 
 4.5  8
z
 P 1.60  z  3.42 
P4.5  x  15 .5  P 
2.191 
 2.191
 P1.60  z  0  P0  z  3.42  = .4452 + .4997 = .9549
If n  20 and p  .4 , the Binomial table gives us P5  x  15 
 Px  15   Px  4  .99968  .05095 = .94873, so that our error with the continuity
correction was below 0.7%. bin
251distrex5l 1/14/08 (Open this document in 'Page Layout' view!)
As n gets larger, individual probabilities become quite small and the continuity
correction becomes negligible. The only rule of thumb that I have seen on this is that
one should always use the correction if npq  9 .
c. Extensions
The Binomial distribution is often expressed as the probabilities of a proportion of
pq
x
successes p  , when the mean is p and the standard deviation is
. This means
n
n
that for a Binomial distribution with n  20 and p  .4 , we have   p  .4,
p p p
p  .4
.4.6 
 0.1095445 and z 


.
20

pq 0.1095445
n
Our Binomial problem P5  x  15  becomes the Normal problem

.75  .4 
 .25  .4
z
 P 1.37  z  3.20 
P.25  p  .75   P 
0.1095445 
 0.1095445
= .4147 + .4993 = .9140. norm
If we insist on a continuity correction (and we should), we expand the interval by
0.5 0.5

 .025 . The expression becomes the Normal problem
n
20
.775  .4 
 .225  .4
z
norm
P.225  p  .775   P 
0
.
1095445
0
.1095445 

 P1.60  z  3.42   P1.60  z  0  P0  z  3.42  = .4452 + .4997 = .9549.
If we have a Hypergeometric problem, we can exploit the similarity of the
Hypergeometric distribution to the Binomial distribution by using our original value of
N n
M
x with p 
npq . A great advantage is that we can also
,   np and  
N 1
N
work with p 
x
with mean of p and the standard deviation is
n
 N  n  pq
.


 N 1  n
N n
would be likely to
N 1
have little effect, so that we may be better off using the Binomial distribution or the
Normal approximation to it.
But note that if N  20 n , the finite population correction
5. Normal approximation to the Poisson Distribution.
If an appropriate Poisson table is not available, it is common to use the Normal
distribution in place of the Poisson with   m and   m , where m is the
parameter (mean and variance) of the Poisson distribution.
The rule of thumb is that this works fairly well if if m  25 . For example, let us
find P5  x  15  , when m  25 . Obviously, we are on the edge of the acceptable
values for the parameter, but let’s try.
x   x  m x  25


We transform x into z 
.

5
m
251distrex5l 1/14/08 (Open this document in 'Page Layout' view!)
15  25 
 5  25
z
Without the continuity correction P5  x  15   P 
norm
5 
 5
 P 4.00  z  2.00   P4.00  z  0  P2.00  z  0  .5000  .4772  .0228 .
With the continuity correction we will expand each interval by lowering its lower limit
by 0.5 and raising its upper limit by 0.5. So the Normal problem P5  x  15  is
approximated by the Poisson problem P4.5  x  15 .5 . If we do this we get
15 .5  25 
 4.5  25
z
P4.5  x  15 .5  P 

5
 5

 P 4.10  z  1.90   P4.10  z  0  P1.90  z  0  .5000  .4713  .0287 .
The Poisson table with m  25 gives us P5  x  15   Px  15   Px  4
 .02229  .00000  .02229 , so that our error without the continuity
correction was about 2% and with it was much larger. It seems that the continuity
correction works badly for very high or low values of x . {poiss}
Download