251distrex5l 1/14/08 (Open this document in 'Page Layout' view!) Approximation of the Poisson, Binomial and Hypergeometric Distributions by the Normal Distribution. 4. Normal Approximation to the Binomial Distribution. If an appropriate Binomial table is not available, it is common to use the Normal distribution in place of the Binomial with np and npq q 1 p . Usually this is done when n is large, but there are rules of thumb available to decide on the appropriateness of a substitution. The traditional criterion is np 5 and nq 5 , which is easy to remember because np is the expected number of successes and nq n is the expected number of failures. A more modern criterion 0 3 n , probably works better, but is harder to remember. a. Without Continuity Correction. There is a problem with this sort of substitution which will be handled in section b below. Let us try to find the Binomial probability P5 x 15 when n 20 and p .4 and let us assume that we do not have a Binomial table. First the criteria. If we use the traditional criteria, we note that np 20 .4 8 and that nq 20 8 12 . Since these are both above 5, we can use the Normal distribution. If we wish to use the more modern criteria, find npq 8.6 4.8 2.191 so 3 8 32.191 8 6.573 or 1.427 and 14.573. Since these numbers are between zero and n 20 , we can use the Normal distribution. x x np x 8 We transform x into z npq 2.191 15 8 58 z P 1.37 z 3.19 P5 x 15 P 2 . 191 2.191 P1.37 z 0 P0 z 3.19 = .4147 + .4993 = .9140 norm b. With Continuity Correction. There is a basic incompatibility between the Normal distribution, which is continuous, and the Binomial distribution, which is discrete. Actually, individual probabilities like P5 are undefined in a continuous distribution. However, we can approximate the binomial probability P5 by the Normal probability P4.5 x 5.5 . If we continue in this vein, we will expand each interval by lowering its lower limit by 0.5 and raising its upper limit by 0.5. So the Binomial problem P5 x 15 is approximated by the Normal problem P4.5 x 15 .5 . If we do this norm 15 .5 8 4.5 8 z P 1.60 z 3.42 P4.5 x 15 .5 P 2.191 2.191 P1.60 z 0 P0 z 3.42 = .4452 + .4997 = .9549 If n 20 and p .4 , the Binomial table gives us P5 x 15 Px 15 Px 4 .99968 .05095 = .94873, so that our error with the continuity correction was below 0.7%. bin 251distrex5l 1/14/08 (Open this document in 'Page Layout' view!) As n gets larger, individual probabilities become quite small and the continuity correction becomes negligible. The only rule of thumb that I have seen on this is that one should always use the correction if npq 9 . c. Extensions The Binomial distribution is often expressed as the probabilities of a proportion of pq x successes p , when the mean is p and the standard deviation is . This means n n that for a Binomial distribution with n 20 and p .4 , we have p .4, p p p p .4 .4.6 0.1095445 and z . 20 pq 0.1095445 n Our Binomial problem P5 x 15 becomes the Normal problem .75 .4 .25 .4 z P 1.37 z 3.20 P.25 p .75 P 0.1095445 0.1095445 = .4147 + .4993 = .9140. norm If we insist on a continuity correction (and we should), we expand the interval by 0.5 0.5 .025 . The expression becomes the Normal problem n 20 .775 .4 .225 .4 z norm P.225 p .775 P 0 . 1095445 0 .1095445 P1.60 z 3.42 P1.60 z 0 P0 z 3.42 = .4452 + .4997 = .9549. If we have a Hypergeometric problem, we can exploit the similarity of the Hypergeometric distribution to the Binomial distribution by using our original value of N n M x with p npq . A great advantage is that we can also , np and N 1 N work with p x with mean of p and the standard deviation is n N n pq . N 1 n N n would be likely to N 1 have little effect, so that we may be better off using the Binomial distribution or the Normal approximation to it. But note that if N 20 n , the finite population correction 5. Normal approximation to the Poisson Distribution. If an appropriate Poisson table is not available, it is common to use the Normal distribution in place of the Poisson with m and m , where m is the parameter (mean and variance) of the Poisson distribution. The rule of thumb is that this works fairly well if if m 25 . For example, let us find P5 x 15 , when m 25 . Obviously, we are on the edge of the acceptable values for the parameter, but let’s try. x x m x 25 We transform x into z . 5 m 251distrex5l 1/14/08 (Open this document in 'Page Layout' view!) 15 25 5 25 z Without the continuity correction P5 x 15 P norm 5 5 P 4.00 z 2.00 P4.00 z 0 P2.00 z 0 .5000 .4772 .0228 . With the continuity correction we will expand each interval by lowering its lower limit by 0.5 and raising its upper limit by 0.5. So the Normal problem P5 x 15 is approximated by the Poisson problem P4.5 x 15 .5 . If we do this we get 15 .5 25 4.5 25 z P4.5 x 15 .5 P 5 5 P 4.10 z 1.90 P4.10 z 0 P1.90 z 0 .5000 .4713 .0287 . The Poisson table with m 25 gives us P5 x 15 Px 15 Px 4 .02229 .00000 .02229 , so that our error without the continuity correction was about 2% and with it was much larger. It seems that the continuity correction works badly for very high or low values of x . {poiss}