Chapter 6

advertisement
Chapter 6 Continuous Probability Distributions
Continuous Random Variables
In the previous chapter we examined discrete probability distributions where the RV X was allowed
only to take on a limited number of values. It can be the case that the RV can take on an unlimited
(infinite) number of values. Suppose the RV is
X = Annual snowfall in Flagstaff (in inches).
Let's consider some potential values X might take on:
X=123.1234567123445…
X=123.1234567123444…
X=123.1234567123443…
X=123.1234567123442…
where … represents an infinite string of numbers I am too lazy to write out. There are an infinite
number of possible values for snowfall between 123 inches and 124 inches. Don't object that you
can't measure snowfall this precisely, that is a problem with the measuring equipment. Snowfall
physically can take on any value it wants to and is not limited by the precision of measuring
equipment.
It should become immediately obvious that we can't treat this sort of variable in the same way we
treated the discrete case. Suppose we want to find P 123  X  124 , the probability that the
annual snowfall will be between 123 inches and 124 inches. Using the rules for a discrete
probability distribution we would have to add together all the individual probabilities between
123 and 124 inches. But there are an infinite number of these and we can't possibly add them all
up. In addition there is the problem of finding the individual probabilities. What would be a
reasonable value for P  X  123.12342432... ? Nobody knows these probabilities and no one
wants to find out. Mathematically this probability is treated as 0.
In the continuous case, such as the snowfall example, we are not interested in the probability that
snowfall takes on a particular value. We will be interested in finding the probability that the RV
takes on a value in a certain range. For example the probability that snowfall will be in the range of
100 to 120 inches might be of great interest to the managers of Snowbowl.
We may also treat certain discrete distributions as continuous. The probabiity distribution of
profits for some large firm, like Exxon, is a discrete distribution. The probability that Exxon's
profits would be equal to $10,034,666,123.04 is finite. Exxon's profits can take on only a finite
number of values, a very large number of values but still finite. But no one, including Exxon, is
interested in finding the probability of Exxon's profits being exactly $10,034,666,123.04.
A number of people would be interested in finding P 10 bill  X  12 bill  .This is a discrete
probability distribution that we would find convenient to treat as continuous.
The binomial distribution is one which we often find useful to treat as continuous. If 1200 voters
are asked if they prefer candidate A or candidate B, there as 1201 possible outcomes (don't forget
0). We could add all the possible outcomes, but we would really like to do something easier if
possible. It will be possible to often treat the binomial as continuous even though it really is
discrete.
The Uniform Probability Distribution
The uniform probability distribution is an excellent tool to introduce the mechanics of continuous
probability distributions. The uniform distribution has the graph shown in Figure 1.
The Uniform distirbution
1.2
1
P(X)
0.8
0.6
0.4
0.2
0
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
X
Figure 1. The uniform distribution
The uniform distribution has the following interpretation. The RV X can take on any value
between 0 and 1. The probability that X will exist between two values, say a and b is equal to the
area under the curve between a and b. Expressed mathematically this is
P  a  X  b   the area under the curve from a to b
First let's find the probability that X exists between 0 and 1. Because the uniform distribution
is a rectangle finding the area is easy. It is
Area = height times width
For the uniform distribution
height = 1
width = 1
area = height times width = (1)(1) = 1
so the probability that X will take on a value between 0 and 1 is one. Since the height of the curve
anywhere outside 0 and 1 is zero, the probability that X will take on a value outside 0 and 1 is zero.
One rule for a continuous distribution is
Total Area Under the Distribution = 1
Now let's find the P( X  0.5) . It seems reasonable this probability should be 50% because X can
take on any value between 0 and 1 there should be a 50% probability that X will be to the left of 0.5
and a 50% probability that X will be to the left.. The area in question is shown in Fig. 2.
Figure 2. P(X<0.5) = area under the curve to the left of 0.5
Using the formula for the area
P  X  0.5  height  width  1 0.5  0.5
and the probability is 0.5 as we thought.
We can use Fig. 3 to find P(0.3< X < 0.6)
Figure 3. P  0.3  X  0.6  height  width  1 0.6  0.3  0.3
P  0.3  X  0.6  height  width  1 0.6  0.3  0.3
You might note that we could also find the probability by using the following operation
P  a  X  b  P  X  b  P  X  a 
So that
P  0.3  X  0.6   P  X  0.6   P  X  0.3
P  0.3  X  0.6   0.6  0.3  0.3
and we get the same answer either way. This occurs because we find the area to the left of b, then
subtract the area to the left of a and what is left over is the area between a and b. We will find out
that we have to do this for most continuous probability distributions
The Standard Normal Distribution
The standard normal distribution is the most frequently used distribution in
all statistics. The standard normal, or z-distribution is the normal distribution with   0 and  =1.
The standard normal distribution has following graph shown in Fig. 4.
Figure 4. The normal distribution
The area under the entire curve from  to + is equal to one. Using the empirical rule we can
also determine some probabilities of this distribution. Recall that the empirical rule suggested that
approximately 68\% of a distribution would lie in     ,     . In the context of a continuous
distribution we would say that approximately 68% of the area lies in     ,     . Also because
the standard normal has   0 and   1 this means that approximately 68% of the area is between
-1 and +1. The previous section showed that areas were equivalent to probabilities for continuous
distributions. That
Is
P  1  Z  1  area under the curve from -1 to 1
What we would like to do is to be able to find the area under the curve for any values of z, not just
a the areas we can find using the empirical rule. We would like to be able to find
P  1.33  Z  0.55 , for example. You will not be able to perform mathematical operations (such
as multiplying height times width ) to find the areas as you did in the previous section. No one yet
knows how to do this. Good approximations to many of these areas have been found and put in
tables. Such a table has been provided to you.
If we want to find P  Z  1.00  we will have to find the area under the curve to the left of Z=1.00
That is the shaded area in the graph in Fig. 5. To find the area we use the standard normal Z-table.
A portion of the table is shown in Table 1.
Figure 5. P( Z  1.00) is the area to the left of Z=1.0
Table 1. A portion of the standard normal distribution table
The area under the curve to the left of Z=1.00 can be found by finding where the row for Z=1.0 and
the column labeled 0.00 intersect. That value is 0.8413. This gives
P  Z  1.00  0.8413
To find P  Z  1.03 look in the row for Z = 1.0 and the column labeled 0.03. You will find
P  Z  1.03  0.8485.
To find P  Z  0.52 look in the row for Z = 0.52 andthe column labeled 0.02. This gives
P  Z  0.52  0.6985
Use the table and let's find the following values
P  Z  0.02   0.5080
P  Z  0.33  0.6293
P  Z  1.11  0.8665
Keep in mind that the tables give the area to the left of a point. If you need the area to the right
you must use the complement. These are complements because if it's not to the left, it must be to
the right. So
P  Z  1.00   1  P  Z  1.00   1  0.8413  0.1587
P  Z  1.03  1  P  Z  1.03  1  0.8485  0.1515
P( Z  0.33)  1  P  Z  0.33  1  0.6293  0.3707
P( Z  0.02)  1  P  Z  0.02   1  0.5080  0.4920
Next we want to find the probability that Z will lie between two values, say a and b, which is
P  a  Z  b  . First we want to bring out once again a distinction between discrete and continuous
distributions. For both distributions, discrete and continuous the following is true:
P  Z  b  P  Z  b  P  Z  b
For the continuous distribution the probability that the random variable will take of any particular
value is zero (see the snowfall example above). This means that P  Z  b   0 , so
P  Z  b  P  Z  b  P  Z  b  P  Z  b 
and we can be sloppy with the , , ,  relations. So we will usually write
P  a  Z  b  P  a  Z  b
But remember you can't do this with discrete distributions.
Now to find the probability that the RV Z will occur between two points a and b we need to find
the area under the curve from a to b. We will do that by first finding the area to the left of b, then
we will subtract the area to the left of a from the area to the left of b. What will be left is the area
between a and b. This gives
P  a  Z  b  P  Z  b  P  Z  a 
Figure 6. P  1.0  Z  1.0 is the area to the left of Z=1 minus the area to the left of Z  1 .
Table 2. Another portion of the Z distribution
Example:1
Find P  1  Z  1
Use Table 1 and Table 2
From Table 1 P  Z  1.00  .8413
From Table 2 P  Z  1  0.1587
Then P  1  Z  1  P  Z  1  P  Z  1  0.8413  0.1587  0.6826 or about 68% as the
empirical rule would suggest.
Example 2
P  2  Z  2  P  Z  2  P  Z  2  0.9772  0.0228  0.9544
P 1.04  Z  2.36  P  Z  2.36  P  Z  1.04   0.9909  0.8508  0.1401
P  2.22  Z  0.54  P  Z  0.54  P  Z  2.22  0.2946  0.0132  0.2814
Note what we have been doing. Given a value of z we found a probability. Now we are going to
reverse this. What we are going to do next is to be given a probability and then find the value of z..
Suppose we want to find
P  Z  z   
for example. The symbol z may seem strange, but it will be useful later. We have to put some
symbol there and this will serve as good as any. If the problem is to find a value of  such that
25% of the area is in the tail we will refer to that value as z  z0.25
Figure 7. Finding a value of z when you have been given a probabity
Probabilities are located inside the table. So we must look inside for a value of 0.2000 inside the
table. Unfortunately that value is not present. Take the value closest to the desired value. The
closest we can find is 0.2005. Note that value has a z-value of -0.84. So
P  Z  z   0.20
P  Z  0.84   0.2005
P  Z  0.84   0.20
The symbol  means approximately equal which we will take as being close enough. That is
0.2005 is as close as we can get using these tables. So find the closest probability value in the
table and use it.
Suppose the problem had been to find
P  Z  z   0.20
In that case we want a value of z such that 20% lies to the right. But the table only contain values
to the left of a z value. To find 20% to the right we must use 80% to the left. We must
restate the problem to be
P  Z    0.80
So
P  Z  z   0.20
P  Z  za   0.80
P  Z  0.84   0.7995
P  Z  0.84   0.80
P  Z  0.84   0.20
Another operation we will perform frequently is to find values of z such that


P   z  Z  z   1  
 2
2 
This means we want to find two values of z, so that half of the  lies in each tail.
Figure 8. Finding values of z when half of  lies in each tail.
If the problem is to find


P   z  Z  z   0.90
 2
2 
that means we wish to find values of z such that 5% of the area is in each tail and 90% of the area is
between the two values. For the lower tail we have
0


P  Z   z   0.05

2 
 Z  1.65   0.0495
 Z  1.65   0.05
and for the upper tail


P  Z  z   0.05

2 


P  Z  z   0.95

2 
 Z  1.65  0.9505
 Z  1.65  0.95
There are a couple of points to note here. For the upper tail we want the 5% of the area to the right.
The tables only give areas to the left, so to get the 5% of the area to the left we must use the 95% of
the area to the left. Next the value for, say, P  Z  z0.05  was split equally between
P  Z  1.64  0.0505 and P  Z  1.65  0.0495 . Both are equally distant from 0.05. Our previous
advice was to take the closest value. In the case of equal distances take the larger value. Another
way of looking at this is that in the case of a tie, move out into the tail.
Using the Standard Normal
Only by a rare accident would we have a distribution with   0 and  =1. However, we can use
the operations above with any normal distribution by using Z—scores to transform from a normal
distribution (with RV X) to the standard normal (with RV Z) and back again.
Example 3
Suppose that we know that the distribution of gasoline usage in a certain type of automobile is
normally distributed with
  30 mpg
 =2 mpg
The random variable here is X= gasoline usage (in miles per gallon). We want to find the
proportion of automobiles with gasoline usage in the range of 28 to 32 miles per gallon
P(28  X  32)
Note that this range is one standard deviation on each side of the mean an by using the empirical
rule we would expect the probability that X would be in this range is about 68%. We will use Z
scores to transform from values in terms of X to values in terms of Z. The formula for the Z score
is
z
x

For the upper value of X this gives
z
and for the lower value of X
[
z
32  30
1
2
28  30
 1
2
The Z—scores tell us that X=32 is 1 standard deviation to the right of the mean, and that X=28 is 1
standard deviation to the left of the mean. You could have figured this out in your head, but this is
to show you what the Z—scores do transform X values into standard deviations. Now all normal
distributions are the same in this sense. The proportion of the area between 1 standard deviation
is the same for all of them.
We have
P  28  X  32   P  1  Z  1
P  1  Z  1  P  Z  1  P  Z  1
P  1  Z  1  0.8413  0.1587
P  1  Z  1  0.6826
so P  28  X  32  0.6826 and the probability of finding an automobile of this type with fuel
usage between 28 and 32 miles per gallon is 0.6826. Putting it in different terms, about 68% of the
automobiles of this type have fuel usage values in this range. Next, let's look at a problem where we
couldn't use the empirical rule. Suppose we want to find the proportion of automobiles that have
gasoline usage in the range of 29 to 33 miles per gallon.
P  29  X  33  ?
The Z—scores give z 
x


33  30
x   29  30
 1.5 and z 

 0.5
2

2
So
P  29  X  33  P  0.5  Z  1.5 
P  0.5  Z  1.5  P  Z  1.5  P  Z  0.5 
P  0.5  Z  1.5  0.9332  0.3085  0.6247
P  29  Z  33  0.6247
and we conclude that 62.47% of the automobiles have gasoline usage in the range of 29 to 33 miles
per gallon.
Example 4
Find the proportion of automobiles that get an excess of 32.5 miles per gallon,
P( X  32.5) .
x
32.5  30
 1.25

2
P  X  32.5   P  Z  1.25   1  P  Z  1.25 
z

P  Z  1.25   1  0.8944  0.1056
P  X  32.5   0.1056
So about 11% of the automobiles would get gas mileagein excess of 32.5 miles per gallon.
Example 5
A company that makes light bulbs advertises that their bulbs will last for 700 hours before burning
out. The mean lifetime of the bulbs is 800 hours and the standard deviation is 150 hours. What
proportion of the bulbs will burn out before the advertised value?
We need to find
P  X  700  ?
where   800 and  =150 , so
x
700  800
 0.67

150
P  X  700   P  Z  0.67   0.2514
z

which means that about 25% of the bulbs will burn out prior to the 700 hour advertised value.
Example 6
A company gives prospective employees an aptitude test. The mean test score is 50 and the test
scores have a standard deviation of 5. Test scores are normally distributed. The company only
wants to make job offers to those who score in the upper 15% on the test. What should the cutoff
score be for the test? Previously we had been given a value for X. We used this value to find a
value of Z, which in turn led us to a probability value. In this case we reverse things. We are given
a probability (15%). We will use this to find the corresponding Z-value z  z0.15 that we will then
use to find X, the score that only 15% of the applicants exceed.
First we need to find the z-value.
P  Z  z   P  Z  z0.15   0.15
P  Z  z0.15   0.85
P  Z  1.04   0.8505
P  Z  1.05  0.85
Now consider the Z—score formula
z
x

We know z,  , and  so we can find x. It is convenient to rewrite the Z—score formula as
x    z
so
x  50  (1.04)(5)  55.3
This gives the cutoff score. Only 15% of the applicants scores will exceed this value.
Example 7
A machine is filling bottles of soda. The machine cannot always fill the bottles with exactly the
same amount of soda, but fills the bottles with a standard deviation of 0.1 oz. While the standard
deviation can't be controlled the average fill can be adjusted to any level desired. In order not to
cheat the customers management decides to set the average fill,  so that at least 95% of the ottles
will contain 10 or more oz. of soda. What value should the mean be set to?
x
. We know   0.1 oz. and X  10 oz. . The

95% probability will give us the value for z, so we have enough information to find  .
Again look at the Z—score formula,
z
To find Z we know
P  Z  z   0.95
P  Z  z   .05
P  Z  1.65  0.0495
P  Z  1.65  0.05
The Z—score formula can be rewritten as
  x  z
so
  x  z  10  (1.65)(0.1)  10.165
Setting the machine to have an average fill of 10.165 insures that 95% of the bottles will contain
10oz or more of soda.
The Normal Approximation to the Binomial
Fig. 9 is a plot of the binomial distribution for n=20 and p=0.5.
Figure 9. A plot of the binomial distribution for n=20 and p=0.5
Note that this distribution looks like a normal distribution, even though it is discrete. We will use
the normal distribution as an approximation to the binomial in many cases.
Rule: The binomial distribution can be approximated by the normal if np  5 and nq  5 .
For the binomial problem above with n=20 and p=0.5 we have np=20(.5) = 10 and nq=20(.5)=10
and the approximation should be a good one according to the rule.
The binomial formula gives P  X  10   C1020  0.5   0.5   0.1762 . We cannot use the normal
distribution to find P(X=10) precisely. Recall that the probability that a continuous distribution
will take on any specific value is zero. The probabilities for continuous distributions must be
calculated over a range. Let's try approximating this by calculating P(9.5 < X < 10.5). The mean
and standard deviation of this binomial distribution are given by
10
10
  np   20  0.5  10
 2  npq   20  0.5 0.5  5
  5  2.236
Using these values for z-scores gives
x
9.5  10
 0.224

2.236
x   10.5  9.5
z

 0.224

2.236
z

Then using the normal distribution
P  9.5  X  10.5  P  0.22  Z  0.22 
P  0.22  Z  0.22   P  Z  0.22   P  Z  0.22 
P  0.22  Z  0.22   0.5871  0.4129
P  9.5  X  10.5  0.1742
The binomial formula gives 0.1762 while the normal approximation gives 0.1742. The
approximation is very close and would be even better for larger n. Another thing that makes the
approximation attractive is that we can calculate the probabilities for values of X over an entire
range. If we were to find P(8<X<12 using the binomial formula we would have to evaluate the
formula for X=8, X=9, X=10, X=11, and X=12 and then add the probabilities. With the normal
approximation we can calculate this in one step. The mechanics for using the normal approximation
to the binomial distributin are shown in Table 3. Note you must insure that the value used in the
binomial problem is enclosed in the normal approximation.
Binomial problem
P  X  a
Normal approximation
P  X  a  0.5
P  X  b )
P  X  b  0.5
P  a  X  b
P  a  0.5  X  b  0.5
Table 3. Normal approximation to the binomial
Example 8
Suppose a coin is flipped n=1000 times. What is the probability that the number of heads will
be between 490 and 510 if the coin is a fair one (i.e. p=0.5). Using the normal approximation we
want to compute P  489.5  X  510.5 . The mean of the binomial distribution is
  np  1000 0.5  500 . The variance is  2  npq  1000 0.5 0.5  250 so that the
standard deviation is   250  15.81 . Using these values for Z-scores gives
x
510.5  500
 0.66

15.81
x   489.5  500
z

 0.66

15.81
z

Then using the normal distribution
P  489.5  X  510.5   P  0.66  Z  0.66 
P  0.66  Z  0.66   P  Z  0.66   P  Z  0.66 
P  0.66  Z  0.66   0.7454  0.2546
P  0.66  Z  0.66   0.4908
So there is a 49% probability that the number of heads that show will be between 490 and 510.
We could also calculate the probability that we would get 480 heads or less, P  X  480 
x
480.5  500
 1.23

15.81
P  X  480.5   P  Z  1.23  0.1093
z

There is about an 11% probability that 480 or fewer heads would appear.
Example 9
Suppose an airline has planes that seat 100 people. The airline's problems is that on average 20%
of the people that book flights are no-shows. This means that on average 20 seats would be empty
an the airline would lose revenue. To avoid this sort of loss the airline decides to overbook its
flights. The policy is to sell up to 120 tickets per flight. We want to find the probability that at least
one passenger will be bumped on such a flight. Being bumped means that more than 100
passengers show up and those in excess of 100 will be forced to take a later flight (possibly at cost
to the airline).
Call it a success if a passenger shows up for the flight. Then we want to calculate P  X  101
where n=120 and p=.8 . Then
  np  120  0.8  96
  npq 
For the binomial approximation we need
120  0.8 0.2   4.38
P  X  101  P  X  100.5
which leads to the Z-score of
z
x


100.5  96
 1.03
4.38
so
P  X  100.5   P  Z  1.03  1  P  Z  1.03
P  X  100.5   1  0.8485  0.1515
thus P  X  101  0.1515 and there is about a 15% chance that this policy will lead to over
booking on each flight. This may be a greater probability than the company would want, but that is
a management decision.
Problems
1. Find the following probabilities \\
a) P  Z  1.66 
d) P  Z  0.55
b) P  1.33  Z  0.54
e) P  2.33  Z  2.85
c) P  Z  0.39
f) P  0.19  Z  0.23
2. Find the following z values
a) P  Z  z   0.20
d) P  Z  z   0.10
b) P  Z  z   0.05
e) P  Z  z   0.40
c) P  Z  z   0.01
f) P  Z  z   0.005
3 Find the following z values
2

a) P   z
 2

b) P   z
 2

c) P   z
 2

d) P   z
 2

 Z  z   0.80
2 

 Z  z   0.90
2 

 Z  z   0.95
2 

 Z  z   0.99
2 
4. Supposse we are producing a critical bolt for a machine assembly. The assembly's specifications
require that the bolt have a diameter of 1000  1 millimeter. Our bolt making machine produces
bolts which have an average diameter of 999.8 millimeters and a standard deviation of 1.5
millimeter. What percentage of these bolts meet the specifications of the assembly?
5 The Federal Government is thinking about instituting a new small business incentive program.
The program will give entrepreneurs grants for job training if they will form new businesses in
neighborhoods where at least 20% of the families have incomes less than $12500.00. Suppose that
a certain neighborhood has a mean family income of $14000.00 and a standard deviation of
incomes of $2000.00. Does this neighborhood qualify for the grant program?
6 . Suppose that 10% of all market research surveys are returned. If a company sends out n=10,000
surveys find the probability that between 975 and 1010 of these surveys will be returned. Use the
normal approximation to the binomial.
7 Suppose NAU's basketball team has a 60% chance of winning a game. If the team plays 20 ball
games, what is the probability that the team will win between 10 and 13 games? Use the
normal approximation to the binomial.
Answers
1. a)0.9515, b)0.2028, c) 0.6517, d) 0.7088, e) 0.0077, f) 0.1663
2. a) -0.84, b) -1.65, c) 2.33 d) 1.28, e) -0.25, f) 2.58
3 a)
P(9.5<X<13.5)=P(-1.14<Z<0.68)=0.7517-0.1271
1.28 , b) 1.65 , c) 1.96 , 2.58
P(9.5<X<13.5)=0.6246.
4 P(999<X<1001)=0.7881-0.2981=0.4900 so 49% of the bolts will meet the specifications.
5. P(X<12500)=P(Z<-0.75)=0.2266 , so about 23% of the families have incomes below $12500,
thus the neighborhood qualifies.
6. P(974.5<X<1010.5)=P(-0.85<Z<.35)=0.6368-0.1977=0.4391
7 ..Check to see if normal approximation to the binomial is
good: np=20(.6)=12  5 and nq=20(.4)=8  5 so the normal approximation is good.
P(9.5<X<13.5)=P(-1.14<Z<0.68)=0.7517-0.1271
P(9.5<X<13.5)=0.6246.
Using Excel to find probabilities for the normal distribution
You can use Excel to find probability values for various probability distributions. The Excel
command to find the cumulative normal distribution (i.e. the area under the curve) for the normal
distribution is
Normdist(x, mean, standard deviation, true)
where x is the value of the RV you are interested in, mean is the mean of the distribution, standard
deviation is the value of the standard deviation of the distribution, and true indicates to Excel that
the it is to calculate the cumulative normal distribution (again the area under the curve). If you
choose false rather than true, Excel will compute the normal probability mass function( the curve
itself).
The probability mass function is the bell shaped curve
The cumulative distribution function is the area under the bell shaped curve.
So if you have a distribution with  = 13.45 and  = 3.23
you could find P(X <4.47) using the Excel command Normdist(4.47,13.45,3.23,true).
Probability formula Result Excel command
P(X<4.47)
0.0027 =NORMDIST(4.47,13.45,3.23,TRUE)
Specifically for the standard normal distribution function (the one with mean zero and standard
deviation one)
Probability formula
P(Z<1.0)
P(Z>1.0)=1-P(Z<1.0)
P(-1<Z<1)=P(Z<1)-P(Z<-1)
P(Z<-2.45)
P(Z> - 1.33)
P(-2.45 < Z < - 1.33)
Result
Excel command
0.8413 =NORMDIST(1.0,0,1,TRUE)
0.1587 =1-NORMDIST(1.0,0,1,TRUE)
=NORMDIST(1,0,1,TRUE) 0.6827
NORMDIST(-1,0,1,TRUE)
0.0071 =NORMDIST(-2.45,0,1,TRUE)
0.9082 =1-NORMDIST(-1.33,0,1,TRUE)
=NORMDIST(-1.33,0,1,TRUE) –
0.0846
NORMDIST(-2.45,0,1,TRUE)
Values of the standard normal distribution function using Excel
The inverse problem.
The problem in the previous sector required that we find the probability that the given random
variable X or Z would be in a certain range. Briefly put, given a value of X what is the probability.
Symbolically
P(X < some given value) + ?
where x stands for the given value.
The inverse problem reverses this process. Briefly put, given the probability, what is the value of
X. Symbolically
P(X<?) = some given value.
The Excel command used to compute this is
NORMINV(probability, mean, standard deviation)
So if you have a distribution with  = 13.45 and  = 3.23 you could find a value for X such that,
say, 33% of the area is to the left of the point and 67% of the area is to the right. Or
P(X < a) = 0.33
Probability formula
P(X<a) = 0.33
Result Excel command
12.0291 =NORMINV(0.33,13.45,3.23)
So 33% of the area is to the left of 12.029 and 67% of the area is to the right.
Example 3 (above). Suppose that we know that the distribution of gasoline usage in a certain type
of automobile is normally distributed with   30 mpg and   2 mpg. Find the proportion of
automobiles with gasoline usage in the range of 28 to 32 mph. Use Excel
P(28<X<32) 0.6827
=NORMDIST(32,30,2,TRUE) - NORMDIST(28,30,2,TRUE)
Next find P(29 < X < 33)
P(29<X<33) 0.6247 =NORMDIST(33,30,2,TRUE) - NORMDIST(29,30,2,TRUE)
Example 4 (above). Find the proportion of automobiles that get an excess of 32.5 mph.
P(X>32.5)
0.1056 =1 - NORMDIST(32.5,30,2,TRUE)
Example 5 (above). A company that makes light bulbs advertises that the bulbs will last for 700
hours before burning out. The mean lifetime of the bulbs is 800 hours and the standard deviation is
150 hours. What proportion of the bulbs will burn out before the advertised value. Find P(X<700)
if the lifetimes are normally distributed.
P(X<700)
0.2525 = NORMDIST(700,800,150,TRUE)
Example 6. (above) A company gives prospective employees an aptitude test. The mean test score
is 50 with a standard deviation of 5. The test scores are normally distributed. The company only
wants to make job offers to those who score in the top 15% on the test. Find x such that P(X < x) =
0.15.
P(X<x)
55.1822 = NORMINV(0.85,50,5)
Example 7. (above). Well, some problems just have to be worked by hand. You could use Excel to
get a value for z and to do the hand calculation.
P(Z<z) = 0.95
x=10, sigma = 0.1
mean = x - z*sigma
-1.64485 =NORMINV(0.05,0,1)
10.16449 =10-B1*(0.1)
Example 8. (above). The text uses the normal approximation to the binomial. Here we could use
Excel and avoid the approximation because the value of n is too large to use the binomial formula
but not so large as to cause Excel to fail.
Note carefully that P(490  X  510)  P( X  510)  P( X  489) .
Solving the problem using Excel but avoiding the approximation
P(490<=X<=510)
=BINOMDIST(510,1000,0.5,TRUE) 0.49334 BINOMDIST(489,1000,0.5,TRUE)
Using the approximation
n=
p=
Mean
Sigma
Upper value of Z
Lower value of Z
P(Low Z < Z < Upper Z)
1000
0.5
500
15.81139
0.664078
-0.66408
=1000*0.5
=SQRT(1000*0.5*0.5)
=(510.5-500)/ 15.81139
=(489.5-500)/ 15.81139
=NORMDIST(0.664078,0,1,TRUE) 0.49336 NORMDIST(-0.66408,0,1,TRUE)
Example 6. (above). Suppose an airline has planes that seat 100 people. The airline's problems is
that on average 20% of the people that book flights are no-shows. This means that on average
20 seats would be empty an the airline would lose revenue. To avoid this sort of loss the airline
decides to overbook its flights. The policy is to sell up to 120 tickets per flight. We want to find the
probability that at least one passenger will be bumped on such a flight. Being bumped means that
more than 100 passengers show up and those in excess of 100 will be forced to take a later flight
(possibly at cost to the airline).
We want to find P( X  100) if n  120 and p  0.8 . Note that you must be very careful here.
P( X  100)  P( X  101)  1  P( X  100).
Using Excel without the approximation
P(X>100) = 1-P(X<=100)= 0.151714 =1-BINOMDIST(100,120,0.8,TRUE)
Using the approximation
n=
p=
mean=
sigma=
Z=
P(Z>1.02698) =
120
0.8
96
4.38178
1.02698
0.152215
=120*0.8
=SQRT(120*0.8*0.2)
=(100.5-96)/1.02698
=1-NORMDIST(1.02698,0,1,TRUE)
Problem .4 (above) Suppose we are producing a critical bolt for a machine assembly. The
assembly's specifications require that the bolt have a diameter of 1000  1 millimeter. Our bolt
making machine produces bolts which have an average diameter of 999.8 millimeters and a
standard deviation of 1.5 millimeter. What percentage of these bolts meet the specifications of the
assembly?
Given   999.8 and   1.5 , find P(999<X<1001)
Z
X 
999  9999

  053

15
Z
X 
1001  9998
 08
 
15
P(999<X<1001) = P(-0.53<Z<0.80) = 0.7881-0.2981 = 0.4900 so 49% of the parts meet the
specifications.
Using Excel
mean=
sigma=
999.8
1.5
P(999<X<1001)=
0.491243 =NORMDIST(1001,999.8,1.5,TRUE)-NORMDIST(999,999.8,1.5,TRUE)
Problem 5 (above) The Federal Government is thinking about instituting a new small business
incentive program. The program will give entrepreneurs grants for job training if they will form
new businesses in neighborhoods where at least 20% of the families have incomes less than
$12500.00. Suppose that a certain neighborhood has a mean family income of $14000.00 and a
standard deviation of incomes of $2000.00. Does this neighborhood qualify for thegrant program?
Given   14000 and   2000 , find P(X<12500)
Z
X 


12500  14000
 0.75
2000
P(X<12500) = P(Z < -0.75) = 0.2266
so about 23% of the families have incomes below $12500, thus the neighborhood qualifies.
Using Excel
mean=
sigma=
P(X<12500) =
14000
2000
0.226627 =NORMDIST(12500,14000,2000,TRUE)
or we can find out that level of income that determines the lower 20% of the distribution
P(X < x)=0.20
12316.76 =NORMINV(0.2,14000,2000)
Problem 6. (above) Suppose that 10% of all market research surveys are returned. If a company
sends out n=10,000 surveys find the probability that between 975 and 1010 of these surveys will be
returned. Use the normal approximation to the binomial. Here we want to find
P(975  X  1010)  P( X  1010)  P( X  974)  P(974.5  X  1010.5) .
Excel cannot compute the binonial formula because the value of n is too large. We can use the
normal approximation to the binomial, however, and get the answer.
First note that Excel cannot solve this problem directly
P(X<=1010)=
#NUM! =BINOMDIST(1010,10000,0.1,TRUE)
However we can use the normal approximation to the binomial
n=
p=
mean=np=
sigma=sqrt(npq)
Upper Z =
Lower Z =
10000
0.1
1000 =10000*0.1
30 =SQRT(10000*0.1*0.9)
0.35
-0.85
P(974.5<X<1010.5) =
P(-0.85<Z<0.35)
=NORMDIST(0.35,0,1,TRUE) 0.4392 NORMDIST(-0.85,0,1,TRUE)
Download