Chapter 6 Continuous Probability Distributions Continuous Random Variables In the previous chapter we examined discrete probability distributions where the RV X was allowed only to take on a limited number of values. It can be the case that the RV can take on an unlimited (infinite) number of values. Suppose the RV is X = Annual snowfall in Flagstaff (in inches). Let's consider some potential values X might take on: X=123.1234567123445… X=123.1234567123444… X=123.1234567123443… X=123.1234567123442… where … represents an infinite string of numbers I am too lazy to write out. There are an infinite number of possible values for snowfall between 123 inches and 124 inches. Don't object that you can't measure snowfall this precisely, that is a problem with the measuring equipment. Snowfall physically can take on any value it wants to and is not limited by the precision of measuring equipment. It should become immediately obvious that we can't treat this sort of variable in the same way we treated the discrete case. Suppose we want to find P 123 X 124 , the probability that the annual snowfall will be between 123 inches and 124 inches. Using the rules for a discrete probability distribution we would have to add together all the individual probabilities between 123 and 124 inches. But there are an infinite number of these and we can't possibly add them all up. In addition there is the problem of finding the individual probabilities. What would be a reasonable value for P X 123.12342432... ? Nobody knows these probabilities and no one wants to find out. Mathematically this probability is treated as 0. In the continuous case, such as the snowfall example, we are not interested in the probability that snowfall takes on a particular value. We will be interested in finding the probability that the RV takes on a value in a certain range. For example the probability that snowfall will be in the range of 100 to 120 inches might be of great interest to the managers of Snowbowl. We may also treat certain discrete distributions as continuous. The probabiity distribution of profits for some large firm, like Exxon, is a discrete distribution. The probability that Exxon's profits would be equal to $10,034,666,123.04 is finite. Exxon's profits can take on only a finite number of values, a very large number of values but still finite. But no one, including Exxon, is interested in finding the probability of Exxon's profits being exactly $10,034,666,123.04. A number of people would be interested in finding P 10 bill X 12 bill .This is a discrete probability distribution that we would find convenient to treat as continuous. The binomial distribution is one which we often find useful to treat as continuous. If 1200 voters are asked if they prefer candidate A or candidate B, there as 1201 possible outcomes (don't forget 0). We could add all the possible outcomes, but we would really like to do something easier if possible. It will be possible to often treat the binomial as continuous even though it really is discrete. The Uniform Probability Distribution The uniform probability distribution is an excellent tool to introduce the mechanics of continuous probability distributions. The uniform distribution has the graph shown in Figure 1. The Uniform distirbution 1.2 1 P(X) 0.8 0.6 0.4 0.2 0 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 X Figure 1. The uniform distribution The uniform distribution has the following interpretation. The RV X can take on any value between 0 and 1. The probability that X will exist between two values, say a and b is equal to the area under the curve between a and b. Expressed mathematically this is P a X b the area under the curve from a to b First let's find the probability that X exists between 0 and 1. Because the uniform distribution is a rectangle finding the area is easy. It is Area = height times width For the uniform distribution height = 1 width = 1 area = height times width = (1)(1) = 1 so the probability that X will take on a value between 0 and 1 is one. Since the height of the curve anywhere outside 0 and 1 is zero, the probability that X will take on a value outside 0 and 1 is zero. One rule for a continuous distribution is Total Area Under the Distribution = 1 Now let's find the P( X 0.5) . It seems reasonable this probability should be 50% because X can take on any value between 0 and 1 there should be a 50% probability that X will be to the left of 0.5 and a 50% probability that X will be to the left.. The area in question is shown in Fig. 2. Figure 2. P(X<0.5) = area under the curve to the left of 0.5 Using the formula for the area P X 0.5 height width 1 0.5 0.5 and the probability is 0.5 as we thought. We can use Fig. 3 to find P(0.3< X < 0.6) Figure 3. P 0.3 X 0.6 height width 1 0.6 0.3 0.3 P 0.3 X 0.6 height width 1 0.6 0.3 0.3 You might note that we could also find the probability by using the following operation P a X b P X b P X a So that P 0.3 X 0.6 P X 0.6 P X 0.3 P 0.3 X 0.6 0.6 0.3 0.3 and we get the same answer either way. This occurs because we find the area to the left of b, then subtract the area to the left of a and what is left over is the area between a and b. We will find out that we have to do this for most continuous probability distributions The Standard Normal Distribution The standard normal distribution is the most frequently used distribution in all statistics. The standard normal, or z-distribution is the normal distribution with 0 and =1. The standard normal distribution has following graph shown in Fig. 4. Figure 4. The normal distribution The area under the entire curve from to + is equal to one. Using the empirical rule we can also determine some probabilities of this distribution. Recall that the empirical rule suggested that approximately 68\% of a distribution would lie in , . In the context of a continuous distribution we would say that approximately 68% of the area lies in , . Also because the standard normal has 0 and 1 this means that approximately 68% of the area is between -1 and +1. The previous section showed that areas were equivalent to probabilities for continuous distributions. That Is P 1 Z 1 area under the curve from -1 to 1 What we would like to do is to be able to find the area under the curve for any values of z, not just a the areas we can find using the empirical rule. We would like to be able to find P 1.33 Z 0.55 , for example. You will not be able to perform mathematical operations (such as multiplying height times width ) to find the areas as you did in the previous section. No one yet knows how to do this. Good approximations to many of these areas have been found and put in tables. Such a table has been provided to you. If we want to find P Z 1.00 we will have to find the area under the curve to the left of Z=1.00 That is the shaded area in the graph in Fig. 5. To find the area we use the standard normal Z-table. A portion of the table is shown in Table 1. Figure 5. P( Z 1.00) is the area to the left of Z=1.0 Table 1. A portion of the standard normal distribution table The area under the curve to the left of Z=1.00 can be found by finding where the row for Z=1.0 and the column labeled 0.00 intersect. That value is 0.8413. This gives P Z 1.00 0.8413 To find P Z 1.03 look in the row for Z = 1.0 and the column labeled 0.03. You will find P Z 1.03 0.8485. To find P Z 0.52 look in the row for Z = 0.52 andthe column labeled 0.02. This gives P Z 0.52 0.6985 Use the table and let's find the following values P Z 0.02 0.5080 P Z 0.33 0.6293 P Z 1.11 0.8665 Keep in mind that the tables give the area to the left of a point. If you need the area to the right you must use the complement. These are complements because if it's not to the left, it must be to the right. So P Z 1.00 1 P Z 1.00 1 0.8413 0.1587 P Z 1.03 1 P Z 1.03 1 0.8485 0.1515 P( Z 0.33) 1 P Z 0.33 1 0.6293 0.3707 P( Z 0.02) 1 P Z 0.02 1 0.5080 0.4920 Next we want to find the probability that Z will lie between two values, say a and b, which is P a Z b . First we want to bring out once again a distinction between discrete and continuous distributions. For both distributions, discrete and continuous the following is true: P Z b P Z b P Z b For the continuous distribution the probability that the random variable will take of any particular value is zero (see the snowfall example above). This means that P Z b 0 , so P Z b P Z b P Z b P Z b and we can be sloppy with the , , , relations. So we will usually write P a Z b P a Z b But remember you can't do this with discrete distributions. Now to find the probability that the RV Z will occur between two points a and b we need to find the area under the curve from a to b. We will do that by first finding the area to the left of b, then we will subtract the area to the left of a from the area to the left of b. What will be left is the area between a and b. This gives P a Z b P Z b P Z a Figure 6. P 1.0 Z 1.0 is the area to the left of Z=1 minus the area to the left of Z 1 . Table 2. Another portion of the Z distribution Example:1 Find P 1 Z 1 Use Table 1 and Table 2 From Table 1 P Z 1.00 .8413 From Table 2 P Z 1 0.1587 Then P 1 Z 1 P Z 1 P Z 1 0.8413 0.1587 0.6826 or about 68% as the empirical rule would suggest. Example 2 P 2 Z 2 P Z 2 P Z 2 0.9772 0.0228 0.9544 P 1.04 Z 2.36 P Z 2.36 P Z 1.04 0.9909 0.8508 0.1401 P 2.22 Z 0.54 P Z 0.54 P Z 2.22 0.2946 0.0132 0.2814 Note what we have been doing. Given a value of z we found a probability. Now we are going to reverse this. What we are going to do next is to be given a probability and then find the value of z.. Suppose we want to find P Z z for example. The symbol z may seem strange, but it will be useful later. We have to put some symbol there and this will serve as good as any. If the problem is to find a value of such that 25% of the area is in the tail we will refer to that value as z z0.25 Figure 7. Finding a value of z when you have been given a probabity Probabilities are located inside the table. So we must look inside for a value of 0.2000 inside the table. Unfortunately that value is not present. Take the value closest to the desired value. The closest we can find is 0.2005. Note that value has a z-value of -0.84. So P Z z 0.20 P Z 0.84 0.2005 P Z 0.84 0.20 The symbol means approximately equal which we will take as being close enough. That is 0.2005 is as close as we can get using these tables. So find the closest probability value in the table and use it. Suppose the problem had been to find P Z z 0.20 In that case we want a value of z such that 20% lies to the right. But the table only contain values to the left of a z value. To find 20% to the right we must use 80% to the left. We must restate the problem to be P Z 0.80 So P Z z 0.20 P Z za 0.80 P Z 0.84 0.7995 P Z 0.84 0.80 P Z 0.84 0.20 Another operation we will perform frequently is to find values of z such that P z Z z 1 2 2 This means we want to find two values of z, so that half of the lies in each tail. Figure 8. Finding values of z when half of lies in each tail. If the problem is to find P z Z z 0.90 2 2 that means we wish to find values of z such that 5% of the area is in each tail and 90% of the area is between the two values. For the lower tail we have 0 P Z z 0.05 2 Z 1.65 0.0495 Z 1.65 0.05 and for the upper tail P Z z 0.05 2 P Z z 0.95 2 Z 1.65 0.9505 Z 1.65 0.95 There are a couple of points to note here. For the upper tail we want the 5% of the area to the right. The tables only give areas to the left, so to get the 5% of the area to the left we must use the 95% of the area to the left. Next the value for, say, P Z z0.05 was split equally between P Z 1.64 0.0505 and P Z 1.65 0.0495 . Both are equally distant from 0.05. Our previous advice was to take the closest value. In the case of equal distances take the larger value. Another way of looking at this is that in the case of a tie, move out into the tail. Using the Standard Normal Only by a rare accident would we have a distribution with 0 and =1. However, we can use the operations above with any normal distribution by using Z—scores to transform from a normal distribution (with RV X) to the standard normal (with RV Z) and back again. Example 3 Suppose that we know that the distribution of gasoline usage in a certain type of automobile is normally distributed with 30 mpg =2 mpg The random variable here is X= gasoline usage (in miles per gallon). We want to find the proportion of automobiles with gasoline usage in the range of 28 to 32 miles per gallon P(28 X 32) Note that this range is one standard deviation on each side of the mean an by using the empirical rule we would expect the probability that X would be in this range is about 68%. We will use Z scores to transform from values in terms of X to values in terms of Z. The formula for the Z score is z x For the upper value of X this gives z and for the lower value of X [ z 32 30 1 2 28 30 1 2 The Z—scores tell us that X=32 is 1 standard deviation to the right of the mean, and that X=28 is 1 standard deviation to the left of the mean. You could have figured this out in your head, but this is to show you what the Z—scores do transform X values into standard deviations. Now all normal distributions are the same in this sense. The proportion of the area between 1 standard deviation is the same for all of them. We have P 28 X 32 P 1 Z 1 P 1 Z 1 P Z 1 P Z 1 P 1 Z 1 0.8413 0.1587 P 1 Z 1 0.6826 so P 28 X 32 0.6826 and the probability of finding an automobile of this type with fuel usage between 28 and 32 miles per gallon is 0.6826. Putting it in different terms, about 68% of the automobiles of this type have fuel usage values in this range. Next, let's look at a problem where we couldn't use the empirical rule. Suppose we want to find the proportion of automobiles that have gasoline usage in the range of 29 to 33 miles per gallon. P 29 X 33 ? The Z—scores give z x 33 30 x 29 30 1.5 and z 0.5 2 2 So P 29 X 33 P 0.5 Z 1.5 P 0.5 Z 1.5 P Z 1.5 P Z 0.5 P 0.5 Z 1.5 0.9332 0.3085 0.6247 P 29 Z 33 0.6247 and we conclude that 62.47% of the automobiles have gasoline usage in the range of 29 to 33 miles per gallon. Example 4 Find the proportion of automobiles that get an excess of 32.5 miles per gallon, P( X 32.5) . x 32.5 30 1.25 2 P X 32.5 P Z 1.25 1 P Z 1.25 z P Z 1.25 1 0.8944 0.1056 P X 32.5 0.1056 So about 11% of the automobiles would get gas mileagein excess of 32.5 miles per gallon. Example 5 A company that makes light bulbs advertises that their bulbs will last for 700 hours before burning out. The mean lifetime of the bulbs is 800 hours and the standard deviation is 150 hours. What proportion of the bulbs will burn out before the advertised value? We need to find P X 700 ? where 800 and =150 , so x 700 800 0.67 150 P X 700 P Z 0.67 0.2514 z which means that about 25% of the bulbs will burn out prior to the 700 hour advertised value. Example 6 A company gives prospective employees an aptitude test. The mean test score is 50 and the test scores have a standard deviation of 5. Test scores are normally distributed. The company only wants to make job offers to those who score in the upper 15% on the test. What should the cutoff score be for the test? Previously we had been given a value for X. We used this value to find a value of Z, which in turn led us to a probability value. In this case we reverse things. We are given a probability (15%). We will use this to find the corresponding Z-value z z0.15 that we will then use to find X, the score that only 15% of the applicants exceed. First we need to find the z-value. P Z z P Z z0.15 0.15 P Z z0.15 0.85 P Z 1.04 0.8505 P Z 1.05 0.85 Now consider the Z—score formula z x We know z, , and so we can find x. It is convenient to rewrite the Z—score formula as x z so x 50 (1.04)(5) 55.3 This gives the cutoff score. Only 15% of the applicants scores will exceed this value. Example 7 A machine is filling bottles of soda. The machine cannot always fill the bottles with exactly the same amount of soda, but fills the bottles with a standard deviation of 0.1 oz. While the standard deviation can't be controlled the average fill can be adjusted to any level desired. In order not to cheat the customers management decides to set the average fill, so that at least 95% of the ottles will contain 10 or more oz. of soda. What value should the mean be set to? x . We know 0.1 oz. and X 10 oz. . The 95% probability will give us the value for z, so we have enough information to find . Again look at the Z—score formula, z To find Z we know P Z z 0.95 P Z z .05 P Z 1.65 0.0495 P Z 1.65 0.05 The Z—score formula can be rewritten as x z so x z 10 (1.65)(0.1) 10.165 Setting the machine to have an average fill of 10.165 insures that 95% of the bottles will contain 10oz or more of soda. The Normal Approximation to the Binomial Fig. 9 is a plot of the binomial distribution for n=20 and p=0.5. Figure 9. A plot of the binomial distribution for n=20 and p=0.5 Note that this distribution looks like a normal distribution, even though it is discrete. We will use the normal distribution as an approximation to the binomial in many cases. Rule: The binomial distribution can be approximated by the normal if np 5 and nq 5 . For the binomial problem above with n=20 and p=0.5 we have np=20(.5) = 10 and nq=20(.5)=10 and the approximation should be a good one according to the rule. The binomial formula gives P X 10 C1020 0.5 0.5 0.1762 . We cannot use the normal distribution to find P(X=10) precisely. Recall that the probability that a continuous distribution will take on any specific value is zero. The probabilities for continuous distributions must be calculated over a range. Let's try approximating this by calculating P(9.5 < X < 10.5). The mean and standard deviation of this binomial distribution are given by 10 10 np 20 0.5 10 2 npq 20 0.5 0.5 5 5 2.236 Using these values for z-scores gives x 9.5 10 0.224 2.236 x 10.5 9.5 z 0.224 2.236 z Then using the normal distribution P 9.5 X 10.5 P 0.22 Z 0.22 P 0.22 Z 0.22 P Z 0.22 P Z 0.22 P 0.22 Z 0.22 0.5871 0.4129 P 9.5 X 10.5 0.1742 The binomial formula gives 0.1762 while the normal approximation gives 0.1742. The approximation is very close and would be even better for larger n. Another thing that makes the approximation attractive is that we can calculate the probabilities for values of X over an entire range. If we were to find P(8<X<12 using the binomial formula we would have to evaluate the formula for X=8, X=9, X=10, X=11, and X=12 and then add the probabilities. With the normal approximation we can calculate this in one step. The mechanics for using the normal approximation to the binomial distributin are shown in Table 3. Note you must insure that the value used in the binomial problem is enclosed in the normal approximation. Binomial problem P X a Normal approximation P X a 0.5 P X b ) P X b 0.5 P a X b P a 0.5 X b 0.5 Table 3. Normal approximation to the binomial Example 8 Suppose a coin is flipped n=1000 times. What is the probability that the number of heads will be between 490 and 510 if the coin is a fair one (i.e. p=0.5). Using the normal approximation we want to compute P 489.5 X 510.5 . The mean of the binomial distribution is np 1000 0.5 500 . The variance is 2 npq 1000 0.5 0.5 250 so that the standard deviation is 250 15.81 . Using these values for Z-scores gives x 510.5 500 0.66 15.81 x 489.5 500 z 0.66 15.81 z Then using the normal distribution P 489.5 X 510.5 P 0.66 Z 0.66 P 0.66 Z 0.66 P Z 0.66 P Z 0.66 P 0.66 Z 0.66 0.7454 0.2546 P 0.66 Z 0.66 0.4908 So there is a 49% probability that the number of heads that show will be between 490 and 510. We could also calculate the probability that we would get 480 heads or less, P X 480 x 480.5 500 1.23 15.81 P X 480.5 P Z 1.23 0.1093 z There is about an 11% probability that 480 or fewer heads would appear. Example 9 Suppose an airline has planes that seat 100 people. The airline's problems is that on average 20% of the people that book flights are no-shows. This means that on average 20 seats would be empty an the airline would lose revenue. To avoid this sort of loss the airline decides to overbook its flights. The policy is to sell up to 120 tickets per flight. We want to find the probability that at least one passenger will be bumped on such a flight. Being bumped means that more than 100 passengers show up and those in excess of 100 will be forced to take a later flight (possibly at cost to the airline). Call it a success if a passenger shows up for the flight. Then we want to calculate P X 101 where n=120 and p=.8 . Then np 120 0.8 96 npq For the binomial approximation we need 120 0.8 0.2 4.38 P X 101 P X 100.5 which leads to the Z-score of z x 100.5 96 1.03 4.38 so P X 100.5 P Z 1.03 1 P Z 1.03 P X 100.5 1 0.8485 0.1515 thus P X 101 0.1515 and there is about a 15% chance that this policy will lead to over booking on each flight. This may be a greater probability than the company would want, but that is a management decision. Problems 1. Find the following probabilities \\ a) P Z 1.66 d) P Z 0.55 b) P 1.33 Z 0.54 e) P 2.33 Z 2.85 c) P Z 0.39 f) P 0.19 Z 0.23 2. Find the following z values a) P Z z 0.20 d) P Z z 0.10 b) P Z z 0.05 e) P Z z 0.40 c) P Z z 0.01 f) P Z z 0.005 3 Find the following z values 2 a) P z 2 b) P z 2 c) P z 2 d) P z 2 Z z 0.80 2 Z z 0.90 2 Z z 0.95 2 Z z 0.99 2 4. Supposse we are producing a critical bolt for a machine assembly. The assembly's specifications require that the bolt have a diameter of 1000 1 millimeter. Our bolt making machine produces bolts which have an average diameter of 999.8 millimeters and a standard deviation of 1.5 millimeter. What percentage of these bolts meet the specifications of the assembly? 5 The Federal Government is thinking about instituting a new small business incentive program. The program will give entrepreneurs grants for job training if they will form new businesses in neighborhoods where at least 20% of the families have incomes less than $12500.00. Suppose that a certain neighborhood has a mean family income of $14000.00 and a standard deviation of incomes of $2000.00. Does this neighborhood qualify for the grant program? 6 . Suppose that 10% of all market research surveys are returned. If a company sends out n=10,000 surveys find the probability that between 975 and 1010 of these surveys will be returned. Use the normal approximation to the binomial. 7 Suppose NAU's basketball team has a 60% chance of winning a game. If the team plays 20 ball games, what is the probability that the team will win between 10 and 13 games? Use the normal approximation to the binomial. Answers 1. a)0.9515, b)0.2028, c) 0.6517, d) 0.7088, e) 0.0077, f) 0.1663 2. a) -0.84, b) -1.65, c) 2.33 d) 1.28, e) -0.25, f) 2.58 3 a) P(9.5<X<13.5)=P(-1.14<Z<0.68)=0.7517-0.1271 1.28 , b) 1.65 , c) 1.96 , 2.58 P(9.5<X<13.5)=0.6246. 4 P(999<X<1001)=0.7881-0.2981=0.4900 so 49% of the bolts will meet the specifications. 5. P(X<12500)=P(Z<-0.75)=0.2266 , so about 23% of the families have incomes below $12500, thus the neighborhood qualifies. 6. P(974.5<X<1010.5)=P(-0.85<Z<.35)=0.6368-0.1977=0.4391 7 ..Check to see if normal approximation to the binomial is good: np=20(.6)=12 5 and nq=20(.4)=8 5 so the normal approximation is good. P(9.5<X<13.5)=P(-1.14<Z<0.68)=0.7517-0.1271 P(9.5<X<13.5)=0.6246. Using Excel to find probabilities for the normal distribution You can use Excel to find probability values for various probability distributions. The Excel command to find the cumulative normal distribution (i.e. the area under the curve) for the normal distribution is Normdist(x, mean, standard deviation, true) where x is the value of the RV you are interested in, mean is the mean of the distribution, standard deviation is the value of the standard deviation of the distribution, and true indicates to Excel that the it is to calculate the cumulative normal distribution (again the area under the curve). If you choose false rather than true, Excel will compute the normal probability mass function( the curve itself). The probability mass function is the bell shaped curve The cumulative distribution function is the area under the bell shaped curve. So if you have a distribution with = 13.45 and = 3.23 you could find P(X <4.47) using the Excel command Normdist(4.47,13.45,3.23,true). Probability formula Result Excel command P(X<4.47) 0.0027 =NORMDIST(4.47,13.45,3.23,TRUE) Specifically for the standard normal distribution function (the one with mean zero and standard deviation one) Probability formula P(Z<1.0) P(Z>1.0)=1-P(Z<1.0) P(-1<Z<1)=P(Z<1)-P(Z<-1) P(Z<-2.45) P(Z> - 1.33) P(-2.45 < Z < - 1.33) Result Excel command 0.8413 =NORMDIST(1.0,0,1,TRUE) 0.1587 =1-NORMDIST(1.0,0,1,TRUE) =NORMDIST(1,0,1,TRUE) 0.6827 NORMDIST(-1,0,1,TRUE) 0.0071 =NORMDIST(-2.45,0,1,TRUE) 0.9082 =1-NORMDIST(-1.33,0,1,TRUE) =NORMDIST(-1.33,0,1,TRUE) – 0.0846 NORMDIST(-2.45,0,1,TRUE) Values of the standard normal distribution function using Excel The inverse problem. The problem in the previous sector required that we find the probability that the given random variable X or Z would be in a certain range. Briefly put, given a value of X what is the probability. Symbolically P(X < some given value) + ? where x stands for the given value. The inverse problem reverses this process. Briefly put, given the probability, what is the value of X. Symbolically P(X<?) = some given value. The Excel command used to compute this is NORMINV(probability, mean, standard deviation) So if you have a distribution with = 13.45 and = 3.23 you could find a value for X such that, say, 33% of the area is to the left of the point and 67% of the area is to the right. Or P(X < a) = 0.33 Probability formula P(X<a) = 0.33 Result Excel command 12.0291 =NORMINV(0.33,13.45,3.23) So 33% of the area is to the left of 12.029 and 67% of the area is to the right. Example 3 (above). Suppose that we know that the distribution of gasoline usage in a certain type of automobile is normally distributed with 30 mpg and 2 mpg. Find the proportion of automobiles with gasoline usage in the range of 28 to 32 mph. Use Excel P(28<X<32) 0.6827 =NORMDIST(32,30,2,TRUE) - NORMDIST(28,30,2,TRUE) Next find P(29 < X < 33) P(29<X<33) 0.6247 =NORMDIST(33,30,2,TRUE) - NORMDIST(29,30,2,TRUE) Example 4 (above). Find the proportion of automobiles that get an excess of 32.5 mph. P(X>32.5) 0.1056 =1 - NORMDIST(32.5,30,2,TRUE) Example 5 (above). A company that makes light bulbs advertises that the bulbs will last for 700 hours before burning out. The mean lifetime of the bulbs is 800 hours and the standard deviation is 150 hours. What proportion of the bulbs will burn out before the advertised value. Find P(X<700) if the lifetimes are normally distributed. P(X<700) 0.2525 = NORMDIST(700,800,150,TRUE) Example 6. (above) A company gives prospective employees an aptitude test. The mean test score is 50 with a standard deviation of 5. The test scores are normally distributed. The company only wants to make job offers to those who score in the top 15% on the test. Find x such that P(X < x) = 0.15. P(X<x) 55.1822 = NORMINV(0.85,50,5) Example 7. (above). Well, some problems just have to be worked by hand. You could use Excel to get a value for z and to do the hand calculation. P(Z<z) = 0.95 x=10, sigma = 0.1 mean = x - z*sigma -1.64485 =NORMINV(0.05,0,1) 10.16449 =10-B1*(0.1) Example 8. (above). The text uses the normal approximation to the binomial. Here we could use Excel and avoid the approximation because the value of n is too large to use the binomial formula but not so large as to cause Excel to fail. Note carefully that P(490 X 510) P( X 510) P( X 489) . Solving the problem using Excel but avoiding the approximation P(490<=X<=510) =BINOMDIST(510,1000,0.5,TRUE) 0.49334 BINOMDIST(489,1000,0.5,TRUE) Using the approximation n= p= Mean Sigma Upper value of Z Lower value of Z P(Low Z < Z < Upper Z) 1000 0.5 500 15.81139 0.664078 -0.66408 =1000*0.5 =SQRT(1000*0.5*0.5) =(510.5-500)/ 15.81139 =(489.5-500)/ 15.81139 =NORMDIST(0.664078,0,1,TRUE) 0.49336 NORMDIST(-0.66408,0,1,TRUE) Example 6. (above). Suppose an airline has planes that seat 100 people. The airline's problems is that on average 20% of the people that book flights are no-shows. This means that on average 20 seats would be empty an the airline would lose revenue. To avoid this sort of loss the airline decides to overbook its flights. The policy is to sell up to 120 tickets per flight. We want to find the probability that at least one passenger will be bumped on such a flight. Being bumped means that more than 100 passengers show up and those in excess of 100 will be forced to take a later flight (possibly at cost to the airline). We want to find P( X 100) if n 120 and p 0.8 . Note that you must be very careful here. P( X 100) P( X 101) 1 P( X 100). Using Excel without the approximation P(X>100) = 1-P(X<=100)= 0.151714 =1-BINOMDIST(100,120,0.8,TRUE) Using the approximation n= p= mean= sigma= Z= P(Z>1.02698) = 120 0.8 96 4.38178 1.02698 0.152215 =120*0.8 =SQRT(120*0.8*0.2) =(100.5-96)/1.02698 =1-NORMDIST(1.02698,0,1,TRUE) Problem .4 (above) Suppose we are producing a critical bolt for a machine assembly. The assembly's specifications require that the bolt have a diameter of 1000 1 millimeter. Our bolt making machine produces bolts which have an average diameter of 999.8 millimeters and a standard deviation of 1.5 millimeter. What percentage of these bolts meet the specifications of the assembly? Given 999.8 and 1.5 , find P(999<X<1001) Z X 999 9999 053 15 Z X 1001 9998 08 15 P(999<X<1001) = P(-0.53<Z<0.80) = 0.7881-0.2981 = 0.4900 so 49% of the parts meet the specifications. Using Excel mean= sigma= 999.8 1.5 P(999<X<1001)= 0.491243 =NORMDIST(1001,999.8,1.5,TRUE)-NORMDIST(999,999.8,1.5,TRUE) Problem 5 (above) The Federal Government is thinking about instituting a new small business incentive program. The program will give entrepreneurs grants for job training if they will form new businesses in neighborhoods where at least 20% of the families have incomes less than $12500.00. Suppose that a certain neighborhood has a mean family income of $14000.00 and a standard deviation of incomes of $2000.00. Does this neighborhood qualify for thegrant program? Given 14000 and 2000 , find P(X<12500) Z X 12500 14000 0.75 2000 P(X<12500) = P(Z < -0.75) = 0.2266 so about 23% of the families have incomes below $12500, thus the neighborhood qualifies. Using Excel mean= sigma= P(X<12500) = 14000 2000 0.226627 =NORMDIST(12500,14000,2000,TRUE) or we can find out that level of income that determines the lower 20% of the distribution P(X < x)=0.20 12316.76 =NORMINV(0.2,14000,2000) Problem 6. (above) Suppose that 10% of all market research surveys are returned. If a company sends out n=10,000 surveys find the probability that between 975 and 1010 of these surveys will be returned. Use the normal approximation to the binomial. Here we want to find P(975 X 1010) P( X 1010) P( X 974) P(974.5 X 1010.5) . Excel cannot compute the binonial formula because the value of n is too large. We can use the normal approximation to the binomial, however, and get the answer. First note that Excel cannot solve this problem directly P(X<=1010)= #NUM! =BINOMDIST(1010,10000,0.1,TRUE) However we can use the normal approximation to the binomial n= p= mean=np= sigma=sqrt(npq) Upper Z = Lower Z = 10000 0.1 1000 =10000*0.1 30 =SQRT(10000*0.1*0.9) 0.35 -0.85 P(974.5<X<1010.5) = P(-0.85<Z<0.35) =NORMDIST(0.35,0,1,TRUE) 0.4392 NORMDIST(-0.85,0,1,TRUE)