251y0341 05/05/03 ECO 251 QBA1 FINAL EXAM MAY 7, 2003 Name KEY Class ________________ Part I. Do all the Following (14 Points) Make Diagrams! Show your work! x ~ N 3, 4 . How many of you really believe that a probability can be negative? 1.85 3 2.30 3 z P 1.32 z 0.29 1. P2.30 x 1.85 P 4 4 P1.32 z 0 P0.29 z 0 .4066 .1141 .2925 or P1.33 z 0 P0.29 z 0 ..4082 .1141 .2941 or , better, P1.325 z 0 P0.29 z 0 .4074 .1141 .2933 17 3 1 3 z P 0.50 z 3.50 2. P1 x 17 P 4 4 P0.50 z 0 P0 z 3.50 .1915 .4998 .6913 1.85 3 Pz 0.29 P 0.29 z 0 Pz 0 3. Px 1.85 P z 4 .1141 .5 .6141 4 3 4. F 4.00 (Cumulative Probability) Px 4 P z 4 Pz 0.25 Pz 0 P0 z 0.25 .5 .0987 .5987 4 3 4 3 z P 1.75 z 0.25 5. P4.00 x 4.00 P 4 4 P1.75 z 0 P0 z 0.25 .4599 .0987 .5586 6. x.015 (Find z .015 first) Make a diagram for z . Show a Normal curve with a mean of zero in its center. Remember that z .015 is a point with 1.5% above it and 98.5% below it. Since 50% of the distribution is bellow zero P0 z z.015 .9850 .5 .4850 . According to the Normal table P0 z 2.17 .4850 . So z .015 2.17 and x z.015 3 2.17 4 3 8.68 11.68 . 11 .68 3 Check: Px 11 .68 P z Pz 2.17 Pz 0 P0 z 2.17 .5 .4850 .0150 4 7. A symmetrical region around the mean with a probability of 25%. Make a diagram for z . Show a Normal curve with a mean of zero in its center. If we split 25% in two, we get two areas, one on either side of the mean with probabilities of 12.5%. We can call the point we want z .375 , because, since the area above zero is 50%, the area above z .375 must be 50% - 12.5% = 37.5%. But we have already decided that the probability between z .375 and zero is 12.5%. The closest we can come on the Normal table is P0 z 0.32 .1255 , so z .375 0.32 and x z.375 3 0.32 4 3 1.28 or 1.72 to 4.28, 4.28 3 1.72 3 z P 0.32 z 0.32 2 P0 z 0.32 2.1255 Check: P1.72 x 4.28 P 4 4 .2510 Exam is normed on 75 points. There are actually 128 possible points. 1 251y0341 05/05/03 II. (10 points+-2 point penalty for not trying part a .) Show your work! The following numbers apply to 9 developed countries and give deaths per 100 million miles and speed limits. (xy was not supplied and is calculated in red.) Row deaths SpLim y x 1 2 3 4 5 6 7 8 9 3.1 3.4 3.5 3.6 4.2 4.4 4.8 5.0 6.2 xy 55 55 55 70 55 60 55 60 75 170.5 187.0 192.5 252.0 231.0 264.0 264.0 300.0 465.0 2326.0 These sums have been calculated for you. y 2 x 38.2 , x 2 169 .86 , y 540 and 32850 . Please calculate the following: a. The sample standard deviation of x (4) Note that y 60.00 and s y 7.50 . Many of you wasted time and energy computing the x squared and y squared columns and then decided that you could get the xy sum by multiplying the sums of x and y. Where have you been? b. The sample covariance between x and y . (3) c. The sample correlation between x and y . (2) d. Given the size and sign of the correlation, what conclusion might you draw on the relation between speed and safety if this were the only evidence available? (1) e. Assume that the death rate in all 9 countries fell by .1. What would be the new values of x , s x , s xy and rxy . Use only the values you computed in a-c and rules for functions of x and y to get your results. If you state the results without explaining why, or change x and recompute the results, you will receive no credit. (4). How many of you recomputed the results anyway? Solution: a) x x 38.2 4.24444 n 9 sx 0.9653 0.9825 . y y 540 60.00 n 9 b) We found above that c) rxy s xy sx sy s x2 s 2y y 2 ny 2 n 1 x xy 2326 , so s xy 4.25 .5768 0.9825 7.50 2 nx 2 n 1 169 .86 94.24444 2 7.72256 0.9653 8 8 32850 960 .00 2 450 56 .25 8 8 xy nxy n 1 2 s y 56.25 7.50 2326 94.24444 60 .00 34 4.25 8 8 This must be between -1 and 1. d) If we square the correlation we get 0. 333, which in a zero to one scale is not impressive. I would not want to conclude that speed limits and safety are closely related, though speed limits may be a factor. 2 251y0341 05/05/03 e) From the syllabus supplement article: “Let us introduce two new variables, w and v , so that w ax b , and v cy d , where a, b, c, and d are constants. From the earlier part of this section we know the following: w2 Varw a 2Varx a 2 x2 w Ew Eax b aEx b v2 Var v c 2Var y c 2 y2 v E v E cy d cE y d To this we now add a new rule: Covw, v wv acCovx, y ac xy To find the correlation between w and v , recall that wv w2 a 2 x2 wv wv . But since w v and v2 c 2 y2 , then ac xy a 2 x2 c 2 y2 ac xy ac x y ac xy signac xy .” ac x y Since these rules work for sample statistics too, then w x 0.1 , v y , so a 1, b 0.1, c 1 and d 0. u x .1 4.24444 .1 4.24444 s wv 11s xy 4.25 . rwv acs xy a 2 s x2 c 2 s 2y acs xy ac s x s y s w2 12 s x2 s x2 , so s w 56 .25 7.50 . ac s xy signacrxy .5768 ac s x s y 3 251y0341 05/05/03 III. Do at least 4 of the following 6 Problems (at least 12 each) (or do sections adding to at least 48 points Anything extra you do helps, and grades wrap around) . Show your work! Please indicate clearly what sections of the problem you are answering! If you are following a rule like E ax aEx please state it! If you are using a formula, state it! If you answer a 'yes' or 'no' question, explain why! If you are using the Poisson or Binomial table, state things like n , p or the mean. Avoid crossing out answers that you think are inappropriate - you might get partial credit. Choose the problems that you do carefully – most of us are unlikely to be able to do more than half of the entire possible credit in this section!) 1. Assume that the amount of paid time (in days) lost by a blue-collar worker during a 3-month period is N 1.4, 1.3 . I take a random sample of 10 workers and record the time they lost in the last 3 months.. a. What is the probability that a randomly picked worker lost paid time exceeding 1.5 days in the 3-month period? (2) b. What is the probability that all 10 workers in the sample lost paid time exceeding 1.5 days in the 3-month period? (2) c. What is the probability that at least one of the workers in the sample lost paid time exceeding 1.5 days in the 3-month period? (2) d. What is the probability that the average amount of paid time lost time exceeded 1.5 days in the three month period? (2) e. What is the probability that the total amount of time lost by the sample of 10 workers exceeded 15 days in the three month period. (2) f. Looking at the distribution of the sample mean in this problem, give a value of the sample mean that will be above the mean we actually observe 95% of the time (the 95th percentile) (2) Solution: a) x ~ N 1.4,1.3 Make a diagram. 1.5 1.4 Px 1.5 P z Pz 0.08 Pz 0 P0 z 0.08 .5 .0319 .4681 1.3 b) Binomial n 10, p .4681 . P10 .4681 10 .0005051 . c) Binomial n 10, p .4681 . Px 1 1 P0 1 1 .4681 10 1 .0018126 .99819 . 1.3 N 1.4, 0.411 d) x ~ N , x N 1.4, 10 1.5 1.4 Px 1.5 P z Pz 0.24 Pz 0 P0 z 0.24 .5 .0948 .4052 0.411 e) Answer is the same as for d because if the total time for 10 workers was 15, the mean was 15 divided by 10. f) According to the t table, the 95th percentile for z was z .05 1.645 . x ~ N , x N 1.4, 0.411 , so x z.05 x 1.4 1.645 0.411 2.076 . 4 251y0341 05/05/03 2. (Bowerman and O’Connell) A retailer that sells home entertainment systems accumulated 10,451 sales invoices during the last year. a. An auditor takes a sample of 16 invoices and computes mean sales of x $532 . If the population standard deviation was known to be $168, find a 99% confidence interval for the mean sales per invoice. (4) b. I lied. Though the sample mean was $532, $168 was a sample standard deviation. Do the 99% confidence interval again. (4) c. I lied. Though it is true that the sample mean was $532 and the sample standard deviation was $168, the actual sample was 650 invoices out of the 10451 invoices that were collected. Do the 99% confidence interval again. (4) d. (Extra credit) Assume that the confidence interval in c is correct, and that the 10451 invoices were all that were generated, using these two facts, create a confidence interval for total sales in the last year. (3) e. (Extra credit) The firm claims that its total sales were above $5.75 million last year. In view of your results in d, does that seem likely? Would you change your mind if I insisted on a confidence level of 99.8%? (3) f. Using the data in a) create a 97% Confidence interval for the mean sales per invoice. (You might want to look at page 1.) (2) Solution: The solution to problem O1 gives the following formulas: i) x z x and x 2 x when is known and the sample is small relative to the n population. ii) x z x and x 2 x n N n when is known and the sample is large relative to N 1 s the population. iii) x tn1 s x and s x x 2 n when is unknown and the sample is small relative s N n to the population and iv) x tn1 s x and s x x when is unknown and the sample is 2 n N 1 large relative to the population. So a) x $532, 168 , n 16 and N is more than 20 times 16. 168 x x 42 . z z.005 2.576 532 2.576 42 532 108 or P424 640 .99 . 2 n 16 b) x $532, s 168 , n 16 and N is more than 20 times 16. s 168 sx x 42 . t n 1 t .15 005 2.947 . 532 2.947 42 532 124 or P408 656 .99 . n 16 2 c) x $532, s 168 , n 650 and N 10451 , which is less than 20 times the sample size. 168 2 9801 N n 168 10451 650 40 .7248 6.38 . t n 1 t .649 005 2.576 . 650 10450 10451 1 2 n N 1 650 532 2.576 6.38 532 16.43 or P515 .57 548 .43 .99. The only place to find z is the very bottom of the t table, unless you have to figure it out as in part f. On the other hand, if you need t, you need the rest of the t table. If you don’t know the difference between sigma and s, please find out! d) Let’s say the average sales invoice was $548.43, then 10451 customs would generate $10451(548.43) = $5,731,642. Likewise if average sales were $515.57, 10451 customers would generate $5,388,222. We can say P5388222 Total sales 5731642 .99. sx sx 5 251y0341 05/05/03 e) Obviously the high estimate in the last interval was below $5.75 million. If we wanted to use a confidence level of 99.8%, we would use t n 1 t .649 001 3.092 . 532 3.092 6.38 532 19 .77 . The 2 upper limit is now $551.77. If we multiply this by 10451, we get $5,756,097, which might make us change our minds. f) On page 1 we found z .015 2.17 , which is the value of z that we would use if .03 . So we have 2 x $532, 168 , n 16 and N is unspecified and assumed much larger than 16. x z 2 z.015 2.17 532 2.17 42 532 91. x n 168 42 . 16 6 251y0341 05/05/03 3. a. Assume that the entire amount of a product made by a supplier is a population of 100 units and that you buy the whole batch. Assume that 15% of the batch is defective. Take a sample of 5 items and give me the probability that at least one is defective. (3) b. Assume that the batch you buy is much larger, well over 200 and that you still take a sample of 5. What is the chance that at least one is defective? (You should not need to use the number 200 or any larger number in your calculations.) (2) c. Assume that you have bought at least a million units, and that 15% still represents the proportion of the product that is defective. This time you take a sample of 80. Find the probability that at least 10 are defective using the Poisson distribution. First, show that it is legitimate to use the Poisson distribution in this case. (3) d. Do part c using the Normal distribution. That is assume that you have a large population that is 15% defective and that you take a sample of 80. Show that the Normal distribution can be used here and find the probability of at least 10 defective items in the sample. (3 points without continuity correction, 3.5 with) e. (Extra credit) In section 7.3 of the text (on the CD) the author tells us we should use a finite population correction with the variance if it is justified. Assume that in part d, the population is 200 and we take a sample of 80, what is the probability of at least 10 defective items in the sample now? (2) f. If we are taking a sample from a large population, find the probability that the first defective item is between the 6th to the 10th item we test. (2) Solution: At least half of you ignored the fact that I changed the sample size from 10 to 5; the only penalty was that it made the problem easier if you made the change. Much of this problem was a repeat of the last take-home exam. Most of you seem to have completely forgotten the following: (i) If you are looking for numbers of successes when the number of tries is given and the probability of success is constant, you want the Binomial distribution. (ii) If you are looking for the try on which the first success occurs out of many possible tries when the probability of success is constant, you want the Geometric distribution. (iii) If you are looking for numbers of successes when the number of tries is given and the average number of successes per unit time or space is given, you want the Poisson distribution. (iv) If you are looking for numbers of successes when the number of tries is given and the probability of success is not constant because the total number of successes in the population is limited, you want the Hypergeometric distribution. a) Hypergeometric n 5, N 100 , p .15 and M p100 15 are defective. Px 1 1 P0 Px 1 P0 1 C xM C nNxM C 010 C 585 C 5100 Crn C nN 1 n! n r !r! 85! 80! 5! 1 100! 95! 5! 85 84 83 82 81 5 4 3 21 1 100 99 98 97 96 5 4 3 21 85 84 83 82 81 1 .43568 .5643 100 99 98 97 96 b) Binomial n 5 and p .15 Px 1 1 P0 1 .44371 .55625 . 1 7 251y0341 05/05/03 c) We can use the Poisson distribution if n n 80 533 .33 500 500 . In this case p .15 p Px 10 1 Px 9 1 .24239 .7576 From table. m np 80.15 12 d) We can use the Normal distribution if expected successes np and expected failures nq are both above five. In this case n 80, p .15 q 1 p 1 .15 .85 , so np 80.15 12 5 and nq 80.85 80 12 68 5 and the Normal distribution is valid. npq 12 .85 10 .2 10 12 3.1937 . So without the continuity correction Px 10 P z Pz 0.63 3.1937 P0.63 z 0 Pz 0 .5 .2357 .7357 , and with the continuity correction 9.5 12 Px 9.5 P z Pz 0.78 P 0.78 z 0 Pz 0 .5 .2823 .7823 . 3.1937 N n 200 80 120 npq 12 .85 10 .2 6.15075 2.4801 N 1 200 1 199 and we are doing a normal approximation to the Hypergeometric distribution. So without the continuity 10 12 correction Px 10 P z Pz 0.81 P 0.81 z 0 Pz 0 .5 .2910 .7910 and 2.4801 e) n 80, N 200 , p .15 . 9.5 12 with the continuity correction Px 9.5 P z Pz 1.01 P 1.01 z 0 Pz 0 2.4801 .5 .3438 .8438 . f) This has the geometric distribution with p .15 q 1 p 1 .15 .85 . F x 1 q x . P6 x 10 Px 10 Px 5 F 10 F 5 1 .85 10 1 .85 5 .85 5 .85 10 .44371 .19687 .24987 . 8 251y0341 05/05/03 4. As everyone knows, a jorcillator has two components, a Phillinx and a Flubberall. It seems that the jorcillator only works as long as both components work. a. The distribution of failure times for the Phillinx is Normal with a mean of 8 months and a standard deviation of 2 months. (i) What is the probability that the Phillinx dies in the first six months Px 6 ? (ii) What is the probability that the Phillinx dies in months 6 – 12 ? (iii) What is the probability that the Phillinx lasts beyond 12 months? (4 total) b. The distribution of the failure times for the Flubberall is described by the continuous uniform distribution between 0 and 11. (i) What is the probability that the Flubberall dies in the first six months? (ii) What is the probability that the Flubberall dies in months 6-12? (iii) What is the probability that the Flubberall lasts beyond 12 months? (iv) What is the mean and standard deviation of the Fluberall’s life? (4.5 total) c. Now let’s see if you learned anything about combining these probabilities. (i) What is the probability that the jorcillator will fail in the first six months? (ii) What is the probability that the jorcillator will fail in the second six months? (iii) What is the probability that the jorcillator will last beyond 12 months? (5 total) d. Rework (c) assuming that the jorcillator works as long as one component works. (6) Solution: a) x ~ N 8, 2 Make a diagram. Draw a Normal curve with a mean at 8. Represent Event B1 by the area below 6, Event A2 by the area between 6 and 12 and Event A3 by the area above 12. 6 8 Pz 1.00 (i) Event A1 : The Phillinx dies in the first six months. Px 6 P z 2 Pz 0 P1.00 z 0 .5 .3413 .1587 . 12 8 6 8 z (ii) Event A2 : The Phillinx dies in months 6 – 12 . P6 x 12 P 2 2 P1.00 z 2.00 P1.00 z 0 P0 z 2.00 .3413 .4772 .8185 . 12 8 Pz 2.00 (iii) Event A3 : The Phillinx lasts beyond 12 months. Px 12 P z 2 Pz 0 P0 z 2.00 .5 .4772 .0228 . b) y is continuous uniform c 0 and d 11 . Make a diagram. Draw a box between 0 and 11 with a 1 1 . Represent Event B1 by the area below 6 and Event B 2 by the area above 6. d c 11 0 Event B 3 is not in the box. height of 60 (i) Event B1 : The probability that the Flubberall dies in the first six months. P y 6 11 0 .54545 (ii) Event B 2 : What is the probability that the Flubberall dies in months 6-12? P6 y 12 11 8 .45455 . 11 0 (iii) Event B3 : What is the probability that the Flubberall lasts beyond 12 months? Px 12 0. (iv) c d 0 11 5.5. 2 2 d c 2 12 11 02 12 10 .0833 3.175 9 251y0341 05/05/03 c, d) Now we put this all together in a humongous table. From here on, c is just a rerun of a take-home exam problem. Period of Failure Joint Event If both components If only one Probability are needed to keep component is it working needed to keep it working. 0 – 6 0 – 6 .1587(.54545) = .0866 A1 B1 A1 B2 A1 B3 A2 B1 A2 B 2 A2 B3 A3 B1 A3 B2 A3 B3 0 0 0 6 6 0 6 > – 6 – 6 – 6 – 12 – 12 – 6 - 12 12 6 > 6 6 > > > > – 12 12 – 12 – 12 12 12 12 12 .1587(.45455) .1587(0) = 0 .8185(.54545) .8185(.45455) .8185(0) = 0 .0228(.54545) .0228(.45455) .0228(0) = 0 = .0721 = .4465 = .3720 = .0124 = .0104 c) So we get the following if both components are required: (i) Probability that the jorcillator will fail in the first six months = .0866 + .0721 + 0 + .4465 + .0124 = .6176 (ii) Probability that the jorcillator will fail in the second six months = .4465 + 0 + .0104 = .3824 (iii) Probability that the jorcillator will last beyond 12 months = 0 Note that these add to one. d) So we get the following if only one component is required: (i) Probability that the jorcillator will fail in the first six months = .0866 (ii) Probability that the jorcillator will fail in the second six months = .0721 + .4465 + .3720 = .8906 (iii) Probability that the jorcillator will last beyond 12 months = 0 + 0 + .0124 + .0104 + 0 = .0228 Note that these add to one. 10 251y0341 05/05/03 5. I am a sales representative and I have two visits scheduled for this morning. Let A represent the event that I make a sale on visit 1 and B represent the event that I make a sale on visit 2. A A a. The joint probability table for the two events is B .4 B and the distribution table .7 w Pw 0 for w , the total number of sales is 1 2 . Fill in these tables on the assumption that A and B are independent (How many of you ignored independence?) and find the expected value and standard deviation for w . (5) Solution: Because A and B are independent, P A B P A PB .4.7 .28 . We also know that P A P A 1 , so we have B A .28 A .4 B .6 . If we just fill in the rest, we get B B A .28 A .12 .4 .42 .7 .18 .3 .6 1 .0 .7 .3 1.0 Now if we use this information to fill in the distribution, we note that, first, probabilities must be between zero and one, second, that the probabilities in a valid distribution must add to one, and third, Pw 0 P A B , Pw 1 P A B P A B and Pw 3 P A B . w Pw wPw w Pw 0 .18 0 0 2 2 1 .54 .54 .54 . We get 1.10 and 1.66 1.10 0.45 . 2 .28 .56 1.12 1.00 1.10 1.66 2 Finally 0.45 0.6708. b. Fill in the two tables again on the assumption that A and B are collectively exhaustive. w Pw A A 0 B .4 and compute the expected value and variance for 1 B 2 .7 w . (4) Solution: Because A and B are collectively exhaustive, P A B 1, and its complement P A B 0 . A If we use this information we get A B .4 0 B .7 .6 . Now just fill in the blanks. .3 1.0 11 251y0341 05/05/03 B B A .10 A .30 .60 .7 0 .3 w Pw wPw w 2 Pw 0 0 0 0 .4 and do the distribution table. 1 .90 .90 .90 . We get 1.10 and .6 2 .10 .20 .40 1.0 1.00 1.10 1.30 2 1.30 1.10 2 0.09 . Finally 0.09 0.3 . c. Let x represent the number of sales I make on visit 1 ( x can only be 0 or 1) and y represent the number of sales I make on visit 2. What relation must exist between the variances of x and y and the variance of w in the case where A and B are independent that cannot exist when they are mutually exclusive? Why? (2). Solution: In general w x y and Varw Varx Var y 2Covx, y . But if x and y are independent Covx, y 0. So Varw Varx Var y . I was afraid that I’d get questioned on this, so I actually did a general proof of this. Don’t read it unless you really like to think about this stuff. Assume that P A Px 1 a 1 , that PB P y 1 b 1 , that a b 1 , and that A and B are mutually exclusive. The joint probability table and computation of the means etc follow. x 0 0 1 a b 1 b 1 a Px xPx 0 y x 2 Px 1 P y yP y y 2 P y a 1 b 0 0 0 b b b . Thus x E x a, E x 2 a, y E y b, a 1.0 b b a a 0 a a b, and we can see that Exy 1 a b00 a10 b01 011 0 . So we can say Varx Ex a a , Var y E y b b and Ey 2 2 2 x 2 2 2 y 2 Covx, y E xy x y 0 ab ab . So if w x y , Varw Varx Var y 2Covx, y a a 2 b b 2 2ab a b a b2 . d. Assume that PB .4 . Use the addition rule to show that A and B cannot be both collectively exhaustive and independent if A has a probability below 1. (3) Solution: If A and B are collectively exhaustive P A B 1 , PB .4 and if A and B are independent, P A B P A PB . So we have P A B P A PB P A B P A PB P APB 1 . This means that P A B P A PB P A B P A .4 P A.4 1 . But this can only be true if P A 1 . More generally, 1 PB P A P APB . If we factor out P A , we get 1 PB P A 1 PB , which can only be true if P A 1. If we do this with PB .4 , we get 1 .4 P A P A.4 or .6 .6 P A . If we divide through by 0.6, we get 1 P A . 12 251y0341 05/05/03 e. Let us define the following events S1 No Oil , S 2 Some Oil and S 3 Much Oil . Assume PS1 .7 , PS 2 .2 and PS 3 .1 . We run a seismic experiment that has three possible readings H (high), M (medium) and L (low). All you have to know for this problem is P H S1 .04 , P H S 2 .02 and P H S 3 .96 (i) Explain the difference between P H S 3 and PH S 3 and show how we get the second of these from the first. (3) (ii) Find PH (2) (iii) Find P S1 H (4) Solution: (i) P H S 3 is a conditional probability which means that it is the probability of getting a ‘high’ reading when it is true that there is much oil. PH S 3 is the joint probability of both getting a ‘high’ reading and much oil out of all 9 possible combinations of readings and oil. The multiplication rule says that PH S 3 P H S 3 PS 3 .96.1 .096 . (ii) If we get a ‘high’ reading, it must be true that there is either no oil, some oil or much oil. Thus it must be true that PH PH S1 PH S 2 PH S 3 PH S1 PS1 PH S 2 PS 2 PH S 3 PS 3 .04.7 .02.2 .96.1 .028 .004 .096 .128 . (iii) We had to bring in Bayes’ rule sometime! PS1 H P H S1 PS1 P H .04 .07 .028 .21875 . .128 .128 Another way of looking at this is to assume that we check 1000 locations. Then 70% or 700 will have no oil, 20% or 200 will have some oil and 10% or 100 will have much oil. P H S1 .04 means that of the 700 locations with no oil, 4% or 28 will give a ‘high’ reading. P H S 2 .02 means that of the 200 locations with some oil, 2% or 4 will give a ‘high’ reading. P H S 3 .96 means that of the 100 locations with much oil, 96% or 96 will give a ‘high’ reading . We thus get a total of 28 + 4 + 96 = 128 ‘high’ 28 .21875 . readings. 28 of these locations have no oil, so P S1 H 128 13 251y0341 05/05/03 6. The Phillies are in the 2003 World Series. I estimate that they have a constant .6 chance of winning a game. There are seven games in a series and the series stops if one team wins four games. a. If there are seven games played what is the mean and variance of the number of games the Phillies win? (2) b. What is the chance that they will win at least 4 of the seven games (You can assume all seven games are played)? (2) c. What is the chance that they will win the series by winning the first four games? (1) d. What is the chance that the first game that they win is the third game? (2) e. What is the chance that they win the series on the fifth game (but not the 4th)? (Ask yourself what has to happen in the first 4 games so that they do not win their 4th game until the 5th try.) (3) f. What is the chance that they lose the first three games and win the series? (2) g. Let x represent the number of the first game they win (so that Px 3 is the probability that the first game that they win is the third game). What is the mean and standard deviation of x ? (2) Solution: a) This starts out as a binomial problem with n 7 and p .6. q 1 .6 .4 For this distribution np 7.6 0.42 . 2 npq 7.6.4 0.42.4 .168 . This starts out as the simplest binomial problem available, 7 tries with a constant probability of .6. Why did so many of you have to make it more complicated? b) We want the probability of 4 or more games won in 7 tries. It is probably easier to think of 4 to 7 wins. Since we do not have Binomial table for probabilities above .5, we must recast the problem in terms of failures. 4 successes in 7 tries would be 3 failures and 7 successes would be no failures. We want the probability of 3 or fewer failures when the probability of failure is .4. According to the table Px 3 .82080 . c) This is a binomial problem too. But it is easier to just figure out the probability of the intersection of four independent events. .64 .1296 . d) This is a geometric distribution problem. Px q x1 p .4 2 .6 .0960 e) In order to win on the 5th game we must win exactly 3 of the first four games and then win again.. First we find the probability of winning 3 out of 4 games. If we want to do this by hand, Px C xn p x q n x so P3 C 34 p 3 q 1 4.63 .4 .3456 . Using the table, we want the probability of 1 failure in 4 tries when the chance of failure is .4. P1 Px 1 Px 0 .4752 .1296 .3456 . To finish, we must multiply this by the .6 probability of success. .6.3456 .20736 . f) To lose 3 games and win the series, we need 3 failures followed by 4 successes. The probability is q 3 p 4 .43 .64 .0082944 . g) This is the mean and standard deviation of the geometric distribution. The outline says q 1 1 .04 11 .1111 3.3333 6.6667 . 2 p .15 p .06 2 14