251solnH3 2/13/08 (Open this document in 'Page Layout' view!) 1 H. Introduction to Probability 1. Experiments and Probability Text problems 4.1, 4.2. 2. The Venn Diagram and the Addition Rule. Downing & Clark, pg. 96 (pg 85 in 3 rd ed) Basics 1, Application 2, 13. H1 (H0A), Text problems 4.3, 4.4, 4.10, 4.11, 4.8!, 4.9!.[4.3, 4.4, 4.8*, 4.9*]., (4.3,4.4, 4.8*, 4.9*)., H4, H5 (H1, H2). 3. Conditional and Joint Probability, Bayes’ Rule. Text 4.16a-c, 4.18 [4.14a-c, 4.16] (4.13a-c, 4.15). H2, H3 (H0B, H0C), D&C pg 113 (pg 103 in 3 rd ed) 14 (Note error in text - 5/6 of the people in the city support Jones, 5/9 of the people in the country support Jones), 15, 16. H6 (H3). 4. Statistical Independence. Text 4.16d, 4.22, 4.21!, 4.24, 4.30, 4.31, 4.33 [4.14d*, 4.19*, 4.20*, 4.22*, 4.28, 4.29, 4.31, 4.68*] (4.18*, 4.19*, 4.21*, 4.26, 4.27, 4.29). H8, H9 (H5, H6). pg. 97( pg. 85 in 3 rd ed) Applications 4, 5, 8, 9, 10, 11, 44. H7(H4). 5. Review. Section 4 is in this document. Remember A and B are statistically independent if P A B P A and that this implies that P A B P A PB . Exercise 4.16d [4.14d in 9th] (Not in 8th edition): The text gave the following contingency table B B A 10 20 . We found the following probabilities: (a) P A B =.3333; (b) P A B =.3333 and (c) A 20 40 P A B = .6667. Are the events A and B statistically independent? Solution: We say that the events A and B statistically independent if P(A |B) = P A . Alternately, we can note that if the events A and B statistically independent P A B P A PB . B B A .1111 .2222 .3333 Note that we found the following. This represents A .2222 .4444 .6667 .3333 .6667 1.0000 B B A P A B P A B P A . Since we found P A P(A |B) = P(A | B ) = 0.3333, it is extremely A P A B P A B P A P B PB 1 obvious that the occurrence of the event A does not in any way depend on whether event B occurs, so they are statistically independent. We can also note that every number on the inside of the joint probability table is the product of the numbers on the outside. We can also see that the second column is proportional to the first and that the second row is proportional to the first. These are all symptoms of independence. 251solnH3 2/13/08 (Open this document in 'Page Layout' view!) 2 Exercise 4.22 [4.19 in 8th] (Not in 8th edition): The table reads: Condition of Die Quality No Particles Particles Total Good 320 14 334 Bad 80 36 116 Total 400 50 450 The table represents the condition of the dies used and the acceptability of 450 wafers. ‘Particles’ means that particles were on the die that produced a given wafer. A wafer can be classified as ‘Good’ or ‘Bad.’ a) If a wafer is bad, what is the probability that it was produced from a die that had particles? b) If a wafer is good, what is the probability that it was produced from a die that had particles? c) Are the two events ‘Particles’ and ‘Good’ independent? Solution: Define the following events: “Yes, there are particles.” Y “There are no particles.” N “The wafer is good.” G “The wafer is bad.” B If we use the events above, we get the table below. If we divide by 450, we get the second table. N Y N Y G 320 14 334 G .7111 .0311 .7422 B 80 36 116 B .1778 .0800 .2578 400 50 450 .8889 .1111 1.0000 According to the Instructor’s Solutions Manual (edited) (a) (b) (c) PY B P(had particles | bad) = 36/116 = 0.3103 PY G P(had particles | good) = 14/334 = 0.0419 PN G P(no particles | good ) = 320/334 = 0.9581 PN P(no particles) = 400/450 = 0.8889 Since P(no particles | good ) P(no particles), “a good wafer” and “a die with no particle” are not statistically independent. Exercise 4.21 [Not in 8th or 9th]: 2000 community members were sampled with the following results: Drives to Work Homeowner Renter Total Yes 824 681 1505 No 176 319 495 Total 1000 1000 2000 Find a) the probability that someone who drives to work is a homeowner, b) the probability that a homeowner drives to work, c) the difference between a) and b) and d) whether driving to work and being a homeowner are independent. Solution: This table was made into a joint probability table by dividing through by 2000. Use H for homeowner, R for renter and D for “drives to work.” H R D .4120 .3405 .7525 D .0880 .1595 .2475 .5000 .5000 1.0000 So, for example PH D .4120 . This is the probability that a respondent is both a homeowner and drives to work. According to the Instructor’s Solutions Manual (edited) 251solnH3 2/13/08 4.21 (a) (b) (c) (d) (Open this document in 'Page Layout' view!) 3 P (a homeowner | drives to work) = 824/1505 = 0.5475 or PH D .4120 PHD .5475 PD .7525 P (drives to work | a homeowner) = 824/1000 = 0.8240 or PD H .4120 P DH .8240 PH .5000 The conditional events are reversed. Since P (a homeowner) = 1000/2000 = 0.50 is not equal to P (a homeowner | drives to work) = 824/1505 = 0.5475, driving to work and whether the respondent is a homeowner or a renter are not statistically independent. Exercise [4.20 in 9th] (Not in 8th edition): The table reads: Book Airline Tickets on the Internet? Research Airline Tickets on the Internet? Yes No Total Total Yes 88 20 108 No 124 168 292 212 188 400 Find a) the probability that someone who researches ticket prices on the internet books on the internet, b) the probability that someond who books on the internet researches prices on the internet and c) the difference between a) and b). Solution: Define the following events: “Books airline tickets on the Internet.” B B “Does not book airline tickets on the Internet” “Researches airline prices on the Internet.” R R “Does not research airline prices on the Internet.” If we use the events above, we get the table below. If we divide by 400, we get the second table. B B B B R 88 124 212 R .22 .31 .53 R 20 168 188 R .05 .42 .47 108 292 400 .27 .73 1.00 According to the Instructor’s Solutions Manual (edited) (a) P B R P(book tickets on the internet | research ticket prices on the internet) = 88/212 (b) (c) (d) = 0.4151 PR B P(researches ticket prices on the internet | book tickets on the internet) = 88/108 = 0.8148 The conditional events are reversed. Since PB P(book tickets on the internet) = 108/400 = 0.27 is not equal to PB R P(book tickets on the internet | research ticket prices on the internet) = 88/212 = 0.4151, researching airline ticket prices on the internet and booking airline tickets on the internet are not statistically independent. 251solnH3 2/13/08 (Open this document in 'Page Layout' view!) 4 Exercise 4.24 [4.22 in 9th] (Not in 8th edition): Of 56 white workers terminated, 29 claimed bias. Of 407 black workers terminated, 126 claimed bias. Find a) the probability that a white worker claims bias, b) the probability that a worker who has claimed bias is white, c) the difference between the meaning of a) and b) and d) whether claiming bias and being white are independent. Solution: Given: 56 + 407 = 463 workers were terminated. Of these 56 were white, so 56 out of 463 or 12.10% were white. 29 + 126 =155 claimed bias. According to the Instructor’s Solutions Manual (a) P(claimed bias | white ) = 29/56 = 0.5179 (b) P(white | claim bias) = 29/155 = 0.1871 (c) The conditional events are reversed. (d) Since P(white | claim bias) = 0.1871 is not equal to P(white) = 0.1210, being white and claiming bias are not statistically independent. As you may have guessed, I would be much happier with a formal solution using tables. Let us define the following events: W - ‘white’ and B ‘claims bias’ The problem says ‘Of 56 white workers terminated, 29 29 .5179 . It also says “Of 407 black workers terminated, 126 claimed bias” or claimed bias’ so PB W 56 126 P BW .3096 . It asks for P W B . If we do this as a table and use numbers rather than 407 B B 29 __ 56 probabilities we find. You can read the probabilities from the table. W 126 __ 407 155 __ 463 In terms of probabilities, we may recall that PW .1210 . This means that W P W 1 .1210 .8790. Bayes rule says PW B PB W PW P B . We need to find 155 .5179 .1210 .1872 . .3348 . So PW B 463 .3348 PB PB W P B W PB W PW P B W P W .5179 .1210 .3096 .8790 .06267 .27214 Exercise 4.30 [4.28 in 9th] (4.26 in 8th edition): If PB .05 , P B .95, P A B .80, P A B .40 , find P B A . Solution: According to the Instructor’s Solutions Manual (using B' for B ) A 0.80 0.04 A' 0.20 0.01 A 0.40 0.38 A' 0.60 0.57 B 0.05 B' 0.95 251solnH3 2/13/08 P( B | A) (Open this document in 'Page Layout' view!) 5 P( A | B) P( B) 0.8 0.05 0.04 0.095 P( A | B) P( B) P( A | B' ) P( B' ) 0.8 0.05 0.4 0.95 0.42 Exercise 4.31 [4.29 in 9th] (4.27 in 8th edition): If PB .30 , P A B .60, P A B .50 , find P B A . Solution: According to the Instructor’s Solutions Manual P( A | B) P( B) 0.6 0.3 0.18 P( B | A) 0.340 P( A | B) P( B) P( A | B' ) P( B' ) 0.6 0.3 0.5 0.7 0.53 Exercise 4.33 [4.31 in 9th] (4.29 in 8th edition): The problem says that husbands watch television in prime time 60% of the time. When the husband is watching TV, the wife is watching 40% of the time and when the husband is not watching television, the wife is watching TV 30% of the time. It asks for a) the probability that, if the wife is watching TV, the husband is also watching and b) the probability that the wife is watching. Solution: According to the Instructor’s Solutions Manual, If we define the following, H = husband watching and W = wife watching, the facts that are given in the problem are PH .60, P W H .40 and P W H .30 . We deduce that P H 1 PH .60. (a) P( H | W ) P(W | H ) P( H ) 0.4 0.6 0.24 2 0.667 P(W | H ) P( H ) P(W | H ) P( H ) 0.4 0.6 0.3 0.4 0.36 3 W W W W W W (b) H H 60 40 100 H H 24 12 60 40 100 H H 24 12 36 . 36 28 64 60 40 100 To do this by the box method, note that of 100 husbands 60%, or 60 husbands are watching in prime time. So 40 are not watching. Of the 60 wives whose husbands who are watching television, 40% or 24 are also watching television. Of the 40 wives whose husbands are not watching television, 30% or 12 are watching television. If we just add up the stuff in the box, and fill in the blanks we get a complete table. We find that 36 wives are watching TV. 24 of these have husbands that are also watching, so that our conditional 24 0.667 probability is P( H | W ) 36 P(W) = 0.24 + 0.12 = 0.36. Using the box, we find that 36 women out of 100 are watching. 251solnH3 2/13/08 (Open this document in 'Page Layout' view!) 6 Exercise 4.68 (Not in 10th or 8th edition): In 1997, 24.0% of all highway fatalities involved a rollover. 15.8% of all accidents involving a rollover involved SUVs, vans and pickups. Given that a rollover was not involved, 5.6% of fatalities involved SUVs, vans and pickups. Define the following events: {Fatality involved an SUV, van or pickup} A {Fatality involved a rollover} B a. Use Bayes’ theorem to find the probability that the fatality involved a rollover, given that the fatality involved an SUV, van or pickup. b. Compare the result in (a) to the probability that the fatality involved a rollover, and comment on whether on whether SUVs, vans and pickups are more prone to rollover accidents. Solution: Given: P A B .158, PB .240 , P A B .056 Try this: 24% of all highway fatalities involved a rollover. 76% of fatalities did not involve a rollover. Since 15.8% of rollover accidents involved an SUV etc., out of 100 accidents 24 involved a rollover and (.158) 24 = 3.792 involved both a rollover and an SUV. Also, out of the 76 accidents out of 100 that did not involve a rollover, 5.6% or (.056) 76 = 4.256 involved an SUV. Thus 4.256 + 3.792 = 8.048 accidents involved an SUV and 3.792 also had a rollover, so the fraction of SUV accidents that involved a rollover was 3.792 out of 8.048 or 47.12%. PA B PB Bayes’ rule says PB A . And P A P A P A B P A B PA B PB P A B P B .158 .24 .056 .76 .03792 04256 = .08048. PA B PB .158 .24 .4712 P A .08048 According to the Instructor’s Solutions Manual So PB A 4.65 (a) (b) P(B | A) = (0.158)(0.24)/((0.158)(0.24)+(0.056)(0.76) = 0.4712 Since the probability of a fatality involved a rollover given that the fatality involved an SUV, van or pickup is 0.4712, which is almost twice the probability of a fatality involved a rollover with any vehicle type at 0.24, SUV's, vans or pickups are generally more prone to rollover accidents. PROBLEM H8: Explain why mutually exclusive events are also dependent events. Solution: If A and B are mutually exclusive P A B 0 . If they are independent P A B P A . If P A 0 and PB 0 and A and B are independent, can P A B be zero? Think about it! PROBLEM H9: (McClave, Benson and Sincich): Flip a coin three times. Define the following events: A At least one head, B Exactly two heads , C Exactly two tails and D At most one head . You have seen a tree diagram for this in class that showed that PHHH PHHT PHTH PTTT 18 . We can do this in a more sophisticated way now. Let H 1 be a head on the first try, H 2 be a head on the second try and H 3 be a head on the third try. These three events are independent so, by the extended Multiplication Rule, PHHH PH 3 H 2 H 1 PH 3 H 2 H 1 PH 2 H 1 PH 1 PH 3 PH 2 PH 1 12 12 12 18 . This applies to all the possible events, which are HHH , HHT , HTH , HTT , THH , THT , TTH and TTT . All are a) Find P A , PB , PC , PD , P A B , P A D , PB C and PB D . b) Use your answers in a) to calculate P A B , P A D and P C B . c) Which pairs of events are independent? 1 8 . 251solnH3 2/13/08 (Open this document in 'Page Layout' view!) Solution: a) P A P TTT 1 PTTT 1 18 P C 3 8 7 8 7 , PB PHHT PHTH PTHH 18 18 18 83 , (Since heads and tails are equally likely), PD PTTT PHTT PTHT PTTH 18 18 18 18 P A D PHTT PTHT PTTH 18 18 18 3 8 12 , P A B PB 83 , 4 8 and PB C PB D 0 . P A D 3 P A B 3 8 P B C 0 8 . PC B 1. PA D 0. 1 3 3 P D 4 P B P B 2 8 8 c) Testing for independence. Pair Conditional Probability Probability AB P A B 3 8 P A 78 P A B 1 3 P B 8 AC P A C PC P A 78 PA C 1 PC PC AD P A D 3 8 3 P A 78 P A D 1 P D 4 2 BC P B C 0 P B 83 PB C 0 3 PC 8 b) PA B 3 BD PB D CD PC D P B D 0 0 3 PC 8 PC D P D 3 8 1 2 3 4 P B 3 8 P C 3 8 Note that the definition of independence for A and B is P A B P A . Because of the inequality, none of the above pairs are independent. Downing and Clark, pg. 85, Application 4: The probability of at least one head on the first three of 5 coin flips. Solution: You only need to consider the first three tosses. We have shown in class that PTTT 1 2 1 2 1 2 18 . Now we want P TTT 1 PTTT 1 18 7 8 . Downing and Clark, pg. 85, Application 5: If you roll a die 3 times, what is the chance of a 1 on at least one of the 3 tosses. Solution: The probability of 1 on 1 toss is 1 6 , so the probability of not getting 1 on 1 toss is 1 1 6 5 6 . 5 6 3 . Since a 1 on at least one of the 3 tosses is the 3 complement of not getting 1 on all 3 tosses, its probability is 1 5 6 1 .8333 .1667 . The probability of not getting 1 on all 3 tosses is Downing and Clark, pg. 85, Application 8: Find the probability of at least one head on n flips. Let Ti Tails on flip i. Then the probability of all tails on n flips is PT1 T2 T3 Tn n PT i i 1 1 2 n . So the probability of not getting all tails is 1 1 2 . n 251solnH3 2/13/08 (Open this document in 'Page Layout' view!) 8 Downing and Clark, pg. 85, Application 9: If your 2 favorite prime-time shows are on different networks and there are 21 prime-time slots, what is the chance that they will both be on at the same time? Solution: Arbitrarily assume that the first show is on at 8PM on Monday. Since this is one of the 21 slots, the chance that the other is on at the same time is 1 21 . This is actually the answer to the problem, since it would be true no matter what time the first show was on. Note: If you look at the next two problems, you will see that the probability that both shows appear at 8PM on Monday is 1 21 1 21 , but you have to do this for all 21 possible slots, so the answer is 21 1 21 1 21 1 21 . Downing and Clark, pg. 85, Application 10: If your 2 favorite prime-time shows are on different networks, what is the chance that they will both be on at the same night? Solution: By the same logic as in the previous problem, the answer is 1 7 . 251solnH3 2/13/08 (Open this document in 'Page Layout' view!) 9 Downing and Clark, pg. 85, Application 11: If your 2 favorite prime-time shows are on different networks, what is the chance that they will both be shown on Monday? Solution: The probability that the first show will be televised on Monday is 1 7 . The probability that the second show will appear on the same day is also joint probability of the two events occurring is 1 7 as we showed in the previous problem. Therefore the 17 17 1 49 Downing and Clark, pg. 85, Application 44: If you roll a die n times, what is the probability that at least one 1 will turn up? Solution: ‘At least one’ is a code word for ‘not none!’ On any one roll the probability of not rolling a 1 is 5 6 , so the probability of not rolling a 1 on n event is 1 5 6 . n tries is 5 6 n . The probability of the complement of this If you want to work this out, the probability rises surprisingly slowly. If n 1 , the probability is 1 .1667 . If n 2 , the probability is 1 5 6 .3056 . If n 3 , the probability is 1 5 6 .4213 . If 2 6 3 n 4 , the probability is 1 5 6 .5177 . Only when n 25 , does the probability rise to 99%. 4 PROBLEM H7: A machine has two components with the following probabilities of failure: Day Component 1 Component 2 Notice that, since neither component has a life of more than 1 .25 0 four days, the machine cannot last beyond day 4 if it 2 .25 .50 requires both or either component to function. Thus the 3 .25 .50 probabilities of it failing on the four days must add to 1. 4 .25 0 a. If the machine requires both components to operate, it cannot fail on day 4, since component 2 cannot last beyond day 3. What is the probability of the machine failing on day 1? day 2? day 3? b. Instead of assuming that the machine needs both components to work, assume that it will operate as long as either component 1 or component 2 is working. Notice that this means that it cannot fail on day 1. What about the probability of failure on day 2? day 3? day 4? Solution: Let the table below define events A through F . Day Component 1 Component 2 For Example, Event B is 'Component 1 fails on Day 2.' Fails Fails Though Events A, B, C, and D are mutually exclusive, as 1 A are events E and F , events involving component 1 are 2 B E independent of events involving component 2. This means C 3 F for example that 4 D PB F PB PF .25 .50 .125 . In Part a of the problem, where the machine needs both components to work, if event B F occurs, the machine fails on day 2. If we check each possible joint event, we find that the event 'Machine fails on Day 2,' is the union of Events B E , C E , D E and B F . Because these joint events are mutually exclusive, we can find the probability of their union by adding their probabilities. The table below names the joint events, gives their probabilities and tells when the machine fails under the assumptions of Part a and Part b. 251solnH3 2/13/08 Joint Event (Open this document in 'Page Layout' view!) Probability .125 A E .125 BE CE .125 .125 DE .125 A F .125 BF CF .125 .125 DF summarizes the results. Day Under Assumptions of Part a, Machine fails on day 1 2 2 2 1 2 3 3 Under Assumptions of Part b, Machine fails on day 2 2 3 4 3 3 3 4 Probability of Failure under Assumptions of Part a .250 .500 .250 0 1 2 3 4 Note that in Part a the component that fails first determines when the machine fails, while in Part b, the component that fails last determines when the machine fails. The final step is to add together all the probabilities associated with each day, so that, for example, there are two joint events that lead to failure on day 2 in Part b. This means that the probability of machine failure on day 2 is .125 .125 .250 . The last table Probability of Failure under Assumptions of Part b. 0 .250 .500 .250 If we handle this with a joint probability table we can write it as: Events A B C D Total E .125 .125 .125 .125 .50 F .125 .125 .125 .125 .50 Total .25 .25 .25 .25 1.00 In Part a, where both components are needed for the machine to operate the machine will fail on day 1 only if event A occurs. It will fail on the third day only if both components last until the third day, that is only if event F occurs at the same time as C or D . The probability of event A is .25 and the probability of either event C F or D F is .125+.125=.25. So the probability of failure on day 2 is .50. Similar reasoning can be used in part b. Parts not copied ©2003 Roger Even Bove 10 251solnH3 2/13/08 (Open this document in 'Page Layout' view!) 11 Appendix: Solutions to problems that were in the 8th edition. Exercise 4.18: Again, we have a contingency table. B B T T 60 60 120 15 65 80 75 125 200 According to the Instructor’s Solutions Manual (edited) (a) P T B P(has travel/entertainment credit card | has bank credit card) = 60/120 = 1/2 = 0.5 (b) P B T P(has bank credit card | does not have travel/entertainment credit card) = 60/125 = 12/25 = 0.48 (c) Since P T B P(has travel/entertainment credit card | has bank credit card) = 60/120 or 0.5 and PT P(has travel/entertainment credit card) = 75/200 or 0.375, the two events are not statistically independent yes no yes S Exercise 4.19: Again, we have a contingency table. no S T T 1197 33 1230 270 127 143 1324 176 1500 According to the Instructor’s Solutions Manual (edited) (a) P T S P(did not received product in time | satisfied) = 33/1230 = 0.0269 (b) PS T P(satisfied | received product in time) = 1197/1324 = 0.904 (c) The conditioned event is different in (a) and (b). It is conditioned that the customer is satisfied in (a) while it is known that the customer did receive the product in time for the holiday. (d) Since PS P(satisfied) = 1230/1500 = 0.82 which is not equal to P S T P(satisfied | received product in time) = 0.904, being satisfied with their experience and receiving their product in time for the holidays are not statistically independent. Exercise 4.21: According to the Instructor’s Solutions Manual (a) P(does not trade online | bullish) = = 240/585 = 0.410 (b) P(does not trade online | not bullish) = 260/415 = 0.627 (c) Since P(does not trade online) = 500/1000 = 0.50 is not equal to P(does not trade online | bullish) = 240/585 = 0.410, being bullish on the market and the type of investor are not statistically independent.