Slide set 6 Stat 330 (Spring 2015) Last update: January 13, 2015 Stat 330 (Spring 2015): slide set 6 Total Probability Law and Bayes’ Rule Experiment: Treasure Hunt • Box 1 has two gold coins • Box 2 has one gold coin and one silver. • Box 3 has two silver coins. • Suppose that you select one of the boxes randomly and then select one of the coins from this box. What is the probability that the coin you select is a gold coin? For a problem like this, that consists of a step-wise procedure, it is often useful to draw a tree (a flow chart) of the choices we can make in each step. Define B1, B2, B3 to be the events that Box 1, 2 or 3 is selected randomly. 1 Stat 330 (Spring 2015): slide set 6 Treasure Hunt... continued A tree diagram shows all possible outcomes of this two-step procedure: Choosing one box (at random) means, that all boxes are equally likely to be chosen: P (Bi) = 31 for i = 1, 2, 3. P (Select a gold coin|B1) = 1. Why? P (Select a gold coin|B2) = 0.5. Why? Define new events E1 and E2: E1 = choose Box 1 and pick a gold coin E2 = choose Box 2 and pick a gold coin 2 Stat 330 (Spring 2015): slide set 6 Treasure Hunt... continued We use the definition of conditional probability to get P (E1) and P (E2)! P (E1) = P ( choose Box 1 and pick a gold coin) = P ( pick a gold coin |B1) · P (B1) = 1 · 13 = 13 . P (E2) = P ( choose Box 2 and pick a gold coin ) = P ( pick a gold coin |B2) · P (B2) = 1 2 · 13 = 16 . The probability to choose a gold coin is the sum of P (E1) and P (E2) This is because those are the only ways to get a gold coin, as we’ve seen in the tree diagram and they are mutually exclusive (disjoint)! Thus we have: P ( gold coin ) = 1 3 + 16 = 0.5. We just used the Law of Total Probability to compute the probability of choosing a gold coin. 3 Stat 330 (Spring 2015): slide set 6 Law of Total Probability Definition. A collection of events B1, . . . Bk is called a cover or partition of Ω if (i) the events are mutually exclusive (i.e., Bi ∩ Bj = ∅ for i 6= j), and Sk (ii) the union of the events is Ω (i.e., i=1 Bi = Ω). • We can represent a cover using a Venn diagram: • If we represent a multi-step procedure with a tree diagram, then the branches of the tree form a cover. 4 Stat 330 (Spring 2015): slide set 6 Law of Total Probability (continued...) Theorem: Law of Total Probability. If the collection of events B1, . . . , Bk is a cover of Ω, and A is an event, then P (A) = k X P (A|Bi)P (Bi). i=1 Proof of the Law of Total Probability: • By definition of conditional probability P (A|Bi)P (Bi) = P (A ∩ Bi) • Because B1, . . . , Bk partition Ω, the events A∩B1, . . . A∩Bk are disjoint, and ∪ki=1Ai = A where Ai = A ∩ Bi. Pk By Axiom (iii) (slide set 2 p.5), P (A) = i=1 P (A ∩ Bi ) = Pk i=1 P (A|Bi )P (Bi ). • 5 Stat 330 (Spring 2015): slide set 6 Law of Total Probability (continued...) We can depict event A using a Venn diagram: The probability Pkof event A is put together as sum of the probabilities of the intersections i=1 P (A ∩ Bi). In our Treasure Hunt example, the events B1, B2, and B3 form a cover Ω, as a coin can be drawn only from one of the boxes. Defining event A = drawing a gold coin, we have P (A) = P (A|B1)P (B1) + P (A|B2)P (B2) = 1 · 31 + 12 · 31 = 0.5 as before. Note that event A does not intersect the event B3 as there are no gold coins in Box 3. 6 Stat 330 (Spring 2015): slide set 6 Bayes’ rule. Theorem: Bayes’ Rule. If B1, . . . , Bk is a cover or partition of Ω, and A is an event, then P (A|Bj )P (Bj ) . P (Bj |A) = Pk j=1 P (A|Bj )P (Bj ) Proof of Bayes’ Rule: P (Bj |A) = = P (Bj ∩ A) P (A|Bj )P (Bj ) = P (A) P (A) P (A|Bj )P (Bj ) Pk j=1 P (A|Bj )P (Bj ) . 7 Stat 330 (Spring 2015): slide set 6 Example 1.7.3: (Hofmann notes) A given lot of chips contains 2% defective chips. Each chip is tested before delivery. However, the tester is not wholly reliable: P ( “tester says chip is good” | “chip is good” ) = 0.95 P ( “tester says chip is defective” | “chip is defective” ) = 0.94 If the test device says the chip is defective, what is the probability that the chip actually is defective? We will apply Bayes’ Rule, using Cd, and C̄d as cover. P (|chip is {z defective} :=Cd | it’s defective}) |tester says {z :=Td = P (Cd|Td) 8 Stat 330 (Spring 2015): slide set 6 Continue Example 1.7.3 P (Cd|Td) = = = = P (Td|Cd)P (Cd) P (Td) P (Td|Cd)P (Cd) P (Td|Cd)P (Cd) + P (Td|C¯d)P (C¯d) 0.94 · 0.02 0.94 · 0.02 + (1 − P (T̄d|C̄d)) · 0.98 0.94 · 0.02 = 0.28 0.94 · 0.02 + (1 − 0.95) · 0.98 9