Chapter 3. Conditional Probability and Independence Introduction • Statistics deals with uncertainty – Weather forecast – Stock prices – Hurricane prediction • Availability of information reduces uncertainty – Weather forecast with more information 2 • Toss two dice, suppose each of the possible 36 outcomes are equally likely. If we observed that the first die is a 3, what is the probability that the sum of the two dice equals to 8? • Given the first die is 3, the sample space can be reduced to {(3,1), (3,2), (3,3), (3,4), (3,5), (3,6)} and the outcomes still equally likely. So the desired probability is 1/6. 3 (1,1) (1,2) (1,3) (1,4) (1,5) (1,6) (2,1) (2,2) (2,3) (2,4) (2,5) (4,1) (4,2) (4,3) (4,5) (4,6) (5,1) (5,2) (5,4) (5,5) (5,6) (6,1) (6,3) (6,4) (6,5) (6,6) S E F (3,1) (3,2) (3,3) (3,4) (3,6) (3,5) (2,6) (4,4) (5,3) (6,2) E: The sum of the two dice is 8 F: The first die is 3 • P(E|F) = # outcomes in EF / # outcomes in F = (# outcomes in EF / # outcomes in S) / (# outcomes in F / # outcomes in S) = P(EF)/P(F) 4 • A coin is flipped twice. Assuming that all four points in the sample space S = {(h, h), (h, t), (t, h), (t, t)} are equally likely, what is the conditional probability that both flips land on heads, given that (a) the first flip lands on heads; (b) at least one flip lands on heads? • Let B = {(h,h)} be the event that both flips land on heads; let F = {(h,h),(h,t)} be the event that the first flip land on heads; and let A = {(h,h),(h,t),(t,h)} be the event that at least one flip lands on heads. P( BF ) P({(h, h)}) 1/ 4 P( B | F ) 1/ 2 P( F ) P({(h, h), (h, t )}) 2 / 4 P( B | A) P( BA) P({(h, h)}) 1/ 4 1/ 3 P( A) P({(h, h), (h, t ), (t , h)}) 3 / 4 5 • Toss two dice, suppose each of the possible 36 outcomes are equally likely. If you observed that the first die is a 3, and you bet on one of the following numbers: 4, 5, 6, 7, 8, 9, which all have the same probability of 1/6. Do you gain any advantage compared to not seeing the first die? • If you had not seen the first die, there are 11 possible outcomes: 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, which have probabilities, 1/36, 2/36, 3/36, 4/36, 5/36, 6/36, 5/36, 4/36, 3/36, 2/36, 1/36. The best bet is 7, which has the same probability of 1/6. 6 Digitalis therapy • Digitalis therapy is often beneficial to patient who have suffered congestive heart failure, but there is the risk of digitalis intoxication, a serious side effect that is, moreover, difficult to diagnose. To improve the chances of a correct diagnosis, the concentration of digitalis in the blood can be measured. Bellar (1971) conducted a study of the relation of the concentration of digitalis in the blood to digitalis intoxication in 135 patients. Their results are simplified slightly in the following table. 7 • • • • T+ = high blood concentration (positive test) T- = low blood concentration (negative test) D+ = toxicity (disease present) D- = no toxicity (disease absent) D+ D- Total T+ 25 14 39 T- 18 78 96 Total 43 92 135 25 of the 135 patients had a high blood concentration of digitalis and suffered toxicity. 8 D+ D- Total T+ .185 .104 .289 T- .133 .578 .711 Total .318 .682 1.000 • P(T+) = .289, P(D+) = .318 • If the patient has high blood concentration (T+), what is the probability of disease (D+)? • P(D+|T+) = 25/39 = .64 • P(D+|T+) = P(D+T+)/P(T+) = .185/.289 = .64 • P(D+|T-) = P(D+T-)/P(T-) = .133/.711 = .187 9 • A student is taking a one-hour-time-limit makeup examination. Suppose the probability that the student will finish the exam in less than x hours is x/2, for all 0 < x < 1. Given that the student is still working after 0.75 hours, what is the conditional probability that the full hour is used? • F: the full hour is used • Lk: exam finished in k hours P( F ) P( L1c ) 1 P( L1 ) .5 c P ( FL P( F ) .5 c .75 ) P( F | L.75 ) .8 c P( L.75 ) 1 P( L.75 ) .625 10 • Ex 2e. Celine is undecided as to whether to take a French course or a chemistry course. She estimated that her probability of receiving an A grade would be ½ in a French course and 2/3 in a chemistry course. If she decides to base her decision on the flip of a fair coin, what is the probability that she gets an A in chemistry? • What are the events? – A: receiving an A grade; C: taking chemistry; F: taking French. – P(A|F) = 1/2, P(A|C) = 2/3, P(C) = P(F) = 1/2. • P(CA)? • P(A|C) = P(AC)/P(C) P(AC) = P(A|C)P(C) = (2/3)(1/2) = 1/3 11 Multiplication rule P( E1E2 E3 En ) P( E1 ) P( E2 | E1 ) P( E3 | E1E2 ) P( En | E1 En1 ) Proof: P( E1E2 En ) P( E1 E2 ) P( E1E2 E3 ) P( E1 ) P( E1E2 En ) P( E1 ) P( E1E2 ) P( E1E2 En1 ) 12 • What is the probability that Celine get an A from either French or chemistry? • P(A) = P(AC) + P(AF) = P(C)P(A|C) + P(F)P(A|F) = (1/2)(2/3) + (1/2)(1/2) = 7/12 13 A useful formula for calculating probabilities E EF EF c P( E ) P( EF ) P( EF c ) P( E | F ) P( F ) P( E | F c ) P( F c ) P( E | F ) P( F ) P( E | F c )[1 P( F )] 14 • Ex 3a part 1 • An insurance company believes that people can be divided into two classes: those who are accident prone and those who are not. Their statistics show that an accident-prone person will have an accident at some time within a fixed 1-year period with probability .4, whereas this probability decrease to .2 for a non-accident-prone person. If we assume that 30 percent of the population is accident prone, what is the probability that a new policyholder will have an accident within a year of purchasing a policy? • The policyholder is either accident prone or not. • A1: the policyholder will have an accident within a year of purchase. • A: the policyholder is accident prone. • P(A1|A) = .4; P(A1|Ac) = .2; P(A) = .3; P(Ac) = .7; P(A1) = ? • P(A1) = P(A1|A)P(A) + P(A1|Ac)P(Ac) = (.4)(.3) + (.2)(.7) = .26 15 • Ex 3a part 2 • Suppose that a new policyholder has an accident within a year of purchasing a policy. What is the probability that he or she is accident prone? • P(A1|A) = .4; P(A1|Ac) = .2; P(A) = .3; P(Ac)=.7 • P(A|A1)? • P(A|A1) = P(AA1)/P(A1) = P(A1|A)P(A)/P(A1) = (.3)(.4)/.26 = 6/13 16 • 3d. A laboratory blood test is 99 percent effective in detecting a certain disease when it is, in fact, present. However, the test also yields a “false positive” result for 1 percent of the healthy persons tested. (That is, if a healthy person is tested, with probability 0.01, the test result will imply he or she has the disease.) If .2 percent of the population actually has the disease, what is the probability a person has the disease given that the test result is positive? • D: Event that the tested person has the disease. • E: Event that the test result is positive. P( DE ) P( E | D) P( D) P( E ) P( E | D) P( D) P( E | D c ) P( D c ) 0.99 0.002 0.00198 .166 .99 0.002 0.01 0.998 0.00198 0.00998 P( D | E ) 17 • Ex 3f. At a certain stage of a criminal investigation the inspector in charge is 60 percent convinced of the guilty of a certain suspect. Suppose now that a new piece of evidence that shows the criminal has a certain characteristic (such as lefthandedness, baldness, or brown hair) is uncovered. If 15 percent of the population possesses this characteristic, how certain of the guilty of the suspect should the inspector now be if it turns out that the suspect has this characteristic? • G: event that the suspect is guilty • C: event that he possesses the characteristic of the criminal • P(G|C)? • P(G|C) = P(GC)/P(C) = P(C|G)P(G) / [P(C|G)P(G) + P(C|Gc)P(Gc)] = 1(.6)/[1(.6) + (.15)(.4)] ≈ .91 18 Monty Hall problem • Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice? 19 Monty Hall problem • • • • • Cswitch: get a car by switching Cstay: get a car by staying Ecar: originally picked the car Egoat: originally picked the goat P(Cswitch) = P(Cswitch| Ecar)P(Ecar) + P(Cswitch| Egoat)P(Egoat) = 2/3 20 • We can express the change in the probability of a hypothesis when new evidence is introduced in a compact form using change in the odds of the hypothesis. • The odds of an event A is defined by P(A)/P(Ac) = P(A)/[1-P(A)] • The odds of an event A tells how much more likely it is that the event A occurs than it is that it does not occur. 21 Change of probability with new evidence • Hypothesis H with probability P(H). • P(H|E) = P(E|H)P(H)/P(E) • P(Hc|E) = P(E|Hc)P(Hc)/P(E) P( H | E ) P( H ) P( E | H ) c P( H | E ) P( H c ) P( E | H c ) 22 • 3g. In the world bridge championships held in Buenos Aires in May 1965 the famous British bridge partnership of Terrence Reese and Boris Schapiro was accused of cheating by using a system of finger signals that could indicate the number of hearts held by the players. Reese and Schapiro denied the accusation, and eventually a hearing was held by the British bridge league. The hearing was in the form of a legal proceeding with a prosecuting and defense team, both having the power to call and cross-examine witnesses. During the course of these proceedings the prosecutor examined specific hands played by Reese and Schapiro and claimed that their playing in these hands was consistent with the hypothesis that they were guilty of having illicit knowledge of the heart suit. At this point, the defense attorney pointed out that their play of these hands was also perfectly consistent with their standard line of play. However, the prosecution then argued that as long as their play was consistent with the hypothesis of guilt, then it must be counted as evidence toward this hypothesis. What do you think of the reasoning of the prosecution? 23 Bayes’ Formula P( AB) P( A | B) P( B) P( B | A) P( A) P( A | B) P( B) P( A | B c ) P( B c ) n F S i i 1 n E EFi i 1 n n i 1 i 1 P( E ) P( EFi ) P( E | Fi ) P( Fi ) P( F j | E ) P( EFj ) P( E ) P( E | F j ) P( F j ) n P( E | F ) P( F ) i 1 i i 24 Occupational Mobility • Suppose that occupations are grouped into upper (U), middle (M), and lower (L) levels. U1 will denote the event that a father’s occupation is upper-level; U2 will denote the event that a child’s occupation is upper-level, etc. (the subscripts index generations). Glass and Hall (1954) compiled the following statistics on occupation mobility in England and Wales: 25 U2 M2 L2 U1 .45 .48 .07 M1 .05 .70 .25 L1 .01 .50 .49 • This table is called transition probability matrix. • If a father is in U, the probability that his son is in U is .45, the probability that his son is in M is .48, etc. • Conditional probabilities such as P(U2|U1)=.45 26 U2 M2 L2 U1 .45 .48 .07 M1 .05 .70 .25 L1 .01 .50 .49 • Suppose that of the father’s generation, 10% are in U, 40% in M, and 50% in L. What is the probability that a child in the next generation is in U? • P(U2) = P(U2|U1)P(U1) + P(U2|M1)P(M1) + P(U2|L1)P(L1) = .45×.10 + .05×.40 + .01×.50 = .07 27 U2 M2 L2 U1 .45 .48 .07 M1 .05 .70 .25 L1 .01 .50 .49 • Suppose we ask: if a child has occupation status U2, what is the probability that his father had occupational status U1? P(U1|U2)? • P(U1|U2) = P(U1U2)/P(U2) = P(U2|U1)P(U1) / [P(U2|U1)P(U1) + P(U2|M1)P(M1) + P(U2|L1)P(L1) ] = .45×.10 / .07 = .64 28 • Suppose that we have 3 cards identical in form except that both sides of the first card are colored red, both sides of the second card are colored black, and one side of the third card is colored red and the other side black. The 3 cards are mixed up in a hat, and 1 card is randomly selected and put down on the ground. If the upper side of the chosen card is colored red, what is the probability that the other side is colored black? 29 • Let – – – – – RR: all red card BB: all black card RB: red-black card. R: upturned side of the chosen card is red P(RB|R)? P( RB | R) P( RB R) P( R) P( R | RB) P( RB) P( R | RR) P( RR) P( R | RB) P( RB) P( R | BB) P( BB) (1 / 2)(1 / 3) 1/ 3 1(1 / 3) (1 / 2)(1 / 3) 0(1 / 3) 30