Chapter 5: A Survey of Probability Concepts Homework #5 (Week 7): Chapter 5, Exercises 8, 12, 20, 26, 28, 30 and 32. Homework #6 (Week 8): Chapter 5, Exercises 36, 38, 40, 70 & 74. Permutations and Combinations Our standard approach to probability questions: P(A) = #S = #N Number of successful outcomes Number of possible outcomes Example: Event A = even number on throw of die S=3 N=6 {2, 4, 6} {1, 2, 3, 4, 5, 6} P(A) = 3/6 = 0.5 This approach, without some revision, is often very difficult to implement, e.g. N (and S) increases. Example: The Irish National Lottery works as follows. Six numbers from the range 1 to 42 are chosen at random. If you have correctly guessed all six you win/share the first prize. What are your chances of winning/sharing if you are allowed to choose six numbers? Useful Mathematical Notation Factorial Notation: n! = n.(n-1).(n-2). ... .2.1 Examples: 5! = 5x4x3x2x1 = 120; 1! = 1; (Aside: 0! = 1 by convention) Examples Leading to General Principles Given the letters {a, b, c, d} (1) How many 2-letter "words" can be formed, allowing for the repeated use of letters? aa ab ac ad ba bb bc bd ca ... 4.4 = (4 possible entries in first cell).(4 possible entries in second cell) = 42 = 16 (2) How many 4-letter "words" can be formed, allowing for the repeated use of letters? aaaa aaab aaac aaad aaba aabb aabc ... 4x4x4x4 = (4 possible entries in the first cell). … .(4 possible entries in the fourth cell) = 44 = 256 General Principle (1) and (2) There are an "words" of length n from an alphabet of a letters, allowing for the repeated use of letters. (3) How many 2-letter "words" can be formed, not allowing for the repeated use of letters? ab ac ad ba bc bd ca cb cd ... [Note: ab is regarded as different from ba] 4x3 = (four possible entries in the first cell).(three possible entries in the second cell) = 12 { = 4!/2! = 4!/(4-2)! } (4) How many 4-letter "words" can be formed, not allowing for the repeated use of letters? abcd abdc acbd acdb adbc adcb bacd badc bcad bcda ... [Note: abcd is regarded as different from abdc] 4x3x2x1 = (four possible entries in the first cell). … .(1 possible entry in the fourth cell) = 24 = 4! { = 4!/1 = 4!/0! = 4!/(4-4)! } General Principle (3) and (4) There are n!/(n-r)! "words" of length r from an alphabet of n letters if the repeated use of letters is not allowed, i.e. letters can only be used once. “Proof”/Intuition: n.(n-1). ... .(n-r+1) = n!/(n-r)! (3) and (4) There are n!/(n-r)! “permutations” of n distinct objects taken r at a time. (nPr = nPr = n!/(n-r)!) (5) How many 2-letter "words" can be formed not allowing for the repeated use of letters and with the extra stipulation that order becomes unimportant, e.g. ab = ba and hence should only be included/counted once? ab ac ad bc bd cd [Note: ab is regarded as the same as ba] (4x3)/2 = 6 (i.e. divide the answer to part (3) by 2 (= 2!) in order to avoid double-counting) General Principle (5) There are n!/[r!.(n-r)!] words of length r from an alphabet of n letters if the repeated use of letters is not allowed and order is disregarded. (5) There are n!/[r!.(n-r)!] “combinations” of n distinct objects taken r at a time. (nCr = nCr = n!/[r!.(n-r)!]) “Proof”/Intuition: As we move from (3) and (4) to (5) order must be disregarded, i.e. as we move from “permutations” to “combinations” order must be disregarded. In how many ways can we permute r objects? The answer is r!. Therefore we must divide our formula from (3) and (4) by r!. Example: The Irish National Lottery works as follows. Six numbers from the range 1 to 42 are chosen at random. If you have correctly guessed all six you win/share the first prize. What are your chances of winning/sharing if you are allowed to choose six numbers? How many 6 number "sets" can be formed from a 42 number “alphabet”, not allowing for the repeated use of numbers and with the extra stipulation that order is disregarded, e.g. 9, 12, 18, 23, 34, 42 = 18, 12, 34, 23, 42, 9 and should only be included/counted once? 42C6 = 42C6 = 42!/[6!(42-6)!] = 5,245,786 P(success) = 1/5,245,786 i.e. approximately 1 in 5 million Aside: 36C6 = 1,947,792 and P(success) is approximately 1 in 2 million 39C6 = 3,262,623 and P(success) is approximately 1 in 3 million Probability: Subjective Approach “Probability is the degree of belief that someone holds about the likelihood of an event occurring.” Problem: Should the belief be rational? What does rational mean? Bayesian Statistics Prior subjective “beliefs”, evidence and rational posterior “beliefs”. BAYES' THEOREM Intuition: Prior Probabilities + Information = Posterior Probabilities Example: Prior Probabilities: P(Faulty) = P(F) = .3, P(Not Faulty) = P(NF) = .7 Information: Mechanic’s Record P(“Guess” FaultyFaulty) = P(GFF) = .9 P(“Guess” Not FaultyFaulty) = P(GNFF) = .1 and P(“Guess” FaultyNot Faulty) = P(GFNF) = .2 P(“Guess” Not FaultyNot Faulty) = P(GNFNF) = .8 Posterior Probabilities? (a) P(Faulty “Guess” Faulty) = P(FGF) = ? Answer > .3 surely (b) P(Faulty “Guess” Not Faulty) = P(FGNF) = ? Answer < .3 surely Method: Bayes' Formula P(AB) = P(A and B)/P(B) (i) P(A and B)? P(BA) = P(B and A)/P(A) = P(A and B)/P(A) P(A and B) = P(BA).P(A) (ii) P(B)? P(B) = P(B and A) + P(B and Not A) P(B) = P(BA).P(A) + P(BNot A).P(Not A) P(AB) = P(A and B)/P(B) P(AB) = P(BA).P(A) P(BA).P(A) + P(BNot A).P(Not A) (a) P(Faulty “Guess” Faulty) = P(FGF) = ? P(FGF) = P(FGF) = P(GFF).P(F) P(GFF).P(F) + P(GFNot F).P(Not F) (.9).(.3) (.9).(.3) + (.2).(.7) P(FGF) = (.27)/(.41) = .66 > .3 (as expected) (b) P(Faulty “Guess” Not Faulty) = P(FGNF) = ? P(FGNF) = P(FGNF) = P(GNFF).P(F) P(GNFF).P(F) + P(GNFNot F).P(Not F) (.1).(.3) (.1).(.3) + (.8).(.7) P(FGNF) = (.03)/(.59) = .05 < .3 (as expected) Example: There are six male candidates and four female candidates for four positions. All ten candidates are equally qualified. The four selected are male. (a) Can this outcome be explained by chance, i.e. what is the probability of the observed evidence occurring given no discrimination? P(EvidenceNo Discrimination) = P(END) 6C4/10C4 = 6C4/10C4 = 15/210 = .0714285 = P(EvidenceDiscrimination) = P(ED) = 1.0 (b) Given the above information/evidence, calculate the posterior probabilities of discrimination given the following prior probabilities of discrimination: (i) P(Discrimination) = P(D) = .8 P(No Discrimination) = P(ND) = .2 (ii) P(Discrimination) = P(D) = .5 P(No Discrimination) = P(ND) = .5 (iii) P(Discrimination) = P(D) = .2 P(No Discrimination) = P(ND) = .8 Prior Probabilities Probabilities + Information = Posterior (i) Prior Probabilities: P(Discrimination) = P(D) = .8 P(No Discrimination) = P(ND) = .2 Information: P(EvidenceNo Discrimination) = P(END) .0714285 P(EvidenceDiscrimination) = P(ED) = 1.0 = Posterior Probabilities: P(DE)? P(DE) = P(ED).P(D) P(ED).P(D) + P(END).P(ND) P(DE) = (1.0).(.8) (1.0).(.8) + (.0714285).(.2) Aside: P(NDE) = .0175439 = .9824561 (ii) Prior Probabilities: P(Discrimination) = P(D) = .5 P(No Discrimination) = P(ND) = .5 Information: P(EvidenceNo Discrimination) = P(END) .0714285 P(EvidenceDiscrimination) = P(ED) = 1.0 Posterior Probabilities: P(DE)? P(DE) = P(ED).P(D) P(ED).P(D) + P(END).P(ND) P(DE) = (1.0).(.5) (1.0).(.5) + (.0714285).(.5) Aside: P(NDE) = .0667 = .9333 = (iii) Prior Probabilities: P(Discrimination) = P(D) = .2 P(No Discrimination) = P(ND) = .8 Information: P(EvidenceNo Discrimination) = P(END) .0714285 P(EvidenceDiscrimination) = P(ED) = 1.0 Posterior Probabilities: P(DE)? P(DE) = P(DE) = P(ED).P(D) P(ED).P(D) + P(END).P(ND) (1.0).(.2) (1.0).(.2) + (.0714285).(.8) Aside: P(NDE) = .223 = .777 =