Probability Chapter 3 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N518 Office Hours: 10:30am-noon Phone: 212-220 7421 False positives and negatives Pregnancy test results Positive test result Negative test result (test indicates pregnant) Subject pregnant Subject not pregnant • False positive: test incorrectly indicates woman pregnant when she is not. • False negative: test incorrectly indicates woman is not pregnant when she is pregnant. • True positive: test correctly indicates woman pregnant when she is . • True negative: test correctly indicates woman not pregnant when she is not. (test indicates not pregnant) 80 5 3 11 • Test sensitivity: the probability of a true positive. • Test specificity: the probability of a true negative. • Ex: Abbot test pack indicates that their urinte test has a 0.2% false positive and a 0.6% false negative rate. 2 Overview • Rare event rule: If under a given assumption (lottery is fair) the probability of a particular observed event (5 consecutive lottery wins by the same person) is extremely small, the assumption is probably not correct. 3 Fundamentals Definitions: • Procedure: rolling a die, 2 dice, tossing a coin, … A procedure is an action whose outcome(s) (result) is (are) random. • Event: Any collection of outcomes of a procedure. • Simple events: an event that cannot be simplified even further. • Sample space of a procedure: The set of all simple events. Notation: • P • A, B, C • P(A) Examples: • Procedure: rolling a die, 2 dice, • • • • Event: For 1 die, any of 1,2,3,4, 5,6, “even”, greater than 3”. For 2 dice: “sum is 7”, “sum is bigger than 10”, “1-1”, “1-2”, “21”, “both even”. Simple events: for 1 die:1, 2, 3,4, 5, 6. For 2 dice: 1-1, 1-2,1-3,14,1-5,1-6, 2-1, 2-2, 2-3, 2-4, 2-5, 2-6, 3-1, …, 6-6 Sample space of a procedure: The set of all simple events. probability specific events the probability of the event A occurring 4 Defining a probability • Relative Frequency Approach: Observe a procedure a large number of times and count the number of times that event A occurs, then P(A) is estimated by P(A)= number of times A occurs number of trials • Classical Approach: If a procedure has n simple (different) events that can occur that are equally likely, and there are s different ways that A can occur then P(A)= number of ways A can occur number of simple events • Examples: • A tack falls up: repeat the experiment 1000 times and count how many times the tack falls up, then P(A) is the ratio of number it falls up over the number of times the tack was thrown. • = s n Subjective Probability: P(A), the probability of the event A, is found by based on knowledge of relevant circumstances. Rolling a die: assuming the die is not loaded each face has the same chance of falling upside # of ways face even 3 = P(even)= Total # of options 6 • Weather forecast: need to be expert to estimate wisely if it will rain tomorrow or 5 not. More examples • Flying on a commercial plane. Find the probability that a random selected adult has flown on a plane. • • 2 events: flown, or not. events not equally likely (cannot use classical approach) use relative frequency approach. Gallup poll: 815 randomly selected adults, 710 indicated the have flown • • Roulette: Bet on number 13 on a roulette game. What is the probability that you will lose? • 38 slots, all equally likely, use classic approach. 37 result in loss. P(loss)= 37 38 • Meteorites: What is the probability that your house will be hit by a meteorite? • In absence of historical data, need 3rd approach. We know the chance is very small, say 0.000,000,001. This is a subjective estimate. A general 6 ballpark. P(flew on commercial plane)= 710 =0.83 815 Law of large numbers Law of large numbers: As a procedure is repeated again and again, the relative frequency probability of an event tends to approach the actual probability. s n • • • • • P(A) for against no opinion total P(for)= 319 491 Example: 2 boys, 1 girl. What is that when a couple has 3 children, exactly 2 out of the 3 are boys. • Assuming that having boys or girls is equally likely, use classical approach. Options are: – boy-boy-boy – boy-boy-girl – boy-girl-boy – boy-girl-girl – girl-boy-boy – girl-boy-girl – girl-girl-boy – girl-girl-girl 8 possible outcomes, 3 correspond to exactly 2 boys • Example: Death penalty. In a Gallup poll, adults are randomly selected and asked if they are in favor or against the death penalty. The responses include 319 who are for it, 133 who are against it, and 39 that have no opinion . Based on these results, estimate the probability that a randomly selected person is in favor of the death penalty. 319 133 39 491 • =0.65 • P(exaclty 3 boys)= 3 8 =0.375 7 Complementary probabilities and properties • Thanksgiving day. What is the probability that Thanksgiving day falls on a a) b) – Examples: • If X denotes the number the face a die shows when it lands, then – – – – Wednesday? Thursday? Thanksgiving is always on a Thursday! a) b) Impossible: P(Thxgiv. Wed)=0 Always true: P(Thxgiv. Thu)=1 – – – – – The probability of the impossible event is 0. P( ) =0. The probability of the certain event is 1. P( ) =1. For any event A, 0 ≤ P(A) ≤ 1. If Ac denotes the complement event to A, then P(A)+P(Ac)=1 • P( X = 7 ) = 0 P( X ≤ 7 ) = 1 P( X not even ) = 1- P( X even ) P( { X ≤ 2} c ) = 1 - P( { X ≤ 2 } ) = 1 - 2/6 = 4/6 = 2/3 = P( X > 2 ) P( X ≥0 ) = 1 For any event A, P(A)≥0 P(A)=0 only if A cannot happen For any event A, P(A ) ≤ 1 P(A)=1 exactly only if A happens for sure If Y denotes the sum of the numbers on the faces when throwing 2 dice: – – – – P( Y = 1) =0 P( 2 ≤ Y ≤ 12 ) =1 P(Y=4) = 3/36 namely 1-3, 2-2, and 3-1 P({Y=2} c) = 1-P(Y=2)=1-1/36 = 35/36 8 HW: p.120 #1-7 Venn diagrams Addition Rule A compound event is an event combining 2 or more simple events. Notation: P(AB) intersection of A and B (both A and B occur) P(AB) union of A and B (either A or B or both occur) A A AB B B Overlapping events = Non-Overlapping events (disjoint) + – P(AB ) = P(A ) + P( B ) – P( A B) Addition Rule: P(AB) = P(A) + P(B) – P(AB) Mendel: hybridization experiments. Peas with purple (p) and white (w) flowers, green (g) and yellow (y) pods. 8p 9g 6w 5 y P(g p) = 9/14 + 8/14 – 5/14 = P(g) + P(p) – P(g p) Idea: count data only once! Events A and B are disjoint (or mutually exclusive) if they cannot both occur together. In such a case, the intersection of the events is empty: AB = ø and we recall that P(ø) = 0. We then have P(AB) = P(A) + P(B) 9 Examples: addition rule Pregnancy test results Positive test result Negative test result (test indicates pregnant) (test indicates not pregnant) Subject pregnant Subject not pregnant 80 5 3 11 Clinical trials of pregnancy test: Assuming that 1 person is selected at random from the 99 people in the test, find the probability of selecting a subject who is pregnant or had a positive test result. P(pregnant) = (80 + 5)/99 P(test positive) = (80 +3 ) / 99 P(pregnant and test positive) = 80 / 99 P(pregnant or test positive) = P(pregnant) + P(test positive) - P(pregnant and test positive) = 85/99 + 83/99 - 80/99 = 88/99 = 8/9 = 0.899 Alternatively P(pregnant or positive)= P(pregnant and positive) + P(pregnant and negative) + P (not pregnant but positive) = 80/99 + 5/99 + 3/99 Note that Pregnant =(pregnant & pos) + (preg. & neg) Positive = (pregnant & pos) + (pos. & not preg.) Substract to avoid double counting! 10 Multiplication rule • P( A and B ) = P( A B ) Example: Answer at random 1. True/false: A pound of feathers is heavier than a pound of gold. 2. Which has affected society most: a) b) c) d) e) Remote control Sneakers with high heels Hostess twinkies Computers Phone HW: p.130 #13-20 • • • To answer at random q. 1, each choice has probability 1/2. To answer at random q. 2, each choice has probability 1/5. P(both answers correct) = P( T and (d) ) = 1/2 * 1/5 =1 / 10 = P(T ) P(d ) T F a b c d e a b c d e 11 Multiplication rule: independent event If events A and B are independent, then P( A B ) = P(A)P(B ) Answer: Independent? YES! A: 1st die even B: second die larger than 4 P(A ) = 3/6 = 1/2 P(B) = P(“face shows 5 or 6”) = 2 / 6 = 1/3 Independence: occurrence of 1 event does not affect the other. Example: P(A B ) = P (A) P(B ) = 1/2 1/3 = 1/6 Throwing 2 dice. What is the probability that the first number is even and the second one is larger than 4. # 1 # 2 #1 # 2 # 1 # 2 # 1 # 2 # 1 # 2 # 1 # 2 1 1 1 2 1 3 1 4 1 5 1 6 2 1 2 2 2 3 2 4 2 5 2 6 3 1 3 2 3 3 3 4 3 5 3 6 4 1 4 2 4 3 4 4 4 5 4 6 5 1 5 2 5 3 5 4 5 5 5 6 6 1 6 2 6 3 6 4 6 5 6 6 Alternatively: from graph there are 6 options that are good: 2-5, 2-6, 4-5,4-6, 6-5,66: P(A B ) = 6/36 = 1/6 12 Multiplication rule (without replacement) Genetics experiments: If 2 peas are chosen at random without replacement, what is the probability that the first one has a green pod and the second one a yellow one? 1st selection: 2nd selection: P(g) = 9/14 P(y) = 5/13 (14 peas, 9 green pods) (13 peas left, 5 yellow pods) P(g first, y second) = P(g)P(y) = 9/14 5/13 = 0.247 • Must take into account : without replacement • Second pea chosen out of only 13 peas! • First event should take into account the fact that the first one occurred! 13 Conditional Probability P( B | A) conditional probability of B given A probability of event B occurring given that event A has occurred P( A | B) conditional probability of A given B probability of event A occurring given that event B has occurred General multiplication rule: P(A and B) = P( A | B) P(B) Examples: CD control damage: Water, Crushing, Puncture, Marking 5 damaged items: W, C, C, P, M, 2 items are selected randomly … a) With replacement probability of first item C, second item C. P(CC) = 2/5 2/5 = 4/25= 0.16 b) Without replacement: P(CC) . P(CC) = 2/5 1/4 = 2/20= 0.1 14 More multiplication examples • • Probability of 4 aces in 4 cards: P(4 aces) = 4/52 3/51 2/50 1/49 = 0.00000369 Pollsters sample without replacement, but treat events as independent if sample size is less than 5% of population. Quality control: - Former DVD defect rate: 3% - New DVD process. Claimed better! - 5000 DVDs produced - 200 sampled, 0 defect Is the claim plausible? • P( 0 DVDs defect) = P(all 200 ok) P(1 dvd ok) = 0.97 – (assuming old 3% defect rate, then 97% are good, or 0.97 ) Assume independence – (sample 5% or less from 5000): P(200 DVDs ) = P(DVD 1 ok and DVD 2 ok and DVD ok … and DVD 200 ok) =P(DVD1 )P(DVD2 ) …P(DVD200) = 0.97 0.97 …. 0.97 = 0.97 200 = 0.00226 This probability is so small (rare event) that it indicates that it is very unlikely to have “by chance” not found any defect DVDs in the 200 DVD sample. Instead, it is more likely that the defect rate is lower. 15 Homework problems 7.138 Defective gas masks: 19, 218 gas masks from US military were tested, 10,332 defective. Find the probability that 2 random gas masks from this population are defective, if the sampling is done a) with replacement. b) without replacement. c) Compare and decide which to choose in this case. a) 10332/19218 10332/19218= 0 .289036 b) 10332/19218 10331/19217= 0 .289033 c) The results are VERY similar, it makes more sense use the first case, with replacement, because for a random selection of 2, it is very very unlikely that the same mask is chosen twice. Therefore, we can assume independence. 14.139 Poll confidence levels: • Public opinion polls usually have a “confidence level” of 95%, meaning that with a probability of 0.95, the poll results are within the margin of error. • If 5 different groups conduct independent polls, what is the probability that all of them fall within the margin of error? • P(5 polls )=P(poll )5= 0.95 5 =0.77378 • Does the result suggest that with a confidence level of 95%, we can expect that almost all polls will be within the margin of error? • Yes, in average 77 out of 100 polls will be within the margin of average, or 4 out of 5 will be . 16 “At least one” event • At least one = one or more • Complement: none! P(at least one girl among 3 children) = 1- P(no girl) boy-boy-boy =1-1/8 boy-boy-girl boy-girl-boy =7/8 boy-girl-girl =0.875 girl-boy-boy girl-boy-girl girl-girl-boy girl-girl-girl P(at least one poll within confidence interval) =1- P(no poll good) =1- [ 0.05 5] = 0.9999997 Conditional probability The conditional probability of the event A given B is denoted by P(A | B) and it is the probability that A occurs knowing that B has occurred already Example: Subject Test positive pregnant Negative 5 3 11 not pregnant Total Total 80 positive 85 pregnant 14 not pregnant negative 83 16 • 1 subject is selected randomly, find the probability of a subject being positive, given that she is pregnant. P(pos|pregnant) = 80/85=0.941 or = P(positive and pregnant) P(pregnant) happens always! = 80/99 = 0.964 HW: p.138 #7, 11, 13, 14 85/99 17 Conditional Probability Titanic • P(man |died) • P(died |men) • P(boy or girl |survived) • P(man or woman | died) Titanic Mortality Rate Men Survived Died Women Boys Girls 332 318 29 27 1360 104 35 18 Titanic Mortality Rate Men Survived Died =m/w/b/g Women Boys Girls 29 Total D/A 332 318 27 1360 104 35 18 1692 422 64 45 706 1517 2223 P(man|died) = P(man & died)/P(died) = 1360/1517=0.897 P(died |man) = P(man & died )/P(man) = 1360/1692=0.804 P(boy or girl |survived)=P( {boy or girl} & survived)/P(survived)=57/706=0.079 P((man or woman | died) )=P({man or woman}& died)/P(died)=1464/1517=0.965 NOTE: P(man & died) P(died) = 1360 / 2223 = 1360 = 0.897 1517 / 2223 1517 18 Counting Fundamental Counting rule: If event A can occur in m ways and event B in n ways, the events together can occur in a total of mn ways. Examples: • Combination lock: – 3 dials with digits 0-9. • Total # of combinations: 101010 = 10 3 =1000. – bikelock 4 dials digits 1-6 • 6666= 64 =1296. • Arrangements of ABC: ABC, ACB, BAC, BCA, CAB, CBA = 6 • In 3 spots, we have __3__ __2__ __1__ 3 choices for first spot, 2 for second and 1 for last, or 321=6. NOTATION: 321= 3 ! or 3 factorial. 4! = 4321= 24, 5! = 54321= 120 also 5! = 5 4! and note as well that for example 7! /3! = 7654321/ 321 = 7654 19 Factorials and permutations A collection of n different objects can be arranged in n! different ways. In the 1st spot there are n possible items to place, in the second one, n-1, in the third one, n-2, …, in the penultimate one, 2, and in the last one, only 1 choice. Examples: • Ways of sitting 20 student in class with 20 chairs: 20!=2,432,902,008,176,640,000 • Arrangements of ABC: ABC, ACB, BAC, BCA, CAB, CBA, totaling 6 arrangements. In 3 spots, we have __3__ __2__ __1__ 3 choices for first spot, 2 for second and 1 for last, or 321=6. • Ways of sitting 4 of the 20 students in 4 preassigned chairs: 20191817 = 20! / 16! = 20! / (20-4)! = 20P4 =116,280 • 20! / 16! = 201918171615…321 / 1615…321 = 20191817 • In general, nPk = n! / (n - k)!= n (n-1)(n-2)… (n-k+1) • nPk is called a permutation of k objects out of n total objects. 20 Permutations and Combinations Permutations Rule (all items differ, order does count): The number of permutations (or sequences) of k items selected from n available items (without replacement) is Permutations (some items equal): If there are n items, n1 alike, n2 alike, … nk alike, the number of permutations of all items is Combinations (order does not count): The number of combinations of k items selected from n different items (without replacement) is 21 Example: elected officers The boards of trustees of a college has 9 members. Each year, a 3 person committee is chosen. At the same time, the board elects 3 officers (Prez, VP, and secretary). a) How many slates of candidates for officers are possible? b) How many different 3-person committees can be chosen? a) Order does count for board, it matters who is P, VP and S. Therefore, permutations of k=3 people out of n=9 different people, or b) Order does not count for committee. Therefore, combinations of k=3 people out of n=9 different people, or 22 p.14 Is the pollster lying? • A pollster claims that 12 voters were randomly selected from a population of 200,000 voters (of which 30% are Republican) and all 12 are Republican. He claims this can easily happen by chance. Find the probability that the 12 are Republican when randomly selected, to see if we believe the claim. • Assuming independence: • P(12 Repulicans) = P(#1 R)P(#2 R) P(#3 R)… P(#12 R) = P( R ) 12 = 0.3 12 =0.000,000,53 • Something is very fishy! 23