Probability •Formal study of uncertainty •The engine that drives statistics Introduction • Nothing in life is certain • We gauge the chances of successful outcomes in business, medicine, weather, and other everyday situations such as the lottery (recall the birthday problem) History • For most of human history, probability, the formal study of the laws of chance, has been used for only one thing: gambling History (cont.) • Nobody knows exactly when gambling began; goes back at least as far as ancient Egypt where 4-sided “astragali” (made from animal heelbones) were used History (cont.) • The Roman emperor Claudius (10BC-54AD) wrote the first known treatise on gambling. • The book “How to Win at Gambling” was lost. Rule 1: Let Caesar win IV out of V times Approaches to Probability • Relative frequency event probability = x/n, where x=# of occurrences of event of interest, n=total # of observations • Coin, die tossing; nuclear power plants? • Limitations repeated observations not practical Approaches to Probability (cont.) • Subjective probability individual assigns prob. based on personal experience, anecdotal evidence, etc. • Classical approach every possible outcome has equal probability (more later) Basic Definitions • Experiment: act or process that leads to a single outcome that cannot be predicted with certainty • Examples: 1. Toss a coin 2. Draw 1 card from a standard deck of cards 3. Arrival time of flight from Atlanta to RDU Basic Definitions (cont.) • Sample space: all possible outcomes of an experiment. Denoted by S • Event: any subset of the sample space S; typically denoted A, B, C, etc. Simple event: event with only 1 outcome Null event: the empty set F Certain event: S Examples 1. Toss a coin once S = {H, T}; A = {H}, B = {T} simple events 2. Toss a die once; count dots on upper face S = {1, 2, 3, 4, 5, 6} A=even # of dots on upper face={2, 4, 6} B=3 or fewer dots on upper face={1, 2, 3} Laws of Probability 1. 0 P( A) 1, for any event A 2. P(F ) 0, P( S ) 1 Laws of Probability (cont.) 3. P(A’ ) = 1 - P(A) For an event A, A’ is the complement of A; A’ is everything in S that is not in A. S A' A Birthday Problem • What is the smallest number of people you need in a group so that the probability of 2 or more people having the same birthday is greater than 1/2? • Answer: 23 No. of people 23 30 40 60 Probability .507 .706 .891 .994 Example: Birthday Problem • A={at least 2 people in the group have a common birthday} • A’ = {no one has common birthday} 3 people 23 people :P ( A') 364 363 365 365 : 364 363 343 P ( A') . 498 365 365 365 so P ( A ) 1 P ( A ' ) 1 . 498 . 502 Unions and Intersections S A B A A Mutually Exclusive Events • Mutually exclusive events-no outcomes from S in common A = S A B Laws of Probability (cont.) Addition Rule for Disjoint Events: 4. If A and B are disjoint events, then P(A B) = P(A) + P(B) • 5. For two independent events A and B P(A B) = P(A) × P(B) Laws of Probability (cont.) General Addition Rule 6. For any two events A and B P(A B) = P(A) + P(B) – P(A B) P(AB)=P(A) + P(B) - P(A B) S A A B Example: toss a fair die once • • • • S = {1, 2, 3, 4, 5, 6} A = even # appears = {2, 4, 6} B = 3 or fewer = {1, 2, 3} P(A B) = P(A) + P(B) - P(A B) =P({2, 4, 6}) + P({1, 2, 3}) - P({2}) = 3/6 + 3/6 - 1/6 = 5/6 Laws of Probability: Summary • • • • 1. 0 P(A) 1 for any event A 2. P() = 0, P(S) = 1 3. P(A’) = 1 – P(A) 4. If A and B are disjoint events, then P(A B) = P(A) + P(B) • 5. If A and B are independent events, then P(A B) = P(A) × P(B) • 6. For any two events A and B, P(A B) = P(A) + P(B) – P(A B) Probability Models The Equally Likely Approach (also called the Classical Approach) Assigning Probabilities • If an experiment has N outcomes, then each outcome has probability 1/N of occurring • If an event A1 has n1 outcomes, then P(A1) = n1/N We Need Efficient Methods for Counting Outcomes Product Rule for Ordered Pairs • A student wishes to commute to a junior college for 2 years and then commute to a state college for 2 years. Within commuting distance there are 4 junior colleges and 3 state colleges. How many junior college-state college pairs are available to her? Product Rule for Ordered Pairs • junior colleges: 1, 2, 3, 4 • state colleges a, b, c • possible pairs: (1, a) (1, b) (1, c) (2, a) (2, b) (2, c) (3, a) (3, b) (3, c) (4, a) (4, b) (4, c) Product Rule for Ordered Pairs • junior colleges: 1, 2, 3, 4 • state colleges a, b,4cjunior colleges • possible pairs: 3 state colleges total number of possible (1, a) (1, b) (1, c) pairs = 4 x 3 = 12 (2, a) (2, b) (2, c) (3, a) (3, b) (3, c) (4, a) (4, b) (4, c) Product Rule for Ordered Pairs • junior colleges: 1, 2, 3, In 4 general, if there are n1 ways to choose the first element of • state colleges a, b, cthe pair, and n ways to choose 2 the second element, then the • possible pairs: number of possible pairs is (1, a) (1, b) (1, c) n1n2. Here n1 = 4, n2 = 3. (2, a) (2, b) (2, c) (3, a) (3, b) (3, c) (4, a) (4, b) (4, c) Counting in “Either-Or” Situations • NCAA Basketball Tournament: how many ways can the “bracket” be filled out? 1. How many games? 2. 2 choices for each game 3. Number of ways to fill out the bracket: 263 = 9.2 × 1018 • • Earth pop. about 6 billion; everyone fills out 1 million different brackets Chances of getting all games correct is about 1 in 1,000 Counting Example • Pollsters minimize lead-in effect by rearranging the order of the questions on a survey • If Gallup has a 5-question survey, how many different versions of the survey are required if all possible arrangements of the questions are included? Solution • There are 5 possible choices for the first question, 4 remaining questions for the second question, 3 choices for the third question, 2 choices for the fourth question, and 1 choice for the fifth question. • The number of possible arrangements is therefore 5 4 3 2 1 = 120 Efficient Methods for Counting Outcomes • Factorial Notation: n!=12 … n • Examples 1!=1; 2!=12=2; 3!= 123=6; 4!=24; 5!=120; • Special definition: 0!=1 Factorials with calculators and Excel • Calculator: non-graphing: x ! (second function) graphing: bottom p. 9 T I Calculator Commands (math button) • Excel: Paste: math, fact Factorial Examples • • • • • • • • • 20! = 2.43 x 1018 1,000,000 seconds? About 11.5 days 1,000,000,000 seconds? About 31 years 31 years = 109 seconds 1018 = 109 x 109 31 x 109 years = 109 x 109 = 1018 seconds 20! is roughly the age of the universe in seconds Permutations ABCDE • How many ways can we choose 2 letters from the above 5, without replacement, when the order in which we choose the letters is important? • 5 4 = 20 Permutations (cont.) 5! 5! 5 4 20 5 4 (5 2)! 3! 5! Notation : 5 P2 20 (5 2)! Permutations with calculator and Excel • Calculator non-graphing: nPr • Graphing p. 9 of T I Calculator Commands (math button) • Excel Paste: Statistical, Permut Combinations ABCDE • How many ways can we choose 2 letters from the above 5, without replacement, when the order in which we choose the letters is not important? • 5 4 = 20 when order important • Divide by 2: (5 4)/2 = 10 ways Combinations (cont.) 5! 5! 5 4 20 5 C2 10 (5 2)!2! 3!2! 1 2 2 n! n Cr (n r )! r! 5 2 n r ST 101 Powerball Lottery From the numbers 1 through 20, choose 6 different numbers. Write them on a piece of paper. Chances of Winning? Choose 6 numbers from 20, without replacemen t, order not important. Number of possibilit ies? 20 6 20! 20 C6 38,760 (20 6)!6! North Carolina Powerball Lottery Prior to Jan. 1, 2009 After Jan. 1, 2009 5 from 1 - 55: 5 from 1 - 59: 55! 3, 478, 761 5!50! 59! 5, 006, 386 5!54! 1 from 1 - 42 (p'ball #): 1 from 1 - 39 (p'ball #): 42! 42 1!41! 39! 39 1!38! 3, 478, 761*42 146,107, 962 5, 006, 386*39 195, 249, 054 Visualize Your Lottery Chances • How large is 195,249,054? • $1 bill and $100 bill both 6” in length • 10,560 bills = 1 mile • Let’s start with 195,249,053 $1 bills and one $100 bill … • … and take a long walk, putting down bills endto-end as we go Raleigh to Ft. Lauderdale… … still plenty of bills remaining, so continue from … … Ft. Lauderdale to San Diego … still plenty of bills remaining, so continue from… … San Diego to Seattle … still plenty of bills remaining, so continue from … … Seattle to New York … still plenty of bills remaining, so continue from … … New York back to Raleigh … still plenty of bills remaining, so … Go around again! Lay a second path of bills Still have ~ 5,000 bills left!! Chances of Winning NC Powerball Lottery? • Remember: one of the bills you put down is a $100 bill; all others are $1 bills • Your chance of winning the lottery is the same as bending over and picking up the $100 bill while walking the route blindfolded. Example: Illinois State Lottery Choose 6 numbers from 54 numbers without replacemen t; order not important 54! 25,827,165 54 C6 48!6! (about 1 second in 10 months) (1200 ft 2 house, 16.5 million ping pong balls) Virginia State Lottery 50! Pick 5 : 50 C5 2,118,760 45!5! 2,118,760 25 C1 25! 2,118,760 52,969000 24!1! Probability Trees A Graphical Method for Complicated Probability Problems Example: AIDS Testing • • • • V={person has HIV}; CDC: P(V)=.006 +: test outcome is positive (test indicates HIV present) -: test outcome is negative clinical reliabilities for a new HIV test: 1. If a person has the virus, the test result will be positive with probability .999 2. If a person does not have the virus, the test result will be negative with probability .990 Question 1 • What is the probability that a randomly selected person will test positive? Probability Tree Approach • A probability tree is a useful way to visualize this problem and to find the desired probability. Probability Tree clinical reliability clinical reliability Probability Tree clinical reliability clinical reliability Multiply branch probs Question 1 Answer • What is the probability that a randomly selected person will test positive? • P(+) = .00599 + .00994 = .01593 Question 2 • • If your test comes back positive, what is the probability that you have HIV? (Remember: we know that if a person has the virus, the test result will be positive with probability .999; if a person does not have the virus, the test result will be negative with probability .990). Looks very reliable Question 2 Answer Answer two sequences of branches lead to positive test; only 1 sequence represented people who have HIV. P(person has HIV given that test is positive) =.00599/(.00599+.00994) = .376 Summary • Question 1: • P(+) = .00599 + .00994 = .01593 • Question 2: two sequences of branches lead to positive test; only 1 sequence represented people who have HIV. P(person has HIV given that test is positive) =.00599/(.00599+.00994) = .376 • Recap We have a test with very high clinical reliabilities: 1. 2. • If a person has the virus, the test result will be positive with probability .999 If a person does not have the virus, the test result will be negative with probability .990 But we have extremely poor performance when the test is positive: P(person has HIV given that test is positive) =.376 • In other words, 62.4% of the positives are false positives! Why? • When the characteristic the test is looking for is rare, most positives will be false. examples 1. P(A)=.3, P(B)=.4; if A and B are mutually exclusive events, then P(AB)=? A B = , P(A B) = 0 2. 15 entries in pie baking contest at state fair. Judge must determine 1st, 2nd, 3rd place winners. How many ways can judge make the awards? 15P3 = 2730