Chapter 4: Probability Slide set to accompany "Statistics Using Technology" by Kathryn Kozak (Slides by David H Straayer) is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Based on a work at http://www.tacomacc.edu/home/dstraayer/published/Statistics/Book/StatisticsUsingTechnology112314b.pdf. 4.1: Empirical Probability • Experiment: an activity that has specific results that can occur, but it is unknown which results will occur. • Outcomes: the results of an experiment • Event: a set of certain outcomes of an experiment that you want to have happen • Sample Space: collection of all possible outcomes of the experiment. Usually denoted as SS. • Event space: the set of outcomes that make up an event. The symbol is usually a capital letter. Trials for Die Experiment n Number of 6s Relative Frequency 10 2 0.2 50 6 0.12 100 18 0.18 500 81 0.162 1000 163 0.163 Experimental Probabilities • π π΄ = ππ’ππππ ππ π‘ππππ π΄ ππππ’ππ ππ’ππππ ππ π‘ππππ π‘βπ ππ₯ππππππππ‘ π€ππ ππππππ‘ππ • On the prior slide, 163 1000 = 0.163 Law of large numbers • as n increases, the relative frequency tends towards the actual probability value. Some words Note: probability, relative frequency, percentage, and proportion are all different words for the same concept. Also, probabilities can be given as percentages, decimals, or fractions. Section 4.2: Theoretical Probability • It is not always feasible to conduct an experiment over and over again, so it would be better to be able to find the probabilities without conducting the experiment. These probabilities are called Theoretical Probabilities. • To be able to do theoretical probabilities, there is an assumption that you need to consider. It is that all of the outcomes in the sample space need to be equally likely outcomes. This means that every outcome of the experiment needs to have the same chance of happening. Theoretical Probabilities If the outcomes of an experiment are equally likely, then the probability of event A happening is: # ππ ππ’π‘πππππ ππ ππ£πππ‘ π ππππ π π΄ = # ππ ππ’π‘πππππ ππ π πππππ π ππππ Flip a pair of coins • • • • • • • What is the sample space? What is the probability of getting exactly one head? What is the probability of getting at least one head? What is the probability of getting a head and a tail? What is the probability of getting a head or a tail? What is the probability of getting a foot? What is the probability of each outcome? What is the sum of these probabilities? Probability Properties 1. 0 ≤ π ππ£πππ‘ ≤ 1 2. If the P(event) = 1, then it will happen and is called the certain event 3. If the P(event) = 0, then it cannot happen and is called the impossible event 4. π ππ’π‘ππππ = 1 Pull a card from a 52-card deck • • • • • • • • • What is the sample space? What is the probability of getting a Spade? What is the probability of getting a Jack? What is the probability of getting an Ace? What is the probability of not getting an Ace? What is the probability of getting a Spade and an Ace? What is the probability of getting a Spade or an Ace? What is the probability of getting a Jack and an Ace? What is the probability of getting a Jack or an Ace? Complementary events • If A is an event, the complementary event can be notated: AC, not A, ~A, or something similar. • P(A) + P(not A) = 1 • P(not A) = 1 – P(A) • This is more computationally useful than it may seem at first. • Sometimes it’s a lot easier to calculate P(not A) than P(A) (see: shared birthdays) Shared Birthdays • What is the probability that two students in this class share a birthday? • There are a lot of ways this can happen! • But there is only one way of it not happening, that is if everybody has a different birthday. • That is a bit beyond where we are now, but turns out to be a lot easier to calculate. Two critical distinctions • Mutual exclusivity • Independence • For each distinction, remember a definition, and a canonical example for both possibilities. Mutual Exclusivity • Two events are mutually exclusive if they can’t happen at the same time. • Canonical examples: (rolling a pair of dice) – Exclusive: Rolling a pair and rolling a 7 – there is no roll of two dice that totals 7 and has the same number on each die. – Not Exclusive: Rolling a pair and rolling an 8 – the roll of 4, 4 totals eight and totals 8 points. Independence • Two events are independent if the fact that one happens does not alter the probability of the second happening. • Canonical examples: (Drawing cards from a 52- card deck. Event A is “Draw a Queen”, and Event B is “Draw a Queen”) – Independent: draw a card, note whether it is a Queen or not, put it back in the deck, re-shuffle, and draw a second card. – Not independent: draw two cards out of the deck. The probability of the second card being a Queen changes depending on whether the first card was a Queen. Addition Rules • If two events A and B are mutually exclusive, then P(A or B) = P(A) + P(B) and P(A and B) =0 • If two events A and B are not mutually exclusive, then P(A or B) = P(A) + P(B) - P(A and B) Two Dice Sample Space 52 cards Roll a pair of dice a) b) c) d) e) What is the sample space? What is the probability of getting a sum of 5? What is the probability of getting the first die a 2? What is the probability of getting a sum of 7? What is the probability of getting a sum of 5 and the first die a 2? f) What is the probability of getting a sum of 5 or the first die a 2? g) What is the probability of getting a sum of 5 and sum of 7? h) What is the probability of getting a sum of 5 or sum of 7? Odds Section 4.3: Conditional Probability • Probabilities calculated after information is given. This is where you want to find the probability of event A happening after you know that event B has happened. If you know that B has happened, then you don’t need to consider the rest of the sample space. You only need the outcomes that make up event B. Event B becomes the new sample space, which is called the restricted sample space, R. Restricted sample space • If you always write a restricted sample space when doing conditional probabilities and use this as your sample space, you will have no trouble with conditional probabilities. The notation for conditional probabilities is P(A, given B) = P(A|B). The event following the vertical line is always the restricted sample space. New Information • One way of looking at this conditional probability issue is “How does this new information cause me to revise my estimate of likelihood (probability)?” • There is a whole branch of Statistics known as Bayesian Analysis that deals with this. • And some genuinely weird history. Suppose you roll two dice. What is the probability of getting a sum of 5, given that the first die is a 2? Solution: Since you know that the first die is a 2, then this is your restricted sample space, so R = {(2,1), (2,2), (2,3), (2,4), (2,5), (2,6)} Out of this restricted sample space, the way to get a sum of 5 is {(2,3)}. Thus P(sum of 5 | the first die is a 2) = 1/6 Probability of a 5? • When we considered all 36 possible outcomes, 4 of them (1,4), (2,3), (3,2), (4,1) had a total of 5. Without any knowledge of the first die, the probability of getting a 5 is 4/36 = 1/9 about 11.1% • But, when we know the first die was a 2, it changes the probability to 1/6 or 16.7%, roughly Suppose you roll two dice. What is the probability of getting a sum of 7, given the first die is a 4? Solution: Since you know that the first die is a 4, this is your restricted sample space, so R = {(4,1), (4,2), (4,3), (4,4), (4,5), (4,6)} Out of this restricted sample space, the way to get a sum of 7 is {(4,3)}. Thus P(sum of 7 | first die is a 4) = 1/6 Probability of a 7? • When we looked at all 36 outcomes, and 6 of them were 7’s, we got the same probability for getting a 7: 6/36 = 1/6 • This means that knowing that the first die is a 4 did not change the probability that the sum is a 7. This added knowledge did not help you in any way. It is as if that information was not given at all. Dependent and Independent Events • In the second case, the events sum of 7 and first die is a 4 are called independent events. • In the first case, the events sum of 5 and first die is a 2 are called dependent events. • Events A and B are considered independent events if the fact that one event happens does not change the probability of the other event happening. A and B are independent if • P(A|B) = P(A) or • P(B|A) = P(B) a) Suppose you roll two dice. Are the events “sum of 7” and “first die is a 3” independent? b) Suppose you roll two dice. Are the events “sum of 6” and “first die is a 4” independent? c) Suppose you pick a card from a deck. Are the events “Jack” and “Spade” independent? d) Suppose you pick a card from a deck. Are the events “Heart” and “Red” card independent? e) Suppose you have two children via separate births. Are the events “the first is a boy” and “the second is a girl” independent? f) Suppose you flip a coin 50 times and get a head every time, what is the probability of getting a head on the next flip? Multiplication Rule • If two events are dependent, then: P(A and B) = P(A)*P(B|A) • If two events are independent, then: P(A and B) = P(A)*P(B) • Solving: π π΅ π΄ = π (π΄ πππ π΅) π(π΄) • It is often easier to find a conditional probability by using the restricted sample space and counting unless the sample space is large. Multiplication rule examples a) Suppose you pick three cards from a deck, what is the probability that they are all Queens if the cards are not replaced after they are picked? b) Suppose you pick three cards from a deck, what is the probability that they are all Queens if the cards are replaced after they are picked and before the next card is picked? Two-Way Table: Leprosy Cases WHO Region Americas Eastern Mediterranean Europe Western Pacific Africa South-East Asia Column Total World Bank Income Group High Upper Lower Low Income Middle Middle Income Income Income 174 36028 615 0 54 6 1883 604 Row Total 36817 2547 10 26 0 216 0 3689 0 1155 10 5086 0 0 39 0 1986 149896 15928 10236 17953 160132 264 36289 158069 27923 222545 Problems from this table a) b) c) d) e) f) g) h) i) j) Find the probability that a person with leprosy is from the Americas. Find the probability that a person with leprosy is from a high-income country. Find the probability that a person with leprosy is from the Americas and a highincome country. Find the probability that a person with leprosy is from a high-income country, given they are from the Americas. Find the probability that a person with leprosy is from a low-income country. Find the probability that a person with leprosy is from Africa. Find the probability that a person with leprosy is from Africa and a low-income country. Find the probability that a person with leprosy is from Africa, given they are from a low-income country. Are the events that a person with leprosy is from “Africa” and “low-income country” independent events? Why or why not? Are the events that a person with leprosy is from “Americas” and “high-income country” independent events? Why or why not? Bayes Theorem Supplement The Theorem • Begin with: P(A & B) = P(A)*P(B|A) this is the “and” rule • Goal: calculate P(H|E) this is the new probability of a Hypothesis H, given new evidence E. • Begin by re-naming: P(E & H) = P(E)*P(H|E) Proof, continued • But note “&” is symmetric: P(E & H) = P(H & E) • P(E)*P(H|E) = P(H)*P(E|H) • Solve for our “target” P(H|E) • π π»πΈ = π π» ∗π(πΈ|π») π(πΈ) • Hmm… but what is P(E)? • P(E) = P(E & H) + P(E & ~H) • π π»πΈ = π π» ∗π(πΈ|π») π πΈ&π» +π(πΈ & ~π») • π π»πΈ = π π» ∗π(πΈ|π») π π» ∗π πΈ π» + π ~π» ∗π πΈ ~π») • But P(~H) = (1 – P(H) ) complement rule • So π(π»|πΈ) = π π» ∗π(πΈ|π») π» + 1− π(π» ∗π πΈ ~π») πΈ • π(π»|πΈ) can be calculated from three terms π π» ∗π The three terms of Bayes • P(H) is called the Prior Probability of H, or just the Prior. Sometimes called the base rate. This is the best estimate of the probability of the hypothesis before considering the new evidence E. • P(E|H) is the probability of getting the evidence we got under the assumption that H is true. • P(E | ~H) is the probability of getting the evidence we got under the assumption that H is false. Note that we can use the compliment rule to say P(E | ~H) = 1 – P(~E | ~H). Bayes as a tool to avoid Fallacies • Confirmation bias: only focusing on P(E|H) (“See, the evidence is consistent with my theory!”) and ignoring P(E | ~H). (Ignoring that there could be other explanations for the evidence, even if the theory is not correct.) • Base Rate: Forgetting to factor in that base rate. (“The test is 99% accurate, you have positive result, hence there is a 99% chance you have the disease.”) Apply to medical diagnosis • Sensitivity: Assume a test is 99% accurate in true positive rate, which means that if the patient has the disease, 99% of the time the test will be positive. This is P(E | H). We could say the false negative rate is 1% (P(~E | H) = 1%). • Specificity: Assume that the test is 98% accurate in the true negative rate, which means that if the patient does not have the disease, 98% of the time the test will be negative. This is P(~E | ~H). We could say the false positive rate is 2% (P(E | ~H) = 2%). • Base Rate: Assume that 1/1000 of people in the population have the disease. This is the prior: P(H) Bayes “Do I have it?” • π π»πΈ = • Or • π π» ∗π π π» ∗π πΈπ» πΈ π π» ∗π(πΈ|π») π» + 1− π(π» ∗π πΈ ~π») π π» ∗π(πΈ|π») + 1− π(π» ∗(1−(π ~πΈ ~π»)) 0.001∗0.99 0.001∗0.99 + 1−0.99 ∗(1−0.98) • ≈ 4.7% “Fake population” table approach History and Bayes • For a long time, there was a great controversy between “Bayesians” and “Frequentests” among statisticians. • Lately, a consensus seems to be developing that both approaches are just different ways of viewing the world, and are in fact fully compatible. Summary of logic to probability rules • NOT: P(not A) = 1 – P(A) • OR: P(A or B) = P(A) + P(B) (if P(A and B) =0) P(A or B) = P(A) +P(B) – P(A and B) • AND: P(A and B) = P(A) * P(B) (if independent) P(A and B) = P(A) * P(B | A) (if the probability of B depends on whether A occurred)