Chapter 5 Probability Created by Kathy Fritz Can ultrasound accurately predict the gender of a baby? The paper “The Use of Three-Dimensional Ultrasound for Fetal Gender Determination in the First Trimester” (The British Journal of Radiology [2003]: 448-451) describes a study of ultrasound gender prediction. An experienced radiologist looked at 159 first trimester ultrasound images and made a gender prediction for each one. When each baby was born, the ultrasound gender prediction was compared to the baby’s actual gender. This table summarizes the resulting data: Radiologist 1 Predicted Male Predicted Female Baby is Male 74 12 Baby is Female 14 59 Notice that theof gender prediction based can on the All these questions be ultrasound image is NOT always correct. answered using the methods introduced in this chapter. The paper also included gender predictions by a second radiologist, who looked at 154 first trimester ultrasound inmages. Radiologist 2 Predicted Male Predicted Female Baby is Male 81 8 Baby is Female 7 58 Interpreting Probabilities Probability Relative Frequency Law of Large Numbers Basic Properties Probability We often find ourselves in situations where the To is quantify the likelihood of an occurrence, a outcome uncertain: number between 0 and 1 can be assigned to an outcome. When a ticketed passenger shows up at the airport, she facesAtwo possible outcomes: she is able to 1take probability is a number(1)between 0 and thatthe flight,reflects or (2) she islikelihood denied a seat as a resultof ofsome the of occurrence overbooking by the airline and must take a later flight. outcome. Based on her past experience, the passenger believes that the chance of being denied a seat is small or unlikely. Subjective Approach to Probability The subjective interpretation of probability is when a probability is interpreted as a personal measure of the strength of the belief that an outcome will occur. A probability of 1 represents a belief that the outcome will certainly occur. A probability of 0 represents a belief that the outcome will certainly NOT occur. All other probabilities fall between these Because different people may have different extremes. subjective beliefs,two they may assign different probabilities to the same outcome. Relative Frequency Approach In the relative frequency interpretation of probability, the probability of an outcome, denoted by P(outcome), is interpreted as the proportion of the time that the outcome occurs in the long run. Relative frequency can be computed by: A probability of 1 corresponds to an outcome that occurs 100% of the time. number of times outcome occurs A probability π outcome = of 0 corresponds to an outcome that occurstotal 0% ofnumber the time.of trials A package delivery service promises 2-day delivery between 2 cities in California but is often able to deliver the packages in just 1 day. The company reports that the probability of next-day delivery is 0.3. Suppose you trackthis the probability delivery of would packages shipped One waythat to interpret be to say with that this company. each new package shipped, in the longWith run, about 30 out of every 100 you could compute the relative frequency packages shipped packages shipped arrive in of 1 day. so far that have arrived in 1 day: number of packages that arrived in 1 day total number of packages shipped Here is a graph displaying the relative frequencies for each of the first 15 packages shipped. Here is a graph displaying the As the number of packages in the relative frequencies sequence increases, for theeach relative of the 50 packages frequency does not first continue to shipped. settles fluctuate wildly, but instead down and approaches a specific value, which is the probability of interest. Here is a graph displaying the relative frequencies for each of the first 1000 packages shipped. Law of Large Numbers As the number of observations increases, the proportion of the time that an outcome occurs gets close to the probability of that outcome. The Law of Large Numbers is the basis for the relative frequency interpretation of probabilities. Some Basic Properties of Probability 1. The probability of any outcome is a number between 0 and 1. 2. If outcomes can’t occur at the same time, then the probability that any one of them will occur is the sum of their individual probabilities. A large auto center sells cars made by many different manufacturers. Two of these are Honda and Toyota. Suppose: P(Honda) = 0.25 and P(Toyota) = 0.14 An interpretation for this two value Why don’t these Consider the make ofabout the next car sold. Can the outcomes Honda and Toyota is that 25 out of probabilities have a every sumand of 1? happen at the time? 100 cars soldsame would be Hondas. What is the probability that the next car sold is either a Honda or a Toyota? P(Honda or Toyota) = 0.25 + 0.14 = 0.39 Some Basic Properties of Probability 3. The probability that an outcome will not occur is equal to 1 minus the probability that the outcome will occur. Because a probability represents a long-run relative in situations exact Recall the frequency, car dealership (P(Honda)where = 0.25): probabilities are not known, it is common to estimate probabilities based on observation. What is the probability that the next car sold is not a Honda? P(not Honda) = 1 - 0.25 = 0.75 Computing Probabilities Chance Experiment Sample Space Event Classical Approach to Probability Chance Experiment A chance experiment is any activity or situation in which there is uncertainty about which of two or more outcomesof will result.experiments. Thesepossible are all examples chance These are the outcomes of chance experiments. Suppose two six-sided dice are rolled and they both land on sixes. Or a coin is flipped and it lands on heads. Or record the color of the next 20 cars to pass an intersection. Sample Space The collection of all possible outcomes of a chance experiment is the sample space for the experiment. Sample space = {MH, FH, MT, FT} Consider a chance experiment to investigate whether men or women are more likely to choose a hybrid engine over a traditional internal combustion engine when purchasing a This an example of a sample Honda Civic at aisparticular dealership. The typespace. of vehicle purchased (hybrid or traditional) will be determined and the customer’s gender will be recorded. A list of all possible outcomes are: male, hybrid female, hybrid male, traditional female, traditional Chance Experiment An event is any collection of outcomes from the sample space of a chance experiment. Recall the can situation in which a by person purchases Honda An event be represented a name, such asahybrid, Civic: or by Sample space letter, = {MH, such FH, MT, an uppercase as AFT} , B, or C. A simple is an event consisting of Eachevent of these 4 outcomes are simple events. exactly on outcome. Identify the following events: traditional = {MT, FT} female = {FH, FT} Classical Approach to Probability When the outcomes in the sample space of a chance experiment are equally likely, the The classical approach to probability probability of an event E, denoted by P ( E ), is the works well for chance experiments that ratio ofhave the number favorable to E a finite of setoutcomes of outcomes that are to the total number equally of outcomes likely. in the sample space: number of outcomes favorable to E π· π¬ = number of outcomes in the sample space Four students (Adam (A), Bettina (B), Carlos (C), and Debra(D)) submitted correct solutions to a math contest that had two prizes. The contest rules specify that if more than two correct responses are submitted, the winners will be selected at random from those submitting correct responses. What is the sample space for selecting the two winners from the four correct responses? Sample space = {AB, AC, AD, BC, BD, CD} Because the winners are selected at random, the six possible outcomes are equally likely. Four students (Adam (A), Bettina (B), Carlos (C), and Debra(D)) submitted correct solutions to a math contest that had two prizes. The contest rules specify that if more than two correct responses are submitted, the winners will be selected at random from those submitting correct responses. Sample space = {AB, AC, AD, BC, BD, CD} Let E be the event that both selected winners are the same sex. What is the probability of E? 2 π πΈ = = 0.333 6 Four students (Adam (A), Bettina (B), Carlos (C), and Debra(D)) submitted correct solutions to a math contest that had two prizes. The contest rules specify that if more than two correct responses are submitted, the winners will be selected at random from those submitting correct responses. Sample space = {AB, AC, AD, BC, BD, CD} Let F be the event that at least one of the selected winners is female. What is the probability of F? 5 π πΈ = = 0.833 6 Relative Frequency Approach to Probability The probability of an event E, denoted by P(E), is When a chance experiment is performed, defined to be the value approached by the some events may be likely to occur, relatively frequency of occurrence of E in a very whereas others may not be as likely to long series of observations from a chance occur. In cases like these, the classical experiment. If the number of observations is approach is not appropriate. large, number of times E occurs π πΈ ≈ number of repetitions Suppose that you perform a chance experiment that consists of flipping a cap from a 20-ounce bottle of soda and noting whether the cap lands with the top up or down. Do you think that the event U, the cap landing up, and event D, the You carry out thistop chance experiment by flipping landing areifequally the capcap 1000 timestop anddown, record it lands top up likely? Whylands or Why or top down. The cap top not? up 694 times. 694 π ππ = = 0.694 1000 Probabilities of More Complex Events Union Intersection Complement Mutually Exclusive Events Independents Events Consider the chance experiment that consists of selecting a student at random from those enrolled at a particular college. There are 9000 students enrolled at the college Here are some possible events: F = event that the selected student is female O = event that the selected student is older than 30 A = event that the selected student favors the expansion of the athletic program S = event that the selected student is majoring is one of the lab sciences Complement If E is an event, the complement of E, denoted EC, is the event that E does not occur. The probability of EC can be computed from the probability of E as follows: π πΈ πΆ = 1 − π(πΈ) Suppose that students favor the π 4300 π΄πΆ of=the 1 9000 − 0.48 = 0.52 expansion of the athletic program. 4300 π π΄ = = 0.48 9000 What is the probability of event A not occurring? Intersection If E and F are events, the intersection of E and F is denoted by πΈ ∩ πΉ and is the new event that both E and F occur. This is the symbol for “intersection”. Consider the events: O = event that the selected student is older than 30 S = event that the selected student is majoring is one of the lab science This table summaries the occurrence of these events: Intersection If E and F are events, the intersection ofin Elab and F Majoring Not majoring in lab science science AND AND is denoted by πΈ ∩ πΉ and is the new event that Not Over over 3030 both E and F occur. The numbers in red corresponds to the S SC intersections of the events. (Majoring in Lab Science) (Not Majoring in Lab Science) Total O (Over 30) 400 1700 2100 OC (Not over 30) 1100 5800 6900 Total 1500 7500 9000 What is the probability of a randomly selected student is older than 30 AND is majoring in a lab science? 400 π π∩π = = 0.04 9000 Union If E and F are events, the union is denoted by πΈ ∪ πΉ . The event πΈ ∪ πΉ is the new event that E or F occur. This is the symbol Consider the events: for “union”. O = event that the selected student is older than 30 A = event that the selected student favors the expansion of the athletic program This table summaries the occurrence of these events: Union If E and F are events, theA,union denoted by The event favorsissale πΈ ∪The πΉ . The event πΈ ∪of πΉ alcohol is the new event that E or F occur. event A AC O, over 30 (Favors Expansion) (Does Not Favor Expansion) Total O (Over 30) 1600 500 2100 OC (Not over 30) 2700 4200 6900 Total 4300 4700 9000 What is the probability of a randomly selected student is older than 30 OR favors the expansion of the athletic program? 1600 + 500 + 2700 π π∪π΄ = = 0.53 9000 Hypothetical 1000 You can use tables to compute the probability of an intersection of two events and the probability of a union of two events. InInmany situations, you may ONLY know the the previous examples, this was possible because a probabilities some events. In this it is student was of to be selected at random andcase, because often possible to create a “hypothetical 1000” the number of students falling into each of the cells the appropriate tabletowere given. table andofthen use the table compute probabilities. The report “TV Drama/Comedy Viewers and Health Information” (www.cdc.gov/healthmarketing) describes a large survey that was conducted by the Centers for Disease Control (CDC). The CDC believed that the sample was representative of adult Americans. Let’s investigate these events (taken from questions on the survey): L = event that a randomly selected adult American reports learning something new about a health issue or disease from a TV show in the previous 6 months. F = event that a randomly selected adult American is female. Data from the survey were used to estimate the following probabilities: π πΏ = 0.58 π πΉ = 0.5 π πΏ ∩ πΉ = 0.31 CDC study continued π πΏ = 0.58 π πΉ = 0.5 π πΏ ∩ πΉ = 0.31 F (female) L (learned from TV) Not L 310 190 Not F 270 230 Total 500 500 Total 580 420 1000 What is the probability that a randomly selected adult P( ∩ Fhas )you tells you58% that 31% of1000 the about 1000 people are American learned something new ashould health P(P( L)FLtells that of the people be ) tells you that the F row is (0.50)(1000) = 500 Begin by rows and columns ofprevious thethe table. both female and health information from a TV Fill in labeling the remaining cells to in complete table. issue or disease from a TV show the 6 Put in the Llearned row: (0.58)(1000) = 580. and the “hypothetical 1000” in the bottom right cell. show. months or is female? that the Not F row is310+270+190 1000 - 500 = 500 πΏ ∪Not πΉ L=row have a sum = of 0.770 The L row andπ the 1000. 1000 The cell for L and F is (0.31)(1000) = 310. Let’s look at the hypothetical table once more. Suppose: P (A) = 0.6, P (B C) = 0.7, and P (A ∩ B) = 0.2 B BC Total A AC Total 200 100 400 600 300 400 300 700 1000 It does not matter which What is the probability A oronBthe happening? eventofgoes side or on the top. 200 + 100 + 400 700 π π΄∪π΅ = = = 0.7 1000 1000 Mutually Exclusive Events Two events E and F are mutually exclusive if they can NOT occur at the same time. Sometimes people call the emergency 9-1-1 number to report situations that are not considered emergencies (such as to report a lost dog). Let two events be: M = event that the next call to 9-1-1 is for a medical emergency N= event that the next call to 9-1-1 is not considered an emergency Suppose that you know P(M) = 0.30 and P(N) = 0.20. Events M and N are mutually exclusive because the next call can’t be both a medical emergency and a call that is not considered an emergency. Mutually Exclusive Events P(M) = 0.30 and P(N) = 0.20 π π∩π =0 300 + 200 Aπ “hypothetical π ∪ π 1000” = table is shown =below. 0.50 The 1000 uppermost cell must be 0 when the two events are mutually exclusive. N (Non-emergency) Not N Total M (Medical Emergency) Not M 0 300 300 200 500 700 Total 200 800 1000 Addition Rule for Mutually Exclusive Events If E and F are mutually exclusive events, then π πΈ∩πΉ =0 and π πΈ ∪ πΉ = π πΈ + π(πΉ) Independent Events Two events are independent if the probability that Because one the event twooccurs components is not operate affected independently by of each other, learning that monitor has has needed knowledge of whether thethe other event warranty service would not effect your assessment of occurred. the likelihood that the keyboard will need repair. Suppose that you purchase a desktop computer system with a separate monitor and keyboard. Two possible events are: Event 1: The monitor needs service while under warranty. Event 2: The keyboard needs service while under warranty. Dependent Events Two events are dependent if knowing that one event has occurred changes the probability that the other event occurs. Consider a university’s course registration process, which divides students into 12 priority groups. Overall, only 10% of all students receive all requested classes, but 75% of those in the first priority group receive all requested classes. You would say that the probability that a randomly selected student at this university receives all requested class is 0.10. However, if you know that the selected student is in the first priority group, you revise the probability that the student receives all requested classes to 0.75. These two events are said to be dependent events. Multiplication Rule for Two Independent Events If two events, E and F, independent, the probability that both events occur is the product of the individual event probabilities. π πΈ ∩ πΉ = π πΈ π(πΉ) More generally, if there are k independent events, the probability that all the events occur is the product of all individual event probabilities. The Diablo Canyon nuclear power plant in California has a warning system that includes a network of sirens. When the system is tested, individual sirens sometimes fail. The sirens operate independently of one another. Imagine that you live near Diablo Canyon and that there are two sirens that can be heard from your home. You might be concerned about the probability that both Siren 1 and Siren 2 fail. (When the siren system is activated, about 5% of the individual sirens fail.) Using the multiplication rule for independent events: π πππππ 1 πππππ ∩ πππππ 2 πππππ = 0.05 0.05 = 0.0025 Conditional Probability Sometimes the knowledge that one event has occurred changes our assessment of the likelihood that another event occurs. Consider a population in which 0.1% of all the individuals have a certain disease. The presence of the disease cannot be discerned from appearances, but there is a diagnostic test available. Unfortunately, the test is not always correct. Suppose that 80% of those with positive test results actually have the disease and the other 20% of those with positive test results actually do NOT have the disease (false positive). Disease example continued . . . Consider the chance experiment in which an individual is randomly selected from the population. Let: E = event that the individual has the disease F =vertical event that individual's The line the is read “given”. diagnostic test is This is an example of conditional probability. positive P(E|F) denotes the probability that event E (has disease) GIVEN that event F (tested positive) occurs. Conditional Probability Conditional probability is a probability that takes into account a given condition has occurred. P(A|B) is read as the probability of event A occurring GIVEN event B has occurred. Recall the example in the Chapter Preview section about gender predictions based on ultrasounds performed during the first trimester of pregnancy. The table below summarizes the data for Radiologist 1. Radiologist 1 Baby is Male Predicted Male Predicted Female Total 74 12 86 59 73 71 159 This question is about ALL Baby is Female 14 159 ultrasound predictions. Total 88 How likely is it that a predicted gender is correct? 74 + 59 π πππππππ‘ππ ππππππ ππ πππππππ‘ = = 0.836 159 Gender prediction example continued. Radiologist 1 Predicted Male Predicted Female Total Baby is Male 74 12 86 Baby is Female 14 59 73 Total 88 71 159 Is a predicted gender more likely to be correct when the baby is male than when the baby is female? 74 π πππππππ‘ππ ππππππ ππ ππππ|ππππ¦ ππ ππππ = = 0.86 86 This question is based on two conditions: 59 the 86 male babies or theππππ¦ 73 female babies. = π πππππππ‘ππ ππππππ ππ ππππππ ππ ππππππ 73 = 0.81 The appropriate row total or column total is used Radiologist 1 is slightly in more to be correct when the as the denominator the likely probability calculation. baby is male than when the baby is female. Gender prediction example continued. Radiologist 1 Male This is a Predicted condition. In thePredicted probability Female statement, the condition follows the Baby is Male 74 vertical line “|”. 12 Total 86 Baby is Female 14 59 73 Total 88 71 159 If the predicted gender is female, should you paint the nursery pink? π ππππππ|πππππππ‘ππ ππππππ = 59 71 = 0.83 For Radiologist 1, when the predicted gender is female, about 83% of the time the baby is actually female. So, if you painted the room pink, then the probability that you would need to repaint is about 0.17 (1 – 0.83). Let’s take the gender prediction example a little further. Suppose that two radiologists both work in the same clinic; Radiologist 1 works part-time while Radiologist 2 (from the Chapter Preview section) works full-time. π πππππππ‘πππ ππ ππππ ππ¦ π πππππππππ π‘ 1 = 0.30 π πππππππ‘πππ ππ ππππ ππ¦ π πππππππππ π‘ 2 = 0.70 Let’s answer these questions: 1. What is the probability that a gender prediction based on a first-trimester ultrasound at this clinic is correct? 2. If the first-trimester ultrasound gender prediction is incorrect, what is the probability that the prediction was made by Radiologist 2? Gender prediction example continued. From the data we know: π πππππππ‘πππ ππ πππππππ‘ πππππππ‘πππ ππππ ππ¦ π πππππππππ π‘ 1 = 0.836 create a “hypothetical 1000” 2 = 0.903 π πππππππ‘πππLet’s ππ πππππππ‘ πππππππ‘πππ ππππ ππ¦ π πππππππππ π‘ π table πππππππ‘πππ ππ ππππ ππ¦ π πππππππππ π‘ 1 = 0.30 to answer the two questions. π πππππππ‘πππ ππ ππππ ππ¦ π πππππππππ π‘ 2 = 0.70 Prediction Correct Prediction Incorrect Total Radiologist 1 251 49 300 Radiologist 2 632 68 700 833 117 1000 the probability that the prediction is correct given that the prediction was made by Radiologist 1 is 0.836, then the value for this cell is: You canSimilarly, now fill inthe thevalue values for thecell remaining cells. this (300)(0.836) = for 250.8 ≈ 251 is: (700)(0.903) = 632.1 ≈ 632 (Cell values MUST be whole numbers since we are counting how many are in each event.) TotalSince Gender prediction example continued. From the data we know: π πππππππ‘πππ ππ πππππππ‘ πππππππ‘πππ ππππ ππ¦ π πππππππππ π‘ 1 = 0.836 π πππππππ‘πππ ππ πππππππ‘ πππππππ‘πππ ππππ ππ¦ π πππππππππ π‘ 2 = 0.903 π πππππππ‘πππ ππ ππππ ππ¦ π πππππππππ π‘ 1 = 0.30 π πππππππ‘πππ ππ ππππ ππ¦ π πππππππππ π‘ 2 = 0.70 Prediction Prediction ofCorrect the incorrectIncorrect gender Total About 58.1% predictions at 49 Radiologist 1 this clinic251 are made by Radiologist 2. 300 Radiologist 2 700 2 This seems high 632 – but remember68that Radiologist 833 twice as many 117predictions1000 Total does more than as Radiologist 1. If theisfirst-trimester gender prediction is What the probabilityultrasound that a gender prediction based incorrect, what is theultrasound probability prediction on a first-trimester atthat this the clinic is correct? was made by Radiologist 2? 833 68 π πππππππ‘πππ πππππππ‘πππππππππ‘ = ==0.833 = 0.581 π π πππππππππ π‘ 2|πππππππ‘πππ 1000 117 Calculating Probabilities – A More Formal Approach Probability Formulas The Complement Rule For any event E, π πΈ πΆ = 1 − π(πΈ) The Addition Rule For any two events E and F, π πΈ∪πΉ =P E +P F −P E∩πΉ For mutually exclusive events, this simplifies to π πΈ ∪ πΉ = π πΈ + π(πΉ) Probability Formulas Continued The Multiplication Rule For any two events E and F, π πΈ∩πΉ =π πΈ πΉ π πΉ For independent events, this simplifies to π πΈ ∩ πΉ = π πΈ π(πΉ) Conditional Probabilities For any two events E and F with P(F) ≠ 0, π(πΈ ∩ πΉ) π πΈπΉ = π(πΉ) Revisit CDC’s study . . . Recall: L = event that a randomly selected adult American reports learning something new about a health issue or disease from a TV show in the previous 6 months. F = event that a randomly selected adult American is female. Data from the survey were used to estimate the following probabilities: π πΏ = 0.58 π πΉ = 0.5 π πΏ ∩ πΉ = 0.31 What is the probability that a randomly selected adult American reports learning something new about a health issue or disease from a TV show in the previous 6 months or that a randomly selected adult American is female? π πΏ∪πΉ =π πΏ +π πΉ −π πΏ∩πΉ = 0.58 + 0.5 − 0.31 = 0.77 The article “Chances Are You Know Someone with a Tattoo, and He’s Not a Sailor” (Associated Press, June 11, 2006) summarized data from a representative sample of adults ages 18 to 50. T = the event that a randomly selected person has a tattoo A = the event that a randomly selected person is between 18 and 29 years old The following probabilities were estimated based on data from the sample: Notice that the probability of π π = 0.24, π΄ “T = given 0.50 A”πare π ∩ π΄ = 0.18 “A given T”πand NOT the same! π π΄π = π(π∩π΄) π(π) 0.18 0.24 = = 0.75 π ππ΄ π(π∩π΄) = π(π΄) 0.18 = 0.50 = 0.36 Another Approach to Probability A large electronics store sells two different portable DVD players, Brand 1 and Brand 2. Based on past records, the store manager reports that 70% of the DVD players sold are Brand 1 and 30% are Brand 2. The manager also reports that 20% of the people who buy Brand 1 also purchase an extended warranty, and 40% of the people who buy Brand 2 purchase an extended warranty. Consider selecting a person at random from those who purchased a DVD player from this store, what is the probability that the person purchased extended warranty? One way to do this problem would be to set up a Hypothetical 1000 table. DVD Players Continued P(Brand 1) = 0.7 P(Brand 2) = 0.3 The manager also reports that 20% of the people who buy Brand 1 also purchase an extended warranty, and 40% of the people who buy Brand 2 purchase an extended warranty. Brand 1 Brand 2 Total Bought Extended Warranty 140 120 260 Not Bought Extended Warranty 560 180 740 Total 700 300 1000 Consider selecting a person at random from those who purchased a DVD player from this store, what is the probability that the person purchased extended warranty? 260 π Extended Warranty = = 0.26 1000 DVD Players Continued P(Brand 1) = 0.7 P(Brand 2) = 0.3 The manager also reports that 20% of the people who buy Brand 1 also purchase an extended warranty, and 40% of the people who buy Brand 2 purchase an extended warranty. Consider selecting a person at random from those who purchased a DVD player from this store, what is the probability that the person purchased extended warranty? and B1 = 0.7 B2 = 0.3 and (0.7)(0.2) to = 0.14 Another approach this problem is to use a tree diagram. EC = 0.8 E = 0.2 or E = 0.4 This is an example of the Law of (0.3)(0.4) = 0.12 Total Probability! EC = 0.6 P(E) = 0.14 + 0.12 = 0.26 The Law of Total Probability If B1 and B2 are disjoint events with P(B1) + P(B2) = 1, then for any event E π πΈ = π πΈ ∩ π΅1 + π πΈ ∩ π΅2 = π πΈ π΅1 π π΅1 + π πΈ π΅2 π(π΅2 ) More generally, if B1, B2, β―, Bk are disjoint events with P(B1) + P(B2) + β― + P(Bk) = 1, then for any event E π πΈ = π πΈ ∩ π΅1 + π πΈ ∩ π΅2 + β― + π πΈ ∩ π΅π = π πΈ π΅1 π π΅1 + π πΈ π΅2 π π΅2 + β― + π πΈ π΅π π π΅π Let’s consider another type of problem . . . Suppose the conditional probability of “a positive test result given that the person has cancer” is known. However, you would like to know the converse probability. That is, you would like to know the probability of the person having cancer given a positive test result. This formula was discovered in the 1700’s by the Reverend Thomas probability Bayes, an English A converse is the reversal Presbyterian minister. probability. of a conditional This converse probability can be computed using Bayes’ Rule. Bayes’ Rule If B1 and B2 are disjoint events with P(B1) + P(B2) = 1, then for any event E π πΈ π΅1 π(π΅1 ) π π΅1 πΈ = π πΈ π΅1 π π΅1 + π πΈ π΅2 π(π΅2 ) More generally, if B1, B2, β―, Bk are disjoint events with Let’s look at an example. P(B1) + P(B2) + β― + P(Bk) = 1, then for any event E π π΅1 πΈ = π πΈ π΅1 π π΅1 π πΈ π΅1 π(π΅1 ) + π πΈ π΅2 π π΅2 + β― + π πΈ π΅π π(π΅π ) Internet addiction has been defined by researchers as a disorder characterized by excessive time spent on the Internet, impaired judgment and decision-making ability, social withdrawal, and depression. In a study of adolescents, each participant was assessed using the Chen Internet Addiction Scale to determine if he or she suffered from Internet addiction. The following probabilities are based on survey results: Although Bayes’ is not listed AP® Statistics P(F) Rule = 0.518 P(M)in=the 0.482 course description, are expected be able to solve P(I|F) = you 0.131 P(I|M)=to0.248 “Bayes’-like” problems. Besides using the formula, you can What is the probability that a randomly selected adolescent also solve using tables or tree diagrams. from the survey is female given that she has Internet addiction? π πΌ πΉ π(πΉ) π πΉπΌ = π πΌ πΉ π πΉ + π πΌ π π(π) (0.131)(0.518) = 0.131 0.518 + (0.482)(0.248) = 0.362 Probability as a Basis for Making Decisions Probability plays an important role in drawing conclusions from data. A professor planning to give a quiz that consists of 20 true-false questions is interested in knowing how someone who answers by guessing would do on such a quiz. To investigate, he asks the 500 students in his introductory psychology course to write the numbers from 1 to 20 on a piece of paper and then to arbitrarily write T or F next to each number. This table summarizes the number of The students are forced to guess at the to each correct answers onanswer the quiz. question, because they are not even told what the questions are! These answers are then collected and graded using the key for the quiz. Quiz example continued. Number of Correct Responses Number of Students Proportion of Students Number of Correct Responses Number of Students Proportion of Students 0 0 0.000 11 79 0.158 1 0 0.000 12 61 0.122 2 1 0.002 13 39 0.078 3 1 0.002 14 18 0.036 4 2 0.004 15 7 0.014 5 8 0.016 16 1 0.002 6 18 0.036 17 1 0.002 7 37 0.074 18 0 0.000 8 58 0.116 19 0 0.000 9 81 0.162 20 0 0.000 10 88 0.176 Would you be surprised if someone guessing on a 20question true-false quiz got only 3 correct? Only about 2 in 1000 guessers would get exactly 3 correct. Since this is so unlikely, this outcome is surprising! Quiz example continued. Number of Correct Responses Number of Students Proportion of Students Number of Correct Responses Number of Students Proportion of Students 0 0 0.000 11 79 0.158 2 1 0.002 13 39 0.078 3 1 0.002 14 18 0.036 4 2 0.004 15 7 0.014 5 8 0.016 16 1 0.002 6 18 0.036 17 1 0.002 7 37 0.074 18 0 0.000 8 58 0.116 19 0 0.000 9 81 0.162 20 0 0.000 10 88 0.176 1 12 0.122 P(passing quiz) 0≈ 0.014 +0.000 0.002 + 0.002 + 0 +610 + 0 = 0.018 If a score of 15 or more correct is a passing grade on the quiz, is it likely that someone who is guessing will pass? It would be unlikely that a student who is guessing would be able to pass. Quiz example continued. Number of Correct Responses Number of Students Proportion of Students Number of Correct Responses Number of Students Proportion of Students 0 0 0.000 11 79 0.158 1 0 0.000 12 61 0.122 2 1 0.002 13 39 0.078 3 1 0.002 14 18 0.036 4 2 0.004 15 0.014 There are two possible explanations for 7a score of 16: 5 8 0.016 16 1 0.002 1) The student was guessing 17and was REALLY lucky 6 18 0.036 1 0.002 The student wasstudent not guessing 7 37 0.074 the 18 just 0 0.000 Begin by 2) assuming that was guessing and 8 58 19 0 determine whether a0.116 score at least as high as 16 0.000 is a 9 81 0.162 20 0 0.000 Since the first explanation isoccurrence. highly unlikely, you could likely or an likely 10 88 0.176 conclude that a student with a score of 16 was not just guessing. The professor actually gives the quiz, and a student scores 16 correct. Do you think that the student was just guessing? P(scores 16 or higher) ≈ 0.002 + 0.002 + 0 + 0 + 0 = 0.004 Quiz example continued. What score on the quiz would it take to convince you that a student was not just guessing? Score Approximately Probability 20 0.000 19 or better 0.000 + 0.000 = 0.000 18 or better 0.000 + 0.000 + 0.000 = 0.000 17 or better 0.002 + 0.000 + 0.000 + 0.000 = 0.002 16 or better 0.002 + 0.002 + 0.000 + 0.000 + 0.000 = 0.004 15 or better 0.014 + 0.002 + 0.002 + 0.000 + 0.000 + 0.000 = 0.018 14 or better 0.036 + 0.014 + 0.002 + 0.002 + 0.000 + 0.000 + 0.000 = 0.054 13 or better 0.078 + 0.036 + 0.014 + 0.002 + 0.002 + 0.000 + 0.000 + 0.000 = 0.132 You might say that a score of 14 or higher is reasonable Consider showing approximate probabilities for evidencethis thattable someone is not just guessing, because the a certain that scorea or higher.would score this approximate probability guesser high is only 0.054. Estimating Probabilities Empirically and Using Simulation Estimating Probabilities Empirically It is fairly common practice to use observed longrun proportions to estimate probabilities. The process used to estimate probabilities is simple: 1. Observe a large number of chance outcomes under controlled circumstances. 2. Interpreting probability as a long-run relative frequency, estimate the probability of an event by using the observed proportion of occurrence. To recruit a new faculty member, a university biology department intends to advertise for someone with a Ph.D. in biology and at least 10 years of college-level teaching experience. A similar university just would completed a search in which The biology like to determine the member ofdepartment the department express the belief thatthere was no requirement prior experience. probability applicant would be excluded becausewill of the requiring atan least 10for years of teaching teaching experience However, prior teachingapplicants experience was The experience requirement. exclude most potential and willrecorded. exclude more The probability that applicant be excluded resulting data is summarized in the would following table. due female applicants than an male applicants. to the requirement of at 10 years experience is Number of least Applicants Less than 10 years 67.5%.10 or more years Total experience experience 277 138 This is just178 a little more than two-thirds of the 112 290 applicants. 21 Female 99 120 Male Total 277 π πππππππππ‘ ππ ππ₯πππ’πππ = = 0.675 410 410 New faculty member example continued. Now let’s determine if more females than males are excluded due to the experience requirement. Number of Applicants It Male Less than 10 years experience 10 or more years experience Total appears that178 female applicants are be 112more likely to 290 excluded due to99the experience requirement than male Female 21 120 applicants. 138 Total 277 410 About 61.4% 178 of the male applicants are excluded due 290 to the experience requirement. 99 About 82.5% 120 of the female applicants are excluded due to the experience requirement. Estimating Probabilities by Using Simulation Simulation provides a way to estimate probabilities when: • You are unable to determine probabilities analytically • You do not have the time or resources to determine probabilities • It is impractical to estimate probabilities empirically by observation Simulations involves generating “observations” in a situation that is similar to the real situation of interest. Using Simulation to Approximate a Probability 1. Design a method that uses a random mechanism (such as a random number generator or table, the selection of a ball from a box, or the toss a coin) to represent an observation. Be sure that the important characteristics of the actual process are preserved. 2. Generate an observation using the method in Step 1, and determine if the event of interest has occurred. 3. Repeat Step 2 a large number of times. 4. Calculate the estimated probability by dividing the number of observations for which the event of interest occurred by the total number of observations generated. Suppose that couples who wanted children were to continue having children until a boy was born. Would this change the proportion of boys in the population? We will use simulation to estimate the proportion of boys in the population if couples were to continue having children until a boy was born. 1. You can use a single random digit to represent a child, where odd digits represent a male birth and even digits represent a female birth. 2. An observation is constructed by selecting a sequence of random digits. If the first random number obtained is odd (a boy), the observation is complete. If the first random number obtained is even (a girl), another digit is chosen. You would continue in this way until an odd digit is obtained. Baby Boy Simulation Continued . . . Below are four rows from the random digit table. Row 6 0 9 3 8 7 6 7 9 9 5 6 2 5 6 5 8 4 2 6 4 7 4 1 0 1 0 2 2 0 4 7 5 1 1 9 4 7 9 7 5 1 Notice that even with only 10 trials, the 8 6 4 7 3 6 3 4 5 1 2 3 1 1 8 0 0 4 8 2 0 proportion of boys is 10/22, which is 9 8 0 2 8 7 9 3 8 close 4 0 4to 20.5! 0 8 9 1 2 3 3 2 Trial 1: girl, boy Trial 5: boy Trial 9: girl, boy Trial 2: boy Trial 6: boy Trial 3: girl, boy Trial 7: boy Trial 4: girl, boy Trial 8: girl, girl, boy Trial 10: girl, girl, girl, girl, girl, girl, boy