AP Statistics - Introduction to Probability Random Babies: Suppose that on one night at a certain hospital, four mothers (named Johnson, Miller, Smith, and Williams) give birth to baby boys. Each mother gives her child a first name alliterative to his last: Jerry Johnson, Marvin Miller, Sam Smith, and Willy Williams. As a very sick joke, the hospital staff decides to return the babies to their mothers completely at random. We want to investigate questions such as, How often will at least one mother get the right baby? How often will every mother get the right baby? What is the most likely outcome? On average, how many mothers will get the right baby? Since it is clearly not feasible to actually carry out this exercise over and over to investigate what would happen in the long run, we will use simulation instead. Simulation is an artificial representation of a random process used to study its long-term properties. We will represent the process of distributing babies to mothers at random by shuffling and dealing cards (representing the babies) to regions on a sheet of paper (representing the mothers). a) Take a sheet of paper and tear it into four equal pieces; get out another sheet of paper and leave it whole. Write a baby’s first name on each small piece of paper, and divide the full sheet of paper into four areas with a mother’s last name written on each area. Shuffle the small pieces of paper well, and then deal them out randomly with one going to each area of the sheet. Finally, turn over the cards to reveal which babies were randomly assigned to which mothers. Record the requested information below. Did Mrs. Johnson get the right baby? __________________ Did all the mothers get the wrong baby? _________________ How many of the mothers got the right baby? ______________ b) Repeat the random “dealing” of babies of total of five times, recording in each case the information requested in the chart below: Repetition Number 1 2 3 4 5 Johnson Match? All Wrong? Number of Matches? c) Combine your yes/no results sequentially with those of your classmates until you obtain a total of 100 repetitions. For each classmate, record in the table how often in their five repetitions Johnson received the right baby and how often all mothers received the wrong baby. Then calculate the cumulative totals and cumulative proportions. You will transfer your information onto the overhead slide for the class and then copy the data on the same chart on the back of this sheet for yourself. Pair 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 d) Cumulative Johnson Matches? Repetitions Of these 5 Cum. Total Cum. Proportion 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 All Wrong? Of these 5 Cum. Total Cum. Proportion Plot the cumulative relative frequencies as a function of repetitions for both of these variables below: 1.0 0.5 0.0 5 e) 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 Do the relative frequencies appear to be “settling down” and approaching one particular value? What might you guess that one value is? The probability of a random event is the long-run proportion (or relative frequency) of times the event would occur if the random process were repeated over and over under identical conditions. One can approximate a probability by simulating the process a large number of times. Simulations leads to an empirical estimate of the probability. f) Now combine your results on a number of matches with the rest of the class, obtaining a tally of how often each outcome occurred. Record the counts and proportions in the table below. Number of Matches Count Proportion 0 1 2 3 4 Total 1.00 g) In what proportion of these simulated cases did at least one mother get the correct baby? h) Based on the simulation results of the class, what is your empirical estimate of the probability of no matches? i) Based on the simulation results of the class, what is your empirical estimate of the probability of at least one match? j) Explain why an outcome of exactly three matches is impossible. k) Is it impossible to get four matches? Would you call it rare? Unlikely? Explain. l) Would you consider a result of 1 matches, or of 1 match, or of 2 matches, to be unlikely? Explain. Pair 1 Cumulative Johnson Matches? Repetitions Of these 5 Cum. Total Cum. Proportion 5 2 10 3 15 4 20 5 25 6 30 7 35 8 40 9 45 10 50 11 55 12 60 13 65 14 70 15 75 16 80 17 85 18 90 19 95 20 100 All Wrong? Of these 5 Cum. Total Cum. Proportion Part II: In situations where the outcomes of a random process are equally likely, exact probabilities can be calculated by listing all of the possible outcomes and counting the proportion that corresponds to the event of interest. The listing of all possible outcomes is called the sample space. The sample space for the “random babies” activity consists of all possible ways to distribute the four babies to the four mothers. Let 1234 mean that the first baby went to the first mother, the second baby went to the second mother, the third baby went to the third mother, and the fourth baby went to the fourth mother. In this scenario all four mothers get the correct baby. As another example, 1243 would mean that the first two mothers got the right baby, but the third and fourth mothers had their babies switched. All of the possibilities are listed below: 1234 2134 3124 4123 a) 1243 2143 3142 4132 1324 2314 3214 4213 1342 2341 3241 4231 1423 2413 3412 4312 1432 2431 3421 4321 How many different arrangements are there for returning the four babies to their mothers? b) For each of these arrangements, indicate how many mothers get the correct baby: 1234: 4 matches 1243: 2 matches 1324 1342 1423 2134 2143 2314 2341 2413 3124 3142 3214 3241 3412 4123 4132 4213 4231 4312 c) In how many arrangements is the number of “matches” equal to exactly: 4:_____________ d) 1432 2431 3421 4321 3:_____________ 2:_____________ 1:_____________ 0:___________ Calculate the (exact) probabilities by dividing your answers to (c) by your answer to (a). 4:_____________ 3:_____________ 2:_____________ 1:_____________ 0:___________ Comment below on how closely the exact probabilities correspond to the empirical (theoretical) probabilities from the simulation recorded in part (f) from Part I of this activity. An Empirical Estimate from a simulation generally gets closer to the actual probability as the number of repetitions increases. Look at the histograms below: e) Generally speaking, which of these three simulations produces empirical estimates closest to the actual probabilities? Explain your answer fully. f) For your class simulation results summarized in (f) from Part I, calculate the average (mean) number of matches per repetition of the process by multiplying each outcome by the number of occurrences, summing the products, and then dividing by the total number of repetitions. The long-run average value achieved by a numerical random process is called its expected value. To calculate this expected value from the (exact) probability distribution, multiply each outcome by its probability, and then add these up over all the possible outcomes. g) Calculate the expected number of matches from the (exact) probability distribution, and compare that to the average number of matches from the simulated data.