Lecture 11 – End of Chapter 17 Probability Models & Chapter 18 Sampling Distributions We did a simulation by flipping coins. SORRY ABOUT THE RUSH AT THE END! The Binomial Model * Number of trials is n n = 3 or n = 6 (depending on directions) * Each trial is independent on each flip the probabilities ‘re-set’ * Only 2 outcomes Head or Tail * Probability of a success on each trial is p (a probability) p = ½ = .5 * We are interested in X=number of successes COUNT THE NUMBER OF HEADS 1 Discovering the Characteristics of 1) a Binomial and of 2) a Geometric Model Q. In the real world a coin flip could imitate what event? Answer: Experiment: Toss your coin 3 times and record each toss as either Head (H) or Tail (T). Do your experiment 8 times. #1___ ___ ___ #2 ___ ___ ___ #3 ___ ___ ___ #4 ___ ___ ___ 1. How many heads? 2. On which toss did you get the first head? 1. How many heads? 2. On which toss did you Get the first head? 1. How many heads? 2. On which toss did you Get the first head? 1. How many heads? 2. On which toss did you Get the first head? #5___ ___ ___ #6 ___ ___ ___ #7 ___ ___ ___ #8 ___ ___ ___ Describe the outcomes Total number of Heads In 3 tosses Frequency (How many experiments showed that number of Heads?) Prob (Freq/ Total) 0 1 2 3 Total: 1.0 What classmates counts or probabilities seem correct? 2 The Probability of exactly k successes in n trials is nCk*pk*(1-p) n-k nCk = n!/(k!*(n-k)!) where, for example, 4!=4*3*2*1 Theoretical Describe the outcomes Total number of Heads In 3 tosses Frequency (How many experiments showed that number of Heads?) 0 1 2 3 Total: One Simulated In-class Example Describe the outcomes Total number of Heads In 3 tosses Frequency (How many experiments showed that number of Heads?) Prob (Freq/ Total) .125 .375 .375 .125 1.000 Prob (Freq/ Total) 0 1 2 3 Total: Expected Value (from simulated): Variance (from simulated): 3 1.0 Expected Value (from simulated): Variance (from simulated): mean of X (average number of successes in n trials) is calculated as n*p = And the variance is n*p*(1-p) = How close? WHAT DO YOU NEED TO KNOW? 1. Be able to identify a situation as Binomial 2. Pick out what n and p are 3. Calculate the chance of k(given) successes in n (given trials) when you have p(given or easily calculated, like coin flip) 4. Find the mean (n*p) and variance n*p*(1-p) 4 Example: I am going to flip a coin 6 times (the other inclass example) 1. Is it Binomial? 2. What is n? What is p? 3. What is the prob of exactly 1 Head In 6 tosses? (could ask 0 and 2 and 3…) 4. mean=? Variance=? Which of the classmates simulations seem to be correct? 5 The Geometric Model * Conduct trials : we flipped a coin * Each trial is independent : yes * Only 2 outcomes : Head or Tail * Probability of a success on each trial is p (a probability) p= ½ = .5 * Interested in first trial when a success occurs like, when did we get the first head? WHAT DO YOU NEED TO KNOW? 1. Be able to identify a situation as Geometric 2. Pick out p is 3. Calculate the chance of the first success happens on the k (given) trial. The probability that the first success happens on the 3rd trial (k=3) is (1-p) k-1 *p 4. Find the mean (1/p) and variance (1-p)/(p*p) 6 Discovering the Characteristics of 2) a Geometric Model Q. In the real world a wait for a head on coin flip could imitate what event? Answer: Experiment: Toss your coin 3 times and record each toss as either Head (H) or Tail (T). Do your experiment 8 times. #1___ ___ ___ #2 ___ ___ ___ #3 ___ ___ ___ #4 ___ ___ ___ 2. On which toss did you get the first head? 2. On which toss did you Get the first head? 2. On which toss did you Get the first head? 2. On which toss did you Get the first head? #5___ ___ ___ #6 ___ ___ ___ #7 ___ ___ ___ #8 ___ ___ ___ Describe when The first Head appears Frequency (How many experiments showed that number of Heads?) Prob (Freq/ Total) 1 2 3 4+ Total: 1.0 What classmates counts or probabilities seem correct? 7 Problem: A claim is that 65% of AASU students have ssn beginning with the number 2. I might guess then that 65% of the students in here do. I am going to “randomly” ask 5 of you. What is the probability that exactly 2 of the 5 have 2 as 1st ssn? How many do you expect to have this (mean)? And what is the variance associated with the number of 2ers? ANS. Could this be Binomial? <if drawing from n items from N make sure n/N < .10….that is, without replacement will not create massive dependence if you select less than 10% to inspect=trial> Any problem with independence ? n/N Exactly 2 of 5? 8 Mean? Variance? Standard Deviation? Q. If I selected a different 5 students would I get the same answer? And then a different 5? Use (#with 2)/5 as the sample’s proportion (call it p-hat p̂ ; note that it is the probability of finding such a person) of those whose ssn begins with 2. This will vary depending on which 5 students I randomly select. 9 Note that p̂ is a mean=average since it is the sum of raw data (0 or 1) divided by total number of data points. Note also that we theoretically know everything about the population (I said 65% had a 2 at start of ssn). So, the CENTRAL LIMIT THEOREM applies Check these criteria: 1. Taking a large sample (n)? “The larger the better.” n = 30 usually works. 2. Make sure each sample=item=trial=person independent from the other. Then my mean looks like many other possible means from different samples consisting of ‘n’ number of items and in fact comes from a normal (bell-shaped) distribution (nearly) and centers about the entire population mean with a spread away (standard deviation=standard error) equal to the population mean divided by the square root of the sample size n. x Notation: Is N( n 10 What about p̂ ? Normal centered at p (population percentage=proportion) with spread of p *(1 p) so long as n*p and n*(1-p)>10 n Finally, let’s use the CLT on our ssn problem: It is modeled by a Binomial (because we noted that the 5 we picked was less than 10% of the AASU population). p = .65 (65%) n = 5; note 5*.65 and 5*.35=1.65 oops not greater than 10…anyway, so our sample proportion p̂ = _________ is it near .65? spread away by p *(1 p) .65*.35 stdev) n 5 11 Think normal distribution We call these sampling distributions = distributions that would result from repeatedly calculating a mean from this sample, then a different one, and another… 12