Uncertainty 17: The binomial distribution Previous clip: Two outcomes, either 1) 2,3,10,11 or 12 on two dice = “success” 2) 4,5,6,7,8,9 on two dice = “failure” Previous clip: Two outcomes, either 1) 2,3,10,11 or 12 on two dice = “success” 2) 4,5,6,7,8,9 on two dice = “failure” Success: and and and and and and Previous clip: Two outcomes, either 1) 2,3,10,11 or 12 on two dice = “success” 2) 4,5,6,7,8,9 on two dice = “failure” Models: p=Pr(“success” on any given event) for independent events 1) p1=Pr(success | M1)=1/4 2) p2=Pr(success | M2)=5/11 Previous clip: Two outcomes, either 1) 2,3,10,11 or 12 on two dice = “success” 2) 4,5,6,7,8,9 on two dice = “failure” Models: p=Pr(“success” on any given event) for independent events 1) p1=Pr(success | M1)=1/4 2) p2=Pr(success | M2)=5/11 Observed after 100 trials: 24 successes. Previous clip: Two outcomes, either 1) 2,3,10,11 or 12 on two dice = “success” 2) 4,5,6,7,8,9 on two dice = “failure” Models: p=Pr(“success” on any given event) for independent events 1) p1=Pr(success | M1)=1/4 2) p2=Pr(success | M2)=5/11 Observed after 100 trials: 24 successes. Observed success rate: 0.24 : Likelihood ratio: 24 76 Pr(D|M1) p1 (1-p1) ------------ = ------------- = 19000 24 76 Pr(D|M2) p2 (1-p2) : Likelihood ratio: 24 76 Pr(D|M1) p1 (1-p1) ------------ = ------------- = 19000 24 76 Pr(D|M2) p2 (1-p2) Bayes formula: Pr(M1)Pr(D|M1) Pr(M1 | D ) = -------------------------------------------- = 99.953% Pr(M1)Pr(D|M1)+Pr(M2)Pr(D|M2) (for Pr(M1=Pr(M2)=1/2) Pr(M2 | D)=1-Pr(M1 | D)=0.047% 'k' successes out of 'n' trials. 'k' successes out of 'n' trials. Pr(k successes out of n trials in a given ordering)= pk * (1-p)n-k where p=Pr(success on single trial) 'k' successes out of 'n' trials. Pr(k successes out of n trials in a given ordering)= pk * (1-p)n-k where p=Pr(success on single trial) k out of n in a given ordering mutually exclusive from k out of n in another ordering. 'k' successes out of 'n' trials. Pr(k successes out of n trials in a given ordering)= pk * (1-p)n-k where p=Pr(success on single trial) k out of n in a given ordering mutually exclusive from k out of n in another ordering. Pr(k out of n in a given ordering or k out of n in another given ordering)= Pr(k out of n in a given ordering)+ Pr(k out of n in a another given ordering) = 2*pk * (1-p)n-k 'k' successes out of 'n' trials. Pr(k successes out of n trials in a given ordering)= pk * (1-p)n-k where p=Pr(success on single trial) Pr(k out of n successes in any ordering)= pk * (1-p)n-k * number of orderings of k successes and n-k failures. Number of ways to order k successes from n trials Number of ways to order n items: for n=3, there are 3*2*1=6 ways: Number of ways to order n items: n items to put in first place times n-1 items to put in the second place times ... ... ... 2 items to put in the second last place times 1 item to put in the last place Number of ways to order n items: n items to put in first place times n-1 items to put in the second place times ... ... ... 2 items to put in the second last place times 1 item to put in the last place = n*(n-1)*...*2*1 Number of ways to order n items: n items to put in first place times n-1 items to put in the second place times ... ... ... 2 items to put in the second last place times 1 item to put in the last place = n*(n-1)*...*2*1 =n! (n factorial) Number of ways to order n items with two identical items: for n=3, there are 3*2*1/2=3 ways: Number of ways to order n items with containing k identical items = number of ways to order n different items ------------------------------------------------------- number of ways to order the k identical items if they were different = n! / k! Number of ways to order n items with containing k identical items and n-k other identical items n! = ------------k! (n-k)! Number of ways to order n items with containing k identical items and n-k other identical items n! (def) = ------------- = k! (n-k)! n k Binomial distribution: Pr(k successes from n trials)= n p k 1− pn−k k Binomial distribution (probability of getting k successes in n trials), for n=100 trials Peak value Binomial distribution (probability of getting k successes in n trials), for n=100 trials Observed outcome Peak value Binomial distribution (probability of getting k successes in n trials), for n=100 trials Peak value Binomial distribution (probability of getting k successes in n trials), for n=100 trials Most probable outcomes Outcome distribution for model 1, p=1/4 Outcome distribution for model 1, p=1/4 Outcome distribution for model 2, p=5/11 Evidence for model 1 Outcome distribution for model 1, p=1/4 Outcome distribution for model 2, p=5/11 Evidence for model 1 Outcome distribution for model 1, p=1/4 Evidence for model 2 Outcome distribution for model 2, p=5/11 Outcome distribution for model 1, p=1/4 Outcome distribution for model 2, p=5/11 Marked in red: Probability of getting evidence for model 2, when the data has been produced by model 1. Outcome distribution for model 1, p=1/4 Outcome distribution for model 2, p=5/11 Marked in red: Probability of getting evidence for model 2, when the data has been produced by model 1 = 1.64%. Outcome distribution for model 1, p=1/4 Probability of getting evidence for model 1 given that it has produced the data = 98.36%. Outcome distribution for model 2, p=5/11 Marked in red: Probability of getting evidence for model 2, when the data has been produced by model 1 = 1.64%. Why n=100 trial? Wouldn't something else function just as well? Model 1 probabilities Most probable outcomes for model 1. Most probable outcomes for model 1. Most probable outcomes for model 2. Outcomes k successes out of n trials means an observed rate of k/n Most probable outcomes between 0.15 and 0.35. Most probable outcomes for model 1 between 0 and 0.5. Most probable outcomes for model 1 between 0 and 0.5. Most probable outcomes for model 2 between 0.2 and 0.8. Marked in red: Pr(evidence for model 2 | model 1)= Pr(k/n=0.4 | p=1/4) + Pr(k/n=0.5 | p=1/4) + Pr(k/n=0.6 | p=1/4) + Pr(k/n=0.7 | p=1/4) + Pr(k/n=0.8 | p=1/4) + Pr(k/n=0.9 | p=1/4) + Pr(k/n=1.0 | p=1/4) = 22.4% Pr(evidence for model 2 | model 1) = Pr(k/n=347/1000 | p=1/4) + Pr(k/n=348/1000 | p=1/4) + ... + Pr(k/n=1 | p=1/4) ≈ 3.2 *10-12