Probability Models Chapter 17 Objectives: • Binomial Distribution – – – – – Conditions Calculate binomial probabilities Cumulative distribution function Calculate means and standard deviations Use normal approximation to the binomial distribution • Geometric Distribution – Conditions – Cumulative distribution function – Calculate means and standard deviations Bernoulli Trials • The basis for the probability models we will examine in this chapter is the Bernoulli trial. • We have Bernoulli trials if: – there are two possible outcomes (success and failure). – the probability of success, p, is constant. – the trials are independent. • Examples of Bernoulli Trials – Flipping a coin – Looking for defective products – Shooting free throws The Binomial Distribution The Binomial Model • A Binomial model tells us the probability for a random variable that counts the number of successes in a fixed number of Bernoulli trials. • Two parameters define the Binomial model: n, the number of trials; and, p, the probability of success. We denote this Binom(n, p). Independence • One of the important requirements for Bernoulli trials is that the trials be independent. • When we don’t have an infinite population, the trials are not independent. But, there is a rule that allows us to pretend we have independent trials: – The 10% condition: Bernoulli trials must be independent. If that assumption is violated, it is still okay to proceed as long as the sample is smaller than 10% of the population. Binomial Setting • Conditions required for a binomial distribution 1. Each observation falls into one of just two categories, “success” or “failure” (only two possible outcomes). 2. There are a fixed number n of observations. 3. The n observations are all independent. 4. The probability of success, called p, is the same for each observation. Definition: • Binomial Random Variable – The random variable X = number of successes produced in a binomial setting. – Examples: • The number of doubles in 4 rolls of a pair of dice (on each roll you either get doubles or you don’t). • The number of patients with type A blood in a random sample of 10 patients (either each person has type A blood or they don’t). Definition: • Binomial Distribution (model)– A class of discrete random variable distribution, where the count X = the number of successes, in the binomial setting with parameters n and p. – n is the number of observations – p is the probability of a success on any one observation. – Notation: B(n,p) Binomial Number observations Probability success Example • Blood type is inherited. If both parents carry genes for the O and A blood types, each child has probability 0.25 of getting two O genes and so of having blood type O. Different children inherit independently of each other. The number of O blood types among 5 children of these parents is the count X off successes in 5 independent observations. • How would you describe this with “B” notation? • X=B(5,.25) Example • Deal 10 cards from a shuffled disk and count the number X of red cards. These 10 observations, and each gives either a red or a black card. A “success” is a red card. • How would you describe this using “B” notation? • This is not a Binomial distribution because once you pull one card out, the probabilities change. Determining if a situation is (yes) or is not (no) binomial? 1. 2. 3. 4. 5. 6. 7. Surveying people by asking them what they think of the current president. Surveying 1012 people and recording whether there is a “should not” response to this question: “do you think the cloning of humans should or should not be allowed?” Rolling a fair die 50 times. Rolling a fair die 50 times and finding the number of times that a 5 occurs. Recording the gender of 250 newborn babies. Spinning a roulette wheel 12 times. Spinning a roulette wheel 12 times and finding the number of times that the outcome is an odd number. 1. NO 2. YES 3. 4. NO YES 5. YES 6. 7. NO YES Combinations - nCk • In n trials, there are n! n Ck k ! n k ! ways to have k successes. – Read nCk as “n choose k.” • Note: n! = n (n – 1) … 2 1, and n! is read as “n factorial.” Binomial Formula • Binomial Coefficient – The number of ways of arranging k successes among n observations (combinations). – The binomial coefficient counts the number of ways in which k successes can be distributed among n observations. – is read “combinations n choose k” Binomial Coefficient • Uses factorial notation, for any positive whole number n, its factorial is – Also, – TI-83/84 can find the combination function under the math menu/prb. – Find = 10 Binomial Probability Model • If X has the binomial distribution with n observations and probability p of success on each observation, the possible values of X are 0, 1, 2,…,n. If k is any one of these values, The Binomial Model (cont.) Binomial probability model for Bernoulli trials: Binom(n,p) n = number of trials p = probability of success 1 – p = probability of failure = q k = # of successes in n trials n k nk P( X k ) p 1 p k np np 1 p npq Mean Standard Deviation Binomial Probabilities n k nk n k nk P( X k ) p 1 p p q k k Binomial Mean & Standard Deviation np np 1 p Example: A biologist is studying a new hybrid tomato. It is known that the seeds of this hybrid tomato have probability 0.70 of germinating. The biologist plants 10 seeds. a) What is the probability that exactly 8 seeds will germinate? b) What is the probability that at least 8 seeds will germinate? Solution: Binomial Setting? 1. Each trial has only two outcomes, success or failure? 2. Fixed number of trials? 3. Trials are independent? 4. Probability of success is the same for each trial? 1. Yes 2. Yes 3. Yes 4. Yes Solution: a) We wish to find P(X = 8), the probability of exactly eight success. n=10 p=.7 (1-p)=.3 k=8 Solution: b) In this case, we are interested in the probability of 8 or more seeds germinating, P(X ≥ 8). P(X ≥ 8) = P(X = 8) + P(X = 9) + P(X = 10) P(X ≥ 8) = .233 + .121 + .028 P(X ≥ 8) = .382 Finding Binomial Probabilities using the TI-83/84 • pdf – Given a discrete random variable X, the probability distribution function (pdf) assigns a probability to each value of X. – Used to find binomial probabilities for single values of X (ie. X = 3). – Found under 2nd (DISTR)/0:binompdf – Input binompdf(n,p,X) for one value of X or binompdf(n,p,{X1,X2,…,Xn}) for multiple values of X. Finding Binomial Probabilities using the TI-83/84 • cdf – Given a discrete random variable X, the cumulative distribution function (cdf) calculates a cumulative probability from X=0 to and including a specified value of X. – Used to find binomial probabilities for an interval of values of X (ie. 0≤X≤3). – Found under 2nd (DISTR)/0:binomcdf – Input binomcdf(n,p,X) for interval X=0 up to and including X. Binomial Distributions on the calculator • • • • Binomial Probabilities B(n,p) with k successes binompdf(n,p,k) Corinne makes 75% of her free throws. • What is the probability of making exactly 7 of 12 free throws. • binompdf(12,.75,7)=.1032 n k nk p 1 p k 12 5 7 .75 .25 7 Binomial Distributions on the calculator • • • • • • 12 12 5 7 6 6 Binomial Probabilities .75 .25 .75 .25 7 6 B(n,p) with k successes 12 12 5 7 binomcdf(n,p,k) .75 .25 .754 .258 5 4 Corinne makes 75% of 12 12 her free throws. 3 9 .75 .25 .752 .2510 What is the probability of 3 2 making at most 7 of 12 12 12 1 11 .75 .25 .750 .2512 free throws. 1 0 binomcdf(12,.75,7)=.1576 Binomial Distributions on the calculator • • • • Binomial Probabilities 12 12 5 7 8 4 B(n,p) with k successes .75 .25 .75 .25 7 8 binomcdf(n,p,k) Corinne makes 75% of her 12 .759 .253 12 .7510 .252 9 free throws. 10 • What is the probability of 12 12 11 1 .75 .25 .7512 .250 making at least 7 of 12 free 11 12 throws. • 1-binomcdf(12,.75,6) = .9456 Example: • A quality engineer selects an SRS of 10 switches from a large shipment for detailed inspection. Unknown to the engineer, 10% of the switches in the shipment fail to meet the specifications. 1. Let X be the number of bad switches in the sample, what is the distribution of X? • B(10,.1) 1. What is the probability of exactly two bad switches? • • • P(X = 2) Binompdf(10,.1,2) .1937 Continued 3. What is the probability that no more than 1 of the 10 switches in the sample fail inspection? • P(X ≤ 1) = P(X = 0) + P(X = 1) binomcdf (10,.1,1) P(X1) = .7361 Binomial Simulations • • • • Corinne makes 75% of her free throws. Simulate shooting 12 free throws. randBin(n,p) will do one simulation randBin(n,p,t) will do t simulations The Normal Model to the Rescue! • When dealing with a large number of trials in a Binomial situation, making direct calculations of the probabilities becomes tedious (or outright impossible). • Fortunately, the Normal model comes to the rescue… The Normal Model to the Rescue! • As long as the Success/Failure Condition holds, we can use the Normal model to approximate Binomial probabilities. – Success/failure condition: A Binomial model is approximately Normal if we expect at least 10 successes and 10 failures: np ≥ 10 and nq ≥ 10 Continuous Random Variables • When we use the Normal model to approximate the Binomial model, we are using a continuous random variable to approximate a discrete random variable. • So, when we use the Normal model, we no longer calculate the probability that the random variable equals a particular value, but only that it lies between two values. Normal Approximation of the Binomial Distribution • Before the technology became available, this was the preferred technique for calculation of binomial probabilities when n was large or when there were a great number of cases or successes to consider. • This approximation method works best for binomial situations when n is large and when the value of p is not close to either 0 or 1. • In this approximation, we use the mean and standard deviation of the binomial distribution as the mean and standard deviation needed for calculations using the normal distribution. Normal Approximation for Binomial Distribution • Given a count X has the binomial distribution with n trials and success probability p. • When n is large, the distribution of X is approximately normal, N(np, √np(1-p)). • When is the approximation valid? – np ≥ 10 and n(1-p) ≥ 10 – The accuracy of the normal distribution improves as the sample size n increases. It is most accurate for any fixed n when p is close to ½ and least accurate when p is near 0 or 1. Normal Approximation of Binomial Distribution • Remember np np 1 p Example 1: Binomial Distribution np = (3)(.25) = .75 n(1-p) = (3)(.75) = 2.25 Example 2: Binomial Distribution np = (10)(.25) = 2.5 n(1-p) = (10)(.75) = 7.5 Example 3: Binomial Distribution np = (25)(.25) = 6.25 n(1-p) = (25)(.75) = 18.75 Example 4: Binomial Distribution np = (50)(.25) = 12.5 n(1-p) = (50)(.75) = 37.5 Problem: • A shipment of ice cream cones has the manufacturer’s claim that no more than 15% of the shipment will be defective (broken cones). What is the probability that in a shipment of 1 million cones, Dairy Heaven Corporate Distribution Center will find more than 151,000 broken cones? • Define the random variable and specify the model. • Let X = the number of defective cones in the shipment (successes). • Check that conditions of the model are met. • np ≥ 10 and n(1-p) ≥10 – X is binomial with n = 1,000,000 and p = 0.15. – Most calculators will not handle a problem of this magnitude. – Use a normal model with μ = np and σ = √ np(1-p) as an approximation. – 1,000,000(.15)≥10 & 1,000,000(.85)≥10. • Find the mean & standard deviation • μ = np and σ = √ np(1-p) • Calculate the probability of 151,000 or more successes • P(X ≥ 151,000) • State your conclusion • In a shipment of 1 million cones, the probability of getting more than 151,000 defective cones is .0026. – μ = 1,000,000(.15) = 150,000 – σ = 1,000,000(.15)(.85) = 357.07 – – Example: • A recent survey asked a nationwide random sample of 2500 adults if they agreed or disagreed that “I like buying new clothes, but shopping is often frustrating and time-consuming.” Suppose that in fact 60% of all adults would “agree”. What is the probability that 1520 or more of the sample “agree”. TI-83 calculator • B(2500,.6) and P(X>1520) • 1-binomcdf(2500,.6,1519) • .2131390887 Review: Binomial Probability Review: Binomial Probability Geometric Distributions The Geometric Model • A single Bernoulli trial is usually not all that interesting. • A Geometric probability model tells us the probability for a random variable that counts the number of Bernoulli trials until the first success. • Geometric models are completely specified by one parameter, p, the probability of success, and are denoted Geom(p). Geometric Distributions • Definition – A special case of a binomial random variable where the random variable X is defined as the number of trials needed to obtain the first success. Geometric Setting (Conditions) 1. Each observation falls into one of just two categories, success (p) or failure (1-p). 2. The probability of a success (p) is the same for each observation. 3. The observations are all independent. 4. The variable of interest is the number of trials required to obtain the first success. Geometric Probability Model Geometric probability model for Bernoulli trials: Geom(p) p = probability of success 1 – p = probability of failure = q k = number of trials until the first success occur P(X = k) = 1 E(X) p Mean (1-p)k-1p 1 p p 2 k-1 =q p q p2 Standard Deviation Calculating Geometric Probabilities • The probability that the first success occurs on the nth trial is • P(X=n) = (1-p)n-1 p • Example: – If p = .25, find P(X=3)? – P(X=3) means “failure on 1st trial (X=1) and failure on 2nd trial (X=2) and success on 3rd trial (X=3). Each trial is independent so, P(X=3) = (1-p)(1-p)(p) = (1-p)2(p). – Therefore, P(X=3) = (.75)2(.25) = .1406 Geometric Probability Distribution • Geometric random variable never ends. • The probabilities are the terms of a geometric sequence, arn-1 (hence the name). The a is probability of success (p), r is the probability of failure (1-p), and n is the value of X. • X 1 2 3 … n P(X) p (1-p)p (1-p)2p (1-p)n-1p • Geometric probability distributions are skewed right. Example: Geometric Probability Distributions Geometric Probability Distribution • The sum of all the probabilities is still equal to one, even though geometric probability distribution is infinite. • For geometric series S∞ = a/(1-r), where a = p and r = (1-p), therefore, ∑P(Xi) = p/(1-(1-p)) = p/p =1. Using the TI-83/84 • Single values – Distr/geometpdf(p,x) – Distr/geometpdf(p,{x1,x2,x3,…,xn}) • Cumulative values – Distr/geometcdf(p,x) – Calculates the sum of the probabilities from X=0 to X. Calculating Probabilities • The probability of rolling a 6 = 1/6 • The probability of rolling the first 6 on the first roll: – P(X=1) = 1/6. – geometpdf(1/6,1) = 1/6. • The probability of rolling the first 6 after the first roll: – P(X>1) = 1-1/6 = 5/6. – 1-geometpdf(1/6,1) = 5/6. Calculating Probabilities • The probability of rolling a 6 = 1/6 • The probability of rolling the first 6 on the second roll: – P(X=2) = (5/6)∙(1/6) = 5/36. – geometpdf(1/6,2) = .1388888889 = 5/36. • The probability of rolling the first 6 on the second roll or before: – P(X<2)=(1/6) +(5/6)∙(1/6) = 11/36. – geometcdf(1/6,2) = .305555556 = 11/36. Calculating Probabilities • The probability of rolling a 6 = 1/6 • The probability of rolling the first 6 on the second roll: – P(X=2) = (5/6)∙(1/6) = 5/36. – geometpdf(1/6,2) = 5/36. • The probability of rolling the first 6 after the second roll: – P(X>2)=1-((1/6) +(5/6)∙(1/6)) = 25/36 – 1-geometcdf(1/6,2) = .69444444 = 25/36. Mean and Standard Deviation • Mean μ = 1/p • Variance σ2 = 1-p/p2 • for P(X>n), the probability that X takes more than n trials to see the first success is P(X>n) = (1-p)n Geometric Distribution Mean & Standard Deviation 1 X p 1 p 1 p 2 , X 2 p p 2 X Useful Geometric Probability Formulas P( X n) pq n 1 , geometpdf(p,n) P( X n) 1 p , 1-geometcdf(p,n) n P( X n) 1 1 p , geometcdf(p,n) n Review: Review: What Can Go Wrong? • Be sure you have Bernoulli trials. – You need two outcomes per trial, a constant probability of success, and independence. – Remember that the 10% Condition provides a reasonable substitute for independence. • Don’t confuse Geometric and Binomial models. • Don’t use the Normal approximation with small n. – You need at least 10 successes and 10 failures to use the Normal approximation. What have we learned? • Bernoulli trials show up in lots of places. • Depending on the random variable of interest, we might be dealing with a – Geometric model – Binomial model – Normal model What have we learned? – Geometric model • When we’re interested in the number of Bernoulli trials until the next success. – Binomial model • When we’re interested in the number of successes in a certain number of Bernoulli trials. – Normal model • To approximate a Binomial model when we expect at least 10 successes and 10 failures. Assignment • Pg. 401 – 404:#1 - 19 odd, 25, 29, 35