4.3 The Binomial Distribution Key concepts: Binomial experiment, success, failure, number of successes, binomial distribution Characteristics of a Binomial Experiment 1. The experiment consists of n identical trials (repetitions). 2. There are only two possible outcomes on each trial. We will denote one outcome by S (for success) and the other by F (for failure). 3. The probability of S remains the same from trial to trial. This probability is denoted by p, and the probability of F is denoted by q. Note that q = 1 − p. 4. The trials are independent. 5. The binomial random variable x is the number of successes S in n trials Examples: Rolling a die 10 times and counting number of times it turns “six”, success = six, p = 1/6, q = 5/6, n = 10 Taking multiple choice (choice A-E) exam completely unprepared, x = number of correct answers, success = correct answer, p = 1/5, n = number of questions Sampling products coming out of production line, x = number of defective in a sample, success = defective item, p = proportion of defectives, n = sample size Selecting a sample of 100 students at a university and counting graduate students, success = graduate student, p = proportion of graduate students at the university, n=100 NOTE: If we select a sample without replacement then trials are not independent. But if the population is more than 10 times the sample, then they are almost independent and we can treat it as a binomial experiment. Example (based on Example 4.10, p. 201) A computer retailer sells both desktop and laptop personal computers (PCs) online. Assume that 80% of the PCs that the retailer sells online are desktops and 20% are laptops, and that sells are independent. Let x represent the number of the next four online PC purchases that are laptops. 1. Explain why x is a binomial random variable; state what is the success, what is the failure, and what are the values of p, q, and n. What are possible values of x 2. Derive formula for the probability distribution p(x) of x. 3. Find the expected value and the standard deviation of x. Extend (guess) the formula for p(x) to the case of an arbitrary Binomial experiments with arbitrary n and p. SOLUTION 1. Success = a laptop is purchased (L), failure = a desktop is purchased (D), p=P(L)=0.20, q=P(D)=1-p=0.80, n=4, x = 0, 1, 2, 3, 4 2. x= P(a sample point ) = # of sample points= 0 (.8)4 1 1 (.2) (.8)3 4 1 2 (.2) (.8)2 6 2 3 (.2) (.8)1 4 P(x=0) = (.8)4 P(x=1) = 4 (.2)1 (.8)3 P(x=2) = 6 (.2)2(.8)2 P(x=3) = 4 (.2)3 (.8)1 P(x=4) = (.2)4 3 = 0.4096 = 0.4096 = 0.1536 = 0.0256 = 0.0016 4 (.2)4 1 𝑛 The coefficients represent the number of ways that we may have x successes in n trials = ( ). 𝑥 4 4 4 4 4 Indeed, if n = 4, then ( ) = 1, ( ) = 4, ( ) = 6, ( ) = 4. ( ) = 1. 0 1 2 3 4 4 𝑝(𝑥 ) = ( ) (0.2)𝑥 (0.8)4−𝑥 , x = 0, 1, 2, 3, 4 𝑥 3. From TI-83: mean = µ = 0.8, standard deviation = σ = 0.8 Example (Ex. 4.13, p. 206) Suppose a poll of 20 employees is taken in a large company. The purpose is to determine x, the number who favor unionization. Suppose that 60% of all the company's employees favor unionization (p = .6, q = .4, n = 20) a. Find the mean and standard deviation of x. µ = 20×0.6 = 12, σ2 = 20×0.6×0.4 = 4.8, σ = √4.8 = 2.19 b. Use Table I in Appendix D to find the probability that x ≤ 10. Repeat using TI-83 Tables: P(x ≤ 10) = 0.245 TI-83: P(x ≤ k) = P(at most k successes) = binomcdf(n,p,k) 2nd→DISTR→A:binomcdf(….. →ENTER binomcdf(20,.6,10) = .2446628 c. Use Table I to find the probability that x > 12. Repeat using TI-83 Tables: P(x > 12) = 1 - P(x ≤ 12) = 1 - .584 = .416 TI-83: 1 - binomcdf(20,.6,12) = .415893 d. Find the probability that x ≥ 8. P(x ≥ 8) = 1 - P(x ≤ 7) = 1 - binomcdf(20,.6,7) = .97897 e. Use Table I to find the probability that x = 11. Repeat using TI-83. Also compute it using binomial formula. Tables: P(x = 11) = P(x ≤ 11) - P(x ≤ 10) = .404 - .245 = .159 TI-83: P(x = k) = P(exactly k successes) = binompdf(n,p,k) 2nd→DISTR→0:binompdf(……. →ENTER P(x = 11) = binompdf(20,.6,11) = .159738 f. Graph the probability distribution of x and locate the interval µ ± 2σ on the graph. Exercise 1. If x is a binomial random variable with p = 0.2 and n = 20. Compute a. Compute P(x = 0) using binomial formula b. Compute P(x ≤ 2) = P(x = 0) + P(x = 1) + P(x = 2) using binomial formula (p. 191) c. Compute the expected value µ of x d. Compute the standard deviation σ of x e. Compute the following probabilities using tables and then calculator…) i. P(x ≤ 4) ii. P(x > 3) iii. P(x = 5) iv. P(x ≥ 16) v. P(2 ≤ x < 8) vi. P( 1 < x ≤ 6) Exercise 2. A multiple choice test has 25 questions each of which has 5 possible answers, only one of which is correct. If Judy guesses on all questions, what is the probability that she will answer more than 15 questions correctly (i.e. she will pass the test) Exercise 3. According to a published study, 1 in every 7 men has been involved in a minor traffic accident. Suppose we have randomly and independently sampled twenty-five men and asked each whether he has been involved in a minor traffic accident. How many of the 25 men do we expect to have never been involved in a minor traffic accident? Exercise 4 (4.51, p. 210). According to the Canadian Journal of Information and Library Science (Vol. 33, 2009), nearly 90% of workers in law libraries are satisfied with their job. Assume the true proportion of law librarians in Canada who are satisfied with their job is .9. In a random sample of 20 law librarians in Canada, what is the probability that at most 2 are unsatisfied with their job? Exercise 5 (4.55, p. 211). According to the National Bridge Inspection Standard (NBIS), public bridges over 20 feet in length must be inspected and rated every 2 years. The NBIS rating scale ranges from 0 (poorest rating) to 9 (highest rating). University of Colorado engineers used a probabilistic model to forecast the inspection ratings of all a major bridges in Denver (Journal of Performance of Constructed Facilities, Feb. 2005). For the year 2020, the engineers forecast that 9% of all a major Denver bridges will have ratings of 4 or below. a. Use the forecast to find the probability that in a random sample of 10 major Denver bridges, at least 3 will have an inspection rating of 4 or below in 2020. b. Suppose that you actually observe 3 or more of the sample of 10 bridges with inspection ratings of 4 or below in 2020. What inference can you make? Why?