Chapter 6 – The Binomial Probability Distribution Defn: A random variable is a variable whose values are determined by chance. We will denote a random variable by a capital letter, such as X, and denote particular values of the variable by the corresponding lower case letter, x. Thus we read P(X = x) as: “the probability that the random variable X takes on the value x.” Defn: A discrete random variable is a r.v. that has either a finite number of possible values or a countable number of possible values. Example: Consider the random experiment of rolling two dice, a green one and a red one. Let the variable X be the sum of the numbers showing on the top faces. X can have the possible values 2, 3, 4, 5, 6, …, 11, or 12. When we roll the two dice once, we cannot predict with certainty which value of X will occur. Example: Consider the population of all families in the United States. Let the random variable X be the number of children in a randomly selected family. Then X has possible values 0, 1, 2, …, up to some maximum number (certainly less than 24!). Defn: A continuous random variable is a r.v. that has an uncountably infinite number of possible values. Example: A man is randomly selected from the population of all adult males in the U.S. Let X = man’s height. Defn: The probability distribution of a r.v. X provides the possible values of X and their corresponding probabilities. A probability distribution may be in the form of a table or a mathematical formula. Example: For our random experiment of rolling two dice, if we assume that the dice are fair, then each possible outcome in the sample space has the same probability of occurring. If we let the random variable X be the sum of the numbers on the top faces, then there is exactly one way for X to take on the value 2, namely if the numbers showing on the top faces are both 1. Thus P(X = 2) = 1/36. There are two ways that X can take on the value 3, if the outcome is either (1, 2) or (2, 1). Thus P(X = 3) = 2/36 = 1/18. The complete probability distribution for the random variable X is given in the table below: X 2 3 4 5 6 7 8 9 10 11 12 P(X = x) 1/36 1/18 1/12 1/9 5/36 1/6 5/36 1/9 1/12 1/18 1/36 Required Properties of a Probability Distribution 1) The probability of the occurrence of any event must be a number between 0 and 1. 2) For probability distributions with a countable number of possible values of the random variable, the sum of the probabilities for the various values must be one; i.e., P( X x) 1, where the sum is over all possible values of the random variable. Example: p. 250, Exercises 4, 6, 7 Example: p. 251, Exercise 12a Mean, Variance and Expectation Defn: The expectation, or mean, of a probability distribution for a discrete random variable X is defined by xP( X x) . I.e., we multiply each possible value of the random variable by the probability of occurrence of that value, and add all of these products together. We also call the mean of the random variable X, or the expectation of the random variable X. Example: Let the random experiment be rolling two fair dice, one green and one red. Let X be the sum of the numbers on the dice. The probability distribution is given in the table above. Then the mean of the distribution is (2)(1 / 36) (3)(1 / 18) (4)(1 / 12) (5)(1 / 9) (6)(5 / 36) (7)(1 / 6) (8)(5 / 36) (9)(1 / 9) (10)(1 / 12) (11)(1 / 18) (12)(1 / 36) 7 In this case, the mean of the distribution is the most likely value. This is not always the case, however. Example: p. 251, Exercise 12c Example: p. 251, Exercise 14c The mean of the random variable (or of the distribution) tells us the long-run average of the variable when the random experiment is performed many times. We also want a measure of the variability of the random variable. 2 Defn: The variance of a discrete random variable X is x P X x . 2 I.e., we first find the mean, , of the random variable, then for each value of x, we square x- and multiply the result by the probability that the value x occurs. Then we add all of these quantities together to get the variance of X. The standard deviation of X is the positive square root of the variance. Note: A simpler way to calculate the variance is to use the computational formula: 2 x2 P X x 2 . all x Note: As with the sample variance and standard deviation, larger values of 2 or of mean that the distribution is more spread out. Also, if 2 = 0, or if = 0, then there is no variability in X; X has only one possible value, namely its mean. Example: Roll a fair die. Let X be the number showing on the top face. The probability distribution of X is given in the table below: X P(X = x) 1 1/6 2 1/6 3 1/6 4 1/6 5 1/6 6 1/6 The mean of X is = (1)(1/6) + (2)(1/6) + (3)(1/6) + (4)(1/6) + (5)(1/6) + (6)(1/6) = 3.5. The variance of X is 2 = (1 – 3.5)2(1/6) + (2 – 3.5)2(1/6) + (3 – 3.5)2(1/6) + (4 – 3.5)2(1/6) + (5 – 3.5)2(1/6) + (6 – 3.5)2(1/6) = 2.9, and the standard deviation of X is = 1.71. Example: p. 251, Exercise 12de. Example: p. 251, Exercise 14de The Binomial Distribution An important special probability distribution, which appears often when doing surveys, is called the binomial distribution. A binomial experiment is a random experiment which has the following characteristics: 1) There are n independent and identical trials. 2) Each trial results in one of two possible outcomes, Success or Failure. 3) The probability of Success is the same for each individual trial. We then define a random variable X to be the number of Successes which occur in the n trials. The random variable X has a binomial probability distribution. The binomial probability distribution is given by the following equation: P X x n Cx p x 1 p n x , for x = 0, 1, 2, 3, …, n. By “independent” in condition 1, we mean that the outcome of one trial does not affect the outcome of any other trial. By “identical” we mean that the trials are performed in exactly the same way. Example: Which of the following are binomial experiments or can be made into binomial experiments? a) Surveying 100 people to determine whether they like Sudsy Soap. b) Tossing a coin 100 times to see how many heads occur. c) Drawing a card from a deck and getting a heart. d) Asking 1000 people which brand of cigarette they smoke. e) Testing four different brands of aspirin to see which brands are effective. f) Testing one brand of aspirin using 10 people to determine whether the brand is effective at relieving headaches. g) Asking 100 people whether they smoke. We can find binomial probabilities using the TI-83 calculator: Example: According to the Information Please almanac, 6% of the human population has blood type O-negative. A simple random sample of size 10 is selected from the population. Since the selection is done randomly, the 10 trials are independent of each other (what does “independent” mean?). Since the same information is being sought for each person, the 10 trials are identical to each other. Hence, condition (1) is satisfied. Either a person has blood type O-negative (Success) or does not (Failure), so condition (2) is satisfied. The probability that a person in the sample has blood type O-negative is 0.06 for each of the 10 people selected, due to random sampling, so condition (3) is satisfied. Let X = number of people in the sample who have blood type O-negative. Then X has a binomial distribution with n = 10 and p = 0.06. a) What is the probability that exactly one person in the sample has O-negative blood? (i) Choose 2nd , DISTR, and binompdf(. (ii) Enter 10, 0.06, and 1. (iii) What is the result? b) What is the probability that no more than 3 people in the sample have O-negative blood? (i) Choose 2nd, DISTR, and binomcdf(. (ii) Enter 10, 0.06, and 3 (iii) What is the result? c) What is the probability that at least 5 people in the sample have O-negative blood? How would we find this probability? What rule would we use? Mean, Variance and Standard Deviation for the Binomial Distribution For a random variable X which has a binomial distribution with parameters n and p, we have (1) = np (2) 2 = np(1-p) (3) = np(1 p) Example: p. 264, Exercise 30 Is this a binomial experiment? What is the mean? The standard deviation? Example: p. 264, Exercise 32