CHAPTER 2 DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS 1. 2. 3. 4. 5. Random Variable Probability Distribution Expected Value of a Random Variable The Variance and Standard Deviation of a Random Variable 4.1. Arithmetic Properties Of Expected Value And Variance 4.1.1. Standardized Random Variable The Binomial Distribution 5.1. Cumulative Probability Distribution 5.2. How to Use Excel to Find Binomial Probabilities 5.3. Expected Value and Variance of the Binomial Distribution 1. Random Variable A random variable is a variable whose values are determined through a random experiment or process. In other words, a random variable is a variable whose value cannot be predicted exactly. The value is not known in advance; it is not known until after the random experiment is conducted. Example 1 As a random experiment, toss a coin. This random experiment has two outcomes: H and T. Let’s assign the value 0 to H, (H = 0), and 1 to T, (T = 1). Counting the number of 1’s in each outcome provides the values assigned to x. If you toss two coins, then the number of tails is either 0, 1, or 2: Outcomes of the random experiment (0,0) (0,1), (1,0) (1,1) Values of random variable x Number of tails 0 1 2 Example 2 When you toss a pair of dice, let 𝑥 denote the sum of the number of dots appearing on top. These numbers, as they appear in the top row below, are assigned to x through the outcomes of the random experiment. The outcomes are shown below. Chapter 2—Discrete Random Variables and Probability Distributions Page 1 of 17 Outcomes of the random experiment (1,1) Values of random variable x Sum of dots 2 (1,2), (2,1) 3 (1,3), (2, 2), (3,1) 4 (1,4), (2,3), (3,2), (4,1) 5 (1,5), (2,4), (3,3), (4,2), (5,1) 6 (1,6), (2,5), (3,4), (4,3), (5,2), (6,1) 7 (2,6), (3,5), (4,4), (5,3), (6,2) 8 (3,6), (4,5), (5,4), (6,3) 9 (4,6), (5,5), (6,4) 10 (5,6), (6,5) 11 (6,6) 12 Example 3 When you are guessing the answers to a set of 5 multiple-choice questions, you are conducting a random experiment. Assign 0 to the incorrect answer and 1 to the correct answer for each question, and let x denote the number of possible correct answers: 0, 1, 2, 3, 4, 5. These numbers are assigned to x through the outcomes of the random experiment as shown below. There are 32 possible outcomes in this random experiment.1 x Correct guesses Outcomes of the random experiment (0,0,0,0,0) 0 (1,0,0,0,0), (0,1,0,0,0), (0,0,1,0,0), (0,0,0,1,0), (0,0,0,0,1), 1 (1,1,0,0,0), (1,0,1,0,0), (1,0,0,1,0), (1,0,0,0,1), (0,1,1,0,0), (0,1,0,1,0), (0,1,0,0,1), (0,0,1,1,0), (0,0,1,0,1), (0,0,0,1,1) 2 (1,1,1,0,0), (1,1,0,1,0), (1,1,0,0,1), (1,0,1,1,0), (1,0,1,0,1), (1,0,0,1,1), (0,1,1,1,0), (0,1,1,0,1), (0,1,0,1,1), (0,0,1,1,1) 3 (1,1,1,1,0), (1,1,1,0,1), (1,1,0,1,1), (1,0,1,1,0), (0,1,1,1,1) 4 (1,1,1,1,1) 5 2. Probability Distribution The probability distribution of a (discrete) random variable is the set of all possible values of the random variable along with the probability corresponding to each value. Since a probability distribution lists all possible values of the random variable, then the sum of the probabilities must equal one (1). Example 4 Let 𝑥 be the random variable denoting the number of tails when tossing two coins as in Example 1. Write the probability distribution of 𝑥. As Example 1 showed, there are four outcomes when tossing two coins, each assigning a discrete value to 𝑥. Since each outcome is equally likely then the probability distribution is: 𝑥 0 1 2 𝑓(𝑥) 0.25 0.50 0.25 1.00 The graph of the above probability distribution is: Each trial has two outcomes (failure or success—0 or 1). The experiment has five trials. Therefore, the total number of outcomes is 2⁵ = 32. 1 Chapter 2—Discrete Random Variables and Probability Distributions Page 2 of 17 Probability Distribution of Number of Tails in Tossing Two Coins f(x) 0.6 0.50 0.5 0.4 0.3 0.25 0.25 0.2 0.1 0 0 1 2 x (number of tails) Example 5 Let x be the random variable denoting the sum of dots appearing on top when tossing a pair of dice. Write the probability distribution of x. As Example 2 showed, there are 36 outcomes of (ordered) pairs of numbers generating the values of x. Since each outcome is equally likely, then the probability distribution of x can be written as follows: x 𝑓(𝑥) 2 3 4 5 6 7 8 9 10 11 12 1 ∕ 36 = 0.0278 2 ∕ 36 = 0.0556 3 ∕ 36 = 0.0833 4 ∕ 36 = 0.1111 5 ∕ 36 = 0.1389 6 ∕ 36 = 0.1667 5 ∕ 36 = 0.1389 4 ∕ 36 = 0.1111 3 ∕ 36 = 0.0833 2 ∕ 36 = 0.0556 1 ∕ 36 = 0.0278 Chapter 2—Discrete Random Variables and Probability Distributions Page 3 of 17 Probability Distribution of Sum of Dots in Tossing Two Dice f(x) 0.18 0.17 0.16 0.14 0.14 0.11 0.12 0.10 0.11 0.08 0.08 0.08 0.06 0.06 0.04 0.14 0.06 0.03 0.03 0.02 0.00 2 3 4 5 6 7 8 9 10 11 12 x (sum of dots) 3. Expected Value of a Random Variable The expected value of the random variable x, denoted by 𝐄(𝒙), is simply the mean of all the values taken by the random variable. Since E(𝑥) represents the mean of all possible values of 𝑥, we can use the symbol 𝜇 (population mean) also to represent the expected value. If probabilities assigned to each value were the same, then the mean or the expected value would simply be the sum of all the values divided by the number of values of 𝑥. But the assigned probabilities are rarely equal for all the values of a random variable. Therefore, to find the expected value of 𝑥 the weighted average formula must be used. The weights assigned to each value are the probabilities. Example 6 The following table shows the probability distribution of the number of automobiles sold in any given business day in a car dealership. Number of automobiles sold 𝑥 0 1 2 3 4 5 Probability 𝑓(𝑥) 0.18 0.39 0.24 0.14 0.04 0.01 Find the expected value of the number of automobiles sold in a day. Chapter 2—Discrete Random Variables and Probability Distributions Page 4 of 17 𝑥 0 1 2 3 4 5 𝑓(𝑥) 0.18 0.39 0.24 0.14 0.04 0.01 𝑥𝑓(𝑥) = 𝑥𝑓(𝑥) 0.00 0.39 0.48 0.42 0.16 0.05 1.50 The expected value of a random variable is the sum of the products of the values of the random variable and the corresponding probabilities. E(𝑥) = 𝑥𝑓(𝑥) Example 7 A state lottery issues 100,000 instant lottery tickets. The possible prizes and the number of tickets containing each prize is shown below: Prize $0 1 2 5 10 25 500 5000 Number of Tickets Issued 86,246 8,000 3,200 1,540 740 260 13 1 100,000 Let x denote the prize value. Using the number (frequency) of tickets associated with each prize develop the probability distribution of x and compute the expected value of the prize amount. The following worksheet shows the calculations. The first and second columns show the probability distribution of x. The probability of winning each prize amount is simply the relative frequency of the tickets containing that prize amount. The third column in the worksheet shows the calculation of E(x). 𝑥 0 1 2 5 10 25 500 5000 𝑓(𝑥) 0.86246 0.08000 0.03200 0.01540 0.00740 0.00260 0.00013 0.00001 1.00000 𝑥𝑓(𝑥) 0.000 0.080 0.064 0.077 0.074 0.065 0.065 0.050 E(𝑥) = 𝑥𝑓(𝑥) = 0.475 Chapter 2—Discrete Random Variables and Probability Distributions Page 5 of 17 4. The Variance And Standard Deviation of a Random Variable The variance of a random variable, denoted by 𝐯𝐚𝐫(𝒙), is a measure of the dispersion of the values of the random variable. Generally, the variance, as explained in Chapter 1, is computed as the mean squared deviation of a variable from the mean: σ2 = (𝑥 − µ)2 ⁄𝑁 If the variable is a random variable, then we must compute the weighted average of the squared deviations, using the probabilities as the weights. The mean of the random variable is µ. Then the variance of the random variable 𝑥 is defined as the expected value of the squared deviations: var(𝑥) = E[(𝑥 − µ)2 ] The above expression means that the variance of the random variable 𝑥 is the weighted mean of squared deviations. var(𝑥) = (𝑥 − 𝜇)2 𝑓(𝑥) Variance of a random variable: E[(𝑥 − 𝜇)2 ] Is the expected value of the squared deviations and is computed as var(𝑥) = ∑(𝑥 − 𝜇)2 𝑓(𝑥) the weighted mean of the squared deviations. Example 8 Given E(𝑥) = 𝜇 = 1.5, find the variance of the number of cars sold in a day in Example 6. 𝑥 0 1 2 3 4 5 𝑓(𝑥) 0.18 0.39 0.24 0.14 0.04 0.01 (𝑥 − 𝜇)2 2.25 0.25 0.25 2.25 6.25 12.25 (𝑥 − 𝜇)2 𝑓(𝑥) 0.4050 0.0975 0.0600 0.3150 0.2500 0.1225 1.2500 The variance of 𝑥 is: var(𝑥) = (𝑥 − 𝜇)2 𝑓(𝑥) = 1.25 The standard deviation of x is the square root of the variance: sd(𝑥) = √(𝑥 − 𝜇)2 𝑓(𝑥) = √1.25 = 1.118 Chapter 2—Discrete Random Variables and Probability Distributions Page 6 of 17 Example 9 𝐸(𝑥) = 𝜇 = 0.475, find the variance of prize values in the lottery example (Example 7) 𝑥 0 1 2 5 10 25 500 5000 The standard deviation is: 𝑓(𝑥) 0.86246 0.08 0.032 0.0154 0.0074 0.0026 0.00013 0.00001 (𝑥 − 𝜇)2 0.2256 0.2756 2.3256 20.4756 90.7256 601.4756 249525.2256 24995250.2256 (𝑥 − 𝜇)2 𝑓(𝑥) 0.1946 0.0221 0.0744 0.3153 0.6714 1.5638 32.4383 249.9525 285.2324 sd(𝑥) = √285.2324 = 16.89 Example 10 A company employs salespersons to market its product. Keeping track of sales for 100 weeks, the following table shows the number of units of the product sold per week and the frequency of the weeks. For example, in 15 of the weeks, a salesperson sold 11 units, and in 40 of the weeks, 12 units were sold. Number of units sold 𝑥 10 11 12 13 14 Weeks 10 15 40 30 5 100 Using this table, we can generate the probability (relative frequency) distribution of the random variable 𝑥 (the number of units sold in a given week), as shown below. Number of items sold 𝑥 10 11 12 13 14 𝑓(𝑥) 0.10 0.15 0.40 0.30 0.05 Find the expected value and the standard deviation of the number of items sold per salesperson per week. The following worksheet shows the calculations. Chapter 2—Discrete Random Variables and Probability Distributions Page 7 of 17 𝑥 10 11 12 13 14 𝑓(𝑥) 0.10 0.15 0.40 0.30 0.05 (𝑥 − µ)2 4.2025 1.1025 0.0025 0.9025 3.8025 var(𝑥) = sd(𝑥) = 𝑥𝑓(𝑥) 1.00 1.65 4.80 3.90 0.70 µ = 12.05 (𝑥 − µ)2 𝑓(𝑥) 0.4203 0.1654 0.0010 0.2708 0.1901 1.0475 1.0235 Each salesperson sells, on average, 12.05 units, and the number units sold deviate from the mean by 1.0235, on average. A simpler formula to compute the variance of a random variable A simple algebraic manipulation of the variance formula provides an alternative, much simpler, way to compute the variance of a discrete random variable.2 (𝑥 − 𝜇)2 𝑓(𝑥) = 𝑥2 𝑓(𝑥) − 𝜇2 Simpler formula to compute the variance: E[(𝑥 − 𝜇)2 ] = E(𝑥 2 ) − 𝜇 2 (𝑥 − 𝜇)2 𝑓(𝑥) = 𝑥2 𝑓(𝑥) − 𝜇2 Use the computational formula to obtain the variance in Example 8 and Example 10. Variance for Example 8 𝑥 0 1 2 3 4 5 𝑥𝑓(𝑥) 0.18 0.39 0.24 0.14 0.04 0.01 𝑥2 0 1 4 9 16 25 Variance for Example 10 𝑥 2 𝑓(𝑥) 0.00 0.39 0.96 1.26 0.64 0.25 3.50 𝑥2 𝑓(𝑥) − 𝜇2 = 3.5 − 1.52 = 1.25 𝑥 10 11 12 13 14 𝑥𝑓(𝑥) 0.10 0.15 0.40 0.30 0.05 𝑥2 100 121 144 169 196 𝑥 2 𝑓(𝑥) 10.00 18.15 57.60 50.70 9.80 146.25 𝑥2 𝑓(𝑥) − 𝜇2 = 146.3 − 12.052 = 1.0475 (𝑥 − 𝜇)2 𝑓(𝑥) = (𝑥 2 − 2𝜇𝑥 + 𝜇2 )𝑓(𝑥) (𝑥 − 𝜇)2 𝑓(𝑥) = [𝑥 2 𝑓(𝑥) − 2𝜇𝑥𝑓(𝑥) + 𝜇2 𝑓(𝑥)] (𝑥 − 𝜇)2 𝑓(𝑥) = 𝑥 2 𝑓(𝑥) − 2𝜇𝑥𝑓(𝑥) + 𝜇2 𝑓(𝑥) (𝑥 − 𝜇)2 𝑓(𝑥) = 𝑥 2 𝑓(𝑥) − 2𝜇2 + 𝜇2 (𝑥 − 𝜇)2 𝑓(𝑥) = 𝑥 2 𝑓(𝑥) − 𝜇2 2 Chapter 2—Discrete Random Variables and Probability Distributions Page 8 of 17 Example 11 In Example 10, suppose each salesperson receives a sales commission of $20 per unit sold plus a fixed weekly wage of $500. What is the expected weekly income? Here the random variable 𝑋 (the number of units sold) is linearly transformed to the random variable Y (the weekly income), where 𝑦 = 500 + 20𝑥 By linearly transforming 𝑋 we have a new random variable 𝑌. However, the corresponding probabilities remain intact. The following shows the probability distribution of 𝑦 and the expected value of 𝑦. 𝑦 𝑓(𝑦) 𝑦𝑓(𝑦) 500 + 20(10) = 700 500 + 20(11) = 720 500 + 20(12) = 740 500 + 20(13) = 760 500 + 20(14) = 780 0.10 0.15 0.40 0.30 0.05 E(𝑦) = 70 108 296 228 39 741 A salesperson expects to earn $741 per week. (He earns on average $741 per week.) Note that you can find the expected value of 𝑦 using E(𝑥). Since 𝑦 is a linear transformation of 𝑥, then E(𝑦) = E(500 + 20𝑥) E(𝑦) = E(500) + E(20𝑥) E(𝑦) = 500 + 20E(𝑥) E(𝑦) = 500 + 20(12.05) = 741 4.1. Arithmetic Properties Of Expected Value And Variance The arithmetic properties of E(𝑥) and var(𝑥) shows what happens to the expected value and variance of x if the random variable is linearly transformed. The random variable 𝑦 is a linear transformation of 𝑥, if for any constant 𝒂 and a nonzero constant 𝒃, 𝑦 = 𝑎 + 𝑏𝑥 Then for any linear transformation of 𝑥, the following holds: 3 E(𝑦) = E(𝑎 + 𝑏𝑥) E(𝑦) = E(𝑎) + 𝑏E(𝑥) E(𝑦) = a + 𝑏E(𝑥) E(𝑎 + 𝑏𝑥) = ∑(𝑎 + 𝑏𝑥)𝑓(𝑥) E(𝑎 + 𝑏𝑥) = ∑𝑎𝑓(𝑥) + ∑𝑏𝑥𝑓(𝑥) E(𝑎 + 𝑏𝑥) = 𝑎∑𝑓(𝑥) + 𝑏∑𝑥𝑓(𝑥) E(𝑎 + 𝑏𝑥) = 𝑎 + 𝑏E(𝑥) 3 Chapter 2—Discrete Random Variables and Probability Distributions Page 9 of 17 and, var(𝑦) = var(𝑎 + 𝑏𝑥) var(𝑦) = var(𝑎) + var(𝑏𝑥) var(𝑦) = 𝑏 2 var(𝑥) (See the footnote4.) Note that the variance of a constant is always zero. Therefore, var(𝑎) = 0. Also note that sd(𝑦) = 𝑏sd(𝑥) Example 12 Find the variance and standard deviation of the weekly income of the salespersons in Example 10. First find the variance of y by using the worksheet method. 𝑦 𝑓(𝑦) (𝑦 − µ)2 𝑓(𝑦) 700 720 740 760 780 0.10 0.15 0.40 0.30 0.05 168.10 66.15 0.40 108.30 76.05 var(𝑦) = (𝑦 − 𝜇)2 𝑓(𝑦) = 419.00 sd(𝑦) = 20.47 The standard deviation of income means that, on average, a salesperson should expect his income to vary by $20.47 per week. Next use the arithmetic property of variance to find var(𝑦). var(𝑦) = var(500 + 20𝑥) = 202 var(𝑥) = 400(1.0475) = 419 and sd(𝑦) = √419 = 20.47 or sd(𝑦) = 𝑏sd(𝑥) = 20(1.0235) = 20.47 var(𝑎 + 𝑏𝑥) = ∑[𝑎 + 𝑏𝑥 − E(𝑎 + 𝑏𝑥)]2 𝑓(𝑥) var(𝑎 + 𝑏𝑥) = ∑[𝑎 + 𝑏𝑥 − 𝑎 − 𝑏E(𝑥)]2 𝑓(𝑥) var(𝑎 + 𝑏𝑥) = ∑(𝑏𝑥 − 𝑏µ)2 𝑓(𝑥) var(𝑎 + 𝑏𝑥) = ∑𝑏 2 (𝑥 − µ)2 𝑓(𝑥) = 𝑏 2 ∑(𝑥 − µ)2 𝑓(𝑥) var(𝑎 + 𝑏𝑥) = 𝑏 2 var(x) 4 Chapter 2—Discrete Random Variables and Probability Distributions Page 10 of 17 Example 13 Given the following probability distribution of 𝑥, find the expected value and variance of x and the expected value and variance of 𝑦 = 2 + 5𝑥. First, find E(x) and E(y): 𝑥 1 2 3 4 5 𝑓(𝑥) 𝑥𝑓(𝑥) 0.05 0.05 0.15 0.30 0.35 1.05 0.30 1.20 0.15 0.75 E(𝑥) = 𝑥𝑓(𝑥) = 3.35 𝑦 = 2 + 5𝑥 7 12 17 22 27 𝑦𝑓(𝑥) 0.35 1.80 5.95 6.60 4.05 𝐸(𝑦) = 18.75 Using the linear transformation property E(𝑦) = 𝑎 + 𝑏E(𝑥). E(𝑦) = 2 + 5E(𝑥) = 2 + 5(3.35) = 18.75 Once you determine E(𝑥), you can use the arithmetic properties of expected value and find E(𝑦). You do not need to perform the worksheet computations. Next, compute var(𝑥) and var(𝑦): 𝑥 1 2 3 4 5 𝑓(𝑥) 0.05 0.15 0.35 0.30 0.15 [𝑥 − E(𝑥)]2 𝑓(𝑥) 0.2761 0.2734 0.0429 0.1268 0.4084 var(𝑥) = 1.1275 𝑦 = 2 + 5𝑥 7 12 17 22 27 [𝑦 − E(𝑦)]2 𝑓(𝑥) 6.9031 6.8344 1.0719 3.1688 10.2094 var(𝑦) = 28.1875 Note, again, that: var(𝑦) = var(2 + 5𝑥) = 52 var(𝑥) = 25(1.1275) = 28.1875 sd(𝑥) = √1.1275 = 1.0618 sd(𝑦) = 5sd(𝑥) = 5(1.0618) = 5.3092 5. The Binomial Distribution Certain random experiments or processes, such as guessing the answers to a multiple choice test, tossing a coin or a pair of dice, drawing a card from a deck of playing cards (with replacement) possess properties that generate special random variables. The probability assigned to each value of such random variables are then determined using a general formula. What are the properties of such random experiments? 1. The experiment consists of identical and independent repeated trials. 2. Each trial consists of two mutually exclusive outcomes: success or failure. The answer for each multiple choice question is either correct (success) or incorrect (failure). Defining heads as success in tossing a Chapter 2—Discrete Random Variables and Probability Distributions Page 11 of 17 coin, or a double-six in tossing a pair of dice, then the other outcomes are “failures”. Or, you may define drawing an ace a success. Then drawing any of the other cards is a failure. 3. The probability of success, P(𝑆) = π, or failure, P(𝐹) = 1 − π, remains the same for all repetitive trials. In the quiz example, 𝜋 = 0.25 and 1 − π = 0.75. In the coin toss 𝜋 = 0.5 and 1 − π = 0.5. In tossing a pair of dice the probability of a double-six is π = 1⁄6. And in drawing an ace, π = 4⁄52. In general, any random process having these properties is called a Bernoulli process. The probability distribution of a random variable defining the number of "successes" in a Bernoulli process is called a binomial distribution. Thus, the probability distribution of x, the number of correct answers in the multiple choice quiz example, is a binomial distribution. Each binomial distribution is defined by two parameters, or identifying characteristics: the number of trials, n, and the probability of success, π. In symbols, the distribution is expressed as follows, 𝑋~B(𝑛, π) which reads: “X is binomially distributed with parameters n and π”. We can now explain the method to determine the probabilities associated with each value of a binomial random variable using the following example. Example 14 The table from Example 3 is reproduced below. When you are guessing the answers to a set of 5 multiplechoice questions, you are conducting a random experiment. Assign 0 to the incorrect answer and 1 to the correct answer for each question, and let 𝑥 be the random variable representing the number of possible correct answers out of 5 questions. The values of 𝑥 are shown at the top row of the table: 0, 1, 2, 3, 4, 5. 0 (0,0,0,0,0) 1 (1,0,0,0,0) (0,1,0,0,0) (0,0,1,0,0) (0,0,0,1,0) (0,0,0,0,1) 2 (1,1,0,0,0) (1,0,1,0,0) (1,0,0,1,0) (1,0,0,0,1) (0,1,1,0,0) (0,1,0,1,0) (0,1,0,0,1) (0,0,1,1,0) (0,0,1,0,1) (0,0,0,1,1) 3 (1,1,1,0,0) (1,1,0,1,0) (1,1,0,0,1) (1,0,1,1,0) (1,0,1,0,1) (1,0,0,1,1) (0,1,1,1,0) (0,1,1,0,1) (0,1,0,1,1) (0,0,1,1,1) 4 (1,1,1,1,0) (1,1,1,0,1) (1,1,0,1,1) (1,0,1,1,1) (0,1,1,1,1) 5 (1,1,1,1,1) Develop the probability distribution of 𝑥 by determining the probability associated with each value of the random variable. Note that each value of 𝑥 has its own number of events. For example, there are five events that generate a value of 𝑥 = 1; ten events generate 𝑥 = 2, etc. Each event has its own probability. If there are four choices per question, the probability of answering each question, the probability of success, is P(1) = 1⁄4 = 0.25, and the probability of failure is P(0) = 3⁄4 = 0.75. The probability that all questions are guessed incorrectly is then: f(0) = P(0,0,0,0,0) = (0.75)(0.75)(0.75)(0.75)(0.75) = 0.2373 Chapter 2—Discrete Random Variables and Probability Distributions Page 12 of 17 The probability that one question is guessed correctly is: f(1) = P(1,0,0,0,0) = (0.25)(0.75)(0.75)(0.75)(0.75) = 0.25 × 0.754 = 0.0791 + P(0,1,0,0,0) = (0.75)(0.25)(0.75)(0.75)(0.75) = 0.25 × 0.754 = 0.0791 + P(0,0,1,0,0) = (0.75)(0.75)(0.25)(0.75)(0.75) = 0.25 × 0.754 = 0.0791 + P(0,0,0,1,0) = (0.75)(0.75)(0.75)(0.25)(0.75) = 0.25 × 0.754 = 0.0791 + P(0,0,0,0,1) = (0.75)(0.75)(0.75)(0.75)(0.25) = 0.25 × 0.754 = 0.0791 f(1) = 5 × 0.25 × 0.754 = 5 × 0.0791 = 0.3955 The probability of two correct guesses is: f(2) = P(1,1,0,0,0) = (0.25)(0.25)(0.75)(0.75)(0.75) = 0.252 × 0.753 = 0.02637 + P(1,0,1,0,0) = (0.25)(0.75)(0.25)(0.75)(0.75) = 0.252 × 0.753 = 0.02637 + P(0,0,0,1,1) = (0.75)(0.75)(0.75)(0.25)(0.25) = 0.252 × 0.753 = 0.02637 f(2) = 10 × 0.252 × 0.753 = 10 × 0.02637 = 0.2637 f(3) = 10 × 0.253 × 0.752 = 10 × 0.00879 = 0.0879 f(4) = 5 × 0.254 × 0.75 = 5 × 0.0029 = 0.0146 f(5) = 1 × 0.255 × 0.750 = 0.0010 Putting these calculations compactly, we have, 𝑥 0 1 2 3 4 5 1 ×. 250 × 0.755 5 ×. 251 × 0.754 10 ×. 252 × 0.753 10 ×. 253 × 0.752 5 ×. 254 × 0.751 1 ×. 255 × 0.750 = = = = = = 𝑓(𝑥) 0.2373 0.3955 0.2637 0.0879 0.0146 0.0010 The probability distribution is then: 𝑥 0 1 2 3 4 5 𝑓(𝑥) 0.2373 0.3955 0.2637 0.0879 0.0146 0.0010 Note that to find the number of ways you can arrange a given number of 𝑥 successes among 𝑛 trials is obtained using the combination counting formula: C(𝑛, 𝑥) = 𝑛! (𝑛 − 𝑥)! 𝑥! For example, the number of arrangements of 3 items selected without replacement from 7 items is: Chapter 2—Discrete Random Variables and Probability Distributions Page 13 of 17 C(7,3) = 7! = 35 (7 − 3)! 3! The general formula for the binomial distribution is: The Binomial Distribution Formula 𝑓(𝑥) = C(𝑛, 𝑥)π𝑥 (1 − π)(𝑛−𝑥) 𝑛= 𝑥= C(𝑛, 𝑥) = 𝜋= Number of trial Number of successes Combinations (number of arrangements) of 𝑥 successes in 𝑛 trials Probability of “success” in each trial Example 15 Suppose you find a parking spot at IUPUI within ten minutes 65 percent of the time. You always arrive at the campus ten minutes before your class starts. Let X be the random variable denoting the number of times you will be late to your class in a seven-day period. Find the probability distribution of X. Since you find a spot in ten minutes or under 65 percent of the time, the probability of being late to your class each time is 1 – 0.65 = 0.35. X is binomially distributed with the following parameters: 𝑋~𝐵(𝑛 = 7, 𝜋 = 0.35) 𝑥 0 1 2 3 4 5 6 7 𝑓(𝑥) C(7,0)(0.350)(0.657) = C(7,1)(0.351)(0.656) = C(7,2)(0.352)(0.655) = C(7,3)(0.353)(0.654) = C(7,4)(0.354)(0.653) = C(7,5)(0.355)(0.652) = C(7,6)(0.356)(0.651) = C(7,7)(0.357)(0.650) = 0.0490 0.1848 0.2985 0.2679 0.1442 0.0466 0.0084 0.0006 1.0000 5.1. Cumulative Probability Distribution The cumulative probability distribution is the probability that the random variable X takes on a value which is at most (less than or equal to) a given value. In the previous example, what is the probability that you will be late to class 3 or fewer times in a seven-day period? That is, find 𝑓(𝑥 ≤ 3). To answer this question, first write the cumulative probability distribution of X. Chapter 2—Discrete Random Variables and Probability Distributions Page 14 of 17 𝑥 𝑓(𝑥) 𝑓(𝑋 ≤ 𝑥) 0 1 2 3 4 5 6 7 0.0490 0.1848 0.2985 0.2679 0.1442 0.0466 0.0084 0.0006 0.0490 0.2338 0.5323 0.8002 0.9444 0.9910 0.9994 1.0000 1.0000 Thus, f(x ≤ 3) = f(x = 0) + f(x = 1) + f(x = 2) + f(x = 3) = 0.8002. Using the above schedule, find the following probabilities: 1. f(2 ≤ x ≤ 5) f(2 ≤ x ≤ 5) = f(x = 2) + f(x = 3) + f(x = 4) + f(x = 5) = 0.7572 Alternatively, using the cumulative probability column, f(2 ≤ x ≤ 5) = f(x ≤ 5) − f(x ≤ 1) = 0.9910 − 0.2338 = 0.7572 2. f(3 ≤ x < 5) f(3 ≤ x < 5) = f(x = 3) + f(x = 4) = 0.4121 Alternatively, using the cumulative probability column, f(3 ≤ x ≤ 4) = f(x ≤ 4) − f(x ≤ 2) = 0.9444 − 0.5323 = 0.4121 3. f(x ≥ 5) f(x ≥ 5) = f(x = 5) + f (x = 6) + f(x = 7) = 0.0556 Alternatively, using the cumulative probability column, f(x ≥ 5) = 1 − f(x ≤ 4) = 1 − 0.9444 = 0.0556 5.2. How to Use Excel to Find Binomial Probabilities In Excel, you can find the above binomial probability using the following function: =BINOM. DIST(number_s, trials, probability_s, cumulative) For example, given X ~ B(n = 7, π = 0.35): Find f(x = 5) =BINOM.DIST(5,7,0.35,0) = 0.0466 Note that “0” is entered for “cumulative” in the formula. This means that we want 𝑥 = 5, not 𝑥 ≤ 5 Chapter 2—Discrete Random Variables and Probability Distributions Page 15 of 17 For cumulative probability, “1” replaces “0” in the formula. Find 𝑓(𝑥 ≤ 5) =BINOM.DIST(5,7,0.35,1) = 0.9910 Example Toss a coin 10 times. Let X define the number of tails. Find the following probabilities using Excel. 1. f(x = 4) =BINOM.DIST(4,10,0.5,0) = 0.2051 2. f(x ≤ 4) =BINOM.DIST(4,10,0.5,1) = 0.3770 3. 𝑓(𝑥 ≥ 4) = 1 − 𝑓(𝑥 ≤ 3) =1 – BINOM.DIST(3,10,0.5,1) = 0.8281 4. 𝑓(4 ≤ 𝑥 ≤ 7) = 𝑓(𝑥 ≤ 7) − 𝑓(𝑥 ≤ 3) =BINOM.DIST(7,10,0.5,1)-BINOM.DIST(3,10,0.5,1) = 0.7734 5. 𝑓(4 < 𝑥 < 7) = 𝑓(𝑥 ≤ 6) − 𝑓(𝑥 ≤ 3) =BINOM.DIST(6,10,0.5,1)-BINOM.DIST(4,10,0.5,1) = 0.4512 5.3. Expected Value and Variance of the Binomial Distribution To show how to find the expected value and variance of a binomial distribution use the following familiar example. Example Suppose you find a parking spot at IUPUI within ten minutes 65 percent of the time. You always arrive at the campus ten minutes before your class starts. Let X define the number of times you will be late to your class in a seven-day period. What is the average number of times you will be late to class in a seven day period? What is the variance and standard deviation of x? Here you are looking for μ = E(x) and var(x). Use the probability distribution determined above. 𝑥 𝑓(𝑥) 0 1 2 3 4 5 6 7 0.0490 0.1848 0.2985 0.2679 0.1442 0.0466 0.0084 0.0006 𝑥𝑓(𝑥) (𝑥 − 𝜇)²𝑓(𝑥) 0.0000 0.1848 0.5970 0.8036 0.5770 0.2330 0.0502 0.0045 µ = 2.4500 0.2943 0.3885 0.0604 0.0810 0.3465 0.3030 0.1054 0.0133 var(𝑥) = 1.5925 Since this is a binomial distribution, there are specific formulas that allow you to find the expected value and the variance of the distribution without using a worksheet. These formulas are: The mean or expect value is: The variance is: The standard deviation is µ = E(𝑥) = 𝑛π var(𝑥) = 𝑛π(1 − π) sd(𝑥) = √𝑛π(1 − π) Chapter 2—Discrete Random Variables and Probability Distributions Page 16 of 17 Thus E(𝑥) = 7(0.35) = 2.45 and var(𝑥) = 7(0.35)(0.65) = 1.5925 Chapter 2—Discrete Random Variables and Probability Distributions Page 17 of 17