STA 2023 Elementary Statistics Lecture Notes Chapter 4 – Discrete Probability Distributions Professor Achenbach Random Variables A random variable is a variable whose value depends on the outcome of a probability experiment. As in algebra, random variables are represented by letters. Examples: T = the number of tails when a coin is flipped 3 times. s = the sum of the values showing when two dice are rolled. h = the height of a woman chosen at random from a group. V = the liquid volume of soda in a can marked 12 oz. There are two basic types of random variables: Discrete Random Variables – have a finite or countable number of possible values. Continuous Random Variables – can take on any value in some interval. Examples: The variables T and s from above are discrete random variables The variables h and V from above are continuous random variables. Probability Distributions of Discrete Random Variables A probability distribution for a discrete random variable x is a discrete is a list of each possible value for x together with the probability that when the experiment is run, x will have that value. This probability is denoted by P ( x ) . 1 Examples: As above, let T be the random variable that represents the number of tails obtained when a coin is flipped three times. Then T has 4 possible values: 0, 1, 2, and 3. The probability distribution for T is given in the following table: T 0 1/8 P(T ) 1 3/8 2 3/8 3 1/8 A statistics class of 25 students is given a 5 point quiz. 3 students scored 0, 1 scored 1, 4 scored 2, 8 scored 3, 6 scored 4, and 3 students scored 5. If a student is chosen at random, and the random variable s is the student’s quiz score then the probability distribution of s is: s P( s) 0 0.12 1 0.04 2 0.16 3 0.32 Note: For any discrete random variable x: 0 P( x) 1 and 4 0.24 5 0.12 P( x) 1 Finding Probabilities from a Probability Distribution Since a random variable can only take on one value at a time, the events of a variable assuming two different values are always mutually exclusive. The probability of the variable taking on any number of different values can thus be found by simply adding the appropriate probabilities. Exercises: Find the probability that a student scored 3 or more on the quiz from the previous example. Find the probability that a student did not get a perfect score. Find the probability of getting 2 or more tails when a coin is flipped 3 times. Find the probability of getting at least one tail. Find the missing probability in the following distribution: X P( X ) -3 0.21 0 0.15 5 2 13 0.33 Mean or Expected Value The mean or expected value of a random variable x is the average value that we should expect for x over many trials of the experiment. Notation: The mean or expected value of a random variable x will be represented by ( x ) or E ( x) We can calculate the mean theoretically by using the formula: E( x) ( x) xP( x) Examples: The expected value of the random variable T from above is: 3 6 3 12 3 1 3 3 1 E (T ) T P(T ) 0 1 2 3 0 8 8 8 8 2 8 8 8 8 Thus if 3 coins are flipped a large number of times, we should expect the average number of tails (per 3 flips) to be about 1.5. The mean of the random variable s from above is: E ( s) s P( s ) (0)(.12) (1)(.04) (2)(.16) (3)(.32) (4)(.24) (5)(.12) 0 .04 .32 .96 .96 .60 2.88 Note that this is actually the class average on the quiz as well. Exercise: Suppose an instant lottery ticket is purchased for $2. The possible prizes are $0, $2, $20, $200, and $1000. Let Z be the random variable representing the amount won on the ticket, and suppose Z has the following distribution: Z P( Z ) 0 2 .2 20 .05 200 .001 1000 .0001 Determine P (0) . Determine E ( Z ) and interpret its meaning. How much should you expect to gain or lose on average per ticket? 3 Variance and Standard Deviation Often, we are also interested in how much the values of a random variable differ from trial to trial. To measure this, we can define the variance and standard deviation for a random variable. For a random variable x, the variance of x, denoted by 2 ( x) can be calculated by the formula: 2 ( x) ( x )2 P( x) The standard deviation of x, denoted by ( x ) is just the square root of 2 ( x) . ( x) ( x ) P( x) 2 As before, standard deviation estimates the average difference between a value of x and the average. Calculating Variance and Standard Deviation The calculation of standard deviation for a random variable is similar to the calculation of weighted standard deviation we made for data in frequency tables. (In fact, P ( x ) can just be thought of as the relative frequency of x.) The calculation can be made using the following steps: 1. Calculate ( x ) . 2. Subtract the mean from each of the possible values of x. Recall that these are called the deviations of the x values. 3. Square each of the deviations calculated in Step 2. 4. Multiply each squared deviation calculated in Step 3 by the corresponding probability P ( x ) . 5. Sum the results of Step 4. This is 2 ( x) . 6. Take the square root of the result of Step 5 to obtain ( x ) . 4 Exercises: Calculate the standard deviation of the random variable T from above. Calculate the standard deviation of the random variable Z from above. Binomial Random Variables A discrete random variable x is said to have a binomial distribution if x satisfies the following conditions: An experiment is repeated for a fixed number of trials n. All trials of the experiment are independent from one another. All possible outcomes for each trial of the experiment can be divided into two complementary events one S called “success” and one F called “failure”. The probability of success P( S ) has a constant value of p for every trial and the probability of failure P ( F ) has a constant value of q for every trial. Note: q 1 p The random variable x counts the number of trials on which S (success) occurred. Examples: Consider the experiment of flipping a coin 5 times. If we let the event of getting tails on a flip be considered “success”, and if the random variable T represents the 1 number of tails obtained, then T will be binomially distributed with n 5 , p , 2 1 and q . 2 A student takes a 10 question multiple-choice quiz and guesses each answer. For each question, there are 4 possible answers, only one of which is correct. If we consider “success” to be getting a question right and consider the 10 questions as 10 independent trials, then the random variable X representing the number of 1 3 correct answers will be binomially distributed with n 10 , p , and q . 4 4 Fourteen percent of flights from Tampa International Airport are delayed. If 20 flights are chosen at random, then we can consider each flight to be an independent trial. If we define a successful trial to be that a flight takes off on time, then the random variable z representing the number of on-time flights will be binomially distributed with n 20 , p .86 , and q .14 . 5 Calculating Probabilities for a Binomial Random Variable If X is a binomial random variable with n trials, probability of success p, and probability of failure q, then by the Fundamental Counting Principle, the probability of any outcome in which there are x successes (and therefore n x failures) is: ( p p ... p) (q q ... q) p x q n x x successes n x failures To count the number of outcomes with x successes and n x failures, we observe that the x successes could occur on any x of the n trials. The number of ways of choosing x trials out of n is n C x , so the probability of x successes becomes: P( x) n Cx p x q n x Examples: As in the previous examples, let T be the random variable representing the number of tails when a coin is flipped 3 times. Using the formula above, we can calculate the probability of exactly 2 tails as: 2 1 1 3 1 1 P(2) 3 C2 3 .375 2 8 2 2 3 Let the random variable X represent the number correct answers on the multiplechoice quiz described above. Then the probability of a student guessing 3 answers correctly is: 3 7 1 2187 1 3 P(3) 10 C3 120 .25 64 16384 4 4 while the probability of guessing seven answers correctly is: 7 3 1 27 1 3 P(7) 10 C7 120 .003 16384 64 4 4 Exercises: Let z be the random variable defined above as the number of on-time flights out of 20 at Tampa International Airport. Find the probability that 15 out of the 20 flights depart on-time. Find the probability that only 6 out of the 20 flights depart on-time. Find the probability that all 20 flights are on time. 6 Properties of Binomial Distributions In many cases, we are interested in the mean and standard deviation of a binomial random variable. If x is a binomial random variable with n trials, probability of success p and probability of failure q, then the mean and standard deviation of x can be calculated by the following: E ( x) ( x) np and ( x) npq Example: For T the number of tails when a coin is flipped 3 times: 1 3 E (T ) 3 2 2 and 1 1 3 (T ) 3 .866 4 2 2 Exercises: Compute ( X ) and ( X ) for the random variable X from the multiple-choice test example. Compute ( z ) and ( z ) for the random variable z from the example using flights from Tampa International. Note: A binomial distribution is symmetric if p q , left skewed if p q and right skewed if p q . 7 Using the TI-83 to Calculate Mean and Standard Deviation The mean and standard deviation for a discrete random variable can be calculated in one step using the TI-83. Step 1: Enter the Probability Distribution Enter the list of possible x values in L1 and the set of corresponding P ( x ) values in L2 as was done for frequency tables in Chapter 2. Step 2: Calculate Choose 1-Var Stats from the CALC menu as when finding the mean of a data set and when it appears on the screen choose L1 comma L2: 1-Var Stats L1, L2 Press [ENTER] to calculate. Step 3: Read the Calculated Values x is the mean of the random variable x is the standard deviation of the random variable Exercise: Use your calculator to recalculate the mean and standard deviation of the random variable Z from the lottery example above. Using the TI-83 to Calculate Binomial Probabilities If x is a binomial random variable with number of trials n, probability of success p, then P ( x ) can be calculated on the TI-83 as follows: Press [2nd][VARS] to choose DISTR. From the menu use the down arrow to select 0: binompdf( and press [ENTER]. binompdf( should appear on the screen. Enter the number of trials n, probability of success p, and desired number of successes x separated by commas so that the screen reads: binompdf(n,p,x) To calculate press [ENTER] again. 8 Examples: binompdf(10, .25, 3) yields .2502822876 binompdf(10, .25, 7) yields .0030899048 Creating Binomial Probability Distributions with the TI-83 Omitting the last entry x in the parenthesis when using the binompdf function will yield a list of all binomial probabilities starting with x 0 and ending with x n . Example: Entering binompdf(3, .5) yields: {.125, .375, .375, .125} the probability distribution of the binomial experiment of flipping three coins. Exercise: Use the TI-83 to construct a probability distribution for the random variable X defined above as the number of correct answers on a multiple choice quiz. Cumulative Binomial Probabilities with the TI-83 The binomcdf function can also be accessed from the DISTR menu, and binomcdf(n, p, x) gives the probability of at most x successes. Example: The probability of at most 12 on-time flights (out of 20) from Tampa can be found on the TI-83 as follows: binomcdf (20, 0.86,12) .0038413869 Exercise: Use your calculator to calculate the probability that at least 15 flights (out of 20) from Tampa are on time. Note: As with binompdf, omitting the last entry x in using the binomcdf function will produce the list of cumulative probabilities starting with x 0 and ending with x n . 9