Chapter 5 Discrete Random Variables Probability Distributions Discrete Random Variables Probability Distributions Overview Random Variables Mean and Standard Deviation for Random Variables Binomial Probability Distributions Mean, Standard Deviation for the Binomial Distribution Overview This chapter will deal with the construction of discrete probability distributions by combining the methods of descriptive statistics presented in Chapter 2 and 3 and those of probability presented in Chapter 4. Combining Descriptive Methods and Probabilities In this chapter we will construct probability distributions by presenting possible outcomes along with the relative frequencies we expect. Random Variables A random variable is a variable whose value is a numerical outcome of a probability experiment. We usually denote random variables by capital letters near the end of the alphabet, such as X or Y Mathematically speaking, a random variable X is a function that assigns a number to every outcome of the sample space S Random Variables There are two kinds of random variables Discrete Random Variables Continuous Random Variables Random Variables A random variable is discrete if it has a finite or countable number of possible outcomes that can be listed. x 0 2 4 6 8 10 A random variable is continuous if it has an infinite number of outcomes, represented by the intervals on a number line. x 0 2 4 6 8 10 Random Variables Example: Decide if the random variable X is discrete or continuous. a.) The distance your car travels on a tank of gas The distance your car travels is a continuous random variable because it is a measurement that cannot be counted. (All measurements are continuous random variables.) b.) The number of students in a statistics class The number of students is a discrete random variable because it can be counted. Discrete Random Variables More Discrete Random Variable Examples Experiment Random Variable Possible Values Make 100 Sales Calls # Sales 0, 1, 2, ..., 100 Inspect 70 Radios # Defective 0, 1, 2, ..., 70 Answer 33 Questions # Correct 0, 1, 2, ..., 33 Count Cars at Toll # Cars Between 11:00 & 1:00 Arriving 0, 1, 2, ..., Notation If X a random variable, then its numerical values are denoted by the corresponding lower case letters x. The notation P (X = x) will be used to denote “probability of the event that makes X = x” This probability is denoted by p(x) that is, p(x) = P (X = x) is the probability distribution function (pdf in the TI-83 notation) Discrete Probability Distributions A discrete probability distribution for the discrete random variable X lists each possible value the variable X can assume, together with its probability. This probability distribution is often expressed in the format of a graph, table, or formula. In this course a discrete random variable X will always have a finite number of possible values. Discrete Probability Distributions The probability distribution of X lists the values and their probabilities as shown in the table The probabilities pi must satisfy two requirements: 1. Every probability pi is a number between 0 and 1 2. p1 + p2 +・ ・ ・+pk = 1 Probability Distributions and Histograms Example The spinner below is divided into two sections. The probability of landing on the 1 is 0.25. The probability of landing on the 2 is 0.75. Let X be the number the spinner lands on. Construct a probability distribution for the random variable X. 1 2 x P (X = x) 1 2 0.25 0.75 Each probability is between 0 and 1. Sum of probabilities is 1. Example Continued The spinner below is spun two times. The probability of landing on the 1 is 0.25. The probability of landing on the 2 is 0.75. Let X be the sum of the two spins. Construct a probability distribution for the random variable X. The possible sums are 2, 3, and 4. P (sum of 2) = 0.25 0.25 = 0.0625 1 2 Spin a 1 on the first spin. “and” Spin a 1 on the second spin. Example Continued P (sum of 3) = 0.25 0.75 = 0.1875 1 2 Spin a 1 on the first spin. “and” Spin a 2 on the second spin. “or” P (sum of 3) = 0.75 0.25 = 0.1875 X = Sum P (X =x) of spins Spin a 2 on the 2 3 4 0.0625 0.375 first spin. “and” 0.1875 + 0.1875 Spin a 1 on the second spin. Example Continued 1 P (sum of 4) = 0.75 0.75 = 0.5625 2 X = Sum P (X =x) of spins 2 3 4 0.0625 0.375 0.5625 Spin a 2 on the first spin. “and” Spin a 2 on the second spin. Each probability is between 0 and 1, and the sum of the probabilities is 1. Example Continued Graph the probability distribution using a histogram. X = Sum P (X =x) of spins 0.0625 0.375 0.5625 Sum of Two Spins 0.6 0.5 Probability 2 3 4 p(x) 0.4 0.3 0.2 0.1 x 0 2 3 Sum 4 Another Quick Example Experiment: Toss 2 Coins. X= Count # of Tails Probability Distribution Values of X Probabilities, p (x ) 0 p(0) = P(X = 0) = 1/4 = .25 1 p(1) = P(X = 1) = 2/4 = .50 2 p(2) = P(X = 2) = 1/4 = .25 Visualizing The Discrete Probability Distribution Table X = # Tails 0 1 2 p(x) = P(X=x) .25 .50 .25 p(x) .50 .25 .00 Graph x 0 1 2 Function n ! x n x P (X x ) p (1 p ) . (n x )! x ! More Coins What is the probability distribution of the discrete random variable X that counts the number of heads in four tosses of a coin? We can derive this distribution if we make two reasonable assumptions. The coin is balanced, so each toss is equally likely to give H or T. The coin has no memory, so tosses are independent. That is, the outcome of a toss does not depend on the outcome of the previous ones. The outcome of four tosses is a sequence of heads and tails such as HTTH. There are 16 possible outcomes. The picture below gives the sample space along with the value of X for each outcome. The Table for this Probability Distribution Probability Histogram for this Distribution Some Probability Calculations The probability of tossing at least two heads is P( X 2) 0.375 0.25 0.0625 0.6875 The probability of tossing at least one head is found by use of the complement rule P( X 1) 1 P( X 1) 1 P( X 0) 1 0.0625 0.9375 Mean and Standard Deviation of a Discrete Random Variable Introducing the Mean of a Discrete Random Variable - Example Ages of eight students Probability distribution of X, the age of a randomly selected student Express the mean age of the eight students in terms of the probability distribution of the random variable X. Express the mean age of the eight students in terms of the probability distribution of the random variable X. Express the mean age of the eight students in terms of the probability distribution of the random variable X. Express the mean age of the eight students in terms of the probability distribution of the random variable X. Express the mean age of the eight students in terms of the probability distribution of the random variable X. Express the mean age of the eight students in terms of the probability distribution of the random variable X. Mean of a Discrete Random Variable Mean of a Discrete Random Variable The mean of a discrete random variable is given by μ = Σ xP(X=x) Each value of x is multiplied by its corresponding probability and the products are added. Example: Find the mean of the probability distribution for the sum of the two spins. x 2 3 4 xP (x) P (x) 0.0625 2(0.0625) = 0.125 0.375 3(0.375) = 1.125 0.5625 4(0.5625) = 2.25 Σ xP(X=x) = 3.5 The mean for the two spins is 3.5. Interpretation The following interpretation is commonly known as the law of averages and in mathematical circles as the law of large numbers. Variance of a Discrete Random Variable The variance of a discrete random variable is given by 2 = Σ(x – μ)2P (X = x) Example: Find the variance of the probability distribution for the sum of the two spins. The mean is 3.5. x–μ (x – μ)2 p(x)(x – μ)2 0.0625 –1.5 2.25 0.141 3 0.375 –0.5 0.25 0.094 4 0.5625 0.5 0.25 0.141 x p (x) 2 ΣP(X=x)(x – 2)2 0.376 Variance of a Discrete Random Variable The standard deviation of a discrete random variable is σ = σ2 (x )2P (X x ). Example: Find the variance of the probability distribution for the sum of the two spins. The mean is 3.5. x–μ (x – μ)2 p(x)(x – μ)2 0.0625 –1.5 2.25 0.141 3 0.375 –0.5 0.25 0.094 4 0.5625 0.5 0.25 0.141 x p (x) 2 σ σ 2 0.376 0.613 Expected Value The expected value of a discrete random variable is equal to the mean of the random variable. Expected Value = E(x) = μ = ΣxP(X=x) Example: At a raffle, 500 tickets are sold for $1 each for a prize of $100. What is the expected value of your gain? Your gain for the $100 prize is $100 – $1 = $99. Write a probability distribution for the possible gains. Expected Value At a raffle, 500 tickets are sold for $1 each for a prize of $100. What is the expected value of your gain? Gain, x P (x) $99 1 500 $-1 499 500 Winning no prize E(x) = ΣxP(x) = $99 × 1 499 + (-$1) × 500 500 = -$0.80 Because the expected value is negative, you can expect to lose $0.80 for each ticket you buy. Binomial Distributions Factorial Notation Binomial Experiments The interpretation for this number is the following: The binomial coefficient represents the number of different ways in which x objects can be selected from a list of n distinct objects when the order of the selection is not important Binomial Experiments A binomial experiment is a probability experiment that satisfies the following conditions. 1. The experiment is repeated for a fixed number of trials, where each trial is independent of other trials. 2. There are only two possible outcomes of interest for each trial. The outcomes can be classified as a success s, or as a failure f . 3. The probability of a success p = P (s ) is the same for each trial. p is called the success probability. 4. The random variable X counts the number of successful trials. Notation for Binomial Experiments Symbol n Description The number of times a trial is repeated. p = P (s ) The probability of success in a single trial. q = P (f ) The probability of failure in a single trial. q = 1– p. X The random variable represents a count of the number of successes in n trials: The possible values of X are: x = 0, 1, 2, 3, … , n. Example: Decide whether the experiment is a binomial experiment. If it is, specify the values of n, p, and q, and list the possible values of the random variable X. If it is not a binomial experiment, explain why. You randomly select a card from a deck of cards, and note if the card is an Ace. You then put the card back and repeat this process 8 times. This is a binomial experiment. Each of the 8 selections represent an independent trial because the card is replaced before the next one is drawn. There are only two possible outcomes: either the card is an ace or not. q 1 1 12 n 8 p 4 1 x 0,1,2,3,4,5,6,7,8 52 13 13 13 Example: Decide whether the experiment is a binomial experiment. If it is, specify the values of n, p, and q, and list the possible values of the random variable X. If it is not a binomial experiment, explain why. You roll a die 10 times and note the number the die lands on. This is not a binomial experiment. While each trial (roll) is independent, there are more than two possible outcomes: 1, 2, 3, 4, 5, and 6. More Binomial Experiments # of reds in 15 spins of roulette wheel # of defective items in a batch of 25 items # of correct on a 33 question exam # of customers who purchase out of 100 customers who enter a store Binomial Probability Formula In a binomial experiment, the probability of exactly x successes in n trials is n x n x n! P (X x ) p q p xq n x (n x )! x ! x Example: A bag contains 10 chips; 3 of the chips are red, 5 are white, and 2 are blue. Three chips are selected, with replacement. Find the probability that you select exactly one red chip. p = the probability of selecting a red chip 3 0.3 10 q = 1 – p = 0.7 3 1 2 P ( X 1) (0.3) (0.7) 0.441 n=3 1 x=1 Example: A bag contains 10 chips. 3 of the chips are red, 5 are white, and 2 are blue. Four chips are selected, with replacement. Create a probability distribution for the number of red chips selected. p = the probability of selecting a red chip 3 0.3 q = 1 – p = 0.7 n=4 x = 0, 1, 2, 3, 4 10 x P (X=x) 0 1 2 3 4 0.240 0.412 0.265 0.076 0.008 The binomial probability formula is used to find each probability. Finding Probabilities Example: The following probability distribution represents the probability of selecting 0, 1, 2, 3, or 4 red chips when 4 chips are selected. x 0 1 2 3 4 P (X=x) 0.24 0.412 0.265 0.076 0.008 a.) Find the probability of selecting no more than 3 red chips. b.) Find the probability of selecting at least 1 red chip. a.) P (no more than 3) = P (x 3) = P (0) + P (1) + P (2) + P (3) = 0.24 + 0.412 + 0.265 + 0.076 = 0.993 b.) P (at least 1) = P (x 1) = 1 – P (0) = 1 – 0.24 = 0.76 Graphing the distribution Example: The following probability distribution represents the probability of selecting 0, 1, 2, 3, or 4 red chips when 4 chips are selected. Graph the distribution using a histogram. P (x) P (X=x) 0.24 0.412 0.265 0.076 0.008 Selecting Red Chips 0.5 Probability x 0 1 2 3 4 0.4 0.3 0.2 0.1 x 0 0 1 2 3 Number of red chips 4 Mean and Standard Deviation of a Binomial Distribution Population Parameters of a Binomial Distribution Mean: μ np Variance: σ 2 npq Standard deviation: σ npq Example: One out of 5 students at a local college say that they skip breakfast in the morning. Find the mean, variance and standard deviation if 10 students are randomly selected. n 10 p 1 0.2 5 q 0.8 μ np σ 2 npq σ npq 10(0.2) (10)(0.2)(0.8) 1.6 2 1.6 1.3 Thinking Challenge You’re taking a 33 question multiple choice test. Each question has 4 choices. Clueless on 1 question, you decide to guess. What’s the chance you’ll get it right? If you guessed on all 33 questions, what would your grade most likely be? pass? X Answer Guessing 33 Answers on a Test Data Sample size Probability of success 33 0.25 Statistics Mean Variance Standard deviation 8.25 6.1875 2.487469 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 P(X) 0.000075339 0.000828733 0.004419907 0.015224123 0.038060308 0.073583262 0.114462852 0.147166524 0.159430401 0.147620742 0.118096593 0.082309747 0.050300401 0.027084831 0.012897539 0.005445627 0.002042110 0.000680703 0.000201690 0.000053076 0.000012384 0.000002556 0.000000465 0.000000074 0.000000010 0.000000001 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 Porbality Distribution of Total Number Right Answers Probability Distribution for TotalofNumber of Right Answers 0.180000000 0.160000000 0.140000000 0.120000000 P(X) 0.100000000 0.080000000 0.060000000 0.040000000 0.020000000 0.000000000 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 Number of Successes Another Thinking Challenge You’re a telemarketer selling service contracts for Macy’s. You have sold 20 in your last 100 calls (p = .20). If you call 12 people tonight, what’s the probability of A. No sales? B. Exactly 2 sales? C. At most 2 sales? D. At least 2 sales? Solution Using the Binomial probability formula: A. p(0) = .0687 B. p(2) = .2835 C. p(at most 2) = p(0) + p(1) + p(2) = .0687 + .2062 + .2835 = .5584 D. p(at least 2) = p(2) + p(3)...+ p(12) = 1 - [p(0) + p(1)] = 1 - .0687 - .2062 = .7251 More Discrete Probability Distributions Geometric Distribution A geometric distribution is a discrete probability distribution of a random variable X that satisfies the following conditions. 1. A trial is repeated until a success occurs. 2. The repeated trials are independent of each other. 3. The probability of a success p is constant for each trial. The probability that the first success will occur on trial x is p(x)=P (X = x) = p(q)x – 1, where q = 1 – p Geometric Distribution Example: A fast food chain puts a winning game piece on every fifth package of French fries. Find the probability that you will win a prize, A. with your third purchase of French fries, B. with your third or fourth purchase of French fries. p = 0.20 A. x = 3 q = 0.80 B. x = 3, 4 P (3) = (0.2)(0.8)3 – 1 = 0.128 P (3 or 4) = P (3) + P (4) 0.230 Poisson Distribution The Poisson distribution is a discrete probability distribution of a random variable x that satisfies the following conditions. 1. The experiment consists of counting the number of times an event, x, occurs in a given interval. The interval can be an interval of time, area, or volume. 2. The probability of the event occurring is the same for each interval. 3. The number of occurrences in one interval is independent of the number of occurrences in other intervals. The probability of exactly x occurrences in an interval is x μ μ P (x ) e x! where e 2.71818 and μ is the mean number of occurrences. Poisson Distribution Example: The mean number of power outages in the city of Brunswick is 4 per year. Find the probability that in a given year, A. there are exactly 3 outages, B. there are more than 3 outages. A. 4, x 3 43(2.71828)-4 P (3) 3! 0.195 B. P (more than 3) 1 P (x 3) 1 [P (3) P (2) + P (1) + P (0)] 1 (0.195 0.147 0.073 0.018) 0.567