Chapter 4 Discrete Random Variables (离散型随机变量) Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 1 Discrete Random Variables 4.1 4.2 4.3 4.4 Two Types of Random Variables Discrete Probability Distributions The Binomial Distribution The Poisson Distribution Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 2 Random Variables (随机变量) A random variable is a variable that assumes numerical values that are determined by the outcome of an experiment. Example: Consider a random experiment in which a coin is tossed three times. Let X be the number of heads. Let H represent the outcome of a head and T the outcome of a tail. The possible outcomes for such an experiment: TTT, TTH, THT, THH, HTT, HTH, HHT, HHH Thus the possible values of X (number of heads) are 0,1,2,3. From the definition of a random variable, X as defined in this experiment, is a random variable. Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 3 Two Types of Random Variables Discrete random variable(离散型随机变量): Possible values can be counted or listed For example, the number of TV sets sold at the store in one day. Here x could be 0, 1, 2, 3, 4 and so forth. Continuous random variable (连续型随机变量): May assume any numerical value in one or more intervals For example, the waiting time for a credit card authorization, the interest rate charged on a business loan Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 4 Example: Two Types of Random Variables Question Random Variable x Type Family size x = Number of people in family reported on tax return Discrete(离散) Distance from home to store x = Distance in miles from home to a store Continuous(连续) Own dog or cat x = 1 if own no pet; Discrete = 2 if own dog(s) only; = 3 if own cat(s) only; = 4 if own dog(s) and cat(s) Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 5 Discrete Probability Distributions (离散概率分布) The probability distribution of a discrete random variable is a table, graph, or formula that gives the probability associated with each possible value that the variable can assume Notation: Denote the values of the random variable by x and the value’s associated probability by p(x) Properties 1. For any value x of the random variable, p(x) 0 2. The probabilities of all the events in the sample space must sum to 1, that is, px 1 all x Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 6 Example 4.3: Number of Radios (Sold at Sound City in a Week) Let x be the random variable of the number of radios sold per week x has values x = 0, 1, 2, 3, 4, 5 Given sales history over past 100 weeks Let f be the number of weeks (of the past 100) during which x number of radios were sold Records tell us that f(0)=3 No radios have been sold in 3 of the weeks f(1)=20 One radios has been sold in 20 of the weeks f(2)=50 Two radios have been sold in 50 of the weeks f(3)=20 Three radios have been sold in 20 of the weeks f(4)=5 Four radios have been sold in 4 of the weeks f(5)=2 Five radios have been sold in 2 of the weeks No more than five radios were sold in any of the past 100 weeks Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 7 Frequency distribution of sales history over past 100 weeks # Radios, x Frequency Relative Frequency Probability, p(x) 0 f(0) =3 3/100 = 0.03 p(0) = 0.03 1 f(1) =20 20/100 = 0.20 p(1) = 0.20 2 f(2) =50 0.50 p(2) = 0.50 3 f(3) =20 0.20 p(3) = 0.20 4 f(4) = 5 0.05 p(4) = 0.05 5 f(5) = 2 0.02 P(5) = 0.02 100 1.00 1.00 Interpret the relative frequencies as probabilities So for any value x, f(x)/n = p(x) Assuming that sales remain stable over time Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 8 Example 4.3 Continued What is the chance that two radios will be sold in a week? P(x = 2) = 0.50 What is the chance that fewer than 2 radios will be sold in a week? Using the addition p(x < 2) = p(x = 0 or x = 1) rule for the mutually exclusive values of = p(x = 0) + p(x = 1) the random variable. = 0.03 + 0.20 = 0.23 What is the chance that three or more radios will be sold in a week? p(x ≥ 3) = p(x = 3, 4, or 5) = p(x = 3) + p(x = 4) + p(x = 5) = 0.20 + 0.05 + 0.02 = 0.27 Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 9 Expected Value of a Discrete Random Variable The mean or expected value of a discrete random variable X is: X x p x All x is the value expected to occur in the long run and on average Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 10 Example 4.3: Number of Radios Sold at Sound City in a Week How many radios should be expected to be sold in a week? Calculate the expected value of the number of radios sold, X Radios, x Probability, p(x) x p(x) 0 1 2 3 4 5 p(0) = 0.03 p(1) = 0.20 p(2) = 0.50 p(3) = 0.20 p(4) = 0.05 p(5) = 0.02 1.00 0 0.03 = 0.00 1 0.20 = 0.20 2 0.50 = 1.00 3 0.20 = 0.60 4 0.05 = 0.20 5 0.02 = 0.10 2.10 • On average, expect to sell 2.1 radios per week Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 11 Variance and Standard Deviation The variance of a discrete random variable is: 2X x X 2 px All x • The variance is the average of the squared deviations of the different values of the random variable from the expected value The standard deviation is the square root of the variance X 2 X • The variance and standard deviation measure the spread of the values of the random variable from their expected value Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 12 Example 4.7: Number of Radios Sold at Sound City in a Week Calculate the variance and standard deviation of the number of radios sold at Sound City in a week Radios, x 0 1 2 3 4 5 Probability, p(x) p(0) = 0.03 p(1) = 0.20 p(2) = 0.50 p(3) = 0.20 p(4) = 0.05 p(5) = 0.02 1.00 Standard deviation X 0.89 0.9434 (x - X)2 p(x) (0 – 2.1)2 (0.03) = 0.1323 (1 – 2.1)2 (0.20) = 0.2420 (2 – 2.1)2 (0.50) = 0.0050 (3 – 2.1)2 (0.20) = 0.1620 (4 – 2.1)2 (0.05) = 0.1805 (5 – 2.1)2 (0.02) = 0.1682 0.8900 Variance X2 0.89 Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 13 3. Let X denote the number of service visits required to fix a problem with a product purchased from a company. X 1 2 3 4 P(x) 0.6 0.2 0.15 0.05 1. What is the expected number of visits required to fix the product? 2. The variance of the number of visits required to fix the product? 3. The probability that it will take more than 2 visits? 4. Given that it takes more than one visit, what is the probability that it will take 4 visits? Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 14 Ans: a. E( X ) XP( X ) 1* 0.6 2 * 0.2 3 * 0.15 4 * 0.05 1.65 b. 2 Var(X) ( X ) 2 P( X ) (1 1.65) 2 * 0.6 (2 1.65) 2 * 0.2 (3 1.65) 2 * 0.15 (4 1.65) 2 * 0.05 0.8275 c. d. P( X 2) P( X 3) P( X 4) 0.15 0.05 0.2 P( X 4 | X 1) P( X 4, X 1) 0.05 0.125 P( X 1) 0.2 0.15 0.05 Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 15 The Binomial Distribution (二项分布 ) The Binomial Experiment: 1. Experiment consists of n identical trials 2. Each trial results in either “success” or “failure” 3. Probability of success, p, is constant from trial to trial 4. Trials are independent Note: The probability of failure, q, is 1 – p and is constant from trial to trial If x is the total number of successes in n trials of a binomial experiment, then x is a binomial random variable Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 16 The Binomial Distribution #2 For a binomial random variable x, the probability of x successes in n trials is given by the binomial distribution: n! px = p x q n- x x!n - x ! • Note: n! is read as “n factorial” and n! = n × (n-1) × (n-2) × ... × 1 – For example, 5! = 5 4 3 2 1 = 120 • Also, 0! =1 • Factorials are not defined for negative numbers or fractions Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 17 The Binomial Distribution #3 • What does the equation mean? – The equation for the binomial distribution consists of the product of two factors n! px = x!n - x ! Number of ways to get x successes and (n–x) failures in n trials p x q n- x The chance of getting x successes and (n–x) failures in a particular arrangement Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 18 Example 4.10: Incidence of Nausea The company claims that, at most, 10 percentage of all patients treated with Phe-Mycin would experience nausea as a side effect of taking the drug. x = number of patients who will experience nausea following treatment with Phe-Mycin out of the 4 patients tested Find the probability that 2 of the 4 patients treated will experience nausea Given: n = 4, p = 0.1, with x = 2 Then: q = 1 – p = 1 – 0.1 = 0.9 and 4! 0.1 20.9 42 px 2 2!4 2! 60.1 2 0.9 2 0.0486 Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 19 Example: Binomial Distribution n = 4, p = 0.1 Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 20 Binomial Probability Table (Appendix Table A.1, P817) Table 4.7(a) for n = 4, with x = 2 and p = 0.1 p = 0.1 values of p (.05 to .50) x 0 1 2 3 4 0.05 0.8145 0.1715 0.0135 0.0005 0.0000 0.95 0.1 0.6561 0.2916 0.0486 0.0036 0.0001 0.9 0.15 0.5220 0.3685 0.0975 0.0115 0.0005 0.85 … … … … … … … 0.50 0.0625 0.2500 0.3750 0.2500 0.0625 0.50 4 3 2 1 0 x values of p (.05 to .95) Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 P(x = 2) = 0.0486 21 Example 4.11: Incidence of Nausea after Treatment x = number of patients who will experience nausea following treatment with Phe-Mycin out of the 4 patients tested Find the probability that at least 3 of the 4 patients treated will experience nausea Set x = 3, n = 4, p = 0.1, so q = 1 – p = 1 – 0.1 = 0.9 Then: p x 3 p x 3 or 4 p x 3 p x 4 0.0036 .0001 0.0037 Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 Using the addition rule for the mutually exclusive values of the binomial random variable 22 Example 4.11: Rare Events Suppose at least three of four sampled patients actually did experience nausea following treatment If p = 0.1 is believed, then there is a chance of only 37 in 10,000 of observing this result So this is very unlikely! But it actually occurred So, this is very strong evidence that p does not equal 0.1 There is very strong evidence that p is actually greater than 0.1 Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 23 Several Binomial Distributions Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 24 Mean and Variance of a Binomial Random Variable If x is a binomial random variable with parameters ( 参数) n and p (so q = 1 – p), then mean X np variance X2 npq standard deviation X npq Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 25 Back to Example 4.11 Of 4 randomly selected patients, how many should be expected to experience nausea after treatment? Given: n = 4, p = 0.1 Then X = np = 4 0.1 = 0.4 So expect 0.4 of the 4 patients to experience nausea If at least three of four patients experienced nausea, this would be many more than the 0.4 that are expected Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 26 Binomial Distribution EXAMPLE: Pat Statsdud is registered in a statistics course and intends to rely on luck to pass the next quiz. The quiz consists on 10 multiple choice questions with 5 possible choices for each question, only one of which is the correct answer. Pat will guess the answer to each question Find the following probabilities Pat gets no answer correct Pat gets two answer correct? Pat fails the quiz If all the students in Pat’s class intend to guess the answers to the quiz, what is the mean and the standard deviation of the quiz mark? Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 27 Solution Checking the conditions An answer can be either correct or incorrect. There is a fixed finite number of trials (n=10) Each answer is independent of the others. The probability p of a correct answer (.20) does not change from question to question. Determining the binomial probabilities: Let X = the number of correct answers 10! P( X 0) (.20 ) 0 (.80 )100 .1074 0! (10 0)! 10! P( X 2) (.20 ) 2 (.80 )10 2 .3020 2! (10 2)! Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 28 Determining the binomial probabilities: Pat fails the test if the number of correct answers is less than 5, which means less than or equal to 4. P(X4 = p(0) + p(1) + p(2) + p(3) + p(4) = .1074 + .2684 + .3020 + .2013 + .0881 =.9672 The mean and the standard deviation of the quiz mark? μ= np = 10(.2) = 2. σ= [np(1-p)]1/2 = [10(.2)(.8)]1/2 = 1.26 Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 29 The Poisson Distribution Consider the number of times an event occurs over an interval of time or space, and assume that 1. The probability of occurrence is the same for any intervals of equal length 2. The occurrence in any interval is independent of an occurrence in any non-overlapping interval If x = the number of occurrences in a specified interval, then x is a Poisson random variable Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 30 The Poisson Distribution Example The number of failures in a large computer system during a given day. The number of customers who arrive at the checkout counter of a store in one hour. Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 31 The Poisson Distribution Continued Suppose μ is the mean or expected number of occurrences during a specified interval The probability of x occurrences in the interval when m are expected is described by the Poisson distribution: px e x! x where x can take any of the values x = 0, 1, 2, 3, … and e = 2.71828… (e is the base of the natural logs) Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 32 Example 4.13: ATC Center Errors Suppose that an air traffic control (ATC) center has been averaging 20.8 errors per year and lately the center experiences 3 errors in a week. Let x be the number of errors made by the ATC center during one week Given: μ = 20.8 errors per year Then: = 0.4 errors per week • Because there are 52 weeks per year, μfor a week is: μ = (20.8 errors/year) / (52 weeks/year) = 0.4 errors/week Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 33 Example 4.13: ATC Center Errors Continued Find the probability that 3 errors (x =3) will occur in a week – Want p(x = 3) when μ= 0.4 px 3 e 0.4 0.4 3 3! 0.0072 Find the probability that no errors (x = 0) will occur in a week – Want p(x = 0) when = 0.4 e 0.4 0.40 px 0 0.6703 0! Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 34 Poisson Probability Table (Appendix Table A.2, P821) =0.4 Table 4.9 x 0 1 2 3 4 5 , Mean number of Occurrences 0.1 0.9048 0.0905 0.0045 0.0002 0.0000 0.0000 0.2 0.8187 0.1637 0.0164 0.0011 0.0001 0.0000 … … … … … … … 0.4 0.6703 0.2681 0.0536 0.0072 0.0007 0.0001 e 0.4 0.43 px 3 0.0072 3! Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 … … … … … … … 1.00 0.3679 0.3679 0.1839 0.0613 0.0153 0.0031 35 Example: Poisson Distribution = 0.4 Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 36 Mean and Variance of a Poisson Random Variable If x is a Poisson random variable with parameter , then mean X variance X2 standard deviation X Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 37 Several Poisson Distributions Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 38 Back to Example 4.13 In the ATC center situation, 28.0 errors occurred on average per year Assume that the number x of errors during any span of time follows a Poisson distribution for that time span Per week, the parameters of the Poisson distribution are: • mean = 0.4 errors/week • Because there are 52 weeks per year, for a week is = (20.8 errors/year) / (52 weeks/year) = 0.4 errors/week • standard deviation = 0.6325 errors/week. • Because = √0.4 = 0.6325 Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 39 Poisson Distribution Example Customers arrive at a rate of 72 per hour. What is the probability of 4 customers arriving in 3 minutes? Solution: 72 per hr. = 1.2 per min. = 3.6 per 3 min. px e x! x e 3.6 3.6 px 4 0.1912 4! 4 Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 40 Discrete probability distribution Discrete Binomial P( x) n C x (1 ) x Poisson n x xe P( x) x! Probability distribution P (x ) Mean of Probability distribution [ xP( x)] n 2 [( x )2 P( x)] 2 n (1 ) 2 Variance of Probability distribution Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 41 Binomial distribution P( x) n C x (1 ) x n x n n (1 ) 2 Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 42 Poisson Distribution 2 Business Statistics in Practice, 2nd Ed., Irwin/McGraw-Hill, 2001 43