STAT 211 Handout 3 (Chapter 3): Discrete Random Variables Random variable (r.v.): A random variable is a function from sample space, S to the real line, R. That is, a random variable assigns a real number to each element of S. Discrete random variable: Possible values of isolated points along the number line. Random variables have their own sample space, or set of possible values. If this set is finite or countable, the r.v. is said to be discrete. Example 1: Define the numeric numbers random variable can take and their probabilities. Use this example to demonstrate most of the properties and tools in this handout. (i) Consider an experiment in which each of three cars come to a complete stop (C) on the intersection or not (N). Let random variable X be the number of cars come to a complete stop. (ii) Consider an experiment of four home mortgages that are classified as fixed rate (F) or variable rate (V). Let random variable Y be the number of homes with the fixed mortgage rate. Probabilistic properties of a discrete random variable, X: P(X a) + P(X > a) = 1 where a is a constant integer. Then P(X > a) = 1 - P(X a) P(X a) = 1 - P(X < a) = 1- P(X a-1) P(a < X < b) = P(X < b) - P(X a) = P(X b-1) - P(X a) where b is also a constant integer. P(a X b) = P(X b) - P(X < a) = P(X b) - P(X a-1) Discrete probability distribution: A probability distribution describes the possible values and their probability of occurring. Discrete probability distribution is called probability mass function (pmf), p(.) and need to satisfy following conditions. 0 p(x)=P(X=x) 1 for all x where X is a discrete r.v. p ( x) 1 all x Examples: Discrete Uniform, Bernoulli, Binomial, Hypergeometric, Negative Binomial, Geometric Distributions. Example 1(ii) (continue): y p(y) 0 1/16 1 4/16 2 6/16 3 4/16 4 1/16 All probabilities, p(y) are between 0 and 1 When you sum probabilities for all possible y values, they add up to 1. p(y) is a legitimate probability mass function (pmf) of y. P(Y>2)=p(3)+p(4) or 1-P(Y≤2)=1-p(0)-p(1)-p(2)=5/16 P(Y≥2)=p(2)+p(3)+p(4)=11/16 or 1-P(Y<2)=1-p(0)-p(1)=11/16 P(1<Y<3)=p(2) or P(Y<3)-P(Y≤1)=(p(0)+p(1)+p(2))-(p(0)+p(1))=6/16 P(1≤Y≤3)=p(1)+p(2)+p(3) or P(Y≤3)-P(Y<1)=(p(0)+p(1)+p(2)+p(3))-(p(0))=14/16 Example 2: A pizza shop sells pizzas in four different sizes. The 1000 most recent orders for a single pizza gave the following proportions for the various sizes. Size 12" 14" 16" 18" Proportion 0.20 0.25 0.50 0.05 With X denoting the size of a pizza in a single-pizza order, Is the table above a valid pmf of x? Example 3: Could p(x)=x2/50 for x=1,2,3,4,5 be the pmf of x? If it is not, is it possible to find the pmf of x? Cumulative Distribution Function, (CDF): F ( x) P( X x) x P( X y ) for all y Example 1 (ii)(continue): For this home mortgages example F(0-)=P(Y<0)=0 F(0)=P(Y0)=p(0)=1/16 F(1)=P(Y1)=p(0)+p(1)=F(0)+p(1)=5/16 F(2)= P(Y2)=p(0)+p(1)+p(2)=F(1)+p(2)=11/16 F(3)= P(Y3)=p(0)+p(1)+p(2)+p(3)=F(2)+p(3)=15/16 F(4+)= P(Y4)= p(0)+p(1)+p(2)+p(3)+p(4)=F(3)+p(4)=16/16=1 0, 1 / 16, 5 / 16, F ( y) 11 / 16, 15 / 16, 1, y0 0 y 1 1 y 2 2 y3 3 y 4 y4 Example 2 (continue): For this pizza example. F(12-)=P(X<12)=0 F(12)=P(X12)=0.20 F(14)=P(X14)=0.45 F(16)= P(X16)=0.95 F(18+)= P(X18)=1 x 12 12 x 14 14 x 16 16 x 18 x 18 0, 0.20, F ( x) 0.45, 0.95, 1, P(a X b) F (b) F (a 1) and P( X a) F (a) F (a 1) where a and b are integers. P(14 X 16) F (16) F (12) =0.95-0.20=0.75 or P(X=14)+P(X=16)=0.75 P( X 14) F (14) F (12) 0.45 0.20 0.25 The expected value of a random variable, X : =E(X) (the population distribution of X is centered). Expected value for the discrete random variable, X: Weighted average of the possible values. Expected value of the random variable X, E ( X ) x p( x) for all x Rules of expected value: (i) For any constant a and the random variable X, E(aX) = aE(X) (ii) For constant b, E(b) = b (iii) For any constant a and b, the random variable aX+b, E(aX+b) = aE(X)+b (iv) For constants a, b, and c and the random variables X and Y, E(aX bY c) = aE(X) bE(Y) c Example 1(ii)(continue): If we use the home mortgages example, determine the expected value of Y. Y E (Y ) y p( y ) 0(1 / 16) 1(4 / 16) 2(6 / 16) 3(4 / 16) 4(1 / 16) =32/16=2 for all y On the average, 2 houses expected to have fixed mortgage rated. Example 2(continue): If we use the pizza example, show that the expected value of X is approximately 14.8". X E ( X ) x p( x) 12(0.20) 14(0.25) 16(0.50) 18(0.05) =14.8 for all x On the average, 14.8" pizza is expected to be ordered. If we have a new variable Y=2X, the pmf of Y is y 24 28 32 36 p(y) 0.20 0.25 0.50 0.05 y E ( y ) y p( y ) 24(0.20) 28(0.25) 32(0.50) 36(0.05) =2 X =29.6 for all y What is the approximate probability that X is within 2" of its mean value? P(12.8 ≤ X ≤ 16.8)=P(X=14)+P(X=16)=0.25+0.50=0.75 The variance of random variable, X: Measure of dispersion. Variance of the random variable X, 2 = Var(X) = E[( X ) 2 ] E ( X 2 ) 2 (variability in the population distribution of X) The standard deviation of random variable, X: = 2 ( x ) p ( x) ) x p ( x) . Variance for the discrete random variable, X: 2 Var ( X ) 2 or for all x the suggested shortcut is 2 Var ( X ) E ( X 2 ) 2 where E ( X 2 2 for all x If h(X) is the function of random variable X, E (h( X )) h( X ) p( x) for all x Var(h(X))= (h( X ) E (h( X ))) 2 p ( x) . for all x If h(X) is a linear function of X, the rules of the mean and the variance can directly be used instead of going through the mathematics. Rules of variance: (i) For any constant, a and the random variable, X, Var(aX) = a2Var(X) (ii) For constant b, Var(b) = 0 (iii) For constants a and b and the random variables X, Var(aX b) = a2Var(X) (iv) For constants a, b and c and the random variables X1 and X2, Var(a X1 b X2 c) = a2Var(X1)+ b2Var(X2) 2 abCov(X1, X2) (this case will be used in Chapter 5) (v) For constants a1 to an and the random variables X1 to Xn, n n n n Var ai X i ai2 Var ( X i ) 2 ai a j Cov( X i , X j ) (this case will be i 1 j 1 i 1 i 1 i j used in Chapter 5) Example 1 (ii) (continue): If we use the home mortgages example, what is the variance of Y? Y2 Var (Y ) E (Y 2 ) ( E (Y )) 2 ( y 2 p( y )) 2 2 for all y (0 (1 / 16) 1 (4 / 16) 2 (6 / 16) 32 (4 / 16) 4 2 (1 / 16)) 2 2 =1 2 2 2 Example 2 (continue): If we use the pizza example, what is the variance of X? X2 Var ( X ) E ( X 2 ) ( E ( X )) 2 ( x 2 p( x)) 14.8 2 for all x (12 (0.20) 14 (0.25) 16 (0.50) 18 2 (0.05)) 14.8 2 =2.96 2 2 2 and the standard deviation of X is X = 2.96 =1.72 If we have a new variable Y=2X, the pmf of Y is y 24 28 32 36 p(y) 0.20 0.25 0.50 0.05 Y2 Var (Y ) E (Y 2 ) ( E (Y )) 2 ( y 2 p( y )) 29.6 2 for all y (24 (0.20) 28 (0.25) 32 (0.50) 36 2 (0.05)) 29.6 2 =11.84 = 4 X2 2 2 2 Parameter: If P(X=x) depends on a quantity that can be assigned any one of a number of possible values, with each different value determining a different probability distribution, that quantity is called a parameter of the distribution. Bernoulli Distribution: It is based on Bernoulli trial ( an experiment with two, and only two, possible outcomes). A r.v. X has a Bernoulli(p) distribution where p is the parameter if 1 with probabilit y p X= 0 with probabilit y 1 - p 0p1. P(X=x)= p x (1 p)1x , x 0,1 Examples 4: (i) Flip a coin 1 time. Let X be the number of tails observed. Let P(heads)=0.55 then P(tails)=0.45, In this example p is P(tails)=0.45. (ii) A single battery is tested for the viability of its charge. Let X be 1 if battery is OK and zero otherwise. Let P(battery is OK)=0.90 then P(battery is not OK)=0.10, In this example p is P(Battery is OK)=0.9. Binomial Distribution: Approximate probability model for sampling without replacement from a finite dichotomous population. X~Binomial(n,p). n fixed trials each trial is identical and results in success or failure independent trials the probability of success (p) is constant from trial to trial X is the number of successes among n trials n P( X x) p x (1 p) n x , x 0,1,2,...., n x E(X) = np and Var(X) = np(1-p) Binomial Theorem: For any real numbers x and y and integer n 0, n n ( x y ) n x i y n i . i 0 i x x n Cumulative distribution function: F ( x) P( X k ) p k (1 p) n k k 0 k 0 k Table A.1 demonstrates cumulative distribution function values for n=5,10,15,20,25 with different p values. Example 5: A lopsided coin has a 70% chance of "head". It is tossed 20 times. Suppose X: number of heads observed in 20 tosses ~ Binomial (n=20, p=0.70) Y: number of tails observed in 20 tosses ~ Binomial (n=20, p=0.30) Determine the following probabilities for the possible results: a. at least 10 heads P(X≥10)=1-P(X<10)=1-P(X≤9)=1-0.017 P(Y≤10)=0.983 b. at most 13 heads P(X≤13)=0.392 P(Y≥7)=1-P(Y<7)=1-P(Y≤6)=1-0.608 c. exactly 12 heads P(X=12)=P(X≤12)-P(X≤11)=0.228-0.113 P(Y=8)= P(Y≤8)-P(Y≤7)=0.887-0.772 d. between 8 and 14 heads (inclusive) P(8≤X≤14)= P(X≤14)-P(X≤7)=0.584-0.001 P(6≤Y≤12)= P(Y≤12)-P(Y≤5)=0.999-0.416 e. fewer than 9 heads P(X<9)=P(X≤8)=0.005 P(Y>11)=1-P(Y≤11)=1-0.995 Hypergeometric Distribution: Exact probability model for the number of successes in the sample. X~Hyper(M,N,n) M N M x n x P( X x) , max( 0, n N M ) x min( n, M ) N n Let X be the number of successes in a random sample of size n drawn from a population with size N consisting of M successes and (N-M) failures. M M E(X) = n where is the proportion of successes in the population. N N N n N n M M n 1 where Var(X) = is the finite population correction factor. N 1 N 1 N N Example 6: An urn filled with N balls that are identical in every way except that M are red and N-M are green. We reach in and select n balls at random (n balls are taken all at once, a case of sampling without replacement). What is the probability that exactly x of the balls are red? N : Total number of samples of size n that can be drawn from the N balls. n M : Number of ways that x balls will be red out of M red balls. x N M : Number of ways that remaining n-x balls will be green.. n x X: number of red balls drawn from a sample of n balls has hypergeometric distribution and the answer is P(X=x) where x=0,1,2,3,....,n Example 7: A quality-control inspector accepts shipments whenever a sample of size 5 contains no defectives, and she rejects otherwise. a. Determine the probability that she will accept a poor shipment of 50 items in which 20% are defective. Let X be the number of defective items in a poor shipment where from N=50 items, sample of n=5 is selected with M=50(0.20)=10 defective items in a poor shipment 10 40 0 5 P(accept shipment)=P(X=0)= =0.3106 50 5 b. Determine the probability that she will reject a good shipment of 100 items in which 2% are defective. Let X be the number of defective items in a good shipment where from N=100 items, sample of n=5 is selected with M=100(0.02)=2 defective items in a good shipment 2 98 2 98 1 4 2 3 P(reject shipment)=P(X1)=P(X=1)+P(X=2)= =0.098 100 100 5 5 Negative Binomial Distribution: Binomial Distribution counts the number of successes in a fixed number of Bernoulli trials. Negative Binomial Distribution counts the number of Bernoulli trials required to get a fixed number of successes. X ~ NegativeBinomial(r,p) x r 1 r p (1 p) x , x 0,1,2,3,........... P( X x) r 1 X: number of failures before the rth success p: probability of successes r: number of successes r (1 p ) r (1 p) E(X) = and Var(X) = p2 p Example 8 (Exercise 3-71, 6th edition which is Exercise 3-69, 5th edition ): P(male birth)=0.5 A couple wished to have exactly 2 female children in their family. They will have children until this condition is fulfilled. (a) What is the probability that the family has x male children? X: number of male children until they have 2 girls p=P(female)=0.5 r:number of girls=2 X~Negative binomial(r=2,p=0.5) x 2 1 (0.5) 2 (1 0.5) x ( x 1)(0.5) x 2 , x=0,1,2,3,…… P(X=x)= 2 1 (b) What is the probability of family has four children? (Answer=0.1875) (c) What is the probability that the family has at most 4 children? (Answer=0.6875) (d) How many male children would you expect this family to have? (Answer=2) How many children would you expect this family to have? (Answer=4) The Geometric Distribution is the simplest of the waiting time distributions and is a special case of the negative binomial distribution (r=1). P( X x) p (1 p) x 1 , x 1,2,3,........... p: probability of success X: the trial at which the first success occurs (waiting time for a success) E(X) = (1 p) , p Var(X) = Need to remember that (1 p ) p2 a x1 x 1 and P(X >x)=(1-p)x 1 and 1 a a x1 x n 1 an 1 a Example 9: A series of experiments conducted in order to reduce the proportion of cells being scrapped by a battery plant because of internal shorts. The experiment was successful in reducing the percentage of manufactured cells with internal shorts to around 1%. Suppose we are interested in the number of the test at which the first short is discovered. Find the probability that at least 50 cells are tested without finding a short. X : the number of tests until the first short ~ Geometric(p) p : probability of internal shorts=0.01 P(X>50) = (1-p)x = (1-0.01)50 =0.605 Poisson Distribution: waiting time for an occurrence (waiting time for a bus, waiting time for customers to arrive in a bank, etc.). The probability of an arrival is proportional to the length of waiting time. e x P( X x) , x 0,1,2,3,..........., 0 x! : rate per unit time or per unit area X : number of occurrences in a given time period or place (example: # of parts produced/hour, or # of fractures /blade, and so on.) i i 0 i! Note that e and E(X)=Var(X)=. e k k! k 0 k 0 Table A.2 demonstrates cumulative distribution function values with different values. x x Cumulative distribution function: F ( x) P( X k ) Example 10: Transmission line interruptions in a telecommunications network occur at an average rate 1 per day. Let X be the number of line interruptions in t days E(X)==1(t)=t interruptions in t days Find the probability that the line experiences a. no interruptions in 5 days e 5 5 0 P(X=0)= 0.0067 0! b. exactly 2 interruptions in 3 days e 3 3 2 P(X=2)= 0.224 2! c. at least 1 interruption in 4 days e 4 4 0 P(X≥1)= 1-P(X=0)=1 0.9817 0! d. at least 2 interruptions in 5 days e 5 5 0 e 5 51 P(X≥2)= 1-P(X=0)-P(X=1)=1 0.9596 0! 1! Example 11 (Exercise 3.78, 6th edition which is Exercise 3-76, 5th edition ): X: the number of missing pulses when you are writing onto a computer disk and then sending it through a certifier ~ Poisson(=0.2) E(X)==0.2 (a) What is the probability that a disk has exactly one missing pulse? e 0.2 0.21 P(X=1)= =0.1638 1! (b) What is the probability that a disk has at least 2 missing pulses? e 0.2 0.2 0 e 0.2 0.21 P(X≥2)=1-P(X=0)-P(X=1)= 1 =0.0175 0! 1! (c) If two disks are independently selected, what is the probability that neither contains a missing pulse? e 0.2 0.2 0 P(X=0)= =0.8187 : P(one disk contains no missing pulse) 0! P(neither contains a missing pulse)=[P(X=0)]2=0.6703 Proposition: Suppose that in the binomial probability mass function, we let n and p 0 in such a way that n.p remains fixed at value > 0. Then, Binomial probabilities can be approximated by using Poisson probabilities.