Some standard, off the shelf probability distributions Recall every random variable X has a probability distribution (probability function). The probability distribution, expected value and variance of the random variable describe the “theoretical behavior” of a random variable. There are a number of important discrete probability distributions which allow us to describe many common experiments. The discrete uniform distribution; Example: Let X be an integer in the range 1..n chosen at random. Probability function: P(x) = 1/n for x=1,2,3,4…n E(X) = = (1/n)(1) + (1/n)2+…(1/n)n = (1/n)[ 1+2+3..+n] = 1(/n) (n) (n+1)/2 (n+1)/2 E(x2) = (1/n)(12) + (1/n)22+…(1/n)n2 = 1/n( n (n+1) (2n+1)/6) = (n+1)(2n+1)/6 Var(X) = 2 = E(x2) – [E(x)]2 = (n+1)(2n+1)/6 - (n+1)2/4 =(n2-1)/12 Example: Roll a die: X is number of spots X has a uniform discrete distribution Range(X) = {1,2,3,4,5,6} P(x) = 1/6 for x = 1..6 = E(X) = (6+1)/2 = 3.5 2 = Var(X) = (62 – 1)/12 = 2.92 = 1.71 1 Example: Pick a whole number at random from 1 to 100 X is the number Range(X) = {1,2,..100} P(x) = 1/100for x = 1..100 = E(X) = (100+1)/2 = 55.5 2 = Var(X) = (1002 – 1)/12 = 833.25 = 28.87 Example: Let X be an integer in the range 10..20. P(x) = 1/11 for x = 10,11,12,13,14,..20 To find E(X): Let Y be a number in the range 1..11. Thus, E(X) = (1+11)/2 = 6. Now, X = Y + 9. E(X) = E(Y+9) = E(Y) + 9 = 6+9 = 15 Var(X) = Var(Y+9) = Var(Y) = (112 – 1)/12 = 10 Note: the Discrete Uniform Distribution is really a family of distributions. If X is n integer chosen at random in the range 1..6 then X has a uniform discrete distribution with parameter 6. If Y is a random integer in the range 1..564 then Y has a uniform discrete distribution with parameter 564. Thus there is not one discrete uniform distribution but infinitely many. However all are similar in the sense that the pmf is p(x) = 1/n for x = 1...n 2 The Binomial Distribution: A Binomial Experiment: Consider an experiment with the following properties: 1. The experiment consists of n independent trials. For example : Flip a coin 45 times, Roll Dice 37 times 2. Each trial has two possible outcomes: Success(S) or Failure(F) Example: Flip a coin 45 times. Each of the 45 trials has two possible outcomes : Heads(S) or Tails(F) Roll dice 37 times. Each of the 37 trials has two possible outcomes Doubles(S) or not Doubles(F) 3. The trials are independent. The outcome of one trial does not affect the outcome of another trial 4. The probability of success on each trial is the same (p) [The probability of failure is q = 1-p.] For example: Flip a coin 45 times. If Heads is success then the probability of success on each trial is ½ . This does not change from trial to trial. Roll dice 37 times. If Doubles denotes success then the probability of success on each trial is 1/6 and the probability of failure on each trial is 5/6. There is no change in probability from trial to trial. An experiment with properties (1-4) is called a binomial experiment. Consider a binomial experiment with n trials and probability of success on each trial p. Let X be the number of successes in n trials. X is a random variable. Range(X) = {0,1,2,3,4..n}. What is the probability function of X? What is E(X)? What is Var(X)? Example: Roll dice 4 times. Let X be the number of times doubles appears. X is a random variable. Range(X) = {0,1,2,3,4} This is a binomial experiment: 1. There are 4 independent trials 2. On each trial--- Success: Rolling Doubles; Failure: Not Rolling Doubles 3. The trials are independent 4. On each trial; P(Success) = p = 1/6; P(5ailure) = q = 5/6 3 Our first problem is to find the probability function for X; x 0 1 2 3 4 P(x) ? ? ? ? ? Here is a picture of the whole experiment: Now lets find P(X = 2) ( the probability that the number of successes/doubles is 2). This happens for the following sequences Sequence probability DDD’D’ DD’DD’ DD’D’D D’DDD’ D’DD’D D’D’DD (1/6)2(5/6)2 (1/6)2(5/6)2 (1/6)2(5/6)2 (1/6)2(5/6)2 (1/6)2(5/6)2 (1/6)2(5/6)2 Roll number where success occurs {1,2} {1,3} {1,4} {2,3} {2,4} {3,4} 4 Notice there are 6 branches – one for every group of 2 successes i.e C(4,2) = 6 Thus p(2) = (1/6)2(5/6)2+(1/6)2(5/6)2+(1/6)2(5/6)2+(1/6)2(5/6)2+(1/6)2(5/6)2+(1/6)2(5/6)2 = 6(1/6)2(5/6)2 Similarly, to find p(3) : Sequence probability Rolls where success occurs 3 DDDD’ ( (1/6) (5/6) {1,2,3} DDD’D (1/6)3(5/6) {1,2,4} 3 DD’DD (1/6) (5/6) {1,,3,4} D’DDD (1/6)3(5/6) {2,3, 4} Notice that there are four brabches – on for every group of 3 successes: C(4,3) = 4 P(3) = 4* (1/6)3(5/6) We can use the tree diagram to see that p(0) = (5/6)4 -------------------------same as (1/6)0(5/6)4 p(1) = 4* (1/6) (5/6)3 p(2) = 6* (1/6)2(5/6)2 p(3) = 4* (1/6)3(5/6) p(4) = (1/6)4--------------------same as (1/6)4(5/6)0 i.e. x 0 1 2 3 4 P(x) 1*(1/6)0(5/6)4 4* (1/6) (5/6)3 6* (1/6)2(5/6)2 4* (1/6)3(5/6) 1*(1/6)4(5/6)0 =.482 =.386 =.116 =.015 =.001 ToTAL : 1 Or as we have seen in the example x 0 1 2 3 4 P(x) C(4,0)*(1/6)0(5/6)4 C(4,1)* (1/6) (5/6)3 C(4,2)* (1/6)2(5/6)2 C(4,3)* (1/6)3(5/6) C(4,4)*(1/6)4(5/6)0 =.482 =.386 =.116 =.015 =.001 ToTAL : 1 5 In general, given a binomial experiment with n trials where p = probability of success q = 1-p = probability of failure and X is the number of successes (in n trials) then the probability function is p( r) = P(X = r) = C(n,r)prqn-r Also, μ= E(X) = np; σ2 = npq, and σ= 2 Like the discrete uniform distribution, the binomial distribution is really a family of distributions—one member for eah pair (n,p). The random variable X in the above example had a binom(4,1/6) distribution. binom(4,.1666) μ= .664 σ=.745 binom(20,.1666) μ= 3.33 σ=1.66 binom(4,.5) μ=2, σ=1 binom(20,.5) μ= 10 σ=2.23 6 Example: Suppose that 10% of all widgets produced by a machine are defective. If 5 widgets are chosen at random: a. what is the probability that exactly 2 are defective Let X = the number of defective widgets…so “defective is success” The distribution of X is binom(5,.1). n=5 p = .1 q = 1-p P(x = 2) = C(5,2)(.1)2(.9)3 = (5*4)/(2*1) (.1)2(.9)3 = 10*(.1)2(.9)3 = .0729 b . Find the probability that at least 4 are defective: P(x > 4) = p(X=4) + p(X=5)= C(5,4) (.1)4(.9)1 + C(5,5) (.1)5(.9)0 = 5*.00009 + 1*.00001 = .00046 c. Find the probability that there is at least one defective Remember: the opposite of “at least one” is “none” P(X > = 1) = 1 – P(X = 0) = 1- C(5,0)(.1)0(.9)5 = 1 - .59049 = .40951 d. Find the expected number of defective widgets, the variance, and standard deviation μ = E(X) = np = 5*.1 = .5; σ2 = npq = 5*.1*.9 = .45 ; σ= .671 Example In a large population, 20 people are selected at random. and asked if they watch Spongebob Squarepants. In fact, 30% of the population watches Spongebob. Suppose that X is the number who watch Spongebob. Find a. the probability that X is at most 10. The distribution of X is binom(10,.4) n = 20 p = .3 q = .7 Note : μ = np = 20*.3 = 6; σ2 =npq = 20*.3*.7 = 4.2 Using the cumulative table: P(X < 10 ) = .983 b. the probability that X is exactly 10: P(x = 10) = P(X < 10) – P(X < 9) = .983 - .952 = .031 c. the probability that at least one person watches Spongebob P(at least one) = 1 – P(X = 0) = 1 - .001 = .999 d. the probability that between 5 and 12 (inclusive) watch Spongebob P(5 < X < 12) = P(X < 12) – P(X < 4) = .999 - .238 = .761 7 Poisson Distribution and Random events Any process with the following characteristics is called a Poisson process. 1. The occurrences of some event in time or space are independent 2. The probability of a single occurrence of an event in an interval of time or space is proportional to the length of the interval of time or space 3. In an “infinitely small” interval of time or space the probability of more than one occurrence of an event is ~ 0. As an interval of time or space approaches 0, the probability of an occurrence of an event approaches 0. Consider a Poisson process where the average number of occurrences of some event during an interval is λ. If X be the number occurrences of the event in some interval of time or space, then X is said to have a Poisson Distribution with parameter λ X is a random variable. Example: 1. Suppose on the average a Keebler Chocolate Chip Cookie has 16 chips per cookie. If X is the number of chips in a cookie selected at random, then X has a Poisson distribution with parameter λ = 16. Range(X) = {1,2,3,4,….} 2. Suppone, on the average, 6 cars cross an intersection per minute. Let X be the number of cars that cross in any minute. Then X has a Poisson Distribution with λ = 6. 3. Suppose that the average number of calls coming to a switchboard is 16/hour. Let X be the number that come in any half hour period. Range(X) = {0,1,2,3,4,5,6,……..}. X has a Poisson distribution with λ = 8. Suppose that X is random variable and the distribution of X is Poisson with parameter λ. Then 1. Range(X) = {1,2,3,4,5…..} 2. the probability function for X is p( r ) = P(X = r ) = λre-λ/r! 3. μ = E(X) = λ 4. σ2 = λ; σ = As before, the Poison distribution is really a family of distributions. There is a different distribution for each value of λ 8 λ=3 λ=6 λ = 13 λ = 25 Poisson distribution 9 Example: On the Average a Keebler Chocolate Chip Cookie contains 16 chips. Find the probability that a cookie selected at random has exactly 18 chips. Let X be the number of chips in a cookie selected at random. Assume that the diustribution of X is Poisson with λ = 16. Find P(X = 18) P(X = 18) = λ18e-λ/18! = 1618e-16/18! ~ .083 10 Example; Using the Cumulative table Suppose that on the average there are 6 traffic accidents at a certain intersection per year. In a given year, a. what is the probability that 3 or fewer accidents occur? Let X be the number of accidents that occur in a year. Assume that X has a Poisson distribution with λ = 6. P(X < 3 ) = .151 b. What is the probability that at least one accident occurs in a year? P(X > 1) = 1 - P(X = 0) = 1-.002 = .998 c. What is the probability that 4 or more accidents occur? P(X > 4) = 1 – P(X < 3 ) = 1 - .151 = .849 Relationship between binomial and Poisson distributions: In a binomial experiment if n is large and p is small then binom(n,p) ~poisson(λ) where λ = np In other words, the poisson distribution can be used to approximate the binomial distribution Example: Suppose X is binom(400, .005) n = 400, p = .005 q = .995 Use the binomial probability mass function to find P(X = 1): P(X = 1) = C(400,1)*.0051.995399 = 0.27066943301711285515159151027602 Now suppose Y is Poisson with λ = np = 400*.005 = 2 P(Y = 1) = λ1e-λ/1! = 21e-2/1! = 2e-2 = 0.27067056647322538378799898994497 The approximation is generally good if: n > 100 p < .01 np < 20 which was the case in the example above. 11