Probability and Statistics Lecture 6 Dr.-Ing. Erwin Sitompul President University http://zitompul.wordpress.com President University Erwin Sitompul PBST 6/1 Chapter 5 Some Discrete Probability Distributions Chapter 5 Some Discrete Probability Distributions President University Erwin Sitompul PBST 6/2 Chapter 5.1 Introduction Introduction Often, the observations generated by different statistical experiments have the same general type of behavior. The discrete random variables associated with these experiments can be described by essentially the same probability distribution in a single formula. In fact, one needs only a handful of important probability distributions to describe many of the discrete random variables encountered in practice. In this chapter, we are going to present these commonly used distributions with various examples. President University Erwin Sitompul PBST 6/3 Chapter 5.2 Discrete Uniform Distribution Discrete Uniform Distribution If the random variable X assumes the values x1, x2, ..., xk, with equal probabilities, then the discrete uniform distribution is given by f ( x; k ) 1 , k x x1 , x 2 , , xk When a light bulb is selected at random from a box that contains a 40-watt bulb, a 60-watt bulb, a 75-watt bulb, and a 100-watt bulb, each element of the sample space S = {40, 60, 75, 100} occurs with probability 1/4. Therefore, we have a uniform distribution, with f ( x ; 4) 1 , x 40, 60, 75,100 4 President University Erwin Sitompul PBST 6/4 Chapter 5.2 Discrete Uniform Distribution Discrete Uniform Distribution When a dice is tossed, each element of the sample space S = {1, 2, 3, 4, 5, 6} occurs with probability 1/6. Therefore, we have an uniform distribution with f ( x ; 6) 1 , x 1, 2, 3, 4, 5, 6 6 President University Erwin Sitompul PBST 6/5 Chapter 5.3 Binomial and Multinomial Distributions Bernoulli Process An experiment often consists of repeated trials, each with two possible outcomes that may be labeled success or failure. We may choose to define either outcome as a success. The process is referred to as a Bernoulli process. Each trial is called a Bernoulli trial. Strictly speaking, the Bernoulli process must possess the following properties: 1. The experiment consists of n repeated trials. 2. Each trial results in an outcome that may be classified as a success or a failure. 3. The probability of success, denoted by p, remains constant from trial to trial. 4. The repeated trials are independent. President University Erwin Sitompul PBST 6/6 Chapter 5.3 Binomial and Multinomial Distributions Bernoulli Process Consider the set of Bernoulli trials where three items are selected at random from a manufacturing process, inspected, and classified defective or non-defective. A defective item is designated a success. The number of successes is a random variable X assuming integer values from 0 to 3. The items are selected independently from a process and we shall assume that it produces 25% defectives. The probability of the outcome NDN can be calculated as 9 3 1 3 P(NDN ) P(N )P(D )P(N ) 4 4 4 64 President University Erwin Sitompul PBST 6/7 Chapter 5.3 Binomial and Multinomial Distributions Bernoulli Process The probabilities for the other possible outcomes can also be calculated to result the probability distribution of X The number X of successes in n Bernoulli trials is called a binomial random variable. The probability distribution of this discrete random variable is called the binomial distribution, and denoted by b(x; n, p). P ( X 2) f (2) b (2 : 3, ) 1 4 President University 9 64 Erwin Sitompul PBST 6/8 Chapter 5.3 Binomial and Multinomial Distributions Binomial Distribution |Binomial Distribution| A Bernoulli trial can result in a success with probability p and a failure with probability q = 1 – p. Then the probability distribution of the binomial random variable X, the number of successes in n independent trials, is b( x : n, p ) n C x p q x n x x 0,1, 2, ..., n , The mean and variance of the binomial distribution b(x; n, p) are 2 np npq The probability that a certain kind of component will survive a given shock test is 3/4. Find the probability that exactly 2 of the next 4 components tested will survive. p 3 4 2 3 3 1 b 2 : 4, 4 C 2 4 4 4 President University 2 54 256 Erwin Sitompul PBST 6/9 Chapter 5.3 Binomial and Multinomial Distributions Binomial and Multinomial Distributions The probability that a patient recovers from a rare blood disease is 0.4. If 15 people are known to have contracted this disease, what is the probability that (a) at least 10 survive, (b) from 3 to 8 survive, and (c) exactly 5 survive? Let X be the number of people that survive. Table A.1 gives help. (a) 9 P ( X 10) 1 P ( X 10) 1 b ( x ;15, 0.4) 1 0 .9 6 6 2 0.0338 x0 ? 15 Can you calculate b ( x ;15, 0.4) manually? (b) P (3 X 8) x 10 8 8 2 x3 x0 x0 b ( x ;15, 0.4) b ( x ;15, 0.4) b ( x ;15, 0.4) 0 .9 0 5 0 0 .0 2 7 1 0.8779 (c) P ( X 5) b (5;15, 0.4) President University 5 C 5 (0.4) (0.6) 15 10 0.1859 Erwin Sitompul PBST 6/10 Chapter 5.3 Binomial and Multinomial Distributions Table A.1 Binomial Probability Sums President University Erwin Sitompul PBST 6/11 Chapter 5.3 Binomial and Multinomial Distributions Binomial and Multinomial Distributions A large chain retailer purchase a certain kind of electronic device from a manufacturer. The manufacturer indicates that the defective rate of the device is 3%. (a) The inspector of the retailer randomly picks 20 items from a shipment. What is the probability that there will be at least one defective item among these 20? (b) Suppose that the retailer receives 10 shipments in a month and the inspector randomly tests 20 devices per shipment. What is the probability that there will be 3 shipments containing at least one defective device? Let X be the number of defective devices among the 20 items. (a) P ( X 1) 1 P ( X 1) 1 P ( X 0) 1 (b) p 0.4562 P (Y 3) b (3;10, 0.4562) President University C 0 (0.03) (1 0.03) 0 20 20 0.4562 C 3 (0.4562) (1 0.4562) 10 Erwin Sitompul 3 10 3 0.1602 PBST 6/12 Chapter 5.3 Binomial and Multinomial Distributions Binomial and Multinomial Distributions It is conjectured that an impurity exists in 30% of all drinking wells in a certain rural community. In order to gain some insight on this problem, it is determined that some tests should be made. It is too expensive to test all of the many wells in the area so 10 were randomly selected for testing. (a) Using the binomial distribution, what is the probability that exactly three wells have the impurity assuming that the conjecture is correct? (b) What is the probability that more than three wells are impure? (a) P ( X 3) (b) P ( X 3) 1 P ( X 3) C 3 (0.3) (1 0.3) 3 10 10 3 0.2668 Try also to use Table ? A.1 to find this value 1 b ( x ;10, 0.3) 3 x0 1 0 .0 2 8 2 0 .1 2 1 1 0 .2 3 3 5 0 .2 6 6 8 0.3504 President University Erwin Sitompul PBST 6/13 Chapter 5.3 Binomial and Multinomial Distributions Binomial and Multinomial Distributions Consider the previous “drinking wells” example. The “30% are impure” is merely a conjecture put forth by the area water board. Suppose 10 wells are randomly selected and 6 are found to contain the impurity. What does this imply about the conjecture? Use a probability statement. P ( X 6) C 6 (0.3) (1 0.3) 10 6 10 6 0.0368 Should the 30% impurity conjecture is true, there is only 3.68% chance that it stands after 6 wells are found contaminated. The investigation suggests that the impurity problem is much more severe than 30%. President University Erwin Sitompul PBST 6/14 Chapter 5.3 Binomial and Multinomial Distributions Binomial and Multinomial Distributions The binomial experiment becomes a multinomial experiment if we let each trial have more than 2 possible outcomes. |Multinomial Distribution| If a given trial can result in the k outcomes E1, E2, ..., Ek with probabilities p1, p2, .., pk, then the probability distribution of the random variables X1, X2, ..., Xk, representing the number of occurrence for E1, E2, ..., Ek in n independent trials is f ( x1 , x 2 , ..., x k ; p1 , p 2 , ..., p k , n ) with x1 ! x 2 !... x k ! x x x p1 1 p 2 2 ... p k k k k x n! i n and i 1 President University pi 1 i 1 Erwin Sitompul PBST 6/15 Chapter 5.3 Binomial and Multinomial Distributions Binomial and Multinomial Distributions The complexity of arrivals and departures into an airport are such that computer simulation is often used to model the “ideal” conditions. For a certain airport containing three runways it is known that in the ideal setting the probabilities that the individual runways are accessed by a randomly arriving commercial jet are 2/9, 1/6, and 11/18 for runway 1, runway 2, and runway 3, respectively. If there are 6 randomly arriving airplanes, what is the probability that 2 airplanes will do the landing in runway 1, 1 in runway 2, and 3 in runway 3? 2 1 3 2 1 11 6 ! 2 1 11 f 2,1, 3; , , ,6 0.1127 9 6 18 2 !1!3! 9 6 18 ? What is the probability that 2 airplanes will do the landing in runway 1? President University Erwin Sitompul PBST 6/16 Chapter 5.4 Hypergeometric Distribution Hypergeometric Distribution As opposed to the binomial distribution, the hypergeometric distribution is based on the sampling done without replacement. The independence among trials is not required. Applications for the hypergeometric distribution are found in many areas, with heavy uses in acceptance sampling, electronic sampling, and quality assurance. The experiment where the hypergeometric distribution applies must possess the following two properties: 1. A random sample of size n is selected without replacement from N items 2. k of the N items may be classified as successes and N–k are classified as failures. The number X of successes of a hypergeometric experiment is called a hypergeometric random variable. The hypergeometric distribution of such variable is denoted by h(x; N, n, k) President University Erwin Sitompul PBST 6/17 Chapter 5.4 Hypergeometric Distribution Hypergeometric Distribution A particular part that is used as an injection device is sold in lots of 10. The producer feels that the lot is deemed acceptable if no more that one defective is in the lot. Some lots are sampled and the sampling plan involves random sampling and testing 3 of the parts out of 10. If none of the 3 are defective, the lot is accepted. Give comment on the utility of this plan. P ( X 0) 2 C0 8 C3 10 P ( X 0) 3 0.291 In case there are 3 defectives, there is still a chance of 29.1% that the lot is accepted. C3 C0 7 C3 10 0.467 In case there are 2 defectives, there is still a chance of 46.7% that the lot is accepted. C3 As conclusion, a plan to do this kind of quality control is faulty. Unacceptable lot can still be accepted with high probability. 3 samples are not enough. The sample size must be increased. President University Erwin Sitompul PBST 6/18 Chapter 5.4 Hypergeometric Distribution Hypergeometric Distribution The probability distribution of the hypergeometric random variable X, the number of successes in a random sample of size n selected from N items of which k are labeled success and N–k labeled failure, is h ( x; N , n, k ) k Cx N k N Cnx , x 0,1, 2, ..., n Cn The mean and variance of the hypergeometric distribution h(x;N,n,k) are nk N and President University 2 N n N 1 n k k 1 N N Erwin Sitompul PBST 6/19 Chapter 5.4 Hypergeometric Distribution Hypergeometric Distribution Lots of 40 components each are called unacceptable if they contain as many as 3 defectives or more. The procedure for sampling the lot is to select 5 components at random and to reject the lot if a defective is found. (a) What is the probability that exactly 1 defective is found in the sample if there are 3 defectives in the entire lot? (b) Find the mean and variance of the random variable and use Chebyshev’s theorem to interpret the interval μ ± 2σ. (a) h (1; 40, 5, 3) 3 C 1 37 C 4 40 (b) nk (5)(3) N 2 N n N 1 40 n 0.3011 C5 3 Again, this method of testing is not acceptable, since it detects a bad lot (with 3 defectives) only about 30% of the time 0.375 8 k k 3 40 5 3 (5) 1 1 0.3113 40 N N 40 1 40 0 .5 5 8 President University In at least 3/4 of the time, the number of defectives will be between –0.741 and 1.491 components Erwin Sitompul PBST 6/20 Chapter 5.4 Hypergeometric Distribution Hypergeometric Distribution If the number of sample n is small compared to the sample size N, the nature of the N items changes very little in each draw, although without replacement. In this case, where n/N ≤ 0.05, the value of binomial distribution can be used to approximate the value of hypergeometric distribution. A manufacturer of automobile tires reports that among a shipment of 5000 sent to a local distributor, 1000 are slightly blemished. If one purchase 10 of these tires at random from the distributor, what is the probability that exactly 3 are blemished? h (3; 5000,10,1000) p 1 C3 4000 5000 C 10 1000 C7 0.2015 Exact hypergeometric probability , 5 1 b 3;10, 5 3 1 4 C3 10 5 5 President University 10 3 0.2013 Erwin Sitompul Approximation using binomial distribution PBST 6/21 Chapter 5.5 Negative Binomial and Geometric Distributions Negative Binomial Distribution Consider an experiment where the properties are the same as those listed for a binomial experiment, with the exception that the trials will be repeated until a fixed number of successes occur. We are interested in the probability that the kth success occurs on the xth trial. This kind of experiment is called negative binomial experiment. The number X of trials to produce k successes in a negative binomial experiment is called a negative binomial random variable, and its probability distribution is called the negative binomial distribution. |Negative Binomial Distribution| If repeated independent trials can result in a success with probability p and a failure with probability q = 1–p, then the probability distribution of the random variable X, the number of the trial on which the kth success occurs, is b ( x; k , p ) * k Ck 2 p q x 1 President University xk , x k , k 1, k 2, ... Erwin Sitompul PBST 6/22 Chapter 5.5 Negative Binomial and Geometric Distributions Negative Binomial Distribution In an NBA (National Basketball Association) championship series, the team which wins four games out of seven will be the winner. Suppose that team A has probability 0.55 of winning over the team B and both teams A and B face each other in the championship games. (a) What is the probability that team A will win the series in six games? (b) What is the probability that team A will win the series? (a) b (6; 4, 0.55) * (b) P ( te a m 4 C 4 2 (0.55) (0.45) 6 1 62 0.1853 A w in s th e ch a m p io n sh ip se rie s ) b (4; 4, 0.55) b (5; 4, 0.55) b (6; 4, 0.55) b (7; 4, 0.55) * * * * 0.0915 0.1647 0.1853 0.1668 0.6083 President University Erwin Sitompul PBST 6/23 Chapter 5.5 Negative Binomial and Geometric Distributions Negative Binomial Distribution In an NBA (National Basketball Association) championship series, the team which wins four games out of seven will be the winner. Suppose that team A has probability 0.55 of winning over the team B and both teams A and B face each other in the championship games. (c) If both teams face each other in a regional playoff series and the winner is decided by winning three out of five games, what is the probability that team A will win a playoff? (c) P ( te a m A w in s th e re g in a l se rie s ) b (3; 3, 0.55) b (4; 3, 0.55) b (5; 3, 0.55) * * * 0.1664 0.2246 0.2021 0.5931 President University Erwin Sitompul PBST 6/24 Chapter 5.5 Negative Binomial and Geometric Distributions Geometric Distribution If we consider the special case of the negative binomial distribution where k = 1, we have a probability distribution for the number of trials required for a single success. If repeated independent trials can result in a success with probability p and a failure with probability q = 1–p, then the probability distribution of the random variable X, the number of the trial on which the first success occurs, is g ( x ; p ) pq x 1 , x 1, 2, 3, ... The mean and variance of a random variable following the geometric distribution are 1 and p President University 2 1 p p 2 Erwin Sitompul PBST 6/25 Chapter 5.5 Negative Binomial and Geometric Distributions Geometric Distribution In a certain manufacturing process it is known that, on the average, 1 in every 100 items is defective. What is the probability that the fifth item inspected is the first defective item found? g (5; 0.01) (0.01)(0.99) 4 0.0096 At “busy time” a telephone exchange is very near capacity, so callers have difficulty placing their calls. It may be of interest to know the number of attempts necessary in order to gain a connection. Suppose that we let p = 0.05 be the probability of a connection during busy time. We are interested in knowing the probability that 5 attempts are necessary for a successful call. P ( X 5) g (5; 0.05) (0.05)(0.95) President University 4 0.041 Erwin Sitompul PBST 6/26 Chapter 5.6 Poisson Distribution and Poisson Process Poisson Distribution and Poisson Process Experiments yielding numerical values of a random variable X, the number of outcomes occurring during a given time interval or in a specified region, are called Poisson experiments. The time interval may be given in any length, such as minute, day, week, and month. The specified region may be a line segment, an area, a volume, or a piece of material President University Erwin Sitompul PBST 6/27 Chapter 5.6 Poisson Distribution and Poisson Process Properties of Poisson Process A Poisson experiment is derived from the Poisson process and possesses the following properties: 1. The number of outcomes occurring in one time interval or specified region is independent of the number that occurs in any other disjoint time interval or region of space. 2. The probability that a single outcome will occur during a very short time interval or in a small region is proportional to the length of the time interval or the size of the region and does not depend on the number of outcomes occurring outside this time interval or region. 3. The probability that more than one outcome will occur in such a short time interval or fall in such a small region is negligible President University Erwin Sitompul PBST 6/28 Chapter 5.6 Poisson Distribution and Poisson Process Poisson Distribution and Poisson Process |Poisson Distribution| The probability distribution of the Poisson random variable X, representing the number of outcomes occurring in a given time interval or specified region denoted by t, is t x e ( t ) p ( x; t ) , x 0,1, 2, ... x! where λ is the average number of outcomes per unit time or region, and e = 2.71828.... (natural number). The mean and variance of the Poisson distribution p(x;λt) both have the value λt. President University Erwin Sitompul PBST 6/29 Chapter 5.6 Poisson Distribution and Poisson Process Poisson Distribution and Poisson Process During a laboratory experiment the average number of radioactive particles passing through a counter in 1 millisecond is 4. What is the probability that 6 particles enter the counter in a given millisecond? x 6, t 4 4 p (6; 4) e (4) 6 6! 0.1042 Ten is the average number of oil tankers arriving each day at a certain port city. The facilities at the port can handle at most 15 tankers per day. What is the probability that on a given day tankers have to be turned away? 15 P ( X 15) 1 P ( X 15) 1 Table A.2 gives help p ( x ;10) x0 e 10 (10)1 e 10 (10) 2 1 1! 2! e 10 (10) 15 ! 15 1 0 .9 5 1 3 0.0487 President University Erwin Sitompul PBST 6/30 Chapter 5.6 Poisson Distribution and Poisson Process Table A.2 Poisson Probability Sums President University Erwin Sitompul PBST 6/31 Chapter 5.6 Poisson Distribution and Poisson Process Poisson Distribution As a Limit of Binomial It should be clear from the three properties of the Poisson process that the Poisson distribution relates to the binomial distribution. In the case of the binomial, if n is quite large and p is small, the conditions begin to simulate the continuous space or time region implications of the Poisson process. Poisson distribution can be taken as a limiting form of the binomial distribution when n ∞ and p 0, and np remains constant. If the conditions are fulfilled, the Poisson distribution can be used with μ = np, to approximate binomial distribution. Let X be a binomial random variable with probability distribution b(x;n,p). When n ∞ and p 0, and μ = np remains constant, b ( x; n, p ) p ( x, ) President University Erwin Sitompul PBST 6/32 Chapter 5.6 Poisson Distribution and Poisson Process Poisson Distribution and Poisson Process In a certain industrial facility accidents occur infrequently. It is known that the probability of an accident on any given day is 0.005 and accidents are independent of each other. (a) What is the probability that in any given period of 400 days there will be an accident on one day? (b) What is the probability that there are at most three days with an accident? x 1, t (0.005)(400) 2 2 (a) p (1; 2) e (2) 1 1! b (1; 400, 0.005) (b) Considered as Poisson process 0.2707 3 P ( X 3) 3 p ( x ; 2) x0 3 P ( X 3) 1 C 1 (0.005) (0.095) 400 2 e (2) x0 b ( x ; 400, 0.005) 399 0.2707 Considered as Bernoulli process x 0.8571 x! 0.8571 x0 President University Erwin Sitompul PBST 6/33 Chapter 5.6 Poisson Distribution and Poisson Process Poisson Distribution and Poisson Process In a manufacturing process where glass products are produced, defects or bubbles occur, occasionally rendering the piece undesirable for marketing. It is known that, on average, 1 in every 1000 of these items produced has one or more bubbles. What is the probability that a random sample of 8000 will yield fewer than 7 items possessing bubbles? (8000)(0.001) 8 6 6 P ( X 7) b ( x ; 8000, 0.001) x0 Actually a problem for Binomial Distribution President University p ( x ; 8) 0.3134 x0 Solved by approximation using Poisson Distribution Erwin Sitompul PBST 6/34 Probability and Statistics Homework 6 1. A communications system consists of n components, each of which will, independently, function with probability p. The total system will be able to operate effectively if at least one-half of its components function. For what values of p is a 5-component system more likely to operate effectively than a 3-component system? (Ro.E5.1c s144) 2. It has been established that the number of defective stereos produced daily at a certain plant is Poisson distributed with mean 4. Over a 2-day span, what is the probability that the number of defective stereos does not exceed 3? (Ro.E5.2f s+10) 3. The probability of hitting a target is 1/5 and ten shots are fired independently. (a) What is the probability of the target being hit at least twice? (b) Find the conditional probability that the target is hit at least twice, assuming that at least one hit is scored. (Fe.VI.10.5-6 s16.9) President University Erwin Sitompul PBST 6/35