Probability and Statistics Lecture 6 Dr.-Ing. Erwin Sitompul President University http://zitompul.wordpress.com 2 0 1 3 President University Erwin Sitompul PBST 6/1 Chapter 5 Some Discrete Probability Distributions Chapter 5 Some Discrete Probability Distributions President University Erwin Sitompul PBST 6/2 Chapter 5.1 Introduction Introduction Often, the observations generated by different statistical experiments have the same general type of behavior. The discrete random variables associated with these experiments can be described by essentially the same probability distribution in a single formula. In fact, one needs only a handful of important probability distributions to describe many of the discrete random variables encountered in practice. In this chapter, we are going to present these commonly used distributions with various examples. President University Erwin Sitompul PBST 6/3 Chapter 5.2 Discrete Uniform Distribution Discrete Uniform Distribution If the random variable X assumes the values x1, x2, ..., xk, with equal probabilities, then the discrete uniform distribution is given by 1 f ( x; k ) , k x x1 , x2 , , xk When a light bulb is selected at random from a box that contains a 40-watt bulb, a 60-watt bulb, a 75-watt bulb, and a 100-watt bulb, each element of the sample space S = {40, 60, 75, 100} occurs with probability 1/4. Therefore, we have a uniform distribution, with 1 f ( x; 4) , 4 President University x 40, 60, 75,100 Erwin Sitompul PBST 6/4 Chapter 5.2 Discrete Uniform Distribution Discrete Uniform Distribution When a dice is tossed, each element of the sample space S = {1, 2, 3, 4, 5, 6} occurs with probability 1/6. Therefore, we have an uniform distribution with 1 f ( x;6) , 6 President University x 1, 2,3, 4,5, 6 Erwin Sitompul PBST 6/5 Chapter 5.3 Binomial and Multinomial Distributions Bernoulli Process An experiment often consists of repeated trials, each with two possible outcomes that may be labeled success or failure. We may choose to define either outcome as a success. The process is referred to as a Bernoulli process. Each trial is called a Bernoulli trial. Strictly speaking, the Bernoulli process must possess the following properties: 1. The experiment consists of n repeated trials. 2. Each trial results in an outcome that may be classified as a success or a failure. 3. The probability of success, denoted by p, remains constant from trial to trial. 4. The repeated trials are independent. President University Erwin Sitompul PBST 6/6 Chapter 5.3 Binomial and Multinomial Distributions Bernoulli Process Consider the set of Bernoulli trials where three items are selected at random from a manufacturing process, inspected, and classified defective or non-defective. A defective item is designated a success. The number of successes is a random variable X assuming integer values from 0 to 3. The items are selected independently from a process and we shall assume that it produces 25% defectives. The probability of the outcome NDN can be calculated as 3 1 3 9 P( NDN ) P( N ) P( D) P( N ) 4 4 4 64 President University Erwin Sitompul PBST 6/7 Chapter 5.3 Binomial and Multinomial Distributions Bernoulli Process The probabilities for the other possible outcomes can also be calculated to result the probability distribution of X The number X of successes in n Bernoulli trials is called a binomial random variable. The probability distribution of this discrete random variable is called the binomial distribution, and denoted by b(x; n, p). 9 P( X 2) f (2) b(2 : 3, ) 64 1 4 President University Erwin Sitompul PBST 6/8 Chapter 5.3 Binomial and Multinomial Distributions Binomial Distribution |Binomial Distribution| A Bernoulli trial can result in a success with probability p and a failure with probability q = 1 – p. Then the probability distribution of the binomial random variable X, the number of successes in n independent trials, is b( x : n, p) n Cx p x q n x , x 0,1, 2,..., n The mean and variance of the binomial distribution b(x; n, p) are np 2 npq The probability that a certain kind of component will survive a given shock test is 3/4. Find the probability that exactly 2 of the next 4 components tested will survive. p 3 4 2 2 3 54 3 1 b 2 : 4, 4 C2 4 256 4 4 President University Erwin Sitompul PBST 6/9 Chapter 5.3 Binomial and Multinomial Distributions Binomial and Multinomial Distributions The probability that a patient recovers from a rare blood disease is 0.4. If 15 people are known to have contracted this disease, what is the probability that (a) at least 10 survive, (b) from 3 to 8 survive, and (c) exactly 5 survive? Let X be the number of people that survive. Table A.1 gives help. (a) 9 P( X 10) 1 P( X 10) 1 b( x;15, 0.4) 1 0.9662 0.0338 x 0 ? Can you calculate manually? (b) 8 8 2 x 3 x 0 x 0 15 b( x;15, 0.4) x 10 P(3 X 8) b( x;15, 0.4) b( x;15, 0.4) b( x;15, 0.4) 0.9050 0.0271 0.8779 (c) P( X 5) b(5;15, 0.4) 15 C5 (0.4)5 (0.6)10 0.1859 President University Erwin Sitompul PBST 6/10 Chapter 5.3 Binomial and Multinomial Distributions Table A.1 Binomial Probability Sums President University Erwin Sitompul PBST 6/11 Chapter 5.3 Binomial and Multinomial Distributions Binomial and Multinomial Distributions A large chain retailer purchase a certain kind of electronic device from a manufacturer. The manufacturer indicates that the defective rate of the device is 3%. (a) The inspector of the retailer randomly picks 20 items from a shipment. What is the probability that there will be at least one defective item among these 20? (b) Suppose that the retailer receives 10 shipments in a month and the inspector randomly tests 20 devices per shipment. What is the probability that there will be 3 shipments containing at least one defective device? Let X be the number of defective devices among the 20 items. (a) P( X 1) 1 P( X 1) 1 P( X 0) 1 20 C0 (0.03)0 (1 0.03) 20 0.4562 (b) p 0.4562 P(Y 3) b(3;10, 0.4562) 10 C3 (0.4562)3 (1 0.4562)103 0.1602 President University Erwin Sitompul PBST 6/12 Chapter 5.3 Binomial and Multinomial Distributions Binomial and Multinomial Distributions It is conjectured that an impurity exists in 30% of all drinking wells in a certain rural community. In order to gain some insight on this problem, it is determined that some tests should be made. It is too expensive to test all of the many wells in the area so 10 were randomly selected for testing. (a) Using the binomial distribution, what is the probability that exactly three wells have the impurity assuming that the conjecture is correct? (b) What is the probability that more than three wells are impure? (a) P( X 3) 10 C3 (0.3)3 (1 0.3)103 0.2668 (b) P( X 3) 1 P( X 3) Try also to use Table ? 1 b( x;10, 0.3) A.1 to find this value 3 x 0 1 0.0282 0.1211 0.2335 0.2668 0.3504 President University Erwin Sitompul PBST 6/13 Chapter 5.3 Binomial and Multinomial Distributions Binomial and Multinomial Distributions Consider the previous “drinking wells” example. The “30% are impure” is merely a conjecture put forth by the area water board. Suppose 10 wells are randomly selected and 6 are found to contain the impurity. What does this imply about the conjecture? Use a probability statement. P( X 6) 10 C6 (0.3)6 (1 0.3)106 0.0368 Should the 30% impurity conjecture is true, there is only 3.68% chance that it stands after 6 wells are found contaminated. The investigation suggests that the impurity problem is much more severe than 30%. President University Erwin Sitompul PBST 6/14 Chapter 5.3 Binomial and Multinomial Distributions Binomial and Multinomial Distributions The binomial experiment becomes a multinomial experiment if we let each trial have more than 2 possible outcomes. |Multinomial Distribution| If a given trial can result in the k outcomes E1, E2, ..., Ek with probabilities p1, p2, .., pk, then the probability distribution of the random variables X1, X2, ..., Xk, representing the number of occurrence for E1, E2, ..., Ek in n independent trials is n! f ( x1 , x2 ,..., xk ; p1 , p2 ,..., pk , n) p1x1 p2x2 ... pkxk x1 ! x2 !...xk ! with k x i 1 i n and President University k p i 1 i 1 Erwin Sitompul PBST 6/15 Chapter 5.3 Binomial and Multinomial Distributions Binomial and Multinomial Distributions The complexity of arrivals and departures into an airport are such that computer simulation is often used to model the “ideal” conditions. For a certain airport containing three runways it is known that in the ideal setting the probabilities that the individual runways are accessed by a randomly arriving commercial jet are 2/9, 1/6, and 11/18 for runway 1, runway 2, and runway 3, respectively. If there are 6 randomly arriving airplanes, what is the probability that 2 airplanes will do the landing in runway 1, 1 in runway 2, and 3 in runway 3? 2 1 3 2 1 11 6! 2 1 11 f 2,1,3; , , , 6 0.1127 9 6 18 2!1!3! 9 6 18 ? What is the probability that 2 airplanes will do the landing in runway 1? President University Erwin Sitompul PBST 6/16 Chapter 5.4 Hypergeometric Distribution Hypergeometric Distribution As opposed to the binomial distribution, the hypergeometric distribution is based on the sampling done without replacement. The independence among trials is not required. Applications for the hypergeometric distribution are found in many areas, with heavy uses in acceptance sampling, electronic sampling, and quality assurance. The experiment where the hypergeometric distribution applies must possess the following two properties: 1. A random sample of size n is selected without replacement from N items 2. k of the N items may be classified as successes and N–k are classified as failures. The number X of successes of a hypergeometric experiment is called a hypergeometric random variable. The hypergeometric distribution of such variable is denoted by h(x; N, n, k) President University Erwin Sitompul PBST 6/17 Chapter 5.4 Hypergeometric Distribution Hypergeometric Distribution A particular part that is used as an injection device is sold in lots of 10. The producer feels that the lot is deemed acceptable if no more that one defective is in the lot. Some lots are sampled and the sampling plan involves random sampling and testing 3 of the parts out of 10. If none of the 3 are defective, the lot is accepted. Give comment on the utility of this plan. C C P( X 0) 0 8 3 0.467 10 C3 In case there are 2 defectives, there is still a chance of 46.7% that the lot is accepted. C0 7 C3 0.291 P( X 0) 10 C3 In case there are 3 defectives, there is still a chance of 29.1% that the lot is accepted. 2 3 As conclusion, a plan to do this kind of quality control is faulty. Unacceptable lot can still be accepted with high probability. 3 samples are not enough. The sample size must be increased. President University Erwin Sitompul PBST 6/18 Chapter 5.4 Hypergeometric Distribution Hypergeometric Distribution The probability distribution of the hypergeometric random variable X, the number of successes in a random sample of size n selected from N items of which k are labeled success and N–k labeled failure, is C x N k Cn x h( x; N , n, k ) , x 0,1, 2,..., n N Cn k The mean and variance of the hypergeometric distribution h(x;N,n,k) are nk N and President University 2 N n k k n 1 N 1 N N Erwin Sitompul PBST 6/19 Chapter 5.4 Hypergeometric Distribution Hypergeometric Distribution Lots of 40 components each are called unacceptable if they contain as many as 3 defectives or more. The procedure for sampling the lot is to select 5 components at random and to reject the lot if a defective is found. (a) What is the probability that exactly 1 defective is found in the sample if there are 3 defectives in the entire lot? (b) Find the mean and variance of the random variable and use Chebyshev’s theorem to interpret the interval μ ± 2σ. (a) h(1; 40,5,3) (b) 3 C1 37 C4 0.3011 40 C5 nk (5)(3) 3 0.375 40 8 N Again, this method of testing is not acceptable, since it detects a bad lot (with 3 defectives) only about 30% of the time N n k k 40 5 3 3 n 1 (5) 1 0.3113 N 1 N N 40 1 40 40 0.558 Chebyshev Theorem: In at least 2 3/4 of the time, the number of defectives will be between – 0.741 and 1.491 components President University Erwin Sitompul PBST 6/20 Chapter 5.4 Hypergeometric Distribution Hypergeometric Distribution If the number of sample n is small compared to the sample size N, the nature of the N items changes very little in each draw, although without replacement. In this case, where n/N ≤ 0.05, the value of binomial distribution can be used to approximate the value of hypergeometric distribution. A manufacturer of automobile tires reports that among a shipment of 5000 sent to a local distributor, 1000 are slightly blemished. If one purchase 10 of these tires at random from the distributor, what is the probability that exactly 3 are blemished? h(3;5000,10,1000) C3 4000 C7 0.2015 5000 C10 1000 1 p , 5 3 10 3 1 1 4 b 3;10, 10 C3 0.2013 5 5 5 President University Erwin Sitompul Exact hypergeometric probability Approximation using binomial distribution PBST 6/21 Probability and Statistics No homework Prepare well for your Mid Examination. Take time to read the slides. Try to redo homework problems. President University Erwin Sitompul PBST 6/22