Discrete Probability Distributions The Bernoulli Distribution – any random variable denoting the presence or absence of a certain condition in an observed phenomenon. One of the outcomes is termed a “success” and the other a “failure”. The Bernoulli distribution equation is x = 0, 1 for 0 p 1 E(X) = p and V(X) = p(1-p) The Binomial Distribution A random variable Y possess a binomial distribution if 1. The experiment consists of a fixed number n of trials. 2. Each trial can result in one of only two possible outcomes, called “success” and “failure”. 3. The probability of “success”, p, is constant from trial to trial. 4. The trials are independent. 5. Y is defined as the number of successes among the n trials. The Binomial Distribution y = 0, 1, 2, …, n for 0 p 1 E(Y) = np and V(Y) = np(1-p) Example: Suppose a large lot contains 10% defective fuses. Four fuses are randomly sampled from the lot. a. Find the probability that exactly one fuse in the sample of four is defective. b. Find the probability that at least one fuse in the sample of four is defective. Solution: a. b. = 0.2916 = 0.3439 Example: In a study of lifetimes for a certain type of battery, it was found that the probability of a lifetime X exceeding 4 hours is 0.135. If three such batteries are in use in independently operating systems, find the probability that a. Only one of the batteries lasts 4 hours or more. b. At least one battery lasts 4 hours or more. Solution: Let y = the number of batteries (out of three) lasting 4 hours or more. We can reasonably assume that Y has a binomial distribution with n = 3 and p = 0.135. Hence, a. b. = 0.303 = 0.647 Example: An industrial firm supplies 10 manufacturing plants with a certain chemical. The probability that any one firm calls in an order on a given day is 0.2, and this probability is the same for all 10 plants. Find the probability that, on a given day, the number of plants calling in order is a. Exactly 3 b. At most 3 c. At least 3 Solution: Let Y = the number of plants (out of 10) calling in orders on the day in question. If the plants order independently, then Y can be modelled to have a binomial distribution with n = 10 and p = 0.2. a. The probability of exactly 3 out of 10 plants calling in order is = 0.201 b. The probability of at most 3 out of 10 plants in orders is = 0.107 + 0.269 + 0.302 + 0.201 = 0.879 c. The probability of at least 3 out of 10 plants calling in orders is = 1 – (0.107 + 0.269 + 0.302) = 0.322 The Geometric and Negative Binomial Distributions The Geometric Distribution Is a special case of negative binomial distribution Is interested in the random variable Y, the number of trial, on which the first success occurs The results of y successive trials are then as follows with F = failure and S = success (y- 1) trials Last (yth) trial Therefore, the probability of observing failures on first (y-1) trials and success on the last (yth) trial (i.e. the probability of requiring y trials till the first success is observed) is Success on Trial Number 1 2 3 4 . . . y Sequence The Geometric Distribution S FS FFS FFFS . . . FF…FS Probability p (1-p)p (1-p)(1-p)p (1-p)(1-p)p . . . (1-p)(1-p)…(1-p)p Example: A recruiting firm finds that 30% of the applicants for a certain industrial job have advanced training in computer programming. Applicants are selected at random from the pool and are interviewed sequentially. a. Find the probability that the first applicant having advanced training is found on the fifth interview. b. Suppose the first applicant with the advanced training is offered the position, and the applicant accepts. If each interview costs Php300, find the expected value and variance of the total cost of interviewing incurred before the job is filled. Solution: a. b. The total cost of interviewing in C = 300Y. and The standard deviation of the total cost is = Php836.66. The Negative Binomial Distribution Let Y denote the number on the trial on which the rth success occurs in a sequence of independent Bernoulli trials with p denoting the common probability of “success”. The negative binomial distribution is defined by two parameters, r and p. The Negative Binomial Distribution equation is Example: In the previous example, 30% of the applicants for a certain position have advanced training in computer programming. Suppose three jobs requiring advanced programming training are open. Find the probability that the third qualified applicant is found on the fifth interview, if the applicants are interviewed sequentially and at random. Solution: Let Y denote the number of the trial on which the third qualified candidate is found. Then Y can reasonably be assumed to have a negative binomial distribution with r = 3 and p = 0.3. The Poisson Distribution The Poisson distribution occurs when we count the number of occurrences of an event over a given time period or length or volume. For example The number of flaws in a square yard of fabric The number of bacterial colonies in a cubic centimetre of water The number of times a machine fails in the course of a workday The Poisson distribution equation is Example: For a certain Manufacturing industry, the number of industrial accidents averages three per week. a. Find the probability that no accident will occur in a given week. b. Find the probability that two accidents will occur in a given week. c. Find the probability that at most four accidents will occur in a given week. d. Find the probability that two accidents will occur in a given day. Solution: With = mean number of accidents per week = 3, hence a. P(No accident in a given week) = p(0) = b. P(Two accidents in agiven week) = p(2) = c. P(at most 4 accidents in a given week) = p(0) + p(1) + p(2) + p(3) + p(4) = d. Since we are interested in the number of accidents on a given day, we need to obtain a new on a per day basis which is = 3/7 =0.2857. Where the 7 represents the number of days in a week. Hence, P(Two accidents on a given day) = p(2) = 0.031 The Hypergeometric Distribution In general, suppose a lot consists of N items, of which k are of one type (called successes) and N-k are of another type (called failures). Suppose n items are sampled randomly and sequentially from the lot, with none of the sampled items being replaced, that is sampling without replacement. Let Y denote the total number of successes among the n sampled items. Then the probability distribution of Y is described as the hypergeometric distribution. The hypergeometric distribution is defined by three parameters, n, k, and N. The Hypergeometric distribution equation is Example: A personnel director selects two employees for a certain job from a group of six employees, of which one is female and five are male. Find the probability that the female is selected for one of the jobs. Solution: If the selections are made at random and if Y denotes the number of females selected, the hypergeometric distribution would provide for the behaviour of Y. Hence N = 6, k = 1, and n = 2, and y = 1.