PROBABILITY DISTRIBUTIONS LECTURE ONE INTRODUCTION It involves use of the laws of probability and how the laws relate to experiments. The lecture will involve analysis of the following probability distributions: Binomial, hypergeometric, poisson, exponential and normal distribution. Probability distributions are used to show how the entire set of events or outcomes for an experiment can easily be represented and expressed as a random variable. A random variable is a variable whose value is the result of a random or uncertain by chance event. For example, suppose a coin is tossed. The specific outcome of a single toss (either Head or Tail) is an observation. Such an observation is the random variable. The value of the random is the result of chance. LECTURE OBJECTIVES By the end of lesson the learner should be able: ο· Define probability distributions. ο· Distinguish the various forms of probability distributions ο· Use and apply discrete probability distributions. A probability distribution is a display or a list of all possible outcomes of an experiment, along with the probabilities associated with each outcome as presented in the form of a table, a graph or a formula. For example, suppose we toss (flip) a coin three (unbiased) times. The sample space for the above is represented as follows: πΈ = {π»π»π», π»π»π, π»ππ», ππ»π», π»ππ, ππ»π, πππ», πππ} From the above sample space, the probability of getting: 1. Zero heads i.e. P(TTT) =1/8 2. One head i.e. P(HTT or THT or TTH) = 3/8 3. Two heads i.e. P(HHT or HTH or THH) = 3/8 4. All heads i.e. P( H HH) = 1/8 Listing all the possible outcomes, along with the probability associated with outcome, we obtain a Probability distribution. Outcome (heads) Probability (x) 0 π π 1 π π 2 π π 3 π π The probabilities of all possible outcomes sum up to one. π ∑ ππ = 1 π=1 The probability that the random variable X can take on some specific value π₯π is written as π(π = π₯π ) Thus, the probability that two heads are obtained above is given as π π =2 =3/ 8 Note that 0 ≤ π(π = π₯π ) ≤ 1 πππ ∑π(π = π₯π) = 1 NB: A distribution in which the probabilities of all outcomes are the same is referred to as a uniform probability distribution. For example, suppose you toss a die. The outcome is a random variable and can take any value from one to six, that is 1, 2, 3, 4, 5, 6. The probabilities of all possible outcomes are all 1⁄6 Hence OUTCOME PROBABILITY 1 2 1/ 6 3 1/ 6 4 1/ 6 5 1/ 6 6 1/ 6 1/ 6 Discrete Vs continuous probability distributions There are two types of probability distributions i.e. discrete and continuous distributions. A discrete probability distribution is one where a distinct number of values, usually whole number. It is most often the result of counting or enumeration for example, No of customers, No of units sold, No of heads after tossing a coin etc. In such cases, the random variable cannot be a fraction. A continuous probability distribution is one whose random variable can take an infinite number or value within a given range. There are no gaps in the observations since no matter how close two observations might be, a third can be found which will fall between the first two. A continuous probability distribution is usually the result of measurement. DISCRETE PROBABILITY DISTRIBUTION The Mean and the Variance Mean The mean of a probability distribution is referred to as an expected value of the random variable. The expected value of a random variable X is written as E(X). It is the sum of the product of each possible outcome and its probability. That is, π = πΈ( π ) = ∑⌊π₯ ππ₯ ⌋ The expected value of a discrete random variable is the weighted mean of all possible outcomes, in which the weights are the respective probabilities of those outcomes. For example, i. The expected value of the experiment of tossing a coin three times is π = πΈ (π) = 0 (1 /8) + 1(3 /8) + 2(3 /8) + 3 (1 /8) = 12 /8 = 1.5 ii. The expected value of tossing a dice is π = πΈ( π ) = 1( 1 /6 )+ 2( 1 /6 )+ 3( 1 /6 )+ 4 (1 /6 )+ 5( 1 /6 ) + 6( 1 /6 ) = 3.5 Variance Variance is the mean of the squared deviations from the mean (expected value). ∑ {( π₯π − π ) 2 ∗ π π₯π} πΏ2 = πΏ2 = {((π₯π )2 ∗ π π₯π ) − π2 } For example, the variance of tossing a dice πΏ2 = ∑*( π₯π − π ) 2 ∗ π π₯π} πΏ2 = (1 − 3.5) 2 (1 /6 )+ ( 2 − 3.5 )2 (1 /6 )+ (3 − 3.5) 2 (1 /6 )+ (4 − 3.5) 2 (1 /6 ) + (5 − 3.5 )2 (1 /6 )+ (6 − 3.5 )2 (1 /6 )= π. ππ EXAMPLES OF DISCRETE PROBABILITY DISTRIBUTIONS 1. BINOMIAL DISTRIBUTION This is a process in which each trial or observation can assume only one of two states that is each trial in a binomial distribution results in one of only two mutually exclusive outcomes, one of which is identified as a success and the other as a failure. The probability of each remains constant from one trial to the next. Experiments of this type follow a binomial distribution or Bernoulli process, named after a Swiss mathematician Jacob Bernoulli. A binomial distribution or Bernoulli process must fulfill the following conditions:1. There must be only two possible outcomes. One is identified as a success, the other as a failure (do not attach any connotation of βgood‘or‘bad‘to the terms. They are quite objective and a success does not necessarily imply a desirable outcome) 2. The probability of a success, π, remains constant from one trial to the next as does the probability of a failure, (1-π) 3. The probability of a success in one trial is very independent of any other trial. 4. The experiment can be repeated many times. NB: Condition 3, above, states that the trials or experiments must be independent in order for a random variable to be considered a binomial random variable. This can be assured in a finite population if the sampling is perfumed with replacement. However, sampling with replacement is the exception rather than the rule in business applications. Most often, the sampling is performed without replacement. But strictly speaking, when sampling is perfomed without replacement, the conditions for the Binomial distribution cannot be satisfied. However, the conditions are approximately satisfied if the sample selected is quite small relative to the size of the population from which the sample is selected. A commonly used rule of the thumb is that the Binomial distribution can be applied if π π < 1 20. Thus if the sample is less than 5% of the size of the population, the conditions from the binomial will be approximately satisfied. The probability that out of n trials, there will be X successes are given π (π₯) = ππΆπ₯ ππ₯ 1 − π π−π₯ πhπππ π = π πππππ π ππ§π ππ the ππ. ππ π‘πππππ π = ππππππππππ‘π¦ ππ π π’ππππ π π = ππ. ππ π π’ππππ π ππ 1 − π = ππππππππππ‘π¦ ππ πππππ’ππ. For example, tossing a coin is an example of a binomial distribution since:1. There are only two possible outcomes, Head or tail. 2. The probability of a success (Head or tail) remains constant at π = 0.5 for all tosses. 3. The probability of a success on any toss is not affected by the results of any other toss. 4. A coin may be tossed many times Examples a) Suppose a coin is tossed 10 times, what is the probability that there will be 3 tails. A tail is a success Then, X=3, n= 10 π (π₯) = ππΆπ₯ ππ₯ 1 − π π−π₯ π (π₯) = 3 = 10πΆ3 (0.5)3 0.5 7 b) A credit manager for American express has found that 10% of their card users do not pay the full amount of indebtedness during any given month. She wants to determine the probability that if 20 accounts are randomly selected, 5 of them are not paid. π(π₯ = 5 πππ£ππ π = 20) π = 0.1 π(π₯) = ππΆπ₯ ππ₯ (1 – π) π−π₯ π (π₯ = 5) = 20πΆ5 (0.1)5 (0.9) 15 π (π₯) = 5 = 0.0319 When the probability if some event is measured over a range of values, it is known as Cumulative Binomial Probability. π π≤π₯ =ππ=0+ππ=1+π π=2+β―+ππ=π₯ For example π π ≤ 2 ππ≤2=ππ=0+ππ=1+ππ=2 While π π > π₯ = 1 − π π ≤ π₯ For example, π π > 2 π π>2 =1−π π≤2 Worked example, Sales person for widgets make a sale to 15% of the customers on whom they call. If a member of the sales calls on 15 customers today, what is the probability that he/she will sell, i. exactly two widgets ii. at most two widgets iii. at least three widgets π = 15 πππ π = 0.15 1. π (π = 2) π (π = 2) = 15πΆ2 (0.15) 2 (0.85) 13 π( π = 2 ) = π. ππππ 2. π (π ≤ 2) = π π = 0 + π π = 1 + π π = 2 π( π ≤ 2) = 15πΆ0 (0.15) 0 (0.85) 15 + 15πΆ1 (0.15) 1 (0.85) 14 + 15πΆ2 (0.15) 2 (0.85 )13 π (π ≤ 2 ) = 3. π( π > 2) π (π > 2) = 1 − π ( π ≤ 2) Mean & Variance of Binomial Distribution The mean and variance for a Binomial distribution are calculated as π = ππ πΏ2 = ππ ( 1 − π ) Where n is the number of trials (sample size) and ο is the probability of a success on any given trial. i.e. we would expect that on the average, there are ππ success out of n trials. From the previous example of sales of widgets, we had. ο = 0.15 n = 15 The sales personnel should average π = 15(0.15) = 2.25 Sales for every 15 sales calls. That is, if a sales person makes 15 calls a day, for many days, he/she will average over the long run 2.25 sales per day. The variance in the number of daily sales is πΏ2 = 15(0.15) 0.85) = 1.91 ο€ = 1.38 SHAPE OF THE BINOMIAL DISTRIBUTION i. ii. iii. iv. If π = 0.5 the binomial distribution will be perfectly symmetrical. If π < 0.5, the curve will be skewed to the right. If π > 0.5, the curve is skewed to the left. Holding π constant, as n increase, the binomial distribution will appear normality. THE POISSON DISTRIBUTION A French Mathematicians Simeon Poisson developed it. Poisson distribution measures the probability of a random event over some interval of time or space. It measures the relative frequency of an event over some unit of time or space. For example, - The number of arrivals of customers per hour. - Number of industrial accidents per month. - Number of detective electrical connections per mile of wiring. etc. In each of these cases, the random variable (customers, accidents, detects, machines etc) is measured per unit of time or space (distance). Two assumptions are necessary to the application of the Poisson distribution. i) The probability of the occurrence of the event is constant for any two intervals of time or space. ii) The occurrence of the event in any interval is independent of the occurrence in any other interval. Given these assumptions, the Poisson distribution can be expressed as ππ= ππ₯ π−π π₯! Where π ππ the πππππ’ππππ¦ ππ the ππ’ππππ ππ π‘ππππ the ππ£πππ‘ ππππ’ππ . µ ππ thπ ππππ ππ’ππππ ππ ππππ’ππππππ πππ π’πππ‘ ππ π‘πππ ππ π ππππ. π = 2.71828 ππ the πππ π ππ the πππ‘π’πππ ππππππthπ π π¦π π‘ππ Worked example A simple observation over the last 80 hours has shown that 800 customers enter a certain bank. What is the probability that 5 customers will arrive in the bank during the next one hour? µ = 10 ππ’π π‘πππππ πππ hππ’π π₯ =5 ππ= ππ₯ π−π π₯! ππ=5= 105 π−10 5! π π = π. ππππ ο· A local paving company obtained a contract with the city council to maintain roads servicing a large urban centre. The roads recently paved by this company revealed an average of two defects per mile after being used for one year. If the council retains this company, what is the probability of one defect in any given mile of road after carrying traffic for one year? ππ₯ π−π ππ= π₯! 21 π−2 ππ=1= 1! π π = 1 = π. ππππ POISSON VS BINOMIAL DISTRIBUTIONS The Poisson distribution is useful an approximation for Binomial Probabilities. This approximation is often necessary if the number of trials is large, since binomial tables for large values of n are often not available. The approximation process is permitted only if n is large and is small. As a rule of thumb, the approximation is reasonably accurate if π ≥ 20 and π ≤ 0.10. If π is large simply, reverse the definition of a success and a failure so that π becomes small. For example, assume that industry records show that 10% of employees steal from their employers. The personnel manager of a firm in that industry wishes to determine the probability that from a sample of 20 employees, three have illegally taken company property. Using the Binomial Distribution, π π = 3 = 20πΆ3 (0.10) 3 (0.9) 17 = 0.1901 ≅ π. π Using Poisson distribution, π = ππ = 20 0.1 = 2 23π−2 = 0.1804 ≅ π. π ππ=3= 3! For larger values of n and /or smaller values of π, the approximation may be even closer. Assume n=100 π = ππ = 100 0.1 = 10 ππ=3= 103 π−10 = 0.0076 3! THE HYPERGEOMETRIC DISTRIBUTION If a sample is selected without replacement from a known finite population and contains a relatively large proportion of the population, such that the probability of a success is measurably altered from one selection to the other, the hypergeometric distribution should be used. The Binomial distribution is appropriate only if the probability of a success remains constant for each trial. This will occur if the sampling is done from an infinite (or very large) population. However, if the population is rather small, or the sample contains a large portion of the population, the probability of a success will vary between trials. For example, suppose we have a group of 30 people of whom 10 are female. The probability of choosing a female (success) =10/30, during the first trial. The probability of a choosing a female during the second trial without replacement will either be 9/29 if the first was either a success or 10/29 if the first was a failure. Hence, the probability cannot remain constant. That is, for N-30, 10/30 and 9/29 or 10/29 could be significantly different. As N increases, the difference is negligible. Therefore, if the probability of a success is not constant, the hypergeometric distribution is given by π(π) ππΆπ₯ π − π πΆπ−π₯ ππΆπ Where, π = ππππ’πππ‘πππ π ππ§π π = π πππππ π ππ§π π = ππ. ππ thπ ππππ’πππ‘πππ πππ π π π’ππππ π π = ππ. ππ thπ π πππππ πππππ‘πππππ ππ π π π’ππππ π Example οΆ In a population of 10 racing horses, 4 are known to have a contagious disease. What is the probability that out of 3 horses selected, 2 horses have a disease? π = 10, π = 3, π = 4, π = 2 4πΆ2∗6πΆ1 π π =2 = = 0.3 10πΆ3 Variance of the hypergeometric distribution. The variance is equal to the number of trials multiplied by the proportion of success in the population and multiplied again by the proportion of failures in the population. πΏ2 = π π π π−π π π−π π−1 Again, similar to the variance of the Binomial since π/π = π, difference is the last factor Nοr = 1 − π. N The only Nοn known as the finite population correction factor. N ο1 HYPERGEOMETRIC VS BINOMIAL DISTRIBUTION We have noted that the means and variances of the binomial and hypergeometric distributions N οn are identical except for the finite population correction factor . If however the sample size N ο1 or number of the population N, then the finite population correction factor approaches 1 and hence ignored. As a rule of thumb, whenever π ≤ 5% of a finite population or when the population size N approaches , the finite population correction factor can be ignored and the Binomial distribution can be used to approximate the Hypogeometric distribution. For example, N=40, n=2 , r=4 But since ,we can use the binomial distribution where n=2 THE EXPONENTIAL DISTRIBUTION The Poisson distribution is a discrete probability distribution, which measures the number of occurrences of some event of over time or space. For example, the number of customers who might arrive during some given period. The exponential distribution measures the passage of time between those occurrences. Therefore, while the Poisson distribution describes arrival rates of units (people, trucks, telephone calls etc) within same time period, the exponential distribution estimates the lapse of time between arrivals. It can measure the time lapse asi) The time that passes/elapses between two successive arrivals or ii) The amount of time that it takes to complete one action ,for example, serving one customer, loading one truck, handling one call etc. Exponential probability distributions are depicted by a steadily decreasing/decaying which shows that the larger the value of the random variable, as measured in units of elapsed time, the less likely it is to occur. If the arrival process is Poisson distributed, then lapse of time between arrivals is exponentially distributed. Let be the mean number of arrivals in a given time period, and let be the mean lapse of time between arrivals. Then e.g If an average of 4 customers arrive in a bank 1 hour, , then on the average, one customer arrives every 0.25 hr. i.e Based on the relationship between Poisson and Exponential distributions, it is possible to determine the probability that a specified time period will lapse given knowledge of the average arrival rate. The probability that no more than t units of time elapse between successive occurrences is Where µ is the mean rate of occurrence Worked examples ο§ Trucks arrive at a loading duck at a mean of 1.5 per hour. What is the probability that no more than 2 hours will lapse between the arrivals of successive trucks? The probability that a second truck will arrive with 2 hours of the first one is 95.02% ο§ Hence, Taxis arrive at the airport at the rate of 12 per hour. You have just landed at the airport and must get to town as fast as possible for a business deal. What is the probability that your will get a taxi in the next 5 minutes? similarly, 5 minutes = 1/12 hrs NB: The mean of the hyper geometric distribution is and the standard deviation is Note. ο· The expected value of a discrete random variable is the weighted mean of all possible outcomes, in which the weights are the respective probabilities of those outcomes. It is the sum of the product of each possible outcome and its probability. That is,ο ο· Variance is the mean of the squared deviations from the mean (expected value).ο SELF-TEST QUESTIONS i. The investment firm employs 20 investment analysts. Every morning each analyst is assigned up to five stocks to evaluate. The assignments that were made in a certain day are: Outcome (xi) Frequency of xi (No. of stocks) (No. of analysts) 1 2 3 4 5 4 2 3 5 6 Determine - ii. Probability Distribution Mean Variance Over the last 100 business days, Harry has had 20 customers on 20 of those days, 25 customers on 20 days, 35 customers on 30 days, 40 customers on 10 days and 45 customers 10 days. How many customers might be expected today, and what is the variance in the no. of customers. FURTHER READINGS. 1. Allan Webster (1992) Applied Statistics for Business and Economics, Truin, Boston. 2. David Groebner and Patrick Shannon, Business Statistics: A Decision-Making Approach, 3rd edition merril publishing company, Ohio 3. Wannacott T.H. and Wannocott: Introduction Statistics for Business and Economics, John Wiley and sons, New York. 4. Other text books on Statistics for Business and Economics