Chapter 5, Probability Distributions

Chapter 5, Probability Distributions 5.1 Introduction - In this chapter, we will discuss various probability distributions including discrete probability distributions and continuous probability distributions. - Discrete probability distributions is used when the sampling space is discrete but not countable. Following is a list of discrete probability distributions:  discrete uniform  binomial and multinomial  hypergeometric  negative binomial  geometric  Poisson - Continuous probability distribution is used when the sample space is continuous. Following is a list of continuous probability distributions:  Uniform  Normal (or Guassian)  Gamma  Beta  t distribution  F distribution  2 distribution 5.2 Discrete uniform distribution - the definition: if a r. v., X, assumes the values x1, x2, ..., xk with equal probabilities, then X conforms discrete uniform distribution and its probability function is given below: f (x,k )  - 1 , x  x1 , x2 ,...,x k k the mean and variance: 1 k    xi k i 1 1 k    (x i   )2 k i1 2 5.3 Binomial and multinomial distributions - First, let us introduce the Bernoulli process. If:  the outcomes of process is either success (X = 1) or fail (X = 0)  the probability of success is P(X = 1) = p and the probability of fail is P(X = 0) = 1-p = q Then, the process is a Bernoulli process. - The probability distribution of the Bernoulli process: p(x) = px(1 - p)1-x, x = 0, 1 and 0 < p < 1 - The mean and the variance: E(X) = p V(X) = p(1 - p) - An example: what is the prob. of picking a male student? X = 1: male student with probability p = (8/12) = 2/3 X = 0: female student with probability 1-p = 1/3 Thus, the probability distribution is: P(x) = (0.25)x(0.75)1-x, x = 0 and 1 In addition, the mean: p = 2/3 and the variance V = (2/3)(1/3) = 2/9 - Binomial Distribution: the binomial distribution is defined based on the Bernoulli process. It is made up of n independent Bernoulli processes. Suppose that X1, X2, ..., Xn are independent Bernoulli random variables, then Y =  Xi will conform Binomial distribution. (note that Y is the number of successes among the n trails) - The probability distribution of binomial distribution is: n P(Y  y )    p y (1  p) n y , y  0,1, ..., n  y - The student example: pick three students from the 12 students (Note we must take samples with replacement in order to ensure the same probability and independence). none is male student from the 3: the possibility: FFF  3 3 the probability:   (1-p) = (0.037) 0   one is male student from the 3: the possibility: MFF, FMF, FFM  3 2 the probability:   3p(1-p) = (0.222) 1   two are male students from the 3: the possibility: MMF, MFM, FMM  3 2 the probability:   3p (1-p) = (0.445) 2   three are male students from the 3: the possibility: MMM  3 3 the probability:   p = (0.296) 3   In general, the formula is:  3 P(Y  y )    p y (1  p) 3 y , y  0, 1, 2, 3  y We can derive the general formula in a same manner. - Mean and variance of the binomial distribution: E(Y) =  E(Xi) = p = np V(Y) =  V(Xi) = p(1 - p) = np(1 - p) - the example: find the mean and variance of picking male students and then use Chybeshev's theorem to interpret the interval  ± 2.  = (3)(2/3) = 2   = (3)(2/3)(1/3) = 2/3,  = 0.817 at k = 2,  + 2 = 2 + (2)(0.816) = 3  - 2 = 2 - (2)(0.816) = 1 (1 - 1/k2) = 3/4. Therefore, there should be at least a probability of 3/4 that the number of male students picked are between 1 to 3. Indeed, the probability is actually p(1)+p(2)+p(3) = 0.973. - Using the Binomial distribution table: a function of n and p. - Multinomial distribution: this is an extension of binomial distribution: let x1, x2, ..., xk be independent r. v. with the probability p1, p2, ..., pk, where, k k i 1 i 1  xi  n, and  pi  1 then, they conform multinomial distribution with the probability distribution: n   x1 x 2  p1 p2 ... pkxk f ( x1 , x2 ,...xk ; p1 , p2 ,... pk )    x1 , x2 ,...xk  5.4 Hypergeometric Distribution - The example: what is the probability of pick three male students in a roll? Note that at this time, samples are not independent, or sampling without replacement. As a result we need to use hypergeometric distribution. Following shows how the distribution is formed:  no male student from the 3 students 12  8  4    8  4      0  3  12    3   total   , male   , female   3 0 3   probability =  one male students from the 12 students 12  8  4     total   , male   , female   3 1 2   probability =   8  4      1  2  12    3 two male students from the 12 students 12  8  4   8   4      2  1  12    3   total   , male   , female   3 2 1   probability =  three male students from the 12 students 12  8  4    8  4      3  0  12    3   total   , male   , female   3 3 0   probability = In general, the probability distribution is as follows: 8 4      y3  y  P(Y  y)  , y  0, 1, 2, 3 12    3  - the general formula of the hypergeometry distribution: k N  k      yn  y  P(Y  y)  , y  0, 1, 2, ..., n N    n  - the mean and the variance of the hypergeometry distribution: nk N N  n nk  k  2  1 N  1 N  N   as a special case, let N be infinite, then (k / N) = p, and (N-n) / (N-1) = 1. Hence:  = np 2 = np(1 - p) That is, the hypergeometric distribution becomes the binomial distribution - We can also define the multivariate hypergeometric distribution 5.5 Negative Binomial and Geometric Distributions - An example: picking three students, what is the probability that the third student is the second male?  a possibility is FMM and its probability is (1-p)p2  the other possibility is MFM and its probability is (1-p)p2 3  1 note that there are  combinations, and hence, the probability is: 2  1 3  1 f (X  3,k  2)   1 p p2 2  1 - The general formula for the negative binomial distribution is as follows: x  1 k f (X  x)   p (1  p) x k , x = k, k+1, k+2, ... k  1 where, x is the number of trails and k is the kth success. - the mean of variance of the negative binomial distribution: E(X) = k(1-p)/p V(X) = k(1-p)/p2 - another example: picking until get a male student:  the first pick: p  the second pick: (1-p)p  the third pick: (1-p)2p - the general formula is: f(X = x) = (1 - p)x-1p, x = 1, 2, 3, ... This is the geometric distribution. - the mean of variance of the negative binomial distribution and geometric distributions: E(X) = 1/p V(X) = (1-p)/p2 5.6 Poisson Distribution - Poisson process is a random process representing a discrete event takes place over continuous intervals of time or region. Examples of Poisson processes include:  the arrival of telephone calls at a switchboard,  the passing cars of an electric checking device. Note that all these examples involve a discrete random event. At any given small period of time (or region), the probability that the event occurs is small; however, over a long time (or large region), the number of occurrence is large. - Poisson distribution plays an extremely important role in science and engineering, since it represents an appropriate probabilistic model for a large number of observational phenomena. - The Poisson distribution can be described by the following formula: p(x, t)  e t ( t) x , x = 0, 1, 2, ... x! where,  is the average number of outcomes per unit time or region. Hence, t represents the number of outcomes. Proof: refer to the textbook. - The Poisson process can be considered as an approximation to the Binomial Distribution when n is large and p is small. - From a physical point of view, given a time interval of length T, which is divided interval into n equal sub-intervals of length t (t  0), (note that T = nt), and assume:  The probability of a success in any sub-interval t is given by t.  The probability of more than one success in any sub-interval t is negligible.  The probability of a success in any sub-interval does not depend on what happened prior to that time. Then, we have the Poisson distribution. - Mean and Variance of Poisson distribution   -    An example: in a large company, industrial accidents occur at the mean of three per week (t = 3) (note that accidents occurs independently).  the probability distribution: y p(y) = (3) exp(-3) / y!, y = 0, 1, 2, ...  the probability can be determined based on simple calculation or by means of checking the Poisson distribution table.  the probability of less than and equal to four accidents in a week: p(0) + p(1) + p(2) + p(3) + p(4) = 0.815  the probability of equal and more than four: P(Y  4) = 1 - P(Y  3) = 0.353  the probability of equal to four P(Y = 4) = P(Y  4) - P(Y  3) = 0.168 note that this is the same as: p(4) = 0.168 5.7 Uniform Distribution - The uniform distribution is a continuous probability distribution  the assumption: the random event is equally likely in an interval  an example: receiving an express mail between 1 ~ 5 pm - The probability density function (pdf)  1  f ( x)   b  a  0 - elsewhere By integration, we obtain the probability function (pf)  0 x  a F ( x)   b  a  1 - a xb xa a xb bx A comparison between the discrete distributions and continuous distribution  the discrete r. v., we have probability function: P(X = x) = p(x)  for continuous r. v.: F(X = x) = 0 x F(x) =  f(x) dx - f(x) = - F(x) dx An example: receiving an express mail equally likely between 1 to 5 pm. f(x) = 1/4, 1x5 0, elsewhere hence, the probability of receiving an express mail between 2 to 5 pm is P(2  X  5) = (5 - 1)/(5 - 1) - (2 - 1)/(5 - 1) = 3/4. - The mean and the variance: E(x) = (a+b)/2 2 V(x) = (b-a) /12 5.8 Normal Distribution - In the natural world there are more cases where possibilities are not equally likely. Instead there is a most likely value and then the likelihood decreases symmetrically. This leads to the Normal distribution. - Normal distribution is by far the most widely used probability distribution. Why Normal distribution is so popular?  the large number theorem  a linear combination of Normal is still Normal - The probability density function: f(x) = 1 2  - (x - )2 /2 2 e note that probability function does not have analytical form, hence, we rely on numerical calculation (Table A.3) - The mean, variance and standard deviation of a normal distributions: E(X) =  2 V(X) =  These two parameters uniquely determine the normal distribution. Hence, a normal distribution is often denoted as N(, ) - Illustration of the normal distribution:  the bell shape  the mean  - the standard deviation: ± (68% area), ±2 (95.4% area), and ±3 (99.7% area). In particular, with E(X) =  2 V(X) =   we have the standard normal distribution N(0, 1) - Calculate the probability through the standard normal distribution:  translate to a normal distribution to a standard normal distribution by: X- Z=   use the normal distribution table (Table A.3) - An example: given N(16, 1), P(X > 17) = ?  Z = (X - 16)/1  P[Z > (17 - 16)/1] = P(Z > 1) = 1 - P(Z < 1) = 1 - 8413 (form Table A.3) = 0.1587 - Questions:  given  and , how to calculate P(c1  X  c2)?  given p,  and , how to calculate x so that P(X > x) = p - Given a set of data, it is often necessary to checking whether the data set conforms normal distribution. - The student example - the number of hours of study of the 12 students:  sorting the data: 10, 12, 12, 14, 14, 14, 15, 15, 15, 20, 20, 25  note that there are just 6 different values. So, the 100  6 = 16.7  finding the percentile of the data: 16, 32, 32, 48, 48, 48, 64, 64, 64, 80, 80, 96  finding the z-values of the percentile: -1., -.47, -.47, -.05, -.05, -.05, .36, .36, .36, .85, .85, 1.75  plotting: • 25 • 20 15• • -1.5  - •-1 -0.5 10 0.5 1 1.5 2 Because the horizontal axis is from a normal distribution, the linear relationship indicates that the distribution of the data can be approximated by a normal distribution. If a data set conforms normal distribution, then the related probability calculated can be easily done. Following the 12 students example:  = 15.5   = 16 Question: what is the prob. of picking a student who studies at least 15 hours per week? Answer: we first calculate the z value; z = (15 - 15.5) / 4 = -0.125 hence, the probability is: P(Z > -0.125) = 1 - P(Z < -0.125) = 1 - 0.45 = 0.55 - As another example, assuming that an exam is coming, everybody is putting an extra 3 hours for study per week, what is the probability of picking a student who studies at least 20 hours per week? We first calculate the z value; z = (20 - 18.5) / 4 = 0.375 hence, P(X > 20) = P(Z > 0.375) = 1 - P(Z < 0.375) = 1 - 0.64 = 0.36. - As an exercise, you may want to try to find that, given a probability of 95%, what is the range of the hours of study per week for a picked student. - Normal approximation to binomial. Assuming p is small and n is large, then Z X  np np(1  p) is approximately normally distributed. This can be demonstrated by the example. In the students example, the probability of picking a student who studies more than 15 hours per week is p = 3/12 = 1/4. Consider the case of sampling with replacement, picking 3 students who all study more than 15 hours per week is: b(X = 3, n = 12, p = 1/4) = 0.212 Use normal distribution to approximate:  = np = (12)(1/4) = 3 2  = np(1 - p) = (12)(1/4)(3/4) = 9/4 = 2.25 ( = 1.5) hence, P(2.5 < X < 3.5) = P[(2.5 - 3)/1.5 < Z < (3.5 - 3)/1.5] = P(-0.167 < Z < 0.167) = 0.56 - 0.395 = 0.165 It is seen that the results are rather similar. The approximation error is caused by small n (n = 12). - The normal approximation of binomial distribution is very useful when n is large because binomial distribution will then require tedious calculation. 5.9 Exponential distribution, Gamma distribution and Chi-Square (2) distribution - There are cases, for example the failure rate, in which the possibility decreases exponentially. This leads to the exponential distribution. - the probability density function of the exponential distributions: 1  x  exp    x  0,   0 f ( x)      0 elsewhere  - the probability function F(x) = 1 - exp(-x/), - x > 0,  > 0 To calculate mean and variance, we need the Gamma () function:  () = x -1 -x e dx 0 using integration by part: (uv)' = u'v + uv' uv  or  u' v   uv'  uv'  uv   u' v let u = x-1, dv = e-xdx, it follows that: ( )  ex x  1     e  x (  1)x  2 dx  (  1)(   1) 0 0 In particular: (+1) = F() (n) = (n-1)! (1/2) =  In general:   0 (x) 1 e  x dx    ( ) for the geometry distribution, since  = 1,  = : E(X) =  2 V(X) =   - The exponential distribution is correlated to Poisson distribution: given a Poisson distribution with the mean t, the probability of first time occurrence is exponential. - Another common case is that the possibility is low when close to zero - this leads to the Gamma distribution. The probability density function of Gamma distribution: f ( x)  1    x  1  e x  , x > 0,  > 0. - The mean and variance: E(X) =  2 V(X) =  - Note that exponential distribution is a special case of Gamma distribution with  = 1. - Another special case of the gamma distribution is the 2 distribution. Let  = /2 and  = 2, it results in the 2 distribution: f (x)  1  2 2 (  2 ) x  2 1 e x2 ,x>0 its mean and variance are as follows: = 2 = 2 - Illustration. Gamma or 2 Exponential 5.10 Weibull distribution - The assumption: similar to Gamma - The probability density function:  -1 -x / x e ,  = 0, f(x) = x>0 otherwise - The probability function:  F(x) = 1 - exp(-x /), x > 0 - The mean and variance 1/  E(X) =  (1 + 1 )  2/  1 2 V(X) =  {(1 + 2  ) - [(1 +  )] } - Application in reliability, defining: f(t) - the pdf of failure F(t) - the pf of failure R(t) = 1 - F(t) - the probability of no failure (reliability function) r(t) = f(t) / R(t) - the failure rate function if: r (t )  f (t ) f (t ) 1    R(t ) 1  F (t )  then f(t) will be exponential. - Proof: since dF(t)/dt = f(t)  • F'(t) = 1 - F(t)  • F'(t) + F(t) = 1 solving the above gives: F(t) = 1 - exp(-t/), t  0 or f(t) = 1/ exp(-t/), t0 5.11 Summary - Discrete distributions  discrete uniform: equally likely  binomial and multinomial: number of success in n independent Bernoulli experiments  hypergeometric: sampling is dependent (finite sampling space)  negative binomial: kth success in n trials  geometric: trail until success  Poisson: discrete event in continuous intervals. - Continuous distributions  uniform: equally likely  Normal: has a most likely value and decreasing symmetrically  exponential: gradually decreasing  Gamma: small when close to zero (generalized exponential)  Beta: contained in a finite interval  Weibull: generalized Gamma

Chapter 5, Probability Distributions

Related documents

Products

Support

Chapter 5, Probability Distributions

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib