Lecture 7. Distributions. Probability density and cumulative distribution functions. Poisson and Normal distributions. 1 •We remind here some facts about the distributions of the discrete and continuous random variable, and also introduce some new concepts. The discrete random distribution can be characterized by a probability function p(xi) assigning the probabilities to all possible values xi of a random variable X. The probability function should satisfy the following equations : p( xi ) 0 p( xi ) 1 i P (E ) ( 7.1) p( xi ) i E For those familiar with the concept of “-function”, this relation can be presented as 2 Example: Suppose that a coin is tossed twice, so that the sample space is ={HH,HT,TH,TT}. Let X represent a number of heads that can come up. Find the probability function p(x). Assuming that the coin is fair, we have P(HH)=1/4, P(HT)=P(TH)=1/4, P(TT)=1/4; Then, P(X=0)=P(TT)=1/4; P(X=1)=P(HTTH)=1/4+1/4=1/2. P(X=2)= ¼. The probability function is thus given by the table: x 0 1 2 p(x) 1/4 1/2 1/4 The graphical example presented below is a little bit of a stretch and should be used with some care. Why? (Hint: see (7.4)) 3 ~p(xi) 0.3 1 2 3 4 5 6 7 8 Possible values of the random variable, xi 9 10 11 4 7.1 Poisson distributiona. Definition Poisson distribution is one of the most important discrete distributions. Its probability function is x P(x) e x! (7.3) Poisson distribution is a limiting case of the Binomial distribution P(pn,n), with parameters pn and n such that pn 0, n , pn n In other words, if we have a large number of independent events with small probability, then number of occurrences has approximately Poisson distribution. 5 Examples with Poisson distribution 1. Suppose that the probability of a defect in a foot of magnetic tape is 0.002. Use the Poisson distribution to compute the probability that 1500 feet roll will have no defects . Exp[ ] x P [ x _ . _] : ; x! p 0.002; n 1500; p n 0.002 1500 3; p[0] P [0, 3] // N 0.0498 6 Exp[ ] x P [ x _ . _] : ; x! p 0.002; n 1500; p n 0.002 1500 3; p[0] P [0, 3] // N 0.0498 This example helps to describe the PD in a new way by noticing that L (I use it here instead of Lambda) is the expected (average) value of the defects in 1500 feet of the tape. In other words, the PD gives the probability of n events happening in some particular setting if the average number of events, L, is known. 7 Example 2 An airline company sells 200 tickets for a plane with 198 seats, knowing that a probability that a passenger will not show up for the flight is 0.01. Use the Poisson approximation to compute the probability that they will have enough seats for all the passengers that will show up. Solution. p=0.01, L=0.01*200=2 – the average number (out of 200 passengers ) that won’t show up for the flight. p[x]= Exp[-2]2x/x!; P[more than 1 person won’t show up] = 1-p[0]-p[1]= 1 – 3 Exp[-2] = 0.594. 8 Example 3: (working in groups) 10% of the tools produced in a certain manufacturing process turns out to be defective. Find a probability that in a sample of ten tools selected at random, exactly 2 will be defective, by using (a) binomial and (b) Poisson distribution. Open a Mathematica file, and find the probability using (a) and (b). 9 Continuous distribution. Probability density function (PDF). Remember: For a continuous variable we must assign to each outcome a probability p(x )=0. Otherwise, we would not be able to fulfill the requirement 7.3. A random variable X is said to have a continuous distribution with density function f(x) if for all a b we have b P (a X b ) f ( x )dx ( 7.4) a The analogs of Eqs. 7.2 and 7.3 for the continuous distributions would be (7.5) f (x) 1 P (E ) f ( x )dx E ( 7.6) 10 P(E) is a probability that X belongs to E. f(x) P(a<X<b) a b Geometrically, P(a<X<b) is the area under the curve f(x) between a and b. 11 Examples: 1. The uniform distribution on (a,b): We are picking a value at random from (a,b). 1 , b a f (x) 0 axb (7.7) otherwise By direct integration you can verify that (7.7) satisfies the condition (7.5). We can now find PDF which describes the experiment with a spinner Lecture 1). In this case b-a=2, and 1 , 2 f ( ) 0 0 x 2 otherwise The probability that the arrow will stop in the rage between and + equals /2. 12 2. The exponential distribution e x , f ( x ) 0 x0 otherwise (7.8) Those who know how to integrate can verify that (7.8) satisfies (7.5) (the total area under the curve f(x) equals 1. Note: In Matematica, the integral of a function f[x] (notice that […] rather than (…) is used) can be found as: Integrate[f[x],{x,x1,x2}] , Shift+Enter. Here x1 and x2 are the limits of integration. 13 A typical example of the exponential distribution results from the discussion of the waste products of the nuclear power plant. If at time t=0 there are N(0) identical unstable particles, and the number of particles dN(t) decaying in time dt is proportional to dt and to the number of particles, then we have dN(t)= - N(t)dt This is so called differential equation. Here is how it is solved with Mathematica. DSolve[{n’[t] + G n[t] == 0, n[0] == n0},n[t],t]; n[t]-> n0 Exp[-Gt]; As a result we came up with the exponential distribution. 14 Let’s introduce the “half-time” T , such that N(T)=N0/2. Then we find : T=ln2=0.693. 3. The standard normal distribution f(x)=(2)-1/2 exp(-x2/2) (7.12) A. Using Mathematica, check that this PDF satisfies the normality condition (7.5). Make a plot of (7.12). If a random variable y is related to x as y=ax+b, how the distribution function f(y) looks like? (we assume that x is distributed according to (7.12). 15 More generally, X is said to have a normal (,2) distribution if it has density function f(x)=(2 2)-1/2 exp[-(x- )2/2 2] (7.12’) B. Try to analyze, assigning different numeric values to and 2 how they affect the shape of f(x). For instance, how the parameters for the green and red curves are related? Green and blue? 16 Probability distribution function ( also called “cumulative distribution function”= CDF) 1. Continuous random variable From the “outside”, random distributions are well described by the probability distribution function (we will use CDF for short) F(x) defined as x F ( x ) P( X x ) f ( y )dy (7.13) This formula can also be rewritten in the following very useful form: P(a X b) F (b) F (a) (7.14) 17 To see what the distribution functions look like, we return to our examples. 1. The uniform distribution (7.7): 1 , b a f (x) 0 ax b otherwise Using the definition (7.13) and Mathematica, try to find F(x) for the uniform distribution. Prove that F(x)=0 for x a; (x-a)/(b-a) for a x b; 1 for x>b. Draw the CDF for several a and b. Consider an important special case a=0, b=1. How is it related to the spinner problem? To the balanced die? 2. The exponential distribution (7.8): e x , f ( x ) 0 x0 otherwise 18 Use Mathematica to prove that F(x)= 0 for x 0; 1-exp(-x) for x >0. (7.15) “Lack of memory” for the exponential distribution Suppose that X has an exponential distribution (7.8). The probability that the event (such as the radioactive decay) did not happen in t units of time is P(X>t) = 1-F(x). According to (7.15) it results in P(X>t)= exp(-t) . Let’s find now a probability that we will have to wait some additional time s given that we have been waiting t units of time: P(T>t+s|T>t) = P(T > T+s)/P(T > t) = exp[-(t+s)]/ exp[-t)]= exp[-s]. As we see, the result depends only on s and does not depend on the previous waiting time. The probability you must wait additional s units of time is the same as if you had not been waiting at all. 19 The standard normal distribution Using Mathematica and Eq. (7.12), find F[x] for the snd. Use NIntegrate[f[t],{t,-,x}] and Plot[…] functions. 2. CDF for discrete random variables For discrete variables the integration is substituted for summation: F ( x ) P( X x ) p( x ) i x x i (7.16) It is clear from this formula that if X takes only a finite number of values, the distribution function looks like a stairway. 20 F(x) p(x4) 1 p(x3) p(x2) p(x1) x1 x2 x3 x4 x Draw F(x) for the example in page 7. 21 C. • Suppose that a pair of fair dice are to be tossed, and let the random variable denote the sum of the points. Obtain F(x) for this variable. • For the standard normal distribution, find the interval (-a,a) such that P(-a<x<a) = 0.95. Use Mathematica. 22 Home assignment is in three blue areas marked as A., B. C. and D. D. • (1) Find the constant c such that the function f(x) = c x2 for 0<x<3, 0 otherwise is a density function and (b) compute P(1<X<2). (use 7.4, 7.5 and Mathematica). • (2) Suppose X has density function f(x)=x/2 for 0<0<2, 0 otherwise. Find (a) distribution function, (b) P(X<1), (c) P(X>3/2). • (3) Let X has exponential distribution with parameter . Using Mathematica, find P(X> /2). • (4) Read the problem 4.22 in Schaum’s P&S and find the error in our solution of this problem ( the previous class ). It looks the results with BD and PD are much closer to each other. 23