PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Let X represent a Binomial r.v ,Then from P(" Occurrence s of A is between k1 and k2 " ) n k n k P( X k1 X k1 1 X k2 ) P( X k ) p q . k k1 k k1 k k2 k2 n k n k P k1 X k2 Pn ( k ) p q . k k1 k k1 k k2 k2 for large n. In this context, two approximations are extremely useful. The Normal Approximation (Demoivre-Laplace Theorem) If 𝑛𝑝𝑞 ≫ 1, then for k in the npq neighborhood of np, we can approximate n k n k p q k 1 2npq e ( k np ) 2 / 2 npq . (for example, when n with p held fixed) And we have: Pk1 X k2 k2 k1 1 2npq where x1 k1 np npq , x2 k2 np npq . e dx ( x np ) 2 / 2 npq x2 1 x1 2 e y / 2dy, 2 n k nk P ( k ) n k p q k k1 k k1 k2 As we know, Pk1 X k 2 k2 If k1 and k2 are within np npq , np npq , with approximation: Pk1 X k2 where x1 k1 np npq k2 k1 , x2 1 2npq k2 np npq e dx ( x np ) 2 / 2 npq x2 1 x1 2 e y / 2dy, . We can express this formula in terms of the normalized integral erf ( x ) 1 2 x 0 that has been tabulated extensively. e y2 / 2 dy erf ( x ) 2 x erf(x) x erf(x) x erf(x) x erf(x) 0.05 0.10 0.15 0.20 0.25 0.01994 0.03983 0.05962 0.07926 0.09871 0.80 0.85 0.90 0.95 1.00 0.28814 0.30234 0.31594 0.32894 0.34134 1.55 1.60 1.65 1.70 1.75 0.43943 0.44520 0.45053 0.45543 0.45994 2.30 2.35 2.40 2.45 2.50 0.48928 0.49061 0.49180 0.49286 0.49379 0.30 0.35 0.40 0.45 0.50 0.11791 0.13683 0.15542 0.17364 0.19146 1.05 1.10 1.15 1.20 1.25 0.35314 0.36433 0.37493 0.38493 0.39435 1.80 1.85 1.90 1.95 2.00 0.46407 0.46784 0.47128 0.47441 0.47726 2.55 2.60 2.65 2.70 2.75 0.49461 0.49534 0.49597 0.49653 0.49702 0.55 0.60 0.65 0.70 0.75 0.20884 0.22575 0.24215 0.25804 0.27337 1.30 1.35 1.40 1.45 1.50 0.40320 0.41149 0.41924 0.42647 0.43319 2.05 2.10 2.15 2.20 2.25 0.47982 0.48214 0.48422 0.48610 0.48778 2.80 2.85 2.90 2.95 3.00 0.49744 0.49781 0.49813 0.49841 0.49865 Example A fair coin is tossed 5,000 times. Find the probability that the number of heads is between 2,475 to 2,525. Solution We need P(2,475 X 2,525). Since n is large we can use the normal approximation. 1 p , so that np 2,500 and npq 35. 2 np npq 2,465, and np npq 2,535, So the approximation is valid for k1 2,475 and k 2 2,525. Example - continued P k1 X k2 x2 x1 Here, x1 1 y2 / 2 e dy. 2 k1 np 5 k np 5 , x2 2 . 7 npq npq 7 Using the table, erf (0.7) 0.258 . P2,475 X 2,525 erf ( x2 ) erf ( x1 ) erf ( x2 ) erf (| x1 |) 5 2erf 0.516 7 The Poisson Approximation For large n, the Gaussian approximation of a binomial r.v is valid only if p is fixed, i.e., only if np 1 and npq 1. What if np is small, or if it does not increase with n? - for example when p 0, number. n , such that np is a fixed The Poisson Theorem If p 0, n , np Then n! Pn (k ) p k (1 p ) n k (n k )!k! k 𝑛→∞ k! e The Poisson Approximation Consider random arrivals such as telephone calls over a line. n : total number of calls in the interval (0, T ). as T we have n Suppose n T . Δ : a small interval of duration 0 Let 𝑃𝑛 𝑘 = 𝑃 𝑘 𝑖𝑛 ∆ be the probability of k calls in the interval Δ We have shown that 𝑛 𝑘 𝑛−𝑘 𝑃𝑛 𝑘 = 𝑃 𝑘 𝑖𝑛 ∆ = 𝑝 𝑞 𝑘 n 2 1 ∆ 𝑤ℎ𝑒𝑟𝑒 𝑝 = 𝑇 T The Poisson Approximation p . T p: probability of a particular call occurring in Δ: p0 as T . np T T Normal approximation is invalid here because 𝑛𝑝 ≫ 1 is not valid. 0 n 2 1 T : probability of obtaining k calls (in any order) in an interval of duration Δ , Pn (k ) n! p k (1 p) n k (n k )! k! The Poisson Approximation n(n 1) (n k 1) (np )k n k Pn (k ) ( 1 np / n ) nk k! 1 2 k 1 k (1 / n )n 1 1 1 . k n n n k! (1 / n ) Thus, lim n , p 0 , np Pn (k ) k k! e the Poisson p.m.f Example: Winning a Lottery Suppose - two million lottery tickets are issued - with 100 winning tickets among them. a) If a person purchases 100 tickets, what is the probability of winning? Solution The probability of buying a winning ticket p Total no. of winning tickets 100 5 5 10 . 6 Total no. of tickets 2 10 Winning a Lottery - continued X: number of winning tickets n: number of purchased tickets , np 100 5 105 0.005. P( X k ) e k k! , P: an approximate Poisson distribution with parameter So, The Probability of winning is: P( X 1) 1 P( X 0) 1 e 0.005. Winning a Lottery - continued b) How many tickets should one buy to be 95% confident of having a winning ticket? Solution we need P( X 1) 0.95. P ( X 1) 1 e 0.95 implies ln 20 3. But np n 5 10 5 3 or n 60,000. Thus one needs to buy about 60,000 tickets to be 95% confident of having a winning ticket! Example: Danger in Space Mission A space craft has 100,000 components n The probability of any one component being defective is 2 10 5 ( p 0). The mission will be in danger if five or more components become defective. Find the probability of such an event. Solution n is large and p is small Poisson Approximation with parameter np 100,000 2 105 2 4 P( X 5) 1 P( X 4) 1 e k 0 k k! 1 e 4 2 1 e 2 1 2 2 0.052. 3 3 2 4 k k! k 0 Conditional Probability Density Function P( A B ) P( A | B ) , P( B ) 0. P( B ) FX ( x) P X ( ) x , P X ( ) x B FX ( x | B) P X ( ) x | B . P( B ) P X ( ) B P( B ) FX ( | B ) 1, P( B ) P( B ) P X ( ) B P( ) FX ( | B ) 0. P( B ) P( B ) Conditional Probability Density Function Further, P( x1 X ( ) x2 | B ) P x1 X ( ) x2 B P( B ) FX ( x2 | B ) FX ( x1 | B ), Since for x2 x1 , X ( ) x2 X ( ) x1 x1 X ( ) x2 . f X ( x | B) dFX ( x | B ) , dx FX ( x | B) x f X (u | B)du. P x1 X ( ) x2 | B x2 x1 f X ( x | B )dx. Example Toss a coin and X(T)=0, X(H)=1. Suppose B {H }. Determine FX ( x | B). Solution FX (x ) has the following form. We need FX ( x | B) For x 0, for all x. X ( ) x , so that X ( ) x B , and FX ( x | B) 0. FX ( x | B) FX (x) 1 1 q 1 (a) x 1 (b) x Example - continued For 0 x 1, X ( ) x T , so that X ( ) x B T H For x 1, and FX ( x | B) 0. X ( ) x , and X ( ) x B B {B} and FX ( x | B ) FX ( x | B) 1 1 x P( B ) 1 P( B ) Example Given FX (x ), suppose B X ( ) a. Find f X ( x | B). Solution We will first determine FX ( x | B). FX ( x | B ) For x a, P X x X a . P X a X x X a X x FX ( x | B ) For x a, so that P X x F ( x) X . P X a FX ( a ) X x X a ( X a) so that FX ( x | B) 1. Example - continued Thus FX ( x ) , x a, FX ( x | B ) FX ( a ) 1, x a, and hence f X ( x) , d f X ( x | B) FX ( x | B ) FX ( a ) dx 0, FX ( x | B) x a, otherwise. f X ( x | B) 1 f X (x ) FX (x ) a (a) x a (b) x Example Let B represent the event a X ( ) b with b a. For a given FX (x ), determine FX ( x | B) and f X ( x | B). Solution FX ( x | B ) P X ( ) x | B P X ( ) x a X ( ) b P a X ( ) b P X ( ) x a X ( ) b FX (b) FX ( a ) . Example - continued For x a, we have X ( ) x a X ( ) b , and hence FX ( x | B) 0. For a x b, we have X ( ) x a X ( ) b {a X ( ) x} and hence FX ( x | B ) P a X ( ) x FX ( x ) FX ( a ) . FX (b) FX ( a ) FX (b) FX ( a ) x b, we have X ( ) x a X ( ) b a X ( ) b so that FX ( x | B) 1. For Thus, f X ( x) , f X ( x | B ) FX (b) FX ( a ) 0, f X ( x | B) a x b, f X ( x) otherwise. a b x Conditional p.d.f & Bayes’ Theorem First, we extend the conditional probability results to random variables: We know that If U [ A1 , A2 ,, An ] is a partition of S and B is an arbitrary event, then: P( B) P( B | A1 ) P( A1 ) P( B | An ) P( An ) By setting B { X x} we obtain: P ( X x) P ( X x | A1 ) P ( A1 ) P ( X x | An ) P ( An ) Hence F ( x) F ( x | A1 ) P ( A1 ) F ( x | An ) P ( An ) f ( x) f ( x | A1 ) P ( A1 ) f ( x | An ) P ( An ) A1 , An form a partition on S Conditional p.d.f & Bayes’ Theorem Using: We obtain: P( A | B ) P( B | A) P( A) . P( B ) PA | X x ) For B x1 X ( ) x2 , PA | x1 X x 2 P X x | A P X x P( A) P x1 X x 2 | A P x1 X x 2 FX ( x 2 | A) FX ( x1 | A) FX ( x 2 ) FX ( x1 ) F x | A F x P( A) P ( A) P ( A) x2 f X ( x | A) dx x1 x2 x1 P ( A). f X ( x) dx Conditional p.d.f & Bayes’ Theorem Let x1 x, x2 x , 0, so that in the limit as 0, lim PA | ( x X x ) P A | X x 0 or f X | A ( x | A) f X ( x | A) f X ( x) P( A). P( A | X x ) f X ( x ) . P( A) we also get P( A) f X ( x | A)dx P( A | X x ) f X ( x )dx, 1 or P( A) P( A | X x ) f X ( x )dx (Total Probability Theorem) Bayes’ Theorem (continuous version) using total probability theorem in f X | A ( x | A) P( A | X x) f X ( x) , P( A) We get the desired result f X | A ( x | A) P( A | X x ) f X ( x ) P( A | X x ) f X ( x )dx . Example: Coin Tossing Problem Revisited p P(H ) probability of obtaining a head in a toss. For a given coin, a-priori p can possess any value in (0,1). f P ( p) : A uniform in the absence of any additional information After tossing the coin n times, k heads are observed. How can we update f P ( p) this is new information? Solution Let A= “k heads in n specific tosses”. f P ( p) Since these tosses result in a specific sequence, P( A | P p ) p k q n k , and using Total Probability Theorem we get 1 1 0 0 P( A) P( A | P p) f P ( p)dp p k (1 p)n k dp 0 (n k )! k! . (n 1)! 1 p Example - continued The a-posteriori p.d.f f P| A ( p | A) represents the updated information given the event A, Using f X | A ( x | A) f P |A ( p | A ) f P| A ( p | A) P( A | X x) f X ( x) , P( A) 0 P ( A | P p )f P ( p ) P (A ) (n 1)! p k q n k , 0 p 1 (n k )!k ! 1 p ( n , k ). This is a beta distribution. We can use this a-posteriori p.d.f to make further predictions. For example, in the light of the above experiment, what can we say about the probability of a head occurring in the next (n+1)th toss? Example - continued Let B= “head occurring in the (n+1)th toss, given that k heads have occurred in n previous tosses”. Clearly P ( B | P p ) p, From Total Probability Theorem, Using (1) in (2), we get: 1 P( B) P( B | P p) f P ( p | A)dp. 0 (n 1)! k 1 k n k P (B ) p . p q dp 0 (n k )!k ! n 2 Thus, if n =10, and k = 6, then 1 P( B ) which is more realistic compared to p = 0.5. 7 0.58, 12