STAT 430/510 Lecture 12 STAT 430/510 Probability Lecture 12: Central Limit Theorem and Exponential Distribution Pengyuan (Penelope) Wang June 15, 2011 STAT 430/510 Lecture 12 Review Discussed Uniform Distribution and Normal Distribution Normal Approximation to Binomial Distribution STAT 430/510 Lecture 12 A little thing about Normal Distribution For N(µ, SD = σ), 68% probability is within ±σ around µ. 95% probability is within ±2σ around µ. 99.7% probability is within ±3σ around µ. So if a value is outside 3σ around µ, it is very rare to happen! STAT 430/510 Lecture 12 Central Limit Theorem: a Generalization of Normal Approximation to Binomial Distribution Xi , i = 1, ..., n independently and identically follow a distribution with expected value µ and standard deviation σ. If n is large, then X̄ = n1 (X1 + X2 + ... + Xn ) follows approximately a normal distribution with mean µ, and standard deviation √σn . STAT 430/510 Lecture 12 Example There are 100 workers in a factory. Assume that their ages independently and identically follows a distribution with expected value 40 years old and standard deviation 10 years old. What is the probability that their mean age higher than 45 years old? Let X represent their mean age. According to the Central Limit Theorem, X approximately follows a normal distribution with mean 40 and standard deviation √ 10/ 100 = 1. P(X > 45) = P(Z > 45−40 1 ) = P(Z > 5) = 0.00. STAT 430/510 Lecture 12 Example There are 100 male workers and 100 female workers in a factory. Assume that the ages of males independently and identically follows a distribution with expected value 40 years old and standard deviation 10 years old. Assume that the ages of females independently and identically follows a distribution with expected value 39 years old and standard deviation 20 years old. What is the probability that male’s mean age is at least 2 years higher than female’s mean age? STAT 430/510 Lecture 12 Example There are 100 male workers and 100 female workers in a factory. Assume that the ages of males independently and identically follows a distribution with expected value 40 years old and standard deviation 10 years old. Assume that the ages of females independently and identically follows a distribution with expected value 39 years old and standard deviation 20 years old. What is the probability that male’s mean age is at least 2 years higher than female’s mean age? The male’s mean age (X ) approximately follows normal distribution (40, 1), and the female’s mean age (Y ) follows normal distribution (39, 2). Thus √ X − Y approximately follows normal distribution (1, 5). √ √ So P(X − Y > 2) ≈ P(Z > (2 − 1)/ 5) = 1 − Φ(1/ 5). STAT 430/510 Lecture 12 Example The approximation to the Binomial distribution is just one application of the central limit theorem. An insurance company believes that people can be divided into two classes: those who are accident prone and those who are not. The company’s statistics show that an accident-prone person will have an accident at some time within a fixed 1-year period with probability 0.4, whereas this probability decreases to 0.2 for a person who is not accident prone. We assume that 30 percent of the population is accident prone. If we draw 100 policyholders, what is the probability that more than 30 of them have accidents within a year of purchasing a policy? STAT 430/510 Lecture 12 Example The probability to have accident for each person is 0.4*0.3+0.2*0.7=0.26 . Let X be the proportion of the people with accidents, then √ X −0.26 approximately follows standard Normal 0.26∗0.74/100 distribution. P(X > 0.3) = P( √ 1 − Φ(0.91). X −0.26 0.26∗0.74/100 >√ 0.3−0.26 ) 0.26∗0.74/100 = STAT 430/510 Lecture 12 Exponential Random Variable A continuous random variable X is said to have a exponential distribution with parameter λ if the pdf of X is λe−λx , if x ≥ 0 f (x) = 0, if x < 0 STAT 430/510 Lecture 12 cdf of Exponential r.v. For exponential r.v. X with parameter λ, the cdf is 1 − e−λx , x ≥ 0 F (x) = 0, x < 0 STAT 430/510 Lecture 12 Expected Value and Variance X is exponential random variable with parameter λ. E[X ] = 1 λ Var (X ) = 1 λ2 STAT 430/510 Lecture 12 Example Suppose that the length of a phone call in minutes is an exponential random variable with parameter λ = 0.1. If someone arrives immediately ahead of you at a public telephone booth, find the probability that you will have to wait (a) More than 10 minutes? (b) Between 10 and 20 minutes? STAT 430/510 Lecture 12 Example: Solution Let X denote the length of the call made by the person in the booth. P(X > 10) = 1 − F (10) = e−1 = 0.368 P(10 < X < 20) = F (20) − F (10) = e−1 − e−2 = 0.233 STAT 430/510 Lecture 12 Memoryless Property The exponential distribution is memoryless! It means: if P(T ≥ s + t|T ≥ t) = P(T ≥ s) for all s, t ≥ 0. In fact, the exponential distribution is the only memoryless continuous distribution. STAT 430/510 Lecture 12 Example Suppose that the length of a phone call in minutes is an exponential random variable with parameter λ = 0.1. Someone arrives immediately ahead of you at a public telephone booth. Now you have already waited for 10 minutes, from now on what is the probability that you need to wait at least for another 10 minutes? STAT 430/510 Lecture 12 Example Suppose that the length of a phone call in minutes is an exponential random variable with parameter λ = 0.1. Someone arrives immediately ahead of you at a public telephone booth. Now you have already waited for 10 minutes, from now on what is the probability that you need to wait at least for another 10 minutes? From the memoryless property of exponential distribution, P(X > 10 + 10|X > 10) = P(X > 10) = e−10∗0.1 = e−1 STAT 430/510 Lecture 12 Example Suppose that the number of miles that a car can run before its battery wears out is exponentially distributed with an average value of 10,000 miles. If a person has used the battery for some time and now he desires to take another 5000-mile trip, then what is the probability that he or she will be able to complete the trip without having to replace the car battery? What can be said when the distribution is not exponential? STAT 430/510 Lecture 12 Example: Solution X is the lifetime of the battery t is the number of miles that the battery had been in use prior to the start of the trip. From the memoryless property of exponential distribution, P(X > t+5000|X > t) = P(X > 5000) = e−5000/10000 = 0.606 If the distribution F of X is not exponential distribution, P(X > t+5000|X > t) = P(X > t + 5000) 1 − F (t + 5000) = P(X > t) 1 − F (t)