Math 141 Lecture 12: The LLN, CLT, and the Normal Distribution Albyn Jones1 1 Library 304 jones@reed.edu www.people.reed.edu/∼jones/courses/141 Albyn Jones Math 141 Properties of X n Suppose X1 , X2 , . . . , Xn are IID random variables, with mean µ and standard deviation σ. We know that E(X n ) = µ and σ SD(X n ) = √ n In other words, typical values for X n are around √ µ ± σ/ n or more formally: X n →P µ Albyn Jones Math 141 The Law of Large Numbers The Law of Large Numbers X n →P µ tells us that X n gets as close to µ as you like when n → ∞, with probability approaching 1. Albyn Jones Math 141 The Law of Large Numbers The Law of Large Numbers X n →P µ tells us that X n gets as close to µ as you like when n → ∞, with probability approaching 1. It does not tell you how close you are at any point, or how large n must be to guarantee you are as close as you would like to be! Albyn Jones Math 141 The Law of Large Numbers The Law of Large Numbers X n →P µ tells us that X n gets as close to µ as you like when n → ∞, with probability approaching 1. It does not tell you how close you are at any point, or how large n must be to guarantee you are as close as you would like to be! The Central Limit Theorem Another famous theorem called the Central Limit Theorem answers that question! Albyn Jones Math 141 First: The Normal Distribution The so-called Normal distribution (aka Gaussian, or the bell-shaped curve) has its origin in approximations to Binomial probabilities for large n. Before discussing that approximation, we study the properties of the Normal distribution. Albyn Jones Math 141 The Normal Distribution 0.2 0.1 0.0 density 0.3 0.4 The Standard Normal Density −3 −2 −1 0 1 Z Albyn Jones Math 141 2 3 The Normal Distribution, Part 2 The Normal Distribution has several important features: it is symmetric and unimodal, the mean, median, and mode coincide, it is completely characterized by the values of the mean µ and the standard deviation σ or variance σ 2 , every Normal distribution has the same shape. Albyn Jones Math 141 Notation The standard notation for a random variable X which has a Normal Distribution with mean µ and standard deviation σ is X ∼ N(µ, σ 2 ) In other words, list the mean and variance. Warning! R functions are parametrized by the mean µ and standard deviation σ! Albyn Jones Math 141 The Normal Distribution, Part 3 Roughly 96% of any Normal population lies with 2 SD’s of the mean, and about 99.7% lies within 3 SD’s of the mean. 0.2 0.1 0.023 0.136 0.341 0.341 0.136 0.023 0.0 density 0.3 0.4 Areas under the Standard Normal Curve −3 −2 −1 0 1 Z Albyn Jones Math 141 2 3 The Normal(50, 102 ) Curve The corresponding regions for ANY Normal distribution contain the same proportions of the population! 0.02 0.01 0.023 0.136 0.341 0.341 0.136 0.023 0.00 density 0.03 0.04 Areas under the Normal(50,100) Curve 20 30 40 50 60 Z Albyn Jones Math 141 70 80 Linear functions of Normal RV’s are Normal! Let Z ∼ N(0, 1), and let Y = σZ + µ. Then E(Y ) = E(σZ + µ) = σE(Z ) + µ = µ and SD(Y ) = SD(σZ + µ) = σSD(Z ) = σ Thus: Y ∼ N(µ, σ 2 ) Albyn Jones Math 141 Standardization It works the other way too: let Y ∼ N(µ, σ 2 ), then Z = (Y − µ)/σ is a Standard Normal RV: Y −µ 1 1 E(Z ) = E = E(Y − µ) = (E(Y ) − µ) = 0 σ σ σ A standardized RV is often called a Z–score, and represents the number of standard deviations the value is away from the mean. Albyn Jones Math 141 Historical Footnote! Standardizing data was a common practice back before computers; then you only needed a table of probabilities for the Standard Normal distribution! Tables are unnecessary now, but it is still very useful to remember that areas under the Normal density curve depend only on the mean and SD, and that Z–scores measure in units of SD’s. Albyn Jones Math 141 R Functions for Normal Probabilities pnorm(a, µ, σ) gives P(Y ≤ a), dnorm(a, µ, σ) gives the height of the curve at a. 0.04 dnorm() and pnorm() 0.02 0.01 pnorm(55,50,10) = 0.691 0.00 density 0.03 dnorm(55,50,10) 20 30 40 50 60 Z Albyn Jones Math 141 70 80 The Normal Density and CDF The density function for X ∼ N(µ, σ 2 ) is given by f (x) = √ 1 2πσ 2 e − (x−µ)2 2σ 2 the CDF is the area under the curve to the left of the point x: Z x (x−µ)2 1 − √ e 2σ2 dx P(X ≤ x) = 2πσ 2 −∞ Albyn Jones Math 141 Cumulative Normal Probabilities: pnorm() and qnorm() The CDF is the area under the density curve up to a point, given by pnorm(), qnorm() is the inverse function of pnorm(). 0.8 1.0 The Cumulative Distribution Function 0.4 density 0.6 pnorm(55,50,10) = 0.691 0.0 0.2 qnorm(.691,50,10) = 55 20 30 40 50 60 Z Albyn Jones Math 141 70 80 Another pnorm example 0.2 0.1 0.0 density 0.3 0.4 pnorm(1)−pnorm(−1) : .68... −3 −2 −1 0 1 x Albyn Jones Math 141 2 3 pnorm() and qnorm() QUIZ! What is qnorm(pnorm(0))? What is pnorm(qnorm(.5))? Albyn Jones Math 141 Sample Means Suppose X1 , X2 , . . . , Xn are IID random variables, with mean µ and standard deviation σ. We know that E(X n ) = µ and σ SD(X n ) = √ n Albyn Jones Math 141 The Central Limit Theorem Let X1 , X2 , . . . , Xn be IID random variables, with mean µ and standard deviation σ. Then as n increases, the distribution of √ X n is approaching that of a Normal with mean µ and SD σ/ n: ! Z x x2 Xn − µ 1 √ e− 2 dx √ ≤x → P σ/ n 2π −∞ Albyn Jones Math 141 The Central Limit Theorem, Three versions We actually have three ways of describing the normal approximation: Albyn Jones Math 141 The Central Limit Theorem, Three versions We actually have three ways of describing the normal approximation: 1 X −µ √ ∼ N(0, 1) σ/ n Albyn Jones Math 141 The Central Limit Theorem, Three versions We actually have three ways of describing the normal approximation: 1 X −µ √ ∼ N(0, 1) σ/ n 2 X ∼ N(µ, σ 2 /n) Albyn Jones Math 141 The Central Limit Theorem, Three versions We actually have three ways of describing the normal approximation: 1 X −µ √ ∼ N(0, 1) σ/ n 2 X ∼ N(µ, σ 2 /n) 3 X Xi ∼ N(nµ, nσ 2 ) Albyn Jones Math 141 Interpretation It is the CLT that allows us to say that √ X ≈ µ ± σ/ n For the Normal distribution, the SD really is a typical deviation! Finally: this also explains why the SD is often more useful than other measures of spread. Albyn Jones Math 141 Example: Binomial Let Xp i be n IID Bernoulli(p) RV’s. Then µ = p and σ = p(1 − p), while X = p̂. Albyn Jones Math 141 Example: Binomial Let Xp i be n IID Bernoulli(p) RV’s. Then µ = p and σ = p(1 − p), while X = p̂. Standardized Averages p̂ − p p ∼ N(0, 1) p(1 − p)/n Albyn Jones Math 141 Example: Binomial Let Xp i be n IID Bernoulli(p) RV’s. Then µ = p and σ = p(1 − p), while X = p̂. Standardized Averages p̂ − p p ∼ N(0, 1) p(1 − p)/n Averages p̂ ∼ N(p, p(1 − p)/n) Albyn Jones Math 141 Example: Binomial Let Xp i be n IID Bernoulli(p) RV’s. Then µ = p and σ = p(1 − p), while X = p̂. Standardized Averages p̂ − p p ∼ N(0, 1) p(1 − p)/n Averages p̂ ∼ N(p, p(1 − p)/n) Sums X Xi ∼ Binomial(n, p) ∼ N(np, np(1 − p)) Albyn Jones Math 141 Example: Binomial(20, 1/2) 0.00 0.05 0.10 0.15 Binomial(20,.5) and N(10, 20*.5*.5) 0 1 2 3 4 5 6 7 8 9 Albyn Jones 11 13 Math 141 15 17 19 Example: Binomial(50, 1/2) 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Binomial(50,.5) and N(25, 50*.5*.5) 0 3 6 9 12 16 20 Albyn Jones 24 28 32 Math 141 36 40 44 48 Example: Binomial(100, 1/2) If X ∼ Binomial(100, 1/2), then E(X ) = np = 100/2 = 50 SD(X ) = p p np(1 − p) = 100/4 = 5 So typical values for X are around 45 or 55, and roughly 96% of the time, 40 ≤ X ≤ 60 sum(dbinom(40:60,100,.5)) gives 0.9648. Albyn Jones Math 141 Example: Binomial(100, 1/100) If X ∼ Binomial(100, 1/100), then E(X ) = np = 100/100 = 1 SD(X ) = p √ np(1 − p) = 100 · .01 · .99 ≈ 1 So typical values for X are roughly 0 to 2, and according to the CLT roughly 96% of the time, −1 ≤ X ≤ 3 sum(dbinom(0:3,100,.01)) gives 0.9816, while sum(dbinom(0:2,100,.01)) gives 0.92. The poisson approximation works better here! Albyn Jones Math 141 Example: Binomial(100, 1/100) 0.0 0.1 0.2 0.3 0.4 Binomial(100,.01) and N(1,.995) 0 1 2 3 4 Albyn Jones 5 6 7 8 Math 141 9 10 12 14 16 Summary The Normal distribution originated as a means of approximating probabilities for sums and averages. For IID RV’s Xi with mean µ and variance σ 2 X ∼ N(µ, σ 2 /n) Albyn Jones Math 141