MATH 10: Elementary Statistics and Probability Chapter 7: The Central Limit Theorem Tony Pourmohamad Department of Mathematics De Anza College Spring 2015 The Central Limit Theorem Using the Central Limit Theorem Objectives By the end of this set of slides, you should be able to: 1 Understand what the central limit theorem is 2 Recognize the central limit theorem problems 3 Apply and interpret the central limit theorem for means 2 / 20 The Central Limit Theorem Using the Central Limit Theorem The Central Limit Theorem • The Central Limit Theorem (CLT) is one of the most powerful and useful ideas in all of statistics • For this class, we will consider two application of the CLT: 1 2 CLT for means (or averages) of random variables CLT for sums of random variables • Let’s start with an example, courtesy of Professor Mo Geraghty http://nebula2.deanza.edu:16080/˜mo/holistic/clt.swf • Try exploring the following website to better understand the CLT http://spark.rstudio.com/minebocek/CLT_mean/ 3 / 20 The Central Limit Theorem Using the Central Limit Theorem The Central Limit Theorem • So what is happening in the CLT video? 100 Samples 20 Frequency 10 2 0 0 1 Frequency 3 30 4 10 Samples 2.5 3.0 3.5 4.0 4.5 5.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 10,000 Samples 1000 Frequency 500 150 0 50 0 Frequency 250 1500 1,000 Samples 2 3 4 5 2 3 4 5 4 / 20 The Central Limit Theorem Using the Central Limit Theorem The Central Limit Theorem -- Basic Idea • Imagine there is some population with a mean µ and standard deviation σ • We can collect samples of size n where the value of n is "large enough" • We can then calculate the mean of each sample • If we create a histogram of those means, then the resulting histogram will look close to being a normal distribution • It does not matter what the distribution of the original population is, or whether you even know it. The important fact is the the distribution of the sample means tend to follow the normal distribution 5 / 20 The Central Limit Theorem Using the Central Limit Theorem The Central Limit Theorem -- More Formally • Suppose that we have a large population with with mean µ and standard deviation σ • Suppose that we select random samples of size n items this population • Each sample taken from the population has its own average X̄ . • The sample average for any specific sample may not equal the population average exactly. 6 / 20 The Central Limit Theorem Using the Central Limit Theorem The Central Limit Theorem -- More Formally Continued • The sample averages X̄ follow a probability distribution of their own • The average of the sample averages is the population average: µx̄ = µ • The standard deviation of the sample averages equals the population standard deviation divided by the square root of the sample size σ σx̄ = √ n • The shape of the distribution of the sample averages X̄ is normally distributed if the sample size is large enough • The larger the sample size, the closer the shape of the distribution of sample averages becomes to the normal distribution • This is the Central Limit Theorem! 7 / 20 The Central Limit Theorem Using the Central Limit Theorem The Central Limit Theorem -- Case 1 • IF a random sample of any size n is taken from a population with a normal distribution with mean and standard deviation σ • THEN distribution of the sample mean has a normal distribution with: µx̄ = µ and σ σx̄ = √ n and X̄ ∼ N (µx̄ , σx̄ ) 8 / 20 The Central Limit Theorem Using the Central Limit Theorem The Central Limit Theorem -- Case 1 X ~ N(10, 2) µ X ~ N(10, 2 50) µ 9 / 20 The Central Limit Theorem Using the Central Limit Theorem The Central Limit Theorem -- Case 2 • IF a random sample of sufficiently large size n is taken from a population with ANY distribution with mean µ and standard deviation • THEN the distribution of the sample mean has approximately a normal distribution with: µx̄ = µ and σ σx̄ = √ n and X̄ ∼ N (µx̄ , σx̄ ) 10 / 20 The Central Limit Theorem Using the Central Limit Theorem The Central Limit Theorem -- Case 2 X ~ N(10, 2) µ X ~ N(µ, σ n) µ 11 / 20 The Central Limit Theorem Using the Central Limit Theorem The Central Limit Theorem -- Recap • 3 important results for the distribution of X̄ 1 The mean stays the same µx̄ = µ 2 The standard deviation gets smaller σ σx̄ = √ n 3 If n is sufficiently large, X̄ has a normal distribution where X̄ ∼ N (µx̄ , σx̄ ) 12 / 20 The Central Limit Theorem Using the Central Limit Theorem What is Large n? • How large does the sample size n need to be in order to use the Central Limit Theorem? • The value of n needed to be a "large enough" sample size depends on the shape of the original distribution of the individuals in the population • If the individuals in the original population follow a normal distribution, then the sample averages will have a normal distribution, no matter how small or large the sample size is • If the individuals in the original population do not follow a normal distribution, then the sample averages X̄ become more normally distributed as the sample size grows larger. In this case the sample averages X̄ do not follow the same distribution as the original population 13 / 20 The Central Limit Theorem Using the Central Limit Theorem What is Large n? Continued • The more skewed the original distribution of individual values, the larger the sample size needed • If the original distribution is symmetric, the sample size needed can be smaller • Many statistics textbooks use the rule of thumb n ≥ 30, considering 30 as the minimum sample size to use the Central Limit Theorem. But in reality there is not a universal minimum sample size that works for all distributions; the sample size needed depends on the shape of the original distribution • In this class, we will assume the sample size is large enough for the Central Limit Theorem to be used to find probabilities for X̄ 14 / 20 The Central Limit Theorem Using the Central Limit Theorem Calculating Probabilities from a Normal Distribution • Here is the general procedure to calculate probabilities from the distribution of the sample mean X̄ 1 You are given an interval in terms of x̄, i.e. P (X̄ < x̄ ) 2 Convert to a z score by using z= 3 x̄ − µ √ σ/ n Look up probability in z table that corresponds to z score, i.e. P (Z < z ) • This is just the same idea we used in Chapter 6! 15 / 20 The Central Limit Theorem Using the Central Limit Theorem Examples • Look at Handout #5 on the website 16 / 20 The Central Limit Theorem Using the Central Limit Theorem Percentile Calculations Based on the Normal Distribution • Here is the general procedure to calculate the value x̄ that corresponds to the P th percentile 1 You are given a probability or percentile desired 2 Look up the z score in table that corresponds to the probability 3 Convert to x̄ by the following formula: x̄ = µ + z σ √ n • Examples: Look at Handout #5 on the website 17 / 20 The Central Limit Theorem Using the Central Limit Theorem Using Your Calculator • If you have a graphing calculator, your calculator can calculate all of these probabilities without using a z table • If you want to calculate P (a < X̄ < b) follow these steps: Push 2nd, then DISTR Select normalcdf() and then push ENTER √ 3 Then enter the following: normalcdf(a , b, µ, σ/ n) 1 2 • Question: If X̄ ∼ N (0, 1), what is the probability P (−1 < X̄ < 1)? • Solution: normalcdf(−1, 1, 0, 1) = 0.6827 ≈ 68% • Question: If X̄ ∼ N (10, 2), what is the probability P (7 < X̄ < 9)? • Solution: normalcdf(7, 9, 10, 2) = 0.2417 18 / 20 The Central Limit Theorem Using the Central Limit Theorem Using Your Calculator • If you want to calculate P (X̄ < a ) follow these steps: Push 2nd, then DISTR Select normalcdf() and then push ENTER √ 99 3 Then enter the following: normalcdf(−10 , a , µ, σ/ n) 1 2 • Question: If X̄ ∼ N (10, 2), what is the probability P (X̄ < 8)? • Solution: normalcdf(−1099 , 8, 10, 2) = 0.158656 • If you want to calculate P (X̄ > a ) follow these steps: Push 2nd, then DISTR Select normalcdf() and then push ENTER √ 99 3 Then enter the following: normalcdf(a , 10 , µ, σ/ n) 1 2 • Question: If X̄ ∼ N (10, 2), what is the probability P (X̄ > 9)? • Solution: normalcdf(9, 1099 , 10, 2) = 0.691462 19 / 20 The Central Limit Theorem Using the Central Limit Theorem Using Your Calculator • If you want to calculate the value of X̄ that gives you the P th percentile then follow these steps: Push 2nd, then DISTR Select invNorm() and then push ENTER 3 Then enter the following: invNorm(percentile,µ, σ ) 1 2 • Question: If X̄ ∼ N (10, 2), what value of X̄ gives us the 25th percentile? • Solution: normalcdf(.25, 10, 2) = 8.65102 √ • Recall: We used the formula x̄ = µ + z σ/ n, so x̄ = 10 + (−0.67)(2) = 8.66 • We got -0.67 from the z table 20 / 20