Paul Cornwell March 31, 2011 1 Let X1,…,Xn be independent, identically distributed random variables with positive variance. Averages of these variables will be approximately normally distributed with mean μ and standard deviation σ/√n when n is large. 2 How large of a sample size is required for the Central Limit Theorem (CLT) approximation to be good? What is a ‘good’ approximation? 3 Permits analysis of random variables even when underlying distribution is unknown Estimating parameters Hypothesis Testing Polling 4 Performing a hypothesis test to determine if set of data came from normal Considerations ◦ Power: probability that a test will reject the null hypothesis when it is false ◦ Ease of Use 5 Problems ◦ No test is desirable in every situation (no universally most powerful test) ◦ Some lack ability to verify for composite hypothesis of normality (i.e. nonstandard normal) ◦ The reliability of tests is sensitive to sample size; with enough data, null hypothesis will be rejected 6 Symmetric Unimodal Bell-shaped Continuous 7 Skewness: Measures the asymmetry of a distribution. ◦ Defined as the third standardized moment ◦ Skew of normal distribution is 0 X 1 E 3 X n i 1 i X 3 (n 1) s 3 8 Kurtosis: Measures peakedness or heaviness of the tails. ◦ Defined as the fourth standardized moment ◦ Kurtosis of normal distribution is 3 4 x 2 E n X n i 1 i X 4 (n 1) s 4 9 Cumulative distribution function: X F ( x; n, p) n C i p i (1 p) ni i 0 E[ X ] np Var[ X ] np(1 p) 10 parameters Kurtosis Skewness % outside 1.96*sd K-S distance Mean Std Dev n = 20 p = .2 -.0014 (.25) .3325 (1.5) .0434 .128 3.9999 1.786 n = 25 p = .2 .002 .3013 .0743 .116 5.0007 2.002 n = 30 p = .2 .0235 .2786 .0363 .106 5.997 2.188 n = 50 p = .2 .0106 .209 .0496 .083 10.001 2.832 .005 .149 .05988 .0574 19.997 4.0055 n = 100 p = .2 *from R 11 Cumulative distribution function: xa F ( x; a, b) ba ab E[ X ] 2 (b a ) 2 Var[ X ] 12 12 parameters Kurtosis Skewness % outside 1.96*sd K-S distance Mean Std Dev n=5 (a,b) = (0,1) -.236 (-1.2) .004 (0) .0477 .0061 .4998 .1289 (.129) n=5 (a,b) = (0,50) -.234 0 .04785 .0058 24.99 6.468 (6.455) n=5 (a,b) = (0, .1) -.238 -.0008 .048 .0060 .0500 .0129 (.0129) n=3 (a,b) = (0,50) -.397 -.001 .0468 .01 24.99 8.326 (8.333) *from R 13 Cumulative distribution function: F ( x; ) 1 e E[ X ] x 1 Var[ X ] 1 2 14 parameters Kurtosis Skewness % outside 1.96*sd K-S distance Mean Std Dev 1.239 (6) .904 (2) .0434 .0598 .9995 .4473 (.4472) n = 10 .597 .630 .045 .042 1.0005 .316 (.316) n = 15 .396 .515 .0464 .034 .9997 .258 (.2581) n=5 λ=1 *from R 15 Find n values for more distributions Refine criteria for quality of approximation Explore meanless distributions Classify distributions in order to have more general guidelines for minimum sample size 16 Paul Cornwell May 2, 2011 17 Central Limit Theorem: Averages of i.i.d. variables become normally distributed as sample size increases Rate of converge depends on underlying distribution What sample size is needed to produce a good approximation from the CLT? 18 Real-life applications of the Central Limit Theorem What does kurtosis tell us about a distribution? What is the rationale for requiring np ≥ 5? What about distributions with no mean? 19 Probability for total distance covered in a random walk tends towards normal Hypothesis testing Confidence intervals (polling) Signal processing, noise cancellation 20 Measures the “peakedness” of a distribution Higher peaks means fatter tails 4 x 2 E 3 n 21 Traditional assumption for normality with binomial is np > 5 or 10 Skewness of binomial distribution increases as p moves away from .5 Larger n is required for convergence for skewed distributions 22 Has no moments (including mean, variance) Distribution of averages looks like regular distribution CLT does not apply 1 f ( x) (1 x 2 ) 23 α = β = 1/3 Distribution is symmetric and bimodal Convergence to normal is fast in averages 24 Heavier-tailed, bell-shaped curve Approaches normal distribution as degrees of freedom increase 25 4 statistics: K-S distance, tail probabilities, skewness and kurtosis Different thresholds for “adequate” and “superior” approximations Both are fairly conservative 26 Distribution ∣Kurtosis∣ <.5 ∣Skewness∣ Tail Prob. <.25 .04<x<.06 K-S Distance <.05 max Uniform 3 1 2 2 3 Beta (α=β=1/3) 4 1 3 3 4 Exponential 12 64 5 8 64 Binomial (p=.1) 11 114 14 332 332 Binomial (p=.5) 4 1 12 68 68 Student’s t with 2.5 df NA NA 13 20 20 Student’s t with 4.1 df 120 1 1 2 120 27 Distribution ∣Kurtosis∣ <.3 ∣Skewness∣ Tail Prob. <.15 .04<x<.06 K-S Distance <.02 max Uniform 4 1 2 2 4 Beta (α=β=1/3) 6 1 3 4 6 Exponential 20 178 5 45 178 Binomial (p=.1) 18 317 14 1850 1850 Binomial (p=.5) 7 1 12 390 390 Student’s t with 2.5 df NA NA 13 320 320 Student’s t with 4.1 df 200 1 1 5 200 28 Skewness is difficult to shake Tail probabilities are fairly accurate for small sample sizes Traditional recommendation is small for many common distributions 29