Statistics 512 Notes 4 Confidence Intervals Continued Role of Asymptotic (Large Sample) Approximations in Statistics: It is often difficult to find the finite sample sampling distribution of an estimator or statistic. Review of Limiting Distributions from Probability Types of Convergence: Let X 1 , , X n be a sequence of random variables and let X be another random variable. Let Fn denote the CDF of X n and let F denote the CDF of X . P 1. X n converges to X in probability, denoted X n X if for every >0, P(| X n X | ) 0 as n . D 2. X n converges to X in distribution, denoted X n X if for every >0, Fn (t ) F (t ) as n at all t for which F is continuous. Weak Law of Large Numbers Let X 1 , , X n be a sequence of iid random variables having mean and variance . Let X n 2 n i 1 n Xi . Then P Xn . Interpretation: The distribution of X n becomes more and more concentrated around as n gets large. Proof: Using Chebyshev’s inequality, for every >0 Var ( X n ) 2 P(| X n | ) 2 2 n which tends to 0 as n . Central Limit Theorem Let X 1 , , X n be a sequence of iid random variables having 2 mean and variance . Then Xn n( X n ) D Zn Z Var ( X n ) where Z ~ N (0,1) . In other words, z 1 x2 / 2 lim P( Z n z ) ( z ) e dx . n 2 Interpretation: Probability statements about X n can be approximated using a normal distribution. It’s the probability statements that we are approximating, not the random variable itself. Some useful further convergence properties: Slutsky’s Theorem (Theorem 4.3.5): D P P D X n X , An a, Bn b, then An Bn X n a bX . Continuous Mapping Theorem (Theorem 4.3.4): Suppose X n converges to X in distribution and g is a continuous function on the support of X. Then g ( X n ) converges to g(X) in distribution. Application of these convergence properties: Let X 1 , , X n be a sequence of iid random variables having 4 2 mean , variance and E ( X i ) . Let n Xi 1 n 2 2 S ( X X ) n i n and . Then n n i 1 Xn D Tn Z Sn n where Z ~ N (0,1) . Proof (only for those interested): Xn Tn We can write S n . Using the Central Limit n Xn D Z Theorem which says and Slutksy’s Theorem, to n Xn i 1 D P prove that Tn Z , it is sufficient to prove that S 1 . n We can write S n2 n 2 X i i 1 n 2 X n i 1 X i n n X n2 n 2 X i i 1 n X n2 . By the p 2 2 2 weak law of large numbers, S E ( X i ) [ E ( X i )] Sn2 P or equivalently 2 1 . 2 n Back to Confidence Intervals CI for mean of iid sample X1 , X n from unknown 4 distribution with finite variance and E ( X i ) : By the application of the central limit theorem above, Xn D Tn Z Sn . n Thus, for large n, Xn 1 P z z Sn 1 1 2 2 n S S P z n X n z n X n 1 n 2 1 2 n S S P X n z n X n z n 1 1 n n 2 2 Thus, X n z1 2 Sn (1 ) confidence n is an approximate interval. How large does n need to be for this to be a good approximation? Traditionally textbooks say n>30. We’ll look at some simulation results later in the course. Application: A food-processing company is considering marketing a new spice mix for Creole and Cajun cooking. They took a simple random sample of 200 consumers and found that 37 would purchase such a product. Find an approximate 95% confidence interval for p, the true proportion of buyers. 1 if ith consumer would buy product X Let i 0 if ith consumer would not buy product . If the population is large (say 50 times larger than the sample size), a simple random sample can be regarded as a sample with replacement. Then a reasonable model is that X1 , , X 200 are iid Bernoulli(p). We have 37 Xn 0.185 200 i 1 ( X i X n )2 200 Sn 2 200 i 1 X i2 X n2 0.185 (0.185) 2 0.151 n 200 Thus, an approximate 95% confidence interval for p is S 0.151 X n z n 0.185 1.96 (0.131, 0.239) . 1 n 200 2 Note that for an iid Bernoulli ( p ) sample, we can write S n 2 in a simple way. In general, i 1 ( X i X n )2 n S n2 n i 1 X i2 n = n 2 n n 2 2 ( X 2 X X X ) i i n n i 1 n 2 n 2nX nX n n n 2 X i i 1 X n2 n For an iid Bernoulli sample, let pˆ n X n . pˆ n is a natural point estimator of p for the Bernoulli. Note that for a 2 Bernoulli sample, X i X i . Thus, for a Bernoulli sample S n2 pˆ n pˆ n2 and an approximate 95% confidence interval for p is pˆ n 1.96 pˆ n2 pˆ n n Choosing Between Confidence Intervals 2 2 Let X 1 , , X n be iid N ( , ) where is known. Suppose we want a 95% confidence interval for . Then for any a and b that satisfy P(a Z b) 0.95 , X b , X a n n is a 95% confidence interval because: X 0.95 P a b n P a X b X n n P X b X a n n For example, we could choose (1) a=-1.96, b=1.96 [P(Z<a)=.025; P(Z>b)=.025); the choice we have used before]; (2) a=-2.05, b=1.88 [P(Z<a)=0.02, P(Z>b)=0.03]; (3) a=-1.75, b=2.33 [P(Z<a)=0.04, P(Z>b)=0.01]. Which is the best 95% confidence interval? Reasonable criterion: Expected length of the confidence interval. Among all 95% confidence interval, choose the one that is expected to have the smallest length since it will be the most informative. Length of the confidence interval = (X a ) (X b ) (b a) n n n, thus we want to choose the confidence interval with the smallest value of b-a. The value of b-a for the three confidence intervals above is (1) a=-1.96, b=1.96, (b-a)=3.92; (2) a=-2.05, b=1.88, (ba)=3.93; (3) a=-1.75, b=2.33, (b-a)=4.08. The best 95% confidence interval is (1) with a=-1.96, b=1.96. In fact, it can be shown that for this problem the best choice of a and b are a=-1.96, b=1.96.