12.1 Estimation of the Difference between the Means Motivating example: Objective: we want to determine the difference between the mean ages of the customers shopping at the inner-city and the suburban. 1 : the mean age of all customers who shop at the inner-city store 2 : the mean age of all customers who shop at the suburban store 12 : the variance of the ages of all customers who shop at the inner-city store 22 : the variance of the ages of all customers who shop at the suburban store We want to estimate the difference 1 2 . Let x1,1 , x1, 2 ,, x1,36 : the ages of 36 customers shopping at the inner-city store. x2,1 , x2, 2 ,, x2, 49 : the ages of 49 customers shopping at the suburban store. Then, 36 x1 49 x i 1 1,i 36 40, x2 x i 1 2 ,i 49 35 and x 36 s1 i 1 The point estimate of 1,i x1 36 1 x 49 2 9, s2 i 1 x2 2 2 ,i 49 1 10 . 1 2 x1 x2 40 35 5 . 1 The point estimate of 12 s12 92 81 . The point estimate of 22 s22 102 100 . General Case: 1 : the mean of population 1 2 : the mean of population 2 12 : the variance of population 1 22 : the variance of population 2 Let x1,1 , x1, 2 ,, x1, n1 : the random sample from population 1 x2,1 , x2, 2 ,, x2, n2 : the random sample from population 2. Then, n1 x1 x 1,i i 1 : the sample mean of population 1. n1 n2 x2 x i 1 n2 2 ,i : the sample mean of population 2 and x n1 s1 i 1 1,i x1 n1 1 2 : the standard deviation of population 1. 2 x n2 s2 i 1 x2 2 2 ,i n2 1 : the standard deviation of population 2. The point estimate of u1 u 2 x1 x2 Important Properties of X 1 X 2 : X 1 : the sample statistic with possible value x1 X 2 : the sample statistic with possible value x2 Then, X X E X1 X 2 1 2 1 2 and I. 2 X1 X 2 Var X 1 X 2 E X 1 X 2 1 2 2 12 n1 22 n2 Large Sample Case ( n1 30, n2 30 ): Sampling Distribution of X 1 X 2 ( n1 30, n2 30 ): X 1 X 2 N 1 2 , X 1 X 2 X1 X 2 X X 1 2 X1 X 2 X 1 12 22 N 1 2 , n n 1 2 X 2 1 2 n1 n2 2 1 2 2 2 N 0,1 Derivation of 1 100% confidence interval: 3 . 1 P Z z 2 X X 2 1 2 P 1 z 2 X1 X 2 X X 1 2 1 2 P z 2 X1 X 2 1 2 X 1 X 2 P z z 2 2 X1 X 2 P z X1 X 2 1 2 X 1 X 2 z X1 X 2 2 2 P X 1 X 2 z X1 X 2 1 2 X 1 X 2 z X1 X 2 2 2 Thus, 1 100% confidence interval is x1 x2 z X X x1 x2 z 2 1 2 2 12 22 n1 n2 12 22 12 22 x1 x2 z , x1 x2 z 2 2 n n n n2 1 2 1 1 100% confidence interval ( n1 30, n2 30 ): As 1, 2 are known, x1 x2 z X X x1 x2 z 2 1 2 2 12 22 n1 n2 12 22 12 22 x1 x2 z , x1 x2 z 2 2 n n n n2 1 2 1 is a 1 100% confidence interval estimate of the population 4 1 2 . difference As 1, 2 x1 x2 z are unknown, 2 s X1 X 2 x1 x2 z 2 s12 s 22 n1 n2 s12 s 22 s12 s 22 x1 x2 z , x1 x2 z 2 n 2 n n n2 1 2 1 is a 1 100% confidence interval estimate of the population difference 1 2 . Example (continue): s X1 X 2 s12 s 22 9 2 10 2 2.07 n1 n2 36 49 A 95% confidence interval for x1 x2 z 2 1 2 is s X1 X 2 40 35 z 0.025 2.07 5 1.96 2.07 5 4.06 0.94,9.06 II. Small Sample Case ( n1 30, n2 30 ): Two assumptions are made: 1. Both populations have normal distribution. 2. The variance of the populations are equal ( 1 2 Pooled estimate of 2 and X2 5 1X2 : 22 2 ) 2 2 , denoted by s p , The pooled estimate of x n1 s 2p n1 1s n2 1s n1 n2 2 2 1 2 2 i 1 1,i x1 x 2,i x 2 2 n2 2 i 1 n1 n2 2 is a weighted average of the two sample variance s12 and s22 . The estimate of X X 1 2 1 1 12 22 2 n1 n2 n1 n2 is s X1 X 2 1 1 . s 2p n1 n2 1 100% confidence interval ( n1 30, n2 30 ): x1 x2 t n n 2, 1 2 2 s X1 X 2 x1 x2 t n1 n2 2, 2 1 1 s 2p n1 n2 1 1 2 1 2 1 x1 x2 t n1 n2 2, s p , x1 x2 t n1 n2 2, s p 2 2 n1 n2 n1 n2 is a 1 100% confidence interval estimate of the population difference 1 2 . Note: X 1 X 2 u1 u 2 X 1 X 2 1 2 T n1 n2 2 s X1 X 2 1 1 s 2p n1 n2 6 Then, 1 100% confidence interval in small sample case can be derived analogous to the one in large sample case. Example: Let 1 : the mean balance of checking account in Chekry Grove bank 2 : the mean balance of checking account in Beechmont bank. x1 1000, n1 12, s1 150 x 2 920, n2 10, s 2 120 Please find a 90% confidence interval for 1 2 . [solution:] s 2 p n1 1s12 n2 1s22 n1 n2 2 11 150 2 9 120 2 18855 . 12 10 2 Thus, 1 1 1 1 s X1 X 2 s 2p 18855 58.79 . 12 10 n1 n2 Then, a 90% confidence interval for x1 x2 t n n 2, 1 2 2 1 2 is s X 1 X 2 1000 920 t 20, 0.05 58.79 80 1.725 58.79 80 101.41 21.41, 181.41 Online Exercise: Exercise 12.1.1 Exercise 12.1.2 7