45 3. Basic Statistical Inference 3.1. The C.I. and Hypothesis Testing: Normal and T (I) 1 100% confidence intervals: (point estimate) [( z , t n 1, , t n n 2 1 2 2 2, ) 2 (standard deviation (error) of point estimate)] One sample: (A) Population mean(s) : (i) C.I. Large sample n 30 : x z s X x z 2 Small sample 2 s s s x z , x z 2 2 n n n n 30 , normal population: x t n1, s X x t n1, 2 2 s x t n1, 2 n (ii) Sample Size and Margin of Error: n s , x t n1, 2 n s n z2 s 2 2 E2 (B) Population proportion p : (i) C.I.: p z s P p z 2 2 p 1 p p z 2 n (ii) Sample Size and Margin of Error: n p 1 p , p z 2 n p 1 p n z2 p1 p 2 E2 Two samples: (A) Population Mean Difference 1 2 : Large sample n1 30, n2 30 : x1 x2 z 2 s X1 X 2 x1 x2 z 2 s12 s 22 n1 n2 s12 s 22 s12 s 22 x1 x2 z , x1 x2 z 2 2 n1 n2 n1 n2 Small sample n1 30, n2 30 , normal populations: x1 x2 t n n 2, 1 2 2 s X 1 X 2 x1 x 2 t n1 n2 2, x1 x 2 t n1 n2 2, 2 2 1 1 s 2p n1 n2 1 1 , x1 x2 t n1 n2 2, s 2p 2 n1 n2 45 1 1 s 2p n1 n2 , 46 x n1 where s 2p n1 1s12 n2 1s22 n1 n2 2 2 1,i i 1 x1 x2 ,i x2 n2 2 i 1 n1 n2 2 (B) Population Proportion Difference p1 p2 : p1 p2 z 2 s P1 P2 p1 p2 z p1 1 p1 p2 1 p2 n1 n2 2 (II) Hypothesis testing: test statistic point estimate - mean of point estimate under H 0 standard deviation (error) of p o i net s t i m aut ne d eHr 0 One sample: (A) Population mean : x 0 x 0 sX s n Large sample n 30 Small sample n 30 , normal population: t x 0 x 0 : z sX (B) Population proportion p : z p p0 P s n p p0 p0 1 p0 n Two samples: (A) Population Mean Difference 1 2 : (a) Large sample n1 30, n2 30 : z x1 x2 0 x x2 0 1 sX1 X 2 s12 s22 n1 n2 (b) Small sample n1 30, n2 30 , normal populations: t x1 x2 0 x x2 0 1 sX1 X 2 1 1 s 2p n1 n2 46 47 d 0 2. Matched samples for testing : z s d d 0 or t s d n . n (B) Population Proportion Difference p1 p2 : p p2 H 0 : p1 p2 0 : z 1 s P1 P2 p1 p2 1 1 p 1 p n n 2 1 p p2 c H 0 : p1 p2 c 0 : z 1 s P1 P2 p1 p2 c p1 1 p1 p 1 p2 2 n1 n2 Summary table: Point estimate Classical approach (critical values) Null distribution (p-value) One sample x, p Two samples x1 x2 , d , p1 p2 z , z , z , z , z , z , 2 t n 1, , t n 1, , t n 1, Z , T n 1 2 2 t n1 n2 2, , t n1 n2 2, , t n n 1 2 2, 2 Z , T n1 n2 2 Example 1: A sample size of 40 provides a sample mean of 16.5 and sample standard deviation of 7. (a) Find the 94% confidence interval for population mean. (b) With a length of 2 at 90% confidence, what size sample would be required to estimate the population mean? [solution:] (a) A 94% confidence interval is x z 2 s 7 16.5 z0.03 16.5 1.88 1.107 14.42, 18.58 . n 40 z2 s 2 z 2 7 2 1.6452 49 2 1 , n 2 2 0.05 2 (b) E 132.59 n 133 . 2 1 E 1 Example 2: It is believed that the running time of movies is normally distributed with mean 47 48 to 140 minutes (i.e., H 0 : 140 v.s. H a : 140 ). A sample of 4 movies was taken and the following running times were obtained, 150 , 150, 180, 170. At the 5% level of significance, (a) test the hypothesis based on the classical hypothesis test procedure. (b) using a p-value, test the hypothesis. (c) using a confidence interval, test the hypothesis. (d) With a margin of error of 5 or less at 95% confidence, what size sample is required? [solution:] 4 n 4, x 150 150 180 170 162.5, s 4 x i i 1 x 4 1 2 15 . (a) x 0 162.5 140 22.5 3, t 3 t n 1, t 3,0.025 3.182 s 15 2 7.5 n 4 not reject H 0 t (b) p - value PT (n 1) t PT (3) 3 PT (3) 3.182 0.05 not reject H 0 (c) A 95% confidence interval for is x t n1, 2 s 15 162.5 t3,0.025 162.5 3.182 7.5 138.63,186.36 n 4 Since 140 138.63,186.36, we do not reject H 0 . (d) n z2 s 2 2 E2 z 02.025 15 2 52 2 1.96 225 34.57 n 35 . 25 Example 3: A random sample of 400 people was taken. 228 of the people in the sample favored candidate A. We are interested in determining whether or not the proportion in favor of candidate A is significantly less than 50%, H 0 : p 0.5 v.s. H a : p 0.5 . 48 49 (a) As 0.1 , test the hypothesis based on the classical hypothesis test procedure. (b) As 0.03 , test the hypothesis based on the p-value. (c) Develop a 90% confidence interval estimate for the proportion in favor of candidate A. (d) With a margin of error of 0.01 or less at 95% confidence, what size sample would be required to estimate the proportion in favor of candidate A? [solution:] p p0 228 0.57 0.5 n 400, p0 0.5, p 0.57, z 2.8 400 p0 1 p0 0.51 0.5 n 400 (a) 0.1, z 2.8 z z0.1 1.28 reject H 0 (b) 0.03 , p - value PZ z PZ 2.8 0.5 0.4974 0.0026 0.03 reject H 0 (c) A 90% confidence interval for p is p 1 p 0.571 0.57 0.57 z 0.05 2 n 400 0.57 1.645 0.02475 0.5293,0.6107 p z (d) n E 0.01, 0.05 . Then, z2 p1 p 2 E2 z 02.025 0.57 1 0.57 0.012 1.962 0.57 1 0.57 9415.76 0.012 n 9416 . Example 4: Consider the following hypothesis test. H 0 : 1 2 0 v.s. H a : 1 2 0 . The following data are for two independent samples taken from the two normal populations with equal variances. 49 50 Sample 1 Sample 2 6, 10, 9, 8, 7 9, 12, 10, 11, 9 (a) With 0.05 , test the hypothesis based on the classical hypothesis test procedure. (b) With 0.01, test the hypothesis based on the p-value. (c) With 0.05 , using the confidence interval method to test the hypothesis [solution:] n1 5, n2 5, x1 8, x2 10.2, s12 2.5, s22 1.7, 0 0 s 2 p n1 1s12 n2 1s22 n1 n2 2 4 2.5 4 1.7 2.1 552 (a) t x1 x 2 0 1 1 s 2p n n 2 1 8 10.2 0 1 1 2.1 5 5 2.4 t n n 1 2 2, 2 t 8, 0.025 2.306 Therefore, we reject H 0 . (b) p value PT n1 n2 2 t PT 8 2.4 PT 8 3.355 0.01 Therefore, we do not reject H 0 . (c) A 95% confidence interval for 1 2 is 1 1 1 1 s 2p 8 10.2 t8, 0.025 2.1 1 2 2 5 5 n1 n2 2.2 2.306 0.9165 4.313,0.086 x1 x2 t n n 2, Since 0 4.313,0.086 , we reject H 0 . Example 5: To determine the effectiveness of a new weight control diet, 8 randomly selected students observed the diet for 4 weeks with the results shown below. Dieter Weight (before) Weight (after) A 138 135 B 151 147 C 129 132 D 125 127 50 51 E 168 155 F 139 131 G 152 144 H 140 142 We like to test the hypothesis H 0 : 1 2 2 , where 1 and 2 are the mean weights of the students before and after taking the weight control diet, respectively. The above data can be considered as the matched-sample data. (a) For 0.1, test the above hypothesis using the classical hypothesis test. (b) For 0.05 , please use the confidence interval method to test the above hypothesis. [solution:] (a) d7 d3 d5 d6 d8 d2 d4 d1 3 Therefore, t 4 -3 -2 13 8 8 -2 d 3.625, sd 5.7802 . Thus, d 2 3.625 2 0.795 t 0.795 1.895 t 7,0.05 t n1, . 2 sd 5.7802 n 8 We do not reject H0 . (b) A 95% confidence interval for d t n 1, Since sd 2 n 3.625 t 7,0.025 1 2 5.7802 8 is 3.625 2.365 2 1.2081,8.4581 not reject H 0 5.7802 8 1.2081, 8.4581. . Example 6: The results of a recent poll on the preference of shoppers regarding two products are shown below. 51 52 Shoppers Favoring Product Shoppers Surveyed This Product A 800 560 B 900 612 Let p1 be the proportion favoring product A and p 2 be the proportion favoring product B. (a) Develop a 90% confidence interval estimate for the difference p1 p2 between the proportions favoring each product. (b) Test H 0 : p1 p 2 at 0.05 based on classical approach. (c) Test H 0 : p1 p 2 at 0.05 based on p-value method. [solution:] n1 800, n2 900, p1 560 s P1 P2 0.7, p2 612 0.68 800 900 . p1 1 p1 p2 1 p2 0.71 0.7 0.681 0.68 0.02246 n1 n2 800 900 (a) Thus, a 90% confidence interval is p1 p 2 z (b) p 2 s P1 P2 0.7 0.68 z 0.05 0.02246 0.02 0.03695 0.01695,0.05695 n1 p1 n2 p2 560 612 1172 0.689 n1 n2 800 900 1700 and s P1 P2 1 1 1 1 p1 p 0.689 1 0.689 0.02249 . 800 900 n1 n2 Therefore, z p1 p2 0.7 0.68 0.89 z 0.89 1.96 z0.025 z , 2 sP1 P2 0.02249 we do not reject (c) H0 . p - value P Z .89 0.3734 0.05 not reject H0 52 .