Worksheet for recitation 14 STT 200, Lecture 5, Sec 23 and 24, 2013-4-9 Confidence interval for means: (1 − ๐ผ)% ๐๐๐๐๐๐๐๐๐๐ ๐๐๐ก๐๐๐ฃ๐๐ ๐๐๐ ๐กโ๐ ๐๐๐๐ข๐๐๐ก๐๐๐ ๐๐๐๐ ๐: ๐ฆฬ ± ๐ก ∗ One sample t-test for the mean tests the hypothesis ๐ป0 : ๐ = ๐0 using ๐ก = ๐ √๐ ๐ฆฬ −๐0 ๐⁄ √๐ Check conditions: 1) Independence and Randomization Assumption 2) 10% condition: sample size less than 10% the population size. 3) Normal or Nearly Normal Condition (unimodal, symmetric), slightly skewed is okay if same size is large, as supported by Central Limit Theorem (CLT). ๏ฑ Will your flight get you to your destination on time? Here is a histogram and summary statistics of late arrivals. There is no evidence of a trend over time. (The correlation of On Time Departure% and time is r = -0.07.) We have ๐ = 144, ๐ฆฬ = 20.0757, and ๐ = 4.08837. a) Check the assumptions and conditions for the inference. Given no time trend, the monthly late-arrival rates should be independent. Though not a random sample, these months should be representative, and they’re fewer than 10% of all months. The histogram looks unimodal and symmetric. b) Find a 99% confidence interval for the true percentage of flights that arrive late. Note in a calculator, the critical value ๐ก ∗ = ๐๐๐ฃ๐(0.995, 143) = 2.611. (or qt(0.995, 143) in R) ๐ฆฬ ± ๐ก ∗ ๐−1 ๐ √๐ = 20.0757 ± 2.611 4.08837 √144 = (19.186, 20.965) c) Interpret this interval for a traveler planning to fly. We can be 99% confident that the interval from 19.19% to 21.0% holds the true mean monthly percentage of late flight arrivals. http://stt.msu.edu/users/zhangz19/stt200sp13.html Page 1 Worksheet for recitation 14 STT 200, Lecture 5, Sec 23 and 24, 2013-4-9 ๏ฑ Consider the following statements about t distributions. I. Like the standard normal distribution, t distributions are bell-shaped and symmetric about 0. II. The standard normal distribution's standard deviation is larger than the standard deviation of any t distribution. III. As the degrees of freedom increase, the standard deviation of the t distribution decreases. Which statements are correct? ๏ฑ 77 cows studied gained an average of 56 pounds, with 95% confidence interval for the mean weight gain this supplement produces has margin of error of ±11 pounds. Why these statements are incorrect? a) 95% of the cows studied gained between 45 and 67 pounds. The CI is for population mean, not the individual cows in this study. b) We’re 95% sure that cow fed this supplement will gain between 45 and 67 pounds. The CI is on population mean, not for individual cows. c) We’re 95% sure that the average weight gain among the cows in this study was between 45 and 67 pounds. We know the average of this study was 56 pounds! d) The average weight gain of cows fed this supplement will be between 45 and 67 pounds 95% of the time. The average weight gain does not vary. It’s fixed but unknown and we’re trying to estimate it. e) If this supplement is tested on another sample of cows, there is 95% chance that their average weight gain will be between 45 and 67 pounds. No, There is not a 95% chance for another to have its average weight gain between 45 and 67, but within 2 standard errors of the true mean. ๏ฑ A random sample of 288 Nevada teachers produces the t-interval for mean salary: 38944<mean salary<42893 with 90% confidence. If we took many random samples of Nevada teachers, about 9 out of 10 out them would produce this confidence interval. Which statement is correct? a) If we took many random samples of Nevada teachers, about 9 out of 10 out them would produce this confidence interval. b) If we took many random samples of Nevada teachers, about 9 out of 10 out them would produce a confidence interval that contained the mean salary of all Nevada teachers. c) About 9 out of 10 Nevada teachers earn between $38,944 and $42,893 d) About 9 out of 10 Nevada teachers surveyed earn between $38,944 and $42,893 e) We are 90% confident that the average teacher salary in the U.S. is between $38,944 and $42,893. http://stt.msu.edu/users/zhangz19/stt200sp13.html Page 2 Worksheet for recitation 14 STT 200, Lecture 5, Sec 23 and 24, 2013-4-9 Review questions ๏ฑ A 98% confidence interval for the mean number of hours spent studying by MSU students is based on a random sample of n = 15 MSU students. The confidence interval will be computed via the formula ๐ฅฬ ± ๐ก ∗ ๐ . √๐ Which of the following gives the correct value for ๐ก ∗ in R? a) qt(0.98, 15) d) qt(0.99, 14) [1] 2.249 [1] 2.624 b) qt(0.98, 14) e) qt(0.995, 15) [1] 2.264 [1] 2.947 c) qt(0.99, 15) f) [1] 2.602 qt(0.995, 14) [1] 2.977 ๏ฑ A random sample of 600 randomly selected college freshmen was selected, and followed for a year to determine whether they would return to college the next year. Of the 600, 468 (78%) returned to college the next year. We would like both a 95% confidence interval for the proportion p of all college freshmen who will return to college the next year, as well as to conduct a test of H0 : p = 0.75 versus Ha : p ≠ 0.75. a) 0.78 ± 2.33√(0.75)(0.25)/600 b) 0.78 ± 2.33√(0.78)(0.22)/600 c) 0.78 ± 2.33√(0.75)(0.25)/599 d) 0.78 ± 2.33√(0.78)(0.22)/599 e) 0.78 ± 2.33(๐/√๐ ) ๏ฑ Which of the following is the correct formula for the test statistic ๐ง? a) ๐ง = (0.78 − 0.75)/√(0.78)(0.22)/600 b) ๐ง = (0.78 − 0.75)/√(0.75)(0.25)/600 c) ๐ง = (0.78 − 0.75)/√(0.78)(0.22)/599 d) ๐ง = (0.78 − 0.75)/√(0.75)(0.25)/599 f) ๐ง = (0.78 − 0.75)/(๐/√๐ ) ๏ฑ A 99% confidence interval for the proportion of adults in Michigan who support an increase in the gas tax to pay for improved road maintenance is desired. If the margin of error is required to be 0.01 or less, how large must the http://stt.msu.edu/users/zhangz19/stt200sp13.html Page 3 Worksheet for recitation 14 STT 200, Lecture 5, Sec 23 and 24, 2013-4-9 sample size n be? Use the conservative approach by assuming that ๐ฬ = 0.5 for planning purposes. Note that for a 99% confidence interval for a proportion, the multiplier ๐ง ∗ is given by ๐ง ∗ = 2.58. a) n = 16641 b) n = 10000 c) n = 129 d) n = 101 e) None of the above. ๏ฑ To test the hypotheses H0 : μ = 65 versus Ha : μ ≠ 65, where μ represents the mean height of adult females in Michigan, a random sample of 27 females in Michigan was selected and their heights were recorded. The test ๐ฆฬ −65 statistic ๐ก = ๐ ⁄ √27 was computed to be ๐ก = −2.11. Which of the following give the p-value for this test? a) pt(-2.11, 26) b) 1-pt(2.11, 26) c) 2*pt(-2.11, 26) d) 2*pt(2.11, 26) e) 1-pt(-2.11, 26) ๏ฑ A month before an election, 360 of a random sample of n1 = 600 likely voters favor a candidate. Later and after some negative revelations about the candidate, 200 of a random sample n2 = 400 likely voters favor the candidate. Let p1 and p2 denote the respective population proportions. We are interested in the change p1-p2. The estimate for the difference p1-p2 is 360/600-200/400 = 0.60-0.50 = 0.10. The 95% CI estimate is 0.10± a) 0.014 b) 0.058 c) 0.041 d) 0.063 e) none of these ๏ฑ Suppose that 10% of students at a large university wear contact lenses. A simple random sample of n = 900 students is selected, and ๐ฬ denotes the proportion of the 900 who wear contact lenses. Find ๐ธ(๐ฬ ) and ๐๐ท(๐ฬ ), and how will they change when we reduce the sample size n? http://stt.msu.edu/users/zhangz19/stt200sp13.html Page 4