STATISTICS B REVIEW S4.1 S4.1 S4.2 S4.3 S4.3 I can compute the mean of the sum given the mean and standard deviation of each random variable in a set of random variables. I can compute the variance and standard deviation of the sum given the mean and standard deviation of each random variable in a set of random variables. I can define and use the Central Limit Theorem with sampling distributions. I can use the normal model to compute probabilities for the sample mean I can use the CLT to compute probabilities for the sample mean. p. 321:1 – 9 p. 321:1 – 9 p. 328 – 331: 1 – 13 p. 328 – 331: 1 – 13 p. 328 – 331: 1 – 13 S4.4 I can define and use the large sample distribution of sample proportion to compare probabilities. p. 349 - 353: 1 – 12 S4.4 I can define and use rules of thumb for the applicability of the large sample distribution. p. 349 - 353: 1 – 12 S4.5 I can use the normal model to derive the percent confidence interval for the mean. . p. 349 - 353: 1 – 12; p. 360 – 364: 1 – 13 S4.5 I can use the CLT to derive the percent confidence interval for the mean p. 349 - 353: 1 – 12; p. 360 – 364: 1 – 13 S4.6 I can compute control limits for commonly used control charts. p. 268-271: 12 – 15 S4.6 I can compute control limits to assess whether a process is out of control. p. 268-271: 12 – 15 S5.1 I can compute bias, variance and mean squared error of estimators of the mean and proportion. p. 349 - 353: 1 – 12; p. 360 – 364: 1 – 13 S5.2 I can understand the logic of confidence intervals. p. 349 - 353: 1 – 12; S5.2 I can define the meaning of confidence interval. p. 349 - 353: 1 – 12; S5.5 I can compute the sample size required for a fixed confidence level and interval width for confidence intervals for mean and proportions p. 378- 381: 1 – 13 S5.3 S5.4 S5.5 I can compute and interpret confidence intervals for the difference between two means (in both the paired and unpaired setting) when the standard deviation in unknown, using the t distribution. I can compute and interpret (large samples) confidence intervals for the difference between two proportions and using the normal distribution. I can compute the sample size required for a fixed confidence level and interval width for confidence intervals for mean and proportions p. 460 – 464: 1 – 13 p. 476 – 478: 1 - 13 p. 488 – 492: 1 – 10 p. 501 – 505: 1 – 10 p. 378- 381: 1 – 13 STATISTICS B REVIEW I can explain what null hypothesis, alternative hypothesis, p-value, Type I error, Type II error and S6.1 power mean. I can explain the logic of significance testing. S6.2 S6.3 I can, assuming a normal model and known standard deviation, carry out a significance test for a single mean, with emphasis on understanding the computation and interpretation of the p-value. I can carry out (large samples) significance tests for one proportion, with emphasis on proper interpretation of results. I can carry out (large samples) significance tests for the difference of two proportions, with emphasis on proper interpretation of results. p. 399-401: 1-9 p. 399-401: 1-9 p. 412-415: 1 – 15; p. 423-424: 1 – 9 p. 371 – 373:1 – 14 p. 501 – 505: 1 – 10 S6.3 S6.4 S6.5 I can carry out (large samples) significance tests for one proportion, with emphasis on proper interpretation of results. I can carry out significance tests for the difference of two means (paired and unpaired) using the t distribution, with emphasis on proper interpretation of results. I can carry out significance tests for one mean using the t distribution, with emphasis on proper interpretation of results. I can carry out chi-squared significance tests of independence with emphasis on proper interpretation of results. I can carry out chi-squared significance tests of goodness of fit with emphasis on proper interpretation of results. I can carry out chi-squared significance tests of homogenicity with emphasis on proper interpretation of results. S6.7 S7.1 I can demonstrate, in the context of specific studies, the understanding that a result can be statistically significant while of insignificant practical importance. I can know the statistical model for regression, including linearity. . . p. 440-443: 1 – 15 p. 460 – 464: 1 – 13 p. 476 – 478: 1 - 13 p. 488 – 492: 1 – 10 p. 433-436: 1 – 13 p. 524 – 525: 1 – 8 p. 532 – 534: 1 – 10 w.s.12.1 p. 399-401: 1-9 p. 549 – 551: 1 – 9 I can know the statistical model for regression, including normality of errors p. 549 – 551: 1 – 9 I can know the statistical model for regression, including constancy of error variance p. 549 – 551: 1 – 9 STATISTICS B REVIEW I can compute a confidence interval for the slope of a regression line using the t distribution. S7.2 S7.3 S8.1 S8.2 S8.3 S8.3 S8.4 I can interpret a confidence interval for the slope of a regression line using the t distribution. I can test hypotheses about the slope of a regression line, with emphasis on interpretation of results. I can demonstrate knowledge of the assumptions required for all of the inferential procedures (confidence intervals and significance test). I can, in the context of specific studies, recognize aspects of study design that either support or offer evidence against required assumptions. I can demonstrate knowledge of the possible effects of incorrect assumptions (i.e., improperly specified models) on the inferential procedures. I can demonstrate knowledge of the possible effects of incorrect assumptions (i.e., improperly specified models) on the robustness of inferential procedures to departures from specified assumptions. I can show in context an understanding that statistical models are approximations to reality. I can show in context an understanding that care should be exercised in assigning too much precision to measures such as confidence levels or p-values. 1. (S4.1) Give two examples of a sample statistics: ______________________ 2. (S4.1) State the Central Limit Theorem (Section 8.2) 3. (S4.1) Given 2.35 and n 13 find the standard error of the mean. p. 549 – 551: 1 – 9 p. 549 – 551: 1 – 9 p. 560 – 562: 1 - 6 p. 399-401: 1-9 p. 399-401: 1-9 p. 399-401: 1-9 p. 399-401: 1-9 p. 433-436: 1 – 13 p. 399-401: 1-9 p. 399-401: 1-9 STATISTICS B REVIEW 4. (S4.2) Jimmy is the manager of a bottling plant in Denver. He has observed that the amount of orange juice in each 24 oz bottle is normally distributed, with a mean of 24.3 ounces and a standard deviation of 0.7 ounces. Find the probability that if a customer buys four bottles of juice that the mean of the four will be greater than 24 ounces. Round your answer to 4 decimal spots. 5. (S4.3) The number of boxes of girls scout cookies sold by each of the girls scouts in a Midwestern city has a distribution which is approximately normal with mean μ = 75 boxes and standard deviation σ = 30 boxes. Find the probability that the sample mean ( X ) number of boxes of cookies, sold by a random sample of 36 girl scouts, is between 60 and 90 boxes. 6. (S4.4) The Smith twins Sally and Billy are in different math classes at Freedom High. On their final exams, Sally scored 70 on a test with a mean of 80 and a standard deviation of 10 and Billy scored 62 with a mean of 71 and a standard deviation of 11. Who scored better? 7. (S4.4) What size does the sample have to be in order to use the student’s t distribution? 8. (S4.5) Given the confidence interval, 6.71 < µ < 7.23, and that s = 1.15 and n = 75, find the confidence level, c. 9. (S5.2) What does it mean, when given a confidence interval such as, 3.2 u 3.8 , is computed at a 90% level? 10. (S5.3) When computing a 95% confidence interval for the given information, what item would you find first? STATISTICS B REVIEW 11. (S5.4) A random sample of 100 felony trials in a large city in the Midwest shows the mean waiting time between arrest and trial is 173 days with standard deviation 28 days. Find a 99% confidence interval for the mean time interval between arrest and trial. 12. (S5.5) The Director of a Museum would like to know what fraction of the museum associates make purchases through the gift shop catalogue. (a) If no preliminary study is done, how large a sample must be taken if the director is to say with 90% confidence that the sample estimate is within 2% of the population proportion? (b) A preliminary study showed that out of 60 associates, 12 have used the gift shop catalogue. What size sample does the director need in order to say with 90% confidence that the sample estimate is within 2% of the population proportion? 13. (S5.5) A random sample of 40 Salt Lake City teachers showed the standard deviation of teaching experience to be 5.3 years. How many more teachers should be included in the sample to have 95% confidence that the sample mean number of years teaching experience is within six months of the population mean? 14. (S6.1) Based on the null hypothesis below, identify whether the result is a Type I error, a Type II error, or not an error. H0 : There is a smoke alarm in the kitchen. Result: The smoke alarm doesn’t go off when there is a fire in the kitchen. 15. (S6.1) What are you two choices in terms of H0 during hypothesis testing? 16. (S6.2) Killer bees have migrated into this country. There is fear that they will spread across the nation. However, they cannot survive in cold climates. It is thought that they cannot tolerate temperatures below 36º F. To test this claim a random sample of 9 killer bee hives were subjected to colder and colder temperatures until they died. The temperatures at which the hives died were recorded. The mean temperature was 37 º F with standard deviation 4 º F. Assuming that the killing temperature level is normally distributed, test the claim that the mean killing temperature is different from 36 º F. Use a 1% level of significance. 17. (S6.3) Of a random sample of 150 American adults, 69 adults claim that they prefer drinking regular STATISTICS B REVIEW soft drinks to drinking diet soft drinks and of a random sample of 130 American adults 61 adults, claim that they prefer drinking diet soft drinks to drinking regular soft drinks. Would this mean that the proportion of American adults that drink regular soft drinks is less than that of those who drink diet soft drinks? Use a 5% level of significance. 18. (S6.4) In order to test how the environment effects the ability of a child to understand concepts better, an experiment was conducted on six pairs of identical twins. In the given table, row B represents the writing age in months of the randomly selected member of a twin pair in an experimental group whereas row A represents the writing age in months of the other member of the twin pair who was not part of the experimental group. Using a 1% level of significance, test if there is a difference in the writing ages of the two groups. Then, find the P-Value. Twin 1 Twin 2 Twin 3 Twin 4 Twin 5 Twin 6 Row A 59 61 65 60 66 58 Row B 70 72 68 61 65 69 19. (S6.5) According to a recent marketing research, the distribution of the ages of people that buy compact discs in the U.S. is listed in the table below. A random sample of 750 people is also presented in the table. At a 5% significance level, does it appear that the distribution of the ages of the customers in the report fits that from customers in the U.S. as a whole? ONLY Calculate 2 . Age in years Less than 20 Percent of Customers Number of Customers in the Report in the Report 40.1 97 21- 30 21.3 374 31- 40 19.0 65 More than 40 19.6 214 20. (S6.7) The Magic Dragon Cigarette Company claims that their cigarettes contain an average of only 10 mg of tar. A random sample of 100 Magic Dragon cigarettes shows the average tar content to be 11.5 mg with standard deviation 4.5 mg. Use a 1% level of significance. Which statement correctly identifies this situation? (a) Too much emphasis on precision (b) Too little emphasis on precision (c) Incorrect assumption that the sample is large enough (d) Incorrect assumption that the sample is small enough STATISTICS B REVIEW 21. (S7.1 & S7.2) In the following table, x represents the number of questions David has answered more than Sally in the school exam and y represents the percent of times David gets a higher grade than Sally. x 13 16 14 19 15 12 22 y 13 14 20 12 22 19 23 (a) Find the linear regression equation for the given data. You may use a calculator. Round to the nearest thousandth. (b) Find the standard error of estimate for the data set. Round to the nearest thousandth. (c) Using the data and linear regression line from part (a), find a 85% confidence interval for x = 17. Round to the nearest hundred. 22. (S7.3) In the following table, x represents the average percentage of change in prices of consumer goods and y represents the average percentage change in expenditure of a family. Use a 2.5% level of significance to test the claim that 0. x 3.2 3.9 2.6 3.7 35 . 3.3 31 . y 12.4 17.6 13.4 17.6 14.4 9.2 5 23. (S 7.3) In the following table, x represents the per capita sales of a wholesale outlet in thousands and y represents the per capita income in thousands of dollars of the outlet. Use a 5% level of significance to estimate the P-value for the claim that 0. x 14.6 151 . 15.3 158 . 16.4 16.7 16.8 y 19.1 24.2 18.5 238 . 19.6 25.2 24.9 24. (S8.1) Which confidence interval is longer – 80% or 95%? Why? If the confidence interval is longer, than what can be said about the x values (sample statistics) in terms of the mean ( x )? 25. (S8.2) In a high school study of test preparation methods it was shown that method 2 produced 1% better results over method 1. At .01 , test results indicate it is statistically significant. Method 2 is expensive to purchase. Should the district buy the new test preparation kit? 26. (S8.3) Incorrect assumptions can affect the robustness of inferential procedures. Which measure of central tendency is most affected by outliers? 27. (S8.4) What is the test statistic (tailed test) needed to determine if you are testing an item being different than the stated statistic? Key to Benchmark 3 review STATISTICS B REVIEW 1. x, s, s 2 2. As n grows the distribution approaches a normal distribution 3. 1.277 4. .8051 5. 99.74% 6. Sally = -1: Billy = -.8181 so Billy is better 7. less than 30 8. 95% 9. 90% of the time the true mean will lies between 3.2 and 3.8 10. C = 95% 11. 90% confident 12. a) 1692; b) 1083 13. 165.78 < u < 180.22 14. Type I 15. Reject null or accept null 16. T=.75; do not reject; p = 17. Z= -0.151; do not reject 18. T = -2.615; .02 < p < .05 19. Chi square = 498.06 20. A 21. Y = 14.484 + 0.1947x; E = 8.924; 8.87 < y < 26.72 22. T-value does not lie within the critical region, so do not reject 23. .02 < p < .05 24. 95%; To be more sure you include a wider range; you can be 95% sure the sample statistic will be in the range 25. No 26. Mean 27. Not equal to