Midterm Practice Questions Answer Key 2014 1. A student in Mrs. Stamp’s Statistics class was complaining about how hard of a teacher she was. “She failed 6 students and Mr. Nidsgorski only failed 1 student. Mrs. Stamp is a harder grader than Mr. Nidsgorski.” Explain why this conclusion may not be true. What additional information would you need to compare the courses? You need to know how many students there are in each class. Mrs. Stamp could have more students than Mr. Nidsgorski and therefore could fail a smaller percentage of her students. Result: comparing percents is better than comparing counts. 2. Many people eat fast food as a regular part of their diet. Is fast food unhealthy? The best answer may be, “It depends on what you eat.” In order to answer this question you gather some nutritional data about some popular fast food burgers: Quarter Pounder with cheese, Classic Single with everything, Whopper with cheese, and Cheeseburger. You record the restaurant they are sold, the calories, fat, cholesterol, and sodium content in each of these burgers. a. What are individuals in this data set? Individuals: fast-food hamburgers. b. What are variables recorded in the data set? Variables: restaurant; calories; fat; cholesterol; sodium c. Which are the numerical variables and which are the categorical variables measured? Numerical Variables: calories; fat; cholesterol; sodium Categorical Variables: restaurant 3. The student government plans to ask a random sample of students about their priorities for improving campus life. The college registrar provides a list of the 3500 enrolled students to serve as a sampling frame. a. Describe how would you choose an SRS of 250 students? b. Describe how would you choose a systematic sample of 250 students? c. The list shows whether students live on campus (2400 students) or off campus (1100 students). Describe how would you choose a stratified sample of 200 on campus students and 50 off campus students? d. Which method would you choose and why? a. Assign each student on the list with a four-digit number in the range 0001-3500. Use the random digit table (selecting four-digit numbers in the given range). Ignore #’s outside range and ignore duplications. Continue this process until you have identified 250 students. b. For a systematic sample you would need every 14th name on an ordered list. Randomly select one of the first 14 students on the list and then every 14 th name on the list from then on. c. Stratified Random Sample. d. Answers will vary. For example, choose a stratified sample in order to be sure that opinions of on- and off-campus students are fairly represented. 4. According to Snapple.com, 13% of adults are left-handed. At a mathematics conference, 16% of those attending were left-handed. State whether the boldface number is a parameter or a statistic. 13% is a parameter; 16% is a statistic. 5. The noted scientist Dr. Iconu wanted to investigate attitudes toward television advertising among American college students. He decided to use a sample of 100 students. Students in freshman psychology are required to serve as subjects for experimental work. Dr. Iconu obtained a class list and chose an SRS of 100 of the 340 students on the list. He asked each of the 100 students in the sample the following question: Do you agree or disagree that having commercials on TV is a fair price to pay for being able to watch it? Of the 100 students in the sample, 82 marked “Agree.” Dr. Iconu announced the result of his investigation by saying “82% of American college students are in favor of TV commercials.” a. What is the population? American college students b. What is the sampling frame? Is it suitable? Students in freshman psychology at one university. No. The sampling frame does not come close to representing the population to which Dr. Iconu wants to generalize his findings—undercoverage! c. What is the sample? 100 students from the psychology class d. What is the population parameter? Unknown (in words: % of all American college students who are in favor of TV commercials). e. What is the sample statistic? 82% (in words: the % of 100 sampled college students who favor TV commercials). f. Why is Dr. Iconu’s result misleading? The sampling frame is not representative of the population of interest—undercoverage bias. g. Dr. Iconu defended himself against criticism by pointing out that he had carefully selected a simple random sample from his sampling frame. Is this defense relevant? Why? The defense is not relevant. It doesn’t matter that he selected an SRS from his sampling frame b/c his sampling frame is not representative of the population— undercoverage. 6. The Gallup Poll asked a random sample of 1493 adults, “Are you afraid to go out at night within a mile of your home because of crime?” Of the sample, 672 said, “Yes.” Make a confidence statement about the percent of all adults who fear to go out at night because of crime. 672 1 0.026 ≈ 0.03 = 3%. p hat (statistic) 0.45 = 45%; margin of error ≈ 1493 1493 We are 95% confident that between 42% and 48% of all adults are afraid to go out at night within a mile of their own home because of crime. (45% plus or minus 3%) 7. What is the advantage of bigger random samples in a sample survey? bigger samples = lower amount of variability, which will then result in smaller margins of error. 8. Each of the following is a source of error in a sample survey. Label each as sampling error or nonsampling error, and explain your answers. a. The telephone directory is used as a sampling frame. Sampling error, undercoverage. b. The subject cannot be contacted in five calls. Nonsampling error, non-response. c. Interviewers choose people on the street to interview. Sampling error, bad sample. 9. Doctors identify “chronic tension-type headaches” as headaches that occur almost daily for at least six months. Can antidepressant medications or stress management training reduce the number and severity of these headaches? Are both together more effective than either alone? Investigators compared 4 treatments: antidepressant alone, placebo alone, antidepressant plus stress management, and placebo plus stress management. Assume you have 30 volunteers available for the study. Identify subjects, response variable, explanatory variable. Subjects: 30 volunteers who have headaches that occur daily for at least 6 months. Response variable: change in number and severity of the headaches Explanatory variable: medicine (antidepressant alone, placebo alone, antidepressant plus stress management, and placebo plus stress management) 10. You want to conduct an experiment to investigate whether single-sex classrooms will help girls improve their score on a chemistry midterm exam. You can assume you have volunteers from Ms. Manning’s chemistry classes. Draw an experimental design for the experiment. Subjects: girls in Ms. Manning’s chemistry classes Response variable: scores on midterm exam Explanatory variable: single-sex vs co-ed classroom 11. Use the following set of data for the questions below: 3,12,17,20,21,23,30,34,36,36,38,38,40,40,41,42,44,45,45,45,47 The numbers above represent the amount of exercise (in minutes) that 21 WA seniors got during one weekend day. a. Find the 5-number summary and draw a neatly labeled horizontal boxplot for the data above. 5 number summary: 3 22 38 43 47 b. Describe the shape of the boxpolot. Skewed to the left c. Mean is < < > Median (circle one) d. Which is better to use for center: median or mean? Why? Median b/c of the skewness, mean is thrown off by outliers and can be pulled towards the tail e. Calculate the interquartile range (IQR). Q3-Q1= 43-22=21 f. Does this data include any outliers? Find the low and high boundaries. IQR= 21 1.5 * IQR = 31.5 Q3 + 31.5 = 74.5 Q1 – 31.5 = -9.5 Boundaries are -9.5 and 74.5 so there are no outliers 12. Calculate the standard deviation for the following set of data: 0,1, 1, 4, 6, 6 X X 2 s = n 1 36 5 = 2.68 x=3 X X X X X 0 1 1 4 6 6 -3 -2 -2 1 3 3 9 4 4 1 9 9 2 lower SD, closer points are from the mean 13. A study found that high school GPA's were positively associated with first-year GPAs for college students. We can conclude from this that (a) students who scored high in high school tended to get lower GPAs than those who scored lower in high school. (b) students who scored high in high school tended to get higher GPAs than those who scored lower in high school. (c) grade point averages are higher for older students (d) the correlation between high school GPAs and first year college GPA is 1.0 ANSWER: B 14. The correlation between two variables is of -0.8. We can conclude (a) there is a strong positive association between the two variables (b) there is a strong negative association between the two variables (c) all of the relationship between the two variables can be explained by a straight line (d) there are no outliers ANSWER: B 15. You would draw a scatterplot (a) to show the five-number summary for the heights of female students. (b) to show how a child's height increases over time. (c) to show the relationship between the height of female students and the heights of their mothers. (d) to show the distribution of heights of students in this course. ANSWER: C 16. Which correlation indicates a strong positive straight line relationship? (a) 0.4 (b) -0.75 (c) 1.5 (d) 0.0 (e) 0.99 ANSWER: E 17. If the explanatory variable and response variable are switched the correlation would be affected (a) by changing the correlation from negative to positive (b) by changing the correlation from positive to negative (c) correlation would not change (d) by changing the correlation from a linear to a curved relationship ANSWER: C