Name: __________________________ Date: _____________ 1. The Columbus Zoo conducts a study to determine whether a household's income can be used to predict the amount of money the household will give to the zoo's annual fund drive. The response variable in this study is A) the Columbus Zoo. B) a household's income. C) the amount of money a household gives to the zoo's annual fund drive. D) all households in Columbus. Use the following to answer questions 2-3: The height (in feet) and volume of usable lumber (in cubic feet) of 32 cherry trees are measured by a researcher. The goal is to determine if volume of usable lumber can be estimated from the height of a tree. The results are plotted below. 2. In this study, the response variable is A) height. B) volume. C) height or volume. It doesn't matter which is considered the response. D) neither height nor volume. The measuring instrument used to measure height is the response variable. 3. The scatterplot suggests A) there is a positive association between height and volume. B) there is an outlier in the plot. C) both a and b. D) neither a nor b. Page 1 Use the following to answer questions 4-5: I wish to determine the correlation between the height (in inches) and scoring average (points per game) of women on a college basketball team. To do this I record the height and scoring average of two players on the team. The values are Player #1 Player #2 Height 70 75 Scoring Average 11.0 20.0 4. The correlation r computed from the measurements on these players is A) 1.0. B) positive and between 0.25 and 0.75. C) near 0, but could be either positive or negative. D) exactly 0. 5. The correlation r would have units A) inches. B) points. C) inches-points. D) no units. Correlation is a unitless quantity. 6. Which of the following statements is true? A) the correlation coefficient equals the proportion of times two variables lie on a straight line. B) the correlation coefficient will be +1.0 only if all the data lie on a perfectly horizontal straight line. C) the correlation coefficient measures the fraction of outliers that appear in a scatterplot. D) the correlation coefficient is a unitless number and must always lie between 1.0 and +1.0, inclusive. Page 2 7. Consider the following scatterplot. The correlation between X and Y A) is approximately 0.999. B) is approximately 0.8. C) is approximately 0.0. D) cannot be computed because there is an outlier in the plot. 8. Consider the following scatterplot of two variables X and Y. We may conclude A) the correlation between X and Y must be close to 1 because there is nearly a perfect relation between them. B) the correlation between X and Y must be close to 1 because there is nearly a perfect relation between them but it is not a straight-line relation. C) the correlation between X and Y is close to 0. D) the correlation between X and Y could be any number between 1 and +1. Without knowing the actual values we can say nothing more. Page 3 9. Which of the following is true of the correlation coefficient r? A) It is a resistant measure of association. B) 1 r 1. C) If r is the correlation between X and Y, then r is the correlation between Y and X. D) All of the above. 10. Consider the scatterplot below. The form of the relationship represented in the plot is best described as A) multi-linear. B) negative association. C) clusters. D) all of the above. Page 4 Use the following to answer questions 11-13: Consider the following scatterplot of amounts of CO (carbon monoxide) and NOX (nitrogen oxide) in grams per mile driven in the exhausts of cars. The least-squares regression line has been drawn in the plot. 11. The intercept of the least-squares regression line is approximately A) 0.7. B) 1.7. C) 2.0. D) cannot be determined from the graph. 12. The least-squares line would predict that a car that emits 10 grams of CO per mile driven would emit how many grams of NOX per mile driven? A) 10.0. B) 1.7. C) 2.2. D) 1.1. 13. The point indicated by the x has A) a negative value for the residual. B) a positive value for the residual. C) a zero value for the residual. D) a zero value for the correlation. Page 5 Use the following to answer questions 14-16: Using data on the appraised value of homes, a real estate agent computes the least-squares regression line for predicting a home's value in 2002 from its value in 1992. The equation of the least-squares regression line is y = $22,000 + 1.6x where y represents a home's value in 2002 and x is the value in 1992. 14. A home's value in 1992 is A) the intercept. B) the slope. C) the explanatory variable. D) the response variable. 15. Suppose Joe owns a home that was worth $100,000 in 1992. What would be the predicted value of his home in 2002? A) $182,000. B) $160,000. C) $122,000. D) cannot be determined from the information given. We also need to know the correlation. 16. Which of the following is true of the least-squares regression line? A) The slope is the change in a home's value in 2002 that would be predicted by a $1 increase in a home's value in 1992. B) It always passes through the point ( x , y ) the means of the values of home's (that were used to compute the least-squares regression line) in 1992 and 2002, respectively. C) It will only pass through all the data points if r = ± 1. D) All of the above. Page 6 17. Babies typically learn to crawl approximately six months after birth. However, it may take longer for babies to learn to crawl in the winter when they are often bundled in clothes that restrict their movement. Thus there may be an association between a baby’s crawling age and the average temperature during the month they first try to crawl. Below are the average ages (in weeks) at which babies began to crawl for a sample of babies born in each of the 12 months of the year. In addition, the average temperature (in F) for the month that is six months after the birth month is also listed. Birth month January February March April May June July August September October November December Average crawling age 29.84 30.52 29.70 31.84 28.58 31.44 33.64 32.82 33.83 33.35 33.38 32.32 Average temperature 66 73 72 63 52 39 33 30 33 37 48 57 We want to investigate if the average age at which infants begin to crawl (Y) can be predicted from the average outdoor temperature (X) six months after birth when they are likely to begin crawling. We decide to fit a least-squares regression line to the data with X as the explanatory variable and Y as the response variable. We compute the following quantities. r = correlation between X and Y = –0.7 x = mean of the values of X = 50.25 y = mean of the values of Y = 31.77 s x = standard deviation of the values of X = 15.85 s y = standard deviation of the values of Y = 1.76 The slope of the least-squares line is A) –0.08. B) 0.49. C) –1.58. D) 14.24. Page 7 18. The correlation between the age and height of children is found to be about r = 0.7. Suppose we use the age x of a child to predict the height y of the child. We conclude A) the least-squares regression line of y on x would have a slope of 0.7. B) the fraction of the variation in heights explained by the least-squares regression line of y on x is 0.49. C) about 70% of the time, age will accurately predict height. D) height is generally 70% of a child's age. 19. Which of the following is correct? A) The correlation r is the slope of the least-squares regression line. B) The square of the correlation is the slope of the least-squares regression line. C) The square of the correlation is the proportion of the data lying on the least-squares regression line. D) The mean of the residuals from least-squares regression is 0. 20. Suppose we fit the least-squares regression line to a set of data. Points with unusually large values of the residuals are called A) response variables. B) the slope. C) outliers. D) correlated. Use the following to answer question 21: A medical researcher collects data relating drug dosage, in milligrams (x), and heart rate (y). The regression equation for the data obtained is yˆ 100 37.5 x 21. Which statement correctly interprets the slope of this equation? A) For every unit increase in drug dosage, heart rate increases by 37.5, on average. B) For every unit increase in drug dosage, heart rate increases by 100, on average. C) For every unit increase in drug dosage, heart rate decreases by 37.5, on average. D) For every unit increase in drug dosage, heart rate decreases by 62.5, on average. 22. Which of the following would be necessary to establish a cause-and-effect relation between two variables? A) Strong association between the variables. B) An association between the variables is observed in many different settings. C) The alleged cause is plausible. D) All of the above. Page 8 Use the following to answer question 23: A study was conducted to investigate the relationship between the number of hours a student spends studying for an exam, x, and his score on the exam, y. Residuals Versus hours (response is Score) 20 15 10 Residual 5 0 -5 -10 -15 -20 -25 0 5 10 15 20 25 hours 23. A student who spends 22 hours studying has a negative residual. This means A) the student's actual exam score is higher than his predicted exam score. B) the student's actual exam score is lower than his predicted exam score. C) there must be a mistake; residuals cannot be negative. 24. A group of college students believes that herbal tea has remarkable restorative powers. To test their theory they make weekly visits to a local nursing home, visiting with residents, talking with them, and serving them herbal tea. The residents are evaluated by a trained therapist at the beginning of the study for general health and mental state. After several months of the visits, the residents are reevaluated by the therapist and it is found that many of the residents are more cheerful and healthy. This is A) an observational study. B) an experiment, but not a double blind experiment. C) a double blind experiment. D) a block design. Page 9 25. In order to investigate whether women are more likely than men to prefer Democratic candidates, a political scientist selects a large sample of registered voters, both men and women. She asks every voter whether they voted for the Republican or the Democratic candidate in the last election. This is A) an observational study. B) a multistage sample. C) a double blind experiment. D) a block design. Use the following to answer questions 26-28: In order to assess the opinion of students at the Ohio State University on campus safety, a reporter for the student newspaper interviews 15 students he meets walking on the campus late at night who are willing to give their opinion. 26. The sample is A) all those students walking on campus late at night. B) all students at universities with safety issues. C) the 15 students interviewed. D) all students approached by the reporter. 27. The method of sampling used is A) simple random sampling. B) the Gallup Poll. C) voluntary response. D) a census. 28. The sample obtained is A) a simple random sample of students feeling safe. B) a stratified random sample of students feeling safe. C) a probability sample of students with night classes. D) probably biased. Page 10 Use the following to answer questions 29-30: You need to select a simple random sample of size three from the following employees of a small company 1. Berliner 2. Blumenthal 3. MacEachern 4. Wolfe 5. Stasny 6. Santner 7. Verducci 8. Lin 9. Critchlow using the numerical labels attached to the names above and the following list of random digits. Read the list of random digits from left to right, starting at the beginning of the list. 11793 20495 05907 11384 44982 20751 27498 12009 45287 71753 98236 66419 84533 29. The simple random sample is A) 117. B) Berliner and Verducci. C) Berliner, Verducci, and Critchlow. D) Berliner, Verducci, and MacEachern. 30. Which of the following statements is true? A) If we used another list of random digits to select the sample, we would get the same result as obtained with the list actually used. B) If we used another list of random digits to select the sample, we would get a completely different sample than that obtained with the list actually used. C) If we used another list of random digits to select the sample, we would get at most one name in common with that obtained with the list actually used. D) If we used another list of random digits to select the sample, the result obtained with the list actually used would be just as likely to be selected as any other set of three names. 31. A stratified random sample corresponds to which of the following experimental designs? A) a block design B) a double blind experiment C) an experiment with a placebo D) a confounded, nonrandomized study Page 11 32. A 1992 Roper poll found that 22% of Americans say that the Holocaust may not have happened. The actual question asked in the poll was “Does it seem possible or impossible to you that the Nazi extermination of the Jews never happened?" and 22% responded possible. The results of this poll cannot be trusted because A) undercoverage is present. Obviously those people who did not survive the Holocaust could not be in the poll. B) the question is worded in a confusing manner. C) we do not know who conducted the poll or who paid for the results. D) nonresponse is present. Many people will refuse to participate, and those that do will be biased in their opinions. 33. In order to take a sample of 90 members of a local gym, I first divide the members into men and women, and then take a simple random sample of 45 men and a separate simple random sample of 45 women. This is an example of a A) block design. B) stratified random sample. C) double blind simple random sample. D) randomized comparative experiment. 34. In order to select a sample of undergraduate students in the United States, I select a simple random sample of four states. From each of these states, I select a simple random sample of two colleges or universities. Finally, from each of these eight colleges or universities, I select a simple random sample of 20 undergraduates. My final sample consists of 160 undergraduates. This is an example of A) simple random sampling. B) stratified random sampling. C) multistage sampling. D) convenience sampling. Use the following to answer questions 35-37: A Senator wants to know what the voters of his state think of proposed legislation on gun control. He mails a questionnaire on the subject to an SRS of 2500 voters in his state. His staff reports that 448 questionnaires have been returned, 343 of which support the legislation. Page 12 35. The population is A) the voters in his state. B) the 448 letters received. C) the 343 letters supporting the legislation. D) the 2500 voters receiving the questionnaire. 36. The sample is A) the voters in his state. B) the 448 letters received. C) the 343 letters supporting the legislation. D) the 2500 voters receiving the questionnaire. 37. This is an example of A) a survey containing nonresponse. B) a survey with little bias because a large SRS was used. C) a survey with little bias because it was the voters who elected the senator. D) all of the above. Use the following to answer questions 38-39: A state group is interested in whether voters would support lowering the legal drinking age to 18. A simple random sample of 25 voters is selected from each county in the state and asked whether they approve of lowering the drinking age. 38. This is an example of A) systematic sampling. B) multistage sampling. C) a simple random sample. D) stratified sampling. 39. The population of interest is A) the 25 voters selected in each county. B) voters in the state. C) all adults in the state. D) 18-year-olds in the state. Page 13 Use the following to answer questions 40-43: Sickle cell disease is a painful disorder of the red blood cells that affects mostly blacks in the United States. To investigate whether the drug hydroxyurea can reduce the pain associated with sickle cell disease, a study by the National Institute of Health has eight subjects available with sickle cell disease. The researchers are going to give the drug to four of the subjects and a placebo to the other four. They will then count the number of episodes of pain reported by each subject. 40. The names of the eight subjects are given below. 1) Berliner 2) Costello 3) Duvall 4) Fan 5) House 6) Long 7) Pavlicova 8) Tang Using the list of random digits, 81507 27102 56027 55892 33063 41842 81868 71035 09001 43367 49497 starting at the beginning of this list and using single digit labels, you assign the first four subjects selected to receive the drug hydroxyurea, while the remainder receive the placebo. The subjects assigned to the placebo are A) Berliner, Costello, Duvall, and Fan. B) Berliner, House, Pavlicova, and Tang. C) House, Long, Pavlicova, and Tang. D) Costello, Duvall, Fan, and Long. 41. The response is A) the drug hydroxyurea. B) the number of episodes of pain. C) the presence of sickle cell disease. D) the number of red blood cells. 42. The experimental units are A) the subjects who experienced the most episodes of pain. B) the four subjects who received the drug hyroxyurea. C) the four subjects who received the placebo. D) the eight subjects. Page 14 43. This is an example of A) a matched pairs experiment. B) a two-factor design. C) a randomized comparative experiment. D) a randomized observational study. 44. Which of the following is not a major principle of experimental design? A) comparative experimentation B) replication C) randomization D) segmentation Use the following to answer questions 45-46: In a recent article it was reported that in a random sample of children in grades 2–4, a significant negative relationship was found between the amount of homework assigned and student attitudes. 45. This is an example of A) an experiment. B) an observational study. C) the establishing of a causal relationship through correlation. D) a block design, with grades as blocks. 46. The amount of homework assigned is A) an explanatory variable. B) a response variable. C) a confounding variable. D) none of the above. Use the following to answer questions 47-48: Researchers wish to determine if a new experimental medication will reduce the symptoms of allergy sufferers without the side-effect of drowsiness. To investigate this question, the researchers give the new medication to 50 adult volunteers who suffer from allergies. Forty-four of these volunteers report a significant reduction in their allergy symptoms without any drowsiness. Page 15 47. This study could be improved by A) including people who do not suffer from allergies in the study in order to represent a more diverse population. B) repeating the study with only the 44 volunteers who reported a significant reduction in their allergy symptoms without any drowsiness, and giving them a higher dosage this time. C) using a control group. D) all of the above. 48. The experimental units are A) the researchers. B) the 50 adult volunteers. C) the 44 volunteers who reported a significant reduction in their allergy symptoms without any drowsiness. D) the six volunteers who did not report a significant reduction in their allergy symptoms without any drowsiness. Use the following to answer questions 49-51: A marketing experiment compares four different types of packaging for blank computer CDs. Each type of packaging can be presented in three different colors. Each combination of package type with a particular color is shown to 40 potential customers, who rate the overall attractiveness on a scale of 1 to 7. 49. The experimental units in this experiment are A) the potential customers. B) the measure of attractiveness. C) type of packaging and color. D) the three different colors. 50. The factors are A) the rating scale and package combination. B) the three different colors. C) the potential customers. D) type of packaging and color. Page 16 51. The number of treatments is A) 3. B) 4. C) 7 D) 12. Use the following to answer questions 52-56: A researcher would like to see if there is a difference in the durability of two types of material used for making soles of shoes. Thirty-six males volunteered to participate in the study. Eighteen males were randomly assigned to wear shoes made with material A for three months, and the remaining eighteen males wore shoes made with material B for three months. At the end of the three months, the researchers measured the amount of wear on the two groups' shoes. 52. This is an example of: A) an observational study. B) a randomized comparative experiment. C) a double-blind block design. D) a matched-pairs design. 53. The factor in the study was: A) the 36 males. B) the amount of wear after three months. C) the materials used to make the soles of the shoes. D) the three months time that passed. 54. The units were: A) the 36 males. B) the materials used to make the soles of the shoes. C) the 36 pairs of shoes involved. D) the three months time that passed. 55. Which of the following would most improve the study? A) Have each male wear soles made from material A on one foot and soles made from material B on the other foot. B) Make the study double-blind. No one should know what type of sole material they wore. C) Have a control group that wears neither material A nor material B. D) Extend the study to six months so that there is more wear on the shoes. Page 17 56. Suppose each male was required to wear soles made with material A on one foot and soles made with material B on the other foot. Then, at the end of the three months, the researchers compared the wear on the soles for each of the 36 males. This would be an example of: A) a randomized block design. B) an observational study. C) a double-blind experiment with no potential for a placebo effect. D) a matched-pairs design. Page 18 Answer Key 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. C B C A D D B C B C B D A C A D A B D C C D B A A C C D C D A B B C A B A D B D B D C D Page 19 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 56. B A C B A D D B C A A D Page 20