Chapter 4-9 Review Packet

advertisement
Name: __________________________ Date: _____________
1. The Columbus Zoo conducts a study to determine whether a household's income can be
used to predict the amount of money the household will give to the zoo's annual fund
drive. The response variable in this study is
A) the Columbus Zoo.
B) a household's income.
C) the amount of money a household gives to the zoo's annual fund drive.
D) all households in Columbus.
Use the following to answer questions 2-3:
The height (in feet) and volume of usable lumber (in cubic feet) of 32 cherry trees are measured
by a researcher. The goal is to determine if volume of usable lumber can be estimated from the
height of a tree. The results are plotted below.
2. In this study, the response variable is
A) height.
B) volume.
C) height or volume. It doesn't matter which is considered the response.
D) neither height nor volume. The measuring instrument used to measure height is the
response variable.
3. The scatterplot suggests
A) there is a positive association between height and volume.
B) there is an outlier in the plot.
C) both a and b.
D) neither a nor b.
Page 1
Use the following to answer questions 4-5:
I wish to determine the correlation between the height (in inches) and scoring average (points
per game) of women on a college basketball team. To do this I record the height and scoring
average of two players on the team. The values are
Player #1
Player #2
Height
70
75
Scoring Average
11.0
20.0
4. The correlation r computed from the measurements on these players is
A) 1.0.
B) positive and between 0.25 and 0.75.
C) near 0, but could be either positive or negative.
D) exactly 0.
5. The correlation r would have units
A) inches.
B) points.
C) inches-points.
D) no units. Correlation is a unitless quantity.
6. Which of the following statements is true?
A) the correlation coefficient equals the proportion of times two variables lie on a
straight line.
B) the correlation coefficient will be +1.0 only if all the data lie on a perfectly
horizontal straight line.
C) the correlation coefficient measures the fraction of outliers that appear in a
scatterplot.
D) the correlation coefficient is a unitless number and must always lie between 1.0
and +1.0, inclusive.
Page 2
7. Consider the following scatterplot.
The correlation between X and Y
A) is approximately 0.999.
B) is approximately 0.8.
C) is approximately 0.0.
D) cannot be computed because there is an outlier in the plot.
8. Consider the following scatterplot of two variables X and Y.
We may conclude
A) the correlation between X and Y must be close to 1 because there is nearly a perfect
relation between them.
B) the correlation between X and Y must be close to 1 because there is nearly a
perfect relation between them but it is not a straight-line relation.
C) the correlation between X and Y is close to 0.
D) the correlation between X and Y could be any number between 1 and +1. Without
knowing the actual values we can say nothing more.
Page 3
9. Which of the following is true of the correlation coefficient r?
A) It is a resistant measure of association.
B) 1  r  1.
C) If r is the correlation between X and Y, then r is the correlation between Y and X.
D) All of the above.
10. Consider the scatterplot below.
The form of the relationship represented in the plot is best described as
A) multi-linear.
B) negative association.
C) clusters.
D) all of the above.
Page 4
Use the following to answer questions 11-13:
Consider the following scatterplot of amounts of CO (carbon monoxide) and NOX (nitrogen
oxide) in grams per mile driven in the exhausts of cars. The least-squares regression line has
been drawn in the plot.
11. The intercept of the least-squares regression line is approximately
A) 0.7.
B) 1.7.
C) 2.0.
D) cannot be determined from the graph.
12. The least-squares line would predict that a car that emits 10 grams of CO per mile
driven would emit how many grams of NOX per mile driven?
A) 10.0.
B) 1.7.
C) 2.2.
D) 1.1.
13. The point indicated by the x has
A) a negative value for the residual.
B) a positive value for the residual.
C) a zero value for the residual.
D) a zero value for the correlation.
Page 5
Use the following to answer questions 14-16:
Using data on the appraised value of homes, a real estate agent computes the least-squares
regression line for predicting a home's value in 2002 from its value in 1992. The equation of the
least-squares regression line is
y = $22,000 + 1.6x
where y represents a home's value in 2002 and x is the value in 1992.
14. A home's value in 1992 is
A) the intercept.
B) the slope.
C) the explanatory variable.
D) the response variable.
15. Suppose Joe owns a home that was worth $100,000 in 1992. What would be the
predicted value of his home in 2002?
A) $182,000.
B) $160,000.
C) $122,000.
D) cannot be determined from the information given. We also need to know the
correlation.
16. Which of the following is true of the least-squares regression line?
A) The slope is the change in a home's value in 2002 that would be predicted by a $1
increase in a home's value in 1992.
B) It always passes through the point ( x , y ) the means of the values of home's (that
were used to compute the least-squares regression line) in 1992 and 2002,
respectively.
C) It will only pass through all the data points if r = ± 1.
D) All of the above.
Page 6
17. Babies typically learn to crawl approximately six months after birth. However, it may
take longer for babies to learn to crawl in the winter when they are often bundled in
clothes that restrict their movement. Thus there may be an association between a baby’s
crawling age and the average temperature during the month they first try to crawl.
Below are the average ages (in weeks) at which babies began to crawl for a sample of
babies born in each of the 12 months of the year. In addition, the average temperature
(in F) for the month that is six months after the birth month is also listed.
Birth month
January
February
March
April
May
June
July
August
September
October
November
December
Average crawling age
29.84
30.52
29.70
31.84
28.58
31.44
33.64
32.82
33.83
33.35
33.38
32.32
Average temperature
66
73
72
63
52
39
33
30
33
37
48
57
We want to investigate if the average age at which infants begin to crawl (Y) can be
predicted from the average outdoor temperature (X) six months after birth when they are
likely to begin crawling. We decide to fit a least-squares regression line to the data with
X as the explanatory variable and Y as the response variable. We compute the following
quantities.
r = correlation between X and Y = –0.7
x = mean of the values of X = 50.25
y = mean of the values of Y = 31.77
s x = standard deviation of the values of X = 15.85
s y = standard deviation of the values of Y = 1.76
The slope of the least-squares line is
A) –0.08.
B) 0.49.
C) –1.58.
D) 14.24.
Page 7
18. The correlation between the age and height of children is found to be about r = 0.7.
Suppose we use the age x of a child to predict the height y of the child. We conclude
A) the least-squares regression line of y on x would have a slope of 0.7.
B) the fraction of the variation in heights explained by the least-squares regression line
of y on x is 0.49.
C) about 70% of the time, age will accurately predict height.
D) height is generally 70% of a child's age.
19. Which of the following is correct?
A) The correlation r is the slope of the least-squares regression line.
B) The square of the correlation is the slope of the least-squares regression line.
C) The square of the correlation is the proportion of the data lying on the least-squares
regression line.
D) The mean of the residuals from least-squares regression is 0.
20. Suppose we fit the least-squares regression line to a set of data. Points with unusually
large values of the residuals are called
A) response variables.
B) the slope.
C) outliers.
D) correlated.
Use the following to answer question 21:
A medical researcher collects data relating drug dosage, in milligrams (x), and heart rate (y). The
regression equation for the data obtained is
yˆ  100  37.5 x
21. Which statement correctly interprets the slope of this equation?
A) For every unit increase in drug dosage, heart rate increases by 37.5, on average.
B) For every unit increase in drug dosage, heart rate increases by 100, on average.
C) For every unit increase in drug dosage, heart rate decreases by 37.5, on average.
D) For every unit increase in drug dosage, heart rate decreases by 62.5, on average.
22. Which of the following would be necessary to establish a cause-and-effect relation
between two variables?
A) Strong association between the variables.
B) An association between the variables is observed in many different settings.
C) The alleged cause is plausible.
D) All of the above.
Page 8
Use the following to answer question 23:
A study was conducted to investigate the relationship between the number of hours a student
spends studying for an exam, x, and his score on the exam, y.
Residuals Versus hours
(response is Score)
20
15
10
Residual
5
0
-5
-10
-15
-20
-25
0
5
10
15
20
25
hours
23. A student who spends 22 hours studying has a negative residual. This means
A) the student's actual exam score is higher than his predicted exam score.
B) the student's actual exam score is lower than his predicted exam score.
C) there must be a mistake; residuals cannot be negative.
24. A group of college students believes that herbal tea has remarkable restorative powers.
To test their theory they make weekly visits to a local nursing home, visiting with
residents, talking with them, and serving them herbal tea. The residents are evaluated
by a trained therapist at the beginning of the study for general health and mental state.
After several months of the visits, the residents are reevaluated by the therapist and it is
found that many of the residents are more cheerful and healthy. This is
A) an observational study.
B) an experiment, but not a double blind experiment.
C) a double blind experiment.
D) a block design.
Page 9
25. In order to investigate whether women are more likely than men to prefer Democratic
candidates, a political scientist selects a large sample of registered voters, both men and
women. She asks every voter whether they voted for the Republican or the Democratic
candidate in the last election. This is
A) an observational study.
B) a multistage sample.
C) a double blind experiment.
D) a block design.
Use the following to answer questions 26-28:
In order to assess the opinion of students at the Ohio State University on campus safety, a
reporter for the student newspaper interviews 15 students he meets walking on the campus late at
night who are willing to give their opinion.
26. The sample is
A) all those students walking on campus late at night.
B) all students at universities with safety issues.
C) the 15 students interviewed.
D) all students approached by the reporter.
27. The method of sampling used is
A) simple random sampling.
B) the Gallup Poll.
C) voluntary response.
D) a census.
28. The sample obtained is
A) a simple random sample of students feeling safe.
B) a stratified random sample of students feeling safe.
C) a probability sample of students with night classes.
D) probably biased.
Page 10
Use the following to answer questions 29-30:
You need to select a simple random sample of size three from the following employees of a
small company
1. Berliner
2. Blumenthal
3. MacEachern
4. Wolfe
5. Stasny
6. Santner
7. Verducci
8. Lin
9. Critchlow
using the numerical labels attached to the names above and the following list of random digits.
Read the list of random digits from left to right, starting at the beginning of the list.
11793 20495 05907 11384 44982 20751 27498 12009 45287 71753 98236 66419 84533
29. The simple random sample is
A) 117.
B) Berliner and Verducci.
C) Berliner, Verducci, and Critchlow.
D) Berliner, Verducci, and MacEachern.
30. Which of the following statements is true?
A) If we used another list of random digits to select the sample, we would get the same
result as obtained with the list actually used.
B) If we used another list of random digits to select the sample, we would get a
completely different sample than that obtained with the list actually used.
C) If we used another list of random digits to select the sample, we would get at most
one name in common with that obtained with the list actually used.
D) If we used another list of random digits to select the sample, the result obtained
with the list actually used would be just as likely to be selected as any other set of
three names.
31. A stratified random sample corresponds to which of the following experimental
designs?
A) a block design
B) a double blind experiment
C) an experiment with a placebo
D) a confounded, nonrandomized study
Page 11
32. A 1992 Roper poll found that 22% of Americans say that the Holocaust may not have
happened. The actual question asked in the poll was
“Does it seem possible or impossible to you that the Nazi extermination of the Jews
never happened?"
and 22% responded possible. The results of this poll cannot be trusted because
A) undercoverage is present. Obviously those people who did not survive the
Holocaust could not be in the poll.
B) the question is worded in a confusing manner.
C) we do not know who conducted the poll or who paid for the results.
D) nonresponse is present. Many people will refuse to participate, and those that do
will be biased in their opinions.
33. In order to take a sample of 90 members of a local gym, I first divide the members into
men and women, and then take a simple random sample of 45 men and a separate
simple random sample of 45 women. This is an example of a
A) block design.
B) stratified random sample.
C) double blind simple random sample.
D) randomized comparative experiment.
34. In order to select a sample of undergraduate students in the United States, I select a
simple random sample of four states. From each of these states, I select a simple
random sample of two colleges or universities. Finally, from each of these eight
colleges or universities, I select a simple random sample of 20 undergraduates. My final
sample consists of 160 undergraduates. This is an example of
A) simple random sampling.
B) stratified random sampling.
C) multistage sampling.
D) convenience sampling.
Use the following to answer questions 35-37:
A Senator wants to know what the voters of his state think of proposed legislation on gun
control. He mails a questionnaire on the subject to an SRS of 2500 voters in his state. His staff
reports that 448 questionnaires have been returned, 343 of which support the legislation.
Page 12
35. The population is
A) the voters in his state.
B) the 448 letters received.
C) the 343 letters supporting the legislation.
D) the 2500 voters receiving the questionnaire.
36. The sample is
A) the voters in his state.
B) the 448 letters received.
C) the 343 letters supporting the legislation.
D) the 2500 voters receiving the questionnaire.
37. This is an example of
A) a survey containing nonresponse.
B) a survey with little bias because a large SRS was used.
C) a survey with little bias because it was the voters who elected the senator.
D) all of the above.
Use the following to answer questions 38-39:
A state group is interested in whether voters would support lowering the legal drinking age to 18.
A simple random sample of 25 voters is selected from each county in the state and asked whether
they approve of lowering the drinking age.
38. This is an example of
A) systematic sampling.
B) multistage sampling.
C) a simple random sample.
D) stratified sampling.
39. The population of interest is
A) the 25 voters selected in each county.
B) voters in the state.
C) all adults in the state.
D) 18-year-olds in the state.
Page 13
Use the following to answer questions 40-43:
Sickle cell disease is a painful disorder of the red blood cells that affects mostly blacks in the
United States. To investigate whether the drug hydroxyurea can reduce the pain associated with
sickle cell disease, a study by the National Institute of Health has eight subjects available with
sickle cell disease. The researchers are going to give the drug to four of the subjects and a
placebo to the other four. They will then count the number of episodes of pain reported by each
subject.
40. The names of the eight subjects are given below.
1) Berliner
2) Costello
3) Duvall
4) Fan
5) House
6) Long
7) Pavlicova
8) Tang
Using the list of random digits,
81507 27102 56027 55892 33063 41842 81868 71035 09001 43367 49497
starting at the beginning of this list and using single digit labels, you assign the first four
subjects selected to receive the drug hydroxyurea, while the remainder receive the
placebo. The subjects assigned to the placebo are
A) Berliner, Costello, Duvall, and Fan.
B) Berliner, House, Pavlicova, and Tang.
C) House, Long, Pavlicova, and Tang.
D) Costello, Duvall, Fan, and Long.
41. The response is
A) the drug hydroxyurea.
B) the number of episodes of pain.
C) the presence of sickle cell disease.
D) the number of red blood cells.
42. The experimental units are
A) the subjects who experienced the most episodes of pain.
B) the four subjects who received the drug hyroxyurea.
C) the four subjects who received the placebo.
D) the eight subjects.
Page 14
43. This is an example of
A) a matched pairs experiment.
B) a two-factor design.
C) a randomized comparative experiment.
D) a randomized observational study.
44. Which of the following is not a major principle of experimental design?
A) comparative experimentation
B) replication
C) randomization
D) segmentation
Use the following to answer questions 45-46:
In a recent article it was reported that in a random sample of children in grades 2–4, a significant
negative relationship was found between the amount of homework assigned and student
attitudes.
45. This is an example of
A) an experiment.
B) an observational study.
C) the establishing of a causal relationship through correlation.
D) a block design, with grades as blocks.
46. The amount of homework assigned is
A) an explanatory variable.
B) a response variable.
C) a confounding variable.
D) none of the above.
Use the following to answer questions 47-48:
Researchers wish to determine if a new experimental medication will reduce the symptoms of
allergy sufferers without the side-effect of drowsiness. To investigate this question, the
researchers give the new medication to 50 adult volunteers who suffer from allergies. Forty-four
of these volunteers report a significant reduction in their allergy symptoms without any
drowsiness.
Page 15
47. This study could be improved by
A) including people who do not suffer from allergies in the study in order to represent
a more diverse population.
B) repeating the study with only the 44 volunteers who reported a significant reduction
in their allergy symptoms without any drowsiness, and giving them a higher dosage
this time.
C) using a control group.
D) all of the above.
48. The experimental units are
A) the researchers.
B) the 50 adult volunteers.
C) the 44 volunteers who reported a significant reduction in their allergy symptoms
without any drowsiness.
D) the six volunteers who did not report a significant reduction in their allergy
symptoms without any drowsiness.
Use the following to answer questions 49-51:
A marketing experiment compares four different types of packaging for blank computer CDs.
Each type of packaging can be presented in three different colors. Each combination of package
type with a particular color is shown to 40 potential customers, who rate the overall
attractiveness on a scale of 1 to 7.
49. The experimental units in this experiment are
A) the potential customers.
B) the measure of attractiveness.
C) type of packaging and color.
D) the three different colors.
50. The factors are
A) the rating scale and package combination.
B) the three different colors.
C) the potential customers.
D) type of packaging and color.
Page 16
51. The number of treatments is
A) 3.
B) 4.
C) 7
D) 12.
Use the following to answer questions 52-56:
A researcher would like to see if there is a difference in the durability of two types of material
used for making soles of shoes. Thirty-six males volunteered to participate in the study.
Eighteen males were randomly assigned to wear shoes made with material A for three months,
and the remaining eighteen males wore shoes made with material B for three months. At the end
of the three months, the researchers measured the amount of wear on the two groups' shoes.
52. This is an example of:
A) an observational study.
B) a randomized comparative experiment.
C) a double-blind block design.
D) a matched-pairs design.
53. The factor in the study was:
A) the 36 males.
B) the amount of wear after three months.
C) the materials used to make the soles of the shoes.
D) the three months time that passed.
54. The units were:
A) the 36 males.
B) the materials used to make the soles of the shoes.
C) the 36 pairs of shoes involved.
D) the three months time that passed.
55. Which of the following would most improve the study?
A) Have each male wear soles made from material A on one foot and soles made from
material B on the other foot.
B) Make the study double-blind. No one should know what type of sole material they
wore.
C) Have a control group that wears neither material A nor material B.
D) Extend the study to six months so that there is more wear on the shoes.
Page 17
56. Suppose each male was required to wear soles made with material A on one foot and
soles made with material B on the other foot. Then, at the end of the three months, the
researchers compared the wear on the soles for each of the 36 males. This would be an
example of:
A) a randomized block design.
B) an observational study.
C) a double-blind experiment with no potential for a placebo effect.
D) a matched-pairs design.
Page 18
Answer Key
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
C
B
C
A
D
D
B
C
B
C
B
D
A
C
A
D
A
B
D
C
C
D
B
A
A
C
C
D
C
D
A
B
B
C
A
B
A
D
B
D
B
D
C
D
Page 19
45.
46.
47.
48.
49.
50.
51.
52.
53.
54.
55.
56.
B
A
C
B
A
D
D
B
C
A
A
D
Page 20
Download