Chapter 4

advertisement
Chapter 4
Section 4.1
Check Your Understanding, page 211:
1. The company inspector is using a convenience sample. This could lead him to think that the oranges
are of better quality than they really are, if the farmer puts the best oranges on top.
2. Nightline was using a voluntary response sample. Only those who feel particularly strongly about the
issue are likely to respond. In this case, those who are happy that the United Nations has its headquarters
in the US already have what they want and so are less likely to worry about responding to the question.
Check Your Understanding, page 219:
1. It might be difficult to give a survey to an SRS of 200 fans because you would have to identify 200
different seats, go to those seats in the arena and find the people who are sitting there. This means going
to 200 different locations throughout the arena, which would take time. There is also the problem that
people are not always in their seats throughout the game and not all seats will necessarily be occupied in
any given game.
2. For a stratified sample, it would be better to take the lettered rows as the strata because each lettered
row is the same distance from the court and so would contain only seats with the same (or nearly the
same) ticket price. This means that all people in any given stratum would have paid roughly the same
amount for their tickets.
3. For a cluster sample, it would be better to take the numbered sections as the clusters because they
include all different seat prices. Each section contains seats with many different ticket prices so the
people in a section would mirror the characteristics of the population as a whole.
Check Your Understanding, page 224:
1. (a) Using the telephone directory as a sampling frame is an example of a sampling error. This will
result in undercoverage because those who are not listed in the phone book (those who do not have a
phone or have only a cell phone) do not have the opportunity to be chosen. (b) If the person cannot be
contacted, this is an example of a nonsampling error. This did not occur because of the way the sample
was chosen, but rather was an effect of the way the survey was administered. (c) If you choose to
interview people walking past you on a sidewalk, this is a sampling error. Who you find will depend on
where (in what neighborhood, etc.) you are standing. This has to do with how you choose your sample.
2. This question will result in fewer people suggesting that we should ban disposable diapers by making
it sound like they are not a problem in the landfill. The author of the question highlights several other
items that take up more space in the landfill, which makes it look like disposable diapers are really not a
problem.
Exercises, page 226:
4.1 The population is (all) local businesses. The sample is the 73 businesses that return the
Questionnaire.
4.2 The population is all the artifacts discovered at the dig. The sample is those artifacts (2% of the
population) that are chosen for inspection.
4.3 The population is the 1000 envelopes stuffed during a given hour. The sample is the 40 envelopes
selected.
Chapter 4: Designing Studies
83
4.4 The population is all 45,000 people who made credit card purchases. The sample is the 137 people
who returned the survey form.
4.5 Only persons with a strong opinion on the subject, strong enough that they are willing to spend the
time and money, will respond to this advertisement.
4.6 Letters to legislators are an example of a voluntary response sample—the proportion of letters
opposed to the insurance should not be assumed to be a fair representation of the attitudes of the
congresswoman’s constituents. Only those who have very strong opinions will write in.
4.7 This is a voluntary response sample. It is biased in favor of those who feel most strongly about the
issue being surveyed.
4.8 (a) A voluntary response sample. (b) It is biased toward readers who feel most strongly about the
issue. 85% is probably higher than the true percent because it is likely that readers who feel most strongly
about this issue have in some way been involved in an accident caused by cell phone use while driving.
4.9 (a) A convenience sample. (b) It is unlikely that the first 100 students to arrive at school are
representative of the student population in general. 7.2 hours is probably higher since you might expect
that the students who arrive first are those who got a good night’s sleep the night before. Students who
received less sleep the night before are probably more likely to run late the next morning.
4.10 This is a convenience sample. It is easy to find lots of people in a mall. However, it is likely to give
a higher percentage for the unemployment figure because the unemployed have more time to be at the
mall than those who are employed.
4.11 (a) Number the 40 students from 01 to 40 alphabetically. Go to the random number table and pick
a starting point. Record two-digit numbers, skipping any that aren’t between 01 and 40 or are repeats,
until you have 5 unique numbers between 01 and 40. (b) Starting at line 107 we read off the following
numbers: 82 (ignore) 73 (ignore) 95 (ignore) 78 (ignore) 90 (ignore) 20 80(ignore) 74 (ignore) 75
(ignore) 11 81 (ignore) 67 (ignore) 65 (ignore) 53 (ignore) 00 (ignore) 94 (ignore) 38 31 48 (ignore) 93
(ignore) 60(ignore) 94 (ignore) 07. So we have picked: Johnson (20), Drasin (11), Washburn (38), Rider
(31), and Calloway (07).
4.12 (a) Number the 33 complexes from 01 to 33 alphabetically. Go to the random number table and
pick a starting point. Record two-digit numbers, skipping any that aren’t between 01 and 33 or are
repeats, until you have 3 unique numbers between 01 and 33. (b) Starting at line 117 we read off the
following numbers: 38 (ignore) 16 79 (ignore) 85 (ignore) 32 62 (ignore) 18. So we have picked:
Fairington (16), Waterford Court (32) and Fowler (18).
4.13 (a) Number the plots from 0001 to 1410. Go to the random number table and pick a starting point.
Record four-digit numbers, skipping any that aren’t between 0001 and 1410 or are repeats, until you have
141 unique numbers between 0001 and 1410. (b) Starting at line 131 we read off the following numbers:
0500 7166 3281 1941 4873 0419 7855 7645 1959 6565 6873 2552 5984 2920 8796 4316 5937 3931 6859
7150 4574 0418 (ignore all numbers not in bold). So the first three plots in our sample are plots 0500,
0419 and 0418.
4.14 (a) Number the gravestones from 00001 to 55914. Go to the random number table and pick a
starting point. Record 5-digit numbers, skipping any that aren’t between 00001 and 55914 or are repeats,
until you have 395 unique numbers between 00001 and 55914. (b) Starting at line 127 we read off the
84
The Practice of Statistics for AP*, 4/e
following numbers: 43909 99477 25330 64359 40085 (ignore all numbers not in bold). So the first three
gravestones picked are those numbered 43909, 25330 and 40085.
4.15 If you always begin at the same place, then the results would not be random. You would end up
using the same sample in every case.
4.16 (a) False—if it were true, then after looking at 39 digits, we would know whether or not the 40th
digit was a 0. (b) True—there are 100 pairs of digits 00 through 99, and all are equally likely. (c) False—
0000 is just as likely as any other string of four digits.
4.17 (a) Assuming none of the phones can be shipped until after the inspection, inspecting a random
sample of 20 phones could hold up the shipping process. Additionally, in order to obtain a random
sample, the phones must be numbered in some way. Keeping track of the ordering of 1000 phones may
be difficult. (b) It is possible that the quality of the phones produced changes over the course of the day
so that the last phones manufactured are not representative of the day’s production. (c) This is not an
SRS because each sample of 20 phones does not have the same probability of being selected. In fact, the
20 phones that are sampled will be the 50th, 100th,…, 1000th, the others have no chance of being sampled.
4.18 (a) To obtain an SRS, every tree would need to have an equal chance of being included in the
sample. It is not practical to even identify every tree in the park. (b) This sampling method is biased
because these trees are unlikely to be representative of the population. Trees along the main road are
more likely to be damaged by cars and people and may be more susceptible to infestation. (c) The
scientists can be confident that the actual percentage of pine trees in the area that are infected by the pine
beetle is near 35% although there is always some error associated with using sampling to estimate
population parameters.
4.19 Assign 01 to 30 to the students (in alphabetical order). Starting on line 123 gives 08-Ghosh, 15Jones, 07-Fisher, and 27-Shaw. Assigning 0–9 to the faculty members gives 1-Besicovitch and 0Andrews.
4.20 Label the 500 midsize accounts from 001 to 500, and the 4400 small accounts from 0001 to 4400.
Starting at line 115, the first five accounts in each strata are 417, 494, 322, 247, and 097 for the midsize
group, then 3698, 1452, 2605, 2480, and 3716 for the small group.
4.21 (a) Use the three types of seats (sideline, corner and end zone) as the three strata since ticket prices
will be similar within each stratum but different between the three strata. (b) It might be easier to obtain
a cluster sample because a stratified random sample will still likely choose seats all over the stadium,
which would make it very time consuming to get to everyone. A cluster sample would be easier to
obtain, because there would be many people sitting all together who would be part of the sample. In this
case one would use the section numbers for the clusters.
4.22 (a) Using a stratified random sample would assure the manager that he got opinions from each type
of room. Use each type of view as the strata and randomly pick 60 guests from each stratum. (b) A
cluster sample would be a simpler option because someone could just slip the forms under a specific
pattern of doors. All rooms ending in a specific number would be the clusters. For example, all rooms
ending in 16. Presumably these are all stacked above each other on the 30 floors. The manager should
just pick two random numbers that represent rooms on the water side and two random numbers that
represent rooms on the golf course side and then survey all 30 rooms (one per floor) that end in that
number.
Chapter 4: Designing Studies
85
4.23 It is not an SRS, because some samples of size 250 have no chance of being selected (e.g., a sample
containing 250 women).
4.24 The chance of being interviewed is 3/30 for students over age 21 and 2/20 for students under age 21.
This is 1/10 in both cases. It is not an SRS because not all combinations of students have an equal chance
of being interviewed. For instance, groups of 5 students all over age 21 have no chance of being
interviewed.
4.25 (a) This is cluster sampling. (b) Answers will vary. Label each block from 01 through 65.
Beginning at line 142, record two-digit numbers, skipping any that aren’t between 01 and 65 or are
repeats. The 5 identified blocks are 02, 32, 26, 34, and 08. The statistical applet selected blocks 10, 20,
45, 36, and 60.
4.26 (a) Split the 200 addresses into 5 groups of 40 each. Looking for 2-digit numbers from 01 to 40, the
table gives 35 so the systematic random sample consists of 35, 75, 115, 155, and 195. (b) Every address
has a 1-in-40 chance of being selected, but not every subset has an equal chance of being picked—for
example, 01, 02, 03, 04, and 05 cannot be selected by this method.
4.27 Households without telephones or with unlisted numbers are omitted from this frame. Such
households would likely be made up of poor individuals (who cannot afford a phone), those who choose
not to have phones, and those who do not wish to have their phone number published. If the variable
being measured tends to have different values for those excluded by this sampling method, then our
smaple result would be off in a particular direction from the truth about the population of households.
4.28 This will miss those who do not have telephones. This means that we will be likely
underrepresenting the poor in our sample.
4.29 (a) You are sampling only from the lower priced ticket holders. (b) This is a sampling error. The
sampling frame differs from the population of interest (undercoverage).
4.30 (a) Nonsampling error. People may lie in response to questions about past drug use. It is not an
error due to the act of taking a sample, rather it is a response error. (b) Nonsampling error. This is an
example of a processing error. (c) Sampling error. This will suffer from the same forms of bias as any
voluntary response survey.
5,029
0.8906 or
= 0.1094 so the nonresponse rate is 1 − 0.1094 =
45,956
89.1%. (b) It is likely that the high amount of nonresponse gave the researchers a lower mean number of
miles driven because those who drive more are at home less to answer the phone.
4.31 (a) The response rate was
4.32 The higher no-answer rate was probably the second period—when families are likely to be
vacationing or spending time outdoors. A high rate of nonresponse makes sample results less reliable
because you don’t know how these individuals would have responded. It is very risky to assume that they
would have responded exactly the same way as those individuals who did respond.
4.33 More than 171 respondents have run red lights. We would not expect very many people to claim
they have run red lights when they have not, but some people will deny running red lights when they
have.
86
The Practice of Statistics for AP*, 4/e
4.34 People likely claim to wear their seat belts because they know they should; they are embarrassed or
ashamed to say that they do not always wear seat belts. Such bias is likely in most surveys about seat belt
use (and similar topics).
4.35 (a) The wording is clear. The question is slanted in favor of warning labels. (b) The question is
clear, but it is clearly slanted in favor of national health insurance by asserting it would reduce
administrative costs. (c) The wording is too technical for many people to understand—and for those who
do understand the question, it is slanted because it suggests reasons why one should support recycling. It
could be rewritten to something like: “Do you support economic incentives to promote recycling?”
4.36 (a) The question is clear, but the two options presented are too extreme; no middle position on gun
control is allowed. Many students may suggest that this question is likely to elicit more responses against
gun control (that is, more people will choose 2). (b) The question is so complicated that it isn’t clear. It is
also slanted; the phrasing of this question will tend to make people respond in favor of a nuclear freeze.
Only one side of the issue is presented.
4.37 c
4.38 d
4.39 d
4.40 c
4.41 e
4.42 c
4.43 The predicted sleep debt for a 5-day school week, based on the least-squares regression equation, is
2.23 + 3.17(5) =
18.08 hours, a little more than 3 hours greater than what was found in the research study.
Based on their collected data, the students have reason to be skeptical of the research study’s reported
results.
4.44 (a) The 95th percentile is the amount of bandwidth below which 95 percent of all 5 minute
mesurements fall. (b) The method using the 98th percentile would cost the company more because it
would suggest a higher usage of bandwidth by the company.
Section 4.2
Check Your Understanding, page 233:
1. This was an experiment because a treatment (brightness of screen) was imposed on the laptops.
2. This was an observational study. Students were not assigned to a particular number of meals to eat
with their family per week.
3. The explanatory variable is the number of meals per week eaten with their family and the response
variable is probably their GPA (or some other measure of their grades).
4. This is an observational study and there may well be lurking variables that are actually influencing the
response variable. For instance, families that eat more meals together may also be families where the
parents show more interest in their childrens’ education and therefore help them to do better in school.
Chapter 4: Designing Studies
87
Check Your Understanding, page 240:
1.
2. Using an alphabetical list of the students, assign each student a number between 01 and 29. Pick a line
of the random number table and read off two digit numbers until you have 15 numbers between 01 and
29. These students belong in the treatment group where students will meet in small groups. The other
students will view the videos alone.
3. The purpose of the control group is to have a group to compare to. Presumably the students have been
evaluating their own performances by themselves before. If you incorporate such a group into your
experiment, you can evaluate if the group work is actually better.
Check Your Understanding, page 244:
1. No, this experiment did not take the placebo effect into account. It is possible that women who
“thought” they were getting an ultrasound would have different reactions to pregnancy than those who
knew that they hadn’t received an ultrasound.
2. This experiment was not double-blind. While the people weighing the babies at birth may not have
known whether that particular mother had an ultrasound or not, the mothers did know whether they had
had an ultrasound or not. This means that the mothers may have affected the outcome since they knew
whether they had received the treatment or not.
3. An improved design would have been one in which all mothers were treated as if they had an
ultrasound, but for some mothers the ultrasound machine just wasn’t turned on (but this fact would not be
obvious to the woman). This means that the ultrasound would have to have been done in such a way so
that the woman could not see the screen.
Exercises, page 253:
4.45 (a) This was an observational study because no treatment was imposed on the mothers. The
researchers simply asked them to report both their chocolate consumption and their babies’ temperament.
(b) The explanatory variable is the mother’s chocolate consumption and the response variable is the
baby’s temperament. (c) No, this study is an observational study so we cannot make a conclusion of
cause and effect. There could be a lurking variable that is actually causing the difference in temperament.
4.46 (a) This was an observational study because no treatment was imposed on the children. The
researchers simply followed them through their 6th year in school, asking adults to rate their behavior at
several times along the way. (b) The explanatory variable was the amount of time in child care from
birth to age four-and-a-half. The response variable was the adult ratings of their behavior. (c) No, this
study is an observational study so we cannot make a conclusion of cause and effect. There could be a
lurking variable that is actually causing the difference in adult ratings of their behavior.
4.47 (a) This was an experiment because students were randomly assigned to the different teaching
methods. (b) Since this was an experiment with proper randomization, the teacher can conclude that
using the computer animation appears to result in higher increases in test scores.
4.48 (a) This is an example of an observational study. The researchers did not assign people to either use
or not use cell phones. (b) No, this study is an observational study so we cannot make a conclusion of
cause and effect.
88
The Practice of Statistics for AP*, 4/e
4.49 One possible lurking variable would private versus public schools. Private schools tend to have
smaller classes, and private school students might tend to earn higher scores. There might be something
else about the private schools, however, that leads to that success other than the small class sizes. So final
success could be dependent on either of these two variables.
4.50 One possible lurking variable is level of academic motivation. Those who drink may have less
academic motivation leading to lower grades. So if students do not do well, we are not sure if it is
because of the alcohol itself or if it is simply a matter of lower level of academic motivation.
4.51 Experimental units: pine seedlings. Explanatory variable: Light intensity. Treatments: full light,
25% light and 5% light. Response variable: dry weight at the end of the study.
4.52 Subjects: The students living in the selected dormitory. Explanatory variable: The rate structure.
Treatments: Paying one flat rate, or paying peak/off-peak rates. Response variables: The
amount and time of use and total network use.
4.53 Subjects: the individuals who were called. Explanatory variables: (1) information provided by
interviewer; (2) whether caller offered survey results. Treatments: (1) giving name/no survey results; (2)
identifying university/no survey results; (3) giving name and university/no survey results; (4) giving
name/offer to send survey results; (5) identifying university/offer to send survey results; (6) giving name
and university/offer to send survey results. Response variable: whether or not the interview was
completed.
4.54 Experimental units: middle schools. Explanatory variables: whether physical activity program was
offered and whether nutrition program was offered. Treatments: (1) activity intervention; (2) nutrition
intervention; (3) both interventions; (4) neither intervention. Response variables: physical activity and
lunchtime consumption of fat.
4.55 Experimental units: fabric specimens. Explanatory variables: (1) roller type; (2) dyeing cycle time;
(3) temperature. Treatments: (1) metal, 30 minutes, 150 degrees; (2) natural, 30 minutes, 150 degrees;
(3) metal, 40 minutes, 150 degrees; (4) natural, 40 minutes, 150 degrees; (5) metal, 30 minutes, 175
degrees; (6) natural, 30 minutes, 175 degrees; (7) metal, 40 minutes, 175 degrees; (8) natural, 40 minutes,
175 degrees. Response variable: a quality measurement.
4.56 Subjects: students. Explanatory variables: (1) Step height; (2) metronome pace. Treatments: (1)
5.75 inches, 14 steps/minute; (2) 5.75 inches, 21 steps/minute; (3) 5.75 inches, 28 steps/minute; (4) 11.5
inches, 14 steps/minute; (5) 11.5 inches, 21 steps/minute; (6) 11.5 inches, 28 steps/minute. Response
variable: increase in heart rate.
4.57 There was no control group for comparison purposes. We don’t know if this was a placebo effect or
if the flavonol actually affected the blood flow.
4.58 There was no control group for comparison purposes this year. Over a year, many things can
change: the state of the economy, hiring costs (due to an increasing minimum wage or the cost of
employee benefits), etc. In order to draw conclusions, we would need to make the $500 bonus offer to
some people and not to others during the same time frame, and compare the two groups.
4.59 (a) Write all names on slips of paper, put them in a container and mix thoroughly. Pull one slip out
and note the name on it. That person gets assigned treatment 1. Pull another name out and assign that
person to treatment 2. The third person gets assigned treatment 3. Keep rotating through the treatments
Chapter 4: Designing Studies
89
until all have been assigned. (b) Assign the students numbers between 001 and 120. Pick a spot on
Table D and read off the first 40 numbers between 001 and 120, skipping any that aren’t between 001 and
120 or are repeats. These are assigned to treatment 1. The next 40 numbers read are assigned to
treatment 2. The remaining are assigned to treatment 3. (c) Assign the students numbers between 001
and 120. Using the RandInt function on the calculator, and ignoring all repeats, assign the first 40
numbers chosen to treatment 1, the next 40 to treatment 2, and so on.
4.60 (a) Write all names on slips of paper, put them in a container and mix thoroughly. Pull one slip out
and note the name on it. That person gets assigned treatment 1. Pull another name out and assign that
person to treatment 2. The third person gets assigned treatment 3. Keep rotating through the treatments
until all have been assigned. b) Assign the students numbers between 001 and 150. Pick a spot on Table
D and read off the first 25 numbers between 001 and 150, skipping any that aren’t between 001 and 120
or are repeats. These are assigned to treatment 1. The next 25 numbers read are assigned to treatment 2.
Keep doing this until all people have been assigned to one of the 6 treatments. (c) Assign the students
numbers between 001 and 150. Using the RandInt function on the calculator, and ignoring all repeats,
assign the first 25 numbers chosen to treatment 1, the next 25 to treatment 2, and so on.
4.61 (a) This type of design is called a completely randomized design. The outline is given below:
(b) Write the names of the patients on 36 identical slips of paper, put them in a hat, and mix them well. Draw out 9
slips. The corresponding patients will be in Group 1. Draw out 9 more slips. Those patients will be in Group 2.
The next 9 slips drawn will be the patients in Group 3, and the remaining 9 patients will be assigned to Group 4.
4.62 (a) This is a completely randomized design. The outline is given below:
(b) Assign the plots the labels 01 through 18. Write the labels on 18 identical slips of paper, put them in
a hat, and mix them well. Draw out 6 slips. The corresponding plots will be in Group 1. Draw out 6
more slips. These plots will be in Group 2, and the remaining 6 plots will be in Group 3.
4.63 (a) Expense, condition of the patient, etc. In a serious case, when the patient has little chance of
surviving, a doctor might choose not to recommend surgery; it might be seen as an unnecessary measure,
90
The Practice of Statistics for AP*, 4/e
bringing expense and a hospital stay with little benefit to the patient. (b) Randomly assign the patients to
two groups of 150 each. One group will receive the traditional surgery and the other group will receive
the new method of treatment. At the end of the study, measure how many patients survived.
4.64 (a) Comparing this year to last year would not be a good idea because there may be lurking
variables that have changed over time. (b) Randomly divide the 120 rural schools into two groups. In
one group, offer the teacher better pay for good attendance. In the other group, do nothing. At the end of
the study period, compare the attendance of the teachers.
4.65 (a) The principle of experimental design that is violated here is random assignment. If players are
allowed to choose which treatment they get, those who choose one particular treatment over the other
may be different in a fundamental way. For example, maybe the weaker players will be more likely to
choose the new method and the stronger players would stick with weight lifting. (b) The response
variable of the number of push-ups that a player can do could be part of what the coach should measure,
but this only measures one kind of upper-body strength and he should probably combine this with other
measures as well.
4.66 In a controlled scientific study, the effects of factors other than the nonphysical treatment (e.g., the
placebo effect, differences in the prior health of the subjects) can be eliminated or accounted for, so that
the differences in improvement observed between the subjects can be attributed to the differences in
treatments.
4.67 (a) First we need to control for the effects of lurking variables and to use at least two groups for
comparison purposes. Next we need random assignment to help create roughly equivalent groups before
the treatments are administered. Finally, we need replication to ensure that a difference in response
between the two groups is dueto the treatments and not chance variation. (b) There were two groups in the
study: one in which the children were assigned to an intensive preschool program and one in which they
weren’t. All of the children were given nutritional supplements and help from social workers. The
children were assigned at random to the two groups, and there were a total of 111 children in the
experiment.
4.68 The researcher should use Plan B. If he uses Plan A and discovers that the plants which had the
weed killer X applied did better, he will not know if this is because of weed killer X or because these
were the healthier plants to begin with.
4.69 (a) The placebo was the harmless leaf. (b) The results support the idea of a placebo effect because
the subjects developed rashes on the arm exposed to the placebo (a harmless leaf) simply because they
thought they were being exposed to the active treatment (a poison ivy leaf).
4.70 (a) If only the new drug is administered, and the subjects are then interviewed, their responses will
not be useful, because there will be nothing to compare them to: How much “pain relief” does one expect
to experience? Also, the placebo effect may lead some subjects to report a decrease in pain even if the
new drug is ineffective. (b) The subjects should certainly not know what drug they are getting—a patient
told that she is receiving a placebo, for example, will probably not experience any pain relief.
4.71 Because the experimenter knew which subjects had learned the meditation techniques, he
(or she) may have had some expectations about the outcome of the experiment: if the experimenter
believed that meditation was beneficial, he may subconsciously rate that group as being less anxious.
4.72 “Double-blind” means that the treatment (testosterone or placebo) assigned to a subject was
unknown to both the subject and those responsible for assessing the effectiveness of that treatment.
Chapter 4: Designing Studies
91
“Randomized” means that patients were randomly assigned to receive either the testosterone supplement
or a placebo. “Placebo-controlled” means that some of the subjects were given placebos. Even though
these possess no medical properties, some subjects may show improvement or benefits just as a result of
participating in the experiment; the placebos allow those doing the study to observe this effect.
4.73 (a) Control: The effects of lurking variables on the response, whether the woman became pregnant,
were controlled by controlling the manner in which the treatments were applied: half of the women
received acupuncture treatment 25 minutes before embryo transfer and again 25 minutes after the transfer,
the other half lay still for 25 minutes after the transfer. Random assignment: Randomly assigning the
women to the two treatments should eliminate any systematic bias in assigning the subjects and should
also balance out the effects of any lurking variables across the two treatment groups. Replication: Eighty
women were assigned to each treatment. These groups are large enough to ensure that differences in the
pregnancy rates of the two groups are due to the treatments themselves and not to chance variation in the
random assignment. (b) The difference in the percent of women who received acupuncture and became
pregnant and those who lay still and became pregnant was large enough to conclude that the difference
was most likely due to the treatments rather than to chance. (c) We should be cautious about drawing
conclusions based on the results of one study. The way this study was designed, it’s possible that the
observed difference is due in part to the placebo effect since the women were aware of which treatment
they received. If possible, another study should be done in which the control group received a fake
acupuncture treatment.
4.74 (a) Researchers randomly assigned participants to diets to make sure that the two groups are as
similar as possible before the treatments are administered. (b) The difference in weight loss seen was
large enough to conclude that the difference was most likely due to the treatments rather than to chance.
(c) Even though the low-carb dieters lost 2 kg more over the year than the low-fat group, this difference
was small enough that it could be due just to chance variation in the random assignment, and not to the
treatments themselves.
4.75 (a) Write “yawn” on 14 slips of paper and “no yawn” on 36 slips of paper. Mix the slips and draw
out 16 of them. These will be the people subjected to the treatment with no yawn seed. The remainder
will be in the yawn-seed treatment group. (b) We would conclude that yawning is not contagious. In our
50 random re-assignments, 10 yawns out of 34 people was not at all unusual.
4.76 (a) Dotplots for both groups are shown below. The differences for the patients in the active group
follow a distribution with a gap between 1 and 4. We might even stretch a bit and say that the distribution
is roughly symmetric with a mean of about 5. The differences ranged from 0 to 10. The distribution of
the differences for the patients in the inactive group is skewed to the right with a center slightly above 1.
Many patients reported no change in their pain ratings (a difference of 0) and the largest difference was 5.
This means that those in the active group had a higher mean difference (5) than those in the inactive
group (1) and had a distribution with much more variability (range = 10 for the active group and range =
5 for the inactive group).
92
The Practice of Statistics for AP*, 4/e
(b) The average difference for the active group is 5.241 and the average difference for the inactive group
is 1.095. The difference in these two means (active – inactive) is 4.146.
(c) Write each patient’s name and difference in pain score on an index card, shuffle them up and deal
them into two piles with 29 in one pile and the rest in the other. The 29 in the first pile will be considered
the active group. (d) The Fathom dotplot shows that none of the 50 simulated differences was larger than
4.146. Thus, a difference of 4.146 would be extremely unlikely if both types of magnets provided the
same level of relief. We would conclude that the active magnets probably do provide relief for polio
patients.
4.77 (a) The blocks are the different diagnoses (e.g. asthma). Within any given diagnosis, we are
looking for differences in patients’ health and satisfaction with medical care between doctors and nurse
practitioners. (b) A completely randomized design would assign patients to two treatment groups
(doctors and nurses) without regard to their diagnosis. This ignores differences between patients with
asthma, diabetes, and high blood pressure, which would probably result in a great deal of variability in
measures of health and satisfaction in both groups. That would make it harder to compare the
effectiveness of nurses and doctors. Blocking will control for the variability in subjects’ responses due to
their diagnosis. This will allow researchers to look separately at health and satisfaction for patients with
each of the three diagnoses, as well as to better assess the relative effectiveness of nurses and doctors.
4.78 (a) The blocks are the sexes. The cancer reacts differently to treatments in men and women so we
want to eliminate sex as a lurking variable. We want to test all three types of treatments in both men and
women. (b) If we used a completely randomized design, we could end up with a treatment that is given
much more frequently to one of the two sexes. Then we will not know if any differences we see between
that treatment and the others are due to the treatment itself, or the fact that cancer reacts differently in
women than men. (c) If the researchers had only 800 men and no women, we would not have to have a
block design. We could just randomly assign the treatments to the subjects. Unfortunately, we would
only be able to make conclusions about how the treatments work in men.
4.79 (a) The difference in soil fertility among the plots is a potential lurking variable. A completely
randomized design could assign one of the varieties of corn to more fertile plots just by chance. If those
plots produced extremely high yields, we wouldn’t know if it was due to the corn variety or to soil
fertility. Blocking will allow researchers to control for the variability in yield due to soil fertility. (b)
The researchers should use the rows as the blocks. All plots in the same row have the same amount of
fertility and so are as similar as possible. (c) Let the digits 1-5 correspond to the five corn varieties A-E.
Begin with, say, line 110, and assign the letters to the rows from west to east (left to right). Use a new
line (111, 112, 113, 114, and 115) for each row. For example, for Block 1, we obtain 3, 4, 4, 4, 1, 3, 3, 2
corresponding to varieties C, D, A, B and lastly E being planted in the first row from west to east
(ignoring non-bolded numbers). The remaining rows are assigned using this same process. The results of
this assignment are shown in the table below.
Chapter 4: Designing Studies
93
Block 1
C
D
A
B
E
Block 2
A
D
E
C
B
Block 3
E
C
D
A
B
Block 4
B
E
D
C
A
Block 5
D
E
A
C
B
Block 6
A
D
C
B
E
4.80 (a) A randomized block design would be better in this case to control for the lurking variable of
initial weight. A completely randomized design could assign several of the more overweight subjects to
the same weight-loss treatment. If these subjects lost much more weight than subjects receiving the other
treatments, researchers wouldn’t know if it was due to the treatment or to subjects’ initial weights. (b)
The blocks should be based on how overweight the subjects are so that the subjects within a block are as
similar as possible. If we block on last name, there will be potentially many differences between the
people in a block. (c) Ordered by increasing weight, the five blocks are (1) Williams-22, Deng-24,
Hernandez-25, and Moses-25; (2) Santiago-27, Kendall-28, Mann-28, and Smith-29; (3) Brunk-30,
Obrach-30, Rodriguez-30, and Loren-32; (4) Jackson-33, Stall-33, Brown-34, and Cruz-34; (5)
Birnbaum-35, Tran-35, Nevesky-39, and Wilansky-42. The exact randomization will vary with the
starting line in Table D. Different methods are possible; perhaps the simplest is to number the subjects
from 1 to 4 within each block, then assign the members of block 1 to a weight-loss treatment, then assign
block 2, etc. For example, starting on line 133, we assign 4-Moses to treatment A, 1-Williams to B, and 3Hernandez to C (so that 2-Deng gets treatment D), then carry on for block 2, etc.
4.81 (a) If all rats from litter 1 were fed diet A and we found diet A to be better, we would not know if
this was because of the diet itself, or because that rat strain was different from the other rat strain. This is
what it means for the strain and the diet to be confounded. We cannot separate out the effects of one
variable from the effects of the other. (b) A better design would be a randomized block design with the
strains as the blocks. In this case, each diet would be given to some rats of each strain.
4.82 (a) Every instructor has their own teaching style. If we assign two instructors to teach using
standard technology and two to use multimedia technology, and we find a difference between the two sets
of sections, we will not know if the difference is due to the technology or to the instructor. (b) A better
design would be to use the instructors as blocks since each instructor teaches two sections. In one
randomly chosen section they would use standard technology and in the other they would use multimedia.
4.83 (a) This is a completely randomized design. (b) Have the students be the blocks. Have each
student drive the simulator twice – once with a hands-free cell phone and once without, in random order.
(c) So that we are not measuring an order effect – that the students are better at the driving simulator the
second time no matter what the treatment is.
94
The Practice of Statistics for AP*, 4/e
4.84 (a) This was a matched pairs design because each volunteer got each treatment in a random order.
(b) The investigators chose this type of design over a completely randomized design because they
recognized that each individual would have different characteristics to their blood vessels. This way they
can directly compare the blood flow with and without the bittersweet chocolate under the same conditions
(in the same body). (c) It is important to randomize the order of the treatments for each subject so that
we are not measuring a time effect. We want to make sure that any effect we see is due to the chocolate
and not to the time at which we measured it.
4.85 (a)
(b) All subjects will perform the task twice, once in each temperature condition. Randomly choose which
temperature each subject works in first by flipping a coin.
4.86 (a) A figure with 6 circular areas is shown below. Randomly assign three of the circles to be treated
with additional CO2 and the other three circles to be left untreated. At the end of the study, compare tree
growth in the treated and untreated areas. Table D was used to select 3 areas for the treatment, starting at
line 104. The first 4 digits are: 5 2 7 1. We cannot use the 7 because it is more than 6. Therefore, we
would treat areas 5, 2 and 1.
(b) A figure with 3 pairs of circular areas is shown below. For each pair, we randomly assign one of the
two areas to receive additional CO2 and the other to be left untreated. Compare tree growth for the treated
and untreated area in each pair. A coin was flipped for each pair. If the coin landed heads then the top
area was treated and the bottom area was left untreated. If the coin landed tails then the top area is left
untreated and the bottom area is treated
Chapter 4: Designing Studies
95
4.87 (a) This experiment confounds gender with deodorant. If the students find a difference between the
two groups, they will not know if it is a gender difference or due to the deodorant. (b) A better design
would be a matched pairs design. In this case each student would have one armpit randomly assigned to
receive deodorant A and the other deodorant B. Have each student rate the difference between their own
armpits at the end of the day.
4.88 (a) This experiment confounds the alarm setting (either set or not) with weekend days and week
days. It is likely that Justin goes to bed at different times on the weekend than he does on the weekday
and this may have an effect on his average wake-up time. (b) A better design would be a randomized
block design with the weekend days being one block and the week days being the other block. Justin
would randomly assign one weekend day to set his alarm and the other day for having no alarm, and do
likewise for the week days. This allows us to make sure the days in which the alarm is set are similar to
the days in which it is not set.
4.89 Take the 50 volunteers and randomly assign them to one of two equal groups (25 volunteers each).
Now randomly select one of the two groups. In this group, give men razor A to use. Give razor B to the
men in the other group. Measure how close the razor shaves. On the next morning, give the men the
other razor and measure how close the razor shaves. Analyze the difference in closeness.
4.90 Take the 30 students and divide randomly into two equal groups (15 students each). Now randomly
select one of the two groups. In this group, have music playing while a story is read. In the other group,
read a story without music in the background. Test the students for recall. Now reverse the treatments.
For those students who had music with the first story, read to them with no background noise and use
background music with the others. Test the students for recall. Analyze the difference in recall for the
individual students.
4.91 c
4.92 a
4.93 b
4.94 d
96
The Practice of Statistics for AP*, 4/e
4.95 c
4.96 d
4.97 c
4.98 b
4.99 (a) Since we know the weights of seeds of a variety of winged bean are approximately Normal, we
can use the Normal model to find the percent of seeds that weigh more than 500 mg. First, we standardize
500 mg:
z=
x−µ
σ
=
500 − 525 −25
=
= −0.23
110
110
Using Table A, we find the proportion of the standard Normal curve that lies to the left of z = −0.23 to be
0.4090, which means that 1 – 0.4090 = 0.5910 lies to the right of z = −0.23. Thus, 59.1% of seeds weigh
more than 500 mg. (b) We need to find the z-score with 10% (or 0.10) to its left. The value z = −1.28 has
proportion 0.1003 to its left, which is the closest proportion to 0.10. Now, we need to find the value of x
for the seed weights that gives us z = −1.28:
x − 525
−1.28 =
110
−1.28(110) =
x − 525
525 − 1.28(110) =
x
384.2 = x
If we discard the lightest 10% of these seeds, the smallest weight among the remaining seeds is 384.2 mg.
4.100 The scatterplot of IQ’s for Twin A and Twin B is given below. We see that there is a reasonably
strong ( r = 0.91) linear relationship between the IQ’s of the twins.
4.101 If we subtract Twin B – Twin A, and look at a dotplot of the differences, we see the graph given
below.
Chapter 4: Designing Studies
97
Since all but one of the differences are positive, this suggests that in most cases Twin B (the one living in
the higher income homes) tends to have a higher IQ. This is confirmed by computing the mean and
standard deviation. The mean difference is 5.83 and the standard deviation is 3.93. This says that the IQ
of the twin living in homes with higher incomes has, typically, an IQ that is 5.83 points higher than their
corresponding twin and that the average difference for sets of twins is about 3.93 points away from that
5.83.
Section 4.3
Exercises, page 269:
4.102 If the study involves random sampling, then we can make inferences about the population from
which we sampled. If the study involves random assignment we can make inferences about cause and
effect.
4.103 Since this study involved random assignment to the treatments (foster care or institutional care),
we can infer cause and effect. Therefore we can conclude that living in foster care in Romania is better
than living in an institution.
4.104 Since this study involved random assignment to the treatments (freezer or room temperature), we
can infer cause and effect. Therefore we can conclude that storing batteries in the freezer leads to a
higher average charge for batteries produced by this company. Also, since the batteries were randomly
chosen, we can generalize to the whole population of batteries.
4.105 Since this study did not involve random assignment to a treatment we cannot infer cause and
effect. Also, since the individuals were not randomly chosen, we cannot generalize to a larger population.
4.106 Since this study involved a random sample, we can make an inference about the population. It
appears that those who attend religious services regularly have a lower risk of dying younger. But we
cannot infer cause and effect. We do not know that attending religious services is the reason for this
lower risk.
4.107 Daytime running lights may be effective because they catch the attention of other drivers. As they
become more common, they may be less effective at catching the attention of other drivers because
people may simply get used to them. We also need to pay attention to how much reduction there is from
using daytime running lights. If it’s only a very small amount, the cost of installing them may not be
justified.
4.108 The psychologist should not generalize to a team of employees that spend months developing a
new product that never works right and is abandoned. She has “put together” a team of students. This
suggests that there was no randomization involved. But regardless of that, students are likely to be in a
different place in their lives than employees who are on the job for at least several months and likely
much longer. Also, the disappointment associated with losing games during an evening is not likely to be
equivalent to the disappointment felt after months of hard work.
4.109 Answers will vary. Possible answers include: (a) Many people would consider pricking a finger
to be of minimal risk. (b) Fewer people would consider drawing blood from an arm to be of minimal
risk. (c) It is unlikely that very many people would consider inserting a tube into the arm that remains
there to be of minimal risk.
98
The Practice of Statistics for AP*, 4/e
4.110 Answers will vary. Possible answers include: (a) A non-scientist will be more likely to consider
the subjects as people and not be blinded by the scientific results that might be discovered. (b) You
might consider at least two outside members. A member of the clergy might be chosen because we would
expect them to help lead the committee in ethical and moral discussions. You might also choose a patient
advocate to speak for the subjects involved.
4.111 Answers will vary. Possible answers include: (a) Many would consider this to be an appropriate
use of collecting data without participants’ knowledge because the data is, in effect, anonymous. (b)
Many would consider this to be appropriate because the meetings are public and the psychologist is not
misleading the participants. (c) Most would consider this to be inappropriate because the psychologist is
misleading the other participants and attending private meetings.
4.112 Answers will vary. One possible answer is: Any collection of data on minors should be made with
parental consent only. This allows the parents to be aware of what is being asked of their children and
they can decide if the subject matter is appropriate for their children.
4.113 The responses to the GSS are confidential. The person taking the survey knows who is answering
the questions (they were chosen in some random fashion), but will not share the results of individuals
with anyone else.
4.114 This describes the anonymous screening. The patient never gives their name, but rather is just
assigned a number. No one at the clinic can put the results together with a name because the name was
never given.
4.115 In this case the subjects were not able to give informed consent. They did not know what was
happening to them and they were not old enough to understand the ramifications in any event.
4.116 Answers will vary. One possible answer is: Yes, providing these potentially life-changing
services to some but not all seniors in the study is unethical. We can’t withhold important services from
some people.
4.117 From the given two-way table of response by gender, find and compare the conditional
distributions of response for men alone and women alone. These values are in the table below. To find
the conditional distributions, divide each entry in the table by its column total.
Response
Strongly Agree
Agree
Neither
Disagree
Strongly Disagree
Male
14.7%
52.3%
16.9%
11.8%
4.3%
Female
9.3%
38.8%
21.9%
19.3%
10.7%
We also present the same conditional distributions in the bar chart below.
Chapter 4: Designing Studies
99
From the table and the bar chart we see that men are more likely to view animal testing as justified if it
might save human lives: over two-thirds of men agree or strongly agree with this statement, compared to
slightly less than half of the women. The percentages who disagree or strongly disagree tell a similar
story: 16% of men versus 30% of women.
4.118 The mean is not resistant to outliers. We are told that Cisco systems stock went up 60,600%. This
is clearly an outlier and will greatly influence the mean. Since the outlier is very positive, this will make
the mean much higher than the median. In fact, the outlier has such a big influence that it even changes
the sign from negative to positive.
Chapter Review Exercises (page 271)
R4.1 (a) The population is Ontario residents; the sample is the 61,239 people interviewed. (b) The sample
size is very large, so if there were large numbers of both sexes in the sample—this is a safe assumption
since we are told this is a “random sample”—these two numbers should be fairly accurate reflections of
the values for the whole population.
R4.2 Answers will vary. One possible answer is: (a) Announce in daily bulletin that there is a survey
concerning student parking available in the main office for students who want to respond. Since
voluntary surveys are generally responded to only by those who feel strongly about the issue, these results
will likely be biased. (b) Personally interview a group of students having lunch in the center quad.
Convenience samples are not generally representative of the population leading to biased results.
R4.3 (a) Alphabetically associate each name with a two-digit number: Agarwal = 01, Andrews = 02, …,
Wilson = 25. Move from left to right reading pairs of digits until you find three different pairs between
01 and 25. (b) Using the numbers given, choose 17 52 17 80 09 46 23. These numbers correspond to
Musselman, Fuhrmann and Smith.
100
The Practice of Statistics for AP*, 4/e
R4.4 A stratified random sample would probably be best here; one could select 50 faculty members from
each type of institution. If a large proportion of faculty in your state work at a particular class of
institution, it may be useful to stratify unevenly. If, for example, about 50% teach at Class I institutions,
you may want half your sample to come from Class I institutions. A simple random sample might miss
faculty from one particular type of institution, especially if there are not many faculty at that type of
institution. A cluster sample might introduce bias. If the clusters are taken to be different schools, faculty
at one school may have a different opinion than faculty at other schools depending on their particular
student body.
R4.5 (a) A potential source of bias related to the question wording is that people may not remember how
many movies they watched in a movie theater in the past year. It might help the polling organization to
shorten the amount of time that they ask about, perhaps 3 or 6 months.
(b) A potential source of bias not related to the question wording is that the poll contacted people through
“residential phone numbers.” Since more and more people (especially younger adults) are using only a
cellular phone (and do not have a residential phone), the poll omitted these people from the sampling
frame. These same people might be more likely to watch movies in a movie theater. The polling
organization should include cell phone numbers in their list of possible numbers to call.
R4.6 (a) The data were collected after the anesthesia was administered. Hospital records were used to
“observe” the death rates, rather than imposing different anesthetics. (b) One possible confounding
variable could be type of surgery. If one anesthesia is used more often with a type of surgery that has a
higher death rate anyway, we wouldn’t know if the death rate was higher because of the anesthesia type
or the surgery type.
R4.7 The experimental units are the potatoes used in the experiment. The explanatory variables are
storage time and time from slicing until cooking. There are six treatments: (1) fresh picked and cooked
immediately, (2) fresh picked and cooked after an hour, (3) stored at room temperature and cooked
immediately, (4) stored at room temperature and cooked after an hour, (5) stored in refrigerator and
cooked immediately, (6) stored in refrigerator and cooked after an hour. The response variables are
ratings of color and flavor.
R4.8 Assign each person a number and using a random number table or technology, choose half to be in
one group and the other half to be in the second group. When a person from the first group is called, use
the first description. When a person in the second group is called, use the second description. Record
their responses. See the figure below.
R4.9 (a) The design accounted for the placebo effect by giving some patients a treatment that should
have no effect at all, but looks, tastes and feels like the St.-John’s-wort. These people think that they are
being treated but, in fact they are not. So if they get better, this would be a case of the placebo effect. (b)
The study should be double-blind. The subjects should not know which treatment they are getting so that
the researchers can measure how much placebo effect there is. But the researchers should also be blinded
so that they cannot influence how they measure the results.
Chapter 4: Designing Studies
101
R4.10 (a) This is a randomized block design. Blocking helps control for the variability in responses due
to people’s running habits. (b) We use randomization to make sure that the two groups of people in each
block are as similar as possible before the treatments are administered. (c) A difference in rate of
infection may have been due to the effects of the treatments, or it may simply have been due to random
chance. Saying that the placebo rate of 68% is “significantly more” than the Vitamin C rate of 33% means
that the observed difference is too large to have occurred by chance alone. In other words, Vitamin C
appears to have played a role in lowering the infection rate of runners.
R4.11 (a) Randomly assign 15 students to Group 1 (easy mazes) and the other 15 to Group 2 (hard
mazes). Compare the time estimates of Group 1 with those of Group 2. (b) Each student does the activity
twice, once with the easy mazes, and once with the hard mazes. Randomly decide (for each student)
which set of mazes is used first. Compare each student’s “easy” and “hard” time estimate (for example,
by looking at each “hard” minus “easy” difference). (c) The matched pairs design would be more likely
to detect a difference because it controls for the variability between subjects.
R4.12 (a) This does not meet the requirements of informed consent because the subjects did not know
the nature of the experiment before they agreed to participate. (b) All individual data should be kept
confidential and the experiment should go before an institutional review board before being implemented.
(c) This would allow for inference about cause-and-effect if the students are randomly assigned to the
two treatments.
AP Statistics Practice Test (page 274)
T4.1 c. A census is defined to be measuring all individuals in the population.
T4.2 e. Ignore numbers that are larger than 816 or are duplicate numbers.
T4.3 d. In order to infer cause and effect, we must run a well-designed experiment. This was an
observational study.
T4.4 c. This is the definition of a Simple Random Sample.
T4.5 b. By randomly assigning treatments we are attempting to make the different groups look as similar
as possible so that we can reduce the likelihood of a confounding variable.
T4.6 b. It is very difficult to show cause and effect using observational studies. It is much easier in an
experiment where the researcher has control over how the treatments are applied.
T4.7 d. By stratifying we can control how many people we survey in each of the different kinds of areas.
T4.8 d. Bias in the responses means that you are getting responses that are systematically different from
the truth.
102
The Practice of Statistics for AP*, 4/e
T4.9 d. This is a completely randomized design because you randomly assign subjects to one of the four
groups. There are two factors: Length of ad (30 seconds or 60 seconds) and Repeat (1 time or 3 times).
T4.10 b. In a matched pairs design, the two observations in the pair should be as similar as possible. So
use a subjective method for pairing the plots. Once the pairs are chosen, then randomly assign the two
treatments to the two plots in the pair.
T4.11 d. The teachers who responded likely feel more strongly about the issue and shouldn’t be
considered to be representative of the entire population of teachers under consideration.
T4.12 (a) The experimental units are the acacia trees. The treatments are placing either active beehives,
empty beehives or nothing in the trees. The response variable is the damage caused by elephants to the
trees. (b) Randomly assign 24 of the acacia trees to have active beehives placed in them, 24 randomly to
have empty beehives placed in them and the remaining 24 to remain empty. To do this, assign the trees
numbers from 01 to 72 and use a random number table to pick 24 2-digit numbers in this range. Those
trees will get the active beehives. The trees associated with the next 24 2-digit numbers will get the
empty beehives and the remaining 24 trees will remain empty. Compare the damage caused by elephants
to the trees with active beehives, those with empty beehives and those with no beehives.
T4.13 (a) It is not a simple random sample because not all samples were possible. For instance, given
their method, they could not have had all respondents from the east coast. (b) One adult was chosen at
random to control for lurking variables. Perhaps household members who generally answer the phone
have a different opinion than those who don’t generally answer the phone. (c) There was undercoverage
in this survey. Those who do not have telephones, or those who have only cell phones were not part of
the sampling frame. So their opinions would not have been measured. Since cell-phone-only users tend
to be younger, the results of the survey may not accurately reflect the entire population’s opinion.
T4.14 (a) Each of the 11 individuals will be a block in a matched pairs design. Each participant will take
the caffeine tablets on one of the two-day sessions and the placebo on the other. The order in which they
take the caffeine or the placebo is decided randomly. The tapping test is administered at the end of each
two-day trial. The results to be compared are the differences between the caffeine and placebo scores on
the tapping test. The blocking was done to control for individual differences in dexterity. (b) The order
was randomized to control for any possible influence of the order in which the treatments were
administered on the subject’s tapping speed. (c) It is possible to carry out this experiment in a doubleblind manner. This means that neither the subjects nor the people who come in contact with them during
the experiment (including those who record the number of taps) had knowledge of the order in which the
caffeine or placebo was administered.
Chapter 4: Designing Studies
103
Download