Tipping

advertisement
Topic 3
Drawing Conclusions from Studies
In-Class Activities
Activity 3-1: Elvis Presley and Alf Landon
3-1, 3-6, 16-5
a.
population = all adult Americans (record company interest); sample = those listening to
the radio who called in
b.
No – 56% is probably not an accurate reflection of the opinions of all adult Americans on
this issue. People who choose to call in (who take the time and are willing to spend the money)
probably feel differently and more strongly about the issue than other Americans. The timing
(on the anniversary of Elvis’ death) is also likely to influence the opinions of those who called in.
We also have no indication of how widely distributed across the country the radio stations were
(perhaps there could be bias if the stations tended to be mostly from the south).
c.
population = all Americans eligible to vote in 1936; sample = the 2.4 million who
returned the questionnaires
d.
Their prediction was in error because their sampling technique was biased. By sampling
people who owned vehicles and telephones in 1936, they were sampling from a subset of the
population that tended to be wealthy. Historically, the wealthy have tended to support the
Republican candidate (conservative), while those without money have tended to vote Democratic
(for social change). Thus the pollsters contacted primarily Republican voters, but there was a
heavy Democratic turn-out on election day. Furthermore, those who chose to respond were
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 1, Topic 3
1
probably more dissatisfied with the incumbent (Roosevelt) than those who did not choose to
respond.
e.
56% of callers who believe Elvis was alive = statistic
57% of voters who indicated they would vote for Alf Landon = statistic
63% of voters who actually voted for Franklin Roosevelt = parameter
f.
proportion of students in your class who use instant messaging = statistic
proportion of students at your school who use instant messaging = parameter
average number of hours students at your school spent watching TV = parameter
average number of hours students in your class slept last night = statistic
g.
proportion of voters who voted for Bush in 2004 = parameter
proportion of voters surveyed by CNN who voted for John Kerry = statistic
proportion of voters among faculty members who voted for Nader = parameter (assume
population = all of your school’s voting faculty members)
average number of points scored in a Super Bowl game = parameter (population = all
Super Bowl games)
h.
A categorical variable leads to a parameter or statistic that is a proportion; a quantitative
variable leads to a parameter or statistic that is an average.
Activity 3-2: Self-Injuries
a.
observational units = students; variable = whether or not they had injured themselves;
type = binary categorical
b.
population = all American college students
sample = the 2875 students from Cornell and Princeton who responded to the survey
c.
the sample size is 2875
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 1, Topic 3
2
d.
17% is a statistic - it is a proportion derived from the sample of students.
e.
This percentage is unlikely to be representative of all college students in the world since
the sample was taken from two U.S. colleges and the college experience in the U.S. is very
different from the rest of the world. It is not even clear that it would be representative of all U.S.
colleges since both schools used in the survey were Ivy League schools, so their students would
hardly be ‘typical’ U.S. college students and may also have distinct types of stress and social
reactions to stress.
Activity 3-3: Candy and Longevity
3-3, 21-27
a.
No – this is unlikely to be a representative sample of the health habits of all adult
Americans. Everyone in the sample was male and college educated (to some extent) at an Ivy
League school at least 30 years before the study. This is not the picture of the “average”
American. In addition to the gender differences, the health knowledge and access to medical
care by the men in this sample could differ from the rest of the population.
b.
proportion who consumed candy = 4529/7841 = .5776; This is a statistic.
c.
proportion of nonconsumers that had died =247/3312 = .075; proportion of consumers
that had died = 267/4529 = .059
d.
observational units = 7841 men who entered Harvard between 1916 and 1950
explanatory variable = whether or not they consumed candy
type = binary categorical
response variable = whether or not they had died by the end of 1993
type = binary categorical
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 1, Topic 3
3
e.
Perhaps men who like candy also like to exercise regularly while those who do not tend
eat candy perhaps do not like to exercise. In this case, it might be the exercise that increases
lifespan, rather than the candy. (other possibilities include differences in diet, differences in
family size, happiness levels)
f.
proportion of nonconsumers who had never smoked = 1201/3312 = .363; proportion of
consumers who had never smoked =1852/4529 = .409
g.
A greater percentage of the candy consumers had never smoked. The higher death rate
among those who did not tend to consume candy might have been due to smoking rather than not
eating candy.
Activity 3-4: Sporting Examples
2-6, 3-4, 8-14, 10-11, 22-26
a.
observational units = statistics students
explanatory variable = section (exclusively sports examples or not)
type = binary categorical
response variable = performance (points earned)
type = quantitative
b.
We know this is an observational study because the students self-selected into the two
sections. The researcher (professor) merely passively observed the students’ selections and
subsequent performances.
c.
No – it is not legitimate to conclude that the sports examples caused the lower academic
performance. One obvious confounding variable would be the time of the class. The section
with exclusively sports examples was offered at an earlier hour of the morning than the other
section. Perhaps students were not as awake for this section, or perhaps attendance was worse
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 1, Topic 3
4
for this section because of the earlier hour. Either of these could be part of the reason for the
lower academic performance in this section.
Activity 3-5: Childhood Obesity and Sleep
[insert checkmark icon]
a.
The explanatory variable is the amount of sleep that a child gets per night. This is a
quantitative variable, although it would be categorical if the sleep information was reported only
in intervals (more sleep vs. less sleep). The response variable is whether the child is obese, which
is a binary categorical variable.
b.
This is an observational study because the researchers passively recorded information
about the children’s sleeping habits. They did not impose a certain amount of sleep on children.
Therefore, it is not appropriate to draw a cause-and-effect conclusion that less sleep causes a
higher rate of obesity. Children who get less sleep might differ in some other way that could
account for the increased rate of obesity. For example, amount of exercise could be a
confounding variable. Perhaps children who exercise less have more trouble sleeping, in which
case exercise would be confounded with sleep. You have no way of knowing whether the higher
rate of obesity is due to less sleep or less exercise, or both, or some other variable that is also
related to both sleep and obesity.
c.
The population from which these children were selected is apparently all children aged
5–10 in primary schools in the city of Trois-Rivieres. These Quebec children might not be
representative of all children in this age group worldwide, so you should be cautious about
generalizing that a relationship between sleep and obesity exists for children around the world.
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 1, Topic 3
5
Homework Activities
Activity 3-6: Elvis Presley and Alf Landon
3-1, 3-6, 16-5
a.
This is a very biased sampling method. We would expect this method to overestimate the
proportion of adults who believe the Elvis faked his death since people who feel strongly about
this are likely to be the ones responding to such an internet poll.
b.
This is a statistic.
c.
While this number may feel large to you, we really have no way of knowing, based on
the statistic alone, whether a sampling method is biased. It is better to consider the sampling
method when accessing whether you believe bias is present.
d.
The sample size is 2032. Taking a larger sample would not reduce the bias – if the
method is bad, increasing the sample size will not correct the problem.
Activity 3-7: Student Data
1-1, 1-5, 2-7, 2-8, 3-7, 7-8
a.
Answers will vary by school and class.
b.
Answers will vary by school and class.
c.
Answers will vary by school and class.
Activity 3-8: Generation M
3-8, 4-14, 13-6, 16-1, 16-3, 16-7, 18-1, 21-11, 21-12
a.
parameter
b.
statistic
c.
statistic
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 1, Topic 3
6
d.
statistic
e.
parameter
f.
statistic
g.
parameter
h.
statistic
Activity 3-9: Community Ages
a.
parameter – You would be viewing your community as the population.
b.
Yes – this would be biased. It would probably overestimate the average age of residents
as younger residents do not attend church as frequently as older residents. .
c.
Yes – this would be a biased sampling method. This would underestimate the average
age of residents as most drivers at the daycare facility tend to be young adults, not middle-aged
or elderly. This method would also exclude all residents who are not yet old enough to drive.
Activity 3-10: Penny Thoughts
2-1, 3-10, 16-23
a.
2136 is the sample size, not the population. The population is all American adults.
b.
The sample is the 2136 people contacted by the Harris Poll; 59% is a statistic.
c.
The variable is an unknown proportion of the population who favor abolishing the penny.
d.
The observational units are people, not pennies.
e.
The parameter is a number (of unknown value). The population is all American adults.
f.
The statistic is a percentage of the sample of 2136 people who favor abolishing the
penny, 59% – not an average.
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 1, Topic 3
7
Activity 3-11: Class Engagement
a.
No – this is an observational study and there are at least two potential confounding
variables that could explain the higher level of engagement in the statistics class. So you cannot
attribute the difference to the subject matter.
b.
1) the time of the class (8:00 AM or 11 AM)
2) the instructor (Newton or Fisher)
Activity 3-12: Web Addiction
a.
population = all visitors to the abcnews.com website (or internet users)
sample = 17,251 users of abcnews.com who responded to the survey
b.
The proportion of the population who have some sort of addiction to the internet.
c.
The 6% is probably not a reasonable estimate of the parameter since the survey was
voluntary. Those who use the internet more and are more addicted to it are more likely to
respond to an online survey. This would make the 6% higher than the percentage for all visitors
to the site or for all internet users in general. Alternatively, you could argue that many addicts
might not be willing to admit to a “problem” and the 6% is lower than the true proportion in the
population though this is more of a nonsampling error (people lying) rather than a sample
selection issue. See Topic 4 (Activity 4-20 in particular) for more discussion of nonsampling
errors.
Activity 3-13: Alternative Medicine
This sample result is probably not representative of the truth concerning the population of all
adult Americans because the sampling method is biased. Only readers of Self magazine were
part of the poll, and the readers of this health magazine were probably the type of people who try
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 1, Topic 3
8
alternative medicines more than non-readers (bad sampling frame). Furthermore, strong
advocates of alternative medicines would probably be more likely to reply to a mail-in poll
(voluntary response bias). Therefore, this result is very likely to be an overestimate of the
proportion of all adult Americans who have used alternative medicines.
Activity 3-14: Courtroom Cameras
a.
800/812 = .985; statistic
b.
This sample probably does not represent well the population of all adult Americans.
Only those people familiar with the trial and with the fact that they could write letters to the
Judge about their opinions and who felt very strongly about the issue would take the time to
write. Those who didn’t mind the use of cameras probably wouldn’t feel the need to write in.
This was a voluntary sample and not random at all.
Activity 3-15: Junior Golfer Survey
a.
No – this is not a representative sample of all American teenagers because most teenagers
do not play golf.
b.
Yes – this sampling procedure is likely to be biased with respect to voting preference.
Golfing is an expensive sport, and the wealthy tend to vote Republican, so these teenagers have
probably grown-up in Republican households.
c.
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 1, Topic 3
9
1
0.9
0.8
Proportion
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
Democrat
Republican
Neither
Don't Know
Voting Preference
d.
Yes – this sampling procedure is likely to be biased with regard to both of these
variables. If Junior golfers tend to come from more affluent families, they almost certainly have
a cell phone and computer in their home, making online access readily available and probably
given them more free time to spend on the computer. Of course if they are more physically active
and training for tournaments, they may tend to spend less time online than a typical teen.
Activity 3-16: Accumulating Frequent Flyer Miles
a.
observational unit = visitors of msnbc.com; variable = whether or not they use a credit
card to accumulate airline miles (binary categorical)
b.
statistic – because it is a number computed from a sample (from 1935 online responses)
c.
This sampling method is most likely biased (because it is voluntary) and will provide an
overestimate of the proportion of all American adults who use a credit card to accumulate airline
miles. People who are willing to respond to an online survey are more likely to be comfortable
to use their credit card over the internet and to take advantage of internet offers.
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 1, Topic 3
10
d.
The sample size is 1935. No – it does not affect the answer to part c. This is a large
sample size, and even if it weren’t, a large sample size will not compensate for bias caused by a
poor sampling method.
Activity 3-17: Foreign Language Study
3-17, 5-11
a.
Yes – these are observational studies. Researchers could only have passively observed
the association between foreign language study and verbal SAT scores rather than determining
for students whether or not they took a foreign language in high school.
b.
No – it is not legitimate to conclude that foreign language study causes an improvement
in students’ verbal abilities. We can never draw cause-and-effect conclusions between variables
from an observational study. One possible confounding variable is verbal aptitude. Perhaps
students with strong verbal aptitudes choose to enroll in foreign language courses and also
perform well on the verbal portion of the SAT exam. Students with weaker verbal skills may
avoid foreign language courses and may also perform less well on the verbal portion of the SAT.
Activity 3-18: Smoking and Lung Cancer
3-18, 3-19
The student needs to explain how diet could be connected to both the explanatory (smoking) and
response (lung cancer) variables. How could diet explain the apparent strong connection
between smoking and lung cancer? For example, smokers may also tend to have poorer overall
diets and it is the poor diet that leads to higher rates of cancer.
Activity 3-19: Smoking and Lung Cancer
3-18, 3-19
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 1, Topic 3
11
a.
explanatory = smoking habits; response = whether or not they died of lung cancer
b.
Yes – this is an observational study. The researchers passively observed the smoking
habits and life spans of their subjects rather than actively imposing the smoking habits of the
individuals.
c.
Yes – we should have qualms about generalizing these results to a larger population. The
subjects were all males and were haphazardly selected by volunteers so the results definitely
should not be extended to women and may also be unrepresentative of the general population as
well depending on how the volunteers selected the individuals.
Activity 3-20: A Nurse Accused
1-6, 3-20, 6-10, 25-23
a.
observational units = eight hours shifts; explanatory variable = whether or not Gilbert
worked on the shift; response variable = whether or not a patient died on the shift
b.
Yes, this is an observational study since the researchers did not randomly determine
which shifts Gilbert would work
c.
No – because this is an observational study we cannot draw any cause-and-effect
conclusions between the variables.
d.
Perhaps Gilbert is a senior-level intensive care nurse whose patients are generally in more
critical condition than seen by nurses on other shifts. If she works primarily with patients who
are less likely to survive then it would not be surprising that the death rate on her shift is higher
than that of the hospital average. Or, perhaps Gilbert works night or weekend shifts which tend
to have higher death rates than day-time or weekday shifts.
Activity 3-21: Buckle Up!
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 1, Topic 3
12
2-4, 3-21, 8-5
a.
Yes – this is an observational study. We know because we collected existing data about
the states.
b.
No – we cannot conclude that the tougher seatbelt laws cause a higher proportion of
residents to comply because this is an observational study.
c.
Yes – the data suggest that tougher seatbelt laws may result in lower death rates because
the tougher seatbelt laws are associated with higher seatbelt compliance.
Activity 3-22: Yoga and Middle-Aged Weight Gain
a.
explanatory = whether or not middle-aged adults practiced yoga (binary categorical)
response = amount of weight gained/lost between the ages of 45 and 55 (quantitative)
b.
Yes this is an observational study because the researchers passively collected the data
through surveys rather than randomly determining who would practice yoga.
c.
No, this study does not allow us to draw a cause-and-effect conclusion between
practicing yoga and gaining less weight because it is an observational study and we can never
draw such conclusions based on observational studies.
d.
A potential confounding variable is the amount of weekly exercise obtained by each
adult. Perhaps adults who practice yoga also tend to engage in other forms of exercise on a
regular basis, and this is what caused their weight loss. Adults who showed more weight gain
may have participated in less overall exercise during the years from 45-55.
Activity 3-23: Pet Therapy
3-23, 5-13
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 1, Topic 3
13
a.
Yes – this is an observational study because you are passively observing and recording
information about the patients instead of randomly determining which individuals would own a
pet.
b.
explanatory = whether or not a recovering heart attack patient has a pet (binary
categorical)
response = whether or not the patient survives for 5 years (binary categorical)
c.
No – you cannot conclude that pet ownership leads to therapeutic benefits for heart attack
patients based on this study, because it is an observational study and we can never conclude
cause-and-effect from an observational study. There are many potential confounding variables
that could explain the association.
Activity 3-24: Winter Heart Attacks
a.
A possible confounding variable could be weather. An alternative explanation could be
that during the months of December and January, the weather is colder, the days are shorter,
people tend to get less exercise (or more straining exercise such as shoveling snow) and these
factors in turn increases the number of heart attacks.
b.
This reduces the viability of the change in weather explanation.
c.
A remaining confounding variable might be the length of the days. As the days shorten
in the winter (and less sunlight is available), people become depressed and this may increase the
number of heart attacks that occur.
Activity 3-25: Pursuit of Happiness
2-16, 3-25, 13-17, 25-1, 25-2, 25-4
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 1, Topic 3
14
No – these study results do not establish a causal connection between income and happiness
because this is an observational study and we can never conclude cause-and-effect from an
observational study. There are many potential confounding variables that could explain the
association.
Activity 3-26: Televisions, Computers, and Achievement
a.
explanatory = whether or not there was a television in the bedroom (binary categorical)
whether or not there was a computer in the home (binary categorical)
response = score on mathematics portion of the achievement test (quantitative)
score on language arts portion of the achievement test (quantitative)
b.
Yes – this is an observational study. The researchers passively observed/collected the
achievement scores and television/computer information about these children and did not impose
any treatments.
c.
No – you cannot make either conclusion because this is an observational study.
d.
There are many possible answers here. One confounding variable might be the financial
status of the family. Families that are better-off financially are more likely to have computers,
but are also more likely to expose their children to various forms of literature and language arts,
such as books, magazines and theatre, etc. This exposure, rather than the home computer, could
be responsible for the higher scores on the language arts portion of the test.
e.
The sample is the 348 Chicago 3rd graders.
f.
If we assume the sample was randomly selected then we could generalize to all 3rd
graders in the Chicago area. As they may not be typical of all 3rd graders anywhere else, we
probably would not want to generalize beyond this population.
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 1, Topic 3
15
Activity 3-27: Parking Meter Reliability
If the meters were randomly selected from Berkeley, we might be willing to generalize to
Berkeley. However, since they were not randomly selected from all California parking meters,
we wouldn’t be willing to generalize the results to this population.
Activity 3-28: Night Lights and Nearsightedness
a.
No – assuming that these are observational studies, there are potential confounding
variables that prevent us from legitimately concluding that sleeping with a night light causes a
higher rate of nearsightedness.
b.
This argument is incomplete because the student has not explained how “genetics” is
connected to sleeping with a nightlight (the explanatory variable) as well as to the rate of
nearsightedness (the response variable). The student should have said: Genetics, because
nearsighted parents tend to have nearsighted children and it could be that parents who are
themselves nearsighted (genetics) are more likely to provide a nightlight for their children.
Rossman/Chance, Workshop Statistics, 3/e
Solutions, Unit 1, Topic 3
16
Download