Chapter 7 Population and Sample How Data are Obtained

advertisement
Basic Practice of Statistics - 3rd Edition
Chapter 7
Producing Data: Sampling
BPS - 3rd Ed.
Chapter 7
1
Population and Sample
Researchers often want to answer questions
about some large group of individuals (this group
is called the population)
‹ Often the researchers cannot measure (or
survey) all individuals in the population, so they
measure a subset of individuals that is chosen to
represent the entire population (this subset is
called a sample)
‹ The researchers then use statistical techniques to
make conclusions about the population based on
the sample
‹
BPS - 3rd Ed.
Chapter 7
2
How Data are Obtained
‹ Observational Study
– Observes individuals and measures variables of
interest but does not attempt to influence the
responses
– Describes some group or situation
– Sample surveys are observational studies
‹ Experiment
– Deliberately imposes some treatment on
individuals in order to observe their responses
– Studies whether the treatment causes change in
the response.
BPS - 3rd Ed.
Chapter 7
Chapter 7
3
1
Basic Practice of Statistics - 3rd Edition
Experiment versus
Observational Study
Both typically have the goal of detecting a relationship
between the explanatory and response variables.
‹Experiment
– create differences in the explanatory variable
and examine any resulting changes in the
response variable (cause-and-effect conclusion)
‹Observational
Study
– observe differences in the explanatory variable
and notice any related differences in the
response variable (association between variables)
Chapter 7
BPS - 3rd Ed.
4
Why Not Always Use an
Experiment?
‹ Sometimes
it is unethical or impossible
to assign people to receive a specific
treatment.
‹ Certain
explanatory variables, such as
handedness or gender, are inherent
traits and cannot be randomly assigned.
Chapter 7
BPS - 3rd Ed.
5
Confounding
‹ The
problem:
– in addition to the explanatory variable of
interest, there may be other variables
(explanatory or lurking) that make the groups
being studied different from each other
– the impact of these variables cannot be
separated from the impact of the explanatory
variable on the response
BPS - 3rd Ed.
Chapter 7
Chapter 7
6
2
Basic Practice of Statistics - 3rd Edition
Confounding
‹ The
solution:
– Experiment: randomize experimental units
to receive different treatments (possible
confounding variables should “even out”
across groups)
– Observational Study: measure potential
confounding variables and determine if
they have an impact on the response
(may then adjust for these variables in the
statistical analysis)
BPS - 3rd Ed.
Chapter 7
7
Question
A recent newspaper article concluded
that smoking marijuana at least three
times a week resulted in lower
grades in college. How do you think
the researchers came to this
conclusion? Do you believe it? Is
there a more reasonable conclusion?
BPS - 3rd Ed.
Chapter 7
8
Case Study
The Effect of Hypnosis
on the
Immune System
reported in Science News, Sept. 4, 1993, p. 153
BPS - 3rd Ed.
Chapter 7
Chapter 7
9
3
Basic Practice of Statistics - 3rd Edition
Case Study
The Effect of Hypnosis
on the
Immune System
Objective:
To determine if hypnosis strengthens the
disease-fighting capacity of immune cells.
BPS - 3rd Ed.
Chapter 7
10
Case Study
‹ 65
college students.
– 33 easily hypnotized
– 32 not easily hypnotized
‹ white
blood cell counts measured
students viewed a brief video about
the immune system.
‹ all
BPS - 3rd Ed.
Chapter 7
11
Case Study
‹ Students
randomly assigned to one of
three conditions
– subjects hypnotized, given mental exercise
– subjects relaxed in sensory deprivation
tank
– control group (no treatment)
BPS - 3rd Ed.
Chapter 7
Chapter 7
12
4
Basic Practice of Statistics - 3rd Edition
Case Study
white blood cell counts re-measured after one
week
‹ the two white blood cell counts are compared
for each group
‹ results
‹
– hypnotized group showed larger jump in white
blood cells
– “easily hypnotized” group showed largest immune
enhancement
BPS - 3rd Ed.
Chapter 7
13
Case Study
The Effect of Hypnosis
on the
Immune System
What is the population?
What is the sample?
BPS - 3rd Ed.
Chapter 7
14
Case Study
The Effect of Hypnosis
on the
Immune System
Is this an experiment
or
an observational study?
BPS - 3rd Ed.
Chapter 7
Chapter 7
15
5
Basic Practice of Statistics - 3rd Edition
Case Study
The Effect of Hypnosis
on the
Immune System
Does hypnosis and
mental exercise affect the
immune system?
BPS - 3rd Ed.
Chapter 7
16
Case Study
Weight Gain Spells
Heart Risk for Women
“Weight, weight change, and coronary heart
disease in women.” W.C. Willett, et. al., vol. 273(6),
Journal of the American Medical Association, Feb.
8, 1995.
(Reported in Science News, Feb. 4, 1995, p. 108)
BPS - 3rd Ed.
Chapter 7
17
Case Study
Weight Gain Spells
Heart Risk for Women
Objective:
To recommend a range of body mass index
(a function of weight and height) in terms of
coronary heart disease (CHD) risk in women.
BPS - 3rd Ed.
Chapter 7
Chapter 7
18
6
Basic Practice of Statistics - 3rd Edition
Case Study
‹ Study
started in 1976 with 115,818
women aged 30 to 55 years and without
a history of previous CHD.
‹ Each woman’s weight (body mass) was
determined
‹ Each woman was asked her weight at
age 18.
BPS - 3rd Ed.
Chapter 7
19
Case Study
‹ The
cohort of women were followed for
14 years.
‹ The number of CHD (fatal and nonfatal)
cases were counted (1292 cases).
‹ Results were adjusted for other variables
(smoking, family history, menopausal status,
post-menopausal hormone use).
BPS - 3rd Ed.
Chapter 7
20
Case Study
‹ Results:
compare those who gained
less than 11 pounds (from age 18 to
current age) to the others.
– 11 to 17 lbs: 25% more likely to develop
heart disease
– 17 to 24 lbs: 64% more likely
– 24 to 44 lbs: 92% more likely
– more than 44 lbs: 165% more likely
BPS - 3rd Ed.
Chapter 7
Chapter 7
21
7
Basic Practice of Statistics - 3rd Edition
Case Study
Weight Gain Spells
Heart Risk for Women
What is the population?
What is the sample?
BPS - 3rd Ed.
Chapter 7
22
Case Study
Weight Gain Spells
Heart Risk for Women
Is this an experiment
or
an observational study?
BPS - 3rd Ed.
Chapter 7
23
Case Study
Weight Gain Spells
Heart Risk for Women
Does weight gain in
women increase their risk
for CHD?
BPS - 3rd Ed.
Chapter 7
Chapter 7
24
8
Basic Practice of Statistics - 3rd Edition
Bad Sampling Designs
‹ Voluntary response sampling
– allowing individuals to choose to be in the sample
‹ Convenience sampling
– selecting individuals that are easiest to reach
™ Both of these techniques are biased
– systematically favor certain outcomes
Chapter 7
BPS - 3rd Ed.
25
Voluntary Response
‹
To prepare for her book Women and Love, Shere
Hite sent questionnaires to 100,000 women asking
about love, sex, and relationships.
– 4.5% responded
– Hite used those responses to write her book
‹
Moore (Statistics: Concepts and Controversies,
1997) noted:
– respondents “were fed up with men and eager to fight
them…”
– “the anger became the theme of the book…”
– “but angry women are more likely” to respond
BPS - 3rd Ed.
Chapter 7
26
Convenience Sampling
‹ Sampling
mice from a large cage to study
how a drug affects physical activity
– lab assistant reaches into the cage to select
the mice one at a time until 10 are chosen
‹ Which
mice will likely be chosen?
– could this sample yield biased results?
BPS - 3rd Ed.
Chapter 7
Chapter 7
27
9
Basic Practice of Statistics - 3rd Edition
Simple Random Sampling
‹
Each individual in the population has the same
chance of being chosen for the sample
‹
Each group of individuals (in the population) of
the required size (n) has the same chance of
being the sample actually selected
‹
Random selection:
– “drawing names out of a hat”
– table of random digits
– computer software
BPS - 3rd Ed.
Chapter 7
28
Table of Random Digits
‹ Table
B on pg. 654 of text
– each entry is equally likely to be any of the 10
digits 0 through 9
– entries are independent of each other
(knowledge of one entry gives no information about
any other entries)
– each pair of entries is equally likely to be any
of the 100 pairs 00, 01,…, 99
– each triple of entries is equally likely to be
any of the 1000 values 000, 001, …, 999
BPS - 3rd Ed.
Chapter 7
29
Choosing a
Simple Random Sample (SRS)
STEP 1: Label each individual in the
population
STEP 2: Use Table B to select labels at
random
BPS - 3rd Ed.
Chapter 7
Chapter 7
30
10
Basic Practice of Statistics - 3rd Edition
Probability Sample
‹a
sample chosen by chance
know what samples are possible and
what chance, or probability, each possible
sample has of being selected
‹ a SRS gives each member of the
population an equal chance to be selected
‹ must
BPS - 3rd Ed.
Chapter 7
31
Stratified Random Sample
‹ first
divide the population into groups of
similar individuals, called strata
‹ second, choose a separate SRS in each
stratum
‹ third, combine these SRSs to form the full
sample
BPS - 3rd Ed.
Chapter 7
32
Stratified Random Sample
Example
Suppose a university has the following student
demographics:
Undergraduate
55%
Graduate
20%
First Professional
5%
Special
20%
A stratified random sample of 100 students could
be chosen as follows: select a SRS of 55
undergraduates, a SRS of 20 graduates, a SRS of
5 first professional students, and a SRS of 20
special students; combine these 100 students.
BPS - 3rd Ed.
Chapter 7
Chapter 7
33
11
Basic Practice of Statistics - 3rd Edition
Multistage Sample
‹ several
stages of sampling are carried out
for large-scale sample surveys
‹ samples at each stage may be SRSs, but
are often stratified
‹ stages may involve other random sampling
techniques as well (cluster, systematic,
‹ useful
random digit dialing, …)
Chapter 7
BPS - 3rd Ed.
34
Cautions about Sample Surveys
‹
Undercoverage
– some individuals or groups in the population are left
out of the process of choosing the sample
‹
Nonresponse
– individuals chosen for the sample cannot be contacted
or refuse to cooperate/respond
‹
Response bias
– behavior of respondent or interviewer may lead to
inaccurate answers or measurements
‹
Wording of questions
– confusing or leading (biased) questions; words with
different meanings
BPS - 3rd Ed.
Chapter 7
35
Nonresponse
‹ To
prepare for her book Women and Love,
Shere Hite sent questionnaires to 100,000
women asking about love, sex, and
relationships.
– 4.5% responded
– Hite used those responses to write her book
– angry women are more likely to respond
BPS - 3rd Ed.
Chapter 7
Chapter 7
36
12
Basic Practice of Statistics - 3rd Edition
Response Bias
‹A
door-to-door survey is being conducted
to determine drug use (past or present) of
members of the community. Respondents
may give socially acceptable answers
(maybe not the truth!)
‹ For
this survey on drug use, would it
matter if a police officer is conducting the
interview? (bias from interviewer)
BPS - 3rd Ed.
Chapter 7
37
Response Bias
Asking the Uninformed
Washington Post National Weekly Edition (April 10-16, 1995, p. 36)
‹A
1978 poll done in Cincinnati asked
people whether they “favored or
opposed repealing the 1975 Public
Affairs Act.”
– There was no such act!
– About one third of those asked expressed
an opinion about it.
BPS - 3rd Ed.
Chapter 7
38
Wording of Questions
A newsletter distributed by a politician to his
constituents gave the results of a “nationwide survey
on Americans’ attitudes about a variety of
educational issues.” One of the questions asked
was, “Should your legislature adopt a policy to assist
children in failing schools to opt out of that school
and attend an alternative school--public, private, or
parochial--of the parents’ choosing?” From the
wording of this question, can you speculate on what
answer was desired? Explain.
BPS - 3rd Ed.
Chapter 7
Chapter 7
39
13
Basic Practice of Statistics - 3rd Edition
Wording: Deliberate Bias
‹ “If
you found a wallet with $20 in it,
would you return the money?”
‹ “If
you found a wallet with $20 in it,
would you do the right thing and return
the money?”
BPS - 3rd Ed.
Chapter 7
40
Wording: Unintentional Bias
‹ “I
have taught several students over the
past few years.”
– How many students do you think I have
taught?
– How many years am I referring to?
‹ “Over
the past few days, how many
servings of fruit have you eaten?”
– How many days are you considering?
– What constitutes a serving?
BPS - 3rd Ed.
Chapter 7
41
Wording: Unnecessary Complexity
‹ “Do
you sometimes find that you have
arguments with your family members
and co-workers?”
– Arguments with family members
– Arguments with co-workers
BPS - 3rd Ed.
Chapter 7
Chapter 7
42
14
Basic Practice of Statistics - 3rd Edition
Wording: Ordering of Questions
‹ “How
often do you normally go out on a
date? about ___ times a month.”
‹ “How happy are you with life in general.”
– Strong association between these questions.
– If the ordering is reversed, then there would
be no strong association between these
questions
BPS - 3rd Ed.
Chapter 7
43
Inferences about the Population
Values calculated from samples are used to
make conclusions (inferences) about unknown
values in the population
‹ Variability
‹
– different samples from the same population may yield
different results for a particular value of interest
– estimates from random samples will be closer to the
true values in the population if the samples are larger
– how close the estimates will likely be to the true values
can be calculated -- this is called the margin of error
BPS - 3rd Ed.
Chapter 7
Chapter 7
44
15
Download