Sample

advertisement
TYPES OF BIAS
Section 4.1A
Remember….
• Population
• Consists of all objects that I wish to describe
• Census – survey of the population
• Sample
• Subset of the population
• Used to predict the population
Sample Survey
We often draw conclusions about a whole population on
the basis of a sample.
Choosing a sample from a large, varied population is not
that easy.
Choose a sample Survey
1. Define the population we want to describe.
2. Say exactly what we want to measure.
*A “sample survey” is a study that uses an organized
plan to choose a sample that represents some specific
population
3. Decide how to choose a sample from the
population
Current Population Survey
• Contacts 60,000 households each month
• Produces monthly unemployment rate and a lot of other
economic and social information
• Population is defined as all U.S. residents (legal or not) 16 years of
age and over who are civilians and are not in an institution such as
prison.
• Unemployed – if you are available for work and if you actually
looked for work in the last 4 weeks.
Bad Sampling….
• Sample that does not represent the
population.
Convenience Sampling
• Choosing individuals who are easiest to
reach.
• Example: Questioning the 1st 100 people to
come to the store
• Example: Mall interview
• What problems can you see?
Studies that use convenience sampling generally have results that
are suspect because the selection of individuals is not random.
The results should be looked on with extreme skepticism.
Voluntary Response Sample
• Consists of people who choose themselves by responding
to a general appeal.
• Those most likely to respond are the people with strong
opinions, especially negative opinions.
• Television call in polls
• FACT: Only about 15% of the public has ever
responded to a call-in poll. That is not a
representative sample of the population as a whole!
Example
• The American Family Association (AFA) is a conservative
group that stands for “traditional family values. It had a
poll on its Web site – about same sex marriage in 2004.
They had 60% of 850,000 people responded that they
favored same-sex marriage. This did not support AFA’s
position. What do you think happened?
What type of Sampling
• A farmer brings a juice company several crates of
oranges each week. A company inspector looks at 10
oranges from the top of each crate before deciding
whether to buy all the oranges.
Convenience Sampling:
This could lead him to think that the oranges are of
better quality than they really are, if the farmer puts the
best oranges on the top.
What type of Sampling
• The ABC program Nightline once asked whether the
United Nations should continue to have its headquarters
in the United States. Viewers were invited to call one
telephone number to respond “Yes” and another for “No”.
There was a charge for calling either number. More than
186,000 callers responded and 67% said “No”.
Voluntary Response Sample:
In this case, those who are happy that the United Nations has
its headquarters in the U.S. already have what they want and
so are less likely to worry about responding to the question.
Activity #1
• Guess the length of my string.
Activity #2 - M&M Rectangles
• I want to know the average size of the rectangles on this
page.
• Pick 10 rectangles that you believe are representative of
all of the on the page.
• Find the area of each of the 10 you chose.
• Find the average area of all 10
Bias
The sampling method is biased if it
systematically favors certain
outcomes.
Exam Tip: Always tell which way it is biased.
Ex: “Explain how using a convenience sample of students in your stats class to
estimate the proportion of all high school students who own a GDC could result
in bias.”
You might respond: “This sample would probably include a much higher
proportion of students with a GDC than in the population at large because a
GDC is required for the stats class.” In other words, this method would
probably lead to an overestimate of the actual population proportion.
3 Sources of Bias
• Selection Bias
• Nonresponse Bias
• Response Bias
Selection Bias
• The method of selecting the sample systematically
excludes some part of the population of interest.
• Example – the M&Ms lab
voluntary response
Measurement / Response
• Method of observation tends to produce values that
systematically differ from the true value in some way.
• Measurement – improperly calibrated scale, the string
activity
• Response – Improperly worded questions
Example: “Should illegal immigrants be prosecuted and deported for
being in the U.S. illegally, or shouldn’t they?” 69% favored deportation.
Vs.
“Should illegal immigrants who have worked in the U.S. for two years be
given a chance to keep their jobs and eventually apply for legal status?”
62% favored allowing them to stay.
Response Bias - More
• Response bias
occurs when the
answers on a
survey do not
reflect the true
feelings of the
respondent. It can
occur in many
different ways.
• Interviewer Error
• Misrepresented
Answers
• Wording of Questions
• Ordering of Questions
or Words
• Type of Question
• Data-Entry Error
Interview Error
An interviewer should be trained to be able to get truthful
responses from people. If people don’t feel they can trust the
interviewer they will give questionable answers.
Also, be aware of interviewers who have a vested interest in
the results of the survey.
Ex: Would you trust a survey conducted by a car dealer that
reports 90% of customers say they would buy another car from
the dealer?
Misrepresented Answers
Some survey questions result in responses that
misrepresent facts or are flat-out lies.
Ex: A survey of recent college graduates may find their
self-reported salaries are inflated.
Also, people may overestimate their abilities.
Ex: Ask people how many pushups they can do in 1
minute and then ask them to do the pushups. How
accurate were they?
Wording of Questions
Balanced
Not Too Vague
Questions should be
asked in a balanced form
to prevent bias.
Ex: The yes/no question:
“Do you oppose the
reduction of estate
taxes?”
Should be changed to
“Do you favor or oppose
the reduction of estate
taxes?”
Another consideration in
wording a question is not
to be vague.
Ex: “How much do you
study?” is too vague.
Should be changed to
“How many hours do you
study statistics each
week?”
Ordering of Questions or Words
Many surveys will rearrange the order of the questions
or words within a questionnaire so that responses are
not affected by prior questions.
Ex: The Gallup organization routinely asks the
following question of 1017 adults aged 18 years or
older:
Do you (rotated) approve or disapprove of the job
Barack Obama is doing as president?
The words approve and disapprove are rotated to
remove the effect that may occur by writing the word
approve first in the question.
Nonresponse Bias
• Responses are not actually obtained from all individuals
selected for inclusion in the sample.
• Example – Failure to return polls
Nonresponse occurs when an individual
chosen for the sample can’t be
contacted or refuses to participate.
All surveys suffer from nonresponse bias
Non response bias can be controlled by
1) using callbacks
2) using rewards
a) Cash for completing survey
b) Incentives that state responses
have an impact on future policy.
What type of Bias
• Bill is assigned by his editor to determine what most
Americans think about a new law that will place a federal
tax on all modems and computers purchased. The
revenues from the tax will be used to enforce new online
decency laws. Bill, being technically inclined, decides to
use an email poll. In his poll, 95% of those surveyed
opposed the tax. Bill was quite surprised when 65% of all
Americans voted for the taxes.
SELECTION BIAS:
Excluded those people NOT technically inclined!
What type of Bias
• The United Pacifists of America decide to run a
poll to determine what Americans think about
guns and gun control. Jane is assigned the task
of setting up the study. To save mailing costs, she
includes the survey form in the group's newsletter
mailing. She is very pleased to find out that 95%
of those surveyed favor gun control laws and she
tells her friends that the vast majority of
Americans favor gun control laws.
SELECTION BIAS:
Pacifists will most likely favor gun control laws.
A proportion of the population was left out!
What type of Bias
• Large scale polls were taken in Florida, California, and
Maine and it was found that an average of 55% of those
polled spent at least fourteen days a year near the ocean.
So, it can be safely concluded that 55% of all Americans
spend at least fourteen days near the ocean each year
Selection Bias:
The states chosen for the polls had easier
access to the ocean. A large part of the
population was left out.
Simple Random Sample.
• Good sampling technique.
• A simple random sample (SRS) of size n consists of n
individuals from the population chosen in such a way that
every set of n individuals has an equal chance to be
the sample actually selected.
Let’s look at those M&M rectangles
again!
• Let’s randomly pick 10 rectangles.
• Press
• Math
• Prb
• RandInt (1,100,10)
• Find the area of the ten rectangles that match the
numbers you generated.
Choosing a SRS…using a Random Number
Chart
• Be specific about how you select.
Ex: I’m going to start with line 100 and pick
two digits going across the row. The number will
represent the group number I will sample.
• Indicate the stopping rule.
Ex: I will stop this process when I have found 10
samples.
• Tell whether you sample with or without replacement.
Ex: I do not want to repeat numbers because I need 10
distinct groups therefore, I will sample without
replacement.
• Use labels to identify subjects selected to be in the sample.
Ex:
• How to Choose an SRS
Sampling and Surveys
Definition:
A table of random digits is a long string of the digits 0, 1, 2, 3,
4, 5, 6, 7, 8, 9 with these properties:
• Each entry in the table is equally likely to be any of the 10
digits 0 - 9.
• The entries are independent of each other. That is,
knowledge of one part of the table gives no information about
any other part.
How to Choose an SRS Using Table D
Step 1: Label. Give each member of the population a
numerical label of the same length.
Step 2: Table. Read consecutive groups of digits of the
appropriate length from Table D.
Your sample contains the individuals whose labels you
find.
We are planning an article on family-friendly places to stay
over spring break at a nearby beach town. The editors
intend to call 4 randomly chosen hotels to ask about their
amenities for families with children. They have an
alphabetized list of all 28 hotels in the town.
01
Aloha Kai
08 Captiva
15 Palm Tree
22 Sea Shell
02
Anchor Down
09 Casa del Mar
16 Radisson
23 Silver Beach
03
Banana Bay
10 Coconuts
17 Ramada
24 Sunset Beach
04
Banyan Tree
11 Diplomat
18 Sandpiper
25 Tradewinds
05
Beach Castle
12 Holiday Inn
19 Sea Castle
26 Tropical Breeze
06
Best Western
13 Lime Tree
20 Sea Club
27 Tropical Shores
07
Cabana
14 Outrigger
21 Sea Grape
28 Veranda
69051
87201
64817
97245
87174
88221
09517
22356
84534
77183
06489
88725
Sampling and Surveys
69051
64817
87201
87174
97245
09517
84534
06489
69 05 16 48 17 87 17 40 95 17 84 53 40 64 89 87 20
Our SRS of 4 hotels for the editors to contact is: 05 Beach
Castle, 16 Radisson, 17 Ramada, and 20 Sea Club.
Activity – Table T5
• Which rectangle would you choose if you used the
random number table? (Table D – back of the book)
• How would you do it?
Homework
• Page 226 (1-11) odd
• Worksheet – Bias
Download