day8 - University of South Carolina

STAT 110 - Section 5
Lecture 8
Professor Hao Wang
University of South Carolina
Spring 2012
Last time: The Beauty of Sampling
With proper sampling methods, based on a
sample of about 1000 adults we can almost
certainly estimate, to within 3% (i.e.,
MOE=3%), the percentage of the entire
population who have a certain trait or opinion.
This result does not depend on how large the
population is.
Chapter 4 – Sample Surveys
in the Real World
Type of errors:
1. Sampling Errors
a. Random Sampling Error
b. Bad Sampling Methods
2. Non-sampling Errors
a. Processing errors
b. Poorly worded questions
c. Response error
d. Non-Response
Chapter 4 – Sample Surveys in the Real
sampling errors – errors caused by the act of
taking a sample
They cause sample results to be different from
the results of a census.
sampling frame – a list of individuals from which we
will draw our sample
 should list every individual in
the population
Errors in Sampling
random sampling error – results from chance
selection in the simple
random sample
• MOE lets us calculate how serious the error is.
• The error is due to chance – always present. A
large sample helps control this.
• MOE includes only random sampling error.
• Most sample surveys are afflicted with errors other
than random sampling errors.
Errors in Sampling
Bad sampling method – a convenience sample or a
voluntary response sample
is also a form of sampling error.
Voluntary sample
Convenience sample
undercoverage – occurs when some groups in the
population are left out of the
process of choosing the sample
• Using telephone directory to survey general
• Problem: excludes those who move often,
those with unlisted home numbers, those
without a phone.
• Solution: use random digit dialing.
nonsampling errors – errors not related to the act
of selecting a sample from
the population
 can even be present in a
• nonrespone (missing data)
• response errors
• processing errors
The subject lies about past drug use.
Sampling Error: Bad Sampling Method
Non Sampling Error: Response Error
Non Sampling Error: Non Response Error
Non Sampling Error: Processing Error
The subject cannot be contacted after five calls.
Sampling Error: Bad Sampling Method
Non Sampling Error: Response Error
Non Sampling Error: Non Response Error
Non Sampling Error: Processing Error
Interviewers choose people on the street
to interview.
Sampling Error: Bad Sampling Method
Non Sampling Error: Response Error
Non Sampling Error: Non Response Error
Non Sampling Error: Processing Error
Consider Wording
Be aware that the wording of a question
influences the answers.
Is our government providing too much money for
welfare programs?
44% said “yes”
Is our government providing too much money
for assistance to the poor?
13% said yes
More Complex Sample Designs
• Sometimes a strict simple random sample is
difficult to obtain.
- Multistage Sampling Design
- Cluster Sampling
- Systematic Sampling
- Stratified Random Sampling
• Stratified Random Sample
• Step 1: Divide the sampling frame into distinct
groups of individuals, called strata.
• – Choose strata because you have an interest in
the groups or because the individuals within each
group are similar
• – Example: graduate/undergraduate students
• Step 2: Take a separate SRS in each stratum and
combine these to make up the complete sample.
Stratified Random Sample. A club has 25 student
members and 10 faculty members. The club can send
4 students and 2 faculty members to a convention.
01 Barrett
06 Frazier
11 Hu
16 Liu
21 Ren
02 Brady
07 Gibellato
12 Jimenez
17 Marin
22 Santos
03 Chen
08 Gulati
13 Katsaounis
18 Nemeth
23 Sroka
04 Draper
09 Han
14 Kim
19 O’Rourke
24 Tordoff
05 Duncan
10 Hostetler
15 Kohlschmidt
20 Paul
25 Wang
0 Berliner
2 Dean
4 Goel
6 Moore
8 Stasney
1 Craigmile
3 Fligner
5 Lee
7 Pearl
9 Wolfe
Line 116:14459 26056 31424 80371 65103 62253 50490 61181
Choose a Stratified RS of 4 Students, then of 2 Faculty
Cluster Sampling
• In order to reduce costs in sampling, researchers
focus on efficiency by sampling from clusters
• Clusters are often formed by geographic location,
resulting in decreased travel costs for the research
• Randomly sample clusters then survey everyone in
each cluster.
Cluster Sample - Divide population into clusters.
Select one or more clusters and include
everyone in those clusters in the sample.
• Example: SC has 46 counties. Select 5 counties
at random, use all household in each
selected county as sample.
• Example: USC has 30 dorms, each dorm has 6
floors; 180 floors form the clusters. Take a random
sample of floors and measure everyone on those
Want to find the opinions of US adults, but want
to save on time and money by randomly
selecting residences. All adults residing in a
sampled residence will be interviewed.
A. Stratified
B. Cluster C. Both
• Want to find the opinions of US adults and
need to make sure that 3 specific religious
groups are represented. You sample 100
Christians, 100 Jewish, and 100 Muslims.
A. Stratified
B. Cluster C. Both
• Want to find the opinions of city dwelling US
adults and need to make sure that the east
and west coasts are represented. You send
5 interviewers to the east coast and 5 to the
west coast. 5 City blocks are chosen at
random. Everyone living in a chosen city
block is interviewed. (similarly for the east
A. Stratified B. Cluster C. Both
Questions to Ask Before You Believe a Poll
• Who carried out the survey?
• What was the population?
• How was the sample selected?
• How large was the sample?
• What was the margin of error?
• What was the response rate?
• How were the subjects contacted?
• When was the survey conducted?
• What questions were asked?
USC has 20,065 undergraduates and 7,423 graduate
students. In an effort to gauge the opinions of all
students on campus parking issues, a simple
random sample consisting of 201 undergraduates
and a simple random sample of 74 graduate
students are taken. This is an example of:
A – a cluster sample
B – a systematic sample
C – a stratified random sample
D - undercoverage
USC has 20,065 undergraduates and 7,423 graduate
students. In an effort to gauge the opinions of all
students on campus parking issues, a simple
random sample consisting of 201 undergraduates
is taken. This is an example of:
A – a cluster sample
B – a systematic sample
C – a stratified random sample
D - undercoverage