day7 - University of South Carolina

advertisement
STAT 110 - Section 5
Lecture 7
Professor Hao Wang
University of South Carolina
Spring 2012
Last time: Picturing Bias and
Variability
Last time: Margin of Error
The CNN Poll interviewed 1000 people. The
approval rating was 57%. What is the margin of
error for 95% confidence (using the quick
formula)?
Answer: Recall 95% confidence
Margin of Error (continued)
Confidence Interval
Use MOE to calculate an interval that we think includes
the parameter
Form for most confidence intervals:
Approximate (because we’re using the quick MOE)
95% confidence interval for p
Confidence Statements
A confidence statement interprets a confidence
interval and has two parts: a margin of error and
a level of confidence.
Margin of error says how close the statistic lies to
the parameter.
Level of confidence says what percentage of all
possible samples result in a confidence interval
which contains the true parameter
Example: President Bush
Pre 9/11: 57% with MOE 3%
Post 9/11: 90% with MOE 3%
Interpretations
– We are 95% confident that the percent of all
Americans who approve of the job president Bush
was doing was between 54% and 60% before
9/11.
– We are 95% confident that the percent of all
Americans who approve of the job president Bush
was doing was between 87% and 93% after 9/11.
Example: College Education
This May 2011 survey finds that 57% of the 2142
adult Americans polled think that “the higher
education system in the United States fails to
provide students good value for the money
they and their families spend”. Using the quick
formula for MOE, compute a 95% confidence
interval for p.
Example: Coke or Pepsi
Suppose you take a sample of 1231 people and ask
them if they prefer Coke over Pepsi. You find that 696
say they do.
What is
A
B
C
D
, the observed percent from the population?
.725 = 72.5%
.565 = 56.5%
.029 = 2.9%
.038 = 3.8%
Example Coke Or Pepsi continued
Suppose you take a sample of 1231 people and ask
them if they prefer Coke over Pepsi. You find that 696
say they do.
What is the margin of error for 95% confidence?
A
B
C
D
square root of 1231 = 35.06 = 35.06%
square root of 696 = 26.38 = 26.38%
1/square root of 1231 = 0.0285 = 2.85%
1/square root of 696 = 0.0379 = 3.79%
Hints for Interpretation
The conclusion of a confidence statement
always applies to the population, not to the
sample.
Our conclusion about the population is never
completely certain.
If you want a smaller margin of error with the
same confidence, take a larger sample.
Hints for Interpretation
It is very common to report the margin of error for
95% confidence.
– If the level of confidence is not mentioned, assume
95% confidence.
Can choose to use a confidence level other than
95%.
– Other popular levels: 80%, 90%, 99%
– For a fixed sample size, if you increase the level of
confidence, your interval will become wider.
– For a fixed confidence level, if you increase
sample size, your interval will become narrower
Population Size Doesn’t Matter
The variability of a statistic from a SRS does not
depend on the size of the population as long
as the population is at least 100 times larger
than the sample.
Example: Population Size Doesn’t
Matter
Suppose we take a sample of size 1000 from a
population of 4,000,000 (e.g., South Carolina).
Then we take a sample of 1000 from a population of
300,000,000 (e.g., the whole US). Which sample
statistic would have more variability (i.e., MOE) ?
A. The one from 4,000,000
B. The one from 300,000,000
C. They are the same.
Chapter 4 – Sample Surveys
in the Real World
Type of errors:
1. Sampling Errors
a. Random Sampling Error
b. Bad Sampling Methods
2. Non-sampling Errors
a. Processing errors
b. Poorly worded questions
c. Response error
d. Non-Response
Chapter 4 – Sample Surveys in the Real
World
sampling errors – errors caused by the act of
taking a sample
They cause sample results to be different from
the results of a census.
sampling frame – a list of individuals from which we
will draw our sample
 should list every individual in
the population
Errors in Sampling
random sampling error – results from chance
selection in the simple
random sample
• MOE lets us calculate how serious the error is.
• The error is due to chance – always present. A
large sample helps control this.
• MOE includes only random sampling error.
• Most sample surveys are afflicted with errors other
than random sampling errors.
Errors in Sampling
Bad sampling method – a convenience sample or a
voluntary response sample
is also a form of sampling error.
Voluntary sample
Convenience sample
undercoverage – occurs when some groups in the
population are left out of the
process of choosing the sample
nonsampling errors – errors not related to the act
of selecting a sample from
the population
 can even be present in a
census
• nonrespone (missing data)
• response errors
• processing errors
• effects of data collection procedure
Example
The subject lies about past drug use.
A.
B.
C.
D.
Sampling Error: Bad Sampling Method
Non Sampling Error: Response Error
Non Sampling Error: Non Response Error
Non Sampling Error: Processing Error
Example
The subject cannot be contacted after five calls.
A.
B.
C.
D.
Sampling Error: Bad Sampling Method
Non Sampling Error: Response Error
Non Sampling Error: Non Response Error
Non Sampling Error: Processing Error
Example
Interviewers choose people on the street
to interview.
A.
B.
C.
D.
Sampling Error: Bad Sampling Method
Non Sampling Error: Response Error
Non Sampling Error: Non Response Error
Non Sampling Error: Processing Error
Consider Wording
Be aware that the wording of a question
influences the answers.
Examples:
Is our government providing too much money for
welfare programs?
–
44% said “yes”
Is our government providing too much money
for assistance to the poor?
–
13% said yes
More Complex Sample Designs
• Sometimes a strict simple random sample is
difficult to obtain.
- Multistage Sampling Design
- Cluster Sampling
- Systematic Sampling
- Stratified Random Sampling
• Stratified Random Sample
• Step 1: Divide the sampling frame into distinct
groups of individuals, called strata.
• – Choose strata because you have an interest in
the groups or because the individuals within each
group are similar
• – Example: graduate/undergraduate students
• Step 2: Take a separate SRS in each stratum and
combine these to make up the complete sample.
Stratified Random Sample. A club has 25 student
members and 10 faculty members. The club can send
4 students and 2 faculty members to a convention.
Students
01 Barrett
06 Frazier
11 Hu
16 Liu
21 Ren
02 Brady
07 Gibellato
12 Jimenez
17 Marin
22 Santos
03 Chen
08 Gulati
13 Katsaounis
18 Nemeth
23 Sroka
04 Draper
09 Han
14 Kim
19 O’Rourke
24 Tordoff
05 Duncan
10 Hostetler
15 Kohlschmidt
20 Paul
25 Wang
Faculty
0 Berliner
2 Dean
4 Goel
6 Moore
8 Stasney
1 Craigmile
3 Fligner
5 Lee
7 Pearl
9 Wolfe
Line 116:14459 26056 31424 80371 65103 62253 50490 61181
Choose a Stratified RS of 4 Students, then of 2 Faculty
Cluster Sampling
• In order to reduce costs in sampling, researchers
focus on efficiency by sampling from clusters
• Clusters are often formed by geographic location,
resulting in decreased travel costs for the research
company.
• Randomly sample clusters then survey everyone in
each cluster.
Cluster Sample - Divide population into clusters.
Select one or more clusters and include
everyone in those clusters in the sample.
• Example: SC has 46 counties. Select 5 counties
at random, use all household in each
selected county as sample.
• Example: USC has 5000 dorms. Select 100 dorms
at random, use all students in each selected
dorm as sample.
Want to find the opinions of US adults, but want
to save on time and money by randomly
selecting residences. All adults residing in a
sampled residence will be interviewed.
A. Stratified
B. Cluster C. Both
• Want to find the opinions of US adults and
need to make sure that 3 specific religious
groups are represented. You sample 100
Christians, 100 Jewish, and 100 Muslims.
A. Stratified
B. Cluster C. Both
• Want to find the opinions of city dwelling US
adults and need to make sure that the east
and west coasts are represented. You send
5 interviewers to the east coast and 5 to the
west coast. 5 City blocks are chosen at
random. Everyone living in a chosen city
block is interviewed. (similarly for the east
coast)
A. Stratified B. Cluster C. Both
Questions to Ask Before You Believe a Poll
• Who carried out the survey?
• What was the population?
• How was the sample selected?
• How large was the sample?
• What was the margin of error?
• What was the response rate?
• How were the subjects contacted?
• When was the survey conducted?
• What questions were asked?
USC has 20,065 undergraduates and 7,423 graduate
students. In an effort to gauge the opinions of all
students on campus parking issues, a simple
random sample consisting of 201 undergraduates
and a simple random sample of 74 graduate
students are taken. This is an example of:
A – a cluster sample
B – a systematic sample
C – a stratified random sample
D - undercoverage
USC has 20,065 undergraduates and 7,423 graduate
students. In an effort to gauge the opinions of all
students on campus parking issues, a simple
random sample consisting of 201 undergraduates
is taken. This is an example of:
A – a cluster sample
B – a systematic sample
C – a stratified random sample
D - undercoverage
Download