Document

advertisement
Sections 7.1 and 7.2
This chapter presents the beginning
of inferential statistics.
™
The two major applications of inferential
statistics
™ Estimate a population parameter: proportion, mean
™ Test some claim (or hypothesis) about a population.
™
™
Point estimate: a single number
Interval estimate: interval of numbers.
Confidence Interval
Why?: point estimate is not reliable under
re-sampling.
™ A confidence interval (CI): an interval of
values used to estimate the true population
parameter.
™
Point Estimate
p=
ˆp = nx
(pronounced
‘p-hat’)
population proportion
sample proportion
of x successes in a sample of size n.
Unbiased estimate (best estimate)
qˆ = 1 - pˆ = sample proportion
of failures in a sample size of n
Example: Photo-Cop Survey Responses
„
829 adult Minnesotans were surveyed, and 51% of them are
opposed to the use of the photo-cop for issuing traffic
tickets. Using these survey results, find the best estimate of
the proportion of all adult Minnesotans opposed to photocop use.
„
Best point estimate=sample proportion=51%.
Confidence Level
α: between 0 and 1
™ A confidence level: 1 - α or 100(1- α)%. E.g. 95%.
This is the proportion of times that the confidence
interval actually does contain the population parameter,
assuming that the estimation process is repeated a
large number of times.
™
™ Other
names: degree of confidence or the
confidence coefficient.
The Critical Value
(z-score)
Given α
Finding zα/2 for 100(1- α)% Confidence Level
α =5%
α/2 = 2.5% = .025
Sampling Distribution of ^
p
™
The sampling distribution of sample proportion can be
approximated by a normal distribution if np≥15 and
nq ≥15 : phat is approximately N(p, pq/n), q=1-p.
p
p
pˆ − p
z=
pˆ qˆ
n
^
p
^
Margin of Error of p
the maximum likely (with probability 1 – α)
difference between the observed proportion
^ and the true population proportion p.
p
E = zα / 2
ˆp q̂
n
^
Standard Error of p
=se
Finding the 95% Confidence Interval
for a Population Proportion
„
A 95% confidence interval for a population
proportion p is:
p̂(1 - p̂)
p̂ ± 1.96(se), with se =
n
„
100(1-α)% confidence interval for p is
pˆ ± zα / 2 ( se)
with
se =
pˆ (1 − pˆ )
n
Example: Would You Pay Higher
Prices to Protect the Environment?
„
In 2000, the GSS asked: “Are you willing to
pay much higher prices in order to protect
the environment?”
„ Of
n = 1154 respondents, 518 were willing to
do so
„
Find and interpret a 95% confidence interval
for the population proportion of adult
Americans willing to do so at the time of the
survey
Example: Would You Pay Higher
Prices to Protect the Environment?
518
= 0.45
p̂ =
1154
(0.45)(0.55)
se =
= 0.015
1154
E = 1.96(se) = 1.96(0.015) = 0.03
p̂ ± E = 0.45 ± 0.03 = (0.42, 0.48)
What is the Error Probability for the
Confidence Interval Method?
Summary: Effects of Confidence Level
and Sample Size on Margin of Error
„
The margin of error for a confidence interval:
„ Increases
as the confidence level increases
„ Decreases as the sample size increases
Determining Sample Size
Recall :
E=
pˆ qˆ
n
zα / 2
(solve for n by algebra)
n=
zα / 2 pˆ qˆ
2
E2
Sample Size for Estimating
Proportion p
ˆ
When an estimate p of p is known:
n=
( zα / 2 )2 pˆ qˆ
E2
When no estimate of p is known:
n=
( zα / 2)2 0.25
E2
Example: Suppose a sociologist wants to determine
the
current percentage of U.S. households using e-mail. How many
households must be surveyed in order to be 95% confident that the
sample percentage is in error by no more than four percentage
points?
a) Use this result from an earlier study: In 1997, 16.9% of U.S.
households used e-mail (based on data from The World Almanac
and Book of Facts).
b) Assume that we have no prior information suggesting a possible
value of p.
a) Use this result from an earlier study: In 1997, 16.9% of U.S.
households used e-mail (based on data from The World Almanac
and Book of Facts).
n = [za/2 ]2ˆpˆq
E2
= [1.96]2 (0.169)(0.831)
0.042
= 337.194
= 338 households
To be 95% confident that our
sample percentage is within
four percentage points of the
true percentage for all
households, we should
randomly select and survey
338 households.
b) Assume that we have no prior information suggesting a possible
value of p.
n = [za/2 ]2 • 0.25
E2
= (1.96)2 (0.25)
0.042
= 600.25
= 601 households
With no prior information,
we need a larger sample to
achieve the same results
with 95% confidence and an
error of no more than 4%.
Finding the Point Estimate
and E from a
Confidence Interval
Point estimate of p:
ˆ
p = (upper confidence limit) + (lower confidence limit)
2
Margin of Error:
E = (upper confidence limit) — (lower confidence limit)
2
Download