Chapter 21 notes - BetsyMcCall.net

advertisement
Chapter 21
What Is a Confidence Interval?
Recall from previous chapters:
Parameter - fixed, unknown number that describes the population
Statistic - known value calculated from a sample
- a statistic is used to estimate a parameter
Sampling Variability - different samples from the same population may yield different values of the
sample statistic, and estimates from samples will be closer to the true values in the population if the
samples are larger
** The amount by which the proportion obtained from the sample ( pˆ ) will differ from the true
population proportion (p) rarely exceeds the margin of error.
Sampling Distribution - tells what values a statistic takes and how often it takes those values in
repeated sampling. Sample proportions ( pˆ ) from repeated sampling would have a normal distribution
with a certain mean and standard deviation.
The Rule for Sample Proportions
If numerous simple random samples of size n are taken from the same population, the sample
proportions ( pˆ ) from the various samples will have an approximately normal distribution. The mean
of the sample proportions will be p (the true population proportion).
The standard deviation will be:
p(1  p)
n
For rule to be valid, it must have:
Random sample and ‘Large’ sample size
Confidence Interval for a Population Proportion
An interval of values, computed from sample data that is almost sure to cover the true
population proportion.
“We are ‘highly confident’ that the true population proportion is contained in the calculated
interval.”
Statistically (for a 95% C.I.): in repeated samples, 95% of the calculated confidence intervals
should contain the true proportion.
Formula for a 95% Confidence Interval for the Population Proportion (Empirical Rule)
pˆ  2
p(1  p)
n
But…..since we do not know the population proportion p, we will use the sample proportion
p̂
in its place :
ˆ 2
p
ˆ (1  p
ˆ)
p
n
Formula for a C-level (%) Confidence Interval for the Population Proportion
ˆ  z*
p
ˆ (1  p
ˆ)
p
n
where z* is the critical value of the standard normal distribution for confidence level C
Common Values of z*
Confidence Level
C
Critical Value
z*
50%
0.67
60%
0.84
68%
1
70%
80%
1.04
1.28
90%
1.64
95%
1.96 (or 2)
99%
99.7%
2.58
3
99.9%
3.29
Inference for Population Means
Sampling Distribution, Confidence Intervals
The remainder of this chapter discusses the situation when interest is in making conclusions
about population means rather than population proportions
–
includes the rule for the sampling distribution of sample means ( X's )
–
includes confidence intervals for one mean or a difference in two means
The Rule for Sample Means
If numerous simple random samples of size n are taken from the same population, the sample
means X's from the various samples will have an approximately normal distribution. The mean
of the sample means will be µ (the population mean). The standard deviation will be:  n
( is the population s.d.)
Conditions for the Rule for Sample Means :
Random sample
Population of measurements…
–
Follows a bell-shaped curve
- or -
–
Not bell-shaped, but sample is ‘large’
But… We do not know the value of  ! So Standard Error of the (Sample) Mean
SEM = standard error of the mean
(standard deviation from the sample)
s
=
divided by
n
(square root of the sample size)
Formula for a C-level (%) Confidence Interval for the Population Mean

x  z *  s

n

where z* is the critical value of the standard normal distribution for confidence level C
Careful Interpretation of a Confidence Interval
“We are 95% confident that the mean resting pulse rate for the population of all exercisers is
between 62.8 and 69.2 bpm.” (We feel that plausible values for the population of exercisers’
mean resting pulse rate are between 62.8 and 69.2.)
** This does not mean that 95% of all people who exercise regularly will have resting pulse rates
between 62.8 and 69.2 bpm. **
Statistically: 95% of all samples of size 29 from the population of exercisers should yield a
sample mean within two standard errors of the population mean; i.e., in repeated samples, 95%
of the C.I.s should contain the true population mean.
Download