Confidence Intervals

advertisement
Point Estimation and Confidence Intervals
A point estimate is the value of a single statistic (e.g. the mean) while a confidence
interval is the value of an interval or range of numbers constructed around the point
estimate. As we’ve previously noted, population parameters, such as the mean µ or
variance σ2, are estimated by sample statistics, such as x or s2, respectively. When
a population parameter is estimated by a sample statistic, the estimate may not be
as informative about the population parameter as that which would be obtained by
an estimation of a range of values. For example a point estimation of the mean may
not provide as much information as a calculation of the confidence intervals around
the mean.
Confidence interval estimation provides a range of values based on the results from
one sample. Essentially confidence intervals provide information about how close
the estimate is to the unknown population parameter and are stated in terms of
probability. The two components to confidence intervals are 1) the interval
estimation itself, and 2) the probability that the parameter falls somewhere within the
estimated interval. Confidence interval estimation results in statements such as
“Based on the results from our sample, I am 95% confident that the µ is between 58
and 63”. Factors that affect the width of the confidence interval are 1) the variance
in the sample, 2) the size of the sample, and 3) the degree of confidence required for
the estimate. Typical levels of confidence are 90%, 95%, and 99% and denoted as
(1-α) = level of confidence.
Calculating the Confidence Interval

When the population standard deviation is known, which is rarely the case,
the formula is:
x
- zα/2
≤ µ ≤ x + zα/2
where x is the sample mean, µ is the population mean, z is the value of z
depending upon the level of confidence desired, α us the confidence level,
and
is the standard error of the mean.
Suppose that we wanted to calculate the 95% confidence level of the
mean for the approval rating of President Obama in the population where
x = 56% and σ = 12.1 and the sample size was 500. The value for z is
obtained from the Z Table where α = .95 and the value of z for .95/2 or
.475 = 1.96.
56 – (1.96*
µ = 56 ± 1.06
) ≤ µ ≤ 56 + (1.96*
)

When the population standard deviation is unknown, the formula is below.
Note that instead of the Z Table the t Table (partially reproduce below) is
used in the calculations.
x
- tα/2, n-1
≤ µ ≤ x + tα/2, n-1
Where x is the sample mean, µ is the population mean, t is the value of t
depending upon the confidence level desired, α is the confidence level,
and n is the sample size.
Suppose that we wanted to calculate the 95% confidence level of the
mean for the approval rating of President Obama in the population where
x = 56%, s = 65.3, and the sample size was 1,025.
56 – (1.96*
) ≤ µ ≤ 56 + (1.96*
)
µ = 56 ± 4.0

df
1
2
3
…
10
…
20
…
30
…
60
80
100
Confidence Level
95%
Right-Tail Probability
t.025
…
…
…
12.706 …
…
4.303 …
…
3.182 …
…
…
…
…
2.228 …
…
…
…
…
2.086 …
…
…
…
…
2.042 …
…
…
…
…
2.000 …
…
1.990 …
…
1.984 …
∞
…
1.960 …
When calculating the confidence interval for proportions, where there’s a
dichotomous categorical outcome, the equations are somewhat different. It’s
assumed that the population follows the binomial distribution and with multiple
trials the normal distribution would be approximated. The formula for this
situation is:
Where p is the proportion of one group and (1-p) is the proportion of the other
group. For example, if a random sample of 300 voters 120 preferred
candidate X, what is the 95% confidence interval for candidate X?
.4 – 1.96 *
≤ p ≤ .4 + 1.96 *
p = .4 ± .055
Other formulas exist, depending upon the situation for which you are computing
confidence intervals (finite populations, etc.), but the basics illustrated above are
the same.
Download