Lecture Notes Standard Error, Margin of Error, and Confidence

advertisement
Confidence Intervals
On January 26, 2009,based on a nightly survey of 1,500 likely voters, Rasmussen Reports
has estimated that President Obama’s approval rating among adult Americans that are
likely to vote is 47%. Reading this or hearing it on the nightly news we might ask how
certain Rasmussen is that Obama’s approval rating has fallen below 50%. We might also
ask how certain Rasmussen is that Obama’s approval rating has changed in the last
month, in the last three months, or in the last six months. Finally, we might read another
poll that puts Obama’s approval rating at only 53%, still above 50%. Someone who is
not knowledgeable about inferential statistics in general and sampling error in particular
might just conclude that “Polls can’t be trusted.”
Alternatively, a better-informed, more knowledgeable person might first investigate
sources of non-sampling error. First, that person would want to make sure that the two
polls asked substantially the same question (that is question wording), a source of nonsampling error. Second, the person would want to make sure that the two samples were
not drawn from different populations or using different definitions of the population,
another source of non-sampling error. Rasmussen samples only “likely voters,” not all
adults. As Rasmussen notes, in a self-congratulatory tone, “Some other firms base their
approval ratings on samples of all adults. President Obama's numbers are always several
points higher in a poll of adults rather than likely voters. That's because some of the
President's most enthusiastic supporters, such as young adults, are less likely to turn out
to vote.” Finally, after addressing issues of non-sampling error, we might ask if the
difference between the two polls is just a matter of the predictable difference of drawing
two different random samples from the same population, that is sampling error.
There are two inferential statistical approaches to answering these questions, parameter
estimation and hypothesis testing. In contingency table analysis, the chi-squared
statistic approach is an example of hypothesis testing. For the proportion result of the
presidential approval rating, a difference of means test would be an alternative
technique in the hypothesis testing approach. The specific technique in the parameter
estimation approach would be confidence intervals. We might, unknowingly, encounter
confidence intervals reporting of the Rasmussen poll, even in the nightly television news
report, that is the news anchor might inform us that, “…with 95% confidence, these
results are plus or minus one half of one percent.” That “plus or minus” is called the
margin of error and is one of the preliminary statistics used in the confidence interval
approach. It is also probably the most commonly encountered inferential statistic, due to
it ubiquity on television newscasts. Given that to calculate the standard deviation of a
proportion all that is needed is the proportion itself and the only additional piece of
information that is needed to calculate the standard error is the sample size, it is
possible (and easy) to assess the above questions.
Standard Error
se 
sd
n
Confidence Interval – a) a range estimate of a population parameter. b) an estimation of
the range of values around the sample estimate (for example mean or proportion) the
range of values likely to inlcude include the population parameter, qualified by a given
confidence level. c) an indictator of the reliability of a sample statistic. b) an interval
estimate, as opposed to a single point estimate.
Confidence Level – a) the likelihood, usually expressed as a percentage, that a
confidence interval contains the population parameter. b) increasing the confidence level
widens the confidence interval and lowering it narrows the confidence interval. c) 95% is
the most common confidence level and corresponds to a situation in which the
confidence interval does not include the population parameter one time in 20.
Margin of Error – a) the amount of sampling error in an estimate at a certain level of
confidence. b) the product of the standard error and the critical value of the confidence
level. c) one half the confidence interval. d) perhaps the most commonly encountered
inferential statistic because it is the basis of the “plus or minus” attached to the reporting
of public opinion surveys, like presidential approval ratings or voting intentions, on
television newscasts.
For known population variance/standard deviation and sample size greater than 50 a zbased confidence can be calculated as follows:
  z se
cl
  z se
cl
For unknown population unknown standard deviation and/or sample size less than or
equal to 50, which is almost always the case, a t-based confidence can be calculated in
the exact way as the z-based interval but with degrees of freedom used to look up the tvalue:
  t se
cl , df
  t se
cl , df
df  n  1
In both the z-test and t-test formulas, the population mean (μ) is estimated by the sample
mean (X-bar). The z-values and t-values correspond to a given level of confidence and
can be looked up in a table of z-score or t-score values. The z-value for a 95%
confidence interval is 1.96, which can be used as an approximation of the t-value for
larger sample sizes (for example with a sample size of 50 the difference between the two
values is 2.3%). The degrees of freedom is one less than the sample size.
Calculating the 95% Confidence Interval for the Rasmussen Poll
1. n = 1,500 p = 0.47
2.
se 
p(q)

n
p(q)
0.47(0.53)

 0.000166  0.01289
N
1,500
3. α = .05
4. Zα/2 = Z.05/2 = Z.025 = 1.96
5. me  z ( se)  1.96(0.01289)  0.025258
.025
6. lb  p  me  0.47  0.025258  0.44742lb= 0.44742
lb  p  me  0.47  0.025258  0.495258
So, based on the Rasmussen Poll and our own calculations with 95% confidence the
percentage of likely voters approving of President Obama’s performance is between
44.7% and 49.5%.
n = 1,000
p = 0.47
Standard Error = 0.015783
z(alpha=0.05)=1.96
me= 0.030934
lb= 0.439066
ub=0.500934
n = 50
p = 0.47
Standard Error = 0.070583
z(alpha=0.05)=1.96
me= 0.138343
lb= 0.331657
ub=0.608343
n = 50
p = 0.47
Standard Error = 0.070583
z(alpha=0.05)=2.10
me= 0.14872
lb= 0.328128
ub=0.611872
Download