Review of Confidence Interval Concepts

advertisement
Review of Confidence Interval Concepts
 A confidence interval is an interval of values that is likely to "capture" the
unknown value of a population parameter of interest, such as the true population
mean, μ, or the true difference, μd. Another concept is to estimate the difference
between two independent samples. However, we will save this discussion for a
future lesson.
 The confidence level is the probability (fraction of times) that the procedure used
to determine an interval gives an interval that actually captures the true population
value. For example, say we repeatedly drew samples of the same size from a
population and constructed 95% confidence intervals for each sample, and we
repeated this process 1000 times. Then we would expect 95%, or 950, of these
confidence intervals to contain the true population parameter. In reality, though,
we typically construct only one such confidence interval and thus we are X-%
confident that this interval has captured the true parameter. However in reality,
this interval might or might not contain the true value. As a result, confidence
intervals are exactly that: statements of how confident you are. These should not
be interpreted, for example, to say that there is a 95% probability that the true
value is in this interval. This is not true because the true value is either in the
interval (i.e. probability of 1) or not in the interval (probability of 0).
 In most situations considered in our text, the general format for determining a
confidence interval is
Sample statistic ± Multiplier × Standard error

In other words, we form a confidence interval by adding and subtracting an
appropriate number of standard errors to (and from) the sample estimate.
Last week we considered confidence intervals for 1-proportion and our multiplier in our
interval used a z-value. But what if our variable of interest is a quantitative variable (e.g.
GPA, Age, Height) and we want to estimate the population mean? In such a situation
proportion confidence intervals are not appropriate since our interest is in a mean amount
and not a proportion.
Therefore we apply similar techniques but now we are interested in estimating the
population mean, μ, by using the sample statistic and the multiplier is a t-value. Until
now we assumed that our random variable came from a normal distribution with a known
population standard deviation, σ. However, typically we do not know this parameter and
therefore must estimate it. This is done by using the standard deviation of the sample
which is expressed as "S". Since we need to make this estimate we lose our reference to
the variable being from a normal distribution. These t-values come from a t-distribution
which is similar to the standard normal distribution from which the z-values came. The
similarities are that the distribution is symmetrical and centered on 0. The difference is
that when using a t-table we need to consider a new feature: degrees of freedom (df).
This degree of freedom will be based on the sample size, n.
1
Example of 1-proportion and means confidence intervals
In the spring of 2006, all students registered for STAT200 were asked to complete a
survey. A total of 1004 students responded. If we assume that this sample represents the
PSU-UP undergraduate population, then we can make inferences regarding this
population based on the survey results. What follows are three examples.
1. Have you ever made out with someone of the same sex?
2. How much are you willing to spend on a date (or want your date to spend on you)?
3. What is your weight (pounds)? If you could choose your "ideal" weight (in pounds), what
would it be?
Solutions:
1. pˆ  Z *
pˆ (1  pˆ )
0.215(1  0.215)
= 0.215  1.96 *
= 0.189 ≤ p ≤ 0.241
n
1004
Variable
SameSex_1
X
216
N
1004
Sample p
0.215139
95% CI
(0.189722, 0.240557)
Z-Value
-18.05
P-Value
0.000
Interpretation: We are 95% confident that the true proportion of PSU-UP undergraduate
students who have made out someone of the same sex is between 18.9% and 24.1%
41.544
s
2. x  t *
= 51.94  1.98 *
= $49.35 ≤ u ≤ $54.54
988
n
Variable
DateSpnd
N
988
Mean
51.9423
StDev
41.5440
SE Mean
1.3217
95% CI
(49.3487, 54.5360)
Interpretation: We are 95% confident that the true mean amount of money that PSU-UP
undergraduate students think should be spent on the first date is between $49.35 and $54.54
3. x d  t *
sd
n
= 6.789  1.98 *
18.171
995
= 5.66 lbs. ≤ ud ≤ 7.92 lbs
Paired T for Weight - IdlWt
Weight
IdlWt
Difference
N
995
995
995
Mean
152.067
145.277
6.78935
StDev
32.212
32.963
18.17087
SE Mean
1.021
1.045
0.57606
95% CI for mean difference: (5.65892, 7.91977
Interpretation: We are 95% confident that the true mean difference between actual weight and
ideal weight of PSU-UP undergraduate students is between 5.66 lbs and 7.92 lbs
Question: Does this mean that actual weight of all PSU-UP undergraduate students is more than
their ideal weight?
Answer: No. Since this interval estimates a mean difference we cannot say that all students feel
that their actual weight is more than their ideal weight, but instead we say that on average PSUUP undergraduate students feel that their actual weight is more than their ideal weight.
2
Download