- Oklahoma City Community College

advertisement
Chapter 8 Interval Estimation / Confidence Intervals
-this is the technique used when we use a point estimator (sample estimate) to construct
an interval where we expect to find the population parameter with a certain probability.
To do this we need to have a sample estimate and we need to specify what degree of
certainty we want to know that the actual population parameter lies in the interval.
1. Statistical Inference – using sample data and estimates to draw a conclusion about
the population.
2. Interval Estimate – this is when we construct an interval by adding or subtracting a
margin of error to a point estimate.
Mathematically: Point Estimate ± margin of error
_
_
Note: x & p are examples of point estimates commonly used.
3. Interval Estimation about a Population Mean (Large Sample; n ≥ 30)
a. sampling error – tells us how close the population parameter is to the sample estimate.
_
Mathematically: | x -μ |
-note that since most likely the population parameter is unknown we can’t say for sure
how far away we are from the actual mean, but we can construct a sampling distribution
and make probability statements about it assuming that we have a normal distribution.
b. precision statement – tells us with what precision the sampling error will take. So the
_
sampling error (| x -μ |) takes on some error (it could be less) with a certain probability.
_
_
Mathematically: x ± zα/2 σ x
_
Note: our margin of error here is the zα/2 σ x ; so we are within some standard deviations
away from the mean.
c. Confidence Coefficient & Precision Terms
i. confidence level – the precision which you make the particular intervals  i.e. 95% or
some percentage
ii. confidence coefficient – the actual percent level used
iii. level of significance – α – tells use the chance that we are not in the confidence
interval. We can take 1- confidence level = α .
ex: so if we use the 95% level we get 1- 0.95 = α = 0.05.
Rule: If we have σ known we can use the population variance to construct the sampling
distributions variance as σ / n , but many times we don’t know the population variance.
1
In this case we can estimate the population variance with s, or the estimate of the
population variance. This is calculated in the same manner as we discussed earlier in
chapter 2. The interval is the exact same as before, but instead of the population variance
_
_
we use s  x ± zα/2 s x
d. Example: Suppose we have the following information. Construct a 95% confidence
interval.
_
x = 80
σ=3
n = 10
α = 0.05
Step 1: Find the appropriate z-score. So we want an interval that looks like the following.
Graph 1:
We want the tails to
have exactly 2.5% of
probability and the
middle portion have
95% of the data.
z
-zα/2
0
zα/2
So we go to the table and find the Z that would give use this and it is z = 1.96.
_
Step 2: Find the sampling distributions variance  σ x = σ /
Step 3: Construct the confidence interval:
_
n  3 / 10 = 0.949
_
x ± zα/2 σ x  80 ± 1.96 (0.949)  80 ± 1.86  78.14 to 81.86
Graph 2:
So the x-values that
will give us 95%
certainty of finding
the population mean
are 78.14 and 81.86
x
78.14
80
2
81.86
4. Interval Estimation about a Population Mean (small sample; n < 30)
-in a small sample case we have to worry about the central limit theorem. Recall that as
we have larger samples we can be assured that the data takes on a normal distribution. In
small samples we cannot be assured this. So we must use another distribution ( tdistribution ) in the case where the population variance is unknown.
a. Case 1: σ is known
-in this case we still use the normal distribution because we can conclude that the
population is normal and therefore the sampling distribution is also normal. This is not
any different from the calculations that we did above.
_
_
Mathematically: x ± zα/2 σ x
Where 1-α = confidence coefficient and zα/2 gives the z-scores that gives us α/2
probability in the upper and lower tails.
b. Case 2: σ is unknown
-in this case we cannot assume that our sampling distribution follows a completely
normal distribution. Now we must use the t-distribution (sometimes called the students tdistribution) to make probability statements. The technique is still the same as before,
but now we go to another chart to get the critical values to construct our confidence
interval.
i. t-distribution – is a family of distributions that is based on degrees of freedom. Each
distribution with its degrees of freedom has its own features. As the degrees of freedom
(d.o.f) go up the distributions get closer and closer to the standard normal. This is b/c as
the d.o.f. go up the variability is reduced. Go to table A.4 642-3
-we read the chart the same way as the standard normal with the only difference coming
from the fact that we now have d.o.f which is (n-1)
_
_
_
ii. Confidence Interval = x ± tα/2 s x = x ± tα/2 s /
n
_
recall that s2 = ∑ ( xi – x )^2 / n-1 ; this is just sample variance. To get s we simply take
the square root of s2 .
***so the only time you use the t-distribution is in the case of a small sample and when
the population variance in unknown. The CI is constructed in the same fashion.
_
Example: Consider a case where you want a 95% CI for x =10, n = 25, and s = 3
We use a t-distribution in this case since the population standard deviation is unknown
and we have a small sample.
_
_
_
So x ± tα/2 s x = x ± tα/2 s /
n = 10 ± 2.064(3/ 25 ) = 10 ± 1.24 8.76 to 11.24
3
This implies we are 95% confident the true population mean lies somewhere between
8.76 and 11.24
5. Determining Sample Size – we can use this technique to find sample size required
when we want to know a sample confidence interval contain the population parameter
with a certain probability.
_
If we let E = margin of error (could also me M ) = zα/2 σ x ; this is because this is how far
we allow the sample stat to vary (see sampling error above)
So we have E = zα/2 σ / n 
n = zα/2 σ / E  n = [(zα/2 σ) / E ]2
Example: Suppose we are given the following information and want to know how large a
sample we need to be sure that we are 90% confident we will find our population
parameter in our sample.
E= 2 units
σ=3
α = 0.10
Step 1: Find the appropriate z  zα/2 = 1.645 ; this is the z value that gives us 90% of the
data between 1.645 and -1.645.
Step 2: Plug in values and find n  n = [(zα/2 σ) / E ]2  [ ( 1.645 * 3 ) / 2 ]2 = 6.08 or 7
****Note: generally we round up to ensure that we fall within our confidence limits.
6. Constructing a CI for p:
a. σ known
If we want to construct a CI it is exactly the same method as before but now we use
our to construct a confidence interval for the true population proportion.
± zα/2
The standard deviation of the sampling distribution is equal to
-so we can reduce our formula above to:
± zα/2
Example: If we have a sample of 49 children from a school and there are 9 of them with
a cold. Find a 99% confidence interval for the true proportion of children at the school
who have a cold.
4
= 0.183
zα/2 = 2.576
= 0.0552
So the 99% CI is 0.183 ± (2.576)(0.0552) or 0.0407 to 0.325
7. Other Notes for P
a. Finding Sample Size: Once again we can find the appropriate sample size for a desired
margin of error as follows:
ME = zα/2
So if we solve for n we get n =
2
p(1-p)
So if we knew that p = 0.6 and you wanted a margin of error of say 4% with a 95%
confidence then the appropriate sample would be:
n=
2
0.6(1-0.6)=576.24 or about 577 observations.
5
Download