Interval Estimation for Population Mean

ST 361 Estimation --- Interval Estimation for  (§7.2, 7.4)
I. Interval estimation: confidence interval
II. (Two-sided) Confidence interval for estimating population mean 
(a) When the population SD  is known: use Z distribution (§7.2)
(b) When the population SD  is NOT known: use t distribution (§7.4)
III. (Two-sided) confidence interval for estimating population proportion  (§7.3)
IV. Two-sided confidence interval for estimating population mean difference 1  2 (§7.5)
(a) when the population SD’s  1 , 2 are known
(b) when the population SD’s  1 , 2 are NOT unknown
---------------------------------------------------------------------------------------------------------------------I. Interval Estimate----Confidence Interval (CI)
 What is it?
A confidence interval is an interval calculated from a sample such that it will contain the
true value of a population parameter (such as the population mean  ) with certain
probability (called the confidence level)
 Why?
Because of sampling variability, the point estimate is almost never exactly equal to the
correct value for the parameter
Point estimates don’t tell us how close they are to the actual parameter
 So we use an interval call a confidence interval to report the likely range for the
parameter of interest
 Confidence interval for the population mean 
Consider a sample of X 1 , X 2 ,, X n (n  30) that is randomly selected from a population
with mean  and SD  . To estimate  we use the sample mean X  n1  X i . Our goal
here is find the likely range of  .
Recall that X ~ N (  ,  X 
Based on this normal distribution of X , we can show that the middle 95% of the X
) no matter what distribution X has (by CLT).
fall within   1.96   / n ,   1.96   / n , or equivalently,   1.96   / n . Notice that
this is an interval centering at the parameter  .
However, in reality we don’t know  , and instead we only observe X from the
sample collected. So what we really want is an interval centering at X with the
same length: X  1.96   / n or equivalently X  1.96   / n , X  1.96   / n .
Meaning of the interval “ X  1.96   / n ”: It contains  with 95% probability
Thus we have 95% of confidence that  can be covered within the range
 X 1.96  /
n , X  1.96   / n
This is the concept of Confidence Interval (CI)----We called such interval “the
95% confidence interval for  ”
 Note that the fundamental assumption for constructing the CI for  is that :
X has a normal distribution (automatically true if X has a normal distribution. Otherwise
the sample size n has to be large)
II. Confidence interval for  ; assume X ~ N (  ,  / n )
 If  known
If  known, the CI for  at a given confidence level (1   ) is
X  z / 2   / n
4 components:
(a) The point estimator X
(b) Confidence level, which determines the critical value z*
(c) The SE of the point estimator X
(d) need X follows normal distribution
Ex1. X~N(  ,  ). The 90% CI for  with sample size n is
X  z0.1/ 2  / n  X  1.645  / n
Ex2. X~N(  ,  ). The 95% CI for  with sample size n is
X  z0.05/ 2  / n  X  1.96  / n
Ex3. X~N(  ,  ). The 99% CI for  with sample size n is
X  z0.01/ 2  / n  X  2.58  / n
► Comment: the higher the confidence level is, the wider or narrower (choose one) a CI
Ex4. X has mean  and SD  (known). A sample of size=100 is collected. What is the
95% CI for  ?
X  1.96  / 100  [ X  0.196 , X  0.196 ]
If from a sample we got X = 3.4, and  is assumed to be 2.5. Then a 95% CI for  is
[ X  0.196 , X  0.196 ]  [3.4  0.196  2.5,3.4  0.196  2.5]  [2.91,3.89]
 
, X  1.75
 ?
Ex5. What is the confidence level for the interval  X  1.75
Since  / 2  P[ Z  1.75]  P[ Z  1.75]  0.04    0.08,  1    0.92 is the confidence
 If  is unknown
In practice, most of the time  is not known. To calculate CI for  , we have to use
s / n instead of 
When  is known,
When s is used,
X    ~ Z (when X has a normal distribution or the sample size n
is large), and hence we use a z critical value.
X    will be distributed as
(a) If n is large ( n  30) ,
X    is approximately distributed as N(0,1). So we can still use
s n
the result before by replacing  with the sample SD s .
(b) If n is small ( n  30) , we have to assume X has a normal distribution with mean  and
SD  (even though its value if unknown). Then
X    has a t-distribution with (n-1)
degrees of freedom
When  unknown, the CI for  at a given confidence level is
X  tn1, / 2  s / n
The t distribution with (n-1) degree of freedom (graph on the last page of the
 t distribution is similar to the standard normal distribution (the Z distribution) in
many aspects: (1) all values are possible
(2) symmetric around zero
(3) bell-shaped
 However, it has heavier tails than the Z distribution. Different sample size
results in different thickness of the tail in a t distribution: the smaller the sample
size (the degrees of freedom), the thicker the distribution.
 Each t distribution is defined through the degree of freedom (df) and the
corresponding t distribution is denoted by t df
When the sample size is very
large (i.e., >120), t(n-1)≈ Z !!
Use t- table to find the critical value
Page 566 Table IV
Ex6. Use the t table to find 95% and 99% t-critical value for each of the following
sample size:
Sample size
Degree of freedom
(df) = n-1
t* (i.e., t-critical value)
Ex7. X~Normal distribution. n=25, X =8 and s=2. What is the 95% CI for the population
mean  ?
X  2.064  s / n  8  2.064  2 / 25  2.0  [7.17,8.83]
Ex8. X= # of claims received (per week) by an insurance company. Based on 41 weeks of
samples, X  18.5 and s=20.0. What is the 95% CI for 
Here n=41 is large enough for us to use the formula
X  z / 2  s / n  18.5  1.96  20 / 41  [12.38, 24.62]