Interval Estimation for Population Mean

advertisement
ST 361 Estimation --- Interval Estimation for  (§7.2, 7.4)
Topics:
I. Interval estimation: confidence interval
II. (Two-sided) Confidence interval for estimating population mean 
(a) When the population SD  is known: use Z distribution (§7.2)
(b) When the population SD  is NOT known: use t distribution (§7.4)
III. (Two-sided) confidence interval for estimating population proportion  (§7.3)
IV. Two-sided confidence interval for estimating population mean difference 1  2 (§7.5)
(a) when the population SD’s  1 , 2 are known
(b) when the population SD’s  1 , 2 are NOT unknown
---------------------------------------------------------------------------------------------------------------------I. Interval Estimate----Confidence Interval (CI)
 What is it?
A confidence interval is an interval calculated from a sample such that it will contain the
true value of a population parameter (such as the population mean  ) with certain
probability (called the confidence level)
 Why?

Because of sampling variability, the point estimate is almost never exactly equal to the
correct value for the parameter

Point estimates don’t tell us how close they are to the actual parameter
 So we use an interval call a confidence interval to report the likely range for the
parameter of interest
 Confidence interval for the population mean 
Consider a sample of X 1 , X 2 ,, X n (n  30) that is randomly selected from a population
with mean  and SD  . To estimate  we use the sample mean X  n1  X i . Our goal
here is find the likely range of  .


Recall that X ~ N (  ,  X 

Based on this normal distribution of X , we can show that the middle 95% of the X

n
) no matter what distribution X has (by CLT).

fall within   1.96   / n ,   1.96   / n , or equivalently,   1.96   / n . Notice that
this is an interval centering at the parameter  .
1

However, in reality we don’t know  , and instead we only observe X from the
sample collected. So what we really want is an interval centering at X with the


same length: X  1.96   / n or equivalently X  1.96   / n , X  1.96   / n .
2

Meaning of the interval “ X  1.96   / n ”: It contains  with 95% probability
Thus we have 95% of confidence that  can be covered within the range
 X 1.96  /

n , X  1.96   / n

This is the concept of Confidence Interval (CI)----We called such interval “the
95% confidence interval for  ”
 Note that the fundamental assumption for constructing the CI for  is that :
X has a normal distribution (automatically true if X has a normal distribution. Otherwise
the sample size n has to be large)
II. Confidence interval for  ; assume X ~ N (  ,  / n )
 If  known
If  known, the CI for  at a given confidence level (1   ) is
X  z / 2   / n

4 components:
(a) The point estimator X
(b) Confidence level, which determines the critical value z*
(c) The SE of the point estimator X
(d) need X follows normal distribution
Ex1. X~N(  ,  ). The 90% CI for  with sample size n is
X  z0.1/ 2  / n  X  1.645  / n
3
Ex2. X~N(  ,  ). The 95% CI for  with sample size n is
X  z0.05/ 2  / n  X  1.96  / n
Ex3. X~N(  ,  ). The 99% CI for  with sample size n is
X  z0.01/ 2  / n  X  2.58  / n
► Comment: the higher the confidence level is, the wider or narrower (choose one) a CI
becomes.
Ex4. X has mean  and SD  (known). A sample of size=100 is collected. What is the
95% CI for  ?
X  1.96  / 100  [ X  0.196 , X  0.196 ]
If from a sample we got X = 3.4, and  is assumed to be 2.5. Then a 95% CI for  is
[ X  0.196 , X  0.196 ]  [3.4  0.196  2.5,3.4  0.196  2.5]  [2.91,3.89]


 
, X  1.75
 ?
Ex5. What is the confidence level for the interval  X  1.75
n
n

Since  / 2  P[ Z  1.75]  P[ Z  1.75]  0.04    0.08,  1    0.92 is the confidence
level.
4
 If  is unknown

In practice, most of the time  is not known. To calculate CI for  , we have to use
s / n instead of 
n.

When  is known,

When s is used,
X    ~ Z (when X has a normal distribution or the sample size n

n
is large), and hence we use a z critical value.
X    will be distributed as
s
(a) If n is large ( n  30) ,
n
X    is approximately distributed as N(0,1). So we can still use
s n
the result before by replacing  with the sample SD s .
(b) If n is small ( n  30) , we have to assume X has a normal distribution with mean  and
SD  (even though its value if unknown). Then
X    has a t-distribution with (n-1)
s
n
degrees of freedom
When  unknown, the CI for  at a given confidence level is
X  tn1, / 2  s / n

The t distribution with (n-1) degree of freedom (graph on the last page of the
textbook)
 t distribution is similar to the standard normal distribution (the Z distribution) in
many aspects: (1) all values are possible
(2) symmetric around zero
(3) bell-shaped
 However, it has heavier tails than the Z distribution. Different sample size
results in different thickness of the tail in a t distribution: the smaller the sample
size (the degrees of freedom), the thicker the distribution.
 Each t distribution is defined through the degree of freedom (df) and the
corresponding t distribution is denoted by t df
5


When the sample size is very
large (i.e., >120), t(n-1)≈ Z !!
Use t- table to find the critical value
Page 566 Table IV
Ex6. Use the t table to find 95% and 99% t-critical value for each of the following
sample size:
Sample size
n
Degree of freedom
(df) = n-1
3
t* (i.e., t-critical value)
95%
99%
2
4.303
9.925
6
5
2.571
4.032
12
11
2.201
3.106
30
29
2.045
2.756
1.96
2.576

Ex7. X~Normal distribution. n=25, X =8 and s=2. What is the 95% CI for the population
mean  ?
X  2.064  s / n  8  2.064  2 / 25  2.0  [7.17,8.83]
6
Ex8. X= # of claims received (per week) by an insurance company. Based on 41 weeks of
samples, X  18.5 and s=20.0. What is the 95% CI for 
Here n=41 is large enough for us to use the formula
X  z / 2  s / n  18.5  1.96  20 / 41  [12.38, 24.62]
7
Download