Chapter 7.3: Estimation - Interval Estimation for

advertisement
ST 361 Estimation --- Interval Estimation for  (§7.3)
---------------------------------------------------------------------------------------------------------------------Topics:
I. Interval estimation: confidence interval
II. (Two-sided) Confidence interval for estimating population mean  (§7.2, 7.4)
(a) When the population SD  is known: use Z distribution (§7.2)
(b) When the population SD  is NOT known: use t distribution (§7.4)
III. (Two-sided) confidence interval for estimating population proportion  (§7.3)
IV. Two-sided confidence interval for estimating population mean difference 1  2 (§7.5)
(a) when the population SD’s  1 , 2 are known
(b) when the population SD’s  1 , 2 are NOT unknown
---------------------------------------------------------------------------------------------------------------------III.
Confidence Interval for  , the population proportion (or the success probability)

The natural statistic for estimating  is the sample proportion of success p.

Recall that p 

Also recall that if n is large (i.e., n  30, n  5 and n 1     5 ), the sampling distribution of p
X
, where X is total number of successes out of n trials.
n
can be approximated by
p

N ( ,
 (1   )
n
)
The CI for  with confidence level ( 1   ) takes the following form:
[ p  z / 2
 (1   )
n
, p  z / 2
 (1   )
n
]
Since  in the above interval is unknown, replace it by the sample proportion p and actually use
the following CI:
[ p  z / 2

p(1  p)
p(1  p)
, p  z / 2
]
n
n
Note that here we still use a Z value from the Standard normal distribution instead of a tdistribution, but this time it requires a more stringent criterion for “large n”.
1

In conclusion,
 Still use Z-distribution to find the critical value z*
 Need np > 10 and n(1-p)>10 for p to be normally distributed.

Confidence interval for  : (Assume n>30, np > 10 and n(1-p)>10)
The Confidence interval for  is
[ p  z / 2
p(1  p)
p(1  p)
, p  z / 2
]
n
n
Ex1. For a certain disease, there is a generic drug and a brand name. We want to estimate  = P(the
brand name is more preferred by a patient). So 100 patients were asked, and 20 of them
preferred the brand name.
(a) What is your estimate of  ? What is the estimated SE of your estimator?
Answer: A natural estimator of  is the sample proportion p = 20/100=0.2. The estimated
standard error (SE) of p is
p(1  p)
0.2  (1  0.2)

 0.04
n
100
(b) Find the 95% CI for 
Answer: Here n = 100>30, np = 100*0.2=20>10, n(1-p)=100*0.8=80>10. So we can use the CI
formula for  given above.
1    0.95,    0.05,   / 2  0.025,  z0.025  1.96. So a 95% CI for  is
[ p  z / 2
p(1  p)
p(1  p)
, p  z / 2
] =[0.2 - 1.96*0.04, 0.2+1.96*0.04]=[0.122,0.278].
n
n
(c) What is the width of your CI?
Answer: The width of the CI is 0.278 – 0.122=0.156.
(d) If we want to make the width of our 95%CI for  be  0.1, how many patients should be
sampled?
Answer: The width of our 95% CI is 2  z / 2
p(1  p)
p(1  p)
 2 1.96
 0.1
n
n
2
0.2  0.8
 2 1.96 
 0.1,  n  
  0.2  0.8  246.
n
 0.1 
2
If we use p=0.2, then 2 1.96
If we treat p unknown (it is unknown before the sampling), we can use p=0.5 in the formula and
0.5  0.5
 2 1.96 
 0.1,  n  
  0.5  0.5  384.
n
 0.1 
2
solve 2 1.96
Ex2. Suppose that the proportion of the left-handed students at a certain university is  . A random sample
of 200 students was collected and found that 40 out of the 200 students are left-handed.
(a) Use an unbiased estimator to compute a point estimate of  .
Answer: an unbiased estimator of  is the sample proportion p =40/200=0.2
(b) What is the approximate distribution of your estimator in (a)? Why?
Answer: It can be well approximated by a normal distribution since n=200>30,
np=40>10 and n(1-p)=160>10.
(c) What is the estimated standard error of your estimator?
(Hint: if the population proportion  is known, then  p 
 1   
n
)
Answer: The estimated standard error of p is
p(1  p)
0.2  0.8

 0.028
n
200
(d) Your estimator in (a) is unbiased because (circle one)
i. Its distribution is normal
ii. Its mean is equal to 
iii. Its SE is equal to
 1   
n
iv. It is based on a sample with size greater than 30
(e) What is a 95% confidence interval for  ?
Answer: since n=200>30, np=40>10 and n(1-p)=160>10, we can use the formula given
before:
[ p  z / 2
p(1  p)
p(1  p)
, p  z / 2
]  [0.2  1.96  0.028,0.2  1.96  0.028]  [0.145, 0.255]
n
n
3
Download