PPT Slides for Confidence Intervals and Hypotheses Tests for p1-p2

advertisement
Chapter 22
Comparing 2 Proportions
© 2006 W.H. Freeman and Company
Objectives (Chapter 22)
Comparing two proportions

Comparing two independent samples

Large-sample CI for two proportions

Test of statistical significance
Comparing two independent samples
We often need to estimate the difference p1 – p2 between two unknown
population proportions based on independent samples. We can compute the
difference between the two sample proportions and compare it to the
corresponding, approximately normal sampling distribution for
pˆ1  pˆ 2
Point Estimator of p1 – p2 : pˆ 1  pˆ 2



Two random samples are drawn from two
populations.
The number of successes in each sample is
recorded.
The sample proportions are computed.
Sample 1
Sample size n1
Number of successes x1
Sample proportion
pˆ 1 =
x1
n1
Sample 2
Sample size n2
Number of successes x2
Sample proportion
pˆ 2 =
4
x2
n2
Large-sample CI for two proportions
For two independent SRSs of sizes n1 and n2 with sample proportion
of successes p
ˆ1 and pˆ2 respectively, an approximate level C
confidence interval for p1 – p2 is
pˆ1 (1  pˆ1 ) pˆ 2 (1  pˆ 2 )

n1
n2
where z* is the appropriate value from
the z-table that depends on the
confidence level C
( pˆ1  pˆ 2 )  z*
C is the area under the standard normal curve between −z* and z*.
Use this method when
npˆ1  10, n(1  pˆ1 )  10, npˆ 2  10, n(1  pˆ 2 )  10
Cholesterol and heart attacks
How much does the cholesterol-lowering drug Gemfibrozil help reduce the risk
of heart attack? We compare the incidence of heart attack over a 5-year period
for two random samples of middle-aged men taking either the drug or a placebo.
Standard error of the difference p1− p2:
pˆ
H. attack
n
Drug p2
56
2051
2.73%
Placebo p1
84
2030
4.14%
SE =
pˆ1(1 pˆ1) pˆ 2 (1 pˆ 2 )

n1
n2
SE =
0.0273(0.9727) 0.0414(0.9586)

= 0.00764
2051
2030
T heconfidenceintervalis ( pˆ1  pˆ 2 )  z * SE
So the 90% CI is (0.0414 − 0.0273) ± 1.645*0.00746 = 0.0141 ± 0.0125 = (0.016, 0.0266)
We are 90% confident that the interval 0.16% to 2.66% captures the true percentage difference
in heart attack rates for middle-aged men when taking a placebo and the cholesterol-lowering
drug.
Example: 95% confidence interval for p1 – p2
The age at which a woman gives birth to her first child may be an important factor in the
risk of later developing breast cancer. An international study conducted by WHO selected
women with at least one birth and recorded if they had breast cancer or not and whether
they had their first child before their 30th birthday or after.
Cancer
Sample
Size
Age at
683
First Birth
> 30
3220
Age at
1498
First Birth
<= 30
10,245
The parameter to be estimated is p1 – p2.
p1 = cancer rate when age at 1st birth >30
p2 = cancer rate when age at 1st birth <=30
21.2%
pˆ1
pˆ1 (1  pˆ1 ) pˆ 2 (1  pˆ 2 )
( pˆ1  pˆ 2 )  1.96

n1
n2
14.6%
pˆ 2
We estimate that the cancer rate when
age at first birth > 30 is between .05
and .082 higher than when age <= 30.
(.212  .146)  1.96
.212(.788)
3220

.146(.854)
10, 245
.066  1.96(.008) or .066  .016
(.05,.082)
7
HypothesisTests for p1 – p2
If the null hypothesis is true, then we can rely on the properties of the
sampling distribution of pˆ1  pˆ 2 to estimate the probability of selecting 2
samples with proportions pˆ1 and pˆ 2
Sampling distribution of pˆ  pˆ
H 0 : p1  p2 = 0 (that is, p1 = p2 = p )
 0

H a : p1  p2   0
 0

Our best estimate of p is pˆ ,
1
2
when H 0 : p1  p2 = 0 is true.
the pooled sample proportion
count1  count 2
total successes
pˆ =
=
total observations
n1  n2
pˆ1  pˆ 2
z=
 1
1 
pˆ (1  pˆ ) 


 n2 n2 
This test is appropriate when
npˆ1  10, n(1  pˆ1 )  10, npˆ 2  10, n(1  pˆ 2 )  10
1 1
p (1  p )   
 n2 n2 
=0
Gastric Freezing
Gastric freezing was once a treatment for ulcers. Patients would
swallow a deflated balloon with tubes, and a cold liquid would be
pumped for an hour to cool the stomach and reduce acid production,
thus relieving ulcer pain. The treatment was shown to be safe,
significantly reducing ulcer pain, and was widely used for years.
A randomized comparative experiment later compared the outcome of gastric
freezing with that of a placebo: 28 of the 82 patients subjected to gastric
freezing improved, while 30 of the 78 in the control group improved.
H0: pgf - pplacebo = 0
pgf = proportion that receive relief from gastric freezing
Ha: pgf - pplacebo > 0
pplacebo = proportion that receive relief using a placebo
28
= .341
82
30
pˆ placebo =
= .385
78
28  30
pˆ pooled =
= 0.3625
82  78
pˆ gf =
P  value = P ( z  0.499) = .69
z=
pˆ gf  pˆ placebo
1 1
pˆ (1  pˆ )   
 n1 n2 
=
0.341  0.385
 1 1 
0.363*0.637   
 82 78 
=
0.044
0.231*0.025
= 0.499
Conclusion: The gastric freezing was no better than a placebo (P-value 0.69),
and this treatment was abandoned. ALWAYS USE A CONTROL!
Download