Chapter 20 Comparing Two Proportions 1 BPS - 5th Ed.

advertisement
Chapter 20
Comparing Two Proportions
BPS - 5th Ed.
Chapter 20
1
Two-Sample Problems
The goal of inference is to compare the
responses to two treatments or to
compare the characteristics of two
populations.
X We have a separate sample from each
treatment or each population. The units
are not matched, and the samples can be
of differing sizes.
X
BPS - 5th Ed.
Chapter 20
2
Case Study
Machine Reliability
A study is performed to test the reliability of
products produced by two machines. Machine A
produced 8 defective parts in a run of 140, while
machine B produced 10 defective parts in a run
of 200.
n
Defects
Machine A
140
8
Machine B
200
11
This is an example of when to use the two-proportion z procedures.
BPS - 5th Ed.
Chapter 20
3
Inference about the Difference p1 – p2
Simple Conditions
X
The difference in the population proportions is
estimated by the difference in the sample
ˆ 1 − pˆ 2
proportions: p
X
When both of the samples are large, the
sampling distribution of this difference is
approximately Normal with mean p1 – p2 and
standard deviation
p1(1− p1 ) p2 (1− p2 )
+
n1
n2
BPS - 5th Ed.
Chapter 20
4
Inference about the Difference p1 – p2
Sampling Distribution
BPS - 5th Ed.
Chapter 20
5
Standard Error
Since the population proportions p1 and
p2 are unknown, the standard deviation
of the difference in sample proportions
will need to be estimated by substituting
pˆ 1 and pˆ 2 for p1 and p2:
SE =
BPS - 5th Ed.
ˆ 1(1− pˆ 1 ) pˆ 2 (1− pˆ 2 )
p
+
n1
n2
Chapter 20
6
BPS - 5th Ed.
Chapter 20
7
Case Study: Reliability
Confidence Interval
Compute a 90% confidence interval for the difference in reliabilities
(as measured by proportion of defectives) for the two machines.
ˆ 1(1− pˆ 1 ) p
ˆ 2 (1− p
ˆ2)
∗ p
ˆ
ˆ
(p1 − p2 ) ± z
+
n1
n2
8 ⎛
8 ⎞ 11 ⎛
11 ⎞
1
1
−
−
⎜
⎟
⎜
⎟
8
11 ⎞
140
140
200
200
⎛
⎝
⎠
⎝
⎠
+
=⎜
−
⎟ ± 1.645
140
200
⎝ 140 200 ⎠
= 0.0021 ± 0.0418
= −0.0397 to 0.0439
We are 90% confident that the difference in proportion of defectives
for the two machines is between -3.97% and 4.39%. Since 0 is in
this interval, it is unlikely that the two machines differ in reliability.
BPS - 5th Ed.
Chapter 20
8
Adjustment to Confidence Interval
“Plus Four” Confidence Interval for p1 – p2
X
The standard confidence interval approach
yields unstable or erratic inferences.
X
By adding four imaginary observations (one
success and one failure to each sample),
the inferences can be stabilized.
X
This leads to more accurate inference of the
difference of the population proportions.
BPS - 5th Ed.
Chapter 20
9
Adjustment to Confidence Interval
“Plus Four” Confidence Interval for p1 – p2
X
Add 4 imaginary observations, one success
and one failure to each sample.
X
Compute the “plus four” proportions.
~ = number of successes + 1
p
1
n1 + 2
X
number of successes + 1
~
p2 =
n2 + 2
Use the “plus four” proportions in the formula.
(~p1 − ~p2 ) ± z∗
BPS - 5th Ed.
~
p2 (1 − ~
p2 )
p1(1 − ~
p1 ) ~
+
n2 + 2
n1 + 2
Chapter 20
10
Case Study: Reliability
“Plus Four” 90% Confidence Interval
8 +1
9
~
p1 =
=
140 + 2 142
11 + 1
12
~
p2 =
=
200 + 2 202
~
~
~
p2 (1 − ~
p2 )
∗ p1 (1 − p1 )
~
~
(p1 − p2 ) ± z
+
n1 + 2
n2 + 2
9 ⎛
9 ⎞ 12 ⎛
12 ⎞
⎜1 −
⎟
⎜1 −
⎟
9
12 ⎞
142
142
202
202
⎛
⎝
⎠
⎝
⎠
=⎜
−
+
⎟ ± 1.645
142
202
⎝ 142 202 ⎠
= 0.0040 ± 0.0434
= −0.0394 to 0.0474 (This is more accurate.)
We are 90% confident that the difference in proportion of defectives
for the two machines is between -3.94% and 4.74%. Since 0 is in
this interval, it is unlikely that the two machines differ in reliability.
BPS - 5th Ed.
Chapter 20
11
The Hypotheses for Testing
Two Proportions
X
Null: H0: p1 = p2
One sided alternatives
Ha: p1 > p2
Ha: p1 < p2
X Two sided alternative
Ha: p1 ≠ p2
X
BPS - 5th Ed.
Chapter 20
12
Pooled Sample Proportion
X
X
X
If H0 is true (p1=p2), then the two proportions are
equal to some common value p.
Instead of estimating p1 and p2 separately, we will
combine or pool the sample information to estimate p.
This combined or pooled estimate is called the
pooled sample proportion, and we will use it in
place of each of the sample proportions in the
expression for the standard error SE.
total number of successes in both samples
p̂ =
total number of observatio ns in both samples
pooled sample proportion
BPS - 5th Ed.
Chapter 20
13
Test Statistic for Two Proportions
X
Use the pooled sample proportion in place of each of
the individual sample proportions in the expression for
the standard error SE in the test statistic:
z=
p̂1 − p̂2
p̂1(1 − p̂1 ) p̂2 (1 − p̂2 )
+
n1
n2
⇒ z=
⇒ z=
BPS - 5th Ed.
Chapter 20
p̂1 − p̂2
ˆ (1 − p
ˆ) p
ˆ (1 − p
ˆ)
p
+
n1
n2
p̂1 − p̂2
⎛1
⎞
1
p̂(1 − p̂ )⎜⎜ + ⎟⎟
⎝ n1 n2 ⎠
14
P-value for Testing Two Proportions
X
Ha: p1 > p2
Y
X
Ha: p1 < p2
Y
X
P-value is the probability of getting a value as large or
larger than the observed test statistic (z) value.
P-value is the probability of getting a value as small or
smaller than the observed test statistic (z) value.
Ha: p1 ≠ p2
Y
P-value is two times the probability of getting a value as
large or larger than the absolute value of the observed test
statistic (z) value.
BPS - 5th Ed.
Chapter 20
15
BPS - 5th Ed.
Chapter 20
16
Case Study
Summer Jobs
X
X
X
A university financial aid office polled a simple
random sample of undergraduate students to study
their summer employment.
Not all students were employed the previous
summer. Here are the results:
Summer Status
Men
Women
Employed
718
593
Not Employed
79
139
Total
797
732
Is there evidence that the proportion of male students
who had summer jobs differs from the proportion of
female students who had summer jobs.
BPS - 5th Ed.
Chapter 20
17
Case Study: Summer Jobs
The Hypotheses
X
Null: The proportion of male students who had
summer jobs is the same as the proportion of
female students who had summer jobs.
[H0: p1 = p2]
X
Alt: The proportion of male students who had
summer jobs differs from the proportion of
female students who had summer jobs.
[Ha: p1 ≠ p2]
BPS - 5th Ed.
Chapter 20
18
Case Study: Summer Jobs
Test Statistic
X
n1 = 797 and n2 = 732
(both large, so test statistic follows a Normal distribution)
Pooled sample proportion:
718 + 593 1311
p̂ =
=
797 + 732 1529
X standardized score (test statistic):
X
z=
BPS - 5th Ed.
718 593
−
797 732
= 5.07
1311 ⎛ 1311 ⎞⎛ 1
1 ⎞
+
⎜1 −
⎟⎜
⎟
1529 ⎝ 1529 ⎠⎝ 797 732 ⎠
Chapter 20
19
Case Study: Summer Jobs
1.
Hypotheses:
2.
Test Statistic:
z=
3.
4.
H0: p1 = p2
Ha: p1 ≠ p2
718 593
−
797 732
= 5.07
1311 ⎛ 1311 ⎞⎛ 1
1 ⎞
1
+
−
⎟
⎜
⎟⎜
1529 ⎝ 1529 ⎠⎝ 797 732 ⎠
P-value:
P-value = 2P(Z > 5.07) = 0.000000396
(using a computer)
P-value = 2P(Z > 5.07) < 2(1 – 0.9998) = 0.0004 (Table A)
[since 5.07 > 3.49 (the largest z-value in the table)]
Conclusion:
Since the P-value is smaller than α = 0.001, there is very
strong evidence that the proportion of male students who had
summer jobs differs from that of female students.
BPS - 5th Ed.
Chapter 20
20
Download