Proportions

advertisement
Dr. Neal, WKU
MATH 183
Population Proportion
A special case of a population mean is the proportion p of those having a certain
designation. For example, we may ask what proportion of the population approves of
the President’s performance. Because p is a proportion, it is always the case that
0 ! p ! 1 ; however, p is often stated as a percentage. If p = 0.53, then we usually say
that p is 53%. However, we always should work with p in decimal form.
When determining a proportion, questions may be asked in “Yes/No” form so that
the responses are not numerical but instead are categorical. Responses may also be of
the form
1. Strongly Approve
2. Approve
3. Indifferent
4. Disapprove
5. Strongly Disapprove
But in order to analyze the approval rate mathematically, we must assign a
numerical value to the categories that we are trying to measure. In this case the
responses “Strongly Approve” and “Approve” could be counted as one category “Yes”
and the other responses are “Not Yes.” We then assign the values 1 for “Yes” and 0 for
“Not Yes” so that the responses actually become numerical.
The True Mean and True Standard Deviation
!# 1 if Yes
There are only two possible measurements: xi = "
. If we average all possible
#$ 0 if not
1
measurements over the population of size N , then we obtain µ = N
N
" xi
i =1
=
# Yes
= p,
N
which is the true proportion that we are trying to measure. Thus,
!
µ = p.
The variance is the average of the squares minus the square of the average. But
1
= N
N
# Yes
– µ 2 = p ! p2 =
N
i =1
p(1! p) . By taking the square root, we obtain the true standard deviation. Thus,
because
xi2
= xi (still 1 or 0), we have !
!!=
2
" xi 2
– µ2 =
p (1 " p)
A random measurement X from the population is then either 1 or 0. Such a
measurement used to describe a population proportion is called a Bernoulli trial,
denoted by X ~ b(1, p) .
Dr. Neal, WKU
When p = 0.5, then max σ = 0.5.
! = 0. 5
p=0
p = 0. 5
p =1
2
We note that the function ! = p " p is circular for 0 ≤ p ≤ 1. When p = 1 (where
100% of the population is “Yes”), then every measurement is 1 and therefore ! = 0
because there is no deviation. When p = 0 (where none of the population is “Yes”), then
every measurement is 0 and again there is no deviation; so ! = 0.
The maximum standard deviation occurs when p = 1/2 which gives a value of ! =
0.5. So for proportions, it is always the case that ! ≤ U = 0.5.
Confidence Interval for Proportions
We estimate the value of p by the sample proportion
p =
# “Yes”
m
=
# Responses
n
where n is the sample size and m is the number of favorable responses.
Because p is unknown, then ! = p (1 " p) is also unknown. But we can estimate
! by S = p (1 ! p ) or by its upper bound of U = 0.5. With confidence intervals for the
z
"
z
#
mean µ , we have µ ≈ x ± ! / 2 ; but now we use p ! p ± " /2 . Replacing ! by
n
n
S = p (1 ! p ) or by U = 0.5, we have
%
z
' p ± "/ 2
'
p!&
'
z
' p ± "/ 2
(
p (1 # p )
z
$ 0.5
or p ± " / 2
n
n
p (1 # p )
n
z" /2 $ 0. 5 N # n
N #n
or p ±
N #1
n
N #1
for large populations
for a population of size N
The confidence interval for a large population proportion can
be found with the built-in 1–PropZInt feature (item A) from the
STAT TESTS menu. It always uses ! ≈ p (1 ! p ) .
Dr. Neal, WKU
Example 1. Investigators asked 250 undergraduate students at a large university about
prayer and found that 195 prayed at least a few times a year. Give a 95% confidence
interval for the proportion p of all undergraduates who pray at this school. Can we be
reasonably sure that the proportion is at least two-thirds?
195
= 0.78. Assuming a “large” population of students, we have
250
z
# 0.5
z
p (1 # p )
p ! p ± " /2
p ! p ± " /2
n
n
1.96 # 0.5
1.96 $ 0.78 $ 0.22
or
= 0.78 ±
= 0.78 ±
250
250
Solution. First, p =
! 0.78 ± 0.062
! 0.78 ± 0.05135
Thus, 0.72865 ! p ! 0.83135
So, 0.718 ! p ! 0.842
With the largest margin of error, p is at least 0.718; thus, we can clearly say
that at this school more than 2/3 of the students pray.
Using the built-in 1-PropZInt feature from the
STAT TESTS menu:
Note: In the second solution above, we used the “worst case scenario” of ! = 0.5 to
obtain 0.78 ± 0.062 . Reporters often give this information as
“78% of students pray.” *
* Based on a survey of 250 students that has a maximum margin of error of 6.2 percentage points.
Example 2. A random survey of a school’s 3600 Greek students found that 205 out of
500 had gone to worship (church, temple, etc.) in the last month. Find a 98% confidence
interval for the true proportion of the school’s Greeks who had been to worship in the
last month. Is there conclusive evidence that the proportion is less than 45% ?
Solution. First, p =
have
p! p±
205
= 0.41. Then using the finite population size of N = 3600 , we
500
z" /2 p (1 # p )
n
N#n
2.326 0.41$ 0.59 3100
= 0.41±
= 0.41 ± 0.04748 .
N #1
500
3599
That is, 0.36252 ! p ! 0.45748 , which means that p still might exceed 0.45.
Note: In this case, the maximum margin of error is
2.326 ! 0.5
!
500
3100
≈ 0.04827.
3599
Dr. Neal, WKU
Choosing the Sample Size
As with confidence intervals for the mean, we may like to know in advance what
sample size would provide a certain maximum margin of error e with a certain level of
confidence r . By using U = 0.5 as a bound for ! , the required sample size n satisfies
+
$ z" /2 # 0. 5 ' 2
&
)
%
(
e
$ z" / 2 # 0.5 ' 2
n!,
N #&
)
%
(
e
$ z" /2 # 0. 5 ' 2
)
-. ( N * 1) + &%
(
e
for large populations
for a population of size N
where z! / 2 is the appropriate z -score depending on the level of confidence. Note: We
always round up to the nearest integer.
Example 3. What sample size will guarantee a maximum margin of error of 0.035 for
any 99% confidence interval of a proportion? What sample size would guarantee the
result from a population of size 1200?
Solution. For a large population, the required sample size must satisfy
$ z" / 2 # 0.5 ' 2 " 2.576 ! 0.5 % 2
' = 1354.24;
n!&
) =$
# 0. 035 &
%
(
e
thus, n must be at least 1355. From a population of size 1200, the sample size must
satisfy
$z
" 0.5' 2
N " & #/ 2
)
%
(
1200 ! 1354.24
e
n!
=
= 636.48
2
$ z# / 2 " 0.5 '
1199 + 1354.24
(N * 1) + &
)
%
(
e
thus, n must be at least 637.
Dr. Neal, WKU
Practice Exercises
1. In a nationwide survey, only 378 of 900 adults surveyed approved of the President’s
performance.
(a) Find a 95% confidence interval for the true proportion of adults who approve of the
President’s performance.
(b) What is the maximum margin of error for a 95% confidence interval based on a
survey of 900 people?
(c) Based on the results in Part (b), can you be relatively sure that less than a majority
approve?
(d) If you wanted to estimate the true proportion within 0.025 with 95% confidence,
then how many people must be surveyed?
2. A pollster wants to survey the 535 members of Congress on a completely nonpartisan issue. The question will be “Did you watch the Super Bowl last February?”
The true proportion p will be based on a random sample of 60 members of Congress.
(a) Suppose 42 out of 60 said that they did watch the Super Bowl. Find a 98%
confidence interval (with the largest margin of error) for the true proportion of
members of Congress who watched the Super Bowl. Can you say with certainty
whether or not at least two-thirds of Congress watched the Super Bowl?
(b) If you wanted to estimate the true proportion p within 0.03 with 98% confidence,
then how many members of Congress would you need to survey?
Dr. Neal, WKU
Answers
z" /2 p (1 # p )
1.96 $ 0.42 $ 0.58
378
= 0.42 ±
= 0.42. Then p ! p ±
n
900
900
= 0.42 ± 0. 032246 . That is, 0.387754 ! p ! 0.452246 , which also can be found easily
with the 1–PropZInt screen.
1. (a) Note that p =
(b)
z! / 2 " 0.5
1.96 ! 0.5
=
= 0.0326
n
900
(c) Using the maximum possible margin of error, we have p ≤ 0.42 + 0.0326 < 0.50; so
we can be certain that less than a majority approve of the President’s performance.
$ z" / 2 # 0.5 ' 2 " 1.96 ! 0.5 % 2
' = 1536.64; so sample 1537.
) =$
(d) n ! &
# 0. 025 &
%
(
e
2. (a) = p =
42
= 0.7, and then using N = 535 and n = 60 , we have
60
z
# 0.5
p ! p ± " /2
n
N$n
2.326# 0.5 475
= 0.7 ± 0.1416
= 0.7 ±
N $1
60
534
That is, 0.5584 ! p ! 0.8416 , which means that p might be below 2/3.
$ z# / 2 " 0.5' 2
$ 2.326 " 0.5' 2
N "&
)
535 " &
)
%
(
%
(
535" 1502.8544
e
0.03
=
=
= 394.74 ;
(b) n !
2
2
$ z# / 2 " 0.5 '
$ 2.326 " 0.5'
534 + 1502.8544
534 + &
(N * 1) + &
)
)
%
%
(
0.03 (
e
so sample 395 members of Congress.
Download