9.2 SAMPLE PROPORTIONS The sampling distribution of p(hat) is

advertisement
9.2 SAMPLE PROPORTIONS
OVERVIEW: The normal distribution curve is often
extremely useful in analyzing sample proportions. This
section provides insights into the circumstances that allow
for use of normal distribution properties.
Consider a simple random sample (SRS) of 1,000 people from a large
population. If X represents the number in this sample who are
Republicans, then there are 1,001 possible values of X, namely 0,1,2,3,
..., 998, 999, 1000. If p(hat) represents the possible sample proportions
of Republicans in the sample, then there are 1,001 possible values of
p(hat), namely 0/1000, 1/1000, 2/1000, ..., 998/1000, 999/1000,
1000/1000. For a given sample, we might find p(hat) = .56. For another
sample, we might find p(hat) = .52. We could choose many SRS's and
calculate a p(hat) for each sample. In general, we would expect the
distribution of p(hat) to be approximately normal.
If we choose an SRS of size n from a large population with population
proportion p having some characteristic of interest, and if p(hat) is the
proportion of the sample having that characteristic, then
The sampling distribution of p(hat) is
approximately normal.
The mean of the sampling distribution is p
(the population parameter).
The standard deviation of the sampling
distribution is sqrt[p(1-p)/n].
It is reasonable to use the above statements when
-the population is at least 10 times as large as the sample
(Rule of Thumb 1 ).
-np is at least 10 and n(1-p) is at least 10. (Rule of Thumb 2
).
Example:
Suppose it is known that 60% of the registered voters in a district of
over 20,000 people are Republicans. If you choose an SRS of 1000
registered voters,
(a) what is the probability that the proportion of registered
voters in the sample is between 58% and 62%?
(b) what is the probability that the sample will contain no
more than 550 Republicans?
First, note that both thumb rules are satisfied. The sample
proportion p(hat) has mean = .6 and standard deviation =
sqrt[(.6)(.4)/1000] = .0155.
Response to (a): Using the TI-83,
normalcdf(
= .8031, or 80.31%
normalcdf(.58,.62,.60,.0155)
cdf(
Response to (b): 550/1000 = 0.55. the probability of a
sample proportion containing at most 55% Republicans is
normalcdf(-1E99,.55,
normalcdf(
.60,.0155) = 0.000628, or about
0.0628%.
Things we might note: z
.55
= (.55 - .60)/.0155 = -3.225. That is, a
proportion of .55 is more than 3 standard deviations below the mean.
This represents a rather rare score in a normal distribution N(.60,
.0155). Also, 0.000628 is approximately 1/1592. In other words, if we
had around 1,600 random samples of size 1000, we would "expect" only
one of them to have 550 of fewer Republicans.
Download