Sampling Distribution

advertisement
Section 9.1-9.2
Sampling Distributions
and Sample Proportions
Sampling Distribution






Sampling Distribution ~ distribution of values taken by
the statistic in all possible samples of the same size
from the same population
A sample distribution is DIFFERENT than the
sampling distribution.
Describing the sampling distribution  shape, center,
spread, outliers
Parameter ~ describes population
Statistic ~ describes sample
Unbiased Statistic ~ mean of sampling distribution is
equal to the true value of the parameter being
estimated
WHAT MAKES A STATISTIC A POOR
ESTIMATOR OF A PARAMETER?

HIGH VARIABILITY
–

The small samples lead to a larger spread in the sampling
distribution of the statistic giving less certainty about the
value of the true parameter.
HIGH BIAS
–
Poor sampling methods create unrepresentative samples so
that the center of the sampling distribution is not equal to
the true value of the parameter.
 WHY?

And what does that look like?
HOW DO WE AVOID HIGH BIAS???

USE APPROPRIATE SAMPLING
PROCEDURES THAT WE LEARNED IN
PREVIOUS CHAPTERS!!!
HOW DO WE AVOID HIGH VARIABILITY???


First, understand that sampling variability occurs
when the value of a statistic varies in repeated
random sampling
So, to avoid high variability of a statistic, which is
described by the spread of its sampling distribution,
use larger samples for smaller spread. As long as
the population is at least 10 times larger than the
sample, the spread of the sampling distribution is
approximately the same for any population size.
WHY DOES THE POPULATION SIZE
NOT REALLY MATTER MUCH???




Even more, why does a sample of size 260 serve a
population of 2600 just as well as a population of 26,000?
If the population is small, then outliers are going to have a
greater impact on the sampling process by creating greater
variability in the sampling distribution.
The size of the sample is what impacts the sampling
variability so a statistic from a sample of 260 Walton students
is just as precise as a statistic from a sample of 260 from all
East Cobb high school students. Of course, this is assuming
one important fact. Which is?
THE SAMPLES MUST BE RANDOM!
SECTION 9.2
Sample Proportions

The sample proportion pˆ is a statistic 
pˆ = # of successes / total sample size

Sampling distribution of a sampling proportion:
choose an SRS of size n from a large population
with population proportion p having some
characteristic of interest. Let pˆ be the proportion of
the sample having that characteristic. Then:
1) sampling distribution of pˆ is approximately normal
and is closer to a normal distribution when the
sample size n is large
2) the mean of the sampling distribution is exactly p
3) the standard deviation of the sampling distribution
is p(1  p)
n
Rules of Thumb
1)
2)
Use the recipe for standard deviation of pˆ
only when the population is at least 10
times as large as the sample
We will use the normal approximation to the
sampling distribution of pˆ for values of n
and p that satisfy: np≥10 and n(1-p)≥10
Standard Deviation Behavior

What will make the size of the standard
deviation, p(1  p) , change?
n



If the sample size goes up the standard
deviation goes down. If the sample size
goes down, standard deviation goes up.
How would we cut the standard deviation in
half?
Increase the sample size by multiplying by 4.
EXAMPLE (pg. 477 # 9.15)

The Gallup Poll once asked a random sample of 1540
adults, “Do you happen to jog?” Suppose that in fact
15% of all adults jog.

a) Find the mean and standard deviation of the
proportion pˆ of the sample who jog. (Assume the
sample is an SRS.)
μ = p = 0.15 ; σ = (.15)(.85)  0.0091
1540
b) Explain why you can use the formula for the
standard deviation of pˆ in this setting.
The population (assumed to be US citizens) is certainly
more than 10 times larger than the sample.
c) Check that you can use normal approximation for the
distribution of pˆ .




EXAMPLE (pg. 477 # 9.15) (cont’)

c) (answer)
np = 231, n(1-p) = 1309 ; these are both ≥ 10

d) Find the probability that between 13% and
17% of the sample jog.
normalcdf(0.13,0.17,0.15,√(0.15*0.85/1540))
≈0.9721
e) What sample size would be required to
reduce the standard deviation of the sample
proportion to one-half the value found in a)?
1540 times 4 = 6160



Download