Sampling Distributions

advertisement
Chapter 7: Sampling Distributions
7.1 Generating Sampling Distributions
Objectives
 To understand what a sampling distribution is.
 To generate exact and simulated sampling distributions.
 To know the shape, centers, and spreads of typical sampling distributions.
 To understand the properties of point estimators.
Understanding the concept of a sampling distribution is very important as it underpins the entire study of
inference in the remaining chapters.
Reminder:
A parameter is a number that describes the population. In statistical practice, the value of a parameter is
not known because we cannot normally examine the entire population.
e.g.  ,  ,  , etc. (Greek letters)
A statistic is a number that can be computed from sample data, without using any of the unknown
parameters.
e.g. x , s , r , etc. (English letters)
The sampling distribution of a statistic is the distribution of potential values arising from
all possible random samples of the same size for that statistic.
Another important thing to understand is the difference between the sampling distribution of a
summary statistic and a simulated sampling distribution.
For discrete populations the sampling distribution is the ideal pattern that would emerge if you looked at
all possible samples of size n from a population. You can sometimes create an exact sampling distribution
by listing all possible samples. However, because the number of possible samples can be extremely large,
we are rarely able to list them to construct the exact sampling distribution. Instead we rely on theory or
simulation to generate an approximate sampling distribution. You can generate a simulated sampling
distribution of any statistic by following these steps:
1.
2.
3.
4.
Take a random sample of size n from a population.
Compute a summary statistic.
Repeat steps 1 and 2 many times.
Display the distribution of the summary statistics.
Note: Both exact and simulated sampling distributions have appeared on the AP Exam so you should
have experience at constructing both.
1
7.1 Generating Sampling Distributions
Properties of Point Estimators
Sampling distributions are the connecting link between the collection of data (through sampling or
experiments) and statistical inference (the process of drawing conclusions from the data).
A point estimate is a single value (summary statistic) calculated from sample data. It serves as the “best
guess” for the unknown population parameter.
There are two properties that you would like the summary statistic to have:
 It should be unbiased. This means:
mean of the sampling distribution = parameter being estimated
Or in other words, a summary statistic is a biased estimator of a population parameter if it gives
results that are too large or too small on average. This idea is exactly parallel to the idea of bias in
a sample survey. The method of sampling is biased if, on average, the method produces a
summary statistic that is too small or too large.
Note: bias is a property of the method, not of an individual sample.
 It should have as little variability as possible and should have a standard error that decreases as
the sample size increases.
Example: P4 page 419
1. Every year, Forbes magazine releases a list of the top-earning dead celebrities. In 2005, the top six
and their yearly earnings were
a. Elvis Presley, $45 million
b. Charles M. Schulz, $35 million
c. John Lennon, $22 million
d. Andy Warhol, $16 million
e. Theodor “Dr. Seuss” Giesel, $10 million
f. Marlon Brando, $9 million
Your talent agency gets an opportunity to represent two of these dead celebrities, to be selected at
random. You will be paid 10% of their earnings.
(a) What is the most you could be paid?
(b) Construct the sampling distribution of your total possible earnings.
(c) What is the probability that you will be paid $3 million or more?
2
7.1 Generating Sampling Distributions
The least?
Download