p.p chapter 7.1

advertisement
Sampling Distributions
What is a sampling distribution?
Section 7.1
Reference Text:
The Practice of Statistics, Fourth Edition.
Starnes, Yates, Moore
Objectives
1. Parameter and Statistic
1. Lets define some variables
2. Sampling variability
1. 3 different types of distributions to distinguish from.
2. Wording! Be careful, some AP errors
3. Describing sampling distributions
1. Unbiased estimators
4. Why the size of our sample matters
5. Variability and Bias
What is sampling Distribution?
• As we begin to use sampling data to draw
conclusions about wider population, we
must be clear about whether a number
describes a sample or a population:
Statistic…..Sample
Parameter…Population
Parameter & Statistic
• A Parameter is a number that describes some
characteristic of the population. In statistical
practice, the value of a parameter is usually not
known because we cannot examine the entire
population
• Makes sense! That’s why we take samples!
• A Statistic is a number that describes some
characteristic of a sample. The value of a statistics
can be computed directly from the sample data. We
often use a statistics to estimate an unknown
parameter.
Statistic…..Sample
Parameter…Population
Variables To Distinguish From!
Example
Check Your Understanding
• Each boldface number in Question 1 and 2 is the value of either a
parameter or a statistic. State which is which…use correct
vocabulary 
1. On Tuesday the bottle of Arizona Ice Tea filled in a plant
were supposed to contain an average of 20 ounces of
iced tea. Quality control inspectors sampled 50 bottles at
random from the day’s production. These bottles
contained an average of 19.6 ounces of iced tea.
1. On a New-York-To-Denver flight, 8% of the 125
passengers were selected for random security screening
before boarding. According to the Transportation security
Administration, 10% of passengers a this airport are
chosen for random screening.
Sampling Variability
Sampling Distribution
• Sampling distribution of a statistic is the distribution
of values taken by the statistic in all possible
samples of the same size from the same population.
Q: “what would happen if we took many samples”
A:
-Take a large number of samples from the same population
- Calculate the statistic for each sample
- Make a graph of the values of the statistic
- Examine the distribution displayed in the graph for shape,
center, and spread, as well as outliers (Hey Look! C.U.S.S.)
C.U.S.S
• Note: small change when talking about sampling
distributions
•
•
•
•
C- center, this is your mean.
U- unusual points, any notable outliers?
S- spread, this is your standard deviation
S- Shape, is it symmetrical? Single peak?
Activity!
Sampling Distribution
3 Types of Distribution
Warning! Errors in AP
• AP Exam Tip: Terminology matters! Don’t say
“sample distribution” when you mean sampling
distribution. You will lose credit on free response
questions for misusing statistical terms.
• Yeap, it really is the difference of just “ing” that determines full
credit or docked points…
• On the AP exam, a common error is to write an
ambiguous statement such as “the variability
decreases when the sample size increases”.
Whenever students are describing a distribution,
they should always say, “the distribution of ____” to
avoid ambiguity.
Check Your Understanding
• Mars, Inc, says that the mix of colors in its M&M’s Milk Chocolate Candies is
24% blue, 20% orange, 16% green, 14% yellow, 13% red, and 13% brown.
Assume that the company’s claim is true. We want to examine the proportion
of orange M&M’s in repeated random samples of 50 candies.
1) Identify the individuals, the variable, and the parameter of
interest.
2) Which of the graphs that follow could be the approximate
sampling distribution of the statistic? Explain your choice.
Unbiased Estimator
Note:
• “unbiased” doesn’t mean perfect. An unbiased
estimator will almost always provide an estimate that
is not equal to the value of the population
parameter. It is called “unbiased” because in
repeated samples, the estimates wont consistently
be too high or consistently too low.
• We also assume that the sampling process we are
using has no bias. No sampling or non-sampling
error present, just sampling variability. If they did
exist, then that would lead to estimates too low or
too high.
Why Size Matters
• Television executives and companies who
advertise on TV are interested in how many
viewers watch particular shows. According to
Nielsen ratings, Survivor was one of the mostwatched television shows in the United States
during every week that it aired. Suppose that
the true proportion of U.S adults who have
watched Survivor is p =0.37
Why Size Matters
• Figure 7.7 (a) shows the results of drawing 1000 SRSs of size
n= 100 from a population with p = 0.37.
• Figure 7.7 (b) shows the results of drawing 1000 SRSs of size
n= 1000 from a population with p = 0.37.
• Both graphs are drawn on the same horizontal scale to make comparison
easier:
What Does Size Do In
Statistics?
• There is a clear advantage to larger samples.
They are much more likely to produce an
estimate close to the true value of the
parameter.
• Said another way, larger random samples
give us more precise estimates than smaller
random samples.
• This leads us to the Variability of a Statistic
Variability Of a Statistic
• The variability of a statistic is described by
the spread of its sampling distribution. This
spread is determined primarily by the size of
the random sample. Larger samples give
smaller spread. The spread of the sampling
distribution does not depend on the size of the
population, as long as the population is at least
10 times larger than the sample.
• Taking larger sample doesn’t fix bias. Remember that
even a very large voluntary response sample or
convenience sample is worthless because of bias.
Check for Understanding
The histogram above left shows the intervals (in minutes) between
eruptions of the Old Faithful geyser for all 222 recorded eruptions
during a particular month. For this population, the median is 75
minutes. We used Fathom software to take 500 SRSs of size 10
from the population. The 500 values of the sample median are
displayed in the histogram above right. The mean of the 500
sample median values is 73.5
Check Your Understanding
1) Is the sample median an unbiased estimator of the population
median? Justify your answer.
2) Suppose we had taken samples of size 20 instead of size 10.
Would the spread of the sampling distribution be larger, smaller,
or about the same?
3) Describe the shape of the sampling distribution. Explain what it
means in terms of overestimating or underestimating the
population median.
Bias, Variability, and Shape
• We can think of true value of the population parameter as the
bull’s-eye on the target and of the sample statistic as an arrow
fired at the target
Bias, Variability, and Shape
• Ideally, we’d like our estimates to be
accurate (unbiased) and precise (have low
variability)
Objectives
1. Parameter and Statistic
1. Lets define some variables
2. Sampling variability
1. 3 different types of distributions to distinguish from.
2. Wording! Be careful, some AP errors
3. Describing sampling distributions
1. Unbiased estimators
4. Why the size of our sample matters
5. Variability and Bias
Homework
Worksheet
Download