Sampling Distributions

advertisement
Sampling Distributions
October 1st
___________________________________________
1) Revisit the difference between a statistic and a
parameter?
2) Discuss factors that determine whether an estimate
of a parameter is ‘good’ or ‘bad’.
3) Define a ‘sampling distribution’ and discuss the
properties of same.
4) Answer the following burning question: Why do
we take relatively large samples of data?
How can I estimate the number of siblings that
people in this class have?
___________________________________________
Take a sample and calculate:
a) mean
b) median
c) mode
d) (High Score + Low Score) / 2
How do I know which of these options is the best?
1) Working with a known population
Take a sample from a population with
known parameters and
2) Repeated Samples method
Take population with known parameters and
see how the distributions of the different
What's a Sampling Distribution?
___________________________________________
Sampling Distribution - probability distribution
calculated from
We are going to model sampling distributions as
Why is this appropriate?
B/C mean does not have to be
What does this buy us?
We know how to calculate the area under the
normal curve for continuous RVs.
How are we going to do this?
Patience, my child. All will be revealed.
Constructing a sampling distribution
___________________________________________


3
5
=
=
7
2.8 
7
9
11
3
How many unique samples could we draw from this
population (without replacement) if n = 2?
Important things about this example
___________________________________________
1) A sampling distribution can be constructed by
2) This information can be used to determine how
3) Note in this case that the mean of the sampling
distribution was
population, and that the standard deviation of the
sampling distribution was
4) Still haven’t told you what makes for a good
statistic.
Properties of a good estimator
________________________________________
Point Estimator - rule or formula that tells us how to
use the sample data to calculate a single
A good point estimator (statistic) is:
(a) unbiased
(b) minimum variability
Can we control biasedness?
What if the mean of the sampling distribution is
too high/low?
Can we control variability?
a) Choose
b) Choose
So, you want to construct a sampling distribution…
________________________________________
Not so fast, Skippy. Can you envision a problem that
might prevent you from constructing a sampling
distribution?
Let’s construct a sampling distribution for n=5 for
this class:
a) How many observations would be in the
sampling distribution?
b) What about samples of 20 at AC?
Can computer technology save us?
Restricted samples:
Unrestricted samples:
________________________________________
Is this the end?
Is class dismissed until the final?
Is there no way to save the semester?
Our hero
________________________________________
Central Limit Theorem –
Further, the larger n gets, the more closely the
sampling distribution will approximate
Finally,
a) M = 
and
b) M =  / n
and
c) z = (M - ) / M
= (M- ) / ( / n)
Coin Flipping Example
_______________________________________
The outcome of a coin flip is distributed
uniformly: 50% heads, 50% tails.
Let’s see the CLT in action:
Flip a coin once and tell me the # of heads.
Flip a coin twice and tell me the # of heads.
Flip a coin 10 times and tell me the # of heads.
Flip a coin 30 times and tell me the # of heads.
Using the CLT: Rush Example
________________________________________
You are deciding whether or not to rush  (it’s a
special Stats Honor Fraternity) and, because you are
the type of person who would rush a Stats Honor
Fraternity, you want to know what the average
intelligence level of the frat is. You ask Eric
Stratton, the Rush Chairman (he seemed real glad to
meet you) what the average GPA in the house is. He
says, “ = 3.5 and  = .6”. You randomly poll 36
fraternity members and find that the mean of the
sample is 3.4. What do you conclude?
P(z  [M-] / [/n])
P(z  [3.4-3.5] / [.6/36])
P(z  [-.1 / .1])
P(z  -1) = Area(Tail -1.0) = .1587
Would you alter your conclusion if the mean of the
sample was 3.2? How?
P(z  [M-] / [/n])
P(z  [3.2-3.5] / [.6/36])
P(z  [-.3 / .1])
P(z  -3) = Area(Tail -3.0) = .0013
More Chips Ahoy
___________________________________________
Remember a few weeks ago, you and Biff were
trying to figure out the probability that ONE Chips
Ahoy cookie, which is supposed to have 23 chips
could have as few as 17 chips. Let's say you reconduct the experiment, but you're smarter now, so
rather than examine 1 cookie, you collect a sample of
49 cookies (I imagine you got sick after eating the
stimuli). The mean number of chips in your sample
was 20, and the standard deviation was 17.5 chips.
Do you have just cause for a legal action against
Chips Ahoy? In other words, what is the probability
that your sample of cookies was drawn from a
population with  = 23?
Central Limit Theorem with Proportions
________________________________________
p =
p
p =
p(1-p) / n
z
=
p - p / p
=
p - p / p(1-p)/n
Applying the CLT with proportions: Blood Example
________________________________________
Nine percent of the U.S. Population has Type B
blood. What is the probability that 12.5% of a
random sample of 400 people will have Type B
blood?
P(p  .125) =
p =
=
=
P
=
=
P (z  [.125 – .09] / p)
p(1-p) / n
(.09)(.91) / 400
.014
(z  [.125 – .09] / .014)
(z  2.5)
Area (Tail: 2.5) = .0062
CLT with proportions: Christmas Example
________________________________________
Sixty percent of the U.S. Population believes that
Christmas presents should be opened on Christmas
morning, as opposed to Christmas Eve. What is the
probability that 65 people out of a random sample of
125 will agree that Christmas morning is the
appropriate time to open presents?
Why do we sample?
________________________________________
1) To ensure an unbiased estimator (i.e., random
sample).
2) To decrease the variability of our estimator (i.e.,
increase its reliability).
3) To enable us to use the Central Limit Theorem as a
way of modeling chance variation in our sample.
Download