Powerpoint Notes - Campbell County Schools

advertisement
Ch 7 - Sampling
Distribution
Vocabulary
• Population (review)
• Sample (review)
• Parameter
• Statistic
*Hint: the p’s and the s’s go together!
Notation
VERY IMPORTANT! and you will loose
points on the exam if you use the wrong
notation (as well as if you use the wrong
word)
• Population mean:
_
• Sample mean: x
• Population Proportion:
p
^
• Sample Proportion: p
Identify the population, the parameter, the
sample, and the statistic in each of the
following settings.
a) A pediatrician wants to know the 75th
percentile for the distribution of heights of 10year-old boys, so she takes a sample of 50
patients and calculates Q3 = 56 inches.
b) A Pew Research Center Poll asked 1102 12to 17-year-olds in the United States if they have
a cell phone. Of the respondents, 71% said
Yes.
Randomly select 5 cards from a shuffled deck (of only cards 2 - 10) and note
the median value. Then replace your cards. We will repeat this “many,
many” times
Create a table of sample
medians
Create a dotplot of sample medians
Describe what you see.
Sampling Distribution
The sampling distribution of a statistic is the distribution of values taken
by the statistic in all possible samples of the same size from a population.
For example, in our card activity if we were able to take every possible
sample of size 5 of cards 2 through 10 and record the outcomes, that
would be the sampling distribution of the median of a deck of cards when
selecting 5 cards.
That would be 36 nCr 5 = 376,992 different samples!
It’s very time consuming to take all possible samples, instead simulations
are done to imitate the process we just did. (FATHOM)
Using Fathom to simulate choosing
500 SRSs of size 5 from the deck of
cards 2 though 10 and finding the
sample medians produced the
following dotplot.
Describe it!
What is the population median?
Is there any connection to the
population
and the sampling distribution?
Distribution, Distribution, Distribution...
There are 3 different types of distributions:
1. Population Distribution
2. Distributions of Sample Data
3. Sampling Distributions
What’s the difference between the 3 distributions?
A population distribution is one graph of
everything possible as a whole.
What would it have been for out card
activity?
A distribution of sample data is an
individual graph depicting each outcome
from the sample you drew from the
population.
What would it have been for out card
activity?
Sampling distribution describes how a
statistic varies in many, many samples of the
population. You are no longer looking at the
individual elements in the
sample/population.
What would it have been for out card
activity?
Helpful Hint
Population & Sample Distributions describe
individuals
Sampling Distribution describes a statistic.
Let’s look at page 420 for a diagram to help
More questions about our
card simulation:
Was that a sampling distribution?
If someone claims to set up the same
activity and they select a sample of size 5
and get a median of 4, is that convincing
evidence that they set their deck up wrong?
According to the National Center for Health Statistics, the distribution
of heights for 16-year-old females is modeled well by a Normal density
curve with mean of 64 inches and standard deviation of 2.5 inches.
A. Make a graph of the population distribution.
B. Sketch a graph of at the distribution of sample data for
an SRS of size 20.
Exit Slip you must complete
p420 CYU #1 - 3 ON
YOUR OWN
BEFORE OU LEAVE!
Biased or Unbiased?
So which sample statistics are biased and which are unbiased? To find out, let’s
collect some quantitative data:
1. On the piece of paper given to you write how many hours of sleep you got last
night.
2. Each of you will randomly select a sample of 4 cards.
3. You will need to record the following information: the four numbers, the sample
mean, and the sample range.
4. Replace the cards.
5. Pass the bag to the next person then record your sample mean and sample
range on the corresponding dotplots on the board.
6. Once everyone plots their data, we will analyze it and compare it to the
population mean and population range.
Definitions
• Biased Estimator - the mean of a
sampling distribution is not close to the
population’s parameter being estimated
• Unbiased Estimator - the mean of a
sampling distribution is equal to the
value of the parameter being estimated
Biased and Unbiased
Estimators
• This is different than the sampling
process being biased. When using an
estimator (i.e. a measure of center or
spread) we are assuming the sampling
process is not biased.
• The actual statistic we are finding can
be biased or unbiased as well.
Variability of a
Statistic
“the spread of the sampling distribution”
• What will help decrease the variability?
• Will it decrease bias?
To answer these questions,
let’s read p424 - 426
7.2 - Sample
Proportions
Remember your notation:
population parameter is p
^
sample proportion is p
More Notations
We have a population, we take a sample,
and find some proportion.
If we want to investigate those sample
proportions we can find the mean and
standard deviation of the sampling
distribution of the sample proportions.
mean of sample proportions:
standard deviation of sample
proportions:
^
^
Describe the Sampling
^
Distribution of p
SHAPE: sometimes it can be approximated by the
Normal curve. It depends on the sample size n and
the population proportion p.
^
CENTER: ^ = p because p is an unbiased
estimator of p.
SPREAD: ^ gets smaller as n gets larger. The
value of
depends on both n and p.
p. 436 shows a good, small proof of why these are true
^
Sampling Distribution of a
Sample Proportion
Choose an SRS of size n from a population of size N with proportion p of
^ p be the sample proportion of successes. Then:
successes. Let
➡ the mean of the sampling distribution^of p is^
=p
➡ the standard deviation of the sampling distribution^ of p ^is
➡ AS LONG AS THE 10% CONDITION IS SATISFIED:
➡ if
and
, the Normal conditions are satisfied and
^ of p is approximately Normal.
the sampling distribution
Formulas are on your
formula sheet
In a Gallop Poll of 1785 random adults, 44% said that
attend a religious service last week. Suppose the actual
adult population that attends a religious service is 40%.
a) What is the mean of the sampling distribution of p-hat?
b) Find the standard deviation & check the 10% condition
c) Is the sampling distribution of p-hat approximately
Normal?
d) Find probability of obtaining a sample of 1785 adults
with 44% or more attended a service? Do you have
doubts in this poll?
e) What would the sample size need to be to reduce the
standard deviation of the sampling distribution by 1/3?
The superintendent of a large school district wants
to know what proportion of middle school students
in her district are planning on attending a four-year
college or university. Suppose that 80% of all
middle school students in her district are planning
on attending a four-year college or university. What
is the probability that an SRS of size 125 will give a
result within 7 percentage points of the true value?
We will use the 4-step method to solve this
problem.
State
We want to find the probability that the
percentage of middle school students that
plan to attend a 4-year college or university
falls between 73% and 87%
or in symbols: P(0.73 < 𝑝 < )0.87
Plan
^
= 0.80.
Since the school district is large, we’ll
assume the 10% condition is satisfied
and there are more than 1250 students.
(10*125 = 1250).
So,
= 0.036
^
We can consider the distribution of p to
be approximately Normal since the
following are true:
np = 125(.8) = 100 > 10
n(1-p)= 125(.2) = 25 > 10
Do
P(0.73 ^ 0.87) = normalcdf(0.73, 0.87,
0.80, 0.036) = 0.948
If you want full credit on the exam, you
must have clearly said everything in the
“Plan” step and these calculations will
receive full credit.
Sketching a Normal curve will help.
You can also use Table A to find the answer. Remember to standardize (z-score) first!
Conclude
About 95% of all SRSs of size 125 will give
a sample proportion within 7 percentage
points of the true proportion of middle
school students who want to attend a fouryear college or university.
7.3 - Sample Means
More Notation!
Mean of the sampling distribution:
_
_
Standard deviation of the sampling distribution of the sample means:
All of the notations in this chapter are very important and very similar. You
will loose credit for using the wrong notations on the exam. So if you can’t
remember it’s always best to write out what you are finding rather than try to
use a notation.
Suppose that x-bar is the mean of an SRS of size
n drawn from a large population with mean 𝜇
and standard deviation 𝜎, it does not matter what
shape the population has.
The mean of the sampling
distribution of 𝑥 is
The standard deviation of the
sampling distribution of 𝑥 is
𝜎
𝜎𝑥 =
𝑛
𝜇𝑥 = 𝜇
as long as the 10% condition is satisfied!
These formulas are on
your formula sheet
If you are asked to find the sampling distribution_of
x, these means to state if it is Normal and find the
mean and standard deviation.
Hint: if the population itself is approximately
_
Normal, then so is the sampling distribution
of x.
Hint: Please read carefully! Make sure you know if
you are using the population standard deviation or
the sample means standard deviation before you
standardize or use your normalcdf on your
calculator.
A grinding machine in an auto parts plant prepares axels with a
target diameter mu = 40.125 mm. The machine has some
variability, so the standard deviation of the diameters is sigma =
0.002 mm. The machine operator inspects a random_sample of 4
axles each hour for quality control purposes and records the
sample mean diameter x.
a) Assuming the process is working properly,
what are the mean and standard deviation of
the sampling distribution of x?
-
_
b) Can you find the probability that x is within
+.05 mm if you are choosing an SRS of 100
axels? Explain
c) In order for you to pass this inspection the
standard deviation of the sampling
_
distribution of x needs to be 0.0005 mm.
How many axels would you have to sample?
The composite scores of individual students on the
ACT in 2009 followed a Normal distribution with
mean 21.1 and standard deviation 5.1.
a) What is the probability that a single student
randomly chosen from all those taking the test
scores 23 or higher? Show your work.
b) Now take an SRS of 50 students who took the
test. What is the probability that the mean score x
of these students is 23 or higher?
What if the population shape is not Normal?
http://onlinestatbook.com/stat_sim/sampling_dist/index.html
Central Limit
Theorem
Draw an SRS of size n from any population
with mean and finite standard deviation
CLT - when n is large,
the
sampling
_
distribution of the sample means x is
NOTE: this is
approximately Normal.
of the sample
means, not
just any
sample!!!
How large is large?
In order for the Normal conditions to apply
for the sample means, and the population is
not Normal the CLT will apply in most
cases if
The number of flaws per square yard in a type of
carpet material varies with the mean 1.6 flaws per
square yard and standard deviation 1.2 flaws per
square yard. The population distribution cannot be
Normal, because a count takes only whole-number
values. An inspector studies 200 square yards of
material, records the number of flaws found in
_
each square
yard, and calculates x, the mean
number of flaws per square yard inspected. Find
the probability that the mean number of flaws
exceeds 2 per square yard.
State
What’s the probability that the mean
number of flaws per square yard of carpet
is more than 2?
Plan
• The mean of the sampling distribution of
_ is
the sample means
• 10% condition is met since there’s more
_
than 2000 square yards of carpet, so
• Since the sample size is large, 200 > 30,
we can safely use the Normal
distribution
as an approximation for the
_
sampling distribution of x
Do
Draw the curve,
find P(x > 2) = normalcdf(2, 100, 1.6, 0.085)
=0
**identify what these numbers are or:
z = (2 - 1.6)/.085 = 4.705
P(z > 4.705) = 0
Conclude
There is virtually no chance that the
average number of flaws per yard in the
sample will be greater than 2.
Download