RITAM SAMPLING

advertisement
1
Estimating the “correct” sample size
Confidence intervals improve with the square
root of the sample size so…
…there are decreasing returns to scaling up your
sample size.
Sample size must be planned for the best return
on investment, because surveying people is a
costly and time consuming exercise.
However, you also have to have workable margins
of error.
2
General Considerations in Estimating Sample Size
There are two questions in determining sample
size:
How small do you want your confidence interval
to be?
How small do you want your margins of error to
be?
The formula for calculating sample size depend on
having an idea of these parameters.
You would not want less than 95% confidence
interval, and normally want the margins to be as
low as possible.
3
Estimating sample size for proportions
The first step to determining sample size is to ask:
How small do you want your confidence interval to
be?
Say you want to be sure that the population
proportion will fall within 2% of your sample
proportion 95 times out of 100.
What you are asking for is that 1.96 * SEp = 2% and
we can use the standard error statistic to estimate
the appropriate sample size to return a 2% margin.
4
The SE statistic in estimating sample size for proportions
𝑺𝑬𝒑 =
𝝅(𝟏𝟎𝟎 − 𝝅)
𝒏
Same SE statistic we used before except p has been replaced by
the population proportion given by π
Again, there are two problems to be solved:
First, we don’t know what π is
Second, we don’t know what n is (that’s what we
need to know)
Use theory for one and then solve for the other
5
Using theory to solve for π
1.96 * 𝑺𝑬𝒑 =
πŸ“πŸŽ(𝟏𝟎𝟎 − πŸ“πŸŽ)
𝒏
=2
The π term has been replaced by 50% because we
know from the normal curve that in a proportions
dataset 50% is the mean!
The 1.96 * SEp = 2% formula is now restated as
above, leaving one unknown value, n, to be solved by
algebra
6
Using algebra to solve for n
To solve for n
1.96 ∗ 50
𝑛=
2
𝑛 = 49
n = 492
n = 2,401
The sample size should be 2,401 to give a 2%
confidence interval
7
The SE statistic in estimating sample size for means
You want to be 95% sure the population mean is
within 1.5 grade points?
Where:
(𝐢 ∗ 𝜎)2
𝑛=
𝛿
C is the confidence interval you want.
σ (small letter sigma) is the population standard deviation.
δ (small letter delta) is the amount by which you want the
sample mean to vary from the population mean.
So where do you find all the values?
8
The SE statistic in estimating sample size for means
(𝐢 ∗ 𝜎)
𝑛=
𝛿
2
C is 95% (your chosen confidence interval).
δ is 1.5% (your chosen margin of error).
σ is something you don’t know – the population
standard deviation.
But you can substitute the sample standard deviation
‘s’ for this:
(𝐢 ∗ 𝑠)2
𝑛=
𝛿
9
The SE statistic in estimating sample size for means
But where do you get it, since you don’t have you
survey done yet – you are trying to figure out the
sample size - to find σ you can:
1. Use a previous survey.
2. Do a pilot study.
3. Use secondary data.
4. Use yours or some others researcher’s experience.
Methods 1 or 2 are the best and a pilot study of 30
cases would provide a theoretically sound sample
standard deviation you could use as a substitute.
We have the small n=63 survey with a standard
deviation of 12 so we can use this:
10
THIS SLIDE IS FROM EARLIER IN THE LECTURE
IT SHOWS HOW THE SUBSTITUTION OF THE SAMPLE SD FOR THE
POPULATION STANDARD DEVIATION WORKS
𝑺𝑬 =
𝝈𝟐
𝒏
We still have the σ term though, but that can be exchanged for the sample
standard deviation since we know from theory that a sample’s mean,
standard deviation, and proportion are close enough to the population’s
mean, standard deviation, and proportion to be able to substitute. Thus the
formula becomes:
π’”πŸ
𝑺𝑬 =
𝒏
And this is adequate for calculations. More to the point,
there are no unknowns and so it can be solved.
11
The SE statistic in estimating sample size for means
(𝐢 ∗ 𝜎)2
(𝐢 ∗ 𝑠)2
This:
𝑛=
𝛿
Becomes
this:
𝑛=
𝛿
With the substitution method the sample size formula now
has all of the parameters it needs: C=1.96, s=12, δ=1.5
(1.96 ∗ 12)2
And worked through, 𝑛 =
1.5
= 369
You would need a sample size of 369 students
to be 95% sure that your population mean
would be within 1.5 grade points of the
sample mean.
12
Diminishing Returns on Sample Size
First thing to remember is the sampling fraction (n/N):
it doesn’t matter how big the population is from which
you draw your sample because it is the sample size that
matters not the fraction that the sample is of the
population.
If ‘n’ keeps increasing, your confidence interval keeps
decreasing until when n=N the confidence interval
would be zero – your sample is the population.
But by how much would you have to increase the
sample size and is it worth it?
13
Effect of diminishing returns on sample size using sd=12.0
n
Doubled
SE in %
Change in ‘n’ produces rapidly
decreasing change in ‘CI’
Change in Change in Change in CI for
CI @ 95%
n
CI
every n
30
2.19
4.29
30
-1.2577
-0.04192
60
1.55
3.04
60
-0.8893
-0.01482
120
1.10
2.15
120
-0.6289
-0.00524
240
0.77
1.52
240
-0.4447
-0.00185
480
0.55
1.07
480
-0.3144
-0.00066
960
0.39
0.76
960
-0.2223
-0.00023
1920
0.27
0.54
1920
-0.1572
-0.00008
3840
advantage
0.19 To a point,
0.38then the
3840
-0.1112of a
larger
‘n’ diminishes
0.14
0.27
7680 rapidly.
-0.0786
7680
15360
0.10
0.19
-15360
-0.1898
-0.00003
-0.00001
14
0.00001
Diminishing Returns on Sample Size (cont…)
Change in CI for every change in 'n'
0
4
CI% Interval
Confidence
-0.02
5
6
7
8
9
10
11
12
13
14
15
16
17
3
-0.04
-0.06
2
-0.08
-0.1
-0.12
-0.14
-0.16
-0.18
1
-0.2
of 'n'
Doublings ofDoublings
‘n’ starting
at 30
15
18 19
Download