Sample size for proportion in single cross

advertisement
Sample size calculation for a single cross-sectional survey
To estimate a sample size for a proportion in a single cross-sectional survey, three numbers
are needed:
1. Estimate of the expected proportion (p)
2. Desired level of absolute precision (d)
3. Estimated design effect (DEFF)
The sample size formula is:
n
1.96 2 p(1  p)( DEFF )
d2
(Gorstein J, Sullivan KM, Parvanta I, Begin F. Indicators and methods for cross-sectional
surveys of vitamin and mineral status of populations. Micronutrient Initiative (Ottawa) and
Centers for Disease Control and Prevention (Atlanta), May 2007, pg 29).
If the expected proportion p for an indicator is not known, usually the value of 0.5 (or 50%) is
used because it produces the largest sample size (for a given values of d and DEFF). If the
proportion is expected to be between two values, select the value closest to 0.5.
For
example, if the proportion is thought to be between 0.15 and 0.30, use 0.30 for the sample
size calculation.
The level of absolute precision d specifies the width of the confidence interval, e.g., +0.03
(i.e., +3%), +0.05 (i.e., +5%) or +0.10 (i.e., +10%). For example, if the proportion estimated
were 40%, would a precision of +10% (i.e., 95% confidence limits of 30% and 50%) be
acceptable? If not, would a narrower confidence interval (35%, 45%), i.e., precision of +5%,
be acceptable? The selection of a value for d (the desired absolute precision) may depend
on the expected proportion and the purpose of the survey.
Common values for d are
usually around +5% for estimated proportions in the range of 20%-80%, and around +3% for
less common or very common events (<20% or >80%).
-2-
The sample size required for a cluster survey is almost always larger than that required for
surveys using simple random sampling because of the design effect (DEFF).
If the
prevalence of a particular indicator is similar in each cluster, the DEFF will be around one,
which means the variability is the same as would have been with simple random sampling
methods. The greater the clusters differ from one another, the larger the DEFF. As the
DEFF increases the sample size must be increased to maintain a desired level of precision.
After a survey has been completed and the data analyzed, any calculated proportion is an
estimate of the proportion in the whole population.
Generally a confidence interval is
calculated to present a range of values within which the true proportion is likely to be
captured. For example, if the proportion is 40% and the lower and upper 95% confidence
limits are 30% and 50%, respectively, the interpretation would be that the true proportion in
the population most likely lies somewhere between 30% and 50%. This means that it would
be very unlikely for the true population proportion to be below 30% and very unlikely for it to
be greater than 50%.
Experience from surveys of anemia, vitamin A deficiency, and iodine deficiency with around
30 individuals sampled in each of 30 clusters have DEFFs in the range of 1.5 to 3. If more
than 30 individuals are sampled per cluster, the DEFF is usually larger; if fewer than 30
individuals are sampled per cluster, the DEFF is usually smaller. Sample sizes for different
key indicators, based on estimated prevalence levels, design effects, confidence levels, and
precision are presented in Table 1. Please note that these are only estimates and the actual
sample size required for an individual country survey will vary.
As an example, the sample size calculation to assess vitamin A capsule coverage,
assuming p = 0.5, d = .05, and DEFF = 2:
n
1.962  .5  .5(2)
 768.32
.052
-3-
Table 1 Examples of sample size calculations for key micronutrient indicators
Micronutrient/Indicators/Group
Indicators based on individuals
VAD and capsule coverage
Low Serum retinol in preschool children
Vitamin A capsule coverage
IDA and supplementation coverage
Anemia
Iron deficiency anemia
Iron tablet coverage in appropriate
group(s)
IDD
Low urinary iodine
Indicators based on households
Households using vitamin A fortified
product
Households using iron fortified product(s)
Households using iodized salt
a
Expected
Prevalence/
Coverage
(p) (%)
Absolute
Precision
(d) (%)
Design
Effect
DEFF
Sample
Sizea
10-15
80-90
 3.0
 3.0
2.0
2.0
1088
1365
40-60
15-30
20-40
 5.0
 5.0
 5.0
2.0
2.0
3.0
769
769
1106
10-30
 5.0
2.0
646
25-75
 5.0
2.0
769
20-40
50-75
 5.0
 5.0
3.0
3.0
1106
1152
Sample sizes are calculated for each strata based on 95% confidence intervals ( =
0.05), expected prevalence/coverage levels closest to 50%, desired absolute
precision, and estimated design effect
Sample sizes are always rounded up, so the sample size from the above example would be
769. In some settings, a different precision value and/or different expected DEFF value may
be appropriate. For example, if the prevalence of anemia is thought to be 60%, this would
indicate that anemia is a severe problem in the population; it could be decided that a
precision of +10% would be adequate because any anemia prevalence estimate in a
population >40% is considered to indicate a severe anemia problem, so a very precise
estimate may not be necessary.
On the other hand, if the proportion is very low or very
high, for example, if the proportion of households using iodized salt is thought to be 92%, it
some situations it may be desired to have greater precision than +5%, perhaps +2.5%. In
general, when performing sample size calculations or using sample size tables, careful
consideration should be given to the values used in the calculation.
Download