Sampling

Sampling for
EHES
EHES Training
Material
Ideal target population
• The core target population for EHES is all adults
aged 25 to 64 who reside in the country
• The age range can be extended by the individual
countries
• Institutionalized should be included
• Temporary visitors are not included
Main sampling frame
• The main sampling frame is the list of
people/addresses to take a sample from.
• An ideal list is:
• Updated regularly
• Includes everyone in the target population
• Contains contact information
• In reality, add-on lists may be neccesary
(especially for those in institutions)
Sampling designs
• A sample is taken to represent the population as
a whole as we do not have the resources to
survey everybody
• We recommend (for most counrties) a multistage design to reduce costs/resources through
clustering participants into manageable areas
known as Primary Sampling Units (PSUs)
An example country with
random sampling
Clustering participants reduces costs
What is a multi-stage
sample?
Stage 1
• The country is divided into Primary Sampling Units
(PSUs)
• A number of these are selected randomly
An example country
What is a multi-stage
sample?
Stage 2
• Within each selected PSU, people from the
population register are selected randomly
An example country
What is a multi-stage
sample?
Stage 2
• Within each selected PSU, households from a
household list are selected randomly
An example country
What is a multi-stage
sample?
Stage 3
• Within each selected household we select all
household members
An example country
What is a multi-stage
sample?
Stage 3
• Within each selected household we select 1
person
An example country
What is random selection?
• Selecting a person randomly means that they are
selected entirely by chance
• We can calculate how likely someone is to be
selected. We can not calculate if they actually will
be selected – this is the random part
Why random selection?
• To estimate the health of the population we need
to know everyone’s chances of being
selected/invited
• This is only possible with random selection
(believe it or not)
• Replacing someone who does not want to/can not
participate with somebody else means we no
longer have a random sample and can not
estimate health figures accurately from the data
Stratification
• Grouping similar PSUs or individuals during the
sampling stage is called stratification
• Stratification generally improves the accuracy of
the estimates
2 PSU selected in
each PSU (shown as
white)
An example country with stratification of PSUs (shown by separate colours)
Biased samples
• A sample is biased if it does not reflect the
population and will tend to give wrong results
• Biased samples can result from:
• Samples that are not randomly taken from the
population
• Low response rates among certain groups of the sample
(eg people who are not well)
Population
Sample
Biased sample
Population
Sample
Representative sample Biased sample
Sample size
• A minimum sample size of 4000 is required in
countries implementing a multi-stage design for
EHES
• This is based on the accuracy required with response
rates of 70%
• Based on a minimum of 500 in each of the 8 sex/age
groups groups (25-34, 35-44, 45-54, 55-64 years)
• A one-stage designs allows a reduction in sample size
• Sub-national estimates will most probably require a
larger sample size
Sample allocation
• How to allocate the sample among the Primary
Sampling Units is a balance between resources
and accuracy
• We recommend using the EHES program in R
and/or a specialist survey statistician
No clusters
Many small clusters
Few large clusters
Very good accuracy of estimates
Medium accuracy of estimates
Low accuracy of estimates
High cost
Medium cost
Low cost
General sampling tips
• Sampling using multi-stage designs can be
complicated, however, can reduce overall costs
while maintaining control over the accuracy of
estimates
• An add-on package for the statistical software ”R”
has been developed as a tool for sampling in
EHES and is freely available
Acknowledgements
• Slides
• Susie Jentoft and Johan Heldal