Sampling for EHES EHES Training Material Ideal target population • The core target population for EHES is all adults aged 25 to 64 who reside in the country • The age range can be extended by the individual countries • Institutionalized should be included • Temporary visitors are not included Main sampling frame • The main sampling frame is the list of people/addresses to take a sample from. • An ideal list is: • Updated regularly • Includes everyone in the target population • Contains contact information • In reality, add-on lists may be neccesary (especially for those in institutions) Sampling designs • A sample is taken to represent the population as a whole as we do not have the resources to survey everybody • We recommend (for most counrties) a multistage design to reduce costs/resources through clustering participants into manageable areas known as Primary Sampling Units (PSUs) An example country with random sampling Clustering participants reduces costs What is a multi-stage sample? Stage 1 • The country is divided into Primary Sampling Units (PSUs) • A number of these are selected randomly An example country What is a multi-stage sample? Stage 2 • Within each selected PSU, people from the population register are selected randomly An example country What is a multi-stage sample? Stage 2 • Within each selected PSU, households from a household list are selected randomly An example country What is a multi-stage sample? Stage 3 • Within each selected household we select all household members An example country What is a multi-stage sample? Stage 3 • Within each selected household we select 1 person An example country What is random selection? • Selecting a person randomly means that they are selected entirely by chance • We can calculate how likely someone is to be selected. We can not calculate if they actually will be selected – this is the random part Why random selection? • To estimate the health of the population we need to know everyone’s chances of being selected/invited • This is only possible with random selection (believe it or not) • Replacing someone who does not want to/can not participate with somebody else means we no longer have a random sample and can not estimate health figures accurately from the data Stratification • Grouping similar PSUs or individuals during the sampling stage is called stratification • Stratification generally improves the accuracy of the estimates 2 PSU selected in each PSU (shown as white) An example country with stratification of PSUs (shown by separate colours) Biased samples • A sample is biased if it does not reflect the population and will tend to give wrong results • Biased samples can result from: • Samples that are not randomly taken from the population • Low response rates among certain groups of the sample (eg people who are not well) Population Sample Biased sample Population Sample Representative sample Biased sample Sample size • A minimum sample size of 4000 is required in countries implementing a multi-stage design for EHES • This is based on the accuracy required with response rates of 70% • Based on a minimum of 500 in each of the 8 sex/age groups groups (25-34, 35-44, 45-54, 55-64 years) • A one-stage designs allows a reduction in sample size • Sub-national estimates will most probably require a larger sample size Sample allocation • How to allocate the sample among the Primary Sampling Units is a balance between resources and accuracy • We recommend using the EHES program in R and/or a specialist survey statistician No clusters Many small clusters Few large clusters Very good accuracy of estimates Medium accuracy of estimates Low accuracy of estimates High cost Medium cost Low cost General sampling tips • Sampling using multi-stage designs can be complicated, however, can reduce overall costs while maintaining control over the accuracy of estimates • An add-on package for the statistical software ”R” has been developed as a tool for sampling in EHES and is freely available Acknowledgements • Slides • Susie Jentoft and Johan Heldal