Methodology Glossary Tier 1 Sampling Obtaining the best estimate Sampling is the process by which a feature of interest (or parameter) relating to a group of interest (or population) is estimated, by measuring its value in a smaller but representative sub group (or sample). The aim of sampling is to enable estimates or statistics that are as close as possible to the real value in the population. Here are some hypothetical examples: SAMPLE STATISTIC POPULATION PARAMETER 1,000 people selected randomly from the electoral roll 6,000 participants in the Labour Force Survey in Scotland Average peppermint consumption = 3.44 per week The electoral roll in Scotland True average = 3.39 peppermints per week Unemployment rate = 3.6% in Spring 2011 All working age people in Scotland (April 2001 Census) Unemployment rate = 3.57 % on Census day (April 2011) 350 new admissions to Glasgow hospitals in January 2010 Average age = 37 years All inpatients Glasgow Hospitals Average age = 39. years Sampling is necessary when it is impossible or impractical to look at every individual member of the population, which in reality is most of the time. However, estimates obtained from samples can never perfectly match the true population parameters because of the information that is missing for the nonsampled population members. The mismatch between sample statistic and the population parameter is called the sampling error and the aim of a successful sampling exercise is to minimise this sampling error in order to be confident that the statistic is a good enough estimate to be useful. If the sampling error is too great then a statistic may be unreliable or misleading. Sampling error depends essentially on three things: (i) Sample size. Increasing sample size makes samples representative of the population and reduces sampling error. more (ii) Variability in the population. High variability in the population will lead to high variation from sample to sample, i.e. high sampling error; sample statistics may not be a very reliable guide to the feature of interest, e.g. peppermint consumption, in the population. Methodology Glossary Tier 1 (iii) Sampling method. The method by which the members of the sample are chosen. A good sampling method will provide a sample that is representative of the population of interest. Sampling methods Simple random sampling - the ideal method, where each individual in the population has the same chance of being selected. While simple random sampling represents the best theoretical approach, it is not the best method in every situation. If a population is strongly grouped in some way and the size of the sample is very limited (perhaps for reasons of cost, or accessibility), some groups may not be adequately represented in the sample (e.g. old people, reindeer farms, isolated communities) and stratified random sampling may be better. Stratified random sampling - The representation of clearly defined groups in a relatively small population can be improved by stratified random sampling, whereby random sampling is undertaken separately within each group (or stratum) of interest within the population. The main advantage of the simple random and stratified random methods is that they minimise sampling error and usually produce the most representative samples of the population of interest. Furthermore, the calculation of sampling error (used to judge the reliability of a statistic as an estimate of the corresponding feature in the population) is relatively easy for simple random and stratified random samples. Cluster sampling - This method of sampling is useful when the population is divided into clusters (such as towns, industries, postcode sectors) and when the amount of sampling that can be undertaken is limited. Firstly, a representative sample of clusters is selected and then the sample units (such as households), are randomly chosen within each chosen cluster. The advantages might also include reduced travelling costs for the survey data collectors when the clusters are widely scattered geographically. Example. The selection of bank employees to take part in a staff training survey. Under cluster sampling, branches of the bank would be chosen at random and then individual staff members would be chosen at random from within each of those branches. This method would remove the need to conduct interviews at, or gather in survey questionnaires from all branches, with the financial and time costs that this would entail. Quota sampling - Quota sampling is most often used when a smaller cost or faster results are required and it is less important to have a sample which represents all the population. In quota sampling the selection is not random but is Methodology Glossary Tier 1 based on pre determined allocations from specified sub-groups of the population. For example, a researcher may be told to interview 100 males and 100 females to include 50% aged below 21 years in each case. There is no obligation to select randomly within each of these groups, only to meet the quota and so the sample may not be as representative of the population as other sampling methods. The researcher may decide to interview each person of each age/sex as they are encountered, or perhaps each second person encountered, or to use some other convenient criterion (such as how helpful they look!). The nonrandom nature of this method makes it impossible to estimate the sampling error and calculate confidence intervals, but this may be less important than other priorities such as speed or targeting a specific group. Example. A short pilot study to compare the effectiveness of several advertising campaigns for stair-lifts would not be concerned about the impact of each campaign on people of all ages but might focus on people aged over 70 years seen using walking sticks. An interviewer may be given a quota of the speaking to the first 100 people in this category emerging from each of a city’s shopping centres, over a one week period. A random selection process would be too time consuming and unnecessary given the very specific target group. The goal is to obtain a representative sample of the population of interest within the practical constraints (usually economic) on the amount of sampling that can be undertaken. Care must be taken that the analysis results are not extended to part of a population not represented in the survey. Further Information Tier 2 Stratified Random Sampling | Cluster Sampling Link Office for National Statistics methodology pages