MICS WS1

Multiple Indicator Cluster Surveys Survey Design Workshop Sampling: Advanced Sampling MICS Survey Design Workshop Major steps in designing MICS sample • Define objectives – Key indicators – Desired level of precision – Sub-national domains of estimation • Identify most appropriate sampling frame – Most recent census of population and housing – Master sample or sample for another survey conducted recently Major steps in designing MICS sample • Determine sample size and allocation –Determine availability of previous MICS or DHS results to provide measures of sampling parameters Sampling Frame • Sampling frame: – Nationally-representative – Complete coverage – Measures of size (households or population) for small area units • Generally most recent census is the most effective sampling frame Sampling Frame • In some cases more recent pre-census listing may be available • When no census is available, identify most complete geographic frame available (e.g. list of villages/localities with estimated population) Sampling Frame • Common problems with area frames: – Coverage issues – Census maps of poor quality – Errors and changes in area boundaries – Inappropriate type and size of area units – Lack of auxiliary information SAMPLE SIZE DETERMINATION • n is the required sample size (number of households) • 4 is a factor to achieve the 95 percent level of confidence • r is the predicted or estimated value of the indicator in target population • deff is the design effect • RR is the response rate • pb is the proportion of the target subpopulation in total population (upon which the indicator, r, is based) • AveSize is the average household size (that is, average number of persons per household) • e is the margin of error to be tolerated at the 95% level of confidence • Currently, note that e = 0.12r [defined as 12% of r, in this case the relative standard error of r is 6% because e = 2 standard error (r)] Previously in MICS2 • 2 different values for margin of error – Margin of error was 5 percentage points for high values of r (over 25%) – Margin of error was 3 percentage points for low values of r (25% or less) • Difficulty for users in deciding on the sample size for their surveys. MICS template for sample size calculation - EXCEL FILE Selection of key indicators • Choose an important indicator that will yield the largest sample size • Step 1: Select 2 or 3 target populations representing each a small percentage of the total population (pb); typically – Children 12-23 months: 2-4% or – Children under 5 years: 7%-20% Selection of key indicators • Step 2: Review important indicators for these target groups but ignore indicators with very low or very high prevalence (less 10% or over 40%, respectively) • Do not choose from the desirably low coverage indicators an indicator that is already acceptably low • Do no choose childhood and maternal mortality ratios Explicit Stratification • Explicit stratification: dividing the sampling frame into sub-groups (called strata) of homogeneous (similar) PSUs. • Advantages: – Better precision because reduced variance within stratum given similarity of units – Flexible design, sub-national estimates for smaller domains (differential sampling rates) • Example of stratification: region, urban/rural Implicit Stratification • Sort the sampling frame according to certain characters such as regions, urban-rural residence, sub-regions, districts, etc., then select a systematic pps sample. • Ensures a representative sample for each subgroup • Automatically provides proportional allocation by size of subgroup Allocation of sample to strata/domains • Proportional allocation – Effective for precision of estimates at the national level • Equal allocation to each domain – Used when each domain requires same level of precision • Optimum allocation – takes into account differential variance and costs by stratum – For example, variability may be higher in urban areas and enumeration costs may be higher in rural areas – use higher sampling rate for urban areas Subnational estimates • Number of separate areas (domains) for which separate, equally reliable estimates are wanted affects sample size • For example, if 10 regional estimates are wanted, theoretically the sample should be increased by factor of 10 • As a compromise, larger sampling errors accepted for subnational estimates – One proposal (by Dr. Vijay Verma) – increase national sample size by factor of D0.65, where D is the number of domains – Results in an average increase in the sampling errors for domain estimates by a factor of about 1.5 Sampling Stages • Ideal to have two-stage sample design, with EAs defined as PSUs • In some countries only frame of larger administrative units available – Three-stage sample design: larger area units selected as PSUs – Necessary to delineate smaller segments in each sample PSU Number of PSUs and Cluster Size • Survey costs depend not only on number of households but their distribution among primary sampling units (PSUs) • Important to determine effective balance between number of sample PSUs and number of sample households per cluster • In general, the more PSUs the better for reliability but the greater the cost (mostly costs of travel and listing) Number of PSUs and Cluster Size • Example: 8000 households selected in 400 PSUs of 20 sample households each is a much more reliable sample than 200 PSUs of 40 households each, but more expensive • Number of sample households per cluster should be as small as practical for reliability • A range of 15-25 households for MICS appears to be effective Design Effect (DEFF) • Deff - ratio of variance of estimate based on stratified multi-stage sample design and corresponding variance from simple random sample of same size • Measure of the relative efficiency of the sample design • Effective stratification reduces the deff • Cluster sampling increases the deff Design Effect (DEFF) • In case of cluster sampling, deff generally measures effect of clustering _ deff  1   (m 1) • δ = intraclass correlation coefficient, or measure of homogeneity within cluster _ • m = average number of households per cluster • Design effect increases with intraclass correlation and cluster size First Stage Selection of PSUs • Standard methodology for MICS and other household surveys – select EAs or clusters systematically with PPS • Important to sort frame before selection, in order to ensure effective implicit stratification • Traditional procedure – cumulate measures of size, determine sampling interval and random start, generate selection numbers Large sample PSUs in PPS sampling • Sometimes a PSU may have a measure of size larger than the sampling interval • PSU may be selected more than once in the systematic PPS selection • Option 1 – if the PSU is selected two or more times, multiply the number of households to be selected by the number of “hits” • Option 2 – separate the large PSUs and include in sample with a probability of 1 MICS Sampling Option 1 – new sample with household listing • Design new MICS sample • Two stages with census as frame • Use of implicit stratification, systematic selection of census EAs at first stage with pps • List households in selected EAs/segments • Select households systematically from listing • Interview selected households, no replacement will be allowed Sampling Option 1 - continued • Advantages of option 2 - simple design - probability-based - if possible self-weighting (national level) • Limitations of option 2 - expense of listing households - time necessary to list households [Example, sample size of 5000 households may require 25000 to 50000 households to be listed] MICS Sampling Option 2 – use an existing sample • Design MICS as a rider to another survey if timely and feasible • Use sample from a previous survey and re-interview households for MICS • Or, use old survey sample EAs and construct new listing of households to select for MICS • Old sample must be probability-based, national in scope • Possibilities – DHS, other national health survey, recent labour force survey • Important: design parameters must be known (such as selection probability, stratification, etc.) Sampling option 2 - continued • Use of existing master sampling frame • Some countries use master sample design for intercensal national household surveys • Master samples generally sufficiently large for MICS; subsample of PSUs can be selected • Advantage – updated maps may be available for master sample of PSUs, and perhaps updated listing Sampling option 2 - continued • Advantages of using previous sample - cost savings - maps available for interviewers - appropriate sampling plan available - simplicity • Limitations of using old sample - burden on respondents - sample design may need modification * sample size * sub-national coverage * number of PSUs or clusters • Balance between loss and gain Listing and Selection of Households • Household listing manual is available • Importance of new listing to represent current population • Problems with using previous listing (older than 1 year) – Does not represent newer households – Distribution of sample population by age group distorted, generally with higher median age – Difficulty of finding households in old list Listing and Selection of Households • MICS recommends a separate household listing operation – More reliable as listing staff are less likely than interviewers to bias the sample by excluding households that are difficult to reach – Allows household selection to be done in a single central location using reliable and uniform procedures Listing and Selection of Households • Household selection in the office: – Advantages – conducted by specialized staff, possible to avoid selection bias in the field, possible to control overall sample size – Disadvantage – increased costs from having two field visits • Selection in the field: use household selection table – Advantage – cost savings of having one integrated field operation – Disadvantage - correct sampling may be difficult for field staff, selection may be biased Listing and Selection of Households • Excel template for generating automatically the sample of households based on the number of households listed(see spreadsheet) • Common problems found in listing operations – Problem with quality of sketch maps – difficult to determine segment boundaries – Sometimes large differences found between number of households in frame (census) and number listed. Sampling strategy for low fertility countries • In MICS 4 and 5, some low fertility countries are using second-stage stratification of listing by households with and without children under 5 • Higher sampling rate used for households with children • Increases number of households with children in MICS sample, and therefore number of sample children Sampling strategy for low fertility countries (continued) • Improves the reliability of the child indicators without increasing the sample size to a very high level • This procedure also increases the variability in the weights and the design effects for the overall sample • Important to avoid very large variability in the weights for households with and without children – Differential weights between households with and without children generally should not exceed a factor of about 4 Implications of sampling strategy on sample size calculations • One parameter in the sample size calculation template is the proportion of the indicator subpopulation • Using a higher sampling rate for households with children increases the proportion of children under 5 in the sample • The proportion of children under 5 (or smaller age groups) should be multiplied by a factor that reflects the increase in sample households with children Implications of sampling strategy on weighting procedures • Under normal MICS sample design, weights vary by sample cluster • With second stage stratification by households with and without children, two weights need to be calculated for each cluster: for households with and without children Survey weighting procedures • Survey data collected using a complex design featuring clustering, unequal probabilities of selection and stratification: – All analyses must apply survey weights in order to prevent biased results • Formulas for calculating weights depend on the exact sample design used in each country • MICS has 4 set of weights: households, women, men and children Survey weighting procedures • Components of MICS survey weights: – Design weight: inverse of the final probability of selection for households – Adjustment factors for nonresponse (cluster, household, woman, child level) • Normalized weights so that the total weighted number of observations is equal to the total unweighting number (sample size) Survey weighting procedures Sampling Error Estimation • • • • • Necessary to evaluate reliability of survey estimates Possible only when probability sampling is used Should be done for 30-50 important indicators Methodology is complex and design-specific Several software packages: – SPSS Complex Samples module – used in MICS – SAS, Stata, SUDAAN, Clusters, WesVar, CENVAR, PCCarp, etc. • Standard error, confidence intervals and DEFF Sampling Error Estimation SPSS Complex Samples module • Advantages: – Simple to use – Template syntax available for standard indicators – Supported by MICS Global and Regional staff • Steps: – Set up sampling parameter specifications file (csplan) – Define variables for stratum, PSU and weight Sampling Error Estimation SPSS Complex Samples module • Stratum should be lowest level of explicit stratification (for example, province, urban/rural) • Necessary to have minimum of two sample PSUs per stratum Reducing bias • Accuracy of survey results depends on both variance and bias (mostly from nonsampling errors) • Bias should be minimized with quality control for all survey operations • Basic data quality determined during enumeration – Important to have good training and supervision in the field • Data capture should include 100% or sample verification • Important to have quality control for editing and coding procedures • Computer consistency and range checks

MICS WS1

Related documents

Products

Support

MICS WS1

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib