SAMPLING DESIGN 1 CENSUS VERSUS SURVEY Census collects information about every member of the population Survey collects information from sample of the population Census is more detailed and accurate. Survey is not accurate or reliable as a census Census takes long time to complete Survey can be done in a shorter period of time compared to census. Census is generally conducted by the government Census are not conducted frequently Surveys can be conducted by anyone. Surveys can be conducted more frequently . 2 SAMPLE A sample is “a smaller (but hopefully representative) collection of units from a population used to determine truths about that population” (Field, 2005) 3 SAMPLING ……. STUDY POPULATION SAMPLE TARGET POPULATION 4 WHY SAMPLING? Get information about large populations with Less costs Less field time and budget constraint (Maximize information with minimum resources) More accuracy i.e. Can Do A Better Job of Data Collection When it’s impossible to study the whole population, for example Topographic limitations and when sometimes the sampling process is destructive. 5 POPULATION FRAME A list, map, directory, or other source used to represent the population Overregistration -- the frame contains all members of the target population and some additional elements Example: using the chamber of commerce membership directory as the frame for a target population of member businesses owned by women. Underregistration -- the frame does not contain all members of the target population. Example: using the chamber of commerce membership directory as the frame for a target population of all businesses. 7 THE SAMPLING DESIGN PROCESS Define the Population Determine the Sampling Frame Select Sampling Technique(s) Determine the Sample Size Execute the Sampling Process 8 FACTORS TO CONSIDER IN SAMPLING DESIGN Work objectives Degree of accuracy Resources Time frame Knowledge on population Scope Statistical analysis needs 9 Classification of Sampling Techniques In which the chances (probability) of selecting members from the population are unknown Nonprobability Sampling Techniques Sampling Techniques In which members of the population have a known chance (probability) of being selected Probability Sampling Techniques 10 Nonprobability Sampling Techniques Nonprobability Sampling Techniques Convenience Sampling Judgmental Sampling Quota Sampling Snowball Sampling 11 NON-PROBABILITY SAMPLING Convenience sampling: selection is based on easiness (the right place at the right time). Judgmental (purposive)sampling: selection based on the judgment of the researcher. Snowball sampling, an initial respondent is selected and subsequent respondents are selected based on the referrals. Quota sampling/stratified: is a two-stage restricted judgmental sampling ( develop quotas of population elements and then select) 12 SNOWBALL SAMPLING Person 1 RESEARCHER Friend/contact 1 contacts his/her own friends/contacts/ 4 5 RESEARCHER HAS 3 CONTACTS Friend/contact 2 contacts his/her own friends/contacts/ 6 7 8 Friend/contact 3 contacts his/her own friends/contacts/ 9 10 11 12 THE 3 CONTACTS EACH HAVE 3 CONTACTS 13 Strengths and Weaknesses of Basic Sampling Techniques ________________________________________________________________ Technique Strengths Weaknesses ________________________________________________________________ Nonprobability Sampling Convenience sampling Least expensive; least time consuming; most convenient Selection bias; sample not representative; not recommended for descriptive or causal research Judgmental sampling Low cost; convenient; not time consuming Does not generalization; subjective Quota sampling Sample can be controlled for certain characteristics Selection bias; no assurance of representativeness Snowball Can estimate Time consuming rare characteristics 14 Probability Sampling Techniques Probability Sampling Techniques Simple Random Sampling Systematic Sampling Stratified Sampling Cluster Sampling 15 PROBABILITY SAMPLING Simple Random Sampling: Each element has a known and equal probability of selection. Systematic Sampling: A random starting point and then picking every ith element . Stratified Sampling: A two-step process. Population is partitioned into subpopulations and then elements are selected from each stratum by a random procedure. Cluster Sampling: First clusters are formed and then a cluster is randomly selected. For each selected cluster, either all the elements are included in the sample (one-stage) or a sample of elements is drawn probabilistically (two-stage). 16 Simple Random Sampling 17 Every unit has an equal non-zero chance of being selected Advantages: Known and equal chance of selection (most representative) Easy method when there is an electronic database Disadvantages: Complete accounting of population needed (Difficult to identify every member of a population) This method is the purest form of probability sampling 18 Systematic Sampling 19 The defined target population is ordered and the sample is selected according to position using a skip interval ((e.g., every 5th item in alphabetized list, every 10th name in phone book) Systematic sampling is frequently used to select a specified number of records from a computer file. 20 Advantages: Known and equal chance of selected interval Less expensive…faster than Radom methods Disadvantages: Loss in sampling precision (each item does not have equal chance to be selected, System for selecting subjects may introduce systematic error, Cannot generalize beyond population actually sampled) 21 SYSTEMATIC SAMPLING Every nth person (e.g. every 4th person). To find the frequency use the formula: N f sn where f = frequency interval; N = the total number of the wider population; sn = the required number in the sample. 22 In a company of 1,500 employees a sample size of 306 is required (from tables of sample size for random samples). The formula is: This rounds to 5, i.e. every 5th person. 23 Stratified Random Sampling 24 The population is separated into homogeneous strata and a sample is taken from each 25 Stage 1: Divide the wider population into mutually exclusive homogeneous groups. Stage 2: Randomly sample within these groups, the size of each group being determined by judgement or tables of sample size. 26 STRATIFIED SAMPLING 27 STRATIFIED RANDOM SAMPLING List of clients South North East Strata Random subsamples of n/N 28 Advantages: More accurate overall sample of skewed population Disadvantages: More complex sampling plan requiring different sample sizes for each stratum 29 Advantage Better in achieving representativeness on control variable Disadvantage Difficult to pick appropriate strata Difficult to Identify every member in population 30 Cluster Sampling 31 The population is divided into groups (clusters), any of which can be considered a representative sample. Useful when it is difficult or costly to develop a complete list of the population members or when the population elements are widely dispersed geographically. 32 33 Types of Cluster Sampling Divide Population into Cluster Randomly Sample Clusters One Stage Include All Elements from Each Selected Cluster Two-Stage Randomly Sample Elements from Each Selected Cluster 34 Advantages: Economic efficiency … faster and less expensive Does not require a list of all members of the population Disadvantages: Cluster specification error (the more homogeneous the cluster chosen, the more imprecise the sample results) 35 PLANNING A SAMPLING STRATEGY 36 Stage One: Decide whether you need a sample, or whether it is possible to have the whole population. Stage Two: Identify the population, its important features (the sampling frame) and its size. Stage Three: Identify the kind of sampling strategy you require (e.g. which variant of probability, non-probability, or mixed methods sample you require). Stage Four: Ensure that access to the sample is guaranteed. If not, be prepared to modify the sampling strategy. 37 Stage Five: For probability sampling, identify the confidence level and confidence intervals that you require. For non-probability sampling, identify the people whom you require in the sample. Stage Six: Calculate the numbers required in the sample, allowing for non-response, incomplete or spoiled responses, attrition and sample mortality. Stage Seven: Decide how to gain and manage access and contact. Stage Eight: Be prepared to weight (adjust) the data, once collected. 38 HOW TO DETERMINE SAMPLE SIZE? 39 HOW LARGE MUST MY SAMPLE BE? The number of strata required; The number of variables included in the study; The variability of the factor under study; The kind(s) of sample; The representativeness of the sample; The allowances to be made for attrition and non-response; The need to keep proportionality in a proportionate sample; The kind of research that is being undertaken (qualitative/quantitative/mixed methods). 40 PROPORTION OF SAMPLE SIZE TO POPULATION 6000 5000 4000 SA M PLE 3000 POPU LA T ION 2000 1000 0 Note: As the population increases, the proportion of the population in the sample decreases. 41 SAMPLE SIZE Ensure a sufficiently large sample for each variable. Samples in qualitative research must be large enough to generate ‘thick descriptions’. A large sample does not guarantee representativeness; representativeness depends on the sampling strategy. Sample size also depends on the heterogeneity or homogeneity of the population: if it is highly homogeneous then a smaller sample may be possible. 42 SAMPLE SIZE Large samples are preferable when: there are many variables; only small differences or small relationships are expected or predicted; the sample will be broken down into subgroups; the sample is heterogeneous in terms of the variables under study; reliable measures of the dependent variable are unavailable. 43 HOW TO DETERMINE SAMPLE SIZE? FACTORS TO CONSIDER The variability of elements in the target population The type of sample required Time available Budget Required estimation precision Whether the findings are to be generalized and, if so, with what degree of confidence 44 HOW TO DETERMINE SAMPLE SIZE? STATISTICAL METHODS When statistical methods are used to determined the sample, three decisions must be made: 1) the degree of confidence (the level of risks involved, often 95%), 2) the specified level of precision (amount of acceptable error, the maximum acceptable difference between the estimated sample value and the true population value), and 3) the amount of variability (population homogeneity, measured by standard deviation). The true SD is usually unknown and mostly based on previous similar studies or a pilot study 45 SAMPLING SIZE FOR LARGE POPULATION Sample size (SS) = (DC X TV/DP)2 Where DC is the number of standard errors for the degree of confidence specified for the research result TV is true variability, the standard deviation of the population DP is desired precision, the acceptable difference between the sample estimate and the population value 46 We wish to estimate the average monthly expenditure on eating out. Although the true standard deviation is unknown, a pilot test study of 30 customer provides an estimate of the unknown standard deviation of $14. We desire to be 95 percent confidence that our estimate of the mean monthly expenditure on eating out is within $2 of the true population mean. Assuming that the distribution f expenditure follows a normal distribution, then the sample size is Sample size (SS) = (DC X TV/DP)2 = (1.96X14/2)2 = 196 47 IN SMALL POPULATION When working with small population, use of the above formula may lead to unnecessarily large sample size. If the sample size is larger than 5 Percent of the population, then the calculated sample size should be multiplied by the following correction factor: N/{N+(n-1)} Where N= population size n=the calculated sample size determined by the original formula 48 FOR SMALL SIZE Suppose a bank has 5,000 ATMS installed in AA. The bank wishes to establish the customer‘s view of this service. A researcher commission estimates the required sample size given their agreed criteria is 750. This sample size is 13 percent of the population and a larger than is necessary for an efficient sample size. In this case the sample size correction factor needs to be applied, as illustrated below. Adjusted sample size = 750 (5000/{5000+750-1})= 653 49