Introduction to Sampling for the Implementation of PATs Materials Developed by

advertisement
Introduction to Sampling for
the Implementation of PATs
Materials Developed by
The IRIS Center at the University of Maryland
1a
Advantages of Sampling
In most cases, do not want to survey EVERYONE
Why?
– Too costly
– Too time consuming
– Too many resources needed
2
Advantages of Sampling
To make our work more cost-effective:
-Interview the minimum number needed
-Reduce:
• time
• cost
• human error
3
Survey Sampling
According to sampling theory we can get valid
results from studying only a fraction (a sample) of
our clients, provided:
• the sample is REPRESENTATIVE of the
qualities of our client POPULATION, and
• of sufficient SIZE to satisfy the assumptions of
the statistical techniques used in our analysis
4
Simple Random Sampling
For the sample to be representative, it must be
obtained randomly.
It is a simple random sample if each item in
the population has an equal chance of
being selected.
5
Types of Bias in Survey Process
Poor randomization is not the only cause of
biased samples. Bias and error are more
often introduced by:
– poor group definition
– interviewer error
– inadequate records (incomplete or
outdated client lists).
6
Longitudinal Design
Longitudinal studies compare multiple clients at
multiple points in time (at least two points in
time).
Often there is a baseline (when the client began
the program) and an endline (two years later,
for example).
7
Cross Sectional Design
Cross sectional studies compare multiple clients
in the program at one point in time.
Ex: On October 1, 2005, program looks at:
• Incoming clients
• 2-year clients
• 4-year clients
8
Calculating Sample Size:
How Big is Big Enough?
• Sample results are almost never identical to the
entire population
• The larger the sample of clients, the greater the
likelihood that the statistical analysis will yield
“significant” results that closely resemble the
entire client population.
9
Calculating Sample Size
Different Views:
• Statistician – maximalist – at least 500
• Field researcher – minimalist – at least 35 to 50 for
each subgroup we want to analyze and compare
USAID PAT – at least 300
10
Trade off: Larger sample is more accurate, but
costs more in time and money
To make generalizations about entire
population, need a total sample size of 200-400
(depending on total population and confidence
level desired)
11
Sample Size Calculator
• Creative Research Systems:
www.surveysystem.com/sscalc.htm
Population Size
Confidence
Level
Confidence
Interval
1,000
5
95%
278
5,000
5
95%
357
10,000
5
95%
370
50,000
5
95%
381
100,000
5
95%
383
1,000,000
5
95%
384
Sample Size
12
How to Sample Randomly?
RANDOM = giving each client an equal chance to be selected
This is done by:
• drawing numbers, as in a lottery
• numbering all clients and selecting numbers from a
random number table
• systematically, by selecting every ‘nth’ case from a
complete list of clients
DANGER!!!
The list may be biased by:
•who is left out—Is the list up-to-date?
13
Steps in Taking a Simple Random Sample
• Number a copy of the complete client list, and note
the total number of clients (the last number)
• Decide on your sample size
• Create a list of random numbers
• Use Excel or a random number table to select the
sample, matching the numbers from the table with
those on your numbered client list.
14
Cluster Sampling
To focus on specific subgroups, first classify the
population into several subpopulations, called
“strata,” then randomly sample from each
stratum (subgroup).
15
Cluster Sampling
Is a way of selecting randomly, when you have
a geographically dispersed population when
time is limited.
This method can help reduce the time and cost
in data collection.
Group the clients into clusters (could be
branches or loan groups). Randomly choose
the clusters. Then sample random individuals
from only some randomly chosen clusters.
16
Stratified Sampling
• Stratified survey sampling enables you to
focus on specific groups (for example,
women or rural people), ensuring that they
will be represented in the sample. Although
random survey sampling, done correctly, will
give the researcher roughly proportional
samples of all groups, disproportional
stratified sampling will guarantee that a
certain group is adequately represented.
17
Parametric Statistics
• Assumes that the distribution of values for
your variables are normal (Bell Curve), and
also relatively similar to each other.
• In parametric statistics, thirty is a “magic
minimum number”--meaning that it is
generally accepted as the minimum cell size
for each stratum or subgroup of a simple
sample.
18
Minimum for Each Subgroup
• 30 = ‘minimum magic number’ for each
subgroup
• To do any statistical analysis between
subgroups, need a minimum of 30 in each
subgroup in order to have any chance at all of
finding ‘significant’ differences.
BUT, 30 is NOT enough for your total sample.19
If you want to compare between
subgroups, you need 35 in each cell
• Since the magic minimum number is 30, and
you may have some missing values in some
of your interview forms, for practical
purposes, you need to always have a
minimum number of 35 completed surveys for
each cell of the sampling frame.
20
Handling Sampling Problems in the Field
• If you cannot interview the client who is sampled
(not available, refusal, etc.)
• Sample ‘at least’ an extra 40% and have
alternates available to be interviewed in each
area (subgroup)
• Help ensure that you complete 35
questionnaires for each subgroup (if you plan to
do additional analysis and compare subgroups)
• Make better use of the interviewers’ time
21
Example of a Sampling Frame
Survey Sample
Region 1
Region 2
Region 3
Total
Clients interviewed
112
100
88
300
Substitute sample
(approx 40%)
45
40
35
120
Total
157
140
123
420
22
What if there are not enough with the 40% extra?
A. Check with the sample tracking coordinator to give you new
names
B. If there is not time, the field supervisor must adjust in the
field
1) Use random number table and select clients from master
list that have not already been selected
2) If you do not have a random number table, can ask
someone to pick a number between # and ## at random
Do NOT introduce bias
3) Write down the changes that you made and how you 23did it
An excerpt from a Random Number Table
32
81
45
13
64
38
95
97
50
34
11
56
61
09
76
30
92
70
49
08
65
93
09
78
46
46
20
38
94
01
00
89
24
99
50
43
30
49
24
23
69
27
86
12
17
43
54
44
48
95
16
11
51
06
15
66
93
04
75
01
54
09
04
18
77
69
80
21
45
24
34
71
87
59
55
41
85
42
41
83
47
71
33
13
41
23
58
08
17
30
98
87
22
23
61
21
Can use: www.random.org/nform.html
29
74
93
08
96
21
05
74
36
42
66
47
26
65
09
18
55
36
76
98
64
14
82
91
24
IF YOU DON’T HAVE A CLIENT LIST
Random walk sampling -- less expensive but
more prone to bias
• Watch out for “tarmac bias”, selecting only
houses that are easily accessible from the
road
25
Example of BDS Sampling
• Investigative emphasis:
final beneficiaries.
– Will use three subsectors
(irrigation, cashews,
potable water).
– Will focus only on end
users of the technologies.
– Will focus on region
surrounding Ziguinchor.
26
Example of Business Development Services
Sampling
•
•
•
•
Sample size = 200
Casamance region is the focus
Program has three sub-sectors
Sample in each sub-sector stratified according to major
differences between types of clients
Irrigation – individual owners and group owners
Cashew processing – shellers and peelers
Potable water – tubewells and rope pumps (rural and
peri-urban)
27
Example of BDS Sampling
• Generate a list of the direct clients and divide by
subgroup
• # of clients per stratum or subgroup depends on
percentage the stratum constitutes in sector
• Select clients using a random number list
• Each direct client will provide information to the
interviewers so that they can create a list of end
users from which some will be chosen according to
a predetermined random number list.
28
BDS Sampling Framework Example
Subsector
Irrigation
Cashew
Potable Water
Total
Total Number of
Beneficiaries
5,500
800
3,500
9,800
Percentage of Total
Beneficiary
Population
56%
9%
35%
100%
Total Number of
Beneficiaries to be
interviewed for PAT
implementation
(based on 300 + 40%
extra, or 420)
235
38
147
420
Type of Client
Individual
Group
Shellers
Peelers
Tubewell
Rope Pump
Percentage of Total
65%
35%
45%
55%
10%
90%
Number to be interviewed
153
82
17
21
15
132
420
Rural
Urban
30%
70%
40
92
29
Download