Uploaded by dejeneedossa120619

CAPTER THREE

advertisement
CHAPTER THREE
Sampling Design
CENSUS
• All items in any field of inquiry constitute a ‘Universe’ or ‘Population.’
• A complete enumeration of all items in the ‘population’ is known as a
census inquiry.
• In this, no element of chance is left and highest accuracy is obtained.
• But in practice this may not be true. Even the slightest element of bias in
such an inquiry will get larger and larger as the number of observation
increases.
CENSUS
• Moreover, this type of inquiry involves a great deal of time, money and energy.
• Therefore, when the field of inquiry is large, this method becomes difficult to
adopt because of the resources involved
• Further, many a time it is not possible to examine every item in the population
• It is better to obtain sufficiently accurate results by studying only a part of total
population.
• However, it needs to be emphasized that when the universe is a small one, it is no
use resorting to a sample survey
CENSUS
• When field studies are undertaken in practical life, considerations of time and
cost almost invariably lead to a selection of only a few items.
• The respondents selected should be as representative of the total population as
possible
• The selected respondents constitute what is technically called a ‘sample’ and the
selection process is called ‘sampling technique.’
• The survey so conducted is known as ‘sample survey’
• Algebraically, let the population size be ‘N’, and the group consisting of ‘n’ units is
known as ‘sample’.
CENSUS
• Thus, the researcher must prepare a sample design for his/her study i.e.,
he/she must plan how a sample should be selected and of what size such a
sample would be.
SAMPLE DESIGN
• A sample design is a definite plan for obtaining a sample from a given population.
• It refers to the technique or the procedure the researcher would adopt in
selecting items for the sample.
• Sample design may as well lay down the number of items to be included in the
sample
• Sample design is determined before data are collected.
• There are many sample designs from which a researcher can choose.
SAMPLE DESIGN
• Researcher must select/prepare a sample design which should be reliable
and appropriate for his research study.
STEPS IN SAMPLE DESIGN
• While developing a sampling design, the researcher must pay attention to
the following points:
• Type of universe: The first step in developing any sample design is to clearly define
the set of objects, technically called the Universe, to be studied.
• The universe can be finite or infinite
SAMPLE DESIGN
• Sampling unit: A decision has to be taken concerning a sampling unit before
selecting
• Sampling unit may be a geographical one such as state, district, village, etc., or a
construction unit such as house, etc., or it may be a social unit such as family, club, school,
etc., or it may be an individual.
• The researcher will have to decide one or more of such units that he/she has to select for
his/her study.
• Source list: It is also known as ‘sampling frame’ from which sample is to be
drawn.
• It contains the names of all items of a universe (in case of finite universe only)
SAMPLE DESIGN
• Size of sample: This refers to the number of items to be selected from the
universe to constitute a sample.
• The size of sample should neither be excessively large, nor too small
• An optimum sample is one which fulfills the requirements of efficiency,
representativeness, reliability and flexibility.
• While deciding the size of sample, researcher must determine the desired
precision, acceptable confidence level, size of population variance, size of
population, and costs
SAMPLE DESIGN
• Parameters of interest: In determining the sample design, one must consider the
question of the specific population parameters which are of interest.
• Budgetary constraint: Cost considerations, from practical point of view, have a
major impact upon decisions relating to not only the size of the sample but also
to the type of sample.
• Sampling procedure: Finally, the researcher must decide the type of sample
he/she will use
• He /she must decide about the technique to be used in selecting the items for the
sample
CRITERIA OF SELECTING A SAMPLING
PROCEDURE
• During sampling survey one must remember that two costs are involved in
a sampling analysis viz., the cost of collecting the data and the cost of an
incorrect inference resulting from the data
• Researcher must keep in view the two causes of incorrect inferences viz.,
systematic bias and sampling error
(i) A systematic bias: results from errors in the sampling procedures, and it
cannot be reduced or eliminated by increasing the sample size
• Usually a systematic bias is the result of one or more of the following
factors:
CRITERIA OF SELECTING A SAMPLING
PROCEDURE
• Inappropriate sampling frame: If the sampling frame is inappropriate i.e., a
biased representation of the universe, it will result in a systematic bias.
• Defective measuring device: If the measuring device is constantly in error
• Non-respondents: If we are unable to sample all the individuals initially included
in the sample
• Indeterminacy principle: Sometimes we find that individuals act differently when
kept under observation
• Natural bias in the reporting of data: Natural bias of respondents in the
reporting of data
CRITERIA OF SELECTING A SAMPLING
PROCEDURE
(ii) Sampling errors: are the random variations in the sample estimates
around the true population parameters.
• Sampling error decreases with the increase in the size of the sample;
• And it happens to be of a smaller magnitude in case of homogeneous
population.
• Sampling error can be measured for a given sample design and size, and the
measurement of sampling error is usually called the ‘precision of the
sampling plan’.
CRITERIA OF SELECTING A SAMPLING
PROCEDURE
• But increasing the size of the sample has its own limitations viz., a large
sized sample increases the cost of collecting data and also enhances the
systematic bias.
• Thus the effective way to increase precision is usually to select a better
sampling design which has a smaller sampling error for a given sample size
at a given cost.
• Generally, while selecting a sampling procedure, researcher must ensure
that the procedure causes a relatively small sampling error and helps to
control the systematic bias in a better way.
TYPES OF SAMPLE DESIGNS
1. On element selection basis
• unrestricted or
• restricted
Unrestricted
• When each sample element is drawn individually from the population at
large, then the sample so drawn is known as ‘unrestricted sample’
Restricted
• all other forms of sampling are covered under the term ‘restricted
sampling’.
TYPES OF SAMPLE DESIGNS
2. On the representation basis
• probability sampling
• non-probability sampling.
Non-probability sampling
• Non-probability sampling is ‘non-random’ sampling.
Probability sampling
• Probability sampling is based on the concept of random selection,
TYPES OF SAMPLE DESIGNS
Non-probability sampling:
• Non-probability sampling is that sampling procedure which does not afford any
basis for estimating the probability that each item in the population has of being
included in the sample.
• Non-probability sampling is also known as deliberate sampling, purposive
sampling and judgement sampling.
• In this type of sampling, items for the sample are selected deliberately by the
researcher
• The investigator may select a sample which shall yield results favorable to his
point of view and if that happens, the entire inquiry may get vitiated.
TYPES OF SAMPLE DESIGNS
Non-probability sampling:
• Thus, there is always danger of bias entering into this type of sampling
technique.
• But, if the investigators are impartial, work without bias and have the
necessary experience so as to take sound judgement, the results obtained
from an analysis of deliberately selected sample may be tolerably reliable
• Sampling error in this type of sampling cannot be estimated and the
element of bias, great or small, is always there.
TYPES OF SAMPLE DESIGNS
Probability sampling:
• Probability sampling is also known as ‘random sampling’ or ‘chance
sampling’.
• Under this sampling design, every item of the universe has an equal chance
of inclusion in the sample.
• Here it is blind chance alone that determines whether one item or the
other is selected.
• The results obtained from probability or random sampling can be assured in
terms of probability
TYPES OF SAMPLE DESIGNS
Probability sampling:
• Random sampling ensures the law of Statistical Regularity which states that
if on an average the sample chosen is a random one, the sample will have
the same composition and characteristics as the universe.
• This is the reason why random sampling is considered as the best technique
of selecting a representative sample.
Probability Sampling
• There are four main types of probability sample:
1. Simple random sampling
2. Systematic sampling
3. Stratified sampling
4. Cluster sampling
1. Simple Random Sampling
• In a simple random sample, every member of the population has an
equal chance of being selected.
• Your sampling frame should include the whole population.
• To conduct this type of sampling, you can use tools like random
number generators or other techniques that are based entirely on
chance.
…Cont’d
HOW TO SELECT A RANDOM SAMPLE?
Lottery
• Such a procedure is obviously impractical, if not altogether impossible in complex
problems of sampling.
• In fact, the practical utility of such a method is very much limited.
Random number table
2952 6641 3992 9792 7979 5911
3170 5624 4167 9525 1545 1396
7203 5356 1300 2693 2370 7483
3408 2769 3563 6107 6913 7691
0560 5246 1112 9025 6008 8126
…Cont’d
Example
You want to select a simple
random sample of 100
employees of Company X.
You assign a number to
every employee in the
company database from 1
to 1000, and use a random
number generator to select
100 numbers.
COMPLEX RANDOM SAMPLING DESIGNS
• Probability sampling under restricted sampling techniques may result
in complex random sampling designs.
• Such designs may as well be called ‘mixed sampling designs’
• Such designs may represent a combination of probability and nonprobability sampling procedures in selecting a sample.
• Some of the popular complex random sampling designs are as
follows:
2. Systematic Sampling
• Systematic sampling is similar to simple random sampling, but it is
usually slightly easier to conduct.
• Every member of the population is listed with a number, but instead
of randomly generating numbers, individuals are chosen at regular
intervals.
• In some instances, the most practical way of sampling is to select
every ith item on a list.
…Cont’d
Example
All employees of the company
are listed in alphabetical order.
From the first 10 numbers, you
randomly select a starting
point: number 6. From
number 6 onwards, every 10th
person on the list is selected
(6, 16, 26, 36, and so on), and
you end up with a sample of
100 people.
…Cont’d
• If you use this technique, it is important to make sure that there is no
hidden pattern in the list that might skew the sample.
• For example, if the HR database groups employees by team, and team
members are listed in order of seniority, there is a risk that your
interval might skip over people in junior roles, resulting in a sample
that is skewed towards senior employees.
3. Stratified Sampling
• This sampling method is appropriate when the population has mixed
characteristics, and you want to ensure that every characteristic is
proportionally represented in the sample.
• It means, if a population from which a sample is to be drawn does not
constitute an homogeneous group, stratified sampling technique is generally
applied
• Under stratified sampling the population is divided into several subpopulations that are individually more homogeneous than the total
population
• You divide the population into subgroups (called strata) based on the
relevant characteristic (e.g. gender, age range, income bracket, job role).
…Cont’d
• From the overall proportions of the population, you calculate how many
people should be sampled from each subgroup.
• Then you use random or systematic sampling to select a sample from each
subgroup.
• The following three questions are highly relevant in the context of stratified
sampling:
(a) How to form strata?
(b) How should items be selected from each stratum?
(c) How many items be selected from each stratum or how to allocate the sample size
of each stratum?
…Cont’d
Example
The company has 800 female employees and
200 male employees. You want to ensure that
the sample reflects the gender balance of the
company, so you sort the population into two
strata based on gender. Then you use random
sampling on each group, selecting 80 women
and 20 men, which gives you a representative
sample of 100 people.
…Cont’d
Illustration #1:
• let us suppose that we want a sample of size n = 30 to be drawn from
a population of size N = 8000 which is divided into three strata of size:
N1= 4000,
N2= 2400 and
N3= 1600.
Adopting proportional allocation, calculate the sample sizes for the
different strata:
…Cont’d
• Proportional allocation is considered most efficient and an optimal design
when:
• the cost of selecting an item is equal for each stratum,
• there is no difference in within-stratum variances, and
• the purpose of sampling happens to be to estimate the population value of some
characteristic.
• But in case the purpose happens to be to compare the
differences among the strata, then equal sample selection from
each stratum would be more efficient even if the strata differ in
sizes.
…Cont’d
• In cases where strata differ not only in size but also in variability and it
is considered reasonable to take larger samples from the more
variable strata and smaller samples from the less variable strata
• Doing so, we can then account for both (differences in stratum size
and differences in stratum variability) by using disproportionate
sampling design by requiring:
…Cont’d
• where σ1 σ2 , , ... and σk denote the standard deviations of the k
strata, N1, N2,…, Nk denote the sizes of the k strata and n1, n2,…, nk
denote the sample sizes of k strata.
• This is called ‘optimum allocation’ in the context of disproportionate
sampling.
• The allocation in such a situation results in the following formula for
determining the sample sizes of different strata:
…Cont’d
Illustration #2:
A population is divided into three strata
• N1=5000,
• N2=2000 and
• N3=3000
Standard deviations are:
• How should a sample of size n = 84 be allocated to the three strata, if
we want optimum allocation using disproportionate sampling design?
…Cont’d
• In addition to differences in stratum size and differences in stratum
variability, if we may have differences in stratum sampling cost, we
can have cost optimal disproportionate sampling design, using the
following formula
4. Cluster Sampling
• Cluster sampling also involves dividing the population into subgroups, but
each subgroup should have similar characteristics to the whole sample.
• Instead of sampling individuals from each subgroup, you randomly select
entire subgroups.
• If it is practically possible, you might include every individual from each
sampled cluster.
• If the clusters themselves are large, you can also sample individuals from
within each cluster using one of the techniques above.
…Cont’d
• This method is good for dealing with large and dispersed populations,
but there is more risk of error in the sample, as there could be
substantial differences between clusters.
• It’s difficult to guarantee that the sampled clusters are really
representative of the whole population.
…Cont’d
Example
The company has offices in 10
cities across the country (all
with roughly the same number
of employees in similar roles).
You don’t have the capacity to
travel to every office to collect
your data, so you use random
sampling to select 3 offices –
these are your clusters.
…Cont’d
(iv) Area sampling:
• If clusters happen to be some geographic subdivisions, in that case
cluster sampling is better known as area sampling.
(v) Multi-stage sampling:
• Multi-stage sampling is a further development of the principle of
cluster sampling.
Non-probability Sampling methods
• There are four main types of non-probability sample:
1.
2.
3.
4.
Convenience sampling
Voluntary response sampling
Purposive sampling
Snowball sampling
1. Convenience Sampling
• A convenience sample simply includes the individuals who happen to
be most accessible to the researcher.
• This is an easy and inexpensive way to gather initial data, but there is
no way to tell if the sample is representative of the population, so it
can’t produce generalizable results.
…Cont’d
Example
You are researching opinions about
student support services in your
university, so after each of your classes,
you ask your fellow students to complete
a survey on the topic. This is a
convenient way to gather data, but as
you only surveyed students taking the
same classes as you at the same level,
the sample is not representative of all
the students at your university.
2. Voluntary Response Sampling
• Similar to a convenience sample, a voluntary response sample is
mainly based on ease of access.
• Instead of the researcher choosing participants and directly
contacting them, people volunteer themselves (e.g. by responding to
a public online survey).
• Voluntary response samples are always at least somewhat biased, as
some people will inherently be more likely to volunteer than others.
…Cont’d
Example
You send out the survey to all students
at your university and a lot of students
decide to complete it. This can certainly
give you some insight into the topic,
but the people who responded are
more likely to be those who have
strong opinions about the student
support services, so you can’t be sure
that their opinions are representative
of all students.
3. Purposive Sampling
• This type of sampling involves the researcher using their judgment to
select a sample that is most useful to the purposes of the research.
• It is often used in qualitative research, where the researcher wants to
gain detailed knowledge about a specific phenomenon rather than
make statistical inferences.
• An effective purposive sample must have clear criteria and rationale
for inclusion.
…Cont’d
Example
You want to know more about
the opinions and experiences of
disabled students at your
university, so you purposefully
select a number of students with
different support needs in order
to gather a varied range of data
on their experiences with
student services.
4. Snowball Sampling
• If the population is hard to access, snowball sampling can be used to
recruit participants via other participants.
• The number of people you have access to “snowballs” as you get in
contact with more people.
…Cont’d
Example
You are researching experiences of
homelessness in your city. Since there is
no list of all homeless people in the city,
probability sampling isn’t possible. You
meet one person who agrees to
participate in the research, and she puts
you in contact with other homeless people
that she knows in the area.
Determining Sample Size
• Perhaps the most frequently asked question concerning sampling is,
what sample size do I need?
• The answer is influenced by number of factors such as:
•
•
•
•
Purpose of the study
Population size
The risk of selecting bad sample and
The allowable sampling error
…Cont’d
Sample Size Criteria
• In addition to purpose of the study and size of the population, three
criteria usually need to be specified to determine the appropriate
sample size:
• Level of precession
• Level of confidence or risk
• The degree of variability in the attributes being measured
…Cont’d
Level of precession
• Some times called sampling error
• It is the range in which the true value of the population is estimated to be
• This range is often expressed in percentage points (e.g. +/- 5 percent)
Level of confidence or risk
• confidence level is based on ideas encompassed under Central Limit
Theorem
• It means that, if a 95% confidence level is selected, 95 out 100 sample will
have the true population value within the range of precession specified
…Cont’d
• There is always a chance that the sample you obtained does not represent
the true value population value
Degree Of Variability
• The third criterion, the degree of variability in the attributes being
measured
• It is refers to the distribution of attributes in the population
• The more heterogeneous a population, the larger the sample size
required to obtain a given level of precision vise versa
Strategies For Determining Sample Size
• There are several approaches to determining the sample size, these
include:
•
•
•
•
using a census for small populations,
imitating a sample size of similar studies,
using published tables, and
applying formulas to calculate a sample size.
…Cont’d
Using published tables
• A third way to determine sample size is to rely on published tables which provide
the sample size for a given set of criteria.
• The following tables present sample sizes that would be necessary for given
combinations of precision, confidence levels, and variability.
• Please note two things.
• First, these sample sizes reflect the number of obtained responses, and not necessarily the
number of surveys mailed or interviews planned (this number is often increased to
compensate for nonresponse).
• Second, the sample sizes in Table 2 presume that the attributes being measured are
distributed normally or nearly so.
• If this assumption cannot be met, then the entire population may need to be surveyed.
…Cont’d
Table 1. Sample size for ±3%, ±5%, ±7% and ±10%
Precision Levels Where Confidence Level is 95% and
P=.5.
Size of
Population
Table 2. Sample size for ±5%, ±7% and ±10% Precision
Levels Where Confidence Level is 95% and P=.5.
Sample Size (n) for Precision (e) of:
±3%
±5%
±7%
±10%
500
a
222
145
83
600
a
240
152
86
700
a
255
158
88
800
a
267
163
89
900
a
277
166
90
1,000
a
286
169
91
2,000
714
333
185
95
3,000
811
353
191
97
4,000
870
364
194
98
5,000
909
370
196
98
6,000
938
375
197
98
7,000
959
378
198
99
8,000
976
381
199
99
9,000
989
383
200
99
10,000
15,000
1,000
1,034
385
390
200
201
99
99
20,000
1,053
392
204
100
25,000
1,064
394
204
100
50,000
1,087
397
204
100
100,000
1,099
398
204
100
>100,000
1,111
400
204
100
a = Assumption of normal population is poor (Yamane,
1967). The entire population should be sampled.
Size of
Sample Size (n) for Precision (e) of:
Population
±5%
±7%
±10%
100
81
67
51
125
96
78
56
150
110
86
61
175
122
94
64
200
134
101
67
225
144
107
70
250
325
275
350
300
375
154
180
163
187
172
194
112
125
117
129
121
132
72
77
74
78
76
80
400
201
135
81
425
207
138
82
450
212
140
82
…Cont’d
Using Formulas To Calculate A Sample Size
• Although tables can provide a useful guide for determining the sample
size, you may need to calculate the necessary sample size for a different
combination of levels of precision, confidence, and variability.
• The fourth approach to determining sample size is the application of one
of several formulas
• For populations that are large, Cochran (1977) developed Equation to yield
a representative sample for proportions.
…Cont’d
• Where:
n0 is the sample size,
Z2 is the abscissa of the normal curve that cuts off an area α at the tails
(1 - α equals the desired confidence level, e.g., 95%)
e is the desired level of precision,
p is the estimated proportion of an attribute that is present in the population, and q is 1p.
Note: the value for Z is found in statistical tables which contain the area under the normal curve.
…Cont’d
Illustration 1: Suppose we wish to evaluate a state- wide Extension program
in which farmers were encouraged to adopt a new practice.
• Assume there is a large population but that we do not know the variability
in the proportion that will adopt the practice
• Furthermore, suppose we desire a 95% confidence level and ±5%
precision.
Required: calculate the sample size
…Cont’d
Finite Population Correction For Proportions
• If the population is small then the sample size can be reduced slightly.
• This is because a given sample size provides proportionately more
information for a small population than for a large population.
• Thus the sample size (n0) can be adjusted using the following Equation
• Where n is the sample size and N is the population size.
…Cont’d
Finite Population Correction For Proportions
Illustration 2: suppose our evaluation of farmers’ adoption of the new
practice only affected 2,000 farmers.
Required: What is the sample size that would now be necessary
…Cont’d
A Simplified Formula For Proportions
• Yamane (1967:886) provides a simplified formula to calculate sample sizes.
• Where n is the sample size, N is the population size, and e is the level of
precision.
Illustration 3:
Given: N= 2000, e= +/- 5%
Required: calculate sample size for the given population and sampling error
…Cont’d
Formula For Sample Size For The Mean
• The use of tables and formulas to determine sample size in the above discussion
employed proportions that assume a dichotomous response for the attributes
being measured.
• There are two methods to determine sample size for variables that are
polytomous or continuous.
• One method is to combine responses into two categories and then use a sample
size based on proportion (Smith, 1983).
• The second method is to use the formula for the sample size for the mean.
…Cont’d
Formula For Sample Size For The Mean
• The formula of the sample size for the mean is similar to that of the
proportion, except for the measure of variability.
• The formula for the mean employs σ2 instead of (p x q), as shown in the
following
…Cont’d
Formula For Sample Size For The Mean
• Where n0 is the sample size, z is the abscissa of the normal curve that cuts off an
area α at the tails, e is the desired level of precision (in the same unit of
measure as the variance), and σ2 is the variance of an attribute in the
population.
• The disadvantage of the sample size based on the mean is that a "good"
estimate of the population variance is necessary.
• Often, an estimate is not available.
• Furthermore, the sample size can vary widely from one attribute to another
because each is likely to have a different variance.
…Cont’d
Formula For Sample Size For The Mean
• Because of these problems, the sample size for the proportion is
frequently preferred.
OTHER CONSIDERATIONS
• In determining sample size, there are three additional issues:
• First, the above approaches to determining sample size have assumed that a simple
random sample is the sampling design.
• More complex designs, e.g., stratified random samples, must take into account the
variances of subpopulations, strata, or clusters before an estimate of the variability
in the population as a whole can be made.
…Cont’d
• Second, consideration with sample size is the number needed for the data analysis.
If descriptive statistics are to be used, e.g., mean, frequencies, then nearly any
sample size will suffice.
• On the other hand, a good size sample, e.g., 200-500, is needed for multiple
regression, analysis of covariance, or log linear analysis, which might be performed
for more rigorous state impact evaluations.
• The sample size should be appropriate for the analysis that is planned.
…Cont’d
• In addition, an adjustment in the sample size may be needed to accommodate a
comparative analysis of subgroups (e.g., such as an evaluation of program
participants with nonparticipants).
• Third, the sample size formulas provide the number of responses that need to be
obtained. Many researchers commonly add 10% to the sample size to compensate
for persons that the researcher is unable to contact.
• The sample size also is often increased by 30% to compensate for nonresponse.
• Thus, the number of mailed surveys or planned interviews can be substantially
larger than the number required for a desired level of confidence and precision
END
Download