Sampling Sampling is the process of selecting from a population of

advertisement
Sampling
Sampling is the process of selecting from a population of interest so that based upon
the sample we are able to reasonably generalize the results of the analysis back to the
population from which the sample was chosen. In other words, sampling is a
procedure by which characteristics about a large body of people or units (a population)
can be inferred by getting data from only a few (the sample). Ideally we would be able
to accurately generalize the results back to the population from which we drew the
sample; however, a number of potential problems can arise. Among them are:

Sampling error - the difference in the sample estimate and what we would have
found by measuring the entire population

Sampling biases - nonrandom errors due to inadequate data or clerical mistakes
(for example, an interviewer always interviews the person who answers the door
(typically the youngest family member)

Non-sampling biases – for example the person recording the responses to
telephone interviews writes 2s that look like 7s - in other words the kinds of
errors that might occur even if we sampled the entire population
There are two general types of samples, probabilistic and non-probabilistic.
Probabilistic samples, by definition, meet the requirements of a good sample. A good
sample is one in which every member of the population has an equal probability of
being selected for the sample. Types of probabilistic samples are:

Simple random sample – In a simple random sample each unit in the population
has a known and equal chance of being selected - like drawing names out of a
hat without replacement. For example in a company with 1,000 employees, a
sample of 50 employees who were randomly picked would indicate that every
employee had a one in 20 chance of being included in the sample.

Systematic random sample – In systematic random sampling the target
population is ordered in some manner which would not be considered
systematically biased. In other words there must not be anything of importance
in the ordering of the population. For example, a population of employees might
be ordered by last name, alphabetically. Then, after the target population is
arranged according to the ordering scheme, elements at regular intervals
through that ordered list are selected. If from a population of 6,000 we wanted a
sample of 150, our sampling interval would be 40 (6,000/150) = 40. Systematic
random sampling involves selecting a random start within the first sampling
interval and then proceeds with the selection of every kth element from then
onwards. In this case, k = (population size/sample size). It is important that the
starting point is not automatically the first in the list, but is instead randomly
chosen from within the first to the kth element in the list. Thus, the sampling
interval is selected depending upon the sample size desired, then a random
starting number is selected, then every nth person on our list. In our example
suppose out of the first 40 on our list we randomly selected number 27, then our
next one would be number 67, then number 107, etc. until we obtained our
sample of 150. Systematic random sampling may save time and costs with
large populations.

Stratified samples – Although perhaps not likely, it is possible to draw a sample
of 100 of the same gender from a population of 1,000 that included the same
number of men and women (500 each). While this sample may not be
representative of the population, it would still meet our definition above of a good
sample. If we know that certain population characteristics are important and we
want to make sure they are adequately included in the sample, we might use
stratified sampling. Stratified sampling is where the population is divided into
subgroups (strata) or layers and the sample is drawn from each strata. In our
example above, if we were interested in being certain that men and women were
included in our sample of 100, we could divide our population into the 500
females (the female strata) and into the 500 males (the male strata), then
randomly select 50 from each strata for a total sample of 100. The benefits of
stratified sampling include allowing for the analysis of sub-groups when desired
and may provide for more accuracy in statistical estimation.

Cluster samples – In cluster sampling the population is divided geographically census tracts, voting precincts, counties, etc. Then the clusters are selected
randomly - usually multi-stage on down to individual or household level. Many
national studies are done in this manner.
Although many of the statistical tests we will cover in future modules rely upon the
assumption that the data is from probabilistic samples, much business research is in
actuality based upon non-probabilistic methods. Types of probabilistic samples are:

Purposive - In purposive sampling respondents are deliberately sampled for a
particular reason; for example we may sample particular individuals because of
their special expertise about a certain topic. One drawback of purposive
sampling is that we can’t generalize from our findings to a population.

Quota samples – Quota samples may appear similar to stratified samples in that
the population is divided into sub-groups, except in quota sampling individuals
are not selected randomly as they are in stratified sampling. This technique may
be useful when time is limited, but because the sample is not randomly selected,
unknown biases are not accounted for. An example of a quota sample would be
a sample requiring five men and five women under the age of 40 and five men
and five women 40 or older.

Chunk samples – Chuck samples typically refer to simply including a group who
happens to be available when needed. For example interviewing five people on
the street about some topic doesn’t represent the whole population and probably
doesn’t even represent the people on the street.

Volunteer samples (also known as convenience samples) – Volunteer sampling
consists of participants becoming part of a study because they volunteer when
asked. This technique is typically quick and easy and much research is done
with volunteers. However, the type of participants who volunteer may not be
representative of the target population for a number of reasons. It’s also very
difficult to determine how volunteers differ from those who did not volunteer and
how whatever those differences might have been systematically affect the
results.

Snowball – The snowball sample is a technique in which study participants
recruit others, from among friends and acquaintances. An obvious problem with
this technique is the number of biases that may influence the results.
Download