Unit 1.2- Sampling Design v2

advertisement
Unit 1.2
Sampling Design
Corresponds to Chapter 2 in Triola
Overview of Section 1.2
• Observational Studies vs. Experiments
• Issues with Designing an Observational Study
– Representativeness
– Timing the Study
• Issues with Designing Good Experiments
– Randomization
– Control
– Replication
– Blinding
The basics of creating a study
• Observational Study vs. an Experiment
– An observational study is designed to observe and
measure specific characteristics- no treatment
– Treatment- An activity meant to make some
effective change in the attribute being measured
– Experiment- A treatment is applied and then the
experimenter proceeds to observe its effects
Issues in Designing Observational
Studies
1.2a
Sampling Error
• A non- sampling error- the difference between a
sample result and the true population result
• Sampling Error- This occurs when data are
incorrectly collected. For example, if we collected
a sample
• Non-sampling error is inevitable and the law of
large numbers allows us to minimize problems
from non-sampling error. Sampling Error is bad
research design and we want to eliminate
sampling error as much as possible.
Sampling Bias
• Sampling Bias is a kind of non-sampling error
that can invalidate the results of your study.
• A bias is skewing of the data towards one
conclusion or another that is not a result of
the actual views or tendencies of a
population.
• We want to avoid sampling bias as much as
possible so that our results can be robust.
Representativeness
• When conducting an observational study, the single
most important factor in creating the study is the
representativeness of the sample.
• By representativeness I mean, is your sample selected
in such a way as to insure that all of the people that
you want to know something about have a good
chance of being selected for the study.
• We can either make sure that you get a representative
sample by how you select your sample, or you can use
randomness to select your sample.
Random Samples
• Random Sample- Members from the population are
selected in such a way that each individual member in
the population has an equal chance of being selected.
• Simple Random Sample- N subjects is selected in such
a way that every possible sample of the same size n has
the same chance of being chosen.
– We are going to be discussing samples a lot and the simple
random sample is one of the primary requirements for us
to be able to run our statistical tests.
– We want to make sure that we can select any sample of a
particular size for our study because the mathematics that
we are going to use allows us to draw conclusions about
any sample of a particular size.
Voluntary Sample
• Voluntary Sample- Individuals select
themselves for participation in the study or
treatment
– In general, a voluntary sample has all sorts of
problems because the persons in the sample selfselect. For example, if you conducted a study on
an individual’s political opinions through a
voluntary sample, you would probably only get
individuals who had strong political beliefs. The
sample would most likely not be representative.
Convenience Sampling
• A convenience sample is one where the researcher
measures particular attributes from a sample that is
readily at hand.
• By readily at hand that means the people that are most
easily available for testing.
• A convenience sample is hardly ever done for scientific
purposes because it is easily subjected to sampling
bias. For example, if you were to stand outside of the
student store and ask students their opinions about
increasing to help with schools, your sample might be
skewed towards being in favor because the students
have a vested interest in more money for schools.
Probability Sample
• Sometimes it is necessary to stack the deck in order to
make sure that you have certain members of the
population represented in your study.
• A probability sample involves selecting members from
a population in such a way that each member of the
population has a known(but not necessarily the same)
chance of being selected.
– For example, weighting women in a study so that there is a
greater likelihood that they will be chosen in order to
insure their representativeness
– What I mean by representation is that we are actually
getting information from that segment of the population.
Systematic Sampling
• Systematic Sampling- Selecting every kth (such as every 50th)
member of the population
– Systematic Sampling is a way of getting randomization if the
individuals in whatever group that you are studying have sufficiently
similar characteristics and are somewhat evenly distributed in your
population.
– Systematic sampling is not an effective way of getting randomization if
you have individuals with similar characteristics who are bunched
together.
– For example, suppose that you wanted to know the buying habits of
people in Orange County. if you selected every 50th person in the
Santa Ana phonebook this would not be a good way of sampling.
Since Santa Ana is 95% Latino and of generally lower socioeconomic
status, you would most likely only be getting the views of Latinos or
people with lower incomes. Hence your sample would not be
representative.
Stratified Sampling
• Stratified Sampling- Subdivide the population into
at least two different subgroups (or strata) so that
subjects in the same subgroup have similar
characteristics and then the sample is drawn
from the subgroup.
– A common way of doing stratified sampling would be
to divide a group of individuals by ethnicity. Then we
would select an equal amount of people from each
ethnicity to sample. This would insure that we have
representativeness of all of the ethnicities in our
study. Because I have tested all ethnicities, I can factor
out ethnicity as a possible lurking variable.
Cluster Sampling
• Cluster sampling- divide the population area into
sections ( or clusters), then randomly select some
of those clusters and then choose all the
members from those clusters.
– For example, suppose that I wanted to study the
effects of statistics workshops on student grades. I
could separate all of the individuals who are taking
Math 219 by the class that they are in. These are the
clusters. Then I could select 5 classes to do my
measurements in. This could insure that I get a good
cross-section of instructors and hence remove that
issue as a lurking variable.
Timing of a Study
• One way to eliminate sampling error in an
observational study is to choose a timing that
fits with the particulars of your study.
Timing of a study
• Cross-Sectional Study- Data are observed,
measured, and collected at one point in time
• Retrospective- Collected from the past by
going back in time (through examination of
previously collected data)
• Prospective Study (also called a longitudinal or
cohort study)- Collected throughout a set
period of time- generally demonstrates
change or persistence
Designing experiments
1.2b
Basic Vocabulary
• Treatment or Experimental group- Those individuals selected for
your study who will have the particular variable that you are
concerned with changed. For example, they are given the drug that
you are looking to test.
• Control group- These are the individuals who will not be given the
treatment that you are looking to test.
• Placebo- A pill or intervention that is meant to have the members
of the control group think that they are given the treatment.
Literally, a placebo is a sugar pill that is used to make a sick person
think that they are getting medicine.
• The Placebo Effect- When a person who is given a placebo shows
improvement in the variables under observation because they think
that they have been given a treatment.
Bias
• Scientists are human beings and would like for
their treatments to show effects. However, the
community of science needs to have robust
outcomes in order to build knowledge. Hence,
processes are put into place by the scientific
community in order to make sure that there is no
systematic bias. A bias is a skewing of the data
towards a particular outcome due to variables
that are not the subject of the study. A
systematic bias is a bias in which the scientist,
either knowingly or unknowingly, skews the data
by how they construct the experiment.
Ways of Preventing Bias in
Experimental Studies
Randomization
• Subjects are assigned to different groups
through a process of random selection.
– Suppose that we wanted to conduct an
experiment where we tested the effect of a new
drug on treating the common cold. We could
randomly assign members to the treatment group
and the experimental group and that would help
us to insure that there was no systematic bias in
our samples.
Randomized Block Design
• Selecting subjects that have a similar
characteristic and then sampling from each of
those blocks.
– For example, a study on political opinions is
conducted in which the study’s participants are
grouped according to their political parties. Then,
a random sample of equal size is selected from
each of the groups and then tested.
Completely Randomized design
• Selecting members of the population through
a process of random selection.
– For example, a random number generator picks
student ID numbers from SAC students and
assigns those persons to either treatment or
placebo group.
– If the sample is of sufficient size then this is the
most common way of designing an experiment.
Rigorously Controlled Design
• Making sure each of the members of a sample
have a similar member of the control for all of
the important characteristics involved in the
study.
– For example, if a 40-year old overweight man who
smokes is selected for the treatment group of a
health study, then a second overweight man who
smokes will be chosen for the placebo group.
Matched Pair Design
• Compare two treatment groups where the
subjects are matched in such a way that sets of
two subjects are identical in some way.
– For example, a set of twins are chosen for a
toothpaste study. One twin is placed into the
treatment group and the second twin is placed into
the placebo group.
– A more common version of this is a study in which a
participant is tested at the beginning of the study and
then that same characteristic is tested at the end of
treatment time or at some point during the treatment
time.
Replication
• The repetition of an experiment on more than
one subject.
• The law of small numbers- Samples that are
too small will have too much variation to
accurately draw inferences about the
population.
Blinding
• The subject does not know that they are a
part of the treatment or the control group.
– The Placebo Effect- The fact that individuals who
know that they are taking a medicine may
improve because of their belief that the medicine
is effective.
Problems in Experimental and
Observational Design
• Confounding occurs in an experiment when
you are not able to distinguish amongst the
effects of different factors.
– A factor is a possible variable that could have an
effect upon an outcome but is not accounted for
in the model or in our conclusions.
– When we draw assumptions about a result and do
not take into account possible confounders, we
are committing an error. We may apply cause to a
situation where cause is not warranted
Confounding Variables
• Example: In the early 50’s, health researchers
concluded that excessive amounts of saturated
fat was responsible for heart disease. This
conclusion was based upon the fact that as
Americans ate more saturated fat, there was also
a commiserate increase in heart disease.
However, the researchers did not account for the
fact that processed sugar consumption also rose
with saturated fat consumption. The amount of
sugar consumption is called a confounder.
Review of Section 1.2
• Observational Studies vs. Experiments
• Issues with Designing an Observational Study
– Representativeness
– Timing the Study
• Issues with Designing Good Experiments
– Randomization
– Control
– Replication
– Blinding
Download