Randomization

advertisement
Chapters 1 & 2
Page 1 of 8
Statistics 103
Part I. Design of Experiments – Chapters 1 & 2
Chapter 1 & 2. Controlled Experiments & Observational
Studies
The important points of these chapters:
1. The method of comparison is the only way of determining if a treatment is
effective.
2. In order to do that we have 2 groups which must be “statistically”
indistinguishable.
3. A probability method must be used to assign people to the two groups since that
allows us to use probability theory to analyze the results and reach a conclusion.
If we randomly assign people to the groups we can determine how likely it is that
they are not alike.
4. The best type of experimental design is a controlled, randomized, double blind
experiment.
Key Terms
Treatment group – The group of subjects given some treatment in an experiment.
Control group – A group of subjects who did not receive treatment.
Placebo – Pills (or a substance) identical in appearance to active substance but containing
no active drug at all.
Controlled experiment – A study where the investigators decide who will be in the
treatment group and who will be in the control group.
Randomized controlled experiment - When an impartial chance procedure is used to
assign the subjects to the treatment or control group.
Double-blind – A procedure used in an experiment whereby the subject does not know
whether he or she is receiving a treatment or placebo, and the person administering the
treatment also does not know what each subject is given.
Observational study – The subjects assign themselves to the different groups and the
investigators watch what happens.
Chapters 1 & 2
Page 2 of 8
Statistics 103
Confounding – A difference between the treatment and control groups, other than the
treatment, which affects the responses being studied.
Association – When one thing is linked to another.
Self-selected survey is one in which the respondents themselves decide whether to be
included.
CONTROLLED EXPERIMENTS
If a new drug is introduced its effectiveness needs to be tested. How does one do
this? In the first section of our text the author describes the Salk vaccine field trial in
which a vaccine for polio was tested. It is a good example on the importance of design in
a statistical experiment. To know the effects of the vaccine, statisticians compare the
responses of a treatment group with a control group. If the treatment group is
comparable to the control group then a difference in responses of the two groups is likely
to be due to the effect of the treatment. If the treatment group differs from the control
group, then the effects of the factors that differ are likely to confound the results of the
study. Confounded results are not reliable and thus, one wants to minimize confounding
variables when determining a statistical design. One way to minimize confounding is to
randomly choose subjects to be in the treatment and control groups: this is then a
randomized controlled experiment. A statistician also wants to minimize bias in their
design for an experiment. For this reason, the control group should be given a placebo to
insure a response is to the treatment rather than the IDEA of being treated. In addition, to
minimize bias, an experiment’s design should also be double-blind. As defined above,
the subjects in a double-blind experiment do not know whether they are in treatment or in
control; neither do those who evaluate the responses. In summary, a statistician wants a
well-controlled experiment in which the design follows a randomized controlled
experiment that is double-blind and if possible, the control group is given a placebo.
OBSERVATIONAL STUDIES
In an observational study, the investigators do not assign the subjects to the
treatment or control. Some of the subjects have the condition whose effects are being
studied; this is a treatment group. The other subjects are the controls. For example, in a
Chapters 1 & 2
Page 3 of 8
Statistics 103
study on smoking, the smokers form the treatment group and the non-smokers are the
controls. Observational studies can establish association: one thing is linked to another.
Association may point to causation, but association does not prove causation There is
an association between exposure and disease, but this does not directly imply that
exposure causes the disease. For example, it may be that there are other genetic factors
which is linked to causing lung cancer AS WELL AS giving one the propensity to smoke.
This is a confounding factor. A confounder is a third variable that is associated with
exposure and with disease.
Observational studies must always be viewed with suspicion because the
assignment of people to treatment and control is not randomized. The assignment is
always self selected.
OVERVIEW
With both observational studies and with nonrandomized controlled experiments,
try to find out how the subjects came to be in treatment or in control. When looking at a
study, ask the following questions:

Are the groups comparable or are they different?

Was there any control group at all?

What factors are confounded with treatment?

What adjustments were made to take care of confounding? Were they
sensible?

Were historical controls used, or contemporaneous controls?

How were subjects assigned to treatment – through a process under the
control of the control of the investigator (a controlled experiment), or a
process under the control of the investigator (an observational study)?

If a controlled experiment, was assignment made using a chance
mechanism (randomized controlled), or did assignment depend on the
judgment of the investigator?
Studies
Chapters 1 & 2
Page 4 of 8
Statistics 103
Controls
Contemporaneous
Controlled experiment
Randomized
No controls
Historical
Observational studies
Not randomized
Design of Experiments (Summary) Much of the material on sampling will be
covered in detail in a later chapter.
In an observational study, we observe and measure specific characteristics but we do not
attempt to modify the subjects being studied.
In an experiment, we apply some treatment and then proceed to observe its effects on the
subjects.
There are a few basic steps that should be followed in designing an experiment that is
capable of yielding valid results.
1.
Identify your objective. Identify the exact question to be answered and clearly
identify the relevant population.
2.
Collect sample data: The way in which sample data are collected is absolutely
critical to the success of the experiment. The sample data must be representative of the
population in question. The sample must be large enough so that the effects of the
treatment can be known. The question that you are trying to answer in your objective
should be addressed without interference from extraneous factors.
3.
Use a random procedure that avoids bias.
4.
Analyze the data and form conclusions.
Chapters 1 & 2
Page 5 of 8
Statistics 103
Controlling Effects of Variables
A placebo effect occurs when an untreated subject incorrectly believes that he or she is
receiving a treatment and reports an improvement in symptoms. The placebo effect can
be countered by using blinding, a technique in which the subject does not know whether
he or she is receiving a treatment or a placebo.
When designing an experiment to test the effectiveness of one or more treatments,
it is important to put the subjects (often called experimental units) in different groups or
(blocks) in such a way that those groups are very similar. A block is a group of subjects
(or experimental units) that are similar. (The subjects only need to be similar in the ways
that might affect the outcome of the experiment.)
When testing one or more different treatments, form blocks so that each one
consists of subjects that are similar.
When deciding how to assign the subjects to different blocks, you can use random
selection or you can try to carefully control the assignment so that the subjects within
each block are similar. One approach is to use a completely randomized experimental
design, in which subjects are put into different blocks through a process or random
selection. Another approach is to use a rigorously controlled design, in which
experimental units (the subjects) are carefully chosen so that the subjects in each block
are similar in the ways that are important.
When conducting experiments, the results are sometimes ruined because of
confounding.
Confounding occurs in an experiment when the effects from two or more variables
cannot be distinguished from each other.
Sample size
Another important consideration in conducting experiments is the size of the
sample. It must be large enough so that erratic behavior of very small samples will not
produce misleading results.
Use a sample size large enough so that we can see the true nature of any
effects, and obtain the sample using an appropriate method, such as one based on
randomness.
Randomization
One of the worst mistakes is to collect datai n a way that is inappropriate. We
cannot overstress this very important point:
Chapters 1 & 2
Page 6 of 8
Statistics 103
Here are some other terms relating to sampling. However, only random sampling
and simple random sampling will be used throughout this course.
In a random sample, members of the population are selected in such a way that each
member has an equal chance of being selected.
A simple random sample of size n subjects is selected in such a way that every possible
sample of size n has the same chance of being chosen. This is drawing n subjects at
random from a population without replacement. That is, as each person is selected they
are not placed back into the population and subject to being chosen again. In a large
population the odds of being chosen will not change significantly as each person is
selected and removed. But, in a small population removing a subject then changes the
chances of being selected on subsequent drawings. Small populations have to be
considered with modified formulas.
In systematic sampling, we select some starting point and then select every kth (such as
every 50th) element in the population.
With convenience sampling, we simply use results that are readily available. For
instance, if we do a survey of students who happen to be walking by some location we
choose, this is a sample of convenience, not a random sample.
With stratified sampling, we subdivide the population into at least two different
subgroups (or strata) that share the same characteristics (such as gender or age bracket),
then we draw a sample from each stratum. (We do not concern ourselves with this any
further in this course.)
In cluster sampling, we first divide the population area into sections (or clusters), then
randomly select some of those clusters, and then choose all the members from those
selected clusters. (We do not concern ourselves with this any further in this course.)
A sampling error is the difference between a sample result and the true population
result; such an error results from chance fluctuations.
A nonsampling error occurs when the sample data are incorrectly collected, recorded, or
analyzed (such as by selecting a biased sample, using a defective measurement
instrument, or copying the data incorrectly).
Example 2, Pg. 26, #9
Solution
Chapters 1 & 2
Page 7 of 8
Statistics 103
(a) False because the observational studies found that people who get lots of
vitamins by EATING VEGETABLES have lower death rates from colon
cancer and lung cancer. In contrast, the colon cancer experiment found no
difference in death rate between the control group and treatment group. Also,
lung cancer experiment found that the death rate increased for subjects that
took beta-carotene.
(b) True because there may be some confounding factor other than eating fruits
and vegetables that could be attributed to decreasing the death rate.
(c) False because the treatment group differed from the control group only by
taking the vitamin supplements. It is unknown that they would eat lots of fruit
and vegetables as a part of their diet. Therefore, we cannot conclude that their
lifestyles are also different.
Example 3, Pg. 26, #10
Solution
(a) This was an observational study because the subjects (children) assigned
themselves to the groups being studied simply by their body fat. The
investigators were then able to study the relationships the children in each
group had with their mothers.
(b) Yes, the association is that young children with more body fat would have
more controlling mothers.
(c) Yes, if the mother’s controlling behavior causes the child to eat more, then
there is an association between her controlling behavior and the child’s body
fat.
(d) No. The gene is not related to the mother’s controlling behavior and therefore
is not a confounding factor.
Chapters 1 & 2
Page 8 of 8
Statistics 103
(e) The association is that controlling mothers have children with more body fat.
An alternative way to explain the
association is that the mother sees the child overeating and tells the child not
to eat.
(f) No. The Chronicle seems to have overreacted.
Download