Experimental designs

advertisement
Experimental designs
The strongest of the
research designs
Image: www.freeimages.co.uk
Categories of research
• Quantitative
– Involves numerical data that
result from taking
measurements on subjects
– Is objective
– Deductive reasoning
• Is used to test theories or ideas to determine
whether or not they are true
– The researcher is an objective observer
Image: www.freeimages.co.uk
Categories of research (cont.)
• Qualitative
– Involves data derived from words
e.g., questionnaires or interviews
– Is subjective
– Inductive reasoning
• Reasoning based on observations which are used
to create an idea or theory
– The researcher actively involved at times
Quantitative vs. qualitative
research
• Quantitative research employs the
scientific method and is usually regarded
at a higher level
– But may have limited relevance to clinical
practice because of strict methods
• Qualitative research often leads to
quantitative studies
• Both forms of research are important
Pragmatic and explanatory
research
• Pragmatic research
– Used to verify the effectiveness of treatments
• i.e., whether they work under real-life conditions
– Does not determine how or why the
treatments work
– Typically used to help make decisions about
the effectiveness of new treatments compared
with existing treatments
Pragmatic and explanatory
research (cont.)
• Explanatory research
– Used to establish the efficacy of treatments
• i.e., how they work under ideal conditions, as in a
controlled experiment
– Capable of answering questions about how
and why treatments work
– Strict methods involved are often very
different from day-to-day clinical practice
• Consequently, results may not be relevant to
practitioners
Pragmatic and explanatory
research (cont.)
– Patient selection is more strict in explanatory
studies
– Patients are excluded because of things like
co-morbid conditions, prior treatment, severity
of the condition, age, etc.
– This may be a disadvantage because it is not
known whether the treatment will work for
patients in everyday practice
• Patients commonly present with many of the
exclusion criteria
Descriptive, relational, and causal
research
• Descriptive (observational) research
– Observes and records various aspects of
participants in a study
– Descriptive statistics involved
• Relational research
– Considers relationships that may exist
between variables
– Correlation and regression
Descriptive, relational, and causal
research (cont.)
• Causal research
– Explores whether an intervention causes or
affects one or more outcome variables
– The most demanding type of research that
involves very detailed methods
– Looks for statistically significant differences
between groups
Experimental and quasiexperimental research
• Experimental research
– Random assignment to groups is involved
– Capable of determining cause-and-effect
relationships
• Quasi-experimental research
– No random assignment
– Provides much less evidence about causeand-effect relationships
Experimental and quasiexperimental research (cont.)
• Non-experimental research
– Does not involve random assignment or even
a comparison group
– Merely involves the observation
of one group before and after
an intervention
Research design notation
•
•
•
•
R – random assignment
O – observation or measure
X – treatment or intervention
N – non-equivalent groups
• The classic
experiment
– Randomization
and 2 groups
R
R
O
O
X
Time
O
O
Each row
represents
a group
Research designs
• A quasi-experiment
– 2 groups but no
randomization
N
N
O
O
O
X
X
• Non-experiment
– Only 1 group
O
O
O
Population
• The units from which a sample is drawn
– May include people, but can also consist of
events or observations
• It is rarely possible to include each and
every unit of a population
– Instead, a smaller number of units (a sample)
are selected to represent the entire population
• Defined as a subset of observations from a
population
Samples
• Samples can permit inferences about what
is happening in a population based on
what is observed in a sample
• However, the sample must be
representative of the population
– Often achieved through random selection of
the sample units whereby each unit of the
population has an equal chance of being
selected
Sample selection
A sample is selected
Samples (cont.)
• Population parameters that are estimated
from random samples are known as
unbiased estimates
• Random sampling is rarely employed in
clinical trials
– Patients are obtained using sequentially
presenting patients or recruiting through
advertisements
– Referred to as convenience sampling
Samples (cont.)
• Selection criteria in clinical trials
– Patients are usually included in a clinical trial
only if they meet certain criteria
– e.g., severity of the condition, no secondary
conditions, history, age, etc.
• It is important to consider features of the
population in a study when applying its
results to a specific patient
Random assignment
• Clinical trials often employ random
assignment (a.k.a., randomization)
– Refers to the way patients are assigned to
groups
• Used to make groups equivalent regarding
prognostic factors (e.g., pain levels)
– Sometimes called probabilistic equivalence
because there is still a chance the groups will
be a different after randomization
Random assignment (cont.)
• Blocking
– Subjects are separated into homogeneous
subgroups based on factors such as age or
disease severity
– Enhances comparison because the
subgroups are more alike than the intact
groups
Random assignment (cont.)
• Stratified randomization
– Intact groups are separated into subgroups
based on prognostic factors
– e.g., trauma vs. non-trauma patients in a
whiplash study
Random assignment (cont.)
• Concealment
– Assignment is often concealed from
researchers to avoid the temptation of
allotting patients with certain traits to groups
that will receive special treatment
• When concealment is inadequate, the
apparent effects of the treatment may be
distorted as much as or more than the size
of the effect being investigated
Sample size determination
• Articles about clinical trials should discuss
why the number of subjects was chosen
• Ethically important
– Because no more subjects should be
inconvenienced or put at risk than required to
find a treatment effect
• Economically
– Extra resources required to include
unnecessary subjects
Sample size
determination (cont.)
• Too few subjects reduces the power of a
study so that a treatment effect may not be
noticeable when it actually is present
• Extremely large samples may show
statistically significant differences between
groups that are so small they are not
clinically important
The randomized
controlled trial (RCT)
• Regarded as the
ultimate research
design in health
care
• The classic
experiment
Placebo
• An inert substance or treatment
– Compared to the active substance or
treatment in RCTs
– Used in pharmaceutical trials to establish
whether an active drug is more effective than
a placebo
– The drug and placebo groups are compared
to determine if the drug resulted in a
statistically significant treatment effect
Sham
• A non-therapeutic intervention that imitates
the real treatment
– Similar to placebo, but refers to something
done rather than something taken
– Patients should have a very difficult time
telling the difference between a sham and the
real treatment
– A sham chiropractic manipulation is difficult to
produce
Treatment effect
• The result that a treatment
has on outcomes that is
attributable specifically to
the effect of the intervention
• The difference between the mean
outcomes observed in a treatment group
and a control group
Why patients improve
• Natural history
– Many acute and some chronic pain conditions
resolve on their own
• Actual effect of the treatment
• Nonspecific effects of the treatment
– Linked to the treatment, but actually due to
factors other than the active components of
the treatment
– Sometimes called placebo effects
Components of treatment
Effectiveness of a treatment
• Both the placebo and treatment groups
typically improve
• The difference between groups at the
conclusion of the study is what matters
• The treatment is considered effective if the
mean outcome of the treatment group is
significantly better than the placebo group
Bias
• Systematic errors in a study that are
caused by problems with
– The selection or assignment of patients to
groups
– The measurements involved in the study
• Bias can render a study invalid, although
all studies have at least some bias
Hawthorne effect
• People tend to react differently when
participating in experiments
• Researchers found that the productivity of
workers increased when they new they
were involved in a study
– True under a variety of conditions
– Even conditions that should have reduced
productivity
Hawthorne effect (cont.)
• Behavior was more influenced
by the attention researchers
gave to the subjects than the
effect of the interventions
• The Hawthorne effect is a factor in all
clinical studies
Types of bias
• Sampling bias (a.k.a, selection bias)
– During the selection process, each person
does not have an equal chance of being
selected from the source population
– Random selection is designed to take care of
this problem
– Results in systematic differences between
groups in experimental studies as to factors of
prognosis or response to treatment
Types of bias (cont).
– Random assignment with concealment is the
best safeguard against selection bias in RCTs
– The effect of selection bias is reduced by
random assignment because it distributes the
bias evenly between the treatment and control
groups
Types of bias (cont).
• Experimenter (researcher) bias
– Examining or treating doctors may influence a
study’s results because of their expectancies
or desires for a certain outcome
– Blinding (masking) of researchers and study
participants as to group assignment can
diminish the effect of this bias
– This bias can be divided into detection bias
and performance bias
Types of bias (cont).
• Exclusion bias
– Occurs when patients who drop out of a study
are systematically different from subjects who
remain
• Perhaps dropouts were having a poor response to
treatment
• Would have changed the results of the study if
they had remained
Extraneous and confounding
variables
• In experiments, researchers are able to
manipulate the explanatory variables and
then watch what happens to the outcome
variable
• Internal validity
– The ability of an experiment to show that the
explanatory variables actually caused the
observed changes in the outcome variables
Extraneous and confounding
variables (cont.)
• Extraneous variables
– Uncontrolled factors that can influence the
relationship between variables in an
experiment
– They are not the variables that are being
studied, yet they affect the outcome of the
experiment
– They are unwanted because they create error
Extraneous and confounding
variables (cont.)
– Error variance due to extraneous variables is
distributed evenly between the groups when
random assignment is utilized
• Confounding variable
– A type of extraneous variable that affects the
explanatory variables differently
• e.g., it affects the treatment group but not the
control group
– Introduces systematic error into the study
Extraneous and confounding
variables (cont.)
– The effect of a confounding variable cannot
be separated from the outcome variable
Explanatory variable
e.g., manipulation
Confounding variable
e.g., groups receive manual
vs. instrument manipulation
Outcome variable
e.g., low back pain
Extraneous and confounding
variables (cont.)
• Quasi-experimental designs are
particularly susceptible to confounding
because the individual differences of
subjects may act as confounding variables
• For example
– A quasi-experimental study that assigned
headache patients with more severe pain to
the treatment group
Threats to internal validity
• History
– Participants are unintentionally exposed to
some historical event during the research
project which affects the results
– For example
• A statewide fitness campaign that coincides with a
lower back pain study
• Some of the subjects doing the exercises would
likely affect the study’s outcome
Threats to
internal validity (cont.)
• Reliability of measures
– Unreliable measures can invalidate a study
– Possible causes
• Faulty equipment, inconsistent instructions to
study participants, unreliable training of
examiners, fatigue or boredom of examiners, or
examiners becoming more skilled at doing the
test
Threats to
internal validity (cont.)
• Mortality
– Subjects dropping out of studies
– Drop-outs may be different from those who
remain
– Occurs for a variety of reasons
• e.g., poor response to treatment, exceptional
response to treatment, adverse effects
– Groups may not be equivalent as a result
Threats to
internal validity (cont.)
• Maturation
– Changes that occur in study participants as
time passes that are not caused by the
explanatory variables
– e.g., in a study investigating strength in
children, they would most likely get stronger in
time, even without exposure to the
explanatory variables
Threats to
internal validity (cont.)
• Regression to the mean
– Extreme scores at the beginning of a study
that migrate toward the mean as time passes
– Occurs because extreme symptoms tend to
return to a more normal state on their own
• i.e., high initial patient scores are much more likely
to move toward normality than to go even higher
– Especially problematic when patients are
selected based on high test values, while
patients with low values are screened out
Read and bring to class
• Hoiriis et al. A Randomized Clinical Trial
Comparing Chiropractic Adjustments
To Muscle Relaxants For Subacute Low
Back Pain. JMPT 2004;27:388-98
• Bakris, et al. Atlas vertebra realignment
and achievement of arterial pressure
goal in hypertensive patients: a pilot
study. J Hum Hypertens. 2007
May;21(5):347-52.
External validity
a.k.a., generalizability
• The extent results of a study are
applicable to other populations, other
settings, and when implemented under
different circumstances
– Should be comparable regarding the
intervention, age, condition severity, etc.
• Relating to EBP – Are the results of a
study applicable to the management of a
particular patient?
External validity (cont.)
• Meade et al. study
– Office-based chiropractic care was compared
with hospital-based physical therapy for low
back pain
– Chiropractic was found to be superior, but
may have been related to patients being
treated in private chiropractic offices versus
out-patient PT departments at hospitals
Internal validity vs.
external validity
Group Mean vs.
an Individual Patient
• A RCT only considers the average of a
group of subjects
• A given patient will NOT be average
– Each patient is unique in some way regarding
condition severity, secondary conditions,
response to care, etc.
• Each practitioner is unique with a whole
arsenal of treatment options
Research designs
• The pretest-posttest randomized
experimental design
– Is the classic experiment design mentioned
earlier
• The most commonly used design in
research
– Patients are randomized to treatment groups
which drastically reduces the chance of bias
Classic experiment
design (cont.)
– Subjects are evaluated before and after the
intervention so that pre-treatment differences
between groups can be considered
• Groups are rarely exactly equivalent
• Analysis of covariance (ANCOVA) test factors in
pretreatment differences between groups as a
covariate
– Use of a control group allows separation of
the active ingredient of the treatment effect
from non-specific components
ANCOVA test
The ANCOVA test
factors in pretreatment differences
between groups as
a covariate
ANCOVA test
• Statistically removes the effect of
covariates from the analysis
• Other variables can also be “adjusted for”
using ANCOVA
– e.g., differences between groups regarding
age or condition severity
• Example report in journal article
– ... the effects of pre-treatment differences
were adjusted for during analysis
Two-group
pretest-posttest design
• Comparison with an alternate form of
treatment
– e.g., a new therapy is compared to an
established therapy
– Cannot determine whether a new treatment
works better than no treatment
R
O
X1
O
R
O
X2
O
Post-test only randomized
controlled trial
• Groups cannot be compared after
randomization because no pretest is used
– It is a weaker design because of doubts about
the success of randomization
• Sometimes used when groups are large
– Large groups are much more likely to be
equivalent
R
R
X
O
O
Factorial design
• Often used when several explanatory
variables are involved in a study
• Useful to determine if any interaction
exists between the variables
• Explanatory variables are categorized as
– Factors (the major independent variables)
– Levels (subgroups)
Factorial design (cont.)
• Two factor by two level (2 X 2) factorial
design
X11
X21
X12
X22
Factorial design (cont.)
– Group 1 received Diversified technique and palpation
as the method of analysis
– Group 2 Gonstead and palpation
– Group 3 Diversified and x-ray
– Group 4 Gonstead and x-ray
Factorial design
notation
R
R
O
O
X11
X12
O
O
R
O
X21
O
R
O
X22
O
Crossover design
• Treatment is provided to one group, while
the other group receives a placebo or
alternate treatment
• Group assignments are switched at some
point in time without the doctors’ or
subjects’ knowledge
• Each group receives both the active
treatment and the alternate treatment
Crossover design (cont.)
• Each subject acts as their own control,
which can reduce the required sample size
considerably
Crossover design (cont.)
Crossover design notation
R
R
O
O
X1
X2
O
O
Optional
washout
period
O
O
X2
X1
O
O
Crossover design (cont.)
• Crossover design limitations
– Carry-over effects
• The therapeutic effects of the first intervention
continue during the second intervention
– High dropout rates
• Because there are 2 or more periods of treatment
• The negative effect is more harmful to the data
analysis than other designs because each patient’s
data is so important
Crossover design (cont.)
– Treatment sequencing
• Patients may respond differently when treatment 1
is given before treatment 2 than if the order is
reversed
– For example
• A chronic neck pain study where treatment 1 is
manipulation and treatment 2 is massage
• Results may be different if treatment 2 is provided
first because the massage may enable patients to
receive a better effect from the manipulation
Quasi-experimental designs
• Very similar to the randomized designs,
minus random assignment to groups
• The lack of randomization is a major factor
that make claims about causality based on
quasi-experimental evidence doubtful
• On the other hand, a first-rate quasiexperiment can generate stronger
evidence than a poorly conducted RCT
Non-experimental designs
• Do not utilize randomization or a
comparison group
• Are not capable of determining the effect
of an intervention
• Includes
– Survey and observational research
– Case studies and case series
Non-experimental
designs (cont.)
• Non-experimental designs are low on the
evidentiary scale
– They are still quite valuable because they
describe unfamiliar occurrences and often
lead to more complex studies
• Pretreatment measures may be taken, but
usually only one measure is involved
X
O
Chiropractic interventions and
experimental methods
• Pharmaceutical experiments work well
– Because it is fairly easy to make an active pill
and an identical looking placebo pill
• Not so with chiropractic interventions
– It is difficult to deceive doctors and patients
– Sham adjustments are either so invasive they
become therapeutic or so dissimilar from
adjustments that patients know they are in the
placebo group
Chiropractic interventions and
experimental methods (cont.)
– Patients may actually receive a treatment
effect when sham adjustments are too
invasive
– Conversely, they may not receive a placebo
effect when they are aware of their inclusion
in the placebo group
Download