PrinciplesOfResearchAndThePlaceboEffect

advertisement
What you need to know to perform biofeedback effectively
Article 3. Why you need to know about research design and the
placebo effect
By Richard A. Sherman, Ph.D.
Biofeedback practitioners benefit by knowing how to evaluate clinical
research to assess whether a technique is efficacious. Common use of a
technique is not a reliable indication of its efficacy.
Balancing all the factors helps you, the professional make your own
decision as a about whether a technique is efficacious or even worthy of a
trial.
Keeping an
Open Mind
The Weight
of Evidence
Many techniques currently in use will likely change or be modified in
the future as scientific discovery contributes to our knowledge base. More
studies supporting or detracting from the use of myriad techniques will be
published. New techniques for assessment and control will be promulgated.
As a biofeedback practitioner, you would benefit by having sufficient skills in
searching the literature and assessing its quality to determine whether it is
useful and clinically applicable.
Do you know how effective you are as a clinician?
A problem with the acceptance of biofeedback is a concern with
unsubstantiated interventions incorporating a variety of techniques that are
applied without a reasonable body of clinical evidence supporting their
efficacy.
The combination of 1) not enough properly designed studies with
adequate numbers of subjects and sufficiently long follow-ups to be
convincing and, 2) the use of unsubstantiated techniques is a common
1
concern raised by those wary of biofeedback. Credible clinical studies are
necessary to support the use of specific biofeedback modalities and protocols
in the treatment of any disorder.
There are nine key elements of a credible clinical study/publication:
1) adequate diagnosis of the subjects.
2) adequate pretreatment baseline to establish symptom variability.
3) objective outcome measures relevant to the disorder
4) intensity of the intervention sufficient to produce an effect
5) a way to ascertain whether the intervention was successful (was the drug
taken properly or the behavioral technique successfully learned and then
used)
6) sufficient patient/subjects to make the result credible
7) appropriate design for the question (e.g. single group, controls,
believable placebo, etc.)
8) sufficient descriptive statistics so the results are clear.
9) long enough follow-up so the duration of results can be established.
There are five criteria for accepting a new technique as efficacious.
Many organizations, such as the American Psychological Association (APA)
and AAPB, have adopted requirements such as the following for determining
that a treatment has been shown to be efficacious:
1) two studies with appropriate design and controls (group design or a
series of single case design studies).
2) studies conducted by different researchers.
3) studies demonstrate clinical efficacy. The new treatment must be shown
to be efficacious in comparison with medication, placebo, or another
treatment. The treatment must be shown to be equally effective as an
established treatment for the same problem.
4) waiting list controls are not sufficient for demonstrating efficacy.
expectation effect.)
(No
2
5) the diagnostic criteria must be clearly specified.
Fatal flaws occur when the investigators are not sufficiently knowledgeable
about:
1) the disorder they are working with
2) their own recording techniques
3) assessing outcome measures to ask the right question
4) the basic elements of research design.
Look for these flaws when you read a study or hear about an idea in order to
assess using a technique.
One common design flaw is a failure to anchor the study outcome
measures to the population having the problem being investigated. A
good example is an early study on behavioral treatment of cancer. The
design compared group therapy intervention to the records of similar
patients. The outcome measure was years of survival with significance
determined by the difference in average number of years of survival of each
group. The investigators reported the patients receiving group therapy
survived significantly longer than the control group and concluded group
therapy was probably the reason for the longer survival.
The investigators did not compare the survival rates of their tiny
groups to the huge database of similar cancer patients starting at the same
stages of the same type of cancer. In fact, 1) their groups were so small
that known variability in survival would have made it very likely that one
group would have an average survival time far longer than the other just by
chance and 2) their failure to review well-known life table data on survival
times caused them to miss the crucial point that the apparently longer
survival of several participants was not out of line with the population.
Sadly, the controls died earlier than would be expected, and numerous
studies have now shown behavioral interventions to be ineffective for cancer
survival.
Another commonly encountered problem is a failure to analyze the
data correctly because the investigators did not understand the outcome
measures or human variability. For example, typical ESP studies have
participants try to guess which one of five possible shapes — such as those
on the cards shown below — a “sender” is thinking about. There are usually
25 cards in a deck, so a participant has a one in five chance of guessing the
3
card correctly. If there is no ESP, the participant would be expected to
randomly guess five of the 25 cards, but, if he or she guesses substantially
more or less than five, that person may be declared to have ESP.
Regression to the mean tells us that a few people will guess a greater or
lesser number correctly by chance if sufficient people are tested. It also
tells us that the person guessing unusually more or less correct answers
than chance is likely to guess at about chance level at the next test.
Unfortunately for proponents of ESP, this is just what happens nearly all the
time when such studies are correctly analyzed.
It’s important to know the tools, equipment, and the recording
methodology utilized in a study. An example of a past flawed methodology
is the use of alpha EEG biofeedback for anxiety. Alpha frequency waves
look just like muscle tension from the eyes. The illustration below shows
filtered versions of a wave that could be either alpha EEG or elctrooculogram (EOG) waves. The early biofeedback devices couldn’t tell the
difference and neither could some early clinicians, so all they did was teach
people to increase eye muscle tension by rolling up their eyes. The
methodology was flawed, but the treatments worked because people learned
to sit quietly in a dim room and effective relaxation exercises were given as
homework.
Both “EOG” and alpha waves look like this:
It’s also important to know the physiology and the disorder being
treated. One of my early hypertension studies showed that clinically
significant changes occurred as a result of habituation to the “biofeedback”
recording environment. The following illustration shows changes in blood
pressure over eight weeks when recordings were made for groups of
hypertensives seen twice (blue circles) or eight times with either no (dotted
line) or actual biofeedback treatment (solid line).
4
BP
Weekly sessions (weeks 1 – 8)
Look at the chart below of finger temperature baseline instability. The
subjects were technicians never exposed to biofeedback devices. In this
study, they are gradually relaxing, so blood flow – and thus skin
temperature are increasing. It looks like a learning curve but the subjects
couldn’t see the display.
Temp
Time in Minutes
Here is an illustration of the difference between learning a
technique and changes in intensity of a clinical problem. The table
below shows the results of a clinical study that used biofeedback to treat a
disorder. The authors sent only the left column of the table in with their
paper for review and concluded biofeedback did not work for the disorder.
However, there was sufficient data in the paper to construct the middle and
right columns. The information in the middle column, subjects who learned
the skill, and the right column, subjects who did not learn the skill, puts the
original improvement date in an entirely different perspective. With this
additional data, we are able to conclude that while most subjects did not
learn the skill, most of those who did showed clinical improvement, and
most of those who did not showed no clinical improvement.
5
Original Results of the
Study
Subjects Who Learned the
Skill
Subjects Who Did Not
Learn the Skill
# Subjects Who Showed
No Improvement = 30
6
24
# Subjects Who Improved
= 21
18
3
Nonspecific/placebo effects are the bane of uncontrolled clinical
studies. They also help to explain why all the “new” treatments work. New
treatments are initially tested using the pretreatment baseline – intervention
– posttreatment baseline (A-B-A) design with no control group. Therefore,
changes resulting from natural fluctuations in the disorder’s intensity (as
with acute lower back pain) and nonspecific/placebo effects are missed.
Consider the typical 30% placebo cure rate for headache. Nonspecific
effects include patient-therapist bonding, placebo efects, changes with time,
expectations, etc. Follow-ups are usually too short to observe the placebo
effects wearing off.
A good placebo control includes the following:
1) treatment expectation effects.
2) placebo effect from the belief that the treatment can work or is working.
3) habituation to the treatment environment and sufficient duration to
elucidate changes with time.
4) good therapist-patient bonding with the therapist giving general support
and expectation that the treatment will work.
The following figure contrasts the results from a realistic placebo control
(blue) and an actual treatment (red). The hatch marks indicate variability
of responses.
6
Controls — especially realistic placebo controls — are critical
because of the placebo effect and changes in problems with time. As an
example, as long ago as 1976, Dohrman and Laskin (Journal of Dental
Research 55: 249) conducted a study of 24 patients diagnosed as having
jaw-area pain related to muscle tension. Eight patients were in a placebo
group, and 16 were given biofeedback. Three-quarters of the patients
treated with biofeedback showed “significant improvement of clinical
symptoms and required no further treatment.” This sounds good, but
unfortunately for people who believe that controls aren’t needed, half of the
controls had comparable results.
There was no long-term follow-up, so there was no opportunity to
know whether the placebo effect wore off (it usually lasts six months or so)
or whether the pain simply returned on its own. Elton offers an excellent
review of the placebo effect on pain (Elton et al: Psychological Control of
Pain. Grune & Stratton, NY, 1983)
An especially good article is by Finniss and Bendetti (Pain 114, 3 – 6,
2005), which discusses mechanisms of the placebo response and their
impact on clinical trials and clinical practice.
A good article demonstrating the power of the placebo response is one
by Price et al (Pain 127, 63 – 72, 2007), which shows that placebo analgesia
results in large reduction in pain-related brain activity in patients with
irritable bowel syndrome. This is one of many studies beginning to come out
that show changing how the brain processes pain changes pain perception,
regardless of the source of the pain or the method used to change the
brain’s processing.
Discover and Scientific American magazines have published several
articles on brain scans, showing that placebos stop pain by changing
processing of the signals (e.g. Epstein in Discover, page 26, Jan. 2006);
Choi in Scientific American, page 36, Nov. 2005; and Ruvin in Discover,
April 2006). For more on brain and pain, you may want to look at Nicoll and
Alger’s article on the brain making its own pain relievers (Scientific
American, Dec. 2004).
7
Open studies don’t show that a technique is actually effective.
Single subject and single group designs are important to demonstrate that a
change in significant outcome measures, such as pain intensity, ability to
walk further, etc., takes place from beginning to end of the study period.
Therefore, it is worth progressing to a much more extensive, complex
design. Evaluation of a technique’s efficacy can’t stop at open studies
because the change could just as easily be the result of time alone or
placebo effects.
This is why single group studies indicating effectiveness of
interventions for low back pain such as (a) chiropractic manipulation and (b)
low back surgery have little to no credibility in the medical community. As
soon as a control for change with time is introduced, it turns out that
subjects receiving chiropractic do no better that those not receiving any
treatment. (Most of the comparison studies are “population change” based
in which changes during a chiropractic study are compared with changes
expected of the population.) As soon as a comparison control with other
treatments is included, it turns out that surgery for low back pain is no
better than well designed, intense behavioral and strengthening programs.
When both are compared with no intervention, their results are less than
impressive.
Typically, tiny controlled studies of behavioral interventions frequently
have fatal flaws. First, waiting list control groups do not have
expectation/nonspecific/placebo effects. Also the people on the waiting lists
are frequently very different from those in the study. They have sometimes
turned down participation in the behavioral study, are too poor to make the
required trips, or are involved in other situations.
Another problem is that the typical 10-subject behavioral study is too
small to adequately represent true patterns in the general population of
people with the disorder. They are also too small to avoid the pitfalls of
looking good or bad because of unusual reactions by a few people who are
either very sensitive or insensitive to the treatment. The final issue is that
follow-ups are generally, but not always, too short to observe placebo
effects wearing off.
Look beyond the “weight of clinical experience” or thinking along
the line of “if the technique is in use now, it must be efficacious.” History is
full of examples illustrating this. When the idea of pre-surgery handwashing was first extolled, it was considered a waste of time, certainly not
something that would make a difference in health care. It was only
accepted after a comparative controlled study considered surgical survival.
Virtually all of the techniques that were in use around the time of World War
8
I are no longer used. At the time everybody knew they worked and laughed
at the new ideas. Nearly all of the drugs and techniques that everyone
supported during the World War II era are gone as well.
Many of today’s accepted techniques have never been subjected to
controlled study, so it’s virtually impossible to evaluate them. At least some
of the techniques which are labeled alternative and complementary today
will survive to be the standard techniques of tomorrow, and we will look
back and laugh at techniques that we once swore by. With better education
of clinicians, more audits by record keeping agencies, and enforcement of
the law, more techniques will need to be proven effective before they reach
wide use.
Keeping credibility in mind, here are five positive signs to look for as
you listen to discussions of alternative techniques:
1) numerous articles by different authors supporting use of a technique.
2) many articles with good, realistic placebo controls.
3) double-blind articles, not single-blind, and evaluations done by a neutral
team.
4) patients who are randomly assigned to the alternative technique (not
people who show up wanting it).
5) articles published in mainstream journals with high reputations
(determined by citation scores, etc.) rather than only in a journal supported
by practitioners of that technique.
This article has offered some helpful hints for your current practice.
One note on typical clinical research courses: Most are designed for
biologists or people performing psychological studies. They don’t
concentrate on how to recognize or perform good studies in the biofeedback
clinical environment. Be sure to take a course relevant to your interests and
applicable to clinical biofeedback practice.
This article has provided a rationale in support of biofeedback practitioners
having a working understanding of research methodology, experimental
design, and the placebo effect. The author, Richard Sherman, Ph.D. is a
Past President of AAPB and teaches basic science and biofeedback training
courses. He can be reached at rsherman@nwinet.com.
9
Download