Addiction Journalists 05 Hawaii workshop

advertisement
Standards of Evidence for
Substance Abuse (SU)
Prevention
Brian R. Flay, D.Phil.
Distinguished Professor
Public Health and Psychology
University of Illinois at Chicago
and
Chair, Standards Committee
Society for Prevention Research
Prepared for Addiction Studies Workshop for Journalists
New York, January 11, 2005
What is SU Prevention?
• Programs or other interventions designed to prevent
youth from becoming users or abusers of alcohol,
tobacco and other drugs (ATOD) and to prevent the
long-term health and social consequences of SU.
• Types of programs:
– School-based educational programs
• E.g., Life Skills Training, DARE
– Mass media campaigns
• Office of National Drug Control Policy (ONDCP)
– Community-based programs
• E.g., Midwest Prevention Project (Project STAR)
• Age:
– Most programs target middle school
– Some target high school
– More and more target elementary school and K-12
Approaches to SU Prevention:
Historical View
1.
2.
3.
4.
5.
6.
Information
Scare tactics
Affective education
Resistance Skills
Social/Life Skills
Comprehensive
Information and Fear
• Early approaches relied on information
– “If only they knew (the consequences of what they
are doing), they wouldn’t do it.”
• Some used fear tactics
–
–
–
–
Show consequences, black lungs
“Reefer Madness”
Some of these had incorrect information
All were misleading to some degree
• For example, not showing benefits of use
• Neither approach reduces SU
– Some informational approaches improve knowledge
of consequences, but they do not reduce SU
– Indeed, some caused negative effects – greater SU.
Affective (Feelings-based)
Approaches
• Values Clarification
– Techniques for thinking about values were taught.
– Tied to decision-making skills
• Decision-Making Skills
– Think about alternative behaviors
– Consider the positive and negative outcomes of each
– Some kind of weighting scheme
• Approaches were not effective alone
– Values were not taught
– DM skills improved, but SU not reduced
– DM just another way of using information
Resistance Skills
• Developed in late 1970’s through 1980’s
• Just Say “No”
– Simplistic approach was popularized by Nancy
Reagan
• Resistance Skills much more complex
– Many different ways of saying “no”
– Many other considerations
• Peer pressure – direct or indirect
• Consequences of different ways of resisting pressures
• Approach reduced SU in small-scale studies
– Then some researchers thought they had a program
– They just had one important (maybe) component for
a more comprehensive effective program
Life Skills Training
•
•
•
•
Developed by Gilbert Botvin of Cornell 1980
More comprehensive than prior programs
30 sessions over grades 7-9
Components:
– Information about short- and long-term
consequences, prevalence rates, social acceptability,
habit development, addiction
– Decision-making, independent thinking
– Social influences and persuasive tactics
– Skills for coping with anxiety
– Social skills – communication, assertiveness, refusal
• Effectively reduced SU over multiple studies,
including one study of long-term effects
(grade 12)
Getting More Comprehensive
• Adding small media to school-based programs
– Video, Newsletters to engage parents
• Using mass media
– Television
• Multiple tests have failed, but well-designed programs
base on theory and careful developmental research can be
effective (Vermont study: Worden, Flynn)
• Incorporating community components
– E.g., Midwest Prevention Project
• Difficult, expensive, and long-term effects not clear
• In general, adding mass media, family or
community components improves effectiveness
(Flay 2000)
• Addressing multiple behaviors also found to be
more effective (Flay, 2002)
Why a Comprehensive
Approach is Necessary
• Multiple influences on (i.e., causes of) behavior,
especially its development
– Biology
• Genetic predispositions
– Social contexts
• Parents, friends, others
– The broad socio-cultural environment
• Religion, politics, economics, justice, education
• All of these have distal (causally distant) and
proximal (causally close) qualities
• Theory of Triadic Influence incorporates all of
these from multiple theories
The Theory
of
Triadic
Influence
Distal 
Intermediate

Proximal
THE BASICS OF THE THEORY OF TRIADIC INFLUENCE
DNA
Biology
& Personality
Social
Competence
Social
Skills
Sense of
Self
Self
Determination
Others'
Beh&&Atts
Bonding
Decisions/
Intentions
Perceived
Norms
Social
Context
Motivation
to Comply
SELF
EFFICACY
SOCIAL
NORMATIVE
BELIEFS
BEHAVIOR
Values
ATTITUDES
Values
Evaluations
Evaluations
EEnnvCultural
viirroon
nm
meennt
t
Culture
Religion
Informational
Environment
Knowledge
Expectancies
Example of the Application of TTI to Program Content
Effects of Aban Aya Program on Reducing SU Onset
0.9
Substance use among boys grades 5-8
CONTROL
Proportion reporting any use
0.8
School Only
0.7
0.6
School + Community
0.5
0.4
0.3
5
5
6
7
Grade
Proport ion report ing any subst ance use derived f rom proport ional odds growt h model
8
Pressure For Programs of
“Proven Effectiveness”
• The Federal Government increasingly
requires that Federal money be spent
only on programs of “proven
effectiveness”
– Center for Substance Abuse Prevention
(CSAP) in the Substance Abuse and Mental
Health Services Administration (SAMHSA)
– U.S. Department of Education (DE)
– Office of Juvenile Justice and Delinquency
Prevention (OJJDP)
What is “Proven Effectiveness”?
• Requires rigorous research methods to
determine that observed effects were
caused by the program being tested,
rather than some other cause.
• E.g., the Food and Drug Administration
(FDA) requires at least two randomized
controlled trials (RCTs) before approving
a new drug.
• RCTs are always expensive, and there
are many challenges to conducting RCTs,
especially in schools.
Standards of Evidence
• Each government agency and academic
group that has reviewed programs for
lists has its own set of standards.
• They are all similar but not equal
– E.g., CSAP allows more studies in than DE
• All concern the rigor of the research
• The Society for Prevention Research
recently created standards for the field
• Our innovation was to consider standards
for efficacy, effectiveness and
dissemination
Members of the SPR Standards Committee
•
•
•
•
•
•
•
•
•
Brian R. Flay (Chair), D. Phil., U of Illinois at Chicago
Anthony Biglan, Ph.D., Oregon Research Institute
Robert F. Boruch, Ph.D., U of Pennsylvania
Felipe G. Castro, Ph.D., MPH, Arizona State U
Denise Gottfredson, Ph.D., Maryland U
Sheppard Kellam, M.D., AIR
Eve K. Moscicki, Sc.D., MPH, NIMH
Steven Schinke, Ph.D., Columbia U
Jeff Valentine, Ph.D., Duke University
• With help from Peter Ji, Ph.D., U of Illinois at Chicago
Standards for 3 levels: Efficacy,
Effectiveness and Dissemination
• Efficacy
– What effects can the intervention have under ideal
conditions?
• Effectiveness
– What effects does the intervention have under
real-world conditions?
• Dissemination
– Is an effective intervention ready for broad
application or distribution?
Desirable
– Additional criteria that provide added value to
evaluated interventions
Overlapping Standards
• Efficacy Standards are basic
– Required at all 3 levels
• Effectiveness Standards include all
Efficacy Standards plus others
• Dissemination standards include all
Efficacy and Effectiveness
Standards plus others
Four Kinds of Validity
(Cook & Campbell, 1979; Shadish, Cook & Campbell, 2002)
• Construct validity
– Program description and measures of outcomes
• Internal validity
– Was the intervention the cause of the change in
the outcomes?
• External validity (Generalizability)
– Was the intervention tested on relevant
participants and in relevant settings?
• Statistical validity
– Can accurate effect sizes be derived from the
study?
Specificity of Efficacy Statement
• “Program X is efficacious for
producing Y outcomes for Z
population.”
– The program (or policy, treatment,
strategy) is named and described
– The outcomes for which proven
outcomes are claimed are clearly
stated
– The population to which the claim can
be generalized is clearly defined
Program Description
• Efficacy
– Intervention must be described at a level that
would allow others to implement or replicate it
• Effectiveness
– Manuals, training and technical support must be
available
– The intervention should be delivered under the
same kinds of conditions as one would expect in the
real world
– A clear theory of causal mechanisms should be
stated
– Clear statement of “for whom?” and “under what
conditions?” the intervention is expected to work
• Dissemination
– Provider must have the ability to “go-to-scale”
Program Outcomes
• ALL
– Claimed public health or behavioral
outcome(s) must be measured
• Attitudes or intentions cannot substitute
for actual behavior
– At least one long-term follow-up is
required
• The appropriate interval may vary by type
of intervention and state-of-the-field
Measures
• Efficacy
– Psychometrically sound
• Valid
• Reliable (internal consistency, test-retest or interrater reliability)
• Data collectors independent of the intervention
• Effectiveness
– Implementation and exposure must be measured
• Level and Integrity (quality) of implementation
• Acceptance/compliance/adherence/involvement of
target audience in the intervention
• Dissemination
– Monitoring and evaluation tools available
Desirable Standards for
Measures
• For ALL Levels
– Multiple measures
– Mediating variables (or immediate
effects)
– Moderating variables
– Potential side-effects
– Potential iatrogenic (negative)
effects
Design – for Causal Clarity
• At least one comparison group
– No-treatment, usual care, placebo or wait-list
• Assignment to conditions must maximize
causal clarity
– Random assignment is “the gold standard”
– Other acceptable designs
• Repeated time-series designs
• Regression-discontinuity
• Well-done matched controls
– Demonstrated pretest equivalence on multiple measures
– Known selection mechanism
Level of Randomization
• In many drug and medical trials,
individuals are randomly assigned
• In educational trials, classrooms or
schools must be the unit of assignment
– Students within classes/schools are not
statistically independent -- they are more
alike than students in other classes/schools
• Need large studies
– 4 or more schools per condition, preferably
10 or more, in order to have adequate
statistical power
Generalizability of Findings
• Efficacy
– Sample is defined
• Who it is (from what “defined” population)
• How it was obtained (sampling methods)
• Effectiveness
– Description of real-world target population
and sampling methods
– Degree of generalizability should be
evaluated
• Desirable
•
•
•
•
Subgroup analyses
Dosage studies/analyses
Replication with different populations
Replication with different program providers
Precision of Outcomes:
Statistical Analysis
• Statistical analysis allows unambiguous
causal statements
– At same level as randomization and includes
all cases assigned to conditions
– Tests for pretest differences
– Adjustments for multiple comparisons
– Analyses of (and adjustments for) attrition
• Rates, patterns and types
• Desirable
– Report extent and patterns of missing data
Precision of Outcomes:
Statistical Significance
• Statistically significant effects
– Results must be reported for all measured
outcomes
– Efficacy can be claimed only for constructs
with a consistent pattern of statistically
significant positive effects
– There must be no statistically significant
negative (iatrogenic) effects on important
outcomes
Precision of Outcomes:
Practical Value
• Efficacy
– Demonstrated practical significance in terms of
public health (or other relevant) impact
– Report of effects for at least one follow-up
• Effectiveness
– Report empirical evidence of practical importance
• Dissemination
– Clear cost information available
• Desirable
– Cost-effectiveness or cost-benefit analyses
Precision of Outcomes:
Replication
• Consistent findings from at least two
different high-quality studies/replicates that
meet all of the other criteria for efficacy and
each of which has adequate statistical power
– Flexibility may be required in the application of this
standard in some substantive areas
• When more than 2 studies are available, the
preponderance of evidence must be consistent
with that from the 2 most rigorous studies
• Desirable
– The more replications the better
Additional Desirable Criteria
for Dissemination
• Organizations that choose to adopt a
prevention program that barely or not quite
meets all criteria should seriously consider
undertaking a replication study as part of
the adoption effort so as to add to the body
of knowledge.
• A clear statement of the factors that are
expected to assure the sustainability of the
program once it is implemented.
Embedded Standards
Efficacy
20
 Effectiveness  Dissemination
28
31
 Desirable
43
What Programs Come Close to
Meeting These Standards?
• Life Skills Training (Botvin)
– Multiple RCTs with different populations,
implementers and types of training
– Only one long-term follow-up
– Independent replications of short-term
effects are now appearing (as well as some
failures)
– No independent replications of long-term
effects yet
Comprehensive Programs?
• No comprehensive programs yet meet the
standards
• Some have one study that meets most, but
without long-term outcomes or replications
(e.g., Project STAR)
• Most promising comprehensive program is
Positive Action
– K-12 Curriculum, Teacher training, school climate
change, family and community involvement
– Improves both behavior (SU, violence, etc.) and
academic achievement.
– Well-designed matched controlled studies
– Randomized trials currently in progress
Programs for Which the Research Meets
the Standards, But Do Not Work
• DARE
– Many quasi-experimental and non-experimental
studies suggested effectiveness
– Multiple RCTs found no effects (Ennett, et al.,
1994 meta-analysis)
• Hutchinson (Peterson, et al., 2000)
– Well-designed RCT
– Published results found no long-term effects
– But no published information on the program or
short-term effects
– Published findings cannot be interpreted because
of lack of information – they certainly cannot be
interpreted to suggest that social influences
approaches can never have long-term effects
How you can use the Standards
when Questioning Public Officials
• Has the program been evaluated in a
randomized controlled trial (RCT)?
• Were classrooms or schools randomized
to program and control (no program or
alternative program) conditions?
• Has the program been evaluated on
populations like yours?
• Have the findings been replicated?
• Were the evaluators independent from
the program developers?
Download