power point - Institute for Policy Research

advertisement
Modeling “The Cause”:
Assessing Implementation
Fidelity and Achieved Relative
Strength in RCTs
David S. Cordray
Vanderbilt University
IES/NCER Summer Research Training
Institute, 2010
Overview
• Define implementation fidelity and achieved relative strength
• A 4-step approach to assessment and analysis of implementation
fidelity (IF) and achieved relative strength (ARS):
–
–
–
–
Model(s)-based
Quality Measures of Core Causal Components
Creating Indices
Integrating implementation assessments with models of effects (next
week)
• Strategy:
– Describe each step
– Illustrate with an example
– Planned Q&A segments
Caveat and Precondition
• Caveat
– The black box (ITT model) is still #1 priority
• Implementation fidelity and achieve relative
strength are supplemental to ITT-based results.
• Precondition
– But, we consider implementation fidelity in
RCTs that are conducted on mature (enough)
interventions
– That is, the intervention is stable enough to
describe an underlying—
• Model/theory of change, and
• Operational (logic and context) models.
Dimensions Intervention Fidelity
• Operative definitions:
– True Fidelity = Adherence or compliance:
• Program components are delivered/used/received, as
prescribed
• With a stated criteria for success or full adherence
• The specification of these criteria is relatively rare
– Intervention Exposure:
• Amount of program content, processes, activities
delivered/received by all participants (aka: receipt,
responsiveness)
• This notion is most prevalent
– Intervention Differentiation:
• The unique features of the intervention are distinguishable
from other programs, including the control condition
• A unique application within RCTs
Linking Intervention Fidelity Assessment to
Contemporary Models of Causality
• Rubin’s Causal Model:
– True causal effect of X is (YiTx – YiC)
– In RCTs, the difference between outcomes, on
average, is the causal effect
• Fidelity assessment within RCTs also entails
examining the difference between causal
components in the intervention and control
conditions.
• Differencing causal conditions can be
characterized as achieved relative strength
of the contrast.
– Achieved Relative Strength (ARS) = tTx – tC
– ARS is a default index of fidelity
Treatment Strength
.45
.40
.35
Outcome
100
TTx
t
tx
Infidelity
YT
Yt
90
85
80
.30
Achieved Relative
Strength =.15
.25
.20
tC
.15
TC
“Infidelity”
75
Yc
YC
70
65
.10
60
.05
55
.00
50
d with fidelity
Y  YC
 T
sd pooled
d with fidelity 
(85)-(70) = 15
90  65
 0.83
30
Expected Relative Strength = (0.40-0.15) = 0.25
d
85  70
 0.50
30
d
Yt  Yc
sd pooled
d  0.50
Why is this Important?
• Statistical conclusion validity
• Construct Validity:
– Which is the cause? (TTx - TC) or (tTx – tC)
• Poor implementation: essential elements of the treatment are
incompletely implemented.
• Contamination: The essential elements of the treatment group are
found in the control condition (to varying degrees).
• Pre-existing similarities between T and C on intervention
components.
• External validity – generalization is about (tTx - tC)
– This difference needs to be known for proper
generalization and future specification of the
intervention components
So what is the cause? …The achieved relative
difference in conditions across components
10
9
8
7
6
5
4
3
2
1
0
Infidelity
T Planned
T Obser
PD
Asmt
Asmt=Formative
Assessment
Diff Inst=
Differentiated
Instruction
C Planned
C Observ
PD
Dif Inst
PD= Professional
Development
10
9
8
7
6
5
4
3
2
1
0
Asmt
Dif Inst
Augmentation of
Control
10
9
8
7
6
5
4
3
2
1
0
Dif in Theory
Dif as Obs
PD
Dif
Inst
Time-out for Questions
In Practice….
• Step 1: Identify core components in the
intervention group
– e.g., via a Model of Change
– Establish bench marks (if possible) for TTX and TC
• Step 2: Measure core components to derive
tTx and tC
– e.g., via a “Logic model” based on Model of
Change
• Step 3: Deriving indicators
• Step 4: Indicators of IF and ARSI
Incorporated into the analysis of effects
Focused assessment is needed
What are the options?
(1) Essential or core components
(activities, processes);
(2) Necessary, but not unique, activities,
processes and structures (supporting the
essential components of T); and
(3) Ordinary features of the setting
(shared with the control group)
• Focus on 1 and 2.
Step 1: Specifying Intervention
Models
• Simple version of the question: What was
intended?
• Interventions are generally multi-component,
sequences of actions
• Mature-enough interventions are specifiable as:
– Conceptual model of change
– Intervention-specific model
– Context-specific model
• Start with a specific example  MAP RCT
Example –The Measuring
Academic Progress (MAP) RCT
• The Northwest Evaluation Association
(NWEA) developed the Measures of
Academic Progress (MAP) program to
enhance student achievement
• Used in 2000+ school districts, 17,500
schools
• No evidence of efficacy or effectiveness
Measures of Academic Progress
(MAP): Model of Change
MAP Intervention:
4 days of training
On-demand
consultation
Formative
Assessment
Achievement
Formative Testing
Student Reports
On-line resources
Differentiated
Instruction
Implementation Issues:
Delivery – NWEA trainers
Receipt – Teachers and
School Leaders
Enactment -- Teachers
Outcomes -- Students
Logic Model for MAP
Focus of implementation fidelity and achieved
relative strength
Resources
Activities
Outputs
Outcomes
Impacts
Testing
System
4 training
sessions
State tests
Multiple
Assessment
Reports
Follow-up
Consultation
Use of
Formative
Assessment
Improved
Student
Achievement
NWEA
Trainers
Access
resources
MAP tests
Differentiated
Instruction
NWEA
Consultants
On-line
teaching
resources
Program-specific
implementation
fidelity
assessments:
MAP only
Comparative
implementation
assessments: MAP
and Non-MAP
classes
Context-Specific Model: MAP
Academic
Schedule
Spring Semester
Aug
Sept
Oct
Nov
Dec
Jan
PD1
Con
PD2
Con
PD3
Con
Feb
Mar
Diff
Instr
State Testing
Implementation
Two POINTS:
1. This tells us when assessments should be undertaken;
and
2. Provides as basis for determining the length of the
intervention study and the ultimate RCT design.
Apr
May
PD4
Con
Change
Use Data
Data Sys
Major MAP Program
Components and
Activities
Fall Semester
Step 2: Quality Measures of Core
Components
• Measures of resources, activities, outputs
• Range from simple counts to sophisticated
scaling of constructs
• Generally involves multiple methods
• Multiple indicators for each major
component/activity
• Reliable scales (3-4 items per sub-scale)
Measuring Program-Specific
Components
MAP
Resources
MAP
Activities
Criterion:
Attendance
Testing
System
4 training
sessions
Source or
method:
Follow-up
Consultation
MAP
records
Multiple
Assessment
Reports
Criterion:
NWEA
Trainers
Source or
Method:
NWEA
Consultants
On-line
teaching
resources
Present or
Absent
MAP
Records
Criterion:
Access
resources
Use
Source or
Method:
Webrecords
MAP
Outputs
Measuring Outputs: Both MAP and
Non-MAP Conditions
MAP
Outputs
Use of
Formative
Assessment
Data
Method:
End of year
Teacher
Survey
Indices:
Methods:
Differentiated
Instruction
End or year
teacher survey
Observations (3)
Teacher Logs
(10)
Difference in
differentiated
instruction (high
v. low readiness
students)
Proportion of
observations
segments with
any
differentiated
instruction
Criterion:
Achieved Relative
Strength
Fidelity and ARS Assessment Plan for
the MAP Program
Step 3: Indexing Fidelity and
Achieved Relative Strength
True Fidelity – relative to a benchmark;
Intervention Exposure – amount of sessions, time,
frequency
Achieved Relative Strength (ARS) Index
t t
ARS Index 
S
Tx
C
Standardized difference in fidelity index across Tx and C
• Based on Hedges’ g (Hedges, 2007)
• Corrected for clustering in the classroom
Calculating ARSI When There Are
Multiple Components
10
9
8
7
6
5
4
3
2
1
0
Infidelity
T Planned
T Obser
PD
Asmt
Asmt=Formative
Assessment
Diff Inst=
Differentiated
Instruction
C Planned
C Observ
PD
Dif Inst
PD= Professional
Development
10
9
8
7
6
5
4
3
2
1
0
Asmt
Dif Inst
Augmentation of
Control
10
9
8
7
6
5
4
3
2
1
0
Dif in Theory
Dif as Obs
PD
Dif
Inst
Weighted Achieved Relative
Strength
X
3  2.5
ARSI PD 

 0.25
Sd
2
6  3.5
ARSI Assess 
 0.83
3
74
ARSI DI 
 0.86
3.5
ARSIWeighted   w j ARSI j  .25(.25)  .33(.83)  .42(.86)  0.69
X
tE
tC
U 3  76%
Time-out for Questions
Some Program-Specific Results
Achieved Relative Strength: Some Results
Achieved Relative Strength:
Teacher Classroom Behavior
Preliminary Conclusions for the MAP
Implementation Assessment
• The developer (NWEA)
– Complete implementation of resources, training, and
consultation
• Teachers: Program-specific implementation
outcomes
– Variable attendance at training and use of training
sessions
– Moderate use of data, differentiation activities
services
– Training extended through May, 2009
• Teachers: Achieved Relative Strength
– No between-group differences in enactment of
differentiated instruction
Step 4: Indexing Cause-Effect
Linkage
• Analysis Type 1:
– Congruity of Cause-Effect in ITT analyses
• Effect = Average difference on outcomes ES
• Cause = Average difference in causal components  ARS
(Achieved Relative Strength)
• Descriptive reporting of each, separately
• Analysis Type 2:
– Variation in implementation fidelity linked to variation
in outcomes
– Hierarchy of approaches (ITT  LATE/CACE 
Regression  Descriptive)
• TO BE CONTINUED ……
Questions?
Download