Modeling “The Cause”: Assessing Implementation Fidelity and Achieved Relative Strength in RCTs David S. Cordray Vanderbilt University IES/NCER Summer Research Training Institute, 2010 Overview • Define implementation fidelity and achieved relative strength • A 4-step approach to assessment and analysis of implementation fidelity (IF) and achieved relative strength (ARS): – – – – Model(s)-based Quality Measures of Core Causal Components Creating Indices Integrating implementation assessments with models of effects (next week) • Strategy: – Describe each step – Illustrate with an example – Planned Q&A segments Caveat and Precondition • Caveat – The black box (ITT model) is still #1 priority • Implementation fidelity and achieve relative strength are supplemental to ITT-based results. • Precondition – But, we consider implementation fidelity in RCTs that are conducted on mature (enough) interventions – That is, the intervention is stable enough to describe an underlying— • Model/theory of change, and • Operational (logic and context) models. Dimensions Intervention Fidelity • Operative definitions: – True Fidelity = Adherence or compliance: • Program components are delivered/used/received, as prescribed • With a stated criteria for success or full adherence • The specification of these criteria is relatively rare – Intervention Exposure: • Amount of program content, processes, activities delivered/received by all participants (aka: receipt, responsiveness) • This notion is most prevalent – Intervention Differentiation: • The unique features of the intervention are distinguishable from other programs, including the control condition • A unique application within RCTs Linking Intervention Fidelity Assessment to Contemporary Models of Causality • Rubin’s Causal Model: – True causal effect of X is (YiTx – YiC) – In RCTs, the difference between outcomes, on average, is the causal effect • Fidelity assessment within RCTs also entails examining the difference between causal components in the intervention and control conditions. • Differencing causal conditions can be characterized as achieved relative strength of the contrast. – Achieved Relative Strength (ARS) = tTx – tC – ARS is a default index of fidelity Treatment Strength .45 .40 .35 Outcome 100 TTx t tx Infidelity YT Yt 90 85 80 .30 Achieved Relative Strength =.15 .25 .20 tC .15 TC “Infidelity” 75 Yc YC 70 65 .10 60 .05 55 .00 50 d with fidelity Y YC T sd pooled d with fidelity (85)-(70) = 15 90 65 0.83 30 Expected Relative Strength = (0.40-0.15) = 0.25 d 85 70 0.50 30 d Yt Yc sd pooled d 0.50 Why is this Important? • Statistical conclusion validity • Construct Validity: – Which is the cause? (TTx - TC) or (tTx – tC) • Poor implementation: essential elements of the treatment are incompletely implemented. • Contamination: The essential elements of the treatment group are found in the control condition (to varying degrees). • Pre-existing similarities between T and C on intervention components. • External validity – generalization is about (tTx - tC) – This difference needs to be known for proper generalization and future specification of the intervention components So what is the cause? …The achieved relative difference in conditions across components 10 9 8 7 6 5 4 3 2 1 0 Infidelity T Planned T Obser PD Asmt Asmt=Formative Assessment Diff Inst= Differentiated Instruction C Planned C Observ PD Dif Inst PD= Professional Development 10 9 8 7 6 5 4 3 2 1 0 Asmt Dif Inst Augmentation of Control 10 9 8 7 6 5 4 3 2 1 0 Dif in Theory Dif as Obs PD Dif Inst Time-out for Questions In Practice…. • Step 1: Identify core components in the intervention group – e.g., via a Model of Change – Establish bench marks (if possible) for TTX and TC • Step 2: Measure core components to derive tTx and tC – e.g., via a “Logic model” based on Model of Change • Step 3: Deriving indicators • Step 4: Indicators of IF and ARSI Incorporated into the analysis of effects Focused assessment is needed What are the options? (1) Essential or core components (activities, processes); (2) Necessary, but not unique, activities, processes and structures (supporting the essential components of T); and (3) Ordinary features of the setting (shared with the control group) • Focus on 1 and 2. Step 1: Specifying Intervention Models • Simple version of the question: What was intended? • Interventions are generally multi-component, sequences of actions • Mature-enough interventions are specifiable as: – Conceptual model of change – Intervention-specific model – Context-specific model • Start with a specific example MAP RCT Example –The Measuring Academic Progress (MAP) RCT • The Northwest Evaluation Association (NWEA) developed the Measures of Academic Progress (MAP) program to enhance student achievement • Used in 2000+ school districts, 17,500 schools • No evidence of efficacy or effectiveness Measures of Academic Progress (MAP): Model of Change MAP Intervention: 4 days of training On-demand consultation Formative Assessment Achievement Formative Testing Student Reports On-line resources Differentiated Instruction Implementation Issues: Delivery – NWEA trainers Receipt – Teachers and School Leaders Enactment -- Teachers Outcomes -- Students Logic Model for MAP Focus of implementation fidelity and achieved relative strength Resources Activities Outputs Outcomes Impacts Testing System 4 training sessions State tests Multiple Assessment Reports Follow-up Consultation Use of Formative Assessment Improved Student Achievement NWEA Trainers Access resources MAP tests Differentiated Instruction NWEA Consultants On-line teaching resources Program-specific implementation fidelity assessments: MAP only Comparative implementation assessments: MAP and Non-MAP classes Context-Specific Model: MAP Academic Schedule Spring Semester Aug Sept Oct Nov Dec Jan PD1 Con PD2 Con PD3 Con Feb Mar Diff Instr State Testing Implementation Two POINTS: 1. This tells us when assessments should be undertaken; and 2. Provides as basis for determining the length of the intervention study and the ultimate RCT design. Apr May PD4 Con Change Use Data Data Sys Major MAP Program Components and Activities Fall Semester Step 2: Quality Measures of Core Components • Measures of resources, activities, outputs • Range from simple counts to sophisticated scaling of constructs • Generally involves multiple methods • Multiple indicators for each major component/activity • Reliable scales (3-4 items per sub-scale) Measuring Program-Specific Components MAP Resources MAP Activities Criterion: Attendance Testing System 4 training sessions Source or method: Follow-up Consultation MAP records Multiple Assessment Reports Criterion: NWEA Trainers Source or Method: NWEA Consultants On-line teaching resources Present or Absent MAP Records Criterion: Access resources Use Source or Method: Webrecords MAP Outputs Measuring Outputs: Both MAP and Non-MAP Conditions MAP Outputs Use of Formative Assessment Data Method: End of year Teacher Survey Indices: Methods: Differentiated Instruction End or year teacher survey Observations (3) Teacher Logs (10) Difference in differentiated instruction (high v. low readiness students) Proportion of observations segments with any differentiated instruction Criterion: Achieved Relative Strength Fidelity and ARS Assessment Plan for the MAP Program Step 3: Indexing Fidelity and Achieved Relative Strength True Fidelity – relative to a benchmark; Intervention Exposure – amount of sessions, time, frequency Achieved Relative Strength (ARS) Index t t ARS Index S Tx C Standardized difference in fidelity index across Tx and C • Based on Hedges’ g (Hedges, 2007) • Corrected for clustering in the classroom Calculating ARSI When There Are Multiple Components 10 9 8 7 6 5 4 3 2 1 0 Infidelity T Planned T Obser PD Asmt Asmt=Formative Assessment Diff Inst= Differentiated Instruction C Planned C Observ PD Dif Inst PD= Professional Development 10 9 8 7 6 5 4 3 2 1 0 Asmt Dif Inst Augmentation of Control 10 9 8 7 6 5 4 3 2 1 0 Dif in Theory Dif as Obs PD Dif Inst Weighted Achieved Relative Strength X 3 2.5 ARSI PD 0.25 Sd 2 6 3.5 ARSI Assess 0.83 3 74 ARSI DI 0.86 3.5 ARSIWeighted w j ARSI j .25(.25) .33(.83) .42(.86) 0.69 X tE tC U 3 76% Time-out for Questions Some Program-Specific Results Achieved Relative Strength: Some Results Achieved Relative Strength: Teacher Classroom Behavior Preliminary Conclusions for the MAP Implementation Assessment • The developer (NWEA) – Complete implementation of resources, training, and consultation • Teachers: Program-specific implementation outcomes – Variable attendance at training and use of training sessions – Moderate use of data, differentiation activities services – Training extended through May, 2009 • Teachers: Achieved Relative Strength – No between-group differences in enactment of differentiated instruction Step 4: Indexing Cause-Effect Linkage • Analysis Type 1: – Congruity of Cause-Effect in ITT analyses • Effect = Average difference on outcomes ES • Cause = Average difference in causal components ARS (Achieved Relative Strength) • Descriptive reporting of each, separately • Analysis Type 2: – Variation in implementation fidelity linked to variation in outcomes – Hierarchy of approaches (ITT LATE/CACE Regression Descriptive) • TO BE CONTINUED …… Questions?