Testing a Strategic Evaluation Framework for Incrementally Building

advertisement
Testing a Strategic Evaluation Framework
for Incrementally Building Evaluation
Capacity in a Federal R&D Program
27th Annual Conference of the
American Evaluation Association
Washington, DC
October 17, 2013
JOHN TUNNA
Director
Office of Research and Development
Office of Railroad Policy and Development
Federal Railroad Administration
Federal Railroad Administration (FRA)
Evaluation Implementation Plan
•
Introduction
– R&D Evaluation Mandate
– R&D Evaluation Goals
– R&D Evaluation Standards
•
Uses of Evaluation
– Formative
– Summative
•
Types of Evaluation (CIPP Evaluation Model)
–
–
–
–
•
•
•
Context
Input
Implementation
Impact
Evaluation Framework & Key Evaluation Questions
Start-up Pilot Evaluations
Institutionalizing and Mainstreaming Evaluation
– Metaevaluation
– The Evaluation Manual
•
•
Evaluation templates
Attestation of standards
R&D Evaluation Mandate
•
Congressional Mandates
–
–
–
•
OMB Memos
–
–
–
–
–
•
M-13-17, July 26, 2013: Next Steps in the Evidence and Innovation Agenda
M-13-16, July 26, 2013: Science and Technology Priorities for the FY 2015 Budget
M-10-32, July 29, 2010: Evaluating Programs for Efficacy and Cost-Efficiency
M-10-01, October 7, 2009: Increased Emphasis on Program Evaluations
M-09-27, August 8, 2009: Science and Technology Priorities for the FY2011 Budget
Federal Evaluation Working Group
–
–
•
Government Performance and Results Act (GPRA, 1993)
Program Assessment Rating Tool (PARTs, 2002)
GPRA Modernization Act of 2010
Reconvened in 2012 to help build evaluation capacity across the federal government
“[We] need to use evidence and rigorous evaluation in budget, management, and policy decisions to make
government work effectively.”
GAO reports
–
–
–
Program Evaluation: Strategies to Facilitate Agencies’ Use of Evaluation in Program Management and
Policy Making (June, 2013)
Program Evaluation: A Variety of Rigorous Methods Can Help Identify Effective Interventions (GAO-10-30,
November, 2009)
Program Evaluation: Experienced Agencies Follow a Similar Model for Prioritizing Research (GAO-11-176 ,
January, 2011)
R&D Evaluation Mandate
OMB Memo M-13-16 (July 26, 2013)
Subject: Science and Technology Priorities for the FY 2015
Budget
“Agencies . . . should give priority to R&D that strengthens the
scientific basis for decision-making in their mission areas, including
but not limited to health, safety, and environmental impacts. This
includes efforts to enhance the accessibility and usefulness of data
and tools for decision support, as well as research in the social and
behavioral sciences to support evidence-based policy and effective
policy implementation. “
“Agencies should work with their OMB contacts to agree on a format
within their 2015 Budget submissions to: (1) explain agency progress
in using evidence and (2) present their plans to build new knowledge
of what works and is cost-effective.“
R&D Evaluation Goals
• Meet R&D accountability requirements
• Guide and strengthen Division R&D
program effectiveness and impact
• Facilitate knowledge diffusion and
technology transfer
• Build R&D evaluation capacity
• Improve railroad safety
Why Evaluation in R&D?
Assessing the logic of R&D Programs
ACTIVITIES
OUTPUTS
Funded
Activity
“Family”
Deliverables/
Products
___________
Technical
Report(s)
Scientific
Research
Technology
Development
Forecasting
Model(s)
OUTCOMES
IMPACTS
Application
of Research
Reduced
Accidents
Injuries
Data Use
Adoption of
Guidelines,
Standards
or Regulations
Emergent
Outcomes
Changing
Practices
Positive
Knowledge Gains
Negative
Environmental Effects
The Research-Evaluation Paradigm
Research
Primary Purpose:
Primary audience:
Types of Questions:
Sources of Data:
Criteria:
-
contribute to knowledge
improve understanding
scholars
researchers
academicians
hypotheses
theory driven
preordinate
surveys
tests
experiments
pre-ordinate
- validity
- reliability
- generalizability
Evaluation
-
program improvement
decision-making
program funders
administrators
decision makers
practical
applied
open-ended, flexible
interviews
field observations
documents
mixed sources
open-ended, flexible
utility
feasibility
propriety
accuracy
accountability
Program Evaluation Standards:
Guiding Principles for Conducting Evaluations
• Utility (useful): to ensure evaluations serve the information needs of
the intended users.
• Feasibility (practical): to ensure evaluations are realistic, prudent,
diplomatic, and frugal.
• Propriety (ethical): to ensure evaluations will be conducted legally,
ethically, and with due regard for the welfare of those involved in the
evaluation, as well as those affected by its results.
• Accuracy (valid): to ensure that an evaluation will reveal and convey
•
valid and reliable information about all important features of the subject
program.
Accountability (professional): to ensure that those responsible for
conducting the evaluation document and make available for inspection all
aspects of the evaluation that are needed for independent assessments of
its utility, feasibility, propriety, accuracy, and accountability.
Note: The Program Evaluation Standards were developed by the Joint Committee on
Standards for Educational Evaluation and have been accredited by the American
National Standards Institute (ANSI).
CIPP Evaluation Model:
(Context, Input, Process, Product)
Types of Evaluation
• Context
• Input
• Implementation
• Impact
Stakeholder engagement is key
Daniel L. Stufflebeam's adaptation of his CIPP Evaluation Model framework for use in guiding program evaluations of the Federal Railroad
Administration's Office of Research and Development. For additional information, see Stufflebeam, D.L. (2000). The CIPP model for
evaluation. In D.L. Stufflebeam, G. F. Madaus, & T. Kellaghan, (Eds.), in Evaluation models (2nd ed.). (Chapter 16). Boston: Kluwer Academic
Publishers.
Evaluation Framework:
Roles and Types of Evaluation
Formative
Evaluation
(proactive)
Context
Inputs
Implementation Impact
Identifies:
Assesses:
Monitors, documents, Assesses:
& guides execution
+/- outcomes
• Needs
• Problems
• Assets
Alternative
approaches
Helps set:
• Goals
• Priorities
Develops:
Program
plans,
designs,
budgets
Assesses:
Summative Assesses:
Evaluation
(retroactive) Original rogram Original
goals &
priorities
Reassess:
project and
program plans;
procedural
plans &
budget
Informs:
Policy development
Strategic planning
Assesses:
Assesses:
Execution
Outcomes
Impacts
Side effects
Cost-effectiveness
Evaluation Framework:
Key Evaluation Questions – Safety Culture
Context
Formative
Evaluation
Summative
Evaluation
Inputs
Implementation
Impact
What are the
highest priority
needs to
improve safety
culture in the
U.S. rail
industry?
What are the most
promising alternatives for
safety culture interventions
(BBS, ISROP, Rules Revision,
Close Calls, etc.)? How do
they compare (potential
success, costs, etc.)? How
can these interventions be
most effectively
implemented? What are
some potential barriers to
implementation?
To what extent do safety
culture interventions
proceed on time, within
budget, and effectively?
How can safety culture interventions
be implemented to maximize
effectiveness? What are some
indicators of impact or use, if any,
that have emerged to indicate that
these interventions are being adopted
more broadly? What are some
emerging outcomes (positive or
negative)? How can the
implementations be modified to
minimize costs and maximize
effectiveness?
To what extent
did this
intervention
address the high
priority safety
need?
What intervention strategy
was chosen, and why was it
chosen compared to other
viable strategies (re.
prospects for success,
feasibility, costs)?
To what extent was the
intervention carried out
as planned, or modified
with an improved plan?
If needed, how can the
intervention design be
improved?
To what extent did these
interventions improve safety/safety
culture? Were there any
unanticipated negative or positive
side effects? What conclusions and
lessons learned can be reached (i.e.
cost effectiveness, stakeholder
engagement, program effectiveness)?
11
Evaluation as a Key Strategy Tool
• Ask questions that matter.
 About processes, products, programs, policies, and impacts
 Then develop appropriate and rigorous methods to answer them.
• Measure the extent to which, and ways, programs goals are being met.
 What’s working, and why, or why not?
• Use to refine program strategy, design and implementation.
 Inform others about lessons learned, progress, and program impacts.
• Improve likelihood of success with:
–
–
–
–
Intended users
Intended uses
Outcomes and impacts
Unanticipated (positive) outcomes
• Use evaluation to develop appropriate and useful performance measures
for reporting R&D outcomes, and monitoring those outcomes for
continuous improvement.
Michael Coplen
Senior Evaluator
Office of Research & Development
Federal Railroad Administration
202-493-6346
Michael.Coplen@dot.gov
13
13
QUESTIONS?
14
Supplemental Information
15
Evaluation Framework:
Illustrative Questions – Fatigue Website
Context
Inputs
Implementation
Impact
Formative
Evaluation
What are the
highest priority
needs for sleep
health and
safety in the
railroad
industry?
Given the need for sleep
health education and
training, what are the
most promising
alternatives (fatigue
website, regulations, etc.)?
How do they compare
(potential success, costs,
etc.)? How can this
strategy be most
effectively implemented?
What are some potential
barriers to
implementation?
To what extent is the
website project
proceeding on time,
within budget, and
effectively? If needed,
how can the design be
improved?
To what extent are people using the
website? What other indicators of
use, if any, have emerged that
indicate the website is being accessed
and the information is being acted
upon? What are some emerging
outcomes (positive or negative)? How
can the implementation be modified
to maintain and measure success?
Summative
Evaluation
To what extent
did the fatigue
website address
this high priority
need?
What strategy was chosen
and why compared to
other viable strategies (re.
prospects for success,
feasibility, costs)?
To what extent was the
website carried out as
planned, or modified with
an improved plan?
To what extent did this project
effectively address the need to
educate railroad employees on sleep
health and safety? Were there any
unanticipated negative or positive
side effects? What conclusions and
lessons learned can be reached (i.e.
cost effectiveness, stakeholder
engagement, program effectiveness)?
Input Evaluation: Program Design and
Partnership Commitment to Change
Clear Signal for Action (CSA) Theory of Change
Safety Culture
Management
Values
Attitudes
Competencies
Patterns of Behavior
(Management & Labor)
INTERVENTION
Establish Steering
Committee
(Management)
Develop Checklist
(Steering
Committee)
Data Analysis & CA
Planning
(Steering Committee,
CA Team)
Observer Training
(Steering Committee
(Observers)
Corrective Actions
Workers don’t have control
(CA Team)
Workers have control
(Steering Committee)
Data Gathering &
Feedback
(Observers)
At-Risk
Conditions
At-Risk
Behaviors
Incidents
Implementation Evaluation
Peer-to-Peer
Feedback
Continuous
Improvement (CI)
Safety Leadership
Development
(SLD)
Safety
Outcomes
Impact Evaluation: Expected changes and
possible metrics (Union Pacific example)
Implementation
First Order Impacts
S.T.E.E.L. Activities
S.T.E.E.L.-targeted
employee practices
Communications
Steering
committee
training
Employee involvement
in S.T.E.E.L.
Checklist
development
Sampler
training
Second Order Impacts
General employee practices
Employee well-being
Personal sense of
control/responsibility
Job satisfaction
Equipment control
Health
Rule compliance
Stress
Safe behaviors
Awareness
Sampling
Reactions to problems
Attitude toward safety
Incidents
Coaching
Third Order Impacts
Investigations
Personal Injuries
Feedback
Discipline
Derailments
Data
analysis
FTX results
Barrier removal
Collisions
Decertifications
Close calls
Barrier
identification
Safety hotline
Management practices
Leadership
training
Other influences include:
· Corporate policy changes
· FRA practices
Safety-enabling
leadership behaviors
Communication
quality, amount and
consistency
Attitude toward safety
Culture
Corporate results
Safety culture
Liability
Labor-management
relations
Incident costs
Productivity
Public image
Download