Program Evaluation How Do I Show This Works? Paul F. Cook, PhD

advertisement
Program Evaluation
How Do I Show This Works?
Paul F. Cook, PhD
UCDHSC School of Nursing
Why Evaluate?
•
•
•
•
•
•
We have to (state, federal, or contract reg.s)
In order to compete (JCAHO, NCQA, URAC)
It helps us manage staff
It helps us manage programs
It helps us maintain high-quality programs
It helps us develop even better programs
Targets for Evaluation
• Services (ongoing quality of services delivered)
• Systems (service settings or workflows)
• Programs (special projects or initiatives)
– “horse races” to decide how to use limited resources
– cost/benefit analysis may be included
• People (quality of services by individuals)
– provider or site “report cards”
– clinical practice guideline audits
– supervisor evaluation of individuals
Grembowski. (2001). The Practice of Health Program Evaluation
Basic: One-Time Evaluation
• Special projects
• Grants
• Pilot programs
May have a control group or use pre-post design
Opportunity for better-designed research
Finish it and you’re done
Intermediate: Quality Improvement
Measure
Improve
Identify
Barriers
• Measure (access,
best practices, patient
satisfaction, provider
satisfaction, clinical
outcomes)
• Identify Barriers
• Make Improvements
• Re-Measure
• Etc.
Advanced: “Management by Data”
• Data “dashboards”
• Real-time monitoring of important indicators
• Requires automatic data capture (no manual
entry) and reporting software – sophisticated IT
If you want to try this at home:
– SQL database (or start small with MS Access)
– Crystal Reports report templates
– Crystal Enterprise software to automate reporting
Worksheet for a QI Project
• Name (action word: e.g., “Improving x …”)
• Needs assessment
– Target population
– Identified need
• Performance measures
– Baseline data
– Timeframe for remeasurement
•
•
•
•
Benchmark and/or goal
Barriers and opportunities
Strong and targeted actions
Remeasurement and next steps
Planning for an Evaluation
“You can accomplish anything in life,
provided that you do not mind who
gets the credit.”
—Harry S. Truman
Start with Stakeholders
• Even if you know the right problem to fix,
someone needs to buy in – who are they?
–
–
–
–
–
–
Coworkers
Management
Administration
Consumer advocates
Community organizations (CBPR)
The healthcare marketplace
• Strategy: sell your idea at several levels
• To Succeed: focus on each group’s needs
Fisher, Ury, & Patton. (2003). Getting to Yes
Needs Assessment
(Formative Evaluation)
• Use data, if you have them
• Describe current environment,
current needs or goals, past efforts & results
• Various methods:
–
–
–
–
–
–
Administrative services data
Administrative cost data
Administrative clinical data (e.g., EMR)
Chart review data (a small sample is OK)
Survey data (a small sample is OK)
Epidemiology data or published literature
Common Rationales for a QIP
• High Risk
• High Cost
• High Volume
• Need for Prevention
Program Design
• Theoretical basis for the program
• Resources needed
–
–
–
–
Time
People
Money/equipment
Space
• Concrete steps for implementing the program
–
–
–
–
–
Manuals
Software
Tools/supplies
Training
Ongoing supervision
How Implementation May Fail
•
•
•
•
•
•
•
Lack of fidelity to theory
Providers not adequately trained
Treatment not implemented as designed
Participants didn’t participate
Participants didn’t receive “active ingredients”
Participants didn’t enact new skills
Results didn’t generalize across time, situations
Bellg, et al. (2004). Health Psychology, 23(5), 443-451
Selling Innovations
It may help to emphasize:
• Relative advantage of the change
• Compatibility with the current system
• Simplicity of the change and of the transition
• Testability of the results
• Observability of the improvement
Rogers. (1995). The Diffusion of Innovations.
Performance Measures
and Baseline Data
“The Commanding General is well
aware that the forecasts are no
good. However, he needs them for
planning purposes.”
— Nobel laureate economist Kenneth Arrow,
quoting from his time as an Air Force weather
forecaster
Asking the Question
• Ask the right question
–
–
–
–
–
Innovation
Process (many levels)
Outcome
Impact
Capacity-Building/Sustainability
• Try to answer only one question
–
–
–
–
Focus on the data you must have at the end
Consider other stakeholders’ interests
Collect data on side issues as you can
Attend to “respondent burden” concerns
Standard Issues in Measurement
• Reliability (data are not
showing random error)
• Validity (measuring the
right construct)
• Responsiveness (ability
to detect change)
• Acceptability
(usefulness in practice)
Existing measures may not work in all settings;
Unvalidated measures may not tell you anything
Data Sources
Qualitative/
Focus Group
Data
Patient Satisfaction
Surveys
Interview Data
Provider
Surveys
Safety Monitoring
Data
Answers
Financial
Data
Chart Review
Data
Encounter
Patient Outcome
Data
Surveys
No Perfect Method
• Patient survey
– Recall bias
– Social desirability bias
– Response bias
• Observer ratings
– Inter-rater variability
– Availability heuristic (“clinician’s illusion”)
– Fundamental attribution error
• Administrative data
– Collected for other purposes
– Gaps in coverage, care, eligibility, etc.
Goal: measures that are “objective, quantifiable, and
based on current scientific knowledge”
CNR “Instrument Lab” at
www.uchsc.edu/nursing/cnr
Benchmarking
• Industry standards (e.g., JCAHO)
• Peer organizations
• “Normative data” for specific instruments
• Research literature
– Systematic reviews
– Meta-analyses
– Individual articles in peer-reviewed journals
Setting Goals
• An average or a percent is OK
• Set an absolute goal: “improve to 50%” vs
“improve by 10% over baseline”
– “improve by x percentage points”, not
– “improve by x percent” (this depends on base rate)
• Set an achievable goal: if you’re at 40%,
don’t make it 90%
• Set a ‘stretch goal’ - not too easy
• ‘Zero performance defects’ (100%) is rarely
helpful – use 95% or 99% performance instead
Special Issues in Setting Goals
• Absolute number may work as a goal in some
scenarios – e.g., # of rural health consults/yr –
but percents or averages allow statistical
significance testing
• Improvement from zero to something is usually
not seen as improvement (“so your program is
new; but what good did it do?”)
• Don’t convert a scale to a percent (i.e., don’t
use a cut-off point) unless you absolutely must
Owen & Froman (2005). RINAH, 28, 496-503
Specifying the “Denominator”
• Level of analysis
–
–
–
–
Unit
Provider
Patient
For some populations, family
• A subgroup, or the entire population?
–
–
–
–
Primary (entire population)
Secondary (at-risk population)
Tertiary (identified patients)
Subgroups of patients (e.g., CHD with complications)
Baseline Data
• May be pre-existing
– Charts
– Administrative data
• May need to collect data prior to starting
– Surveys
• Whatever you do for baseline, you will need to
do it exactly the same way for remeasurement
– Same type of informant (e.g., providers)
– Same instruments (e.g., chart audit tool)
– Same method of administration (e.g., by phone)
Sampling
• Representativeness
–
–
–
–
Random sample, stratified sample, quota sample
Characteristics of volunteers
Underrepresented groups
Effect of survey method (phone, Internet)
• Response Rate
– 10% is good for industry
– If you have 100% of the data available, use 100%
• Finding the right sample size:
http://www.surveysystem.com/sscalc.htm
Evaluation Frequency and Duration
• Seasonal trends (selection bias)
• Confounding factors
–
–
–
–
Organizational change (history)
Outside events (history)
Other changes in the organization (maturation)
Change in patient case mix (selection bias)
• For the same subjects over time (pre/post):
– Notice the shrinking denominator (attrition)
• If your subjects know they are being evaluated:
– Don’t evaluate too often (testing)
– Don’t evaluate too rarely (reactivity)
Evaluation Design
Snow White and the 7 Threats to Validity
•
•
•
•
•
•
•
History – external events
Maturation – mere passage of time
Testing – observation changes the results
Instrumentation – random noise on the radar
Mortality/Attrition – data lost to follow-up
Selection Bias – not a representative sample
Reactivity – placebo (Hawthorne) effects
Grace. (1996). http://www.son.rochester.edu/son/research/research-fables.
Study Design
Effect of
Effect Time
of Group
No
Comparison
Group
Posttest Only Pretest and
Posttest
Pretest,
Posttest, and
Follow-up
Posttest
project
results
Longitudinal
project
7
results
Nonrandom
Comparison
Group
Randomized
Control
Group
Posttest
results vs
6
control
Pilot RCT
Pre-post
change vs
5
control
Full RCT
3
2
9
Pretestposttest
change
8
Adapted from Bamberger et al. (2006). RealWorld Evaluation
Longitudinal
results vs
4
control
Longitudinal
RCT
1
Post Hoc Evaluation
• Can you get posttest for the intervention?
– Design 9
• Can you get baseline for the intervention?
– Design 8 or 7
• Can you get posttest for a comparison group?
– Design 6
• Can you get baseline for the comparison group?
– Design 5 or 4
– (Randomization requires a prospective design)
Cost-Effectiveness Evaluation
• From “does it work” to “can we afford it?”
• Methods for cost-effectiveness evaluation:
– Cost-offset: does it save more than it spends?
– Cost-benefit: do the benefits produced (measured in
$ terms – e.g., QALYs, 1/$50K) exceed the $ costs?
– Cost-effectiveness: do the health benefits produced
(measured as clinical outcomes – e.g. reduced risk
based on odds ratios) justify the $ costs?
– Cost-utility: do the health benefits produced
(measured based on consumer preferences) justify
the $ costs? (e.g., based on willingness to pay)
Kaplan & Groessl. (2002). J Consult Clin Psych, 70(3), 482-493
Statistics
• Descriptive statistics: “how big?”
– Averages & Standard Deviations
– Correlations
– Odds Ratios
• Inferential statistics: “how likely?”
–
–
–
–
Various tests (t, F, chi-square)
Correct test to use depends on the type of data
All give you a p-value (chance the result is random)
“Significant” or not is highly dependent on sample N
Actions for Improvement
Evaluating Actions
• Strong
–
–
–
–
Designed to address the barriers identified
Consistent with past experience/research literature
Seem likely to have an impact
Implemented effectively & consistently
• Targeted
–
–
–
–
Right time
Right place
Right people
Have an impact on the barriers identified
NCQA. (2003). Standards and Guidelines for the Accreditation of MBHOs.
Theory-Based Actions
• What is the problem? (descriptive)
• What causes the problem? (problem theory)
– People
– Processes
– Unmet needs
• How to solve the problem? (theory of change)
–
–
–
–
–
–
Educate
Coach or Train
Communicate or build linkages
Redesign existing systems or services
Design new systems or services
Use new technologies
Using Evaluation Results
Describe the Process
• Needs analysis and stakeholder input
• Identification of barriers
• Theory basis for the intervention
– What actions were considered?
– Why were these actions chosen?
– How did these actions address the identified barriers?
• Implementation
–
–
–
–
What was done?
Who did it?
How were they monitored, supervised, etc.?
For how long, in what amount, in what way was it done?
• Data collection
– What measures were used?
– How were the data collected, and by whom?
Describe the Results
(Summative Evaluation)
• What were the outcomes?
–
–
–
–
–
Data on the primary outcome measure
Compare to baseline (if available)
Compare to goal
Compare to benchmark
Provide data on any secondary measures that also
support your conclusions about program outcomes
• What else did you find out?
– Answers to any additional questions that came up
– Any other interesting findings (lessons learned)
• Show a graph of the results
“Getting information from a table is
like extracting sunlight from a
cucumber.”
—Wainer & Thissen, 1981
Conclusions
• If the goals were met:
– What key barriers were targeted?
– What was the most effective action, and why?
• If the goals were not met:
– Did you miss some key barriers to improvement?
– Was the idea good, but there were barriers to
implementation that you didn’t anticipate? What
were they, and how could they be overcome?
– Did you get only part way there (e.g., change in
knowledge but not change in behavior)?
– Did the intervention produce results on other
important outcomes instead?
Dissemination
• Back to the original stakeholder groups
• Remind them – needs, goals, and actions
• Address additional questions or concerns
– “That’s a good suggestion; we could try it going
forward, and see whether it helps”
– “We did try that, and here’s what we found”
– “We didn’t have time/money/experience to do that,
but we can explore it for the future”
– “We didn’t think of that question, but we do have
some data that might answer it”
– “We don’t have data to answer that question, but
it’s a good idea for future study”
Broader Dissemination
•
•
•
•
Organizational newsletter
Summaries for patients, providers, payors
Trade association conference or publication
Scholarly research conference presentation
– Rocky Mountain EBP Conference
– WIN Conference
• Scholarly journal article
– Where to publish depends on rigor of the design
– Look at journal “impact factor” (higher = broader
reach, but also more selective)
• Popular press
Next Steps
• PDSA model: after “plan-do-study,” the final
step is “act” – roll out the program as widely as
possible to obtain all possible benefits
• Use lessons learned in this project as the needs
analysis for your next improvement activity
• Apply what you’ve learned about success in this
area to design interventions in other areas
• Set higher goals, and design additional actions
to address the same problem even better
Download