Dennis P. Scanlon, Ph.D. Jeff Alexander, Ph.D. Laura Bodenschatz, M.S.W. AcademyHealth, June 29, 2010 More information regarding AF4Q evaluation team at: http://www.hhdev.psu.edu/CHCPR/activities/project_alignforce.html Discuss the limitations, advantages, and utility of traditional health services research approaches and the value of mixed methods and realistic evaluation approaches Understand the choices and challenges encountered when conducting rigorous evaluations ◦ We will only scratch the surface in 90 minutes Provide examples from real evaluations to emphasize key points Workshop is pitched at an intermediate level ◦ Assumes graduate level training in research design, statistics, and multivariate analysis Training in qualitative research methods is not assumed Different approaches to measuring the impact of a program or policy ◦ How research goals can/should influence research design Key Discussion Areas ◦ Program Definition ◦ Research Design Options ◦ Evaluation Implementation Questions and Discussion ◦ Workshop is interactive – we will pause for questions/discussion after each section The Patient Protection and Affordable Care Act (PPACA) of 2010, provides for several complex systems changes that will challenge traditional approaches to research ◦ ◦ ◦ ◦ ◦ ◦ ◦ Accountable Care Organizations (ACOs) Health Insurance Exchanges Individual and Employer Mandates Medicaid Expansions Meaningful Use of HIT and EHR National Quality Strategy Payment Reforms “For example, the RCT is a powerful, perhaps unequaled, research design to explore the efficacy of conceptually neat components of clinical practice – tests, drugs, and procedures. For other crucially important learning purposes, however, it serves less well.” Berwick (2008) “Experimentalists have pursued too singlemindedly the question of whether a [social] program works at the expense of knowing why it works.” Pawson and Tilley (1997) “Thus, although the OXO model seeks generalizable knowledge, in that pursuit it relies on – it depends on – removing most of the local details about “how” something works and about the “what” of contexts.” Berwick (2008) Did ‘it’ (the program or policy) work? ◦ If so, what was the effect size? ◦ What mechanism(s) led to the effect? ◦ If not, why didn’t it work? In which context and under what conditions did it work? Did the program happen? ◦ What was the dose? ◦ Did it vary across sites/markets and if so why? Should any changes be made to the program? Can it be spread? Under what conditions? What do you want to know? ◦ What are the key outcomes of interest? ◦ How interested are you in the processes and mechanisms that lead to change? ◦ How important is it to monitor the program or policy implementation? Regardless of mandatory or voluntary implementation What is the theory of change? ◦ By what process or sequence of activities does one expect the intervention to result in the outcome? ◦ What is the evidence base for the theory of change? ◦ What are the critical assumptions underlying this theory of change? Traditional “Difference-in-Difference-inDifference” ◦ Gruber’s study on the labor market incidence of mandated maternity benefits Difference-in-Difference analysis supplemented with a survey ◦ Analysis of a “Hospital Safety Incentive” in an employed population Aligning Forces for Quality ◦ A program of the Robert Wood Johnson Foundation The Incidence of Mandated Maternity Benefits (Gruber, 1994): DDD Design Wijt = log real hourly wage of individual i in state j (1 if experimental and 0 if nonexperimental) in year t (1 if after the law and 0 if before) Xijt = vector of observable characteristics δj = state fixed effect τt = year fixed effect TREATi = dummy for treatment group (1 if treatment, 0 if control) β2 = time‐series changes in wages β3 = time‐invariant characteristics of the experimental states β4 = time‐invariant characteristics of the treatment group β5 = changes over time in the experimental states β6 = changes over time for the treatment group nationwide β7 = time‐invariant characteristics of the treatment group in the experiment states β8 = all variation in wages specific to the treatments (relative to controls) in the experimental states (relative to the non‐experimentals) in the year after the law (relative to before the law) Who Chooses the Hospital? Factors Influencing Consumer’s Health Care Choices Factors Influencing Hospital Choice Quality Rating Prior Experience Out of Pocket Expenses Physician Privileges/ Recommendation Amenities Hospital Choice Reputation/ Recommend -ation Travel Time/ Distance In Health Plan Network Specialty Services Factors Influencing Physician Choice Hospital Choice In Health Plan Network Range of Services (lab, x-ray) Reputation/ Recommendation Physician Choice Credentials/ Board Certification Quality Rating Factors Influencing Health Plan Choice Travel Time/ Distance Prior Experience Physician Health Plan Choice Choice Hospital Network Out of Pocket Expenses Physician Network Health Plan Choice Prior Experience Quality Rating Reputation/ Recommend -ation Covered Benefits/ Services 20 minute phone interviews, pre/post with a random sample of beneficiaries (employees or spouses) ◦ 4 groups pre and post July 1, 2004 The survey focused on the following areas: ◦ Awareness of enrollment materials and online decision support tools ◦ Opinions regarding the quality and safety of health care ◦ Factors influencing hospital choice (for respondents with a recent hospitalization) ◦ Factors important for future choice of hospital if inpatient care is needed ◦ Factors related to health plan choice ◦ Demographic characteristics Aligning Forces for Quality? • An unprecedented commitment by the Robert Wood Johnson Foundation to implement and support resources to improve the quality of health care, reduce disparities related to race and ethnicity, and provide models for reform. • Within the 17 different Alliances of AF4Q exist local stakeholder groups charged with making sense of the quality problem in America and meeting it with local solutions. 19 The AF4Q Theory of Change • Increased transparency – Inpatient & ambulatory performance measurement – Cost & efficiency – Patient Experience • Information is being used by: – Consumers to inform decision making – Purchasers / employers / plans – Providers to improve 20 17 Communities Across America 21 The Who – AF4Q Alliances Albuquerque, NM: Aligning Forces for Quality in Albuquerque Boston, MA: Greater Boston Aligning Forces for Quality Central Indiana: Indiana Health Information Exchange Cincinnati, OH: Health Improvement Collaborative of Greater Cincinnati Cleveland, OH: Better Health Greater Cleveland Detroit, MI: Greater Detroit Area Health Council Humboldt County, CA: Community Health Alliance of Humboldt‐Del Norte Kansas City, MO: Kansas City Quality Improvement Consortium Maine: Maine Aligning Forces for Quality Memphis, TN: Healthy Memphis Common Table Minnesota: MN Community Measurement Puget Sound: The Puget Sound Health Alliance South Central Pennsylvania: Aligning Forces for Quality—South Central PA West Michigan: Alliance for Health Western New York: P2 Collaborative of Western New York Willamette Valley, OR: Oregon Health Care Quality Corporation Wisconsin: Wisconsin Collaborative for Healthcare Quality 22 The What – AF4Q Areas of Focus Performance Measurement & Public Reporting Consumer Engagement Quality Improvement 23 Equity Targeted alignment AF4Q National Program Office: George Washington University • The National Program office works in concert with the Robert Wood Johnson Foundation • The NPO is responsible for the day-to-day management and oversight of the AF4Q project. • Coordinate and deploy technical assistance to the Alliances for PM/PR, CE, QI, and Equity 24 For More Information Visit www.rwjf.org/qualityequality/af4q. 25 The“why” of the program ◦ Program goals ◦ Theory of change ◦ Assumptions ◦ Evidence base The “what” and “how” of the program ◦ Interventions Degree of standardization ◦ Requirements ◦ Timing ◦ Implementation approach and context The key actors and their roles ◦ Sponsors, implementers, intermediaries, others Historical Context and External Environment Alliance Development AF4Q Community Alliances Employers Labor Public purchasers Insurers Providers Hospital leadership Nurse leaders Publicly funded healthcare organizations Public health experts Consumers representing community population AF4Q Interventions Consumer Engagement Activities Public Reporting Initiatives Quality Improvement Initiatives Alliance Sustainability Intermediate Outcomes Patient experience Patient activation Price & quality transparency Provider quality improvement Patient safety Care coordination Care site transitions Diffusion of best practices Nurse sensitive outcomes R/E/L Data Collection {Performance Data} Time Rev 10_9_09 Longer Term Outcomes Quality Improvement Cost reductions Improved Health Status Reductions in Disparities Tracking ◦ Systematic tracking of Alliance activities and relevant health information and activities in the Alliances’ communities e.g., availability of public reports, CE activities and state policy proposals Key Informant Interviews ◦ In person site visit interviews with multiple stakeholders ◦ Phone interviews Surveys ◦ Consumer survey ◦ Physician survey ◦ Alliance Survey Secondary Data ◦ Dartmouth Atlas ◦ H-CAHPS ◦ Area Resource File Key Informant Interviews Key Informant Interviews Key Informant Interviews Alliance Survey Alliance Survey Alliance Survey Consumer Survey Physician Survey Secondary Data Sources Secondary Data Sources Alliance Tracking Alliance Tracking Alliance Tracking Alliance Tracking Questions/Discussion Research Design in Realistic Evaluation Research Thinking Outside the O-X-O Box Focus on “proving” internal validity of intervention RCT and QED best suited for discrete interventions (e.g. drugs, tests, procedures) Rely on sufficient statistical power to reject null Do not tell us much about the mechanism or process by which change occurs Controls away (holds constant) the context of the intervention • • • • Things change- traditional designs not adaptive Context as intervention- traditional designs ignore context The end of the story is not the storytraditional designs not suited to providing timely results to intervention sites, funders, and evaluators Politics of evaluation- traditional designs not suited to needs and expectations of multiple stakeholders Interventions as complex social systemstracking and incorporating change in evaluation design (intentional and unintended) Anticipating change and making appropriate adjustments Assessing change to evaluate implementation Context as a condition for intervention success or failure Interaction of context and intervention Context as a component of the intervention Separating long term, short term, and intermediate effects Being realistic about time required for social change Not all research questions require power calculations Alternatives when faced with limited power Compromising understanding of complex interventions and their effects for the sake of statistical power Research Design as Balancing Act Funders Intervention sites National Program office Other agencies and external groups Evaluators themselves Data Collection Decisions what- outcomes, process, structure, context when- frequency level- market/community, intervention site, individual stakeholder type- qualitative, survey, secondary mixin what combination How will the data be used? How much will the data cost to collect and analyze? How do the data relate to other data (complimentarity) How useful/interesting will the data be to various evaluation stakeholders? Scaling evaluation Allocating scarce evaluation resources Doing realistic evaluation on a limited budget ◦ How much? ◦ How deep? ◦ How often? Questions/Discussion Implementing a Realistic Evaluation of a Complex Program Balancing Process and Product Methods decisions drive team composition The double whammy ◦ Realistic evaluation often requires specialized skills across multiple disciplines ◦ Complex programs typically involve various types of data and points of data collection Significant time devoted to process with large and varied teams The Balancing Act continues Expect to devote time to developing and maintaining relationships with multiple entities and individuals ◦ The Funder and their partners ◦ The intervention sites ◦ The National Program Office and their consultants The “Are you evaluating…?” question Expect it Create a team culture of adaptability Plan to devote time to ◦ ◦ ◦ ◦ Discussing it Reviewing research design in light of it Adapting to it Documenting it Mixed methods require collection of both quantitative & qualitative data Qualitative methods are resource- and timeintensive Many team members may be new to qualitative and mixed methods Research design provides framework Understanding context and tracking change may require more Many day-to-day decisions Multiple types of data ◦ Key Informant Interview (KII) ◦ Observations ◦ Project documentation Challenges of qualitative data in evaluation ◦ Need to move the data quickly ◦ Many people working with the data Look at existing tools and strategies ◦ Coding ◦ CAQDAS (computer assisted qualitative data analysis software) Build or adapt other tools and strategies Start with the end in mind…what do you need to know in order to answer your research questions? Determine whom (by role, characteristics, etc.) to interview and when Communicate with intervention sites early about interviews; set target dates Identify specific topics and questions Develop interview protocol(s) and determine level of interview structure needed Make data creation decisions (record and transcribe interviews, write field notes, etc.) Create recruitment materials Gain IRB approval of protocols and recruitment materials Test protocols and train interviewers Work with intervention sites to identify and schedule specific people who fit your identified need categories Conduct interviews Transcribe interviews and/or write field notes Develop codebook(s) for the data Train coders and establish inter-coder reliability processes Code data and enter into software program (if using) Begin work on first paper/products by identifying relevant codes and pulling data KII Data Preparation Screen Shots Codebook, coded page & software output Multiple Audiences ◦ the Balancing Act continues ◦ Evaluation team itself is the first audience Multiple or “Layered” findings ◦ Overarching question ◦ Interim results All in ‘Real Time’ A project supported by the Robert Wood Johnson Foundation May 12, 2009 Questions/Discussion