522_Mid & Final Lectures Chen & Holden

Evaluation Planning; forms of Evaluation; evaluation Stages— Chen and Holden-Zimmerman Mario A. Rivera PA 522 School of Public Administration Practical Program Evaluation: Assessing and Improving Planning, Implementation, and Effectiveness–by Huey-Tsyh Chen   Chen’s approach (Chapter 3): Chen proposes a taxonomy of program evaluation, one built around the program stage that is the desired focus of the evaluation, as well as around the desired purpose of the evaluation (for either program improvement or program impact assessment). Program Planning Stage. The first of the four stages is the program planning stage. This is the very beginning of program evaluation. Stakeholders at this stage—for example, program designers and managers, other program principals, and clients or beneficiaries—are developing a plan that will serve as a foundation for specifying, organizing, and implementing a program at a future date. Chen—Second Stage, Implementation  Implementation Stage. Program evaluation has, for much of its history, focused principally, and narrowly, on outcomes. Evaluation practice, however, suggests that program failures are often essentially implementation failures. Consequently, the practical scope of program evaluation has gradually broadened to include process evaluation, i.e., evaluation of implementation process, or processes. Focus on process is necessary when looking for explanations for shortfalls in program results.  The current view is that a much of implementation failure can be traced back to poor program planning, and to poor program design and development. Evaluators can make important contributions to programs by assessing these developmental steps. Consequently, there needs to be concern with the entire programmatic arc in evaluation: program planning, articulating program theory (theory of change), assessing implementation, and outcomes assessment. Chen—Third Stage, Mature Implementation Mature Implementation Stage. This stage follows initial implementation at a point when the program has settled into fairly routine activities and tried-and-true ways of doing things. Rules and procedures for conducting program activities are now well established. Stakeholders are likely to be interested in the following: determination of the sources of immediate problems, accountability (generation of data reassuring to those to whom stakeholders are accountable), and continuous program improvement.  Even in maturity, a program is likely to have problems such as client dissatisfaction with services. Identifying and resolving these problems is key to improving a program. And, as a program matures, stakeholders may think more about accountability, requiring concerted effort the direction of performance monitoring and reporting.  Chen—Fourth Stage, Program Outcome   Outcome Stage. A fourth stage of program growth is known as the outcome stage. Following a period of program maturity, stakeholders inside and outside the program want to know more or less definitively whether the program is achieving its goals. An evaluation at this point can serve any of several evaluation needs, including merit assessment and fidelity assessment (how well the program has functioned, whether it was implemented as planned, and how closely it has come to projected outcomes). However, Chen reminds us in his writings that there needs to be an adaptation of fidelity assessment to program evolution—the fidelity-adaptation approach. This even pertains to mature, well-settled programs. Evaluation Phases, Purposes & Types Program Development Phase Program Implementation Phase Program Outcome Phase Design-phase or Developmental Evaluation Helps ensure that programs are well conceived, well designed (e.g., in reference to well-established best practices) Formative and/or Process Evaluation Formative evaluation helps improve the program implementation and management. Process evaluation focuses on assessment of program operational and management process(es). Evaluations can be both at once. Summative or Outcome or Impact Evaluation Helps determine whether and to what extent a program has worked, by gauging its demonstrable effects (results, outcomes) Further Defining Evaluation Phases and Types  Design-phase/Developmental Evaluation: Conducted before or early in program implementation, testing evaluation plans, change models (rationalse), action models (implementation plans), etc.  Formative Evaluation: designed to determine (especially during later developmental phases of an intervention): (1) the feasibility of program implementation; (2) the aptness of change and action models; and (3) the short-term social, behavioral, or other impacts of an intervention. Focused on program improvement.  Process Evaluation: Designed to ascertain the degree to which replicable program procedures were implemented with fidelity by trained staff according to an articulated program plan; “black box,” systems-based evaluation. Assesses program process(es). If also meant to inform and improve these, it may properly be called a formative evaluation as well. However, “formative” and “process” evaluation are too often used interchangeably, blurred.  Outcome or Impact (Summative) Evaluation: Intended to assess the feasibility, efficacy, and cost-effectiveness of a program intervention in producing significant, long-term benefits for a welldefined population. Results-oriented evaluation for accountability. Chen’s Evaluation Strategies classification; Holden & Zimmerman on evaluation planning Chen proposes four evaluation strategies that correspond to program phases as just discussed: (1) assessment strategies (judging the results or performance of an intervention effort); (2) developmental strategies (judging the planning and early implementation of the intervention); (3) theory-elucidation or “enlightenment” strategies, which aim to make explicit the underlying assumptions and change models and action models of an intervention (often at early program stages); and (4) partnership strategies (ways of involving stakeholders, as well as other organizations in strategic and operational collaboration, and ways of evaluating such engagement).  The distinction is based on the purpose or objectives of the evaluation, and what aspect of a program is under scrutiny. More than one of these efforts could be undertaken in one evaluation, probably at different stages of the evaluation.  Both Chen and Holden stress evaluation planning, a projective process that occurs prior to carrying out an evaluation.  Holden & Zimmerman Planning for evaluation involves: 1. Stating the purpose of the evaluation 2. Understanding a program’s organizational and political context 3. Determining the uses of the prospective evaluation 4. Working with stakeholders to identify primary and secondary evaluation questions 5. Ensuring stakeholder’s buy-in for the evaluation Holden and Zimmerman developed a model called Evaluation Planning Incorporating Context, or EPIC, which aims to engage stakeholders, describe the program, and focus the evaluation plan. It provides a way to address issues in the pre-implementation phase of program evaluation. There are five steps in the model, namely assessing context, understanding the organizational and political environment, defining relationships, determining level of evaluation, gathering reconnaissance, specifying evaluation uses, and validating evaluative perspectives. Evaluation Planning Incorporating Context (EPIC)—Model Overview & Review The EPIC model provides a heuristic (or set of rules, or rules of thumb) for evaluation planning rather than a specified set of steps that are required for all evaluations. Some parts of the model may be more or less applicable depending on such issues as the type of evaluation, the setting of the evaluation, the outcomes of interest, and the sponsor's interests. Therefore, the EPIC model can be used as a kind of instruction guide to prepare for a program evaluation. However, the EPIC model as such is not in particularly wide use. Evaluation practitioners do ordinarily undertake evaluation planning (distinct from but often connected to program planning), following similar steps, however. Holden: Importance of Evaluation Planning For Holden & Zimmerman, planning the evaluation is key to building evaluation capacity. Planning an evaluation involves anticipating what will be required to collect information, organize it, analyze it, and report it, in short what will be involved in administering the evaluation  Everything from the articulation of evaluation questions to data-collection strategies should (to the extent feasible) be undertaken collaboratively with stakeholders and program partners. Examples of questions: Have critical program activities occurred on time and within budget? Why is site A performing better than site B despite identical programs?  A cogent evaluation plan presupposes a strong program plan  Things evaluator would need to know:  Which activities were viewed as critical?  Program time frames, budget by activity  When each activity began/ended  Total cost of each critical activity  EPIC Model Sequence Convergence in evaluation implementation Focus Evaluation: (Specify evaluation questions, Assess feasibility, Prioritize) Describe the Program: (Theory, History, Evolution) Engage Stakeholders: (Identify and Invite stakeholders, Define roles, Establish group process) Gather Reconnaissance: (Specify evaluation Uses, Validate Perspectives) Assess Context: (Understand organizational and Political Envrionment, Define Relationships, Determine Level of Evaluation) The Holden text incorporates the CDC Program Evaluation Framework, which stresses the continuous nature of evaluation Steps Engage stakeholders Ensure use and share lessons learned Justify conclusions Standards Utility Feasibility Propriety Accuracy Describe the program Focus the Evaluation design Gather credible evidence Framework for Program Evaluation in Public Health– MMWR, 1999 Holden & Zimmerman, Chen, & Role-sharing As stated in the reading Role Sharing Between Evaluators and Stakeholders in Practice, and as Chen stresses, program evaluation has moved away from traditional “objective observation” and now strives to engage stakeholders more fully in the evaluative process. The Vancouver case suggests that sharing roles between evaluators and project leaders, peer educators, and others was the norm among study participants but varied by their orientation and role. There was some tension and confusion due to this role-sharing, the kind of cross-functionality which is likely to obtain most markedly early on in a collaborative process (and often in Community-based Participatory Research). Role sharing requires strong communications skills on the part of evaluators. When these skills are absent, roleconfusion prevails. There needs to be role clarification then.  How did role-sharing characterize the education evaluation case study in Holden & Zimmerman?  Holden & Zimmerman—education evaluation The INEP’s initial phase (1999-2002) involved adapting and testing the curriculum with community stakeholders. INEP was housed in the Rocky Mountain Prevention Research Center (RMPRC), where a resource teacher and staff were available during parent events and evaluation activities. After 2002, funding by the USDA allowed for continued support for INEP teachers, specifically staff assistance to shop for food, perform food preparation, organize teaching materials, and clean up after lessons.  The INEP Public School Organization engaged teachers, principals, the district superintendent, the school board, and the State Board of Education. A district health coordinator knowledgeable about the program provided a critical link among agencies to provide the program with greater visibility. All of these actors became involved in the evaluation in varying degrees, so that role-sharing clearly obtained in both program implementation and evaluation.  The education evaluation’s challenges Evaluators had to determine whether desired curriculum adaptations had taken place, whether unanticipated changes in program and context could be expected to alter evaluation findings, and, in general, whether the program was effective.  It is difficult to measure changes in eating behavior. And there are many different barriers to healthy eating, particularly socioeconomic status. You can teach children good eating habits and good food choices but some families are unable to afford healthy foods, or there may not be readily substitutable healthy foods for cultural staples (such as tortillas and other starch-heavy staples), or such changes may not be culturally sensitive or desirable. This was not a culturally-responsive evaluation—culturally unaware.  Another problem is the burden placed on the teachers to carry out these programs even though they are overextended already.  The evaluators also noticed that there were many barriers to deal with in order to obtain necessary permissions to conduct research in public schools. It is essential to understand the political hierarchy of the system and the legal-regulatory requirements involved, in order to gain required approvals.  Planning a Service Program Evaluation  Mari Millery was called in by the Leukemia and Lymphoma Society (LLS) to plan and conduct a pilot study for its Information Resource Center (IRC). In Planning for a Service Program Evaluation, Millery discusses the steps she took in the planning stages of the program evaluation. The goal of the pilot study was to enhance patient navigation through the IRC by way of an intervention program. Both management and the evaluation team wanted to see the short-term results of this pilot study before fully launching the intervention program. One shortterm goal was to ensure the feasibility of the program before implementation. Ultimately, the evaluation team and LLS wanted to produce positive impacts on the patients’ level of care and quality of life. Millery’s was an instance of a developmental (or design-phase) program evaluation. Planning a Service Program Evaluation  Chen stresses the importance of developing a program rationale concurrently with the development of a program plan. The program rationale can be to correspond closely to the change model for the evaluation, while the program plan is a key element of its action model. The main purposes of the program rationale are to define a target group as well as to specifically explain why those in the target group were selected—for instance, in reference to a needs assessment. Program rationales provide support for three main tactics necessary for the proper planning of a program evaluation: (1) establishing a foundation for planning, (2) effective communication, and (3) adequately providing for the evaluation of outcomes. A program rationale will serve as a guide that evaluators can follow throughout the planning process; it will also support effective communications between evaluators and stakeholders. Planning a Service Program Evaluation   Chen also discusses strategies and approaches for articulating program rationales. He begins with a background information provision strategy whereby evaluators gather relevant information on things such as the characteristics of the target group, community needs, previous evaluative studies, and so on. In this context, Chen points out the value of both needs assessment and formative research. A needs assessment serves to identify, measure, and prioritize community needs. This in turn can aid in the process of goal-setting and target group selection, as well as in the subsequent steps of engaging stakeholders, specifying the program’s change and action models (or program theory), and focusing the evaluation. In all, needs assessment provides the basis for program rationale. Planning a Service Program Evaluation   Chen’s approach parallels the EPIC model for organizing program evaluation efforts. Both entail: 1) assessing context; 2) assessing the need for the program—its rationale, and how the evaluation is to be used; 3) engaging stakeholders; 4) describing the program; and 5) focusing the evaluation. Under either approach, it is important to begin with an understanding of the organizational and political context of the given program, so as to understand in turn why the program is deemed necessary, and how it is to be implemented. Gathering reconnaissance is a critical step in the EPIC model which specifies how the evaluation will be used, and which should be consistent with prior needs assessments and with evaluation planning in general. The community service evaluation particularly stressed the third step in the EPIC Model, engaging stakeholders. Chen’s approach The “conceptualization facilitation” approach is a key topic in Chen’s discussion of program rationales. Subtopics include whether to use a facilitative working group and whether to rely on an intensive interview format in assessing stakeholders’ backgrounds and preferences. Noting the frequent tendency of stakeholders to set high, unachievable goals, Chen stresses the need to set realistic short- to long-term program aims. Short-term objectives serve as a kind of grading sheet by which program staff and evaluators can note and measure tangible successes.  A pitfall to be avoided, according to Chen, is confusing objectives with action steps. Here he is distinguishing between the program rationale (or change model) and program implementation plan (action model). Clarification and consistency of program goals is necessary to avoid creating incompatibility among goals and objectives.  Chen’s approach Chen describes program plans as “blueprints for the actions prescribed by program rationales.” In chapter five, How Evaluators Assist Stakeholders In Developing Program Plans, Chen moves from the ‘why’ to the ‘how’ part of helping clients with program planning. Concerns range from recruiting and training implementers to formulating research questions.  Chen emphasizes the importance of a simple, clear, and realistic program rationale in order to develop an effective program plan and then an evaluation plan. Unnecessarily complex and over-detailed program rationales expose evaluators to complications in the evaluation planning stage. Checking program rationales against best practices among similar organizations and programs is one way to streamline and validate these. One should fully explore how these successful efforts may serve as templates for the focal program.  Chen & the Service Program Evaluation case An action model framework can help the evaluator in facilitating the development of a new program plan, as seen in Millery’s work. An action model is a means to ensure that there are no gaps or inconsistencies in the action plan over against implementation. In other words, it serves as a kind of “proofreading” tool for evaluators during planning stages. It is also a check on how various program activities work together in actual implementation.  Chen describes a formative research approach in the context of a background-information provision strategy. The two main purposes of this approach are (a) to formulate research questions and (b) to gather data to answer these questions. The action model can help evaluators determine which questions should be researched, gaining insight into the program in order to develop a cogent evaluation plan.  Chen & the Service Program Evaluation case Chen specifies six steps for planning an evaluation, which one can compare to Millery’s/EPIC approach to evaluation planning: 1. Assess, enhance, and ensure implementing organization’s capacity. Parallels Preskill and Boyle’s Evaluation Capacitybuilding (ECB) Model as well as the EPIC model. 2. Delineate service content and delivery procedures 3. Recruit, train, and maintain the competency and commitment of, program implementers 4. Establish collaboration with community partners 5. Ecological context: seek external support 6. Identify, recruit, screen, and serve the target population With regard to the implementing organization, Chen indicates that technical expertise, cultural competence, and manpower need to be considered. Technical expertise can determine the readiness of the implementers to carry out the necessary interventions and to help with the evaluation. Cultural competence provides them with effective methods of communication with clients. Chen & the Service Program Evaluation case  The program plan should be specific about services provided (the intervention protocol). A clear explanation of program services and how they are to be provided is also necessary (the service delivery protocol). According to Chen, the best method of providing for apt intervention and service delivery protocols is one-to-one interaction with program principals.  Consistent with Chen’s approach, in conducting the Service Program Evaluation Millery was able to work one-on-one with colleagues in an agency that greatly valued evaluation and was willing and able to provide her with essential information about organizational and program staff capacity, recruitment and training protocols, and incentives for program participants.  Chen stresses the importance of collaborative networks for both the implementers and evaluators of programs. In Millery’s case, she was evaluating the work of a large health services provider that was well connected with similar organizations and had experience with them. Millery built on and amplified these collaborative relationships in carrying out the evaluation. Chen & the Service Program Evaluation case If one uses Chen’s approach as a checklist, Millery was a very thorough evaluator. To begin with, she sought to provide a program rationale in addition to the program plan. Initially, when she responded to the request for proposals (RFP), LLS outlined for her the idea of patient navigation and desired outcomes. However, according to Millery, LLS “did not explain where the concept of patient navigation originated or whether there was a theoretical or conceptual framework behind it,” so Millery proceeded to fill these gaps in further discussion with LLS principals.  Chen stresses the importance of gathering background information on both the agency and the program staff. Millery did this in several ways. At the outset, she researched the history of the agency, its code of conduct and mission statement, and so on. She studied the organizational structure of LLS and consulted LLS staff. Also, she was able to become familiar with the politics surrounding the relevant advocacy issues. She also found previous evaluative studies on which to build her own.  Planning a Service Program Evaluation   In assessing the context of the program, Millery studied the organizational infrastructure of LLS. Specifically, she focused on how the IRC is structured as a department within LLS. She also familiarized herself with LLS’s management model, which turned out to be much like that of a private sector organization. This information was vital in order to for her to be able to understand how evaluation findings were to be developed and used. Through background research, Millery was able to draw up a preliminary list of questions that would guide her initial conversations with LLS management. She categorized these questions by their relevancy and priority. An example of a question that would be asked early on: From where does the funding for this evaluation study come? Another was: Why specifically target clients in Wyoming? Planning a Service Program Evaluation Millery requested that LLS explain the specific services provided by IRC (the intervention protocol) and also how they were to be provided (the service delivery protocol). According to Millery, “it is much easier to start planning for a study when the evaluator has a clear picture of how the service is provided.” Consistent with Chen, Millery wanted to be certain that service delivery processes were laid out clearly in order to avoid having any gaps in either program plan or action model.  Millery had the advantage of working with well-trained professionals who supported the evaluation. She had little to do when it came to Chen’s third step: Recruiting implementers and training them was not an issue. However, Millery did have to define the relationship she had with IRC staff as well as with LLS supervisors who were in charge of overseeing the program evaluation, i.e., role clarification. She had a great deal of one-on-one interaction with program staff and management, an important part of evaluation planning.  Planning a Service Program Evaluation Consequently, Millery and LLS supervisors agreed on a single liaison between herself and the LLS board of directors. This simple relationship helped Millery avoid any complications— specifically, role abmiguity or confusion—that might arise from multiple and complex relationships with LLS principals. Millery also considered the importance of maintaining staff confidence. She made it a point to individually interview each program staff member and make it clear that the evaluation strictly concerned the program rather than the staff. It was to be a programperformance evaluation, not individual performance evaluation.  Like Chen, Millery valued collaborative networking and partnership. She sought out organizations that provided services similar to those provided by LLS. In fact, she was able to find information on previous survey studies performed by both LLS and these other organizations. This information not only helped her formulate research questions, but it also helped her specify program determinants (mediators and moderators) that were germane to program implementation and evaluation processes.  Planning a Service Program Evaluation Why Wyoming residents only? LLS managers explained that Wyoming did not have a local chapter and therefore the residents of that state could benefit most from the enhanced IRC (phone) services, so that these would be evaluated in Wyoming in lieu of direct-contact chapter services.  In focusing the evaluation, Millery’s methods mirror those of Chen. After her initial research of background information on LLS, formulation of research questions, examination of collaborative networking, and clarification of roles and relationships with program managers and staff, Millery was able to better gauge the program’s (strategic and operational) goals and objectives and, correspondingly, establish the evaluation’s goals and objectives. From there, she was able to determine the feasibility of the evaluation and the evaluability (evaluation readiness) of the program.  Chen & the Service Program Evaluation case Chen stresses the importance of establishing a program rationale (program theory, or change and action models) in delineating the rationale of an evaluation. Millery helped the LLS evaluation team members clearly define the rationale, or change model and intended purposes, of their partnered evaluation.  Chen likewise stresses the differences between a change model and an action model. Millery was able to articulate the why and the how of the evaluation in the planning stage in relation to program theory, both the program’s theory of change and its implementation model. This effort in turn allowed her to gauge implementation fidelity and success.  Consistent with Chen, Millery engaged stakeholders in the program evaluation to the maximum extent possible— stakeholder involvement was central to the evaluation. It also helped build evaluation capacity in the organization.  Chen & the “Media Evaluation” case  This case involves an effort to bring about behavioral change through social marketing. The Truth® campaign media evaluation demonstrates how fidelity evaluation can be effective in assessing social marketing initiatives.  Chen describes the Fidelity Evaluation Approach as a major evaluation method well-fitted to a mature implementation stage. Fidelity evaluation is principally a process evaluation approach that gauges the degree of congruence between program change and action models (program theory), on the one hand, and the program intervention as implemented, on the other. Target population fidelity evaluations assess whatever element of the change and action models are of special interest to stakeholders. Since outcomes are of vital interest to stakeholders funding or otherwise supporting the program, fidelity evaluation is also concerned with program impacts, and specifically impacts on intended populations. Planning for a Media Evaluation   Health communication may be defined as a complex of techniques and initiatives intended to inform, influence, and motivate individual, institutional, and public audiences about important health issues. Social marketing is a vehicle for health communication that seeks to influence social behaviors, not to benefit the marketer but to benefit the target audience and society as a whole (Kotler et al., 1971, in Holden p. 124). Social marketing is the systematic application of commercial marketing concepts and techniques so as to achieve specific behavioral goals for a social good. It attempts to prompt healthy behaviors in a population by using some of the proven marketing techniques used to promote commercial products (Kotler et al., 1996, in Holden p. 124). Media Evaluation case Media campaigns, and media evaluations, are based on social marketing theory and behavioral theory, including theories of exposure, messaging, communication, and behavior change (Hornik, 2002, in Holden p. 124).  Media evaluation may be divided into process and outcome evaluation methods, as follows: 1) Process evaluation helps to assess whether the target audience has been exposed to a campaign’s messages and whether the target audience reacts favorably to the messages [as delivered] in real-world circumstances. 2) Outcome evaluation helps to determine the effects of messages on health behavior and determinants of behavior, such as health knowledge, attitudes, and beliefs. Media evaluations often capture process and outcome data simultaneously to offer the immediate formative feedback that can enhance the campaign effort. (Evans et al., in Holden p. 124)  Media Evaluation case When (1) immediate reactions to media messages, (2) longer-term recollections of these, and (3) associated health outcomes are correlated, process and outcome evaluation efforts are brought together.  As a result of the Master Settlement Agreement between tobacco companies and 46 states, the American Legacy Foundation initiated the national truth® campaign in February 2000. From 2000 to 2002, annual funding for the campaign averaged $100 million per year. National media purchase was employed by the campaign, as opposed to a randomized exposure design, for two primary reasons. First, it was considered that the campaign could not ethically assign some media markets to low or zero exposure, given the documented successes of the predecessor Florida “truth” campaign. Second, a national media purchase was roughly 40% cheaper than a market-to-market purchase, which would have been necessary to randomize exposure.  Media Evaluation case The truth® campaign evaluation used data from the 1997–2002 Monitoring the Future annual spring surveys, which were designed to monitor alcohol, tobacco, and illicit drug use among youths in the United States. The survey, funded primarily by the National Institute on Drug Abuse and conducted by the University of Michigan’s Institute for Social Research, was a self-administered questionnaire, involving about 18,000, 17,000, and 16,000 8th-, 10th-, and 12th-grade students a year, respectively. In-school surveys such as the National Youth Tobacco Survey (NYTS) and Monitoring the Future (MTF) are more appropriate for measuring substance use because they are selfadministered without the presence of parents or others who could motivate youth to provide socially desirable responses to substance questions. With its large national sample and coverage of major media markets where the truth® campaign was advertised, MTF became the cornerstone of evaluation planning efforts to assess the campaign’s impact on youth smoking behaviors. (Evans et al., in Holden p.129) Chen’s target population fidelity evaluation Chen’s target population fidelity evaluation looks at programs’ contact with their target populations. Chen writes, “Programs must reach sufficient numbers of clients from the specified target population in order to be effective” (Chen p.169).  To conduct a target population fidelity evaluation, evaluators need to ask three main questions. First, how many clients were served by the program during a specific period? Second, how many of the clients served come from the target population? And third, upon determining how many clients served come from the target population, based on that number, the evaluator’s next question elicits a judgment call about a program’s performance: Does the number of clients served justify the program’s existence?  Chen and the truth® campaign evaluation   Chen indicates that the evaluator must remain aware of the distinction between clients recruited and clients served. This was the case with the truth® campaign evaluators. In its first year, the campaign reached three fourths of American youths and was associated with campaignrelated changes in youth attitudes toward tobacco and the tobacco industry. (Siegel, 2002, in Farrelly p.431) All survey-based analyses have limitations. Measures of youth smoking prevalence are self-reported and may be subject to social desirability bias so that youths are less likely to report smoking in areas with high exposure to the campaign than in areas with lower exposure. This would lead to an overstatement of the campaign’s effects. However, some studies have found that underreporting of smoking by youths is actually minimal. Chen and the truth® campaign evaluation “Results also rely on repeated cross-sectional surveys, not repeated measures on the same students, which weaken the strength of causal inference” (Messeri et al., 2002, in Farrelly p.430). Evaluators included youths surveyed before 2000 as well, so that students in the 1997–1999 surveys served as an unexposed [cross-sectional] control group.  For the purpose of the truth® campaign, the second component of target population fidelity evaluation that must be addressed as, “Is it possible that the estimated truth® campaign effects may have been due to other unmeasured youth-focused prevention activities (e.g., in-school substance abuse–prevention programs, the national antidrug campaign by the Office of National Drug Control Policy, other media-borne messages, secular trends in social behavior) that were correlated by chance with the truth® campaign exposure”? This is the attribution question in causal analysis (Chen).  Chen and the truth® campaign evaluation Following a socio-ecological model (Glanz et al., 1997, in Farrelly p.431) that recognizes multiple levels and types of influence on health behaviors (in particular, intrapersonal, interpersonal, community, media, policy, economic), evaluators controlled for a wide array of potential confounding influences.  Considering the Media Market Level: Low-exposure markets tended to be more rural, White, and less educated, and have lower incomes—all factors associated with smoking—than markets with high campaign exposure. Failing to control for these factors (high pre-treatment smoking rates coupled with low exposure to truth® campaign messages) could lead to a spurious negative correlation between campaign exposure and smoking prevalence. Evaluators statistically modeled possible correlations between preexisting media market smoking rates and the subsequent campaign dose. (Heckman et al., 1989, in Farrelly p.431). This controlled for average market-level smoking rates, effectively making each market its own control group. Evaluators also included direct, local media market– level measures of potential confounders.  Chen and the truth® campaign evaluation Chen’s third and ultimate component of target population fidelity evaluation asks, “Does the number of clients [and distribution of clients] served justify the program’s existence”? Findings suggest that the truth® campaign may have had the largest impact among 8th-grade students, which is consistent with evidence from Florida that indicates the Florida truth campaign led to declines in smoking rates and that smoking rate declines were greatest among middle school students (grades 6 through 8) from 1998 to 2002. (Farrelly et al., p.427)  In addition to being consistent with previous findings, this study improves on previous research by reaching generalized conclusions about the effects of antismoking campaigns for youths across the U.S. and by implementing a pre/post quasi-experimental design that controlled for potential threats to validity, such as secular trends in smoking prevalence, the influence of cigarette prices, state tobacco control programs, and other factors.  Chen and the truth® campaign evaluation    This result was confirmed in multivariate analyses that controlled for confounding influences and indicated a ‘doseresponse’ relationship between truth® campaign exposure and current youth smoking prevalence. The evaluators found that by 2002, smoking rates overall were 1.5 percentage points lower than they would have been in the absence of the campaign, which translates to roughly 300,000 fewer youth smokers based on 2002 US census population statistics. (Farrelly et al., p.428). That was the actual impact attributed to the campaign. In sum, the truth® campaign was effective, demonstrably associated with significant declines in youth smoking prevalence. Truth Campaign impact (marginal impact equals projected smoking rates without program versus rates with program) Chen and the truth® campaign evaluation The evaluators found that implementers were consistent— i.e., faithful to the socio-ecological change/action model underlying the program—in their execution of the program, and this model therefore became the efficacy test in the evaluators’ assessment of the campaign. The program also made consistent use of social marketing theory (involving vectors of exposure, messaging, communication, and behavior change).  Therefore, program fidelity in the sense of close theoryimplementation correspondence was high, as was target population fidelity. Consistent with the Results-mapping approach (Reed, et al.), the truth® campaign was impactful in both the quantity (extent) and quality of results attained.  1. Evans, W. D., Davis, K. C., Farrelly, M. C. (2009). Planning for a Media Evaluation. In Holden, D. J., Zimmerman, M. A., A Practical Guide to Program Evaluation Planning, pp. 123-142. 2. Farrelly, M. C., Davis, K. C., Haviland, M. L., Messeri, P., & Healton, C. G. (2005). Evidence of a Dose—Response Relationship Between “truth” Antismoking Ads and Youth Smoking Prevalence. American Journal of Public Health, Vol. 95, No. 3, pp. 425-431. The contribution/attribution challenge        Attribution for outcomes always a challenge A credible performance story needs to address attribution Sensible accountability needs to address attribution Complexity significantly complicates the issue Attribution is based on the theory of change (change model) of the program, and it is buttressed by evidence validating the theory of change, Attribution is einforced by examination of other influencing factors, Contribution analysis builds a reasonably credible case about the difference the program is making. Attribution determinations are based on analyses of net program impact (program contribution). Attribution        Outcomes not controlled; there are always other factors at play Conclusive causal links don’t exist You are trying to understand better the influence you are having on intended outcomes Need to understand the theory of the program (program theory), to establish plausible association Something like contribution analysis can help Measuring outcomes Linking outcomes to actions (activities and outputs), i.e. attribution: Are we making a difference with our interventions? Accountability for outcomes In order to be accountable, we need to credibly demonstrate:     The extent to which the expected results were achieved The contribution made by activities and outputs of the program to the outcomes The learning or other behavioral/social changes that have resulted, and, therefore The soundness and propriety of the intervention means used. Contribution analysis:     There is a postulated theory of change The activities of the program were implemented The theory of change is supported by evidence Other influencing factors have been assessed & accounted for Therefore  The program very likely made a net contribution, to be gauged against the counterfactual: What would have occurred, plausibly, in the absence of the program? Theory of change: Truth Campaign Change Model    A results chain with embedded assumptions and risks is identified An explanation of why the results chain is expected to work; what has to happen These two elements comprise the Change Model Reduction in smoking Anti-smoking campaign Assumptions: target is reached (national media coverage), messages are heard, messages are convincing, commercialcampaign techniques are effective, nonsmoking as rebellion concept works, other major influences are identified and their impact considered and measured. Risks: target not reached, poor message in some contexts, lack of randomization introduces validity issues, attribution difficulties Other influencing factors      Literature and knowledgeable others can identify the possible other factors (direct and indirect influences) Reflecting on the theory of change may provide some insight on their plausibility Prior evaluation/research may provide insight Relative size compared to the program intervention can be examined Knowledgeable others will have views on the relative importance of other factors Chen: Program Theory and Fidelity  Theory-driven program evaluation    All programs have implicit theories Program modeling (e.g., via logic models) helps make implicit (or tacit) theory more explicit and therefore subject to scrutiny Implementation fidelity    Preserving causal mechanisms in implementation Scaling up Staying close to projected, intended outcomes (What of positive unintended outcomes? Or negative unintended consequences?) Chen: Program Implementation and Fidelity Intended model:Implemented model (is program implemented as intended—focused on program action model)  Normative theory (induced positive behavioral/social change that is intended—e.g., changing smoking behaviors)  Causative theory (theory of change, change model)  However, models too often substitute for reality (they should not—a kind of “formalism”). Dangers of reification  Models can support:  Assessment of “evaluability” (is the program ready to be evaluated, or how to ready a program for evaluation—based on the work of Joseph Wholey)  Client needs and resource assessments  Program development, refinement, capacity-building  Monitoring and evaluation  Chen: Program Implementation and Fidelity Formative and process forms of evaluation are undertaken to assess whether the program is proceeding as planned, the fidelity of implementation to program design (Chen), and the degree to which changes need to be made.  Summative evaluation is conducted to asses whether planned outcomes have been achieved (fidelity of outcomes) and what impacts (intended and unattended) have occurred.  Context for evaluating fidelity—it may become evident that the program has strayed from its design but for good reasons, making for better outcomes; if so, make all of that explicit.  Considerations for conceptualizing fidelity  Multilevel nature of many interventions  Level and intensity of measurement aligned with need  Capacity for monitoring fidelity  Burden of monitoring fidelity  Alignment with desired outcomes   Program theory can be either descriptive or prescriptive  Descriptive theory specifies what impacts are generated and how this occurs. It suggests a causal mechanism, including intervening factors, and the necessary context for program efficacy. Descriptive theories are generally empirically-based, relying on best practices in the practitioner and academic literatures. Description here includes causative sequences.  Prescriptive theory indicates what ought to be done. It specifies program design and implementation, what outcomes should be expected, and how performance should be judged.  Comparison of the program’s descriptive and prescriptive theories can help to identify, diagnose, and explain implementation difficulties—the two should be consistent.  Logic modeling is largely limited to normative theory–what is expected to happen. However, we need both normative and causative forms of theory. Both are required to explain how project outputs are expected to lead to a chain of intermediate outcomes and, in turn, eventual impacts, based on program observations. Causal logic models incorporate both kinds of theory—depicting both actual and expected program elements. Causal logic models Intervention Outcome Other Factors A causal logic model clarifies the theory of how interventions produce outcomes. Multiple methods and techniques establish the relative importance of causes of changes in outcomes Over time, the relative influence of a program decreases over against exogenous factors & actors High Low Endogenous Actors Program Determinants of Success (Mediating & Moderating Variables, or Mediators and Moderators): Mediators are intervening variables (intervening between the intervention effort and program outcomes), while moderators are contextual factors that constrain or enable those outcomes—Chen, page 91. Mediator Intervention (following from the intervention) Outcome Mediator (exogenous) Moderator Moderator Chen pp.240-241; Action Model for HIV/AIDS education Action Model (which along with the Change Model=ProgramTheory) Implementation (interventiondeterminantsprogram outcomes) Mediating Variables Instrumental variables inherent in program design. E.g., openness to learning and change regarding sexual behaviors may well be either presumed or actively fostered by the program, since this cognitive factor would be considered a key variable intervening between program intervention(s) and behavioral change(s) among program subjects Moderating Variables Often, less than positive: e.g., lack of partner support, social and economic variables such as poverty, education, prejudice. However, may be positive: e.g., the incidence of help from supportive networks— support groups, family and friends, reinforcing messages, social and institutional and cultural supports Impacts on individual subject(s) of the intervention, with “impacts’ defined as the aggregate of comparative net outcomes Logic Model A graphic representation that clearly identifies and lays out the logical relationships among program conditions (needs), resources/inputs, activities, outputs, and outcomes or impacts. Welfare-To-Work Logic Model Inputs Activities/Outputs Short-term to Intermediate Outcomes Impacts Outputs for Strategy 1 Strategy 1: Improve Hard Skills of Clients to Fit Hiring Needs of the Current Economy Increase % of clients with adequate hard skills for standard employment Increase % of clients completing continuing education coursework for high-wage career advancement Strategy 2: Improve the Soft Skills of Clients to Aid in Job Placement and Retention Increase % of clients with appropriate soft skills Strategy 3: Enhance Day Care Access Decrease % clients w/out day care access Strategy 4: Enhance Access to Transportation Decrease % of clients w/out transport Strategy 5: Decrease Barriers Presented by Physical Disability Increase % of employers offering “integrative” workplace for people with disabilities Goal: $ FTE $ FTE # of clients trained for standard employment # of clients trained or completing degree in high-wage employment area $ FTE Activities for Strategy 1 $ FTE $ FTE $ FTE # of training courses held # training methodologies developed # employer surveys completed # career counseling sessions provided # employers offering continuing education assistance Increase SelfSufficiency in the Community through Increased Employment Measures: Decrease in Welfare Ratio of TANF funds to wages paid to #clients Decrease Unemployment # unemployment rate total; # unemployment rate for clients Increase SelfSufficiency % of community achieving a selfsufficient wage; % of clients achieving selfsufficient wage Logic Model & implicit/explicit program theory A good logic model clearly identifies Program Goals, Objectives, Inputs, Activities, Outputs, Desired Outcomes, and Eventual Impacts, in their sequential interrelation.  Program theory specifies the relationship between program efforts and expected results (cf. theory-driven and utilizationfocused evaluation—Chen). Causal logic models specify the connections among program elements with reference to a specific theory or theories of change and of action; in some instances, they may just provide if-then linkages.  A logic model helps specify what to measure in an evaluation, guides assessment of underlying assumptions, and allows for stakeholder consultation and corrective action, for telling a program’s “performance story.”  Partnered, collaborative programs involving a number of agencies or organizations have more complex causal chains; it is a challenge to capture and assess these in evaluation, as indicated in the following two slides.  Multi-agency Monitoring &Evaluation Logic Model Intermediate Outputs Process Agency 1 Inputs Short-Term Outcomes Outcomes Program Program Agency 2 Inputs Program Program Agency 3 Inputs Program Program Other inputs Program Program Adapted from Milstein & Kreuter. A Summary Outline of Logic Models: What are They and What Can They Do for Planning and Evaluation? CDC 2000 Long-Term Impacts Complex effects chain in partnered programs Attribution difficulties; transparency & accountability challenges Partners 1, 2, 3, etc. Shared Common Outcomes Identifying Design and Data Collection Methods in Evaluation Planning Involve client and stakeholders in deciding necessary information to best answer each key evaluation question  Evaluation designs specify the organization and structure and resources needed for data collection and analysis  Causal designs: (quasi)experimental designs  Multiple regression, ANOVA, t-tests, or other statistical methods are applied in order to answer evaluation questions.  Descriptive designs: describe (case study), analyze the program, show a trend (time series), assess public opinions (cross-sectional), illustrate a process (“thick” description)  Commonly used in needs assessment and process evaluation research     Evaluator and stakeholders examine each question carefully to identify any important research design issues Most evaluations involve multiple research designs or combination of methods—hybrid evaluation designs. This is also called “mixed-method” evaluation, involving the triangulation of both methods and data Important to discuss early to see if:  Focus groups are available, appropriateness of random assignment, time available for collecting data, access to data sources such as program files, training needs that may be indicated for staff, cost, etc.  Intensive interviewing, semi-structured interviews, or other methods are feasible.  Is the design “doable?” Identifying Appropriate Information Sources  Once information requirements are agreed upon, the sources of the information must be specified; the following questions are key:    Using existing data as information source   Does necessary information already exist in a readily available form? Preferable to use it Commonly used information sources   Who will have information or access to it? Who will be able to collect those data? Program recipients, deliverers, persons who have knowledge of the program recipients, public documents/databases, file data, reports, position papers, grant proposals Policies that restrict information sources   Are there policies about collecting data from clients or program files? Confidentiality, anonymity, privacy, IRB protocols Identifying Appropriate Information Sources    Using existing data as information sources  Does necessary information already exist in a readily available form? Preferable to use it Commonly used information sources  Program recipients, deliverers, persons who have knowledge of the program recipients, public documents/databases, file data, reports, position papers, grant proposals Policies that restrict information sources  Do policies exist concerning collecting data from clients or existing files?  Confidentiality, anonymity, privacy, IRB protocols Identifying Appropriate Information Sources  Client and stakeholder involvement in identifying sources    The evaluator, by training and experience, often can identify key sources of information Client groups will be able to identify sources of information that may be missed by the evaluator This is one area where evaluator-client and evaluatorstakeholder collaboration yields helpful answers and makes for a sense of shared ownership of the evaluation process Identifying Data Collection Methods, Instruments  Data collected directly from individuals identified as sources of information  Self-reports  interviews, surveys, rating scales, focus groups, logs/journals Personal Products:  Tests, narratives, survey responses Data collected by independent observer      Narrative accounts Observation forms (rating scales, checklists) Unobtrusive measures; participant observation  Data collected from existing information  Public documents   Review of organizational documents   federal, state, local, databases, Census data, etc. client files, notes of employees/directors, audits, minutes, publications, reports, proposals Program files     Original grant proposal Position papers Program planning documents Correspondence, e-mails, etc.  After identifying for methods, it is important to review adequacy of techniques       Will the information collected provide a comprehensive picture? Are the methods both legal and ethical? Is the cost of data collection worthwhile? Can data be collected without undue disruption? Can data be collected within time constraints? Will the information be reliable and valid for the purposes of the evaluation? Determining Appropriate Conditions for Collecting Information  Examples of issues around data collection:     Will sampling be used? How will data actually be collected? When will data be collected? Specifying sampling procedures to be employed    Sampling helps researcher draw inferences about the population in the study Sampling is useful when it will not diminish the confidence of results Sample size must be appropriate; too small a sample is of limited value, and over-large, unfeasible  Specifying how/when information will be collected         Who will collect data? For interviews, focus groups: Might characteristics of the evaluator or evaluators influence data collection? For instance, cultural distance. What training should be given to people collecting the data? Striving for consistency across applications. In what setting should data collection take place? Is confidentiality protected? Are special equipment, materials needed? When will the information be needed? Available? When can the information conveniently be collected? Determining Appropriate Methods to Organize, Analyze, and Interpret Information   Develop a system to code, organize, store, and retrieve data For each evaluation question, specify how collected information will be analyzed    Identify statistical and other analytical techniques Designate some means for conducting the analysis Interpreting results    Share information with clients to gain perspective on potential interpretations of the data, and to ensure completeness and correctness of the data collected The evaluation plan should allow for the generation and recording of multiple or conflicting interpretations Interpretations should consider multiple perspectives Determining Appropriate Ways to Report Evaluation Findings  What is the appropriate way to report findings?   Audience, content, format, date, context of presentation Suggested Questions (Chen)      Are reporting audiences defined? Are report formats and content appropriate for audience needs? Will the evaluation report balanced information? Will reports be timely and effective? Purposes? Is the report plan responsive to the rights to information and data ownership of the audiences? Evaluation Plan Checklist—outline The following is a checklist of the primary components of a typical evaluation plan; plans should be tailored to specific requirements, beyond this checklist: Introduction and Background  A description of the project, strategy or activity that you are evaluating Research  Questions Questions that you think need answers in order to understand the impact of your work and to improve the evaluation effort Checklist—outline Program Outcomes and Measures   The desired outcomes of the project or program effort about to be undertaken or already underway, and the measures that you will use to indicate that you are progressing toward those outcomes. The evaluation plan often articulates desired program outcomes (objectives) more fully and clearly than program documents. This is one way that evaluations can play a formative role. Methodology and Approach  Methodology or techniques (e.g., surveys, use of agency records, focus groups, key informant interviews, pre- and post-tests, etc.) that you will be using to collect the measurement data Checklist—outline  Data Collection Management and Work-plan   The data sources (e.g. administrative data sources, respondent groups) that will be used, how data will be managed, and who will be responsible for data collection, data “clean-up,” quality-control of data collection, and eventual “ownership” of the data. These controls and disposition of ownership question were major concerns with the NARCH program. Proposed Products  An evaluation report or several reports, an executive summary, a PowerPoint presentation to program principals, grant proposals, handouts, press releases? Who will receive them—intended audiences? (The contractor and funding agency and other key actors may wish to have distinct reports). How will these products be used? Are various uses to be sequenced in particular ways? Evaluation Research Questions Most evaluation plans are prepared annually for multi-year programs; the following retrospective and prospective questions often arise: 1. Planning and Implementation: Was program planning adequate? Was the implementation carried out as planned? How well? Were there process or implementation barriers? 2. Opportunities: What anticipated and unanticipated opportunities for the generation of information obtained? Did advisory groups, IRBs, focus groups, and other key respondents function as expected? Were information and resources provided as planned—as to types, quantity, and timing? For instance, in the Native American Research Centers for Health Program, data collection went well, but collection of data from six different sets of project Principal Investigators was often delayed or not available in the right format or containing the information expected. 3. Participation and Utilization: How many and what key stakeholders participated? Were there unexpected barriers to participation? 4. Developmental/consultative role for the evaluator: If there are serious shortcomings in any of these areas, should the evaluator become involved in redressing them? Questions about the proper evaluator role. Evaluation Research Questions 5. Satisfaction: Are/Were participants satisfied? Why? Why not? 6. Awareness: What is the level of awareness of the subject in the target community? Has awareness increased? 7. Attitudes, norms: What is the perception of an activity or service (example: cancer screening)? Have perceptions changed? 8. Knowledge: What does the target population know about an issue or service (example: substance abuse awareness)? Do they now know more about it? Are they more engaged? For example, in the NARCH Program, parent-facilitators were trained in two communities to develop and implement a family-based curriculum for their lateelementary-school children, and depth semi-structured interviews indicated a very significant increase in awareness and buy-in on their part. 9. Behavior: What do people do (example: display a willingness to undergo cancer screening)? 10. Capacity: Has community institutional capacity increased? E.g., in the NARCH program, development of IRB capability. . Outcomes and Measures      What are the stated goals and objectives of the program? For NARCH they were drawn from the NIH, and entailed (1) Reducing historic mistrust between tribal communities and university researchers, (2) reducing health disparities between Native communities and the American population at large, and (3) reducing under-representation of AI/AN in the health professions. How do goals and objectives connect to one another? What are the specific program intervention strategies to attain these goals, objectives? You may need to have a strategic planning retreat or two with clients to define these. How do goals and objectives connect to strategies. Examine assumptions as you link the two. How will progress toward goal attainment be assessed – what indicators or measures will tell you how you are doing? Short term and long term and the time-frames for each. Define indicators and measures in dialogue with clients and beneficiaries, stakeholders. Methodology and Data Collection Approach  Specify the data collection methods for each measure (which links back to indicators and objectives and goals, and inputs and resources, in a logic model). Specify both qualitative and quantitative measures, and to what extent you will use mixed or hybrid approaches.  What specific types of data will you collect, from what sources? Who are the respondent groups to reach?  What will be your timeline for collecting the data?  What systems (computerized or paper) are in place to collect, manage, and store data? If none, what is your plan for addressing this gap?  Data Collection Management and Work Plan  What are the tasks (e.g., designing assessment tools such as surveys, building necessary relationships to obtain data)?  What is the interplay to be between program implementation protocols and the sequencing and content of the evaluation?  Who is responsible for instrument design (usually the evaluator) and for data collection, analysis, and presentation (evaluator in concert with client)?  How long will it take to collect, analyze and prepare to present the information?  How much will it cost? Data Collection Management and Work-plan Projecting the Time Involved:  Project/Account for preparation time and implementation time for focus groups, interviews, and site visits.  Logistics of arranging space, transportation, and compensation (if any) for participants, etc.  Participant recruitment, invitations, follow-up  Instrument development and training/practice (for facilitators, if other than the evaluator)  Obtaining data from programs or administrative/public sources can be time consuming  Plan for research time, follow-up time in response to data requests, clarification after receipt, etc. Data Collection Management and Work-plan Projecting—in the Evaluation Plan—time Involved in Data Collection:    A day of data collection often requires a day of analysis; as a rule of thumb, at least two hours of analysis for a two-hour focus group Build time for review and feedback from other stakeholders into the preparation of products phase Allow for flexibility in your projection of time involved for general project management, meetings to discuss the data collection and analysis, and unintended or unexpected events Projecting the Cost Involved:      Your costs—time invested, for instance—as a consultant. It’s not uncommon for contracts to be priced per deliverables in their totality and not per-hour, while still allowing for separate billing for travel and other extraordinary costs; this has significant advantages for evaluators. Direct costs (mailing, copying, telephone use for data collection) Incentives for participation (NARCH used $20 gift cards); Costs for conducting focus groups (food, space, transportation) Presentation materials—usually the evaluator’s responsibility as pertains to her or his own presentations. Evaluation Products and Deliverables These are to be specified in the Evaluation Plan: Evaluation Reports –principal product; annual, semi-annual, end-ofproject Other Products: quarterly reports, reports to boards, separate reports to client and funding source, press releases, position papers, etc. Audience/Purpose: Specify the various audiences corresponding to each of these These presuppose an articulation early in the report of key Evaluation Questions, which connect back to program goals and objectives, indicators and measures, type of evaluation (design or developmental, process or formative, impact or summative). They also connect with key stakeholder and program principal questions. For example: What would I want to know as a program manager, funder, board member, community member? What would I want to read to be able to understand the issue? What would I need to know in order to take action? In what ways does the evaluation address program accountability and responsibility? In order to carry all of this out well, you need to engage principals and stakeholders, hold community forums (also opportunities for satisfaction surveys), circulate drafts. The Evaluation Management Plan (Work Plan)   The final task in planning the evaluation is describing how it will be carried out An Evaluation Management Plan or Work Plan is often essential to help with implementation and oversight of the project      Who will do the evaluation? How much will it cost? Will it be within budget? How does the sequencing of tasks define these issues? Can evaluators count on continued support from top management, for instance in mandating timely and usable reporting of data by program principles? Agreements and contracts  Potential problems that arise during the evaluation can be more easily resolved if client and evaluator share a firm understanding  A well-documented agreement prior to launching the evaluation study concerning important research procedures (and caveats, e.g., as to data availability) is essential Contract and form samples may be found at: http://www.wmich.edu/evalctr/checklists/contracts.pdf

522_Mid & Final Lectures Chen & Holden

Related documents

Products

Support

522_Mid & Final Lectures Chen & Holden

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib