Evaluating Health Promotion Programs Questions for the Planner • Did the program have an impact? • How many people stopped smoking? • Were the participants satisfied with the program? • Should we change anything about the way to program was offered? • What would happen if we just changed the time we offer the program? Evaluating Health Promotion Programs • Evaluation: An Overview • Evaluation Approaches, Frameworks, and Designs • Data Analysis and Reporting Evaluation: An Overview • There are two general categories of evaluation: Informal Evaluation, Formal Evaluation • Informal Evaluation: an absence of breath and depth; no procedures and formally collected evidence – e.g., consulting colleagues about a program concern, making a program change based on participant feedback. – Adequate when making minor changes in programs Evaluation: An Overview • Formal Evaluation: “systematic wellplanned procedures” – Most valuable procedure in health program evaluation – Controlled for a variety of extraneous variables that could produce evaluation outcomes that are not correct Characteristics of Formal and Informal Evaluation Characteristic Formal Informal a. Degree of freedom Planned activities Spontaneous activities b. Flexibility Prescribed proces or protos Flexible procedures or protocols c. Information Precision of information Depth of information d. Objectivity Objective scores of measure Subjective impressions e. Utility Maximal comparability Maximal informativeness f. Bias Potential narrowed scope Subjective bias g. Setting Controlled settings Natural settings h. Inference Strong inferences Broad inferences Source: Willams and Suen (1998). Important Points • The evaluation must be designed early in the process of program planning • Begins when the program goals and objectives are being developed • Must also answer questions about the program as it is being implemented • Must involve a collaborative effort of program STAKEHOLDERS (i.e., those who have a vested interest in the program) Important Points • Understand that evaluation can be a political process – Judgment often “carries the possibilities of criticism, rejection, dismissal, and discontinuation (Green & Lewis, 1986, p. 16) • Need to have ethical considerations for the individuals involved – Institutional Review Board or Human Subject Review Committee Define EVALUATION • In general terms: “Evaluation is a process of reflection whereby the value of certain actions in relation to projects, programs, or policies are assessed” (Springett, 2003, p. 264) • Applies to health promotion “The comparison of an object of interest against a standard of acceptability” (Green & Lewis, 1986, p. 362) Standards of Acceptability Defined: “the minimum levels of performance, effectiveness, or benefits used to judge the value” (Green & Lewis, 1986) and are typically expressed in the “outcome” and “criterion” components of a program’s objectives. Standard of Acceptability Example Mandate (policies, statues, laws) Of regulating agencies Percent of children immunized for school; % of priority population wearing safety belts Priority population health status Values expressed in the local community Rates of morbidity and mortality compared to state and national norms Type of school curriculum expected Standards advocated by professional Organizations Passing scores, certification, or registration examinations Norms established via research Treadmill tests or percent body fat Norms established by evaluation of previous Smoking cessation rates or weight loss programs expectations Comparison or control groups Used in experimental or quasi-experimental studies Types of Evaluation • Process Evaluation: “Any combination of measurements obtained during the implementation of program activities to control, assure, or improve the quality of performance or delivery. Together with preprogram studies, makes up formative evaluation” (Green & Lewis, 1986, p. 364). – Getting reactions from program participants about the times programs are offered or about program speakers are examples. Such measurements could be collected with a short questionnaire or focus group. Types of Evaluation • Impact Evaluation: Focuses on “the immediate observable effects of a program, leading to the intended outcomes of a program intermediate outcomes” (Green &Lewis, 1986, p.363). – Measures of awareness, knowledge, attitudes, skills, and behaviors yield impact evaluation data. Types of Evaluation • Outcome Evaluation: Focuses on “an ultimate goal or product of a program or treatment, generally measured in the health field by morbidity or mortality statistics in a population, vital measures, symptoms, signs, or physiological indicators on individuals” (Green & Lewis, 1986, p. 364). – Outcome evaluation is long-term in nature and takes more time and resources to conduct than impact evaluation. Types of Evaluation • Formative Evaluation: “Any combination of measurements obtained and judgments made before or during the implementation of materials, methods, activities or programs to control, assure or improve the quality of performance or delivery” (Green & Lewis, 1986, p. 362). – Examples include, but are not limited to, a needs assessment or pilot testing a program. Types of Evaluation • Summative Evaluation: Any combination of measurements and judgments that permit conclusions to be drawn about impact, outcome, or benefits of a program or method” (Green & Lewis, 1986, p.366). Comparison of Evaluation Terms Planning → Start of Implementation → End of Implementation Formative Process Source: McKenzie et al. (2005, p. 295) Summative Impact Outcome Purpose for Evaluation • Basically, programs are evaluated to gain information and make decisions • Six Specific Reasons: – To determine achievement of objectives related to improved health status – To improve program implementation – To provide accountability to fenders, community, and other stakeholders – To increase community support for initiatives – To contribute to the scientific base for community public health interventions – To inform policy decisions The Process for Evaluation Guidelines • Planning – Review the program goals and objectives – Meet with the stakeholders to determine what general questions should be answered – Determine whether the necessary resources are available to conduct the evaluation; budget for additional costs – Hire an evaluator, if needed. – Develop the evaluation design – Decide which evaluation instrument(s) will be used and, if needed, who will develop the instrument – Determine whether the evaluation questions reflect the goals and objectives of the program – Determine whether the questions of various groups are considered, such as the program administrators, facilitators, planners, participants, and funding source – Determine when the evaluation will be conducted; develop a time line The Process for Evaluation Guidelines • Data Collection – Decide how the information will be collected; survey, records and documents, telephone interview, personal interview, observation – Determine who will collect the data – Plan and administer a pilot test – Review the results for the pilot test to refine the data collection instrument or the collection procedures – Determine who will be included in the evaluation (e.g., all program participants, or a random sample of participants) – Conduct the data collection The Process for Evaluation Guidelines • Data Analysis – Determine how the data will be analyzed – Determine who will analyze the data – Conduct the analysis, and allow for several interpretations of the data The Process for Evaluation Guidelines • Reporting – – – – Write the evaluation report Determine who will receive the results Choose who will report the findings Determine how (in what form) the results will be disseminated – Discuss how the findings of the process or formative evaluation will affect the program – Decide when the results of the impact, outcome, or summative evaluation will be made available. – Disseminate the findings Practical Problems or Barriers in Evaluation • • • • • • Planners either fail to build evaluation in the program planning process to do so too late (Solomon, 1987; Valente, 2002; Timmreck, 2003). Resources (e.g., personnel, time, money) may not be available to conduct an appropriate evaluation (Solomon, 1987; NCI, 2002; Valente, 2002). Organizational restrictions on hiring consultants and contractors (NCI, 2002). Effects are often hard to detect because changes are sometimes small, cone slowly, or do not last (Solomon, 1987; Glasgow, 2002; Valente, 2002). length of time allotted for the program and its evaluation (NCI, 2002). Restrictions (i.e., Policies, ethics, lack of trust in the evaluators) that limit the collection of data form those in the priority population (NCI, 2002). Practical Problems or Barriers in Evaluation • • • • • • It is sometimes difficult to distinguish between cause and effect (Solomon, 1987). It is difficult to separate the effects of multi-strategy interventions (Glasgow et at., 1999), or isolating program effects on the priority population from “real world” situations (NCI, 2002). Conflicts can arise between professional standards and do-it-yourself attitudes (Solomon, 1987) with regard to appropriate evaluation design. Sometimes people’s motives get in the way (Solomon, 1987; Valente, 2002). Stakeholders’ perceptions of the evaluation’s value (NCI, 2002). Intervention strategies are sometimes not delivered as intended (i.e., type III error) (Glasgow, 2002), or are not culturally specific (NCI, 2002; Valente, 2002). Who Will Conduct the Evaluation • Internal evaluation (within a department) • External evaluation (outside a department) • Persons with credibility and objectivity • Clear about the evaluation goals, design, and accurately reporting the results regardless of the findings. Evaluation Approaches, Frameworks, and Designs • Seven Major Evaluation Approaches – 1. Systems Analysis Approaches • Efficiency-based: determining which are the most effective programs using inputs, processes, and outputs. • Economic evaluations ate typical strategies used in system analysis approaches – Comparison of alternative courses of action in terms of both costs and outcomes • Two common cost analyses in health promotion – Cost-benefits: how resources can be best used – yielding the dollar benefits received from the dollars invested in the program – Cost-effectiveness: quantifies the effects of a program in monetary terms. Indicates how much it costs to produce a certain effect (e.g., years of life saved, # of smokers who stop smoking). » Check out the handout by McKenzie et al. (2005) on “cost-benefit and cost-electiveness as a part of the evaluation of health promotion programs.” Evaluation Approaches, Frameworks, and Designs • Evaluation Approaches – 2. Objective-Oriented Approaches • Specify program goals, and objectives, and collect evidence to determine if the goals and objectives have been reached Evaluation Approaches, Frameworks, and Designs • Evaluation Approaches – 3. Goal-Free Approaches • Focus on all outcomes, including unintended positive or negative side effects. – 4. Management-Oriented Approaches • Focus on identifying the meeting the informational needs of managerial decision makers Evaluation Approaches, Frameworks, and Designs • Evaluation Approaches – 5.Consumer-Oriented Approaches • Focus on developing evaluative information on “products.” (e.g., using checklists and criteria to allow an evaluation of the “product.” – 6. Exercise-Oriented Approaches • Reply “primarily on the direct application of professional expertise to judge the quality of whatever endeavor is evaluated Evaluation Approaches, Frameworks, and Designs • Evaluation Approaches – 7. Participant-Oriented Approaches • A unique one. Focus on a process in which involvement of participants (stakeholders in that which is evaluated) are central in determining the values, criteria, needs, data, and conclusions for evaluation. Framework For Program Evaluation • Six Steps must be completed in any evaluation, regardless of the setting (CDC, 1999) – Step 1: Engaging stakeholders – Step 2: Describing the program – Step 3: Focusing the evaluation design – Step 4: Gathering credible evidence – Step 5: Justifying the conclusions – Step 6: Ensuring use and sharing lessons learned Engage stakeholders Ensure use and share lessons learned Standards Utility Feasibility Propriety Accuracy Describe the program Focus the evaluation design Justify conclusions Gather credible evidence Framework for Program Evaluation Framework For Program Evaluation • Four Standards of Evaluation (CDC, 1999) – Utility standards ensure that information needs of evaluation users are satisfied – Feasibility standards ensure that the evaluation is viable and pragmatic – Propriety standards ensure that the evaluation is ethical – Accuracy standards ensure that the evaluation produces findings that are considered correct Selecting an Evaluation Design • Critical to the outcome of the program • Few perfect evaluation designs – Because situation is ideal, and there are always constraining factors, such as limited resources – Challenge is to devise an optimal evaluation as opposed to an ideal evaluation Selecting an Evaluation Design Asking These Questions • How much time do you have to conduct the evaluation • What financial resources are available? • How many participants can be included in the evaluation? • Are you more interested in qualitative or quantitative data? • Do you have data analysis skills or access to computers and statistical consultants? Selecting an Evaluation Design Asking These Questions • In what ways can validity be increased? • Is it important to be able to generalize your findings to other populations? • Are the stakeholders concerned with validity and reliability? • Do you have the ability to randomized participants into experimental and control groups? • Do you have access to a comparison group? Selecting an Evaluation Design A Four-Step Model (Dignan, 1995) Step 1 Orientation to The situation Resources, constraints, And hidden agendas Step 2 Defining the problem Dependent variables Independent variables Confounding variables Step 4 Plans for Step 3 Basic design decision Qualitative Measurement Quantitative Data collection Combination of both Data analysis Reporting of results Evaluation Design • True Experimental Design – Pretest-posttest design: Exp. vs. Control •R •R O1 O1 X O2 O2 – Posttest-only Design: Exp. vs. Control •R •R X O O – Time series Design: Exp. vs. Control •R •R O1 O1 O2 O2 O3 O3 X O4 O4 O5 O5 O6 O6 Evaluation Design • Quasi-Experimental Design – Pretest-posttest design: Exp. vs. Control •R •R O1 O1 X O2 O2 – Time series Design: Exp. vs. Control •R •R O1 O1 O2 O2 O3 O3 X O4 O4 O5 O5 O6 O6 Evaluation Design • Non-Experimental Design – Pretest-posttest design: Experimental only •R O1 X O2 – Time series Design: Exp. vs. Control •R O1 O2 O3 X O4 O5 O6 Internal Validity • Defined: the degree to which the program caused the change that was measured. • Factors that can threaten internal validity – History, maturation, testing, instrumentation, statistical regression selection, mortality, etc. External Validity • Defined: the extent to which the program can be expected to produce similar effects in other populations – known as generalizability. • Factors that can threaten external validity – Social desirability, expectancy effect, Hawthorne effect, placebo effect, etc. Data Analysis and Reporting • Data Management • Data Analysis – Univariate data analyses • Examines one variable at a time – Bivariate data analyses • Study two variables simultaneously – Multivariate data analyses • Study three or more variables simultaneously Applications of Data Analyses: CASE #1 Program Goal: Reduce the prevalence of smoking Target Population: The seventy smoking employees of Company XYZ Independent Variable: Two different smoking cessation programs Dependent Variable: Smoking cessation after one year Design: R A X1 O1 R B X2 O1 where: R = Random assignment; A = Group A; B = Group B; X1 = Method 1; X2 = Method 2; O1 = self-reported smoking behavior Data collected: Nominal data: quite yes or no Smoking Employees A B Quit 24% 33% Did not quite 76% 67% Data analysis: A chi-square test of statistical significance for the null hypothesis that there is no difference in the success of the two groups Applications of Data Analyses: CASE #2 Program Goal: Increase the AIDS knowledge Target Population: The 1,200 new freshmen at ABC University Independent Variable: A 2-hour lecture-discussion program Dependent Variable: AIDS knowledge Design: O1 X O2 where: O1 = pretest; X = 2-hour program at freshman orientation; O2 = posttest Data collected: Ratio data; scores on 100-point-scale test Test Results Pretest Posttest Number of students Mean score 1,200 69.0 1,200 78.5 Data analysis: A dependent t-test of statistical significance for the null hypothesis that there is no difference between the pre- and posttest means on the knowledge test Applications of Data Analyses: CASE #3 Program Goal: Improve breast cancer self-examination skills Target Population: All women employees in the Valley River Shopping Center Independent Variable: A 2-hour training on self-examination Dependent Variable: score on breast cancer self-exam skills test Design: A O1 X O2 B O1 O2 where: A = all women in VRC; B = all women in the Cascade shopping center; O1 = pretest scores; X = a 2-hour program on skill training; O2 = posttest scores Data collected: Ratio data; scores on 100-point skills test Test Results VCR Cascade (n=142) (n=131) Pre 62 63 Post 79 65 Data analysis: An independent t-test of statistical significance for the null hypothesis of (a) no difference in the pretest scores of the 2 groups; and (b) no difference between the posttest means of the 2 groups. Interpreting the Data (1) • Determining whether objectives have been achieved; • Determining whether laws, democratic ideals, regulations, or ethical principles have been violated; • Determining whether assessed needs have been reduced; Interpreting the Data (2) • Determining the value of accomplishments; • Asking critical reference groups to review the data and to provide their judgments of successes and failures, strengths, and weakness; • Comparing results with those reported by similar entities or endeavors; • Comparing assessed performance levels on critical variables to expectations of performance or standards; • Interpreting results in light of evaluation procedures that generated them Interpreting the Data (3) • Distinguish between program significance (practical significance) and statistical significance – Program Significance: the meaningfulness of a program regardless of statistical significance. • e. g., 70 vs. 69 (out of 100 points) – Statistical Significance: determined by statsistical testing • e.g., at 5% type I error level Evaluation Reporting • Designing the Written Report – Abstract/executive summary – Introduction – Methods/procedures – Results – Conclusions/recommendations Evaluation Reporting • Presenting Data – Guidelines – Use graphic methods whenever possible – Build the results and discussion around tables and figures – Provide instructions on how to read them Evaluation Reporting • Dhow and When to Present the Report – Discuss this with the decision makers involved in the evaluation – Choose ways to report the evaluation findings so as to meet the needs of the stakeholders, and include information that is relevant to each group Increasing Utilization of the Results General Guidelines • Plan the study with program stakeholders in mind and involve them in the planning process • Focus the evaluation on conditions about the program that the decision makers can change • Write reports in a clear, simple manner and submit them on time Increasing Utilization of the Results General Guidelines • Base the decision on whether to make recommendations on how specific and clear the data are, how much is known about he program, and whether differences between programs are obvious. A joint interpretation between evaluator and stakeholders may be best. Increasing Utilization of the Results General Guidelines • Disseminate the results to all stakeholders, using a variety of methods • Integrate evaluation findings with other research and evaluation about the program area • Provide high-quality research THANK YOU