AUTOPSY OF A FAILED EVALUATION Examination Against The 2011 Program Evaluation Standards Daniel L. Stufflebeam 9/13/11 1 FOCUS: The 2011 Program Evaluation Standards 2 THE SESSION’S PARTS ARE 1. 2. A rationale for evaluation standards A case showing the utility of The 3. Program Evaluation Standards The contents of The Standards 4. Recommendations for applying The Standards 3 PART 1 A RATIONALE FOR EVALUATION STANDARDS 4 STANDARDS FOR EVALUATIONS ARE Widely-shared shared principles for guiding and judging the conduct and use of an evaluation Developed & approved by experts in the conduct & use of evaluation 5 STANDARDS FOR EVALUATIONS PROVIDE Principled Direction Technical Advice A Basis for Professional Credibility A Basis for Evaluating Evaluations A Basis for Public Accountability 6 EVALUATORS IGNORE OR FAIL TO MEET STANDARDS TO Their professional peril The detriment of their clients 7 As in the famous ENRON debacle, failure to meet standards may contribute to Lack of an impartial perspective Erroneous conclusions Unwarranted decisions Cover-up of findings Misguided decisions Breakdown of trust Organizational repercussions Personal losses & tragedies Lowered credibility for evaluators, their organizations, & the evaluation profession Increased government controls 8 PART 2: A CASE A UNIVERSITY’S REVIEW OF ITS GRADUATE PROGRAMS (A university group should have followed evaluation standards but didn’t.) 9 CONTEXT WAS PROBLEMATIC Board had voted confidence in the president (12/06) Faculty gave president & provost low ratings (2/07) Enrollment was declining U. faced a fiscal crisis Review focused on resource allocation Morale was low 10 REVIEW’S STATED PURPOSES: Address a fiscal crisis over the university’s inability to support all of its programs & maintain excellence Determine which programs are highest strategic priorities based on quality Identify programs for increased funds 11 SCOPE OF THE REVIEW To be completed within 1 year All masters and doctoral programs Launched on 7/19/06 114 programs were reviewed 12 THE REVIEW’S PLAN Keyed to Dickeson book (chapter 5) Data book Program’s report Dean’s report Review team’s report Appeals of review team’s report Provost’s final report Board’s decisions No update of U. mission No appeals of provost’s conclusions No adoption of standards for reviews Minimal participation of outside evaluators No external metaevaluation or peer review of the review 13 GENERAL REVIEW CRITERIA External demand Quality of student & program outcomes Quality of program administration & planning Program size, scope, & productivity Program impact, justification, & essentiality Opportunity analysis Compelling program factor (features that make it unique & excellent) 14 DEFINITION OF SUB-CRITERIA Many Evolved throughout the review Caused confusion & controversy 15 CRITERIA OMITTED FROM DICKESON’S LIST History, development, & expectations of the program Internal demand for the program Quality of program inputs & processes Revenue & other resources generated Program costs & associated costs 16 EVALUATION PROCEDURES Program’s self-report Document & data book review Group & individual interviews Variable protocols for ratings (1-5) Training of review team leaders Rating of each program by department, dean, review team, & provost Synthesis by provost & staff 17 REVIEW PERSONNEL Essentially internal Provost was both primary decision maker & de facto lead evaluator Provost’s staff assisted the process A program representative wrote the program’s report & sent it to department faculty, dean, & review team Faculty input varied across programs The dean rated the college’s programs & sent reports to the department chairs & review team (not in original plan) 18 REVIEW PERSONNEL (continued) Seven 7-person review teams rated designated programs & on the same day emailed all reports to the provost & to pertinent deans & department chairs Review team members were mostly from outside the program’s college Provost met with deans before finalizing decisions Provost met with team leaders before releasing final report An internal evaluation expert assisted 19 FINAL REPORT Issued on May 11, 2007 Gave priorities for funding in each college Announced plans to maintain 56, increase 16, merge 6, maintain/merge 17 subject to review, transfer 8, close 26, & create 6 new degrees 20 FINAL REPORT (continued) Gave no evidentiary basis for decisions Referenced no technical appendix Referenced no accessible files of supporting data, analyses, & data collection tools Gave no rating of each program on each criterion & overall 21 OUTCOMES Local paper applauded the report (5/12/06) Review evidence & link to conclusions were inaccessible to many interested parties Professors, alumni, & others protested President announced an appeal process (5/18/07) Faculty voted to call for a censure of the provost (5/18/07) Provost resigned (5/20/07) Appeals overturned 10 planned cuts (7/14/07) 22 OUTCOMES (continued) Potential savings from cuts were reduced Community watched a contentious process Board fired the president (8/15/07) President threatened to sue Board awarded ex-president $530,000 severance pay (10/27/07) Projected review of undergraduate programs was canceled, ceding that area priority by default Reviews were scheduled to resume in 2010 23 CLEARLY, THIS EVALUATION FAILED No standards were required to reach this conclusion. However, adherence to approved standards might have prevented the review’s failure. 24 MY TAKE- ON THE PLUS SIDE: Review was keyed to an important need to restructure programs. There was significant faculty involvement in studying programs. General criteria were established. 25 HOWEVER, THERE WERE SERIOUS DEFICIENCIES. No independent perspectives Top evaluator & decision maker were the same Evidence to support conclusions was not reported Political viability was not maintained Evidence disappeared No independent evaluation of the review 26 PART 3 The Program Evaluation Standards 27 FOR A MORE SYSTEMATIC EXAMINATION OF THE CASE Let’s see if use of The Program Evaluation Standards might have helped ensure the study’s success. Let’s also use the case to develop a working knowledge of The Program Evaluation Standards. 28 FIRST, SOME BACKGROUND INFORMATION 29 THE JOINT COMMITTEE ON STANDARDS FOR EDUCATIONAL EVALUATION Developed The Program Evaluation Standards Includes evaluation users and experts Was sponsored by 17 professional societies 30 THE SPONSORS REPRESENTED Accreditation officials Administrators Curriculum specialists Counselors Evaluators Rural education Measurement specialists Policymakers Psychologists Researchers Teachers Higher education 31 The Program Evaluation Standards Are accredited by the American National Standards Institute As an American National Standard Include 30 specific standards 32 NOW, LET’S LOOK AT THE CONTENTS OF THE STANDARDS & DISCUSS THEIR RELEVANCE TO THE PROGRAM REVIEW CASE 33 THE 30 STANDARDS ARE ORGANIZED AROUND 5 ATTRIBUTES OF A SOUND EVALUATION UTILITY FEASIBILITY PROPRIETY ACCURACY EVALUATION ACCOUNTABILITY 34 EACH STANDARD INCLUDES CONSIDERABLE DETAIL Label Summary statement Definitions Rationale Guidelines Common errors to avoid Illustrative case 35 CAVEAT Time permits us to deal with the 30 standards only at a general level. You can benefit most by studying the full text of the standards. 36 THE UTILITY STANDARDS Require evaluations to be – – – – Informative Timely Influential Grounded in explicit values Intended to ensure an evaluation – Is aligned with stakeholder needs – Enables process and findings uses and other appropriate influence 37 LABELS FOR THE UTILITY STANDARDS ARE U1 Evaluator Credibility U2 Attention to Stakeholders U3 Negotiated Purposes U4 Explicit Values U5 Relevant Information U6 Meaningful Processes and Products U7 Timely and Appropriate Communicating and Reporting U8 Concern for Consequences and Influence 38 Let’s consider some of the specific Utility standards 39 THE U1 EVALUATOR CREDIBILITY STANDARD STATES: Evaluations should be conducted by qualified people who establish and maintain credibility in the evaluation context. How well did the program review meet this standard? 40 THE U2 ATTENTION TO STAKEHOLDERS STANDARD STATES: Evaluations should devote attention to the full range of individuals and groups invested in the program and affected by its evaluation. How well did the program review meet this standard? 41 THE U4 EXPLICT VALUES STANDARD STATES: Evaluations should clarify and specify the individual and cultural values underpinning purposes, processes, and judgments. How well did the program review address this standard? 42 THE U8 CONCERN FOR CONSEQUENCES AND INFLUENCE STANDARD STATES: Evaluations should promote responsible and adaptive use while guarding against unintended negative consequences and misuse. Did the program review case meet this standard? 43 OVERALL, BASED ON THIS SAMPLING OF UTILITY STANDARDS Did the program review pass or fail the requirement for utility? Why or why not? 44 DID FAILURE TO MEET ANY OF THESE UTILITY STANDARDS CONSTITUTE A FATAL FLAW? If yes, which failed standard(s) constituted a fatal flaw? What could the provost have done to ensure that the review passed the Utility requirements? 45 NOW, LET’S CONSIDER THE FEASIBILITY STANDARDS 46 THE FEASIBILITY STANDARDS Are intended to ensure that an evaluation is – Economically and Politically Viable – Realistic – Contextually sensitive – Responsive – Prudent – Diplomatic – Efficient – Cost Effective 47 LABELS FOR THE FEASIBILITY STANDARDS ARE F1 Project Management F2 Practical Procedures F3 Contextual Viability F4 Resource Use 48 LET’S CONSIDER THE F2 FEASIBILITY STANDARD. 49 THE F2 PRACTICAL PROCEDURES STANDARD STATES: The procedures should be practical and responsive to the way the program operates. Did the program review employ workable, responsive procedures? 50 OVERALL, BASED ON THE FEASIBILITY STANDARDS Did the program review pass or fail the requirement for feasibility? Why or why not? 51 NOW, LET’S CONSIDER THE PROPRIETY STANDARDS 52 LABELS FOR THE PROPRIETY STANDARDS ARE P1 Responsive and Inclusive Orientation P2 Formal Agreements P3 Human Rights and Respect P4 Clarity and Fairness P5 Transparency and Disclosure P6 Conflicts of Interest P7 Fiscal Responsibility 53 THREE PROPRIETY STANDARDS WERE ESPECIALLY RELEVANT TO THIS CASE P2 Formal Agreements P5 Transparency and Disclosure P6 Conflicts of Interest 54 THE P2 FORMAL AGREEMENTS STANDARD STATES: Evaluation agreements should be negotiated to make obligations explicit and take into account the needs, expectations, and cultural contexts of clients and other stakeholders. Was this standard not met because of a flawed agreement or failure to honor the agreement? 55 THE P5 Transparency and Disclosure STANDARD STATES: Evaluations should provide complete descriptions of findings, limitations, and conclusions to all stakeholders unless doing so would violate legal or propriety obligations. Why do you think the provost withheld the supporting evidence? 56 THE P7 CONFLICTS OF INTEREST STANDARD STATES: Evaluators should openly and honestly identify and address real or perceived conflicts of interests that may compromise the evaluation. Was this study an evaluation by pretext? 57 OVERALL, BASED ON THE PROPRIETY STANDARDS Did the program review pass or fail the requirement for propriety? Why or why not? 58 DID FAILURE TO MEET ANY OF THESE PROPRIETY STANDARDS CONSTITUTE A FATAL FLAW? If yes, which failed standard(s) constituted a fatal flaw? How could the provost have acted to ensure that the review passed the Propriety requirements? 59 NOW, LET’S CONSIDER THE ACCURACY STANDARDS 60 THE ACCURACY STANDARDS Are intended to ensure that an evaluation – employs sound theory, designs, methods, and reasoning – in order to minimize inconsistencies, distortions, and misconceptions – and produce and report truthful evaluation findings and conclusions 61 LABELS FOR THE ACCURACY STANDARDS ARE A1 Justified Conclusions and Decisions A5 Information Management A2 Valid Information A6 Sound Design and Analyses A3 Reliable Information A4 Explicit Program and Context Descriptions A7 Explicit Evaluation Reasoning A8 Communicating and Reporting 62 THREE ACCURACY STANDARDS WERE ESPECIALLY RELEVANT TO THIS CASE A1 Justified Conclusions A2 Valid Information A8 Communicating and Reporting 63 THE A1 JUSTIFIED CONCLUSIONS STANDARD STATES: Evaluation conclusions and decisions should be explicitly justified in the cultures and contexts where they have consequences. Was failure to meet this standard possibly the review’s most serious flaw? 64 THE A2 VALID INFORMATION STANDARD STATES: Evaluation information should serve the intended purposes and support valid interpretations. Why did the program review fail to meet this standard? 65 THE A8 COMMUNICATING AND REPORTING STANDARD STATES: Evaluation communications should have adequate scope and guard against misconceptions, biases, distoritions, and errors. Did the review break down in relationship to this standard? 66 OVERALL, BASED ON THE ACCURACY STANDARDS Did the program review pass or fail the requirement for accuracy? Why or why not? 67 DID FAILURE TO MEET ANY OF THESE ACCURACY STANDARDS CONSTITUTE A FATAL FLAW? If yes, which failed standard(s) constituted a fatal flaw? How could the provost have acted to ensure that the review passed the Accuracy requirements? 68 NOW, LET’S CONSIDER THE EVALUATION ACCOUNTABILITY STANDARDS 69 LABELS FOR THE EVALUATION ACCOUNTABILITY STANDARDS ARE: E1 Evaluation Documentation E2 Internal Metaevaluation E3 External Metaevaluation 70 This “new” Evaluation Accountability category responds to Poor documentation in many program evaluations Little use of standards to plan and guide evaluations Lack of independent, publicly reported evaluations of evaluations 71 THE E1 EVALUATION DOCUMENTATION STANDARD STATES: Evaluations should fully document their negotiated purposes and implemented designs, procedures, data, and outcomes. In what respects did the program review not meet this standard? 72 THE E2 INTERNAL METAEVALUATION STANDARD STATES: Evaluations should use these and other applicable standards to examine the accountability of the evaluation design, procedures employed, information collected, and outcomes. In what respects did the program review address or not address this standard? 73 THE E3 EXTERNAL METAEVALUATION STANDARD STATES: Program evaluation sponsors, clients, evaluators, and other stakeholders should encourage the conduct of external metaevaluations using these and other applicable standards. Who could have and should have commissioned and funded an external metaevaluation? What difference could this have made? 74 QUESTIONS RE PART 3 Could application of The Program Evaluation Standards have prevented failure of the review? Why or why not? To help ensure the review’s success, how would you have applied the standards? 75 What are your questions? 76 10 minute break 77 PART 4 RECOMMENDATIONS FOR APPLYING THE PROGRAM EVALUATION STANDARDS 78 1. REMEMBER THE BASICS OF METAEVALUATION The most important purpose of evaluation OR METAEVALUATION is not only to PROVE but to IMPROVE. Formative metaevaluations are needed to improve ongoing evaluation work. Summative metaevaluations are needed to prove the evaluation’s merit. 79 2. TRAIN EVALUATION STAFF IN THE CONTENT & APPLICATION OF THE STANDARDS. 80 3. USE CHECKLISTS TO GUIDE METAEVALUATIONS Visit www.wmich.edu/evalctr/checklists Select & apply checklists for evaluation design, contracting, budgeting, & metaevaluation 81 4. ADVISE YOUR SPONSORS TO FUND INDEPENDENT SUMMATIVE METAEVALUATIONS OF YOUR EVALUATIONS 82 5. SUPPORT RELEASE OF SUMMATIVE METAEVALUATION FINDINGS TO ALL RIGHT-TOKNOW AUDIENCES 83 IN CONCLUSION Standards have a crucial role in assuring the quality, credibility, & value of evaluations. Unprincipled evaluations can have dire consequences. The Program Evaluation Standards are valuable for guiding & judging evaluations. Checklists are available to guide application of The Standards. 84 THE PROGRAM EVALUATION STANDARDS Joint Committee on Standards for Educational Evaluation (2011). The Program Evaluation Standards. Thousand Oaks, CA: Sage. ISBN 978-1-4129-8908-4(paper back) 85 Stufflebeam, D. L., & Shinkfield, A. J. (2007). Evaluation Theory, Models, & Applications. San Francisco: Jossey-Bass See Chapter 3: Standards for Program Evaluations ISBN-13-978-0-7879-7765-8 (cloth) ISBN-10: 0-7879-7765-9 (cloth) 86 WESTERN MICHIGAN UNIVERSITY’S EVALUATION CENTER WEB SITE www.wmich.edu/evalctr 87 DICKESON BOOK Dickeson, Robert C. (1999). Prioritizing Academic Programs and Services; Reallocating Resources to Achieve Strategic Balance. San Francisco: JosseyBass. 88 MetaevalCourse2011 Failed Eval.& theProgStandardsppt 89