DOES TEACHER EVALUATION IMPROVE SCHOOLS AND STUDENT LEARNING? ©Stronge and Associates, 2018 All Rights Reserved February, 2018 ; Research Report February 2018 Does Teacher Evaluation Improve Schools and Student Learning? What Does Research Say Teacher effectiveness has proven time after time to be the most influential school-related factor in student achievement. If teacher quality is the pillar of the success of education, then it logically follows that a robust teacher evaluation system should be in place, since the purpose of evaluation is to recognize and develop good teaching. Stronge and Tucker (2003) stated: Without capable, high quality teachers in America’s classrooms, no educational reform effort can possibly succeed. Without high quality evaluation systems, we cannot know if we have high quality teachers. Thus, a well-designed and properly implemented teacher evaluation system is essential in the delivery of effective educational programs and in school improvement. (p. 3) There are two main purposes of teacher evaluation: 1) a formative purpose to inform and stimulate teachers’ professional development; and 2) a summative purpose to hold teachers accountable for their performance. Evaluation is a tool, not the outcome; it serves as a systematic tool that enables data-driven personnel and student improvement decisions. Most recently, there has been a national imperative to reform teacher evaluation systems, spurred both by federal policy initiatives, state statutory and policy decisions, and local policy. The new teacher evaluation systems typically use multiple data sources (e.g., observation, student achievement data, student surveys, and teacher portfolios) to evaluate both the process and outcomes of teaching. By 2015, all 50 states and the District of Columbia had policies for performance-based teacher evaluation and 43 of them mandated the incorporation of student achievement data in these evaluations (Marsh et al., 2017). ©Stronge & Associates, 2018 All Rights Reserved 2 ; Research Report February 2018 Potential Positive Impact On the positive side, there are numerous benefits of performance-based and multiple-measure teacher evaluation – benefits such as increasing the accuracy and objectivity of the evaluation, better differentiating high- and low-performing teachers, identifying areas of strengths and weaknesses, and providing more meaningful, specific feedback about teacher practice (Delvaux et al., 2013; Hill, Kapitula, & Umland, 2011). An effective teacher evaluation system propels individuals to work more effectively, efficiently, and persistently, especially when they believe their performance is gauged against standards that are fair and objective. For instance, Taylor & Tyler (2012) found that participating in teacher evaluation can increase teacher performance by 0.11 standard deviation, which is equivalent to an improvement of 4.5 percentile points, compared with not participating in evaluation (Figure 1). Figure 1. Impact of Evaluation (Taylor & Tyler, 2012) Extant literature has primarily focused on the validity and reliability of evaluation, but less attention has been given to the efficacy of evaluation systems in improving teacher practices and student learning. This brief report aims to summarize research evidence about whether reformed teacher evaluation has improved schools or student learning. The findings are quite mixed, revealing that the effectiveness of teacher evaluation is contingent upon a number of contextual factors. Average teacher’s students score in years before the teacher has undergone an evaluation ©Stronge & Associates, 2018 All Rights Reserved Average teacher’s students score in years after the teacher has undergone an evaluation 3 ; Research Report February 2018 Steinberg and Sartain (2015a) also found reformed teacher evaluation systems that are characterized by quality classroom observations and conferences with teachers can increase student achievement by 5.4 percent of a standard deviation in math and 9.9 percent of a standard deviation in reading (Figure 2). Figure 2: Student Achievement Boost by Teacher’s Participation in Evaluation (Steinberg & Sartain, 2015a) Reading 9.9 Change in scores (% of st. dev.) Math 5.4 0 5 10 Evaluation also can improve the composition of the teacher workforce. It is estimated that dismissing and replacing teachers who fall in the bottom 6 to 10 percent of the value-added distribution would improve student achievement by 50 percent of a standard deviation (Hanushek, 2008). Research also suggests that performance-based evaluation increases the voluntary attrition of low-performing teachers and improves the effectiveness of teachers who remain. For instance, Dee and Wyckoff (2015) examined a controversial teacher evaluation system introduced in the District of Columbia Public Schools by then-Chancellor Michelle Rhee. The study examined the retention, turnover, and performance outcomes of low-performing teachers whose ratings placed them near the threshold that implied a strong dismissal threat. The findings indicated that dismissal threat increased the voluntary attribution of low-performing teachers by 11 percentage points and improved the performance of lowperforming teachers who remained by 0.27 of a standard deviation (Figure 3). Figure 3: Impact of Evaluation and Dismissal (Dee & Wyckoff, 2015) Increased the voluntary attrition of low-performing teachers by 11 percentage points Dismissal threats Improved the performance of low-performing teachers who remained by 0.27 of a standard deviation. ©Stronge & Associates, 2018 All Rights Reserved 4 ; Research Report February 2018 Similarly, Adnot and colleagues (2016) also evaluated the effects of teacher turnover caused by evaluation in the District of Columbia Public Schools (DCPS). Specifically, they examined teachers who received an “ineffective” rating in evaluation, turned over, and were replaced by new hires. Fifty-five percent of the replacement teachers came from outside the DCPS system and 45 percent were transferred from within the DCPS schools. The researchers compared student achievement gain differences between entering and exiting teachers. They found that when low-performing teachers were induced to leave for poor performance, student academic achievement improved by 6 percentile points (0.14 SD) in reading and 8 percentile points (0.21 SD) in math (Figure 4) (Adnot, Dee, Katz, & Wyckoff, 2016). This study provides evidence that rigorous evaluation can enhance the dismissal of the least effective teachers and inform the efforts to retain effective teachers. Figure 4: Dismissal of Ineffective Teachers and Student Achievement Gains (Adnot, Dee, Katz, & Wyckoff, 2016) Jiang et al. (2014) investigated the impact of Chicago’s teacher evaluation reform on student learning. They found: • • At the end of the first year of implementing the reformed teacher evaluation system, schools improved student achievement in reading by 0.10 standard deviations. More advantaged schools (i.e., schools that were high achieving prior to implementation or schools with lower rates of student poverty) tended to benefit the most from the reform in teacher evaluation. ©Stronge & Associates, 2018 All Rights Reserved 5 ; Research Report February 2018 Traditional teacher evaluation tended to reply on a single measure of teacher performance, which typically was a perfunctory classroom observation. Since any measure of teacher effectiveness is fallible and not definitive, classroom observation also is susceptible to unreliability, instability, and bias (e.g., Hill et al., 2012). Thus, using a balanced combination of multiple measures may create a more valid composite of teacher performance (Braun, 2015; Measures of Effective Teaching, 2013). In many aspects, the reforms have improved the practices in teacher evaluation. Research in the field has found the following positive findings regarding a performancebased teacher evaluation system that uses multiple data sources to make evaluation decisions: • Performance-based teacher evaluation is more conducive to the improvement of teaching than traditional drive-by evaluations, and it contributes to a better professional atmosphere in schools (Toch & Rothman, 2008). • Teachers report that the structured and standardsbased classroom observations provide useful feedback. They believe their evaluator is fair and able to accurately assess their instruction, and the overall objectivity of evaluation is strengthened (Jiang, Sporte, & Luppescu, 2015; Kimball, 2002). • It contributes to a common dialogue about quality instruction across the school during evaluation interaction (Taylor & Tyler, 2011). • Teachers perceive that this type of teacher evaluation provides more comprehensive, specific, and clear expectations for performance compared with conventional teacher evaluation systems, and they are positive about providing input into the evaluation process (Steinberg & Sartain, 2015b). • When teachers perceive that the evaluation standards are clear and relevant to good teaching, they also perceive that the teacher evaluation is effective and fair, and as a result, they have a higher level of organization commitment. Teachers also tend to have decreased perceptions of “role ambiguity” (uncertainty about what the occupant of a particular position is supposed to do) and also increased perceptions of “effort performance-rating linkage” (the extent to which people perceive there is a clear and direct relationship between their work effort/performance and evaluation of their performance) (Conley, Muncy, & You, 2005). • Teachers usually accept the performance standards, procedures, and outcomes. They tend to accept the standards as consistent with their view of teaching. Additionally, teachers have confirmed that performance-based teacher evaluation serves the purposes of 1) increasing accountability of teaching and 2) helping teachers improve professionally (Kimball, 2002; Milanowski & Heneman 2001; Taylor & Tyler, 2012). • Teachers report that the evaluation process leads them to engage in more reflection, better alignment of their teaching to the performance standards, and they become more organized, improve lesson planning, and improve their classroom management skills (Heneman & Milanowski, 2003). • Performance-based teacher evaluation systems have been found to have a substantial degree of criterion validity. Teachers with higher scores on standards-based evaluations produce more student learning gains than teachers with lower evaluation scores (Xu, Grant, & Ward, 2016). ©Stronge & Associates, 2018 All Rights Reserved 6 ; Research Report February 2018 Potential Negative Impact Teacher evaluation that is thoughtfully implemented can translate into improved student performance. Nevertheless, despite the positive results that can accrue from well-designed and implemented evaluation systems, consternation over the inclusion of student performance has caused negative publicity, which has sometimes overshadowed the benefits. The inclusion of any measures of direct student performance may cause increased pressure on teachers to teach to the test, reduce instructional depth, and foster instruction targeted primarily toward students whose test scores are likely to improve, therefore, causing teachers to avoid serving students and schools that are socio-economically disadvantaged, and probably high-performing students as well whose scores approach the ceiling effect of achievement assessments (Menken, 2006; Embse et al., 2016a, 2016b). Other criticisms related to using student achievement data in teacher evaluation include: • Students’ learning ability, home and peer influence, motivation and other influences are powerful in affecting achievement. It is challenging to disentangle a teacher’s impact from the influence of pre-existing student differences. Value-added models typically measure correlation, not causation. Consequently, achievement data cannot answer with precision the degree to which student learning is attributed to students, teachers, or other factors (Darling-Hammond, Amrein-Beardsley, Haertel, & Rothstein, 2012; Morganstein & Wasserstein, 2014). • The quality of student achievement data is uncertain. In order for a teacher to be accurately evaluated on the basis of his or her students’ academic achievement, it is crucial that the student performance assessment being used is high quality. Student performance measures must be valid, reliable, useful for diagnosis, stretchy enough to allow growth for both low- and high-performing learners, equitable and comparable (Koedel & Betts, 2009)., Furthermore, standardized achievement tests are unlikely to reflect the full range of instructional goals in their subject areas. Using these types of tests to evaluate teachers may actually encourage them to teach down, focusing more on the lower level skills being measured (Toch & Rothman, 2008). • Value-added scores may provide teachers and administrators with information on their students’ performance and identify areas where improvement is needed; however, they do not provide information on how to improve the actual teaching. In addition, teachers’ value-added scores can change drastically when a different model or test is used (Morganstein & Wasserstein, 2014; Sass, Semykina, & Harris, 2014). • Teachers perceive that the inclusion of student achievement data lacks clarity and transparency. There is confusion and misinformation in how much student growth contributes to their overall evaluation score as well as how value-added models control for outside influences such as mobility and poverty (Liang, Sporte, & Luppescu, 2015). • The integration of student achievement into evaluation is associated with teacher stress and job dissatisfaction (Liang, Sporte, & Luppescu, 2015). Also, educator collaboration is decreasing while competition is increasing since teachers are hold accountable for the learning of students and do not want to release their students to the care of other professionals (Hewitt, 2015). • Scholars and practitioners also commented on the challenge in reconciling the conflicting purposes of evaluation between the summative, accountability-based standpoint and the formative, developmentoriented standpoint (Hallinger, Heck, & Murphy, 2014). ©Stronge & Associates, 2018 All Rights Reserved 7 ; Research Report February 2018 Ways to Improve the Positive Impact of Teacher Evaluation Research indicates that teacher evaluation has an impact on teacher learning and professional development, but this impact is influenced by a number of factors including (Delvaux et al., 2013; Tuytens & Devos, 2011). In other words, school-level implementation and organizational context play important roles in the success of teacher evaluation. The influencers include: • • • • • Utility of the feedback; A positive attitude of the principal toward the evaluation system; Clarity of the criteria and purposes of the evaluation system; Credibility of the evaluator; and Instructional leadership of the principal. Steinberg and Sartain (2015a) also found that the effectiveness of teacher evaluation depends on a number of factors, such as 1) principals’ capacity to provide targeted instructional guidance, 2) teachers’ ability to respond to the instructional feedback in a manner that improves student achievement, and 3) the extent of district-level support and training for principals who are primarily responsible for implementing the new system. Research consistently found that teachers’ perceptions of evaluation are associated with their perceptions of leadership and professional community (Liang, Sporte, & Luppescu, 2015; Marsh et al., 2017). Principal instructional leadership, the trust between the principal and teachers, and structures for frequent collaborations are positively correlated with teachers’ perceptions of evaluation and feedback. Also, when a strong professional community is available, teachers tend to have more positive views of evaluation. To make any evaluation effective, it is crucial that the principals have the expertise in conducting quality classroom observations and conferences with teachers, and engaging teachers in looking intensely at their weaknesses and strengths. It is also important that a principal’s role evolves from pure evaluation to a dual role in which, by incorporating instructional coaching, the principal serves as both evaluator and formative assessor of a teacher’s instructional practice (Steinberg & Sartain, 2015a) To summarize, teacher evaluation, designed properly and implemented with fidelity, is an important lever for teacher and school improvement. But if it is not implemented appropriately, it may not realize the potential benefits to the greatest extent possible. Reformed teacher evaluation systems have many advantages over the traditional evaluation exemplified by a one-shot superficial classroom visit and checklist marking. However, teacher evaluation is still not a silver bullet to resolving all the problems of teaching and learning in schools, and the effectiveness of evaluation, itself, is impacted by many contextual factors. School leaders frequently are still confronted with the imbalance of the formative and summative purposes of evaluation. Researchers and practitioners need to explore innovative ways to further increase the efficacy of evaluation in improving professional development and student learning. ©Stronge & Associates, 2018 All Rights Reserved 8 ; Research Report February 2018 References Adnot, M., Dee, T., Katz, V., & Wyckoff, J. (2016). Teacher Turnover, Teacher Quality, and Student Achievement in DCPS (CEPA Working Paper No.16-03). Available at: http://cepa.stanford.edu/wp16-03 Braun, H. (2015). The value in value added depends on the ecology. Educational Researcher, 44(2), 127-131. Conley, S., Muncy, D.E., & You, S. (2005). Standards-based evaluation and teacher career satisfaction: A structural equation modeling analysis. Journal of Personnel Evaluation in Education, 18, 39-65. Darling-Hammond, L., Amrein-Beardsley, A., Haertel, E., & Rothstein, J. (2012). Evaluating teacher evaluation. Phi Delta Kappan, 93(6), 8-15. Dee, T. S., & Wyckoff, J. (2015). Incentives, selection, and teacher performance: Evidence from IMPACT. Journal of Policy Analysis and Management, 34(2), 267-297. Delvaux, E., Vanhoof, J., Tuytens, M., Vekeman, E., Devos, G., & Van Petegem, P. (2013). How may teacher evaluation have an impact on professional development? A multilevel analysis. Teaching and Teacher Education, 36, 1-11. Hallinger, P., Heck, R. H., & Murphy, J. (2014). Teacher evaluation and school improvement: An analysis of the evidence. Educational Assessment, Evaluation, and Accountability, 26, 5-28. Hanushek, E. A. (2008, May). Teacher deselection. Available at http://www.stanfordalumni.org/leadingmatters/san_francisco/documents/Teacher_DeselectionHanushek.pdf. Heneman, H. G., III., & Milanowski, A. T. (2003). Continuing assessment of teacher reaction to a standards-based teacher evaluation system. Journal of Personnel Evaluation in Education, 17(2), 173195. Hewitt, K. K. (2015). Educator evaluation policy that incorporates EVAAS value-added measures: Undermined intentions and exacerbated inequities. Education Policy Analysis Archives, 23(76). Available at http://epaa.asu.edu/ojs/article/view/1968. Hill, H. C., Kapitula, L., & Umland, K. (2011). A validity argument approach to evaluating teacher valueadded scores. American Educational Research Journal, 48(3), 794-831. Jiang, J. Y., Sartain, L., Sporte, S. E., & Steinberg, M. P. (2014). The impact of teacher evaluation reform on student learning: Success and challenge in replicating experimental findings with non-experimental data. Society for Research on Educational Effectiveness. Jiang, J. Y., Sporte, S. E., & Luppescu, S. (2015). Teacher perspectives on evaluation reform: Chicago’s REACH students. Educational Researcher, 44(2), 105-116. Kimball, S. M. (2002). Analysis of feedback, enabling conditions and farness perceptions of teachers in three school districts with new standards-based evaluation system. Journal of Personnel Evaluation in Education, 16(4), 241-268. Koedel, C., & Betts, J. R. (2009). Does student sorting invalidate value-added models of teacher effectiveness? An extended analysis of the Rothstein critique. Nashville, TN: National Center on Performance Incentives. Marsh, J. A., Bush-Mecenas, S., Strunk, K. O., Lincove, J. A., & Huguet, A. (2017). Evaluating teachers in the big easy: How organizational context shapes policy response in New Orleans. Educational Evaluation and Policy Analysis. ©Stronge & Associates, 2018 All Rights Reserved 9 ; Research Report February 2018 Measures of Effective Teaching. (2013). Ensuring fair and reliable measures of effective teaching: Culminating findings from the MET Project’s three-year study [Policy and practice brief]. Seattle, WA: Bill & Melinda Gates Foundation. Menken, K. (2006). Teaching to the test: how no child left behind impacts language policy, curriculum, and instruction for English language learners. Bilingual Research Journal, 30(2), 521-546. Milanowski, A. T., & Heneman, H. G., III. (2001). Assessment of teacher reactions to a standards-based evaluation system: A pilot study. Journal of Personnel Evaluation in Education, 15(3), 193-212. Morganstein, D., & Wasserstein, R. (2014). ASA statement on value-added models. Statistics and Public Policy, 1(1), 108-110. Sass, T. R., Semykina, A., & Harris, D. N. (2014). Value-added models and the measurement of teacher productivity. Economics of Education Review, 38, 9-23. Steinberg, M. P., & Sartain, L. (2015). Does better observation make better teachers? New evidence from a teacher evaluation pilot in Chicago. EducationNext, 15(1). Available at http://educationnext.org/better-observation-make-better-teachers/ Steinberg, M. P., & Sartain, L. (2015). Does teacher evaluation improve school performance? Experimental evidence from Chicago’s Excellence in Teaching Project. Education Finance and Policy, 10(4), 535-572. Stronge, J. H., & Tucker, P. D. (2003). Handbook on teacher evaluation: Assessing and improving performance. Larchmont, NY: Eye on Education. Taylor, E. S., & Tyler, J. H. (2011). The effect of evaluation on performance: Evidence from longitudinal student achievement data of mid-career teachers. Cambridge, MA: National Bureau of Economic Research. Taylor, E. S., & Tyler, J. H. (2012). Can teacher evaluation improve teaching? Evidence of systematic growth in the effectiveness of midcareer teachers. EducationNext, 12(4). Available at http://educationnext.org/can-teacher-evaluation-improve-teaching/ Toch, T., & Rothman, R. (2008). Rush to judgment: Teacher evaluation in public education. Washington, DC: Education Sector. Tuytens, M., & Devos, G. (2011). Stimulating professional learning through teacher evaluation: An impossible task for the school leader? Teaching and Teacher Education, 27(5), 891-899. von der embse, N. P., Pendergast, L. L., Segool, N., Saeki, E., & Ryan, S. (2016). The influence of testbased accountability policies on school climate and teacher stress across four states. Teaching and Teacher Education, 59, 492-502. von der Embse, N. P., Sandilos, L. E., Pendergast, L., & Mankin, A. (2016). Teacher stress, teachingefficacy, and job satisfaction in response to test-based educational accountability policies. Learning & Individual Differences, 50, 308-317. Xu, X., Grant, L. W., & Ward, T. J. (2016). Validation of a state-wide teacher evaluation system: Relationship between scores from evaluation and student academic progress. NASSP Bulletin, 100(4), 203-222. ©Stronge & Associates, 2018 All Rights Reserved 10 Contact Us: www.strongeandassociates.com 757.986.0756 info@strongeandassociates.com