Uploaded by Cathy Montemurri

Does Teacher Evaluation Improve Schools or Student Learning (12-21-17)

DOES TEACHER
EVALUATION IMPROVE
SCHOOLS AND
STUDENT LEARNING?
©Stronge and Associates, 2018
All Rights Reserved
February, 2018
;
Research Report
February 2018
Does Teacher Evaluation Improve Schools
and Student Learning?
What Does Research Say
Teacher effectiveness has proven time after time to be the most influential school-related factor in
student achievement. If teacher quality is the pillar of the success of education, then it logically follows
that a robust teacher evaluation system should be in place, since the purpose of evaluation is to
recognize and develop good teaching. Stronge and Tucker (2003) stated:
Without capable, high quality teachers in America’s classrooms, no educational reform effort
can possibly succeed. Without high quality evaluation systems, we cannot know if we have high
quality teachers. Thus, a well-designed and properly implemented teacher evaluation system is
essential in the delivery of effective educational programs and in school improvement. (p. 3)
There are two main purposes of
teacher evaluation: 1) a
formative purpose to inform
and stimulate teachers’
professional development; and
2) a summative purpose to hold
teachers accountable for their
performance. Evaluation is a
tool, not the outcome; it serves
as a systematic tool that enables
data-driven personnel and
student improvement decisions.
Most recently, there has been a
national imperative to reform
teacher evaluation systems, spurred
both by federal policy initiatives,
state statutory and policy decisions,
and local policy. The new teacher
evaluation systems typically use
multiple data sources (e.g.,
observation, student achievement
data, student surveys, and teacher
portfolios) to evaluate both the
process and outcomes of teaching.
By 2015, all 50 states and the
District of Columbia had policies
for performance-based teacher
evaluation and 43 of them
mandated the incorporation of
student achievement data in these
evaluations (Marsh et al., 2017).
©Stronge & Associates, 2018 All Rights Reserved
2
;
Research Report
February 2018
Potential Positive Impact
On the positive side, there are
numerous benefits of
performance-based and
multiple-measure teacher
evaluation – benefits such as
increasing the accuracy and
objectivity of the evaluation,
better differentiating high- and
low-performing teachers,
identifying areas of strengths
and weaknesses, and
providing more meaningful,
specific feedback about teacher
practice (Delvaux et al., 2013;
Hill, Kapitula, & Umland,
2011). An effective teacher
evaluation system propels
individuals to work more
effectively, efficiently, and
persistently, especially when
they believe their performance
is gauged against standards
that are fair and objective. For
instance, Taylor & Tyler
(2012) found that participating
in teacher evaluation can
increase teacher performance
by 0.11 standard deviation,
which is equivalent to an
improvement of 4.5 percentile
points, compared with not
participating in evaluation
(Figure 1).
Figure 1. Impact of Evaluation (Taylor & Tyler, 2012)
Extant literature has
primarily focused on the
validity and reliability of
evaluation, but less
attention has been given to
the efficacy of evaluation
systems in improving
teacher practices and
student learning. This brief
report aims to summarize
research evidence about
whether reformed teacher
evaluation has improved
schools or student learning.
The findings are quite
mixed, revealing that the
effectiveness of teacher
evaluation is contingent
upon a number of
contextual factors.
Average teacher’s
students score in
years before the
teacher has
undergone an
evaluation
©Stronge & Associates, 2018 All Rights Reserved
Average teacher’s
students score in
years after the
teacher has
undergone an
evaluation
3
;
Research Report
February 2018
Steinberg and Sartain (2015a) also found reformed teacher evaluation systems that are characterized
by quality classroom observations and conferences with teachers can increase student achievement by
5.4 percent of a standard deviation in math and 9.9 percent of a standard deviation in reading (Figure
2).
Figure 2: Student Achievement Boost by Teacher’s Participation in Evaluation (Steinberg & Sartain,
2015a)
Reading
9.9
Change in scores (% of
st. dev.)
Math
5.4
0
5
10
Evaluation also can improve the composition of the teacher workforce. It is estimated that dismissing
and replacing teachers who fall in the bottom 6 to 10 percent of the value-added distribution would
improve student achievement by 50 percent of a standard deviation (Hanushek, 2008). Research also
suggests that performance-based evaluation increases the voluntary attrition of low-performing
teachers and improves the effectiveness of teachers who remain. For instance, Dee and Wyckoff
(2015) examined a controversial teacher evaluation system introduced in the District of Columbia
Public Schools by then-Chancellor Michelle Rhee. The study examined the retention, turnover, and
performance outcomes of low-performing teachers whose ratings placed them near the threshold that
implied a strong dismissal threat. The findings indicated that dismissal threat increased the voluntary
attribution of low-performing teachers by 11 percentage points and improved the performance of lowperforming teachers who remained by 0.27 of a standard deviation (Figure 3).
Figure 3: Impact of Evaluation and Dismissal (Dee & Wyckoff, 2015)
Increased the voluntary
attrition of low-performing
teachers by 11 percentage
points
Dismissal threats
Improved the performance
of low-performing teachers
who remained by 0.27 of a
standard deviation.
©Stronge & Associates, 2018 All Rights Reserved
4
;
Research Report
February 2018
Similarly, Adnot and colleagues (2016) also evaluated the effects of teacher turnover caused by evaluation
in the District of Columbia Public Schools (DCPS). Specifically, they examined teachers who received an
“ineffective” rating in evaluation, turned over, and were replaced by new hires. Fifty-five percent of the
replacement teachers came from outside the DCPS system and 45 percent were transferred from within the
DCPS schools. The researchers compared student achievement gain differences between entering and
exiting teachers. They found that when low-performing teachers were induced to leave for poor
performance, student academic achievement improved by 6 percentile points (0.14 SD) in reading and 8
percentile points (0.21 SD) in math (Figure 4) (Adnot, Dee, Katz, & Wyckoff, 2016). This study provides
evidence that rigorous evaluation can enhance the dismissal of the least effective teachers and inform the
efforts to retain effective teachers.
Figure 4: Dismissal of Ineffective Teachers and Student Achievement Gains (Adnot, Dee, Katz, &
Wyckoff, 2016)
Jiang et al. (2014) investigated the impact of Chicago’s
teacher evaluation reform on student learning. They
found:
•
•
At the end of the first year of implementing the
reformed teacher evaluation system, schools
improved student achievement in reading by
0.10 standard deviations.
More advantaged schools (i.e., schools that were
high achieving prior to implementation or
schools with lower rates of student poverty)
tended to benefit the most from the reform in
teacher evaluation.
©Stronge & Associates, 2018 All Rights Reserved
5
;
Research Report
February 2018
Traditional teacher evaluation
tended to reply on a single
measure of teacher
performance, which typically
was a perfunctory classroom
observation. Since any
measure of teacher
effectiveness is fallible and not
definitive, classroom
observation also is susceptible
to unreliability, instability,
and bias (e.g., Hill et al.,
2012). Thus, using a balanced
combination of multiple
measures may create a more
valid composite of teacher
performance (Braun, 2015;
Measures of Effective
Teaching, 2013). In many
aspects, the reforms have
improved the practices in
teacher evaluation. Research
in the field has found the
following positive findings
regarding a performancebased teacher evaluation
system that uses multiple data
sources to make evaluation
decisions:
• Performance-based teacher
evaluation is more
conducive to the
improvement of teaching
than traditional drive-by
evaluations, and it
contributes to a better
professional atmosphere in
schools (Toch & Rothman,
2008).
• Teachers report that the
structured and standardsbased classroom
observations provide useful
feedback. They believe their
evaluator is fair and able to
accurately assess their
instruction, and the overall
objectivity of evaluation is
strengthened (Jiang, Sporte,
& Luppescu, 2015; Kimball,
2002).
• It contributes to a common
dialogue about quality
instruction across the school
during evaluation interaction
(Taylor & Tyler, 2011).
• Teachers perceive that this
type of teacher evaluation
provides more
comprehensive, specific, and
clear expectations for
performance compared with
conventional teacher
evaluation systems, and they
are positive about providing
input into the evaluation
process (Steinberg & Sartain,
2015b).
• When teachers perceive that
the evaluation standards are
clear and relevant to good
teaching, they also perceive
that the teacher evaluation is
effective and fair, and as a
result, they have a higher
level of organization
commitment. Teachers also
tend to have decreased
perceptions of “role
ambiguity” (uncertainty
about what the occupant of a
particular position is
supposed to do) and also
increased perceptions of
“effort performance-rating
linkage” (the extent to which
people perceive there is a
clear and direct relationship
between their work
effort/performance and
evaluation of their
performance) (Conley,
Muncy, & You, 2005).
• Teachers usually accept the
performance standards,
procedures, and outcomes.
They tend to accept the
standards as consistent
with their view of teaching.
Additionally, teachers have
confirmed that
performance-based teacher
evaluation serves the
purposes of 1) increasing
accountability of teaching
and 2) helping teachers
improve professionally
(Kimball, 2002;
Milanowski & Heneman
2001; Taylor & Tyler,
2012).
• Teachers report that the
evaluation process leads
them to engage in more
reflection, better alignment
of their teaching to the
performance standards,
and they become more
organized, improve lesson
planning, and improve
their classroom
management skills
(Heneman & Milanowski,
2003).
• Performance-based teacher
evaluation systems have
been found to have a
substantial degree of
criterion validity. Teachers
with higher scores on
standards-based
evaluations produce more
student learning gains than
teachers with lower
evaluation scores (Xu,
Grant, & Ward, 2016).
©Stronge & Associates, 2018 All Rights Reserved
6
;
Research Report
February 2018
Potential Negative Impact
Teacher evaluation that is thoughtfully implemented
can translate into improved student performance.
Nevertheless, despite the positive results that can
accrue from well-designed and implemented
evaluation systems, consternation over the inclusion
of student performance has caused negative publicity,
which has sometimes overshadowed the benefits. The
inclusion of any measures of direct student
performance may cause increased pressure on
teachers to teach to the test, reduce instructional
depth, and foster instruction targeted primarily
toward students whose test scores are likely to
improve, therefore, causing teachers to avoid serving
students and schools that are socio-economically
disadvantaged, and probably high-performing
students as well whose scores approach the ceiling
effect of achievement assessments (Menken, 2006;
Embse et al., 2016a, 2016b). Other criticisms related
to using student achievement data in teacher
evaluation include:
• Students’ learning ability, home and peer influence,
motivation and other influences are powerful in
affecting achievement. It is challenging to
disentangle a teacher’s impact from the influence of
pre-existing student differences. Value-added
models typically measure correlation, not
causation. Consequently, achievement data cannot
answer with precision the degree to which student
learning is attributed to students, teachers, or other
factors (Darling-Hammond, Amrein-Beardsley,
Haertel, & Rothstein, 2012; Morganstein &
Wasserstein, 2014).
• The quality of student achievement data is
uncertain. In order for a teacher to be accurately
evaluated on the basis of his or her students’
academic achievement, it is crucial that the student
performance assessment being used is high quality.
Student performance measures must be valid,
reliable, useful for diagnosis, stretchy enough to
allow growth for both low- and high-performing
learners, equitable and comparable (Koedel &
Betts, 2009)., Furthermore, standardized
achievement tests are unlikely to reflect the full
range of instructional goals in their subject areas.
Using these types of tests to evaluate teachers may
actually encourage them to teach down,
focusing more on the lower level skills being
measured (Toch & Rothman, 2008).
• Value-added scores may provide teachers
and administrators with information on their
students’ performance and identify areas
where improvement is needed; however,
they do not provide information on how to
improve the actual teaching. In addition,
teachers’ value-added scores can change
drastically when a different model or test is
used (Morganstein & Wasserstein, 2014;
Sass, Semykina, & Harris, 2014).
• Teachers perceive that the inclusion of
student achievement data lacks clarity and
transparency. There is confusion and
misinformation in how much student growth
contributes to their overall evaluation score
as well as how value-added models control
for outside influences such as mobility and
poverty (Liang, Sporte, & Luppescu, 2015).
• The integration of student achievement into
evaluation is associated with teacher stress
and job dissatisfaction (Liang, Sporte, &
Luppescu, 2015). Also, educator
collaboration is decreasing while
competition is increasing since teachers are
hold accountable for the learning of students
and do not want to release their students to
the care of other professionals (Hewitt,
2015).
• Scholars and practitioners also commented
on the challenge in reconciling the
conflicting purposes of evaluation between
the summative, accountability-based
standpoint and the formative, developmentoriented standpoint (Hallinger, Heck, &
Murphy, 2014).
©Stronge & Associates, 2018 All Rights Reserved
7
;
Research Report
February 2018
Ways to Improve the Positive Impact of
Teacher Evaluation
Research indicates that teacher evaluation has
an impact on teacher learning and professional
development, but this impact is influenced by a
number of factors including (Delvaux et al.,
2013; Tuytens & Devos, 2011). In other words,
school-level implementation and organizational
context play important roles in the success of teacher
evaluation. The influencers include:
•
•
•
•
•
Utility of the feedback;
A positive attitude of the principal
toward the evaluation system;
Clarity of the criteria and purposes of
the evaluation system;
Credibility of the evaluator; and
Instructional leadership of the principal.
Steinberg and Sartain (2015a) also found that
the effectiveness of teacher evaluation depends
on a number of factors, such as 1) principals’
capacity to provide targeted instructional
guidance, 2) teachers’ ability to respond to the
instructional feedback in a manner that
improves student achievement, and 3) the
extent of district-level support and training for
principals who are primarily responsible for
implementing the new system. Research
consistently found that teachers’ perceptions of
evaluation are associated with their perceptions
of leadership and professional community
(Liang, Sporte, & Luppescu, 2015; Marsh et al.,
2017). Principal instructional leadership, the
trust between the principal and teachers, and
structures for frequent collaborations are
positively correlated with teachers’ perceptions
of evaluation and feedback. Also, when a
strong professional community is available,
teachers tend to have more positive views of
evaluation. To make any evaluation effective, it is
crucial that the principals have the expertise in
conducting quality classroom observations and
conferences with teachers, and engaging teachers
in looking intensely at their weaknesses and
strengths. It is also important that a principal’s
role evolves from pure evaluation to a dual role in
which, by incorporating instructional coaching,
the principal serves as both evaluator and
formative assessor of a teacher’s instructional
practice (Steinberg & Sartain, 2015a)
To summarize, teacher evaluation, designed
properly and implemented with fidelity, is an
important lever for teacher and school
improvement. But if it is not implemented
appropriately, it may not realize the potential
benefits to the greatest extent possible. Reformed
teacher evaluation systems have many advantages
over the traditional evaluation exemplified by a
one-shot superficial classroom visit and checklist
marking. However, teacher evaluation is still not
a silver bullet to resolving all the problems of
teaching and learning in schools, and the
effectiveness of evaluation, itself, is impacted by
many contextual factors. School leaders
frequently are still confronted with the imbalance
of the formative and summative purposes of
evaluation. Researchers and practitioners need to
explore innovative ways to further increase the
efficacy of evaluation in improving professional
development and student learning.
©Stronge & Associates, 2018 All Rights Reserved
8
;
Research Report
February 2018
References
Adnot, M., Dee, T., Katz, V., & Wyckoff, J. (2016). Teacher Turnover, Teacher Quality, and Student
Achievement in DCPS (CEPA Working Paper No.16-03). Available at:
http://cepa.stanford.edu/wp16-03
Braun, H. (2015). The value in value added depends on the ecology. Educational Researcher, 44(2), 127-131.
Conley, S., Muncy, D.E., & You, S. (2005). Standards-based evaluation and teacher career satisfaction: A
structural equation modeling analysis. Journal of Personnel Evaluation in Education, 18, 39-65.
Darling-Hammond, L., Amrein-Beardsley, A., Haertel, E., & Rothstein, J. (2012). Evaluating teacher
evaluation. Phi Delta Kappan, 93(6), 8-15.
Dee, T. S., & Wyckoff, J. (2015). Incentives, selection, and teacher performance: Evidence from IMPACT.
Journal of Policy Analysis and Management, 34(2), 267-297.
Delvaux, E., Vanhoof, J., Tuytens, M., Vekeman, E., Devos, G., & Van Petegem, P. (2013). How may
teacher evaluation have an impact on professional development? A multilevel analysis. Teaching
and Teacher Education, 36, 1-11.
Hallinger, P., Heck, R. H., & Murphy, J. (2014). Teacher evaluation and school improvement: An analysis
of the evidence. Educational Assessment, Evaluation, and Accountability, 26, 5-28.
Hanushek, E. A. (2008, May). Teacher deselection. Available at
http://www.stanfordalumni.org/leadingmatters/san_francisco/documents/Teacher_DeselectionHanushek.pdf.
Heneman, H. G., III., & Milanowski, A. T. (2003). Continuing assessment of teacher reaction to a
standards-based teacher evaluation system. Journal of Personnel Evaluation in Education, 17(2), 173195.
Hewitt, K. K. (2015). Educator evaluation policy that incorporates EVAAS value-added measures:
Undermined intentions and exacerbated inequities. Education Policy Analysis Archives, 23(76).
Available at http://epaa.asu.edu/ojs/article/view/1968.
Hill, H. C., Kapitula, L., & Umland, K. (2011). A validity argument approach to evaluating teacher valueadded scores. American Educational Research Journal, 48(3), 794-831.
Jiang, J. Y., Sartain, L., Sporte, S. E., & Steinberg, M. P. (2014). The impact of teacher evaluation reform on
student learning: Success and challenge in replicating experimental findings with non-experimental data.
Society for Research on Educational Effectiveness.
Jiang, J. Y., Sporte, S. E., & Luppescu, S. (2015). Teacher perspectives on evaluation reform: Chicago’s
REACH students. Educational Researcher, 44(2), 105-116.
Kimball, S. M. (2002). Analysis of feedback, enabling conditions and farness perceptions of teachers in
three school districts with new standards-based evaluation system. Journal of Personnel Evaluation in
Education, 16(4), 241-268.
Koedel, C., & Betts, J. R. (2009). Does student sorting invalidate value-added models of teacher effectiveness? An
extended analysis of the Rothstein critique. Nashville, TN: National Center on Performance Incentives.
Marsh, J. A., Bush-Mecenas, S., Strunk, K. O., Lincove, J. A., & Huguet, A. (2017). Evaluating teachers
in the big easy: How organizational context shapes policy response in New Orleans. Educational
Evaluation and Policy Analysis.
©Stronge & Associates, 2018 All Rights Reserved
9
;
Research Report
February 2018
Measures of Effective Teaching. (2013). Ensuring fair and reliable measures of effective teaching: Culminating
findings from the MET Project’s three-year study [Policy and practice brief]. Seattle, WA: Bill &
Melinda Gates Foundation.
Menken, K. (2006). Teaching to the test: how no child left behind impacts language policy, curriculum,
and instruction for English language learners. Bilingual Research Journal, 30(2), 521-546.
Milanowski, A. T., & Heneman, H. G., III. (2001). Assessment of teacher reactions to a standards-based
evaluation system: A pilot study. Journal of Personnel Evaluation in Education, 15(3), 193-212.
Morganstein, D., & Wasserstein, R. (2014). ASA statement on value-added models. Statistics and Public
Policy, 1(1), 108-110.
Sass, T. R., Semykina, A., & Harris, D. N. (2014). Value-added models and the measurement of teacher
productivity. Economics of Education Review, 38, 9-23.
Steinberg, M. P., & Sartain, L. (2015). Does better observation make better teachers? New evidence from a
teacher evaluation pilot in Chicago. EducationNext, 15(1). Available at
http://educationnext.org/better-observation-make-better-teachers/
Steinberg, M. P., & Sartain, L. (2015). Does teacher evaluation improve school performance?
Experimental evidence from Chicago’s Excellence in Teaching Project. Education Finance and
Policy, 10(4), 535-572.
Stronge, J. H., & Tucker, P. D. (2003). Handbook on teacher evaluation: Assessing and improving
performance. Larchmont, NY: Eye on Education.
Taylor, E. S., & Tyler, J. H. (2011). The effect of evaluation on performance: Evidence from longitudinal student
achievement data of mid-career teachers. Cambridge, MA: National Bureau of Economic Research.
Taylor, E. S., & Tyler, J. H. (2012). Can teacher evaluation improve teaching? Evidence of systematic
growth in the effectiveness of midcareer teachers. EducationNext, 12(4). Available at
http://educationnext.org/can-teacher-evaluation-improve-teaching/
Toch, T., & Rothman, R. (2008). Rush to judgment: Teacher evaluation in public education. Washington, DC:
Education Sector.
Tuytens, M., & Devos, G. (2011). Stimulating professional learning through teacher evaluation: An
impossible task for the school leader? Teaching and Teacher Education, 27(5), 891-899.
von der embse, N. P., Pendergast, L. L., Segool, N., Saeki, E., & Ryan, S. (2016). The influence of testbased accountability policies on school climate and teacher stress across four states. Teaching and
Teacher Education, 59, 492-502.
von der Embse, N. P., Sandilos, L. E., Pendergast, L., & Mankin, A. (2016). Teacher stress, teachingefficacy, and job satisfaction in response to test-based educational accountability policies. Learning
& Individual Differences, 50, 308-317.
Xu, X., Grant, L. W., & Ward, T. J. (2016). Validation of a state-wide teacher evaluation system:
Relationship between scores from evaluation and student academic progress. NASSP Bulletin,
100(4), 203-222.
©Stronge & Associates, 2018 All Rights Reserved
10
Contact Us:
www.strongeandassociates.com
757.986.0756
info@strongeandassociates.com