HON 200 Policy Brief

advertisement
Cholewa 1
Ashley Cholewa
Dr. Liu
HON-200-01
4 March 2014
Policy Brief
Value-Added Teacher Evaluations Cannot
Stand Alone
By Ashley Cholewa
WHAT
IS
VALUE-ADDED
EVALUATION AND WHY IS IT
AN ISSUE?
A method of teacher evaluation which
mainly utilizes statistical analysis of
students’ standardized test scores to
determine a teacher’s effectiveness, called
value-added evaluation, is becoming
increasingly popular among school districts
and teachers unions, much to the dismay of
teachers worried about what value-added is,
and about what it means for their teaching
careers.
The most basic definition of valueadded evaluation, according to Daniel F.
McCaffrey of the Carnegie Foundation for
the Advancement of Teaching, is that it is a
form of evaluation that “uses statistical
methods to isolate the contributions of
teachers from other factors that influence
student achievement” (3). In other words, it
is a mathematical way to measure the
contribution a teacher has made to her
students’ academic growth during a given
year. Joel Lefkowitz, a professor emeritus of
Psychology at prestigious colleges such as
Bernard M. Baruch College and the Graduate
Center of the City University of New York,
says that value-added does this by relying on
“complex algorithms based on the
standardized test performance of [a teacher’s]
students” (47). While Lefkowitz is certainly
correct in asserting that value-added is
mainly concerned with test scores, he fails to
mention that the algorithms used to configure
teachers’ value-added scores take into
account much more than “a single score on a
Cholewa 2
single day,” a fact that is pointed out in a
document released by teacher organization
TNTP (1). TNTP, formerly known as The
New Teacher Project, says the equations used
to evaluate teachers take into consideration
“factors like a student’s poverty level or class
size,” both of which could adversely affect a
student’s academic progress (1).
The
ultimate goal of using value-added
evaluation, as Stephen W. Raudenbush and
Marshall Jean of the Carnegie Foundation for
the Advancement of Teaching suggest, is to
determine the extent to which a teacher
helped her students grow academically in a
single subject, over the course of a single year
(2).
WHAT ARE THE BENEFITS?
One can see how this evaluation
method could be helpful in improving overall
teacher quality. The fact that it is a
mathematical method helps stave away any
subjectivity on the evaluator’s part, and the
fact that it is a method that concedes
socioeconomically disparate students learn
differently helps teachers to be confident that
they are being evaluated fairly. In addition, as
Thomas
Toch,
co-director
of
the
Washington-based think tank Education
Sector, suggests in an article in Education
Week, value-added is “inexpensive and easy
to administer and seemingly measures what
matters most—student achievement” (2829). Low cost and ease of use alone certainly
make this method an attractive option, but the
fact that value-added also gets useful results
about academic growth levels had even
Randi Weingarten, then-president of the
teachers union American Federation of
Teachers, adopting the method as a means of
replacing inefficient evaluation practices
across the country (Herbert A23). In a New
York Times article by Bob Herbert,
Weingarten announced her endorsement of
value-added, and even vowed to “urge her
members to accept [this] form of teacher
evaluation that takes student achievement
into account…” (A23). TNTP suggests that
teachers would be wise to heed Weingarten’s
advice and to accept value-added as it is
extremely fair to teachers, allowing them to
“get the credit they deserve for helping all
their students improve—even those who start
the year far behind grade level—and aren’t
penalized for the effects of factors beyond
their control” (par. 1). The fact that the
algorithms used in this method are so
complex and control for so many different
demographic variables, even taking into
account a student’s previous progress, means
that teachers are more likely to be judged
solely on their own performance as
educators.
WHAT ARE THE DRAWBACKS?
Unfortunately,
what
Ms.
Weingarten and TNTP did not take into
account when extoling the virtues of valueadded was the fact that as good as it sounds
in theory, in practice it is not nearly as
impressive an evaluation method. For
example, in Stephen Sawchuk’s Education
Week article reviewing a study by the Bill &
Melinda Gates Foundation, it was revealed
that results from value-added evaluations
“tended to be volatile and were also least
predictive of how students taught by those
teachers would fare on…more cognitively
Cholewa 3
demanding tasks” (16). This means that
value-added results were least reliable out of
the three different teacher evaluation
methods the study tested, and those same
results were the least indicative of how
students would do on more complex, projectbased work than the simple, factual
memorization required by standardized tests.
So, even though value-added purports to
represent student achievement, it does not
represent it well enough to be depended on.
In addition to this revelation, Donald B. Gratz
brings up a pertinent point about standardized
tests in his Educational Leadership article.
He says that Americans assume that
“standardize test scores accurately measure
student academic achievement and that
academic achievement constitutes the full
range of goals we have for students,” even
though there is no concrete proof that either
one of those statements is true (78). The
Gates Foundation study presented in
Sawchuk’s article even seems to refute one of
those questionable assumptions, showing that
standardized test scores did not necessarily
predict performance on more thoughtprovoking tasks (16).
McCaffrey also brings up the point
that it is “very challenging to differentiate
between a teacher’s contribution to student
achievement and the influences of the school,
the classroom, and a student’s peers” (8).
This means that even when using a
mathematical approach like value-added,
there is no guarantee that a teacher’s
evaluation result is based solely on her
teaching ability. No evaluation method can
control for every single student’s personal
situation, so if a student is being consistently
discouraged by his family or by his peers and
that discouragement is negatively affecting
his standardized test scores year after year,
his teachers may be unfairly attributed with
his failure to grow academically. Lefkowitz
even goes as far as suggesting that valueadded isn’t fair to any teacher—he thinks that
value-added evaluations “appear to lack
validity or job relatedness and in at least some
instances could be found illegal” (47).
Lefkowitz feels that some unfairly evaluated
and terminated teachers could potentially win
court cases against schools that used poor
value-added scores to fire them. He says that
the use of “performance outcomes to evaluate
individual employee’s work performance is
appropriate only if those indicators are under
the employee’s control,” and that federal law
would require employers to show “a ‘clear
relation’ between standardized test scores
and teacher’s performance” (Lefkowitz 48).
We know from McCaffrey’s article that a
teacher’s contribution is difficult to separate
from the contributions of peers and other
environmental factors, so the performance
indicator of standardized test scores may not
be entirely in teachers’ control, and thus it
may be impossible to show a the relation
required by federal law (8).
Another issue with the value-added
method, very similar to the problem of
pinpointing a teacher’s contribution, is the
idea of confounding. McCaffrey says that
“confounding might occur because the
statistical model doesn’t measure or properly
control for all the factors that contribute to
student achievement” (3). In other words,
when value-added algorithms are not
complex enough to control for all possible
demographic factors in a classroom, a
teacher’s contribution may be over- or under-
Cholewa 4
estimated.
McCaffrey
specifies
that
confounding only occurs if these
demographic
factors
are
“consistently…associated with the students
of a particular teacher, so that they result in a
value-added
score
that
consistently
underestimates or overestimates her
effectiveness,” but if there is a flaw in a
school’s value-added algorithm that
overlooks certain demographic factors,
numerous teachers in that school may be
assigned inaccurate value-added scores (3).
A pertinent example of consistent
confounding is a teacher who is assigned to
have a group of gifted children in her class
year after year. The value-added model does
not control for these children’s enhanced
academic ability, and so, as they get
consistently high marks on standardized tests
every year, they do not appear to grow
academically. This affects their teacher’s
value-added score, making her appear
consistently more inefficient than she truly is.
In this way, confounding can be a huge
problem with the value-added evaluation
method.
A final and serious problem with
value-added is presented when Toch says that
“only about half of public school teachers
teach the subjects, or at the grade levels
where students are tested…” (28). Not only
can a teacher’s contribution not be precisely
pinpointed, but not all teachers even have an
opportunity to have their students tested.
This, Toch says, “[eliminates] the prospect of
a system that’s applied fairly to all teachers”
(28).
As Raudenbush and Jean remind us,
“social science has yet to come up with a
perfect measure of teacher effectiveness, so
anyone who makes decisions on the basis of
value-added estimates will be doing so in the
midst of uncertainty” (3). In other words, it is
unwise to rely on just one method when
evaluating teachers, particularly when we are
unsure whether any of the methods we use
today are actually true measures of teachers’
contributions.
RECOMMENDATIONS
So, what can be done to capitalize on
the benefits of value-added while still
mitigating its many pitfalls? TNTP assures us
that though “no single measure of
performance is reliable in isolation…,”
value-added can still be helpful, as the
method “provides objective information to
support or act as a check against classroom
observations” (1). Supporting this suggestion
of combining evaluation methods is the study
presented in Sawchuk’s article, which found
that “student feedback, test-score-growth
calculations, and observations of practice
appear to pick up different but
complimentary information that, combined,
can provide a balanced and accurate picture
of teacher performance…” (1). Toch
provides further framework for effective inclass evaluation, adding that in order for an
evaluation to truly complement the results
garnered through value added, it must be
thorough (29). He says that “to get a fairer
and fuller sense of performance, evaluations
should focus on teachers’ instruction—the
way they plan, teach, test, manage and
motivate” (Toch 2). This way, notes taken
from observations are truly notes about the
teacher’s effectiveness in the classroom.
Cholewa 5
State Departments of Education and
individual school districts should seriously
consider heeding Sawchuk’s, Toch’s, and
TNTP’s advice—the best way to evaluate a
teacher is to use an objective, mathematical
approach as a first step, and more personal,
subjective method to supplement the
conclusions drawn from that first step . This
way, value-added remains cheap and
efficient, and the threat of serious problems
like confounding is lessened.
Cholewa 6
Works Cited
Gratz, Donald B. “The Problem with Performance Pay.” Educational Leadership 67.3 (2009):
76-79. Academic Search Premier. Web. 6 Feb. 2014.
Herbert, Bob. “A Serious Proposal.” New York Times 12 Jan. 2010: A23. Education Source.
Web. 23 Feb. 2014.
Lefkowitz, Joel. “Rating Teachers Illegally?” TIP: The Industrial-Organizational Psychologist
48.4 (2011): 47-49. Academic Search Premier. Web. 6 Feb. 2014.
McCaffrey, Daniel F. “Do Value-Added Methods Level The Playing Field For Teachers? What
We Know Series: Value-Added Methods and Applications. Knowledge Brief 2.”
Carnegie Foundation For The Advancement Of Teaching (2012): ERIC. Web. 23 Feb.
2014.
Raudenbush, Stephen W. and Marshall Jean. “How Should Educators Interpret Value-Added
Scores? What We Know Series: Value-Added Methods and Applications. Knowledge
Brief 1.” Carnegie Foundation For The Advancement of Teaching (2012): ERIC. Web.
23 Feb. 2014.
Sawchuk, Stephen. “Multiple Gauges Best for Teachers.” Education Week 16 Jan. 2013: 1+.
ERIC. Web. 6 Feb. 2014.
TNTP. “Myths & Facts About Value-Added Analysis.” TNTP (2011): ERIC. Web. 23 Feb. 2014.
Toch, Thomas. “Test Results and Drive-By Evaluations.” Education Week 05 Mar. 2008: 28-29.
Academic Search Premier. Web. 6 Feb. 2014.
Download