Cholewa 1 Ashley Cholewa Dr. Liu HON-200-01 4 March 2014 Policy Brief Value-Added Teacher Evaluations Cannot Stand Alone By Ashley Cholewa WHAT IS VALUE-ADDED EVALUATION AND WHY IS IT AN ISSUE? A method of teacher evaluation which mainly utilizes statistical analysis of students’ standardized test scores to determine a teacher’s effectiveness, called value-added evaluation, is becoming increasingly popular among school districts and teachers unions, much to the dismay of teachers worried about what value-added is, and about what it means for their teaching careers. The most basic definition of valueadded evaluation, according to Daniel F. McCaffrey of the Carnegie Foundation for the Advancement of Teaching, is that it is a form of evaluation that “uses statistical methods to isolate the contributions of teachers from other factors that influence student achievement” (3). In other words, it is a mathematical way to measure the contribution a teacher has made to her students’ academic growth during a given year. Joel Lefkowitz, a professor emeritus of Psychology at prestigious colleges such as Bernard M. Baruch College and the Graduate Center of the City University of New York, says that value-added does this by relying on “complex algorithms based on the standardized test performance of [a teacher’s] students” (47). While Lefkowitz is certainly correct in asserting that value-added is mainly concerned with test scores, he fails to mention that the algorithms used to configure teachers’ value-added scores take into account much more than “a single score on a Cholewa 2 single day,” a fact that is pointed out in a document released by teacher organization TNTP (1). TNTP, formerly known as The New Teacher Project, says the equations used to evaluate teachers take into consideration “factors like a student’s poverty level or class size,” both of which could adversely affect a student’s academic progress (1). The ultimate goal of using value-added evaluation, as Stephen W. Raudenbush and Marshall Jean of the Carnegie Foundation for the Advancement of Teaching suggest, is to determine the extent to which a teacher helped her students grow academically in a single subject, over the course of a single year (2). WHAT ARE THE BENEFITS? One can see how this evaluation method could be helpful in improving overall teacher quality. The fact that it is a mathematical method helps stave away any subjectivity on the evaluator’s part, and the fact that it is a method that concedes socioeconomically disparate students learn differently helps teachers to be confident that they are being evaluated fairly. In addition, as Thomas Toch, co-director of the Washington-based think tank Education Sector, suggests in an article in Education Week, value-added is “inexpensive and easy to administer and seemingly measures what matters most—student achievement” (2829). Low cost and ease of use alone certainly make this method an attractive option, but the fact that value-added also gets useful results about academic growth levels had even Randi Weingarten, then-president of the teachers union American Federation of Teachers, adopting the method as a means of replacing inefficient evaluation practices across the country (Herbert A23). In a New York Times article by Bob Herbert, Weingarten announced her endorsement of value-added, and even vowed to “urge her members to accept [this] form of teacher evaluation that takes student achievement into account…” (A23). TNTP suggests that teachers would be wise to heed Weingarten’s advice and to accept value-added as it is extremely fair to teachers, allowing them to “get the credit they deserve for helping all their students improve—even those who start the year far behind grade level—and aren’t penalized for the effects of factors beyond their control” (par. 1). The fact that the algorithms used in this method are so complex and control for so many different demographic variables, even taking into account a student’s previous progress, means that teachers are more likely to be judged solely on their own performance as educators. WHAT ARE THE DRAWBACKS? Unfortunately, what Ms. Weingarten and TNTP did not take into account when extoling the virtues of valueadded was the fact that as good as it sounds in theory, in practice it is not nearly as impressive an evaluation method. For example, in Stephen Sawchuk’s Education Week article reviewing a study by the Bill & Melinda Gates Foundation, it was revealed that results from value-added evaluations “tended to be volatile and were also least predictive of how students taught by those teachers would fare on…more cognitively Cholewa 3 demanding tasks” (16). This means that value-added results were least reliable out of the three different teacher evaluation methods the study tested, and those same results were the least indicative of how students would do on more complex, projectbased work than the simple, factual memorization required by standardized tests. So, even though value-added purports to represent student achievement, it does not represent it well enough to be depended on. In addition to this revelation, Donald B. Gratz brings up a pertinent point about standardized tests in his Educational Leadership article. He says that Americans assume that “standardize test scores accurately measure student academic achievement and that academic achievement constitutes the full range of goals we have for students,” even though there is no concrete proof that either one of those statements is true (78). The Gates Foundation study presented in Sawchuk’s article even seems to refute one of those questionable assumptions, showing that standardized test scores did not necessarily predict performance on more thoughtprovoking tasks (16). McCaffrey also brings up the point that it is “very challenging to differentiate between a teacher’s contribution to student achievement and the influences of the school, the classroom, and a student’s peers” (8). This means that even when using a mathematical approach like value-added, there is no guarantee that a teacher’s evaluation result is based solely on her teaching ability. No evaluation method can control for every single student’s personal situation, so if a student is being consistently discouraged by his family or by his peers and that discouragement is negatively affecting his standardized test scores year after year, his teachers may be unfairly attributed with his failure to grow academically. Lefkowitz even goes as far as suggesting that valueadded isn’t fair to any teacher—he thinks that value-added evaluations “appear to lack validity or job relatedness and in at least some instances could be found illegal” (47). Lefkowitz feels that some unfairly evaluated and terminated teachers could potentially win court cases against schools that used poor value-added scores to fire them. He says that the use of “performance outcomes to evaluate individual employee’s work performance is appropriate only if those indicators are under the employee’s control,” and that federal law would require employers to show “a ‘clear relation’ between standardized test scores and teacher’s performance” (Lefkowitz 48). We know from McCaffrey’s article that a teacher’s contribution is difficult to separate from the contributions of peers and other environmental factors, so the performance indicator of standardized test scores may not be entirely in teachers’ control, and thus it may be impossible to show a the relation required by federal law (8). Another issue with the value-added method, very similar to the problem of pinpointing a teacher’s contribution, is the idea of confounding. McCaffrey says that “confounding might occur because the statistical model doesn’t measure or properly control for all the factors that contribute to student achievement” (3). In other words, when value-added algorithms are not complex enough to control for all possible demographic factors in a classroom, a teacher’s contribution may be over- or under- Cholewa 4 estimated. McCaffrey specifies that confounding only occurs if these demographic factors are “consistently…associated with the students of a particular teacher, so that they result in a value-added score that consistently underestimates or overestimates her effectiveness,” but if there is a flaw in a school’s value-added algorithm that overlooks certain demographic factors, numerous teachers in that school may be assigned inaccurate value-added scores (3). A pertinent example of consistent confounding is a teacher who is assigned to have a group of gifted children in her class year after year. The value-added model does not control for these children’s enhanced academic ability, and so, as they get consistently high marks on standardized tests every year, they do not appear to grow academically. This affects their teacher’s value-added score, making her appear consistently more inefficient than she truly is. In this way, confounding can be a huge problem with the value-added evaluation method. A final and serious problem with value-added is presented when Toch says that “only about half of public school teachers teach the subjects, or at the grade levels where students are tested…” (28). Not only can a teacher’s contribution not be precisely pinpointed, but not all teachers even have an opportunity to have their students tested. This, Toch says, “[eliminates] the prospect of a system that’s applied fairly to all teachers” (28). As Raudenbush and Jean remind us, “social science has yet to come up with a perfect measure of teacher effectiveness, so anyone who makes decisions on the basis of value-added estimates will be doing so in the midst of uncertainty” (3). In other words, it is unwise to rely on just one method when evaluating teachers, particularly when we are unsure whether any of the methods we use today are actually true measures of teachers’ contributions. RECOMMENDATIONS So, what can be done to capitalize on the benefits of value-added while still mitigating its many pitfalls? TNTP assures us that though “no single measure of performance is reliable in isolation…,” value-added can still be helpful, as the method “provides objective information to support or act as a check against classroom observations” (1). Supporting this suggestion of combining evaluation methods is the study presented in Sawchuk’s article, which found that “student feedback, test-score-growth calculations, and observations of practice appear to pick up different but complimentary information that, combined, can provide a balanced and accurate picture of teacher performance…” (1). Toch provides further framework for effective inclass evaluation, adding that in order for an evaluation to truly complement the results garnered through value added, it must be thorough (29). He says that “to get a fairer and fuller sense of performance, evaluations should focus on teachers’ instruction—the way they plan, teach, test, manage and motivate” (Toch 2). This way, notes taken from observations are truly notes about the teacher’s effectiveness in the classroom. Cholewa 5 State Departments of Education and individual school districts should seriously consider heeding Sawchuk’s, Toch’s, and TNTP’s advice—the best way to evaluate a teacher is to use an objective, mathematical approach as a first step, and more personal, subjective method to supplement the conclusions drawn from that first step . This way, value-added remains cheap and efficient, and the threat of serious problems like confounding is lessened. Cholewa 6 Works Cited Gratz, Donald B. “The Problem with Performance Pay.” Educational Leadership 67.3 (2009): 76-79. Academic Search Premier. Web. 6 Feb. 2014. Herbert, Bob. “A Serious Proposal.” New York Times 12 Jan. 2010: A23. Education Source. Web. 23 Feb. 2014. Lefkowitz, Joel. “Rating Teachers Illegally?” TIP: The Industrial-Organizational Psychologist 48.4 (2011): 47-49. Academic Search Premier. Web. 6 Feb. 2014. McCaffrey, Daniel F. “Do Value-Added Methods Level The Playing Field For Teachers? What We Know Series: Value-Added Methods and Applications. Knowledge Brief 2.” Carnegie Foundation For The Advancement Of Teaching (2012): ERIC. Web. 23 Feb. 2014. Raudenbush, Stephen W. and Marshall Jean. “How Should Educators Interpret Value-Added Scores? What We Know Series: Value-Added Methods and Applications. Knowledge Brief 1.” Carnegie Foundation For The Advancement of Teaching (2012): ERIC. Web. 23 Feb. 2014. Sawchuk, Stephen. “Multiple Gauges Best for Teachers.” Education Week 16 Jan. 2013: 1+. ERIC. Web. 6 Feb. 2014. TNTP. “Myths & Facts About Value-Added Analysis.” TNTP (2011): ERIC. Web. 23 Feb. 2014. Toch, Thomas. “Test Results and Drive-By Evaluations.” Education Week 05 Mar. 2008: 28-29. Academic Search Premier. Web. 6 Feb. 2014.