VAM Description: “a collection of complex statistical techniques that use multiple years of students’ test score data to estimate the effects of individual schools or teachers” (McCaffrey, Lockwood, Koretz, & Hamilton, 2003, p. xi). Students’ previous test scores are used to create predicted test scores for a given year. The difference between the predicted and actual test scores are growth scores. Teachers’ contribution to students’ learning is determined by looking at the average of all of their students’ growth scores. The teachers are then ranked against other teachers within a district (or other unit of interest) according to how much they contributed to students’ growth, and this ranking is their value-added “score.” (Goe, 2008) Simple, video explanation: http://www.youtube.com/watch?v=uONqxysWEk8 Pros Arguably the single best predictive measure of student learning Considered more “objective” and less subject to bias After initial costs, relatively cheap A direct measure of at least part of the outcome we ultimately care about – student learning Controls for prior achievement, and can control for SES/demographics of students (which status measures such as percent proficient do not) Different VAM models tend to be highly correlated with one another, suggesting that estimates are not highly model dependent (Lockwood et al 2007) As better tests of student learning become available (Common Core), VAM estimates might be expected to better approximate deep student learning Easy to understand, appealing metrics High face validity Cons Threaten to drown out any attention to other measures of teaching quality or student learning because this is “the best” highly unstable from year to year highly imprecise: margin of error for teacher at 43rd percentile ranges from 15th to 71st (Corcoran, 2010) highly reliant on what test is used missing data creates problems for VAM; provides incentives to ignore new students can only be applied to about 30% of teachers cannot account for other differences such as out-of-school learning (summer, after-school, parental involvement) teachers estimated to account for about 10% of variance in students’ scores cannot account for non-random sorting of students to teachers (selection effects) incentivize increased attention to standardized test outcomes Additional Sources on VAM Schochet, Peter Z. and Hanley S. Chiang (2010). Error Rates in Measuring Teacher and School Performance Based on Student Test Score Gains (NCEE 2010-4004). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education. Addresses likely error rates for measuring teacher and school performance using value-added models applied to student test score gain data. Error rate formulas based on OLS and Empirical Bayes estimators. Simulation results suggest that valueadded estimates are likely to be noisy: Type I and II error rates for comparing a teacher’s performance to the average are likely to be about 25 percent with three years of data and 35 percent with one year of data. Corresponding error rates for overall false positive and negative errors are 10 and 20 percent, respectively. Lower error rates can be achieved if schools are the performance unit. Corcoran, Sean P. “Can Teachers be Evaluated by their Students’ Test Scores? Should They Be? The Use of Value-Added Measures of Teacher Effectiveness in Policy and Practice.” Annenberg Institute for School Reform at Brown University. 2010. http://www.scribd.com/doc/37648467/The-Use-of-ValueAdded-Measures-of-Teacher-Effectiveness-in-Policy-and-Practice Using data from NYC and Houston, Corcoran brings the problems of VAM to life. We learn about “Mark Jones, ranked at the 43rd percentile among eighth grade teachers in math. Taking into account uncertainty in this estimate, however, his range of plausible rankings range from the 15th to the 71st percentile. Based on his last four years of results, Mr. Jones ranked in the 56th percentile, though his plausible range still extends from the 32nd percentile to the 80th, which overlaps the “average” and “above average” performance categories. Taking his two sets of results together, we have learned that Mark Jones is most likely an average teacher, ranking somewhere around the 43rd and 56th percentile citywide, who may be below average, average, or above average. Accounting for uncertainty, we can’t rule out a twenty-point margin in either direction. It is unclear to this author what Mr. Jones or his principal can do with this information to improve instruction or raise student performance.” Rothstein, J. (2010). Teacher quality in educational production: Tracking, decay, and student achievement. The Quarterly Journal of Economics, 125(1), 175-214. Finds that a student’s fifth-grade teacher has large effects on her fourth-grade achievement, a technical impossibility given that the student has not yet advanced to the fifth grade. He suggests that this finding may be due to “dynamic tracking,” where a student’s assignment to a fifth-grade teacher depends on their fourth-grade experience. When such assignment occurs, it biases measures of value-added. Sanders, W. L., & Horn, S. P. (1994). The Tennessee value-added assessment system (TVAAS): Mixed-model methodology in educational assessment.Journal of Personnel Evaluation in education, 8(3), 299-311. Explains the rationale behind and the model used in the TVAAS, the nation’s first VAM accountability system. Hanushek, E. A., & Rivkin, S. G. (2010). Generalizations about using valueadded measures of teacher quality. The American Economic Review, 100(2), 267-271. Hanushek and Rivkin, two major researchers who generally favor the use of valueadded measures, explain the basic VA model and what it is good for.