VAM_Overview -

Description: “a collection of complex statistical techniques that use multiple years of
students’ test score data to estimate the effects of individual schools or teachers”
(McCaffrey, Lockwood, Koretz, & Hamilton, 2003, p. xi).
Students’ previous test scores are used to create predicted test scores for a given year. The
difference between the predicted and actual test scores are growth scores. Teachers’
contribution to students’ learning is determined by looking at the average of all of their
students’ growth scores. The teachers are then ranked against other teachers within a
district (or other unit of interest) according to how much they contributed to students’
growth, and this ranking is their value-added “score.” (Goe, 2008)
Simple, video explanation:
 Arguably the single best predictive
measure of student learning
 Considered more “objective” and
less subject to bias
 After initial costs, relatively cheap
 A direct measure of at least part of
the outcome we ultimately care
about – student learning
 Controls for prior achievement,
and can control for
SES/demographics of students
(which status measures such as
percent proficient do not)
 Different VAM models tend to be
highly correlated with one
another, suggesting that estimates
are not highly model dependent
(Lockwood et al 2007)
 As better tests of student learning
become available (Common Core),
VAM estimates might be expected
to better approximate deep
student learning
 Easy to understand, appealing
 High face validity
 Threaten to drown out any
attention to other measures of
teaching quality or student
learning because this is “the best”
 highly unstable from year to year
 highly imprecise: margin of error
for teacher at 43rd percentile
ranges from 15th to 71st
(Corcoran, 2010)
 highly reliant on what test is used
 missing data creates problems for
VAM; provides incentives to ignore
new students
 can only be applied to about 30%
of teachers
 cannot account for other
differences such as out-of-school
learning (summer, after-school,
parental involvement)
 teachers estimated to account for
about 10% of variance in students’
 cannot account for non-random
sorting of students to teachers
(selection effects)
 incentivize increased attention to
standardized test outcomes
Additional Sources on VAM
Schochet, Peter Z. and Hanley S. Chiang (2010). Error Rates in Measuring
Teacher and School Performance Based on Student Test Score Gains (NCEE
2010-4004). Washington, DC: National Center for Education Evaluation and
Regional Assistance, Institute of Education Sciences, U.S. Department of
Addresses likely error rates for measuring teacher and school performance using
value-added models applied to student test score gain data. Error rate formulas
based on OLS and Empirical Bayes estimators. Simulation results suggest that valueadded estimates are likely to be noisy: Type I and II error rates for comparing a
teacher’s performance to the average are likely to be about 25 percent with three
years of data and 35 percent with one year of data. Corresponding error rates for
overall false positive and negative errors are 10 and 20 percent, respectively. Lower
error rates can be achieved if schools are the performance unit.
Corcoran, Sean P. “Can Teachers be Evaluated by their Students’ Test Scores?
Should They Be? The Use of Value-Added Measures of Teacher Effectiveness in
Policy and Practice.” Annenberg Institute for School Reform at Brown
University. 2010.
Using data from NYC and Houston, Corcoran brings the problems of VAM to life. We
learn about “Mark Jones, ranked at the 43rd percentile among eighth grade teachers
in math. Taking into account uncertainty in this estimate, however, his range of
plausible rankings range from the 15th to the 71st percentile. Based on his last four
years of results, Mr. Jones ranked in the 56th percentile, though his plausible range
still extends from the 32nd percentile to the 80th, which overlaps the “average” and
“above average” performance categories. Taking his two sets of results together, we
have learned that Mark Jones is most likely an average teacher, ranking somewhere
around the 43rd and 56th percentile citywide, who may be below average, average,
or above average. Accounting for uncertainty, we can’t rule out a twenty-point
margin in either direction. It is unclear to this author what Mr. Jones or his principal
can do with this information to improve instruction or raise student performance.”
Rothstein, J. (2010). Teacher quality in educational production: Tracking,
decay, and student achievement. The Quarterly Journal of Economics, 125(1),
Finds that a student’s fifth-grade teacher has large effects on her fourth-grade
achievement, a technical impossibility given that the student has not yet advanced
to the fifth grade. He suggests that this finding may be due to “dynamic tracking,”
where a student’s assignment to a fifth-grade teacher depends on their fourth-grade
experience. When such assignment occurs, it biases measures of value-added.
Sanders, W. L., & Horn, S. P. (1994). The Tennessee value-added assessment
system (TVAAS): Mixed-model methodology in educational
assessment.Journal of Personnel Evaluation in education, 8(3), 299-311.
Explains the rationale behind and the model used in the TVAAS, the nation’s first
VAM accountability system.
Hanushek, E. A., & Rivkin, S. G. (2010). Generalizations about using valueadded measures of teacher quality. The American Economic Review, 100(2),
Hanushek and Rivkin, two major researchers who generally favor the use of valueadded measures, explain the basic VA model and what it is good for.