September 16, 2010
Sean Patrick Corcoran
New York University
Annenberg Institute for School Reform
NEW YORK -- Value-added assessments of teacher effectiveness are a “crude indicator” of the contribution that teachers make to their students’ academic outcomes, asserts Sean P. Corcoran, assistant professor of educational economics at New York University’s Steinhardt School of Culture, Education and Human Development, and research fellow at the Institute for Education and Social Policy, in a paper issued today as part of the “Education Policy for Action” series of research and policy analyses by scholars convened by the Annenberg Institute for School Reform at Brown University.
“The promise that value-added systems can provide a precise, meaningful and comprehensive picture is much overblown,” argues Corcoran whose research report is entitled Can Teachers be Evaluated by Their Students’ Test Scores? Should they Be? The Use of Value-Added Measures of Teacher Effectiveness in Policy and Practice. “Teachers, policy-makers and school leaders should not be seduced by the elegant simplicity of value-added measures. Policy-makers, in particular, should be fully aware of their limitations and consider whether their minimal benefits outweigh their cost.”
Value-added models the centerpiece of a national movement to evaluate, promote, compensate and dismiss teachers based in part on their students’ test scores have proponents throughout the country, including school systems in New York City, Chicago, Houston and Washington, D.C. In theory, a teacher’s “value-added” is the unique contribution he or she makes to students’ achievement that cannot be attributed to any other current or past student, family, teacher, school, peer or community influence. In practice, states Corcoran, it is exceptionally difficult to isolate a teacher’s unique effect on academic achievement.
“The successful use of value-added requires a high level of confidence in the attribution of achievement gains to specific teachers,” he says. “Given one year of test scores, it's impossible to distinguish between the teacher’s effect and other classroom-specific factors. Over many years, the effects of other factors average out, making it easier to infer a teacher’s impact. But this is little comfort to a teacher or school leader searching for actionable information today.”
In October 2009, the National Academies’ National Research Council issued a statement that applauded the Department of Education’s proposed use of assessment systems that link student achievement to teachers in Race to the Top initiatives, but cautioned the use of value-added approaches for evaluation purposes, citing that “too little research has been done on these methods’ validity to base high-stakes decisions about teachers on them.”
Corcoran’s research examines the value-added systems used in New York City’s Teacher Data Reports, and Houston’s ASPIRE program (Accelerating Student Progress, Increasing Results and Expectations). Among his concerns surrounding them, he concludes that the standardized tests used to support these systems are inappropriate for value-added measurement.
“Value-added assessment works best when students are able to receive a single numeric test score every year on a continuous developmental scale,” states Corcoran, meaning that the scale does not depend on grade-specific content but rather progresses across grade levels. Neither the Texas nor New York state test was designed on such a scale. Moreover, the set of skills and subjects that can be adequately assessed in this way is remarkably small, he argues, suggesting that value-added systems will ignore much of the work teachers do.“Not all subjects are or can be tested, and even within tested subject areas, only certain skills readily conform to standardized testing,” he says. “Despite that, value-added measures depend exclusively on such tests. State tests are often predictable in both content and format, and value-added rankings will tend to reward those who take the time to master the predictability of the test.”
In practice, the biggest obstacle to value-added assessments is their high level of imprecision, he argues.
“A teacher ranked in the 43rd percentile on New York City’s Teacher Data Report may have a range of possible rankings from the 15th to the 71st percentile after taking statistical uncertainty into account,” says Corcoran. He finds that the majority of teachers in New York City’s Teacher Data Reports cannot be statistically distinguished from the 60 percent or more of other teachers in the district.
“With this level of uncertainty, one cannot differentiate between below average, average, and above average teachers with confidence. At the end of the day, it's isn't clear what teachers and their principals are supposed to do with this information.”
Corcoran grants that some uncertainty is inevitable in value-added measurement but questions whether value-added measures are precise enough to be useful in high-stakes decision-making or even for professional development. Novice teachers have the most to gain from performance feedback, he contends, yet value-added scores for these teachers are the least reliable.
The notion that a statistical model could isolate each teacher’s unique contribution to their students’ educational outcomes is a powerful one, acknowledges Corcoran. With such information, one could not only devise systems that reward teachers with demonstrated records of classroom success and remove teachers who do not, but also create a school climate in which teachers and principals work constructively with their test results to make positive instructional and organizational changes.
“Few can deny the intuitive appeal of these tools,” says Corcoran. “Teacher quality is an immensely important resource, and research has found that teachers can and do vary in their effectiveness. However, these evaluation tools have limitations and shortcomings that are not understood or apparent to interested stakeholders, or even to value-added advocates.”
Adds Corcoran: “Research on value-added remains in its infancy, and it is likely that these methods and the tests on which they are based will continue to improve over time. The simple fact that teachers and principals are receiving regular and timely feedback on their students’ achievement is an accomplishment in and of itself. It's hard to argue that stimulating conversation around improving student achievement is not a positive thing, but teachers, policy-makers, and school leaders should not be seduced by the simplicity of value added.”