social-emotional nonacademic

advertisement
1
Assessing Teacher-Led Reform:
Using Measures of Accountability Beyond Test Scores
By Jackie Bennett, Christina Collins, Maisie McAdoo and Rhonda Rosenberg
United Federation of Teachers Research Department
Introduction
The New York City school system, under a progressive new mayor and his veteraneducator chancellor, has taken several steps to enhance teachers’ roles in decision-making, in
schools as well as classrooms. The chancellor also has pressed for greater collaboration within
and between schools and emphasized the importance of measures of school quality beyond
standardized test scores in school accountability processes. This reverses the approach of former
Mayor Michael Bloomberg and his chancellor, Joel Klein, who emphasized principal leadership
and used standardized test results as the primary means to judge school and teacher performance.
These changes are new, but the stakes are high. The NYC school system, like many
others, has huge challenges, limited budgets, a high-needs student population and pockets of
evident failure. The need for reform is indisputable, but what the mayor and chancellor are
doing, in collaboration with the teachers’ and principals’ unions, runs counter to the national
education reform agenda.
The education research community will rightly ask if this approach works, and
researchers will seek to answer that question sooner rather than later. Based on early feedback
from schools and teachers involved in some of the new reforms, we suggest that educational
researchers join us in exploring a series of indicators beyond standardized test scores. These
indicators can identify quantifiable and qualitative changes that take place in school climate,
teacher effectiveness and student well-being. Drawing from initial data gathered during the
launch of these reforms, this paper concludes that in assessing these new teacher-led reforms,
measures should be used which allow researchers, policy-makers, and school stakeholders to do
the following:
1) Define and discuss student academic achievement using measures other than
standardized test scores;
2) Examine students’ non-academic outcomes, including social-emotional growth;
3) Assess levels of collaboration between teachers and levels of parent engagement;
4) Measure the quality of implementation of innovations;
5) Access data which is useable and useful during the current classroom year.
Scholarship which looks at these types of measures, we argue, is essential for the implementation
of truly effective school improvement efforts. It is urgent that the research community work
towards more consistent inclusion of such measures in scholarship and policy research.
Measurement under Teacher Leadership
The urgent need to expand measures of school and teacher success in New York City
came into focus as the new administration sought to make teachers the leaders of school change.
What data could these new leaders use to analyze and assess their changes? At the classroom
2
level, once-yearly standardized test results are of minimal usefulness. Even at the school level,
test-score-based snapshots do not produce granular pictures of schools or point a way to
improvement. The advent of new teacher-leader positions under the new administration brought
with it a need for more detailed and useful ways to examine student outcomes and make
decisions.
The most radical of the new teacher leader reforms has been the launch of the five-year
Progressive Redesign Opportunity Schools for Excellence (PROSE) program in 2014. There are
currently 62 PROSE schools with another 40 or so to be added in 2015-16. They are selected by
a joint panel of district and union representatives and are required to have a strong record of
collaborative practices and respect for teacher “voice” or input.
The result of negotiations between the city, the Department of Education and the teachers
union, PROSE permits schools, with a 65 percent faculty vote, to alter DOE or union regulations
in pursuit of improvement. Teachers and leaders in PROSE schools have wide latitude to change
scheduling, hire and fire, plan curriculum and conduct professional development, as long as the
school meets performance targets chosen by the school and the joint panel. An “Option PROSE”
allows some schools to substitute peer observations for part of the state-mandated teacher
evaluation process.
Though some NYC schools have been using teacher-leader practices for years, the
deliberate institutionalizing of teacher leadership is a new reform on a system level. In addition
to PROSE, the systemic teacher leadership changes include identifying and compensating
specific teacher-leader roles such as mentor, master and "ambassador" teachers, who may be
released from some teaching duties to observe and help train colleagues,
The new chancellor has also moved to replace many former Dept. of Education
employees, mainly lawyers and administrators, with educators or education administrators. She
has mandated that principals must have seven years of teaching experience versus the former
three-year requirement. And she has championed teacher-to-teacher professional development
over expert-led training
Outputs, Inputs and Standardized Tests
In order to measure these reforms we have to address the inputs vs. outputs issue. Many
researchers and reformers in recent years have criticized assessments of school improvement
programs which focus primarily on “inputs”-- how much money the state and city invest in
schools, how much cutting-edge technology they deliver, or even teachers’ experience and
academic qualifications— arguing that such measures don’t truly matter in judging whether a
given reform has been successful or not in raising student achievement. (Hanushek, Levin cites)
Instead, discussions of reform have been dominated by the widespread conviction that
“outputs,” as defined almost exclusively by standardized measures of student achievement, are
the only really objective, hard-nosed way to demonstrate education success. But as weighing the
pig doesn’t fatten it, neither do measures of changes in test scores necessarily lead even to
improved test scores, let alone to other increases in desired outputs. Driving reform by measuring
test score outcomes does not tell teachers or policy-makers how to get there. (More worrisome,
3
over the last decade, have been too many instances of unintended consequences, such as
excessive test-prep, abandonment of non-tested subjects and cheating scandals.)
Further, cause and effect, in terms of a score on a test given once a year, are almost
impossible to disentangle in schools, where so many parts are in simultaneous motion. A simple
value-added or growth measure of test-score outcomes is important but inadequate in assessing
the reforms being put into place in New York City. There is the inherent challenge of claiming
clear statistical significance and causality (based on correlations in either direction) in an
education reform context in which many policy changes are being implemented simultaneously.
How can test scores identify the impact of new teacher roles separately from other reforms
taking place, such as new state-mandated Common Core Learning Standards or redesigned
graduation requirements?
In addition, reporting and analysis of test score outcomes may take too long to be useful
to practitioners and policy-makers; in a one- or two-year data sample, changes of a point or two
in either direction are necessarily inconclusive. Strong evidence of success or failure of a given
reform requires several years of fairly conclusive movement in one direction or another on a
stable and predictable test. Teachers and policy-makers at the ground-level increasingly see this
timeline and the limited data available from state-wide standardized test results as useless, at
best, for the day-to-day work of improving the educational experience of their students – and, at
worst, as a distraction from or misrepresentation of the true teaching and learning happening in
their classrooms and schools.
A new paradigm for measuring success is necessary to move forward in school
improvement research.
Alternative Measures of Student Growth and Program Success
On the national level, much of the recent interest in alternatives to standardized test
scores as the default measure of student academic achievement and growth has been driven by
changes to teacher evaluation policy. In response to federal guidelines and incentives in the
Race to the Top program which required states to include student learning measures in their
teacher evaluation formulas, multiple states have sought to develop measures of growth for
teachers who are in subjects and grades outside of those for which value-added measures are
available - generally, teachers other than those in grades 4-8 and who teach subjects other than
English Language Arts or Math. One of the most commonly adopted alternatives to value-added
measures are Student Learning Objectives, which researchers have begun to focus on more
frequently over the past two years as states which were early adopters of this method have both
released initial results and have begun to adjust their original policies in response to
implementation challenges (Gill et al 2014; Lacireno-Pacquet et al 2014)
Other researchers have used correlations between student growth scores and nonacademic measures as evidence for increased use of those alternative measures as a necessary
complement to standardized test results in teacher evaluation and policy and program evaluation.
The best known of these in recent years has likely been the Gates-funded Measures of Effective
Teaching Study, which concluded that measures such as student surveys and teachers’ ratings on
observation rubrics had consistently significant (although relatively small) correlations with
4
student growth scores, concluding that inclusion of such measures in teacher evaluation formulas
was recommended in order to increase the ratings’ usefulness in shaping classroom practice and
district- and school-wide decision-making.
There is also a growing body of research linking measures of increased collaboration at the
school level with increased student academic achievement. Researchers such as
Anthony Bryk and his colleagues in Chicago have used novel measures of collaboration and trust
in conjunction with more familiar measures of student learning to argue that it is equally
important to consistently track and discuss the inputs occurring in a classroom and school as to
have high-quality analysis of the data regarding student achievement outputs if teachers and
policy-makers are to have useable information for changes to practice. (Quintero 2014, Bryk;
Daly)
Current Alternative Measures in Use in NYC
In New York City, we already have a large body of data in place which offers researchers
and policy-makers alternative measures of school performance and of reform implementation,
past and present. The challenge in the past has been that these measures were largely
overshadowed by the importance of standardized test performance and growth in the previous
administration’s heavily-weighted “Progress Report” accountability system, assigning A-F
grades for each school. The new DOE administration has been gradually shifting away from this
system into a more nuanced portrayal of school performance which adapts many of the measures
below using a “Capacity Framework” based on Anthony Bryk’s research on school success in
Chicago. (Bryk cite)
One large source of data is a yearly survey of parents, teachers and (middle and high
school) students at each school that collects information on academic expectations, learning
climate, leadership and parent engagement. These surveys, which have been given to all public
school teachers, parents and upper-grades students for the last eight years, can be analyzed
quantitatively. The responses of parents, teachers and students are reported separately and
provide nuanced information than can be cross-referenced.
School Quality Reviews are conducted by education experts (though not every school is
assessed every year) and rate schools’ curriculum, instructional practices, learning environment
and professional collaborations on a four-point scale.
Peer School Comparisons—DOE groups schools into peer groups that share similar
entering test scores, percentages of special education and high-needs special education students,
and some other factors to create groups of similar schools. Schools are assessed against their
peers on test-score growth, credit accumulation, attendance, graduation, post-secondary
enrollment and other factors.
Local Measures of Student Learning (MOSL)—As part of statewide teacher evaluations,
teachers, in discussion with their principals, currently set goals for local ‘measures of student
learning,’ which can include student engagement, quality of questioning, mastery of a skill and
the like. In schools with strong teacher leadership these aspects of the evaluation can provide
evidence of success.
5
Thanks to New York State’s and New York City’s many advances in education data
collection and reporting over the last decade, researchers also have access to many other
aggregate (or aggregate-able) measures; although access to this data is sometimes restricted by
student privacy laws and other regulations, many of these data points have been collected for all
schools for multiple years
They include but are not limited to
1) Academic measures such as grades; rates of grade promotion; credit accumulation;
college attendance and persistence; and student portfolio assessment;
2) Indicators of “conditions for student growth,” such as attendance; suspensions; school
safety incidents; survey reports on school environment; parent participation rates;
enrollment and retention of low-income or high-needs students, such as English language
learners, special education students or students in temporary housing.
3) Measures of teacher effectiveness and persistence: teacher turnover; teacher experience
and longevity; teacher leadership (grade or department chairs, mentor and master
teachers); participation on School Leadership Teams; participation in curriculum writing;
and state teacher evaluation ratings.*
*State-mandated teacher evaluations, new this year, will also include principal and teacher
observational metrics that may be accessible if privacy issues can be resolved.
New Measures
Teachers' accounts of how their daily practice changes under new policies should be at
the core of assessing instructional and curricular reforms, and yet little research has focused on
this key question of implementation at the classroom level. The United Federation of Teachers
has been active in seeking to explore and expand the use of alternative measures of school and
teacher success, with the goal of increasing the use of measures which are instructionally
relevant and timely as well as being rigorous and comparable across classrooms.
Teachers’ first-person accounts of their efforts are part of the raw data available in a
database of published interviews with teachers from the New York Teacher and other union
publications and reports. The union also has begun use of an annual survey which uses a
representative sampling technique to assess members’ experiences and opinions on a range of
subjects not covered by the district survey. The second iteration of this survey is scheduled to be
distributed and analyzed in Spring 2015.
In addition, the launch of the PROSE program has allowed the union and district to work
collaboratively with each other and with the staff and leadership of the sixty-two schools in the
first PROSE cohort to explore and identify measures of success which the schools believe will be
especially relevant in gauging their implementation of a range of innovations, including changes
to teacher evaluation, scheduling and calendar changes, and school and city policies for grading
and credit accumulation.
6
As part of their initial application to the program in June 2014, PROSE applicants were
asked to identify which measures their schools either currently used or were interested in
exploring as ways to measure the success of innovative practices. Several of the schools are
members of the New York Performance Standards Consortium. These schools use Performance
Based Assessment Tasks (PBATs), in which students present an individually selected project or
portfolio to a panel of teachers and other adults at the end of the year. Consortium schools and
other PROSE schools proposed dozens of different measures, summarized in the attached table
and excerpted below.
Measures identified by the PROSE schools as important to their definitions of success
included the following categories and selected examples of measures other than state tests:
Academic Measures
 Student portfolios
 Middle- and high-school admissions
 High school credit accumulation
 Student promotion and graduation
 Enrollment in advanced courses
 College admission and retention
Student Non-Academic Measures
 Behavior
 Attendance and retention
 Social-emotional well-being
 Enrollment and retention of high-need students
 Number and diversity of student applicants
Other Measures
 Teacher retention
 Teacher performance (assessed by school leaders and peer teachers)
 Administrator performance
 Support for collaboration within school
 Collaboration with other schools
 Parent participation and satisfaction
One specific example of an innovation being pursued by a number of PROSE schools is
the Mastery-based model of learning and grading. Students are given the opportunity to show
learning by documenting mastery of a body of knowledge at an individualized level rather than
within a standardized curriculum pacing model. PROSE has offered a subset of these schools the
opportunity to experiment with different ways of scheduling student classes and reporting
students’ grades and credit accumulation to the district as part of a pilot program this spring.
Standardized methods of tracking student credit accumulation or test results would not
provide adequate data to assess the implementation and success of this pilot program, especially
since many of these schools are already using school-specific student learning management
7
programs to track students’ progress in mastery-based courses. Such practices make it
imperative for researchers and policy-makers to turn to the schools themselves for accurate
information about what is happening at the school and classroom level and how it has impacted
their students’ experiences.
Using the data—Theory of Change
Many PROSE and other schools are currently tracking or are interested in finding ways to
track data points which are not required by or available within city or state accountability
systems, but which they have identified as important to student success at their schools and
which many use for making instructional decisions on a regular basis during the school year.
The PROSE leadership team, composed of union and DOE representatives, is working
closely with local staff and with schools to determine how best to use this data to inform and
improve the implementation of the program and the school-level innovations. In addition to
continuing to identify and refine the measures above, we have begun to develop a Theory of
Change (TOC) model for use in helping stakeholders go through a process of linking
implementation of PROSE innovations to their aims for student learning and growth.
The leadership team is also developing a TOC for the program as a whole. By making the
assumed links between a given practice at the school level and teacher or student outcomes more
explicit, the TOC process has helped us understand which measures are most important for
determing the innovations that could have the greatest impact on the schools we are working
with and the students they serve. In March 2015, we will be launching a Participatory Action
Research project to work closely with seven PROSE schools to develop a Theory of Change and
help them identify which measures of success will be most relevant and useful in gauging the
impact of the innovations they are implementing.
PROSE is just a single example of an innovative and teacher-led school improvement
effort in which standardized tests cannot capture all the information necessary to gauge success
in either implementation or results. Researchers interested in fuller and more nuanced
examinations of the success of school and district reforms should take advantage of the similar
engagement with self-selected measures of success at many schools to shape their research
questions and strategies. In doing so, they will both be breaking new ground in the field of
educational research methods and providing teachers, schools, and other stakeholders in
education the information they need to truly improve student learning.
Download