Considerations for Cost of Implementation

advertisement
DRAFT – Please do not cite or reproduce without permission of the author.
Educator Evaluation Systems: Considerations for Cost of Implementation
Elizabeth A. Barkowski
Research Associate
Value-Added Research Center
Wisconsin Center for Education Research
University of Wisconsin Madison
barkowski@wisc.edu
Association for Education Finance and Policy
38th Annual Conference
New Orleans, Louisiana
March 14, 2013
1
DRAFT – Please do not cite or reproduce without permission of the author.
Educator Evaluation Systems: Considerations for Cost of Implementation
Introduction and Policy Context
The advent of new educator evaluation systems, spurred by state-level legislative changes
as well as federal initiatives such as ESEA waivers and Race to the Top, is driving states and
school districts to seek new and innovative ways to measure educator effectiveness. To date, 36
states have passed legislation and/or engaged in efforts to revamp educator evaluation systems
(Education Week, 2013). These new evaluation systems generally use multiple measures of
educator effectiveness, such as observations of teacher and principal practice and measures of
student growth, to generate an educator’s overall evaluation rating. The ultimate goal of these
initiatives is to identify educator strengths and weaknesses to help inform strategic human capital
management decisions.
While reforming educator evaluation systems has gained traction across the nation, the
development and implementation of such systems requires time, effort, and money. Related costs
could include the development of new human resource data management systems, the cost of
developing and implementing metrics to evaluate teacher and principal practice, the calculation
of value-added, and the implementation of other student growth measures. The direct costs of
implementing educator evaluation systems may be low for some states and school districts;
however, stakeholders should consider hidden, or indirect, costs comprised of the time and effort
needed to implement such systems. This paper discusses the direct and indirect costs associated
with the implementation of new educator evaluation systems and uses the implementation of
student growth measures for traditionally non-tested teachers as a case to discuss specific costs.
2
DRAFT – Please do not cite or reproduce without permission of the author.
Educator Evaluation System Overview
Educator evaluation systems in most states and school districts typically base an
educator’s evaluation rating on measures of practice and student growth. Figure 1 displays the
typical breakdown of state educator evaluation system. The percentage, or weight, of the various
components of an educator’s evaluation varies by state; however, states typically base 50% on
measures of practice and 50% on student growth. Regardless of the weight, the cost of
implementing each component would be similar.
Figure 1. Basic Educator Evaluation System
Student Growth
(ranges from 20 – 51%)
50%
50%
Educator Practice
(observation)
Educator Evaluation System Components and Associated Tasks
Given that the various components included within new educator evaluation system are
typically unfamiliar to teachers, principals, and school district central office staff, states and
school districts must take time to develop or procure, train on, and implement each component of
the system. This section describes the steps involved in the development and implementation of
educator evaluation system components and identifies potential costs associated with each.
Developing educator evaluation systems. Some states have passed legislation and
appropriated funds to develop new educator evaluation systems. This process often involves the
creation of a group or division within the state education agency to manage the development and
3
DRAFT – Please do not cite or reproduce without permission of the author.
implementation process. This requires state agencies to hire or repurpose staff from other areas.
States often contract with outside organizations to help lead the development process and
sometimes compensate stakeholders for their work.
Employees within state agencies working on educator evaluation initiatives often work to
develop and implement the state system through a variety of activities, all of which would
require state expenditures. These activities could include:

Convene committees of stakeholders to explore options for evaluating educators.

Lead committees in the development of the state educator evaluation system.

Work with committees to create communication materials, guidance, and training
around new systems.

Present information on new systems to educators around the state.

Hold workshops and trainings on how school districts will implement new systems.

Develop and manage data systems needed to support the implementation of new
systems.

Serve as contacts for technical support around the implementation of new systems.
Teacher and principal practice standards and rubrics. All educator evaluation
systems include some measure of teacher and principal practice. States and school districts
define professional standards for teachers and principals, which include the knowledge and skills
that educators are expected to possess. States and school districts often use rubrics to measure the
degree to which teachers and principals fulfill these standards. The rubrics outline specific skills
and practices that relate to standards and typically include descriptions of multiple levels of
performance that teachers and principals achieve in relation to those skills.
4
DRAFT – Please do not cite or reproduce without permission of the author.
For purposes of teacher and principal evaluation, some states have adopted existing
educator practice rubrics. Wisconsin and Illinois, for example, use the Framework for Teaching,
developed by Charlotte Danielson, as its teacher practice rubric. Other states, such as New Jersey
and Florida, provide a list of rubric options that districts can use to evaluate educator practice.
Other states, such as Georgia and Tennessee, developed their own educator practice rubrics.
While there is some variation in the way in which states determine how they will develop
or adopt rubrics used to evaluate educator practice, each approach requires time, effort, and
funding. Imagine a state that began the process of adopting educator practice standards and
developing a process by which those standards will be evaluated. States and districts may engage
in the following activities:

Develop or adopt standards, guidelines, and rubrics to measure educator practice.

Train educators and evaluators on the system, which may include in-depth training on
rubrics, evaluator training and certification, annual calibration and re-certification for
evaluators.

Allocate time in districts to implement the system, including time to conduct
observations, collect evidence of educator practice, hold pre- and post-observation
conferences, submit and score evidence and observations.

Instill mechanisms to track, monitor, and report evaluation results.
Student growth measures. New evaluation systems typically require that a portion of an
educator’s overall evaluation be based on measures of student growth. States such as Indiana,
Rhode Island, New York, Tennessee, Wisconsin, and Georgia typically use two distinct types of
student growth measures to evaluate teachers: (1) value-added measures of student growth and
5
DRAFT – Please do not cite or reproduce without permission of the author.
(2) student learning objectives (SLOs). Both approaches require significant time and resources to
implement.
States and school districts incur specific costs when calculating value-added measures of
student growth. Since most states and districts do not have the capacity to calculate such
measures in-house, they often contract with nationally-recognized vendors to run the
calculations. This requires states or district to pay a vendor for time and effort. Value-added also
requires that states and districts collect student/teacher linkages needed to accurately identify
students that were taught by a particular teacher (and thus associate those students’ growth on
standardized tests with that teacher). The need for student/teacher links along with an intensive
verification process could require states to upgrade state data systems.
Value-added measures of student growth are limited to teachers who teach in state-tested
grades and subject areas. Currently, most states administer annual assessments in grades 4-8
math and reading. In order to measure growth in non-tested grades and subjects (which
encompasses approximately 70% of teachers), states and school districts may adopt the SLO
process to evaluate student growth for teachers that lack standardized assessments.
The SLO process requires teachers to set annual student growth targets using existing
assessment data and student work as baseline data. The degree to which students meet those
growth targets by the end of semester or school year is thought to be that teacher’s impact on
student learning. States and districts typically use scoring rubrics to assign a score or final rating
to SLOs based on the amount of growth achieved by students. These ratings are then factored
into a teacher’s overall evaluation score. Principals can also develop SLOs focused on student
growth at the school level.
6
DRAFT – Please do not cite or reproduce without permission of the author.
The implementation of SLOs requires states and school districts to directly or indirectly fund
the following:

Development of guidelines around the SLO process, including guidelines for
selecting and/or development assessments, setting growth targets, and scoring.

Training sessions for educators to learn about the SLO process.

Teacher and principal time to develop SLO targets and identify or develop
assessments they will use to measure student growth toward meeting those targets.

Teacher and principal time to approve and score SLOs. Districts may use existing
staff time or hire new staff to assist with the process.
Detailed Cost Example: Measures of Student Growth for Non-Tested Teachers
Measuring student growth in non-tested grades and subject (NTGS) has proven a
significant challenge for states and school districts. Many new educator evaluation systems
require that student growth comprise a significant portion of an educator’s overall evaluation
rating. The challenge is determining how to measure student growth in areas in which states do
not administer standardized assessments. For example, most states administer assessments in
grades 4-8 math and reading. The new PARCC and Smarter Balanced common core assessment
systems will include assessments for additional grades and subject areas, but will still leave
many grades and subject areas “untested.”1
Given the lack of assessments in many grades and subject areas, along with the
requirement that student growth must account for a significant portion of an educator’s
evaluation, states and school districts must determine ways to capture an educator’s impact on
student growth. Some options may include the following:
1
For more information on the PARCC and Smarter Balanced assessment systems, please visit
http://www.parcconline.org/parcc-assessment and http://www.smarterbalanced.org/smarter-balanced
7
DRAFT – Please do not cite or reproduce without permission of the author.
1. Procure and purchase new assessments
2. Develop new assessments
3. Use a Student Learning Objective approach
This section provides a description of potential costs associated with the implementation
of each option for measuring student growth in NTGS. Cost categories were identified based on
knowledge of the development of new state educator effectiveness initiatives and recent
proposals released by states and school districts.
Estimate Assumptions
Cost categories included in this section are based on potential first year costs of
implementing options to measure student growth in NTGS. It is assumed that costs may decrease
over time, given familiarity with various approaches to measuring student growth and the
development of assessments or data system that support such measures.
Option 1: Cost of Procuring New Assessments at the State Level
Some states, such as Illinois2, are working to expand the grades and subjects in which
students are assessed on an annual basis. Illinois, for example, plans to contract with an
organization to identify assessments in NTGS, forge contracts with assessment vendors, and
make these assessments available to districts through an electronic assessment platform
(potentially at a reduced cost). The purpose of procuring additional assessments is to provide
useful information to teachers about student levels of knowledge in a subject area and to measure
student growth over the course of a school year.
2
In August of 2012, the state of Illinois released a Request for Proposal (RFP) to identify an organization to procure
assessments for the state in traditionally non-tested areas. The RFP also required the proposing organization to
lead a process to work with groups of educators to develop assessments in non-tested areas and create an
electronic bank of assessment items.
8
DRAFT – Please do not cite or reproduce without permission of the author.
This assessment procurement process requires knowledge and expertise, as well as time
and effort, to undergo the identification of assessments and the procurement process. Finally,
once a state identifies additional assessments to administer, the state or individual school districts
must pay test vendors to administer these assessments.
This section describes potential costs associated with the procurement of assessments at
the state level and the administration of assessments at the local school district level. Table 1
identifies potential costs that states and districts might incur if they choose to procure
assessments in some NTGS.
State costs. The procurement of assessments at the state level would require state
education agencies to identify commercially available assessments in traditionally NTGS (those
other than math and reading in grades 4-8). Assume that a state agency allots staff member time
to work on the assessment procurement process. An agency would potentially repurpose a
portion of staff members’ time and preclude these staff members from engaging in their regular,
daily responsibilities. For this reason, the agency would incur an indirect cost for implementing
this process. Agencies could also contract with a vendor to provide knowledge, expertise, and
guidance to assist with the procurement process.
State agencies and partnering vendors may also create or procure a data system to host
assessments so that districts can access, pay for, and administer new assessments. Agencies
would incur the direct costs of developing and hosting such a system, as well as incur costs
required for ongoing updates and maintenance.
District costs. Once a state procures and makes available assessments in traditionally
non-tested grades and subject areas, districts would pay testing vendors (potentially at a reduced
rate) to administer these assessments. Direct costs would include the direct cost of administering
9
DRAFT – Please do not cite or reproduce without permission of the author.
each assessment multiplying by the number of students taking each assessment. Districts would
also incur indirect costs related to the time spent by teachers and principals to administer
assessments. Based on the size of the district, indirect costs may also include time spent by
central office staff to assist with the coordination of assessment administration and the collection
and maintenance of assessment data.
Finally, in order to determine levels of student growth for use within a state’s educator
evaluation system, districts would contract with a vendor to calculate value-added estimates.
Some assessments, such as the NWEA MAP, may have built-in structures for reporting student
growth. Others may not. The direct costs associated with value-added calculations may vary by
vendor.
Implications. The procurement of assessments at the state level would provide districts
with new options to assess students in currently NTGS and provide student growth data for
teachers teaching in these areas. If a state were to undergo such a procurement process, from an
economies of scale standpoint, each district in the state would not be required to spend time and
effort to identify assessments and contract with vendors. Also, states could negotiate reduced test
administration rates by making assessments available to all districts in the entire state. This could
bring down the overall combined state and district cost of procuring and administering
assessments.
Regardless of the ability of states to procure assessments, it is likely that no commercial,
nationally recognized, standardized assessments assessments exist in some NTGS. This means
that states and districts would still be required to develop alternative measures of student growth
for some teachers. This would require additional costs to implement alternative measures, such
as SLOs.
10
DRAFT – Please do not cite or reproduce without permission of the author.
Table 1. Categories of First Year Costs for Assessment Procurement
Quantity
Total Cost
State Assessment Procurement
State Agency Staff Time
Guidance and Expertise
# staff * # hours
1 contract
-
State Assessment Platform
Data System
Administration and Maintenance
1 system
# staff * # hours
-
# tests * # students
# hours * # teachers
# hours * # principals
-
# students
-
District Cost to Administer
Cost of Assessments
Assessment Administration
District Value-Added Calculations
Option 2: Cost of Developing New Assessments at the State Level
States such as Illinois and Colorado are working to convene groups of educators to
develop assessments in traditionally NTGS. This process requires expert knowledge of
assessment development, training for educators on the development of assessments, and
compensation for experts’ time and effort. After the development of such assessments, districts
can use the assessments to measure student growth in NTGS for purposes of teacher evaluation.
This section describes potential cost categories associated with the development of
assessments at the state level and the administration of assessments at the local school district
level. Table 2 identifies potential costs that states and districts might incur if they choose to
develop assessments in some NTGS.
State costs. If a state decides to develop assessments in some NTGS, state education
agencies might contract with national experts to lead the development process. This assumes that
a state agency would contract with a vendor and pay expert facilitators to lead educators through
the process of developing assessments. The state could convene expert educators and
stakeholders from around the state to partake in the process, providing these experts with a
11
DRAFT – Please do not cite or reproduce without permission of the author.
stipend to compensate for their participation. Facilitators may train expert educators on how to
develop assessments, and then facilitators and experts would engage in the actual development
process. Similar to the cost of procuring assessments, this estimate assumes that the state
education agency would develop a data platform to host assessments so that districts can access
and administer new assessments. In future years, the state might reconvene this group on an
annual basis to modify assessments and develop additional assessment items.
District costs. Once a state develops and makes available assessments in traditionally
non-tested grades and subject areas, districts would, in theory, administer assessments free of
charge. This means that districts would incur fewer direct costs associated with the
administration of new assessments. This estimate assumes that teachers and principals will still
allocate time to administer assessments and acquire indirect costs. In order to determine levels of
student growth for use within a state’s educator evaluation system, districts could contract with a
vendor to calculate value-added estimates using state-developed assessments.
Implications. The development of assessments at the state level would provide districts
with new options to assess students in currently NTGS and provide student growth data for
teachers teaching in these areas. In contrast with the procurement of assessments at the state
level, the availability of assessments without the administration cost could significantly reduce
the direct costs that districts must pay when developing student growth measures for non-tested
teachers. This could reduce the overall cost of implementing such measures when factoring in
the cost of implementation for all districts in a state.
Regardless of the ability of states and expert vendors to develop assessments, it is likely
that groups of educators and experts will not develop standardized assessments in all grades and
subject areas. This means that states and districts would still be required to develop alternative
12
DRAFT – Please do not cite or reproduce without permission of the author.
measures of student growth for some teachers, such as using SLOs. This, as with the assessment
procurement option, would require additional costs at the state and district level.
In addition to cost, the development of assessment requires high levels of knowledge and
expertise. For example, if teachers repeatedly use the same test items, it could result in a
“narrowing of the curriculum.” Creating and using several parallel test forms that are equated for
overall difficulty could address this; however, this would also require additional knowledge,
expertise, and money on part of the state. Similarly, groups of educators creating assessments at
the state level may lack knowledge of the psychometric properties of valid and reliable
assessments. For this reason, the procurement of large-scale standardized assessments with wellestablished psychometric properties may provide more valid and reliable measures of student
growth.
Table 2. Estimated First Year Cost of Assessment Development
Quantity
Total Cost
State Assessment Development
Guidance and Expertise
Facilitators
Training
Development
1 contract
# facilitators * # days
# experts * # days
# experts * # days
-
State Assessment Platform
Data System
Administration and Maintenance
1 system
# hours * # staff
-
# hours * # teachers
# hours * # principals
-
# students
-
District Cost to Administer
Assessment Administration
District Value-Added Calculations
Option 3: Cost of Implementing the Student Learning Objective Process
Many states with redesigned educator evaluation systems are turning to SLOs as a
solution to measure student growth in NTGS. Many states also use SLOs as an additional
13
DRAFT – Please do not cite or reproduce without permission of the author.
measure of student growth for tested teachers. SLOs are often thought of as a no-cost or low-cost
alterative to measure student growth in NTGS. While the direct costs associated with the
implementation of SLOs may be lower than the procurement or development of assessments,
however, hidden indirect costs exist.
This section describes potential cost categories associated with the implementation of
SLOs at the state and school district level. Table 3 identifies potential cost categories that states
and districts might incur if they implement SLOs as measures of student growth in NTGS.
State costs. In order to implement the SLO process, states must provide districts with
some level of guidance and training. In this estimate, it is assumed that a state would contract
with an organization or vendor with expertise in the area of SLOs. This vendor contract would
include the development of SLO process guidelines and the development and delivery of training
on the SLO process. Alternatively, the state could fund state agency staff to develop and train on
such processes, which may or may not prove less costly.
Similar to the other cost estimates, this estimate assumes that the state education agency
would develop a data platform to track and monitor SLOs. This estimate assumes that the state
would incur direct costs to develop or procure such a data system and fund the administration
and maintenance of the system.
District costs. Individual districts would assume the majority of costs included in this
estimate. In order to implement and manage the SLO process, a district may hire, or repurpose,
staff members to help review, approve, and score SLOs. Districts may also choose to provide
stipends to educators to serve as SLO campus facilitators to assist teachers and principals with
the SLO process.
14
DRAFT – Please do not cite or reproduce without permission of the author.
Districts might also incur direct and indirect costs by sending teachers and principals to
training on the SLO process. Districts might pay for substitutes to take over teacher classrooms
while teachers attend training, or use existing professional development days for training.
Similar, principals might attend training on SLOs, not only to learn how to develop their own
SLOs but also to approve and score teacher SLOs.
Indirect district costs associated with the SLO process entails teacher and principal time
spent to develop, approve, and score SLOs. Teachers, for example, may spend time reviewing
student baseline data and setting student growth targets. If no assessments of student growth
exist, teachers might develop assessments on their own or with teams of teachers. Teachers track
and monitor student progress over the course of the year, then collect evidence of final student
growth at the end of the year. Principals will also spend time developing their own SLO, plus
time spent reviewing and approving teacher SLOs. Principal time spent on SLOs may be reduced
if campus facilitators and central office staff assist with the SLO approval and scoring processes.
Implications. When solely examining direct district costs, the SLO process may appear
to be the least expensive option used to measure student growth in non-tested grades and
subjects; however, indirect costs that factor in teacher and principal time may be equivalent to or
higher than other options. Once states complete the development of guidance and training on
SLOs, they might incur few, in any, direct costs related to the SLO process. Also, a state data
system is not as necessary for the implementation of SLOs as it is for the implementation of new
assessments. The elimination of the state data system included in this estimate would reduce
state direct costs.
Regardless of costs, little standardization of SLO student growth measures exists. The
level of rigor may vary across teachers, schools, and school districts. The use of teacher-
15
DRAFT – Please do not cite or reproduce without permission of the author.
developed assessments may threaten the reliability and validity of SLO student growth
outcomes. Inconsistencies in the SLO scoring process may also arise. While states and school
districts could somewhat standardize the process to create quality control mechanisms, the
process is still subject to issues of validity and reliability, which may or may not be present with
the use of standardized assessments and value-added estimates of growth.
Table 3. Estimated First Year Cost of Implementing Student Learning Objectives
Quantity
Total Cost
District Direct Costs
District Staff Salaries
Campus Facilitator Stipend
Teacher Training Time
Principal Training Time
# staff
# per school * # schools
# hours * # teachers
# hours * # principals
-
District Indirect Costs
Teacher Implementation Time
Principal Implementation Time
# hours * # teachers
# hours * # principals
-
State Assessment Platform
Data System
Administration and Maintenance
1 system
# hours * # staff
-
State Guidelines and Training
1 contract
-
Discussion and Policy Considerations
This paper provides examples of the potential costs that states and school districts might
incur when developing and implementing new educator evaluation systems. In addition to cost,
stakeholders should consider the level of rigor and fidelity of implementation of evaluation
system components when selecting and implementing various models. For example, statewide
training for educators on evaluation system components may be costly; however, training can
help the state ensure common understanding of the system and increase the likelihood that school
districts will implement the system with fidelity.
Within respect to selecting methods to measure student growth for teachers in
traditionally NTGS, as well as other evaluation system components, it is important to consider
16
DRAFT – Please do not cite or reproduce without permission of the author.
both direct and indirect costs associated with implementation. The use of SLOs, for example,
may appear to be an easy solution to measure student growth; however, the time to implement
SLOs may account for large indirect costs for school districts. Reducing the amount of time
spent on the SLO process may result in compromises to consistency in rigor and the scoring of
SLOs.
Using nationally-recognized standardized assessments to measure student growth may be
the most rigorous, valid, and reliable approach to measuring student growth; however, the direct
costs of implementing this option may be high. The development of assessments at the state level
may reduce assessment administration costs, yet may require much time, effort, and expertise.
Also, states are not likely to acquire or develop assessments for all grades and subjects, which
would require the development of alternative student growth evidence sources or SLOs. The
direct costs of implementing SLOs may appear to be lower than paying for standardized
assessments; however, district may incur high indirect costs to implement the SLO process.
Additionally, the improper implementation of educator evaluation systems may result in
unintended, negative consequences. Eventually, many states intend to use educator evaluation
results to make high-stakes human capital decisions, yet inaccurate measures of educator
effectiveness could result in lawsuits or other difficult matters. The use of standardized tests to
measure student growth, for example, may be the most rigorous and reliable way to measure a
teacher’s impact on student learning; however, the expansion of standardized tests could be
costly. The use of SLOs as an alternative to standardized tests could result in gaming of the
system or inaccurate measures of teacher effectiveness. If states and districts attempt to reduce
costs by eliminating rigorous training on evaluation system components and reduce the time that
educators spend implementing the system, then the system may not provide useful information
17
DRAFT – Please do not cite or reproduce without permission of the author.
about educator effectiveness. States and districts should consider such tradeoffs and determine
how to best balance cost with rigor, reliability, validity, and fidelity of implementation.
Finally, it is important for states to consider long-term costs associated with educator
evaluation systems and the rigor of evaluation measures. More rigorous evaluation measures
may provide more accurate and useful information about teachers. Districts may use this
information to inform human capital decisions, leading to long-term cost savings.
Misinformation on educator effectiveness, however, may cost districts more money in the long
run.
References
Sawchuk, S. (2013). Teachers’ Ratings Still High Despite New Measures. Education Week.
Retrieved from http://www.edweek.org/ew/articles/2013/02/06/20evaluate_ep.h32.html
18
Download