School Improvement, Short cycle Assess, Teacher Eval

advertisement
School Improvement,
Short Cycle Assessments
and Educator Evaluation
Orlando
June 22, 2011
Allan Odden and Anthony Milanowski
Strategic Management of Human Capital (SMHC)
University of Wisconsin-Madison
Overview of Presentation
1. Prime challenge is to improve student
performance
2. Key strategy to attain that goal (on which
I focus today): talent and human capital
management
3. Support tactic for talent management –
multiple measures of effectiveness used
in new teacher evaluation systems
2
Improving Student Performance
• CPRE research and Lawrence O. Picus and Associates
research from school finance adequacy studies: Odden
(2009) & Odden & Archibald (2009)
• Research by others – Ed Trust, Karen Chenoweth (2007),
Supowitz, etc.
• Schools from urban, suburban and rural communities
• Many schools and districts with high concentrations of
children from low income and minority backgrounds
• Finalists for the Broad Prize in Urban Education
• = Ten Strategies for Improving Performance
Ten Strategies to Improve Performance
1. Initial data analysis, largely analyzing state
accountability tests
2. Set high ambitious goals – double student
performance, 90% to advanced standards
3. Adopt new curriculum materials and over time a
systemic view of effective instructional practices
that all teachers are expected to implement
4. Implement data based decision making with
benchmark and short cycle assessments – e.g.
Renaissance Learning STAR Enterprise
Ten Strategies to Improve Performance
5. Invest in comprehensive, ongoing professional
development, including instructional coaches in
all schools
6. Use school time more effectively – protected
core subject times for reading and math –
collaborative time for teacher teams working in
Professional Learning Communities
7. Multiple extra-help strategies for struggling
students – tutoring, extended day, summer
8. Widespread distributed instructional leadership
Ten Strategies to Improve Performance
9. Reflect “best practices” and incorporate
research knowledge – not doing your own thing
10. Be serious about talent – finding it, developing it,
determining effective from ineffective teachers
and principal; promoting, highly compensating
and retaining only teachers and principals based
on measures of effectiveness
Human Capital Management
• Obama and Duncan administration has made improving
teacher and principal talent and their effectiveness central to
education reform
• Goal: put an effective teacher into every classroom and an
effective principal into every school
• To implement these practices and manage teachers (and
principals) around them, develop multiple measures of teacher
effectiveness (long-hand for new teacher evaluation systems)
• Scores of states and districts working on this issue
• These issues also central to ESEA reauthorization
• The question is not whether teacher evaluation will change but
how it will be changed
7
Core Elements of the Strategy
• Multiple Measures of Teaching Effectiveness
1.
2.
3.
4.
Measures of instructional practice – several systems
Measures of pedagogical content knowledge
Student perceptions of the academic environment
Indicators of impact on student learning
All this is now mandated by Illinois law
• Use of those measures:
a)
b)
c)
d)
e)
In new evaluation systems, for teachers and principals
For tenure
For distributing and placing effective teachers
For dismissing ineffective teachers
For compensating teachers
8
Teacher Evaluation
Two major pieces of the evaluation:
Measure of instructional practice – Danielson Framework,
INTASC, Connecticut BEST system, CLASS, PACT, National
Board, the new North Carolina system – see Milanowski,
Heneman, Kimball, Review of Teaching Performance Assessments for
1.
Use in Human Capital Management, 2009 at www.smhc-cpre.org and go to
resources
2.
Measure of impact on student learning:
a.
b.
The only model at the present time is value added using end of year
state summative tests
One new proposal is to use interim-short cycle (every 4-6 weeks)
assessment data, aligned to state content standards, that show
student/classroom growth relative to a normed (national or state?)
growth trajectory
9
Other National Efforts
• Measuring Effective Teaching project of
the Gates Foundation:
– Multiple value added measures
– Several teacher rubrics, with video tool to
replace direct observations
– Student survey – Ron Ferguson
10
Measuring Educator Performance
Measuring Educator Performance
Measuring Educator Performance
Specifically, focus on short-cycle assessments
Measuring Educator Performance
 The indicators of impact on student learning,
must devolve from tests that:
1.
2.
3.
Are valid and Reliable
Are instructionally sensitive and instructionally useful (linked to
state content standards and provide data to teachers about how
to improve instructional practice)
Provide stable results, which mean they should be given multiple
times a year (every 4-6 weeks)
Many state accountability tests fall short of these psychometric
standards
14
Measuring Educator Performance
 Be very helpful if the data system can be used:
•
•
•
By teachers to guide their instructional practice
To roll up the individual data to the classroom to
indicate teacher impact on student learning gains
To roll up the individual data to the grade level and/or
the school level to indicate impact of school and
school leadership on student learning gains
15
Final Contextual Comment
• All these systems must be embedded
within a framework of ongoing educator
development
AND
• During these tight fiscal times, funds for
professional development should NOT be
cut
16
17
Multiple Measures of Teaching Performance
for Accountability & Development
• Standard Prescription:
Instructional practice measure (e.g., teacher
evaluation ratings) + Gain, growth, or valueadded based on state standards-based
assessments
• But:
– Practice ratings and assessment gain, growth, or
value-added don’t measure the same thing;
measurement error sources are different and don’t
cancel
– Gain, growth, or value-added on state assessments
are of limited use for teacher development
Advantages of Adding Short-cycle
Assessments to the Mix
1.
For teacher development: because such assessments
are frequent, teachers get feedback that they can use
to adjust instruction before the state test
–
2.
Teachers can see if student achievement is improving, and if
assessments are linked to state proficiency levels, whether
students are on track to proficiency
For teacher accountability:
–
–
–
–
–
More data points allow estimation of a growth curve
The growth curve represents learning within a single school
year; no summer to confuse attribution
The slope of the average growth curve or average difference
between predicted end points provides another indicator of
teaching effectiveness
Combining with growth, gain, or value-added based on state
assessments provides multiple measures of productivity
If linked to state assessments, can predict school year
proficiency growth
Short Cycle Assessment Growth Curve
Issues in Combining Practice &
Student Achievement Measures
• Models: Report Card, Compensatory,
Conjoint
• When Combining Need to Address:
– Different Distributions, Scales and Reference
Points
– Weighting in Compensatory Models
• Equal
• Policy
• Proportional to reliability
Report Card Model
Performance
Domain
Performance
Dimensions
Instructional
Practice
Planning &
Assessment
Classroom Climate
Instruction
Professionalism Cooperation
Attendance
Development
Student
Growth, Gain,
or VA on State
Assessments
Math
Reading/ELA
Other Tested
Subjects
Student Growth Math
on Short Cycle
Reading
Assessment
22
Score Levels
Requirement for Being
Considered Effective
1-4
1-4
1-4
Rating of 3 or higher on all
dimensions
1-4
1-4
1-4
Rating of 3 or higher on all
dimensions
Percentiles in
Being in the 3rd Quintile or
state/district
Higher for All Tested
distribution for each Subjects
subject
Avg. Growth Curve
Translated into
Predicted State Test
Scale Score Change
Predicted Gain Over Year
Sufficient to Bring Student
from Middle of “Basic”
Range to “Proficient”
Scales, Distributions, & Reference
Points for Value-Added vs. Practice
23
Putting Practice Ratings and
Student Achievement on the Same Scale
Emerging Practice: Rescale growth, gain or
value-added measure to match the
practice rating scale
– Standardize and set cut-off points in units of
standard error, standard deviation or
percentiles
Category
Distinguished (4)
Proficient (3)
Basic (2)
In S.E. Units
Percentiles
>1.5 S.E. Above Mean
70th +
+/- 1.5 S.E. Around Mean
30th to 69th
1.51 - 2 S.E. Below Mean
15th to 29th
Unsatisfactory (1)
> 2 S.E. Below Mean
Below 15th
Compensatory (Weighted Average) Model
for Combining Performance Measures
Dimension
Rating
Weight
Product
Growth, Gain, ValueAdded on State Test
2
25%
0.50
Growth as Measured by
Short-Cycle Assessment
3
25%
0.75
Practice Evaluation
4
50%
2.00
3.25
1.0-1.75 = Unsatisfactory, 1.76-2.75 = Basic,
2.76-3.75 = Proficient, 3.76 += Distinguished
25
Conjoint Model for
Combining 2 Measures
Student Outcome Rating
26
Teaching
Practice
1
2
3
4
4 = Advanced
2
2
3
4
3 = Proficient
2
2
3
4
2 = Basic
1
2
2
3
1 =Unsatisfactory
1
1
1
2
Conjoint Model for
Combining 3 Measures
To Get a
Summary Rating of
4
3
27
Need Scores of at Least:
4 on two measures and 3 on the other
2 on the practice measure and 4 on both the
student achievement measures
- or 3 on the practice measure and 3 on at least one
of the student achievement measures
2
2 on the practice measure and 2 on either of
the student achievement measures
1
1 on the practice measure and 1 on either
student achievement measure
Other ideas about using short
cycle assessments in
Educator evaluation
Impact on Student Learning
• The conversation is about “multiple indicators”
for this category
BUT
• Few if any places actually have viable multiple
indicators
• The prime and in most cases only indicator here
is a “value added measure derived from state
summative, accountability tests”
29
Impact on Student Learning
• Most teachers do not like value-added measures
using end of year state summative tests; don’t
understand them; don’t like state tests
• So what could be actual and practical additional
indicators, indicators that could augment these
value added statistics that derive from state
summative tests (which whatever our viewpoint
will probably not go away)
30
Impact on Student Learning
• Interim, short cycle assessments, that are given
multiple times during the year
• Interim short-cycle assessments (STAR is one
example) are used to help teachers improve
instruction and also can be used to show student
and classroom growth
• The only new, viable, specific idea in this area
that is now on the table, and gives comparable
evidence across teachers
31
Several Additional Indicators
• Background points:
– STAR Reading and Math cover grades K-12 so
cover classrooms above the standard grade 3-8
and 11
– Administered in computer-based format so provide
immediate feedback to teachers for use in
instructional improvement and change
– Vertically aligned scales so can compare scores
across months and years
– Following charts derive from individual student data
32
First Set of Ideas: Chart 1
Chart 1
• Interim assessments given monthly
• Student data aggregated to classroom
• Red squares are progress line for similar
classes of students in a state (or nation)
• Green triangles are actual class progress
• Yellow star is state proficiency level
• Shows growth during the months of just the
academic year
34
Chart 1
• Modest student learning when the class had a
substitute teacher
• Growth happened when regular teacher returned
• Actual class growth (green triangles) was much
greater than the reference norm (red squares)
• In value added terms, the class would have a
high value added – performance growth was
above the average (the red square trend line) for
this typical classroom
35
Chart 1
How to use these data:
6. Compare growth of this class to other classes:
a.
b.
c.
d.
e.
f.
In the same school
In the same district
With similar demographics
In the same state
Across the nation
To classes in schools with similar demographics
Many different ways to use the data in such a
Chart
36
Chart 2
37
Chart 2
• Interim assessments given monthly
• Student data aggregated to school level
• Red squares are progress line for schools
with similar students in a state (or nation)
• Green triangles are actual school progress
• Yellow star is state proficiency level
• Shows growth during the months of just the
academic year
38
Chart 2
• Rolls the classroom data up to the school
level
• Could also roll up student data across
classes for each grade
• Multiple ways to create an indicator
39
Chart 2
How to use these data:
1. Shows school performed above the state proficiency level
2. Shows school performed above similar schools
3. So compare end of year score on interim assessments
with end of year score on state summative test – do both
show exceeded proficiency?
4. Compute “standard deviation” of change – fall to spring
and COMPARE to “standard deviation” of change on state
summative test
5. Compare end-of-year interim assessment score to end-ofyear state proficiency score, in terms of standard deviation
above proficiency level, or standard deviation of growth
over the year
40
Chart 2
•
•
•
•
Compare to other schools in the district
Compare to other schools in the state
Compare to other schools in the nation
Compare to schools with similar
demographics
• Compare value added or growth scores on
interim assessments to that on state
summative assessments
41
Chart 2
• When data are rolled up to the school level,
they provide additional indicators for:
– Those education systems, like Hillsborough
(FL), which are using school wide gains for
teachers of non-tested subjects
42
Chart 3
43
Chart 3
• Indicates whether classroom (school) is low
performance level but high or low growth
OR high performance level and high or low
growth
• Could use simply to give points if high
growth, or negative points of both low
performance level and low growth (indicates
a real performance issue)
44
Final Comments
• Interim short cycle assessments can be used
to provide additional indicators of teacher
(school) impact on learning growth
• The data supplement what is shown by value
added with state summative tests
• Thus such data reduce the weight given to
such indicators
• And these data derive from a system designed
to help teachers be better at teaching
45
Allan Odden
University of WisconsinMadison
arodden@lpicus.com
arodden@wisc.edu
Download