Helping Teachers Become More Effective While Measuring Teaching Effectiveness: Combining Multiple Measures AASA Webinar, 2011 Allan Odden Strategic Management of Human Capital (SMHC) University of Wisconsin-Madison Overview 1. Prime challenge is to improve student performance 2. Key strategy to attain that goal (the focus of today): talent and human capital management 3. Support tactic for talent management – multiple measures of effectiveness used in new teacher evaluation systems 2 Human Capital Management • Obama and Duncan administration has made improving teacher and principal talent and their effectiveness central to education reform • Goal: put an effective teacher into every classroom and an effective principal into every school • To implement these practices and manage teachers (and principals) around them, develop multiple measures of teacher effectiveness (long-hand for new teacher evaluation systems) • New NEA and AFT policies that allow use of student data in teacher evaluation – Extract such measures from teacher improvement systems • Scores of states and districts working on this issue • These issues also central to ESEA reauthorization 3 Core Elements of the Strategy • Multiple Measures to Evaluate Teachers and Assess Teaching Effectiveness 1. Measures of instructional practice – several systems 2. Indicators of impact on student learning • Use of those measures: a) b) c) d) e) In new evaluation systems, for teachers and principals For tenure For distributing and placing effective teachers For dismissing ineffective teachers For compensating teachers 4 Current Teacher Evaluations Useless • Find 99+% of teachers satisfactory, accomplished, or outstanding • Even when student performance is dismal • Rarely use specific teaching standards and scoring rubrics with trained assessors • Until recently, did not include evidence of impact on student learning • Neither valid nor reliable; cannot be used for consequential decisions for teachers • Viewed as “waste of time” by teachers &administrators 5 New Directions in Teacher Evaluations • So now there is a major nationwide push to change teacher evaluation systems • Desire to use BOTH measures of instructional practice (qualitative) AND indicators of impact on student learning gains (quantitative) • Widespread support for these new directions • The question is not whether teacher evaluation will change but how it will be changed 6 Teacher Evaluation Two major pieces of the evaluation: Qualitative Measures of instructional practice – Danielson Framework, INTASC, Connecticut BEST system, CLASS, PACT, National Board, the new North Carolina system – see Milanowski, Heneman, Kimball, Review of Teaching Performance 1. Assessments for Use in Human Capital Management, 2009 at www.smhccpre.org and go to resources 2. Quantitative Measures of impact on student learning: a. b. Primary model at the present time is value added using end of year state summative tests Additional proposal is to use interim-short cycle (every 4-6 weeks) assessment data, aligned to state content standards, that show student/classroom growth relative to a normed (national or state?) growth trajectory 7 Measuring Educator Effectiveness Measuring Educator Effectiveness Measuring Educator Effectiveness Specifically, focus on short-cycle assessments Combining Multiple Measures of Teaching Performance • Standard Prescription: Instructional practice measure (e.g., teacher evaluation ratings) + Gain, growth, or valueadded based on state standards-based assessments • But: – Practice ratings and assessment gain, growth, or value-added don’t measure the same thing; measurement error sources are different and don’t cancel – Gain, growth, or value-added on state assessments are of limited use for teacher development Advantages of Adding Short-cycle Assessments to the Mix 1. For teacher development: – – 2. Because such assessments are frequent, teachers get feedback that they can use to adjust instruction before the state test Teachers can see if student achievement is improving, and if assessments are linked to state proficiency levels, whether students are on track to proficiency For teacher accountability: – – – – – More data points allow estimation of a growth curve The growth curve represents learning within a single school year; no summer to confuse attribution The slope of the average growth curve or average difference between predicted end points provides another indicator of teaching effectiveness Combining with growth, gain, or value-added based on state assessments provides multiple measures of productivity If linked to state assessments, can predict school year proficiency growth What Short Cycle Assessments Show Issues in Combining Practice & Student Achievement Measures • Models: Report Card, Compensatory, Conjoint • When Combining Need to Address: – Different Distributions, Scales and Reference Points – Weighting in Compensatory Models • Equal • Policy • Proportional to reliability Report Card Model Performance Domain Performance Dimensions Instructional Practice Planning & Assessment Classroom Climate Instruction Professionalism Cooperation Attendance Development Student Growth, Gain, or VA on State Assessments Math Reading/ELA Other Tested Subjects Student Growth Math on Short Cycle Reading Assessment 15 Score Levels Requirement for Being Considered Effective 1-4 1-4 1-4 Rating of 3 or higher on all dimensions 1-4 1-4 1-4 Rating of 3 or higher on all dimensions Deciles or Quintiles Being in the 4th Decile or in state/district 3rd Quintile or Higher for All distribution for each Tested Subjects subject Avg. Growth Curve Translated into Predicted State Test Scale Score Change Predicted Gain Over Year Sufficient to Bring Student from Middle of “Basic” Range to “Proficient” Scales, Distributions, & Reference Points for Value-Added vs. Practice 16 Putting Practice Ratings and Student Achievement on the Same Scale Emerging Practice: Rescale growth, gain or value-added measure to match the practice rating scale – Standardize and set cut-off points in units of standard error, standard deviation or percentiles Category Distinguished (4) Proficient (3) Basic (2) In S.D. Units Percentiles >1.5 S.D. Above Mean 70th + +/- 1.5 S.D. Around Mean 30th to 69th 1.51 - 2 S.D. Below Mean 15th to 29th Unsatisfactory (1) > 2 S.D. Below Mean Below 15th Compensatory (Weighted Average) Model for Combining Performance Measures Dimension Rating Weight Product Growth, Gain, ValueAdded on State Test 2 25% 0.50 Growth as Measured by Short-Cycle Assessment 3 25% 0.75 Practice Evaluation 4 50% 2.00 Total 1.0-1.75 = Unsatisfactory, 1.76-2.75 = Basic, 2.76-3.75 = Proficient, 3.76 += Distinguished 18 3.25 Conjoint Model for Combining 2 Measures Student Outcome Rating 19 Teaching Practice 1 2 3 4 4 = Advanced 2 2 3 4 3 = Proficient 2 2 3 4 2 = Basic 1 2 2 3 1 =Unsatisfactory 1 1 1 2 Conjoint Model for Combining 3 Measures To Get a Summary Rating of 4 3 20 Need Scores of at Least: 4 on two measures and 3 on the other 2 on the practice measure and 4 on both the student achievement measures - or 3 on the practice measure and 3 on at least one of the student achievement measures 2 2 on the practice measure and 2 on either of the student achievement measures 1 1 on the practice measure and 1 on either student achievement measure Teacher Evaluation in Tennessee From Race to the Top to First to the Top Educating Our Children, Engaging Our Parents, Empowering Our Schools Evaluation The ultimate goal of all teacher assessments and evaluations should be… TO IMPROVE TEACHING AND LEARNING Educating Our Children, Engaging Our Parents, Empowering Our Schools First to the Top Law on Evaluation • Requires annual evaluation of all teachers and principals • 50% student achievement data: 35% TVAAS where available, 15% other objective measures • 50% other qualitative data include: Review of prior evaluations Personal conferences re: strengths, weaknesses and remediation For teachers, classroom or position observation followed by written assessment For principals, additional criteria pursuant to their employment contract Educating Our Children, Engaging Our Parents, Empowering Our Schools General Guidelines • Evaluations will be used to inform human resource decisions, including but not limited to: Tenure and dismissal Compensation Assignment and promotion Hiring Professional development • LEAs may develop alternative evaluation procedures which must be approved according to policies and rules adopted by the SBE. Educating Our Children, Engaging Our Parents, Empowering Our Schools Categories of Educators Teachers with TVAAS data Teachers without TVAAS data untested subjects untested grades Library Information Specialists Special Groups counselors social workers non-classroom educators Principals assistant principals Not included in TEAC authority: central office staff Educating Our Children, Engaging Our Parents, Empowering Our Schools Criteria for Evaluations Educator Evaluation 35% Student Growth 15% Student Achievement 50% Other Criteria Educating Our Children, Engaging Our Parents, Empowering Our Schools 50% Quantitative Data Teachers 35% Student Growth • TVAAS where available • School-wide TVAAS for all other teachers • Developing alternative growth measures for non-tested subjects/grades Principals 35% Student Growth • School-wide TVAAS 15% Student Achievement • Selected from “menu of options” adopted/approved by SBE 15% Student Achievement • Selected from “menu of options” adopted/approved by SBE Educating Our Children, Engaging Our Parents, Empowering Our Schools Growth Measures for Non-tested • TDE convened educator workgroups in 12 areas of nontested subjects and grades. • Teams provided recommendations in February 2011. • All recommendations are being vetted by the TDE and a technical advisory committee to determine validity, reliability and feasibility. • Until such measures are available, educators in non-tested subjects and grades will be evaluated using a TVAAS composite score for the growth component. Educating Our Children, Engaging Our Parents, Empowering Our Schools 15% Student Achievement • For the 15% achievement portion of the teacher evaluation, the State Board approved a menu of options from which teachers may choose, in cooperation with their administrator, by October 1. • The chosen measures should reflect the educator’s primary responsibility as directly as possible. • Top 3 quintiles may use TVAAS score. • Measures are under review for appropriateness and scalability. Educating Our Children, Engaging Our Parents, Empowering Our Schools Qualitative Appraisals • For teachers the qualitative appraisal instrument must address the following domains: Instruction Planning Environment Professionalism • For principal/assistant principal the qualitative appraisal instrument will be based on Tennessee Instructional Leadership Standards (TILS). Educating Our Children, Engaging Our Parents, Empowering Our Schools Outlining the process • TDE to provide user-friendly, manageable forms to document observations and personal conferences • Future goal: all forms and data entry will be done electronically • Minimum 4 observations for professional licensed teachers (2 -semester) • Minimum 6 observations for other licensure categories (3-semester) • Feedback from observation visits: Detailed feedback, highlighting areas of strength and refinement At least ½ of all observations must be unannounced Written feedback within a week In-person debrief scheduled within a week Educating Our Children, Engaging Our Parents, Empowering Our Schools Guidelines for the Evaluations Category 35% Student Growth 15% Student Achievement 50% Other Mandatory Criteria (Minimums) Teachers with TVAAS Individual TVAAS score Menu of options; top 3 quintiles may use TVAAS score Multiple sources; 4 observations for professional licensed, 2/semester, minimum 60 minutes annually; at least half unannounced Teachers without TVAAS School-wide valueadded; other identified or developed measures Menu of options; top 3 quintiles may use TVAAS score or growth score Multiple sources; 4 observations for professional licensed, 2/semester, minimum 60 minutes annually; at least half unannounced Apprentice Licensed Teachers Individual TVAAS scores TVAAS composite; other identified or developed measures Menu of options; top 3 quintiles may use TVAAS score or growth score Multiple sources; 6 observations, 3/semester, minimum 90 minutes annually, (also other nonprofessional licenses) Principals, Assistant Principals School-wide valueadded Menu of options; top 3 quintiles may use TVAAS score Multiple sources; 2 onsite observations; qualitative appraisal based on TILS, review of teacher evaluation quality; surveys Special Groups School-wide valueMenu of options Multiple sources; Educating Our Children, Engaging Our Parents, Empowering Our Schools added; menu of 4 observations, 2/semester, options; other identified minimum 60 minutes annually; at Evaluations will differentiate educators into five effectiveness groups: Educating Our Children, Engaging Our Parents, Empowering Our Schools State Model • The Tennessee Educator Acceleration Model (TEAM) has been adopted as the state evaluation model. • TEAM utilizes the TAP rubric for observations. • TEAM observers must complete a four-day training session and pass an online test to be certified as observers. Educating Our Children, Engaging Our Parents, Empowering Our Schools Other Evaluation Models Alternative evaluation models developed and adopted: •Memphis—Teacher Effectiveness Measure (Gates supported based on IMPACT model) •Hamilton County—Project COACH •Association of Independent and Municipal Schools (AIMS)—Teacher Instructional Growth for Effectiveness and Results (TIGER) Educating Our Children, Engaging Our Parents, Empowering Our Schools Evaluation Appeals Process Teachers may appeal: 1)Accuracy of data used in evaluation 2)Adherence to evaluation policies adopted by SBE Educating Our Children, Engaging Our Parents, Empowering Our Schools Evaluation Appeals Process Three-step process: 1) 15 days to appeal to evaluator, who has 15 days to issue decision in writing 2) 15 days to appeal to director of schools or designee, who has 15 days to issue a written decision 3) 15 days to appeal to school board (final step), which has 30 days to conduct a hearing and 30 days to render a decision Educating Our Children, Engaging Our Parents, Empowering Our Schools Short Summary State Action • More than half the states have enacted legislation changing how teachers are evaluated • All require a combination of indicators including: – Measures of instructional practice – Student achievement data • State accountability test data • Other test data, that usually can include short cycle assessment data – Short cycle can comprise up to 35% of the data on student learning, so are important options Advantages of Short Cycle Data • Multiple kinds: – Renaissance Learning STAR assessments • online administration for immediate feedback, can be administered monthly, online instructional help – Several others – AIMS Web, NWEA Map, etc. • Designed in the first instance to help teachers improve their instructional practice • Gives formative feedback during the year on how the class is doing • So short cycle assessments, designed to help teachers be more effective, can now also be used to measure teacher effectiveness Contact Information Dr. Allan Odden, University of Wisconsin-Madison arodden@lpicus.com arodden@wisc.edu Dr. Damian Betebenner, Center for Assessment dbetebenner@nciea.org Al Mance, Tennessee Education Association amance@tea.nea.org