Understanding the Purpose of Educator and School Growth Collaborative Conference for Student Achievement Part I: Transition from ABCs to EVAAS Dr. Tammy Howard, NCDPI Director of Accountability Services ABCs of Public Education • Implemented in 1996 for K–8 schools only • High schools inclusion – 5 EOCs in 1997–98 – Additional 5 EOCs in 1998–99 – Prediction model in 2000–01 3 ABCs of Public Education • Balance of growth and proficiency • Schools with no growth, regardless of percent proficient, designated as No Recognition • Previous performance compared to current performance 4 ABCs of Public Education • Academic change (growth) expressed as the difference between a student’s actual c-scale score for the current year and the student’s average of two (in most cases) previous assessments (EOGs and EOCs) with a correction for regression toward the mean. 5 ABCs of Public Education • AC = CSc-scale – (0.92 x ATPAc-scale) – AC = academic change – CS = current score – ATPA = average of two previous assessment scores • A positive academic change = gain in academic achievement; negative academic change = loss 6 ABCs of Public Education • Advantages – Student-level analysis – Easily replicated • Disadvantage – School-level designations considered not transparent 7 New Accountability Model • Recommendation of the Blue Ribbon Testing and Accountability Commission to design a new model – Transparency – Implemented in 2012–13 • School Performance Grades – A–F designations 8 Integrating EVAAS • Same growth model for school accountability as for educators – Consistency • However, school includes EOGs/EOCs and educator includes EOGs/EOCs/NC Final Exams/CTE post assessments 9 Part II: EVAAS Jill Leandro, SAS Education Policy Specialist 10 Introduction to EVAAS • History of EVAAS • Implementation of EVAAS in North Carolina • Implications for districts, schools, and educators 11 Beginning of EVAAS • In the early 1980s, the EVAAS approach to measuring growth was founded at the University of Tennessee Knoxville by Dr. William Sanders. – The EVAAS (or TVAAS in Tennessee) approach overcame many non-trivial statistical issues associated with measuring student growth. • In 1993, TVAAS released district-level reports statewide. • In 1994, TVAAS released school-level reports statewide. • In 1996, TVAAS released teacher-level reports statewide. • Key research from the early years revolutionized the way educators and policymakers viewed schooling effectiveness and the ability of students to make growth. 12 EVAAS in North Carolina • In 2001, the EVAAS team moved to SAS. • In 2005, EVAAS was implemented in pilot districts in the state as a school improvement resource. • In 2006, EVAAS was implemented statewide as a school improvement resource. • In 2012, EVAAS became a formal part of the state’s teacher evaluation and accountability after recommendation by WestEd and UNC researchers. 13 What is available in EVAAS? • Dozens of reports for use in school improvement – Reflective analytics, such as value-added and diagnostic reports for districts, teachers and schools – Proactive analytics, such as student projections – Comparison reports, such as value-added summary and scatterplots • Roster verification for the student-teacher linkages in teacher value-added reports • Help supports, such as video clips, online ticketing system, and help pages • Available through a secure web application with customized access 14 What data are used in EVAAS? • Student assessment data – EOGs, EOCs, NCFE, CTE, mCLASS and SAT/ACT • Student-teacher linkages (for teacher reports) • What about demographic/socioeconomic flags? 15 How does EVAAS measure growth? • It depends on the test. • Gain-based model for consecutive-grade-given test, such as EOG math and reading in grades 4–8 – Based on students’ entering achievement, what is the change in achievement from one year to the next? • Predictive-based model for all other tests – Based on students’ prior testing history, what is the difference between students’ expected score and observed score? 16 Advantages to both models • Use all available testing history for each student to minimize impact of measurement error • Include students who have missing test scores – For predictive model, students must have three prior test scores in any grade/subject. • Incorporate team teaching or other shared instructional practices for teacher reports • Use standard errors to address uncertainty inherent in any growth model and protect against misclassification 17 What is expected growth? • Precise definition depends on the model, but the general idea is that the actual performance of students in the current year determines the growth expectation for the current year. 18 Example for predictive-based model • Student Growth = Average Expected Score – Average Observed Score • How is each student’s “expected score” determined? Student A’s Testing History Students with Similar Testing History to Student A On average, how did all students like Student A perform? Student A’s Expected Score 19 Example for gain-based model • Student Growth = Change in achievement over time for a group of students Year 1 Year 2 55th percentile – achievement rose 50th percentile 50th percentile – achievement stayed the same 45th percentile – achievement decreased 20 Growth sounds simple, right? • The concept is simple, but the implementation is more complicated. – Definition of growth by year – Intra-year versus base year – Gain-based model reported in Normal Curve Equivalents (NCEs) – Uncertainty/standard errors • Some of these are technical issues related to working with student testing data and some of these are policy decisions made by NCDPI. 21 Use of EVAAS in North Carolina • The original goal of EVAAS was to be a resource for school improvement. • This is still the intent behind the comprehensive website. • With the use of EVAAS in more formal applications, there is more interest and more questions. • Common questions include… 22 Is growth dependent on the students served? • Is EVAAS fair to educators, even if they serve students who are – Economically disadvantaged? – High-achieving? – Low-achieving? • The answer is YES to all. 23 Achievement vs. % Students Testing as Econ. Disadvantaged 24 Source: NC EVAAS 2013 - 2014 data by school for EOG Math across grades; each dot represents a school. Growth vs. % Students Testing as Econ. Disadvantaged 25 Source: NC EVAAS 2013 - 2014 data by school for EOG Math across grades; each dot represents a school. Growth vs. % Students Testing as Minority 26 Source: NC EVAAS 2013 - 2014 data by school for EOG Math across grades; each dot represents a school. Growth vs. % Students Testing as AIG (Math) 27 Source: NC EVAAS 2013 - 2014 data by school for EOG Math across grades; each dot represents a school. Growth vs. % Students Testing as LEP 28 Source: NC EVAAS 2013 - 2014 data by school for EOG Math across grades; each dot represents a school. Growth vs. % Students Testing with Disabilities 29 Source: NC EVAAS 2013 - 2014 data by school for EOG Math across grades; each dot represents a school. Growth vs. Achievement 30 Source: NC EVAAS 2013 - 2014 data by school for EOG Math across grades; each dot represents a school. How can educators be ineffective when all their students passed the test? Advanced Proficient Start of the School Year End of the School Year 31 How can educators be very effective when none of their students passed the test? Advanced Proficient Not Proficient Start of the School Year End of the School Year 32 Growth is not achievement. Advanced Proficient Not Proficient Start of the School Year End of the School Year 33 Can all educators meet expected growth? • No, but there is not a designated distribution of districts/schools/teachers in the three categories (did not meet, meets, and exceeds expected growth). • Why is this the case? • Is this fair to educators? 34 How can EVAAS be used with new tests? • Neither the predictive model nor the current gain-based model require continuity in scaling to measure growth because of the intra-year growth expectation. 35 Can you tell me how much each student needs to learn in advance of the test? • EVAAS makes student-level projections available for tests that have not yet been taken. • The growth expectation in all models for a given year is based on actual student performance in that year, so it cannot be given in advance. 36 Part III: Educator Effectiveness Dr. Tom Tomberlin, NCDPI Director of Educator Effectiveness 37 Student Growth Data EVAAS Ratings 2014 Exceeds Expected Growth 8,270 20% Does Not Meet Expected Growth 6,574 16% Meets Expected Growth 25,967 64% 38 Weight of Standards • The six standards (eight for principals) are weighted equally in the determination of teachers’ effectiveness ratings. • In practice, however, student growth carries much more weight in differentiating teachers in terms of effectiveness. • Nominal vs. Effective Weighting 39 Observation and EVAAS Ratings Status (Observation) → Status (Obs + Growth) ↓ Needs Effective Improvement Highly Effective Total Needs Improvement 506 4,119 1,904 6,529 (16.1%) Effective 836 12,985 11,940 24,155 (63.6%) Highly Effective 93 3,124 5,014 5,383 (20.3%) Total 1,435 (3.5%) 20,228 (49.9%) 18,858 (46.5%) 40,521 40 Weight of Standards • Standard 6, student growth, plays a greater role in determining teacher effectiveness ratings than observational data. • The disproportional effect of student growth is an artifact of the lack of variation in observational data, not a value judgment. • More accurate assessment of teacher performance can improve this phenomenon. 41 EVAAS across multiple years 2013 EVAAS Rating 2012 EVAAS Rating Does Not Meet Expected Growth Meets Expected Growth Exceeds Expected Growth Total Does Not Meet Expected Growth Meets Expected Growth Exceeds Expected Growth 2014 EVAAS Rating Does Not Meet Meets Expected Expected Growth Growth Exceeds Expected Growth Total 1,325 43.7% 1,475 48.6% 234 7.7% 1,114 36.7% 1,627 53.6% 293 9.7% 3,034 1,686 14.1% 8,159 68.0% 2,152 17.9% 1,405 11.7% 8,315 69.3% 2,277 19.0% 11,997 172 4.8% 1,553 40.4% 2,119 55.1% 151 3.9% 1,652 43.0% 2041 53.1% 3,844 3,183 11,187 4,505 2,670 11,594 4,611 18,875 42 General Method of Estimation Student Raw Score (26/50) Deviation from the mean (50th NCE) for each student is aggregated at the teacher (or school, or district) level. The mean difference is the teacher effect which has an associated standard error Conversion to Scale Score (250) Percentile Rank is converted to NCE (~ 58 NCE) The teacher effect is divided by the standard error to create the index. Difference Between Expectation and Actual Scale Score (240 vs. 250) Positive 10 Scale Score Points is compared to distribution at the state level for that grade and subject. (e.g., 65th %tile) Index>=2 – Exceeds 2>=Index>2 – Meets Index <-2 – Does Not Meet 43 Teacher- and School-Level Growth Teacher 1 Teacher 2 Teacher 3 Student A -1.9 -2.2 -0.4 Student B -1.2 -1.6 -0.5 Student C -0.7 1.6 -1.6 Student D -1.0 -1.2 -0.7 Student E 0.5 -1.5 -1.2 -0.9 -1.0 -0.9 Std Dev 0.9 1.5 0.5 Std Error 0.5 0.5 0.5 -1.7 -2.0 -1.8 Mean Index 44 Teacher- and School-Level Growth School A Teacher 1 Teacher 2 Teacher 3 Student A -1.9 -2.2 -0.4 Student B -1.2 -1.6 -0.5 Student C -0.7 1.6 -1.6 Student D -1.0 -1.2 -0.7 Student E 0.5 -1.5 -1.2 Mean Index -0.9 Std Dev 1.0 Std Error 0.3 -3 45 Proficiency and EVAAS • How can a school increase proficiency rates by X percentage points but not meet growth? Prior Year (Expected) Score Prior Year NCE Current Year Score Current Year NCE Growth Student A 229 54 231 55 1 Student B 250 62 238 57 -5 Student C 255 64 236 56 -8 Student D 226 53 230 54 1 Student E 228 54 232 55 1 Student F 243 59 235 56 -3 Student G 225 52 230 54 2 Student H 231 55 230 54 -1 Student I 227 53 220 50 -3 Student J 235 56 230 54 -2 Prior Year Proficiency = 230 % Proficient = 50% Current Year Mean %Proficient = 90% Std Dev 3.2 Std Error 0.8 Index -1.7 -2.1 46 General Method of Estimation for K-2 Assessments Students are assessed in mClass at BOY, MOY, and EOY Students’ EOY results are compared to BOY (MOY for kindergarten). For each district/school/teacher, average entering achievement is compared to ending achievement to generate a growth measure Each growth measure has its own standard error Index>=2 : Exceeds Significant evidence students made more growth than their peers statewide 2>=Index>2 : Meets Evidence does not suggest strongly that students exceeded or fell short of the growth standard Index <-2 : Does Not Meet Differences in EOY to BOY assessments are interpreted as gain scores. Each growth measure is divided by its standard error to create an index or apply color coding Significant evidence students made less growth than their peers statewide 47 Why don’t we use BOY – EOY for kindergarten? 48 APPENDIX TEACHER MRM (ORIGINAL SCALE) Benchmark Periods BOY to EOY Grade Total 174 (62.37%) 46 (16.49%) 279 1 59 (21.61%) 159 (58.24%) 55 (20.15%) 273 2 51 (19.10%) 168 (62.92%) 48 (17.98%) 267 0 (0.00%) 5 (83.33%) 1 (16.67%) 6 K 54 (19.35%) 177 (63.44%) 48 (17.20%) 279 1 51 (18.75%) 165 (60.66%) 56 (20.59%) 272 2 49 (18.42%) 170 (63.91%) 47 (17.67%) 266 0 (0.00%) 5 (83.33%) 1 (16.67%) 6 K 61 (21.79%) 166 (59.29%) 53 (18.93%) 280 1 53 (19.41%) 171 (62.64%) 49 (17.95%) 273 2 48 (17.98%) 177 (66.29%) 42 (15.73%) 267 0 (0.00%) 4 (66.67%) 2 (33.33%) 6 Across C op yr i g h t © 2 0 1 4 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d . Exceeds Expected Growth 59 (21.15%) Across MOY to EOY Meets Expected Growth K Across BOY to MOY Does Not Meet Expected Growth Proficiency vs. Growth mCLASS: Reading 3D Revised Harcourt Rigby Text Reading and Comprehension (TRC) Cut Points Kindergarten First Grade Second Grade BOY MOY EOY BOY MOY EOY BOY MOY EOY C or above D or above E or above E or above I or above L or above L or above M or above O or above RB to B C D D G to H J to K J to K L M to N PC RB to B C C F H to I H to I J to K L <PC PC or below B or below B or below E or below G or below G or below I or below K or below Above Proficient Proficient Below Proficient Far Below Proficient 50 Proficiency vs. Growth • How can a teacher with no proficient students meet or exceed expected growth? • How can a teacher who has all students meeting proficiency not show high growth? • How can a teacher with declining proficiency show growth that meets or exceeds expectation? 51 Growth with no proficiency MOY BOY EOY 10 9 8 7 6 5 4 3 2 1 0 PC- PC PC+ RB RB+ B B+ C C+ D D+ 52 Proficiency with low growth MOY BOY EOY 10 16 9 14 8 12 7 10 6 5 8 4 6 3 4 2 2 1 0 PC- PC PC+ RB RB+ B B+ C C+ D D+ 53 Growth with declining proficiency MOY BOY EOY 18 10 16 9 16 14 8 14 12 7 12 10 6 10 5 8 8 4 6 6 3 4 4 2 2 2 1 0 PC- PC PC+ RB RB+ 100% B B+ C C+ 50% D D+ 15% 54 Proficiency vs. Growth • Proficiency and growth are two unrelated events. • Don’t let the attainment of proficiency distract teachers from generating maximum growth with their students. • A change management strategy may be needed to help with the shift from a “proficiency culture” to a “growth culture”. 55 Teacher Evaluation in NC • The North Carolina Educator Effectiveness System (NCEES) has six standards of performance for teachers and eight standards for principals. • NC has a conjunctive model, meaning that teachers and principals must be proficient (or better) on all standards in order to receive an overall effectiveness rating. We do not average or index these standards. • Unlike the observational standards, student growth (standard 6 for teacher, standard 8 for principals), requires three years of valid data in order to generate a rating. 56 Standards 6 & 8 – The Basics Teachers 1 2 3 4 5 6 Demonstrate Establish Leadership Environment Know Content Facilitate Learning Contribute Reflect on Practice to Academic Success Principals (and other Administrators) 1 2 3 4 5 6 7 8 Strategic Leadership Instructional Leadership Cultural Leadership Human Resource Leadership Managerial Leadership External Development Leadership Micropolitical Leadership Academic Achievement Leadership 57 3-Year Rolling Average Teacher Rating from 2 years ago Standard Rating from 1 year ago Standard Rating from this year 1.0 + (-2.5) + 1.2 Standard Contribute to Academic Success 6 6 6 1.0 -2.5 1.2 Met Expected Growth Did not meet Met Expected Expected Growth Growth 3 = -0.3 Met Expected Growth 3- year average rating on standard 6 for determining status Note: A similar methodology applies to principals as well. 58 Teacher Status 1. In Need of Improvement Standards 1-5 1 2 3 4 5 Demonstrate Leadership In the year Establish Environment Know Content Facilitate Learning Reflect on Practice Any rating lower than proficient and/or Three year rolling average ) Standards 6 6 + 6 + 6) 2 years ago 1 year ago This year /3 Does Not Meet Expected Growth 59 Teacher Status 2. Effective Standards 1-5 1 2 3 4 5 Demonstrate Leadership In the year Establish Environment Know Content Facilitate Learning Reflect on Practice Proficient or Higher on Standards 1 - 5 and Three year rolling average ) Standard 6 6 + 6 + 6) 2 years ago 1 year ago This year /3 Meets or Exceeds Expected Growth 60 Teacher Status 3. Highly Effective Standards 1-5 1 2 3 4 5 Demonstrate Leadership In the year Establish Environment Know Content Facilitate Learning Reflect on Practice Accomplished or Higher on Standards 1 - 5 and Three year rolling average ) Standard 6 6 + 6 + 6) 2 years ago 1 year ago This year /3 Exceeds Expected Growth 61 Teacher Status – First Status • For all teachers (and principals) the first status for Standard 6 will be generated from the best two out of three valid Standard 6 ratings. • School-level growth that has been assigned to a teacher as a result of a waiver (from NCFEs or ASW) will function as a valid Standard 6 rating. • School-level growth that has been assigned as a result of a lack of data for a teacher (i.e., not from a waiver) will not count as a valid Standard 6 rating. 62 Status Scenarios Rating from 2012–13 Rating from 2013–14 Rating from 2014–15 6 6 6 1.0 -2.5 1.2 Met Expected Growth Did not meet Met Expected Expected Growth Growth • • • Teacher has individual-level data for three years. Standard 6 from the 2013–14 school year is the lowest of the three ratings. Teacher’s Standard 6 status is 1.1 – “Meets Expected Growth”. 63 Status Scenarios Rating from 2012–13 Rating from 2013–14 Rating from 2014–15 6 6 6 1.0 -2.5 Met Expected Growth Did not meet Met Expected Expected Growth Growth • • • 1.2 • Teacher has individual-level data for the first two years. The 2014–15 data is schoollevel growth from a waiver. Standard 6 from the 2013–14 school year is the lowest of the three ratings. Teacher’s Standard 6 status is 1.1 – “Meets Expected Growth”. 64 Status Scenarios Rating from 2012–13 Rating from 2013–14 Rating from 2014–15 6 6 6 1.0 -2.5 Met Expected Growth Did not meet Met Expected Expected Growth Growth • • • 1.2 • Teacher has individual-level data for the final two years. The 2012–13 data is schoollevel growth because teacher did not have individual-level data. The teacher does not receive a status in the fall of 2015 because teacher does not have 3 years of valid data. First status in Fall 2016 (provided teacher has valid data in SY 2015–16). 65 Status Scenarios – Second Year Rating from 2012–13 Rating from 2013–14 Rating from 2014–15 Rating from 2015–16 • • 6 6 6 6 1.0 -2.5 1.2 Met Expected Growth Did not meet Met Expected Expected Growth Growth 4.0 Exceeded Expected Growth • • Teacher receives second status in fall of 2016. Rating from 2012– 13 “rolls off”. Rating from 2013– 14 returns to the rolling average (even though it was dropped from prior year’s calculation). Teacher’s status is “Meets Expected Growth” with an average of 0.9. 66