Memorandum To: REPORTER From: Elaine Weiss, National Coordinator, BBA Re: Concerns regarding DCPS/OSSE claims of gains in student proficiency Date: July 9, 2014 Within the next few weeks, the District of Columbia Public Schools (DCPS) and the Office of the State Superintendent for the District of Columbia (OSSE) will release selected data from the 2014 DC Comprehensive Assessment System (DC-CAS) that they will likely assert demonstrate an increase in student proficiency. They may also likely claim substantial gains for low-income and minority students and, possibly, progress in closing race- and income-based test score gaps, as they did last year based on 2013 DC-CAS results.i Building on its 2013 research for the report Market-Oriented Reforms’ Rhetoric Trumps Reality, the Broader, Bolder Approach to Education (BBA) is producing a report explaining why these gains are exaggerated and, in some cases, non-existent, and how lack of data transparency, combined with cherry-picking specific numbers, has enabled DCPS and OSSE to paint a false picture of progress. Moreover, our report will show clearly that low-income and minority DCPS students (and other groups of disadvantaged students) have, in fact, lost ground to their more advantaged peers in the past few years under Chancellors Michelle Rhee and Kaya Henderson. The report will also explain how excessive pressure has contributed to this gaming of scores, as well as the multiple negative consequences on students, teachers, and the system as a whole. We felt that it was critically important, however, that reporters have this basic information on hand prior to the upcoming release of 2014 scores. As last year’s release and Council Member David Catania’s attempt to shed light on many inconsistencies and holes in the data demonstrate, trying to correct false information after the fact tends to be ineffective.ii We thus use this memorandum to provide you with some of the more critical data points, backed by charts, figures, and sources, as well as a set of questions that you should ask when the public release takes place. We will follow this memo up with a comprehensive, fully-sourced report after all the 2014 data are made available. The purpose of this report is not to criticize DCPS for intentional bad practices, but rather to illustrate, via an example in a high-profile district, the types of conflicts and problems that inevitably arise when undue pressure is put on student standardized tests. Our hope is that shedding light on the consequences of poorly conceived federal policies, misguided philanthropic contributions, and other pressure will spur a balanced and thoughtful discussion of more effective strategies that would boost all students and their communities, rather than sustaining and exacerbating existing disparities. We would welcome the chance to work in partnership with the city and school system leadership to develop such a system. Memorandum: Questioning Asserted Gains in DC-CAS Proficiency Page 1 DCPS claims of increases in proficiency are undermined by scale score trends 2009-2013 and by disaggregation of data Cut scores – levels of “basic,” “proficient,” and “advanced” –vary by test and year. The DC-CAS definition of proficiency, for example, is much lower than that of the National Assessment for Educational Progress (NAEP).iii As DCPS itself notes, however, cut scores for these standards are based on scale scores, which are the most valid way to assess gains in student learning over time, as they are designed to be consistent within grades and subjects across years. As such, DCPS claims of “historic” gains in students who are “proficient” and “advanced” should be reflected in large increases in scale scores, which are the basis for the cut scores. They are not. Mean scale scores range from a low of 45.81 (2010 3rd grade reading among African American students) to highs of over 70 among white students in math).iv As per the data that are illustrated in Figure A, 2009-2013 gains in reading scale scores are minimal (they range from 0=-0.25 in 6th grade to 2.77 points in 4th grade, for an average of 1.6 points). Math gains are slightly larger, with an outsized 6.25-point gain among 6th graders raising the average to 2.99.v Given the 99-point scale for tests, these are far too small to support large proficiency gains.vi In addition to claiming substantial overall growth in proficiency, DCPS asserts that lowerincome and minority students have made large gains, and that they are closing achievement gaps. For example, the report on 2013 DC-CAS results states “Our lowest-performing schools narrowed the gap with other schools.”vii It also claims that students in Wards 7 and 8 made some of the largest gains, and that black-white achievement gaps narrowed substantially 2007-2013. In particular, the report points to math gains of 16.6% for black students versus 9.2% gains for white students, and 8.5% gains for black students in reading versus 5.1% for white students. Memorandum: Questioning Asserted Gains in DC-CAS Proficiency Page 2 It is hard to understand how such gains happened given scale scores that show the opposite. DCCAS scale score data by grade and subject, disaggregated by race, show that black students generally made the smallest gains in math (except 7th and 8th grade), widening already large gaps. Reading is more mixed; with overall smaller gains, black students made slightly larger gains than their white peers in most grades, but in many cases due to actual losses for white students.viii NAEP data further underscore this reality. As illustrated in Figure B, from 2009 to 2013, average 4th grade NAEP reading scores for school-lunch-eligible children – three quarters of all DCPS students, and those who should have benefitted most from reforms – have been essentially flat; they increased just two points (from 193 to 195).ix At the same time, scores for non-lowincome students rose by 19 points (from 226 to 245). This falsely suggested gains by dragging up the average and increased the test-score gap 17 points, from 33 to 50, in just four years.x As per Appendices D and E, gains in 4th grade math are larger, but the pattern is similar; lowincome students gained nine points, while their non-low income peers gained 19, widening the gap by ten points. Gaps also grew substantially at the 8th grade level; middle-schoolers widened the math gap by the same ten points as their elementary school peers, though both subgroups gained more points. In reading, non-low-income students gained 15 points, nearly quadruple the four points for low-income students, growing the gap from 19 to 30 points in four years.xi Data must take into account changing demographics and student classification NAEP data also suggest that changes in demographics and/or classification of students is partly responsible for some of these score increases. When gains among these same students – 4th grade readers – are assessed by racial subgroup and as a whole, we see that each group’s progress does not add up to that of all.xii As shown in Figure C, below, the scale score increase of four points for all students (202-206), while small, is still larger than gains in any of the three subgroups (three for white students, and one each for black and Hispanic students), though some average of the latter should produce the overall increase. Gains in math are larger, but they reveal the same conflict/inability to square disaggregated and all/average gains.xiii Memorandum: Questioning Asserted Gains in DC-CAS Proficiency Page 3 In other words, a change in which students took the test between 2009 and 2013, and/or change in the classification of some of those students as low-income, appears to account for some of the gains. Indeed, it is actually impossible for any group to have gained the equivalent of half a standard deviation in one testing cycle (as non-low-income students appear to have done as per Figure B), so some other explanation must be at play. Demographics among DCPS students have indeed changed in recent years; black students represent a smaller proportion of those tested (73% in 2013 vs. 80% in 2009), while the share of white students has grown (from 7% to 10%). Another, larger change is likely more influential, however. In 2012, DCPS instituted a system in which all students in the lowest-income schools receive free meals. This means that these schools no longer collect nor submit data on which students qualify for these meals, and that all students, including those who are not low-income, are now classified as such. This is reflected in the growing number of low-income test-takers: in 2013, 77% of NAEP test-takers were classified as low-income, up from 72% in 2009 (DC-CAS share rose from 71% to 80%, even as more white students were taking the test versus blacks).xiv In sum, these numbers suggest relatively small gains overall, not gains that would support claims of large increases in “proficiency,” and particularly small gains for disadvantaged students. All point to the critical need for more transparency in data, including the release of raw and scale scores, not just “proficiency” numbers, and for all data to be disaggregated by income and race. Memorandum: Questioning Asserted Gains in DC-CAS Proficiency Page 4 DCPS claims of increases in proficiency are undermined by both DC-CAS and NAEP data As discussed above, cut scores are inherently less valid means of assessing student progress, because they can be set differently by test and over time. However, as DCPS has used the DCCAS measure of “proficiency” to claim success, it is important to explore this claim. The first indication that something is amiss in the DCPS definition comes from disaggregated DC-CAS data. Figure D illustrates virtually impossible “gains” in proficiency among non-low-income students in a single year.xv These indicate that: 1) either the definition of proficiency changed radically between 2012 and 2013, and/or a different set of students took the 2013 test; and 2) the DC-CAS definition of “proficient” is questionable, given that the majority (in some cases, a sizeable majority) of all DCPS students are classified as such.xvi It also shows that, contrary to DCPS claims of gains among the most disadvantaged students, low-income students have made virtually no gains at all over the past 4 years, substantially widening an already enormous gap. These are questions that must be posed when 2014 “proficiency” numbers are released.xvii Understanding changes in the proportion of DCPS students who are deemed “proficient” and “advanced” as per DC-CAS requires a metric that is stable/consistent over time.xviii NAEP data offer just that, by providing a measure of proficiency that is defined by education experts based on what students of a given age should know, and that is consistent over time in characterizing students with that same level of knowledge as “proficient.” As such, gains in the percentage of students who are characterized as “proficient” or above in DC-CAS should be comparable to those gains per NAEP. As demonstrated by the numbers in Appendix C, however, and in Figure E, below, the gains are neither equivalent across the two tests, nor are trends consistent across grades and subjects. This contrast indicates that the DCCAS definition does not capture a meaningful concept of “proficient.” It also indicates that the DC-CAS definition is not constant over time, contrary to what DCPS and OSSE claim. Memorandum: Questioning Asserted Gains in DC-CAS Proficiency Page 5 As indicated by the highlighted numbers in the Appendix, many of the increases are sharply different across DC-CAS versus NAEP. NAEP often posts larger gains, due to its much more rigorous definition of proficiency, and thus to smaller proportions of “proficient” students and, in turn, more room to advance.xix In one of the most extreme examples, NAEP registered a 111 percent increase in the proportion of proficient/advanced students (black 4th graders in math), while the DC-CAS increase was just 14.5 percent. In two instances, Hispanic students registered double-digit percentage gains in DC-CAS proficiency, while posting NAEP losses.xx In all, these data challenge DCPS claims of consistency and stability in its definition of “proficient”. The low proportions of low-income students who were proficient as per the 2013 NAEP further undermines asserted gains for these students.xxi While achievement gaps far precede recent reforms and are mirrored in districts across the country, these numbers do not suggest a system that is raising up its disadvantaged students as asserted. As illustrated in Figure F, not only are income-based gaps in proficiency very large, only a handful of low-income students is deemed proficient in reading by NAEP standards. Memorandum: Questioning Asserted Gains in DC-CAS Proficiency Page 6 As Appendix C shows, the disparities are very similar with respect to race: In 2013, nearly 80 percent of white fourth graders were proficient readers according to NAEP. Only 25 percent of Hispanic students, 12 percent of Black students, 11 percent of low-income students, and fewer than 10 percent of low-income Black students attained the “proficient” status. The pattern is also very similar for eighth grade math: over 70 percent of white DCPS students were proficient, versus 20 percent of Hispanic students, nine percent of Black students, just eight percent of lowincome students, and only a handful – six percent – of low-income Black students. These numbers also illustrate the stark contrast between DC-CAS and NAEP proficiency standards. Comparing pre- and post-reform gains undermines DCPS claims of reforms’ “success” Finally, Chancellor Henderson has claimed, with support from US Secretary of Education Arne Duncan, and others, that reforms instituted under former Chancellor Michelle Rhee and herself, in particular test-based accountability measures in keeping with the district’s Race to the Top proposal, have spurred gains, and that the gains demonstrate the success of these reforms. A comparison of pre- and post-reform eras, however, shows otherwise.xxii As Figure G illustrates, low-income fourth graders had the same seven-point gains in math in the four years preceding Michelle Rhee’s Chancellorship as they did 2009-2013. Non-low-income students, however, who were gaining at the same rate as their low-income peers from 2003-2007, doubled those gains since, widening the gap over the past four years.xxiii In sum, the reform era demonstrates little change in overall gains, but a striking increase in the relative advantage for higher-income and white students – the opposite of what was intended, and what reform proponents claim. Moreover, changes in student composition and in student classification, rather than actual learning, seem to explain some test score increases; while more and better data are needed to determine their exact influence, sudden bumps in scores suggest that these may be a bigger factor than most others in driving test score changes. Memorandum: Questioning Asserted Gains in DC-CAS Proficiency Page 7 In sum: 1) Gains in DC-CAS scale scores do not support the claimed large gains in proficiency; 2) Disaggregated NAEP scale score data confirm much greater gains for higher-income students than for their low-income peers and reveal substantial increases in already-large gaps; 3) The same disaggregation, compared with overall/scale score gains, indicates that change in student body composition and/or classification, rather than improvements in teaching or learning among the same students, may be responsible for a large proportion of the asserted overall gains; 4) An assessment of DC-CAS proficiency rate trends suggests definitions of “proficient” and “advanced” that are not meaningful, as well as changes in that definition over time and/or in the student body composition, and demonstrates loss of ground for low-income students relative to their higher-income peers; 5) A comparison of changes in the percent of “proficient” and “advanced” students as per DC-CAS versus NAEP confirms the lack of meaning and consistency in the DC-CAS cut scores and/or in the raw and scale scores that are the basis for them; and 6) A comparison of pre- and post-reform eras challenges the assertion that reforms have been “successful” in driving gains and closing gaps Questions reporters should ask/data they should demand: 1) Why are cut scores and scale scores not released, and why are they not disaggregated by income and by race/ethnicity? a. Such scores were released under former Chancellor Clifford Janey, and they are released each year in neighboring Maryland and Virginia. b. Only by obtaining these data can total gains, relative gains, and gains in the proportion of proficient/advanced students be fully understood. c. Data experts must be allowed to examine the microdata and reach their own conclusions. 2) DCPS/McGraw-Hill must provide an explanation that is understandable to the public, including parents, teachers, and members of the media, detailing how changes in test difficulty translate into changes in scale, and cut scores. 3) DCPS/McGraw-Hill must provide a document explaining why NAEP and DC-CAS data are not aligned, how “proficiency” on DC-CAS (which the majority of DCPS students meet) compares to that of NAEP and against what benchmark it is set, and why changes in the proportion of students proficient in DC-CAS and NAEP do not correspond with each other. 4) Given substantial increases in both race- and income-based achievement gaps under current policies and lack of significant overall improvements for DCPS students, and thus reforms’ failure to achieve their purported goals, why is DCPS not intensively pursuing alternative strategies that help all students, rather than touting ones that exacerbate a bad status quo? Memorandum: Questioning Asserted Gains in DC-CAS Proficiency Page 8 Memorandum: Questioning Asserted Gains in DC-CAS Proficiency Page 9 Memorandum: Questioning Asserted Gains in DC-CAS Proficiency Page 10 Memorandum: Questioning Asserted Gains in DC-CAS Proficiency Page 11 Memorandum: Questioning Asserted Gains in DC-CAS Proficiency Page 12 Memorandum: Questioning Asserted Gains in DC-CAS Proficiency Page 13 References Catania, David. 2013. The Truth About the 2013 D.C.-CAS. D.C. Council Committee on Education. CTB/McGraw Hill. 2010, 2011, 2012, 2013. Technical Report, Spring Test Administration. Washington, D.C. Comprehensive Assessment System (DC CAS). Appendix D: Internal Consistency Reliability Coefficients for Examinee Subgroups. District of Columbia Public Schools (DCPS) 2013. DC CAS 2013 Results. DCPS Office of Data and Accountability. July. Levy, Mary, 2014a. “DCPS NAEP, 2003-2013” [unpublished Excel files provided by D.C. budget consultant Mary Levy, based on data extracted from U.S. Department of Education, Institute of Education Sciences, National Center for Education Statistics, National Assessment of Educational Progress (NAEP), NAEP Data Explorer, Readings and Mathematics Assessments, 2003-2013. Levy, Mary, 2014b. Proportion of Low-Income Test-Takers, DC-CAS and NAEP, 2008-2013. National Center for Educational Statistics (NCES). 2009. Mapping State Proficiency Standards onto the NAEP Scales: Variation and Change in State Standards for Reading and Mathematics, 2005-2009. http://nces.ed.gov/nationsreportcard/pdf/studies/2011458.pdf Office of State Superintendent of Education for the District of Columbia (OSSE). Data for 20082012: 2012 DC CAS School-by-School Longitudinal Data http://osse.dc.gov/release/mayorvincent-c-gray-announces-2012-dc-cas-results Office of State Superintendent of Education for the District of Columbia (OSSE). Data for 2013: School-by-School DC CAS Results (Alpha) http://osse.dc.gov/publication/dc-cas-resultssy-2012-2013 i See DCPS 2013. See Catania 2013 for concerns regarding how cut scores are translated to scale scores, and “inflated proficiency growth on the 2013 DC-CAS. iii According to an NCES report, DCPS “proficiency” standard across reading and math in both grades is roughly equivalent to the NAEP definition of “basic.” iv As Council Member Catania highlighted at his hearing, raw scores have gone down in all grades and both subjects for the past two years. Former Interim Superintendent Emily Durso asserted in her September 26, 2013 testimony that DCPS has maintained the same cuts scores since DC CAS was implemented in 2006, in order to enable comparability. One question to address in our full report is thus how these scores are translated and how purported increases in difficulty of questions due to the implementation of the Common Core State Standards is taken into account in that process. We will also discuss how test preparation practices, and loss of time for recess and other parts of the school day, especially in the district’s lowest-income schools, may have influenced test scores. v See Appendices A and B, Reading and Math DC-CAS Scale score gains by grade, all students and disaggregated, 2009-2013. vi CTG/McGraw-Hill technical reports, 2010-2013, re lowest obtainable scale score (LOSS) and highest obtainable scale score (HOSS). vii DCPS 2013. ii Memorandum: Questioning Asserted Gains in DC-CAS Proficiency Page 14 viii See Appendices A and B for scale score gains by grade and subject, for all students and disaggregated, 20092013. ix See Appendix C for 4th and 8th grade NAEP reading and math scores, disaggregated by income, and gaps, 20092013. NAEP data are for all DCPS schools, both regular public schools and charter schools, using a database constructed by Mary Levy, because TUDA scores do not include charter school students’ scores in all years studied. x See Appendices D and E for 4th and 8th grade NAEP reading and math scores, disaggregated by race and ethnicity, and gaps, 2009-2013. xi All of these NAEP data are from Mary Levy, who is able to provide data that include scores from both DCPS neighborhood public schools and charter schools. We are not able to use TUDA data because it does not include both sets of schools in all years being studied, and thus data are not comparable over time. xii If existing students had gained ground over time, some (appropriately weighted) average of scores from all three racial subgroups should correspond to the average/scale score. As these numbers show, however, this does not happen. Rather, the scale score increase is larger than any of the subgroup increases. This could then be explained in part by the increase in wealthier students who have entered DCPS in the past four years. xiii Eighth grade scores are not available across the three years, so we cannot conduct parallel comparisons for those students. xiv Proportion of high- and low-income test-takers, DC-CAS and NAEP, Mary Levy 2014b. xv See Appendix C for data on gains in 2009-2013 proficiency in math and reading as per DC-CAS. xvi See Appendix C. xvii The database from which these numbers were derived does not include schools in which there were too few lowincome students to report, the cut-off allegedly being fewer than 10 in earlier years and fewer than 25 in 2013. When Levy counted the number of schools each year for which there was no report for low income, other than those that serve adults, incarcerated youth, or others where FARMS is not used, she found: 10 in 2008, 5 in 2009, 23 in 2010, 6 in 2011, none in 2012, and 13 in 2013. xviii Ideally, such a comparison would be across measures of proficiency that are comparable, if not equivalent. Unfortunately, across virtually all states, “proficient” and “advanced” are very different under state assessment metrics versus NAEP metrics. DCPS is no exception; indeed, as noted above, a 2009 NCES report comparing cut score levels for NAEP versus state assessments finds DCPS standards for “proficiency” to be equivalent to “basic” or just below basic as per NAEP. The full report, out later this summer, will discuss this issue in more detail. xix As noted above, the same classification of many non-low-income students as low-income in 2013 may account for some of the increases seen among low-income students between 2011 and 2013. xx See Appendix B, Changes in percent of students “proficient” or “advanced,” 2009-2013, DC-CAS vs. NAEP xxi See Appendix C for NAEP proficiency levels by race and income. xxii See, too, retired DCPS math teacher Guy Brandenburg’s analysis of 25 years of NAEP trends, and the lack of any discernible change under Rhee-Henderson reforms (in contrast to wild ups and downs in DC-CAS scores). http://gfbrandenburg.wordpress.com/2014/07/03/my-predictions-for-the-2014-dc-cas-scores/. xxiii Gains were larger pre-reform in fourth grade reading, larger post-reform in eighth grade reading and math, and roughly equivalent in fourth grade math, as per data in Appendices D and E. The income-based gap widened the most in fourth grade reading and least in eight-grade math, but all gaps did grow post-reform, and gaps narrowed more pre-reform than post-reform for every year and subject for which numbers are available, as per Appendices C and D. Memorandum: Questioning Asserted Gains in DC-CAS Proficiency Page 15