ACHIEVEMENT G AINS IN ELEMENTARY AND HIGH SCHOOL by Laura LoGerfo*, Austin Nichols*, and Sean F. Reardon** Abstract We estimate how much students are learning at different points in school across the United States, as measured by reading and mathematics tests, and how these rates of learning differ for students of different social backgrounds. We find our results depend on which of several plausible estimates of achievement is used, but results differ less dramatically when achievement gains are measured in standard deviation units. We find that students in kindergarten or first grade make much larger gains, on average, than students in later grades, and students make larger gains early in high school than late in high school. The extent to which this finding is an artifact of the tests used is unclear. Achievement gaps across race, income, and home language groups exist at the start of kindergarten, and typically increase in the first two years of school, but seem somewhat more stable afterwards. Given these two findings, we suspect that interventions targeted on earlier grades may produce "more bang for the buck," though other interpretations are possible. Finally, we argue that our estimates of average gains measured in standard deviation units can be used as benchmarks that policymakers or researchers can use to estimate average gains in reading or math in a baseline scenario in future experimental interventions, to judge the relative importance of measured effect sizes, or to conduct power analyses. Acknowledgements The authors gratefully acknowledge research support from Sarah Cohodes and Joe Gasper. Many thanks to Duncan Chaplin, Larry Hedges, Kim Reuben and Jesse Rothstein for extremely helpful comments and suggestions. This report was written under the supervision of Jane Hannaway, Director of the Education Policy Center of the Urban Institute. * Urban Institute, Washington DC ** Stanford University, Stanford CA TABLE OF CONTENTS Chapter I: Introduction.............................................................................................................................. 1 Research Questions................................................................................................................................ 2 Methods................................................................................................................................................. 2 Data and Samples................................................................................................................................... 5 Chapter II: Average Learning Rates ......................................................................................................... 12 Elementary School............................................................................................................................... 12 Secondary School................................................................................................................................. 17 Summary of Elementary and High School Estimates............................................................................ 20 Chapter III: Differences in Learning Rates............................................................................................... 22 Elementary School............................................................................................................................... 22 Secondary School................................................................................................................................. 59 Summary of Elementary and High School Estimates............................................................................ 83 Chapter IV: Comparison with Theta Scores............................................................................................. 84 Elementary School............................................................................................................................... 84 High School ......................................................................................................................................... 95 Changes in Learning Rates Over Time ............................................................................................... 101 Summary ............................................................................................................................................ 113 Chapter V: Locally Standardized Differences in Learning Rates............................................................. 114 Brief Description of Method.............................................................................................................. 114 Advantage of Method Over Linear Models........................................................................................ 114 Limitations of the LSD Method ......................................................................................................... 116 Locally Standardized Difference Estimates......................................................................................... 116 Summary ............................................................................................................................................ 119 Chapter VI: Discussion and Implications............................................................................................... 121 Achievement Gains Over Time.......................................................................................................... 121 Gaps in Achievement Gains, and Gaps in Achievement..................................................................... 122 Methodological Caveats ..................................................................................................................... 124 Policy Recommendations................................................................................................................... 124 Future Work ...................................................................................................................................... 125 Conclusion......................................................................................................................................... 126 References ............................................................................................................................................. 127 Appendix A: Details of Data and Methods ............................................................................................ 128 Variables Used ................................................................................................................................... 130 Analytic Methods............................................................................................................................... 135 Appendix B: Rescaling Issues................................................................................................................. 140 Appendix C: Standard Deviations.......................................................................................................... 144 Effect Size.......................................................................................................................................... 144 IRT and Theta Score Distributions .................................................................................................... 146 Appendix D: Comparison of IRT and Theta Scores............................................................................... 155 Appendix E: Locally Standardized Growth Rate Differences................................................................. 161 * Urban Institute, Washington DC ** Stanford University, Stanford CA CHAPTER I: INTRODUCTION Educational research often attempts to explain student achievement by estimating the effects of individual ability, home environment, and teacher and school quality (Burkam et al., 2004; Cooper et al., 1996; Entwisle & Alexander, 1992; Ferguson, 1998; Fryer & Levitt, 2002; Goldhaber & Brewer, 2000; Lee & Burkam, 2002; Nye et al., 2004; Reardon, 2003). Rather than isolate what factors account for learning, this report steps back to ask two basic and crucial questions. First, how much are students learning per grade in reading and mathematics? Second, how do these rates of learning differ for students of different social backgrounds? We address these questions for students in elementary school and high school, taking advantage of two nationally representative, longitudinal datasets sponsored by the National Center for Education Statistics (NCES)—the Early Childhood Longitudinal Study (ECLS-K) and the National Education Longitudinal Study (NELS:88). Lessons learned can set benchmarks for researchers interested in experimental and quasi-experimental designs, inform policymakers about the likely impacts of potential policy reforms, and help educators and the general public understand what students and schools can be expected to accomplish in an academic year. In sum, our results will serve as reference points for future research on achievement gains. The report is organized into six chapters. This first chapter explains the importance of the study and outlines what results are presented. The second chapter presents baseline estimates for how much students learn per grade on average in both reading and math. The third chapter examines the relative learning rates for specific student subgroups (gender, race/ethnicity, language background, and economic status). The fourth chapter explores learning per grade and by subgroup using a different metric than the previous two chapters. The fifth chapter presents results from specialized analyses that re-examine differences in learning rates across time and across subgroups. The sixth and final chapter summarizes the findings and draws implications for research and policy. Appendices document the data and analytic methods used in the report. 1 Research Questions Question 1: How much do students learn? To address this question we estimate the average gains in reading and math for students in elementary school and students in high school. To produce such estimates, we use two nationally representative and longitudinal datasets, the Early Childhood Longitudinal Study—Kindergarten cohort (ECLS-K) for elementary school and the National Education Longitudinal Study of 1988 (NELS:88) for high school. In ECLS-K the typical student is a child who was a kindergartner in 1998, and in NELS:88 the typical student was an eighth-grader in 1988. Participants in these studies took a battery of tests in reading and mathematics at the start of their relevant school transition. The elementary school children were first tested in the first term of kindergarten and last tested three years later, for a total of five testing times. The high school students were first tested during eighth-grade, typically the last grade before high school, and last tested four years later, for a total of three times. Results from this report indicate how much students learned during the intervals. To study learning, we want to measure achievement, but what we have are several versions of these reading and math test scores. As definitions of achievement, each version has strengths and weaknesses, which are discussed in the next subsection. In general, achievement is defined for our purposes as probable performance on a reliable test, which represents knowledge or skills at a point in time. Gains in achievement are thus increases in probable performance over time and represent learning. Question 2: Do learning rates differ for different types of students? We examine if, when, and for whom differences in learning rates are evident. This report sets benchmarks for the yearly gains in achievement made by: 1) male students and female students; 2) students of different racial/ethnic groups and language backgrounds; and 3) higher-income students and low-income students. Early evidence from ECLS-K suggests that economically disadvantaged and minority children enter kindergarten with lower average achievement than socio-economically advantaged and white children (Burkam et al., 2004; Downey et al., 2004; Fryer & Levitt, 2004; Lee & Burkam, 2002). Studies with the NELS:88 data suggest the same pattern of academic disadvantage for low-income and minority students at the beginning of high school (Phillips, Crane, and Rouse, 1998). Phillips and colleagues attribute achievement gaps between black and white students at the end of high school to differences in initial skills. Over the school years, the initial gap widens. Our analyses estimate the learning rates for these groups to explore what group differences exist and if the differences change over time. Methods We address each of these two research questions with several different analytic approaches. If the different techniques produce similar results, we are more confident that our findings are robust and accurate. We apply two different methods—growth curve analysis and locally standardized differences—and use three different metrics—IRT scale score points, effect sizes, and theta scores. The different methods applied to these different metrics all measure growth in achievement test scores. One approach is not more accurate than the other. This section explains the analytic methods. Growth Analysis In Chapters 2 and 3, we report on regression analyses that use piecewise linear growth models1 to estimate learning in elementary school and in high school.2 In elementary school, gains in reading and math 1 The analytic methods used in this report are explained in more detail in Appendix A. 2 achievement are estimated during: 1) kindergarten; 2) the summer between kindergarten and first grade; 3) first grade; and 4) the time between the end of first grade and the end of third grade. High school analyses estimate gains in reading and math achievement between eighth and tenth grades and between tenth and twelfth grades. In Chapter 2, we look at results by grade level, and in Chapter 3, we focus on the learning of subgroups defined by gender, race/ethnicity, and economic status. Findings are reported in three different metrics to facilitate interpretation of the results. Each is in common use by researchers and policy analysts, and each has both drawbacks and advantages. Using one metric or the other simply reflects preferences in interpretability and not differences in accuracy. The metrics we use are: (1) estimated changes in points on the test; (2) effect sizes; and (3) estimated changes in theta scores. IRT scores Gain is measured in points on the tests (for the ECLS-K analyses, the scale score points are rescaled as of the third grade assessment—see Appendix B for details on rescaling). The points are not the actual number right on the assessment as administered, but rather the number that item response theory (IRT) predicts the student would have answered correctly if s/he had been administered all the questions in the ECLS-K kindergarten through third grade item pools. At any given test administration, a student was administered only a subset of these items—a subset that corresponded to their grade level and skill level as estimated by an initial set of routing items. The IRT model does not increase students’ scores for correct guesses (NCES, 2005), so the score is more accurate than a pure sum of correct responses. This IRT process allows each student’s performance to be put on a common scale at each point in time, and over time. The IRT scores represent the best available measure of knowledge at a specific time point, and therefore offer some hope of quantifying learning rates. However, there are three major challenges to face before using the IRT scale scores. First, the range of possible IRT scores is still a somewhat arbitrary component of the test design. In ECLS-K, for example, the math test had fewer questions than the reading test, so a mid-range score on reading may appear quite high for math. Thus, IRT scale scores across these different subjects are not comparable. This is easy to rectify by converting all point scores into percent right, that is, by dividing by the maximum score on the test. However, this solution would mask differences in the test introduced by the subsequent addition of more difficult questions in later rounds, so we choose not to change the point scores in this way. Second, this scale is assumed to be an interval scale, meaning that point differences are consistent in numeric and substantive value throughout the distribution. But this assumption is tenuous at best. The IRT scale score is an interval-scaled measure of the number of items right on a specific test, which is necessarily one of many possible interval-scaled measures of achievement, one for each possible test. Different tests, each of which might have some desirable properties, could produce different results. On any one test, there are subtle distinctions in the difficulty of questions asked that complicates comparing IRT scores from different points in the distribution of scores within the same subject.3 This complicates comparisons of learning by the same students across time (e.g. is learning multiplication over 9 months in third grade faster or slower than learning addition in six months during first grade?) and comparisons of two students at different levels at the same point in time (e.g. is a gain of 16 points for a higher income The program used to test the growth models, Hierarchical Linear Modeling (HLM) 6.0 clusters multiple test scores within individual students. The program also can group students within schools so as to test for school effects on learning trajectories. Future research will explore differences in school characteristics that may explain differences in student achievement gains. 3 See Appendix B for details. 2 3 student whose initial score was 20 points a greater gain than a gain of 14 points for a lower income student whose initial score was 14 points?). This issue is further explicated in Appendix B. Third, the IRT scale scores tend to be positively skewed in the earliest grades (indicating that the lowest performing students may have higher scores than they would have had on a longer test) and negatively skewed in third grade (indicating that the highest performing students may have lower scores than they would have had on a longer test). On the ECLS, this reflects the fact that the tests included relatively few questions4 that were very easy for most kindergarteners, the type of questions which would better capture differences in achievement among low-achieving kindergartners, so these low achievers tend to be “clumped” together at a higher achievement level than accurately reflects the achievement of the lowestachieving among them. The tests included relatively few questions that were extremely difficult for most third graders, so these high achievers tend to be “clumped” together at a lower achievement level than accurately reflects the achievement of the highest-achieving among them. To address these and related issues, we standardize the change in points over time to construct effect sizes, and compare across two metrics (test performance measured by scale scores and theta scores), as discussed in the following sections. Effect sizes Effect sizes measure the magnitude of a relationship by calculating the point gains made relative to the baseline variation. Effect sizes can be compared across tests that have not been designed to be compared and that differ in the difficulty range of questions asked. For these reasons, we report estimates from models using IRT scores in points per month, divided by the standard deviation of the initial (baseline) score at the start of the time period. This translates point gains into standard deviation gains per month.5 In other words, effect sizes measure how far children’s scores progress along the time 1 test distribution by time 2. The gain is measured relative to the distribution of test scores at time 1 so the rescaling issues discussed in Chapter 5 and Appendix B are less problematic. For students with test scores at the median at time 1, a one standard deviation gain means that their time 2 test scores would put them roughly at the 84 th percentile, instead of the 50 th percentile, in the time 1 distribution. Details on how we estimate standard deviations are available in Appendix C. These effect sizes facilitate comparisons of learning rates across different tests and different populations (e.g. compare average learning by the population of US kindergartners in 1998 to average learning by the population of US kindergartners in some other year or in some other geographic unit). We offer some comparisons at the end of Chapters 2 and 3 between results from the ECLS-K analyses and results from the NELS:88 analyses. Neither points nor effect sizes directly describe the specific skills students are actually learning. However, IRT points can be linked to those skills. Each graph in Chapters 2 and 3 maps IRT scores onto the proficiency level (which corresponds to a set of skills) that students are learning most rapidly at that score6 (for example, a score of 50 in the ECLS-K IRT score metric corresponds to a skill level at which students are primarily learning to add and subtract). Using this method adds substantive meaning to the IRT scores reported as levels, and also clarifies what gains students make at different points in time. Stating a gain of 5 More details on the implications of test design appear in Chapter 5 and Appendix C. One month is the largest unit of calendar time smaller in size than every interval, and we use the conversion factor one month equal to 30.4375 days (the average length of one month in days). 6 Specifically, each proficiency level is mapped onto the IRT scale score at which the probability of proficiency in that level is one half; in general, this corresponds to the score at which students’ proficiency in that skill grows most rapidly with gains in scale scores. 4 5 4 points on a math test is not as meaningful or informative as stating a gain in skill from identifying numbers to solving word problems. Theta scores The IRT process combines an individual’s pattern of responses (right, wrong, omitted) with characteristics of the items on a test to estimate individual ability, known as theta. First, the IRT model estimates the theta for each student and the item parameters (difficulty, discrimination, and guess-ability). Then these theta and IRT parameters are transformed in a non-linear, monotonic function to construct the IRT scale scores (in the “estimated number right” metric). Using the theta scores offers a distinct advantage. In contrast to the more skewed scale scores, the theta scores are more symmetrically distributed, because theta is on an absolute scale, independent from the particular set of questions that are asked (see Appendix B for more details). However, like the IRT scale associated with these theta scores, the theta scale is still mathematically arbitrary. We construct models that are identical to the IRT scale score models, but using theta scores as the outcome variable. The theta score is often referred to as a measure of ability in the subject area. However, this does not imply that ability is a fixed characteristic of the test-taker (Hambleton, Swaminathan, & Rogers, 1991). The notion of ability measured by theta is the capacity to answer questions on a kind of test (e.g. reading or math) at a point in time, and does not correspond to any notion of unformed or genetic potential. The ability represented by theta scores can change over time, and the change in theta scores is simply another measure of achievement or skill level to contrast with changes in IRT scale scores. In Chapter 4, we discuss how the results from the IRT scale score models compare with results from the theta models. Locally Standardized Difference In Chapter 5, we use a method pioneered by Sean Reardon, an associate professor at Stanford University and a co-author of this report, called locally standardized difference (LSD), to analyze differences in gains over time between subgroups. The method generates estimates of differences in learning rates that change less in response to certain changes in test design than do findings from our analyses in Chapters 2 through 4, where IRT and theta scores are the dependent variables. The LSD technique compares the learning rates of students in different categories in the vicinity of some initial score (either IRT score or theta score), normalized by the difference in the pooled standard deviation (roughly, the average within-group standard deviation among population subgroups). Each of these standardized estimates of gaps in learning rates is local to a particular baseline score, and averaging across the entire distribution of baseline IRT scores produces an estimate of the average locally standardized difference in growth rates. This LSD approach offers a distinct advantage in that it was developed to address issues of rescaling in the ECLS-K data. The mean of the local estimates measures the differences in gains across subgroups in a way that is more robust to subsequent rescaling. This rescaling affects ECLS-K most, so our LSD analyses focus only on young children. The first set of results from the LSD models is produced in the IRTestimated number right metric, and the second set of results is in the theta metric. Data and Samples ECLS-K Data The ECLS-K study followed students from kindergarten through third grade. In addition to test score data, information on children’s gender, race and ethnicity, language status, and family background were also gathered. 5 Data were collected at five points in time: fall and spring of kindergarten, fall and spring of first grade, and spring of third grade. Not all students were assessed at every time point for several reasons: 1) sample attrition; 2) random subsampling in the fall of third grade; and 3) insufficient English fluency. The third reason has the most important implications for our analyses. In the rounds of data collection during kindergarten and first grade, children from a language minority background first took a screening test for English fluency called the Oral Language Development Scale (OLDS). If children did not pass this test, they could not take the reading assessment in that round. Spanish speakers who failed this test, however, could take a Spanish translation of the math assessment. Students classified as fluent at one wave were deemed fluent at all subsequent waves; students not fluent at one wave were re-administered the English OLDS at each subsequent wave until a passing score was obtained. When children demonstrated sufficient proficiency in English on the OLDS at any point, they then took both the reading and math assessments. Due to this language assessment process, the sample of Hispanic students with valid scores on the reading assessment increases over time. At wave 1, in the fall of kindergarten, 30 percent of Hispanic students were not assessed in reading (Table 2). By the spring of first grade, 10 percent of Hispanic students were missing reading scores, and by the spring of third grade almost all Hispanic students were able to take the reading assessment in English. As a result of the changing sample of Hispanic students with reading scores, comparisons of average reading scores by subgroup over time must be done with caution. If, for example, the reading score gap (the difference in average reading scores) between non-Hispanic white students and Hispanic students grows over time, this may not indicate slower average rates of reading skill gain among Hispanic students, but rather reflect the addition of students with lower-than-average English reading skills to the sample over time. TABLE 1.1: P ERCENTAGES OF STUDENTS MISSING R EADING SCORES , BY RACE/ ETHNICITY AND W AVE RACE/ETHNICITY FALL K SPRING K FALL 1ST SPRING 1ST SPRING 3RD White 0.9 0.5 0.4 0.3 1.0 Black 0.5 0.4 0.0 0.1 3.3 Asian 22.8 12.9 7.4 3.2 0.6 Hispanic (Total) 30.0 20.9 20.7 10.4 1.4 Total 7.5 5.0 4.9 2.5 1.5 The same problem is not manifest in the math assessment, since virtually all Hispanic students took the math assessment at each wave—some in Spanish, some in English. Nearly a quarter of Asian students did not take the math assessment at wave 1, since the math assessment was administered only in English and Spanish. Thus, Asian math achievement gap patterns must be interpreted with similar caution as outlined above. 6 TABLE 1.2: P ERCENTAGES OF STUDENTS MISSING MATH SCORES , BY RACE/ ETHNICITY AND W AVE RACE/ETHNICITY FALL K SPRING K FALL 1ST SPRING 1ST SPRING 3RD White 0.9 0.5 0.4 0.4 0.6 Black 0.6 0.7 0.3 0.1 1.4 Asian 22.6 12.9 7.8 3.2 0.6 Hispanic (Total) 0.9 0.4 0.6 0.6 0.7 Total 1.7 1.0 1.0 0.6 0.7 ECLS-K sample Our analyses are based on the 21,059 children in the ECLS-K restricted sample who have at least one reading or math test score.7 Slightly fewer than 5 percent of the children in the sample were repeating kindergarten in 19988 and 54.86 percent attended full-day kindergarten. Complete details on which students are enrolled in half-day versus full-day kindergarten are available in Appendix A. Table 1.1 presents descriptive statistics for the sample. Excluding children with only one test score does not change the results significantly. Children who were repeating kindergarten in the fall of 1998 are included in this sample to represent who enrolls in kindergarten. Children who repeated kindergarten in the fall of 1999–when the majority of children in the sample progressed to first grade–are not included. We dropped the students who were retained in kindergarten in the spring of 1999, because we wanted to ensure that we were capturing the achievement gain made in a given grade. If some students were in kindergarten at the same time as the majority had moved onto first grade, then the gains would not be defined consistently. 7 8 7 TABLE 1.3: SAMPLE SIZES—ELEMENTARY SCHOOL Percent of Sample Male Female Missing Gender White Black Hispanic Asian Other Missing Race English Speaking Home (EH) Non-English Speaking Home (NEH) Hispanic Non-EH Hispanic-EH Asian Non-EH Asian-EH Missing Race*EH Low Income Higher Income Missing Income Status Sample Size (N=21,059) 10,760 10,275 24 11,643 3,192 3,744 1,303 1,106 71 17,905 3,132 1,957 1,787 812 491 71 8,417 11,840 802 White-Low Income White-High Income Black-Low Income Black-High Income Hispanic-Low Income Hispanic-High Income Asian-Low Income Asian-High Income Other-Low Income Other-High Income Missing Race*Economic Status 3,008 8,635 2,046 964 2,286 1,316 497 693 574 513 822 14.28 41.00 10.11 4.76 11.30 6.50 2.46 3.42 2.73 2.44 3.90 50.99 49.01 0.11 55.29 15.16 17.78 6.19 5.25 0.34 85.11 14.89 9.32 8.51 3.87 2.34 0.34 41.55 58.45 3.81 White students make up 55.3 percent of this analytic sample (these descriptive frequencies are unweighted and are not intended to be representative of the population), followed by Hispanic children (17.8 percent) and black children (15.2 percent). Asian students (6.2 percent) and children classified as other (5.3 percent)—mixed race, American Indian, Native Hawaiian—complete the sample. Approximately 15 percent of the children in the analytic sample speak a language other than English at home. The Asian students and Hispanic students are almost evenly divided between those who come from homes where English is spoken and from homes where another language is spoken. In this sample, 41.6 percent of the children are eligible for the federal free and reduced-price lunch program. 9,10 Results from analyses that compare the analytic sample to the survey sample are discussed in Appendix A. Eligibility for the federal free and reduced price lunch program is determined by calculating an income-to-needs ratio, a family’s income as a proportion of the official federal poverty line for a family of that size. A family with income at the poverty line has a ratio of 1.00. Free and reduced price lunch eligibility extends to those with a ratio of 1.85. These data come from the fall kindergarten round of data collection. We use information from the first data collection, because we expect that the socioeconomic level at which students start school plays an important role in predicting subsequent learning rate. Changes in 9 10 8 NELS:88 Data NELS:88 collected data from a nationally representative sample of 24,599 eighth graders and followed them through high school. Data were collected at three time points: spring of eighth grade, spring of tenth grade, and spring of twelfth grade. Descriptive information about students was also recorded, including their gender, race/ethnicity, language status, and family background. NELS:88 sample Analyses include students who participated in the base year, first follow-up, and second follow-up of NELS:88. We exclude students who dropped out or were retained in high school.11,12,13 Of the full sample of NELS:88 participants, 4.42 percent dropped out between grades 8 and 10 and 11.19 percent between grades 10 and 12. Thus, our final analytic sample consists of 14,078 respondents (again, these descriptive frequencies are unweighted and thus not representative of the population).14 Table 1.2 presents the demographic composition of the analytic sample. socioeconomic status may affect achievement status and learning rate, but that compelling and critical question is not the focus of this report. 11We use two sources to determine whether a student dropped out. First, we use the created variable F2EVDOST, which indicates whether the student ever dropped out at least once during the base year through second follow-up, regardless of whether they ever returned. This variable is constructed from non-transcript sources. Second, we use the variable F2TROUT, which indicates whether a student dropped out based on information collected from transcripts. If either of these variables indicates dropout, then the student was excluded from the analyses. To test how much our analytic results changed by including and excluding dropouts and retained students, we conducted analyses with these students. The results changed only slightly. A more thorough discussion of this process and results is in Appendix A. 12The variable G12COHRT determines whether a student is on time in the twelfth grade. 13 We selected this sample to retain a true longitudinal sample. The weights we use to ensure generalizability over time was constructed for the sample selected. 14A small number of respondents (n=104) were missing data on race. These respondents are included in the analytic sample when possible because most of them had valid cognitive test scores at all three time points as well as data on other key demographic characteristics. 9 TABLE 1.4: SAMPLE SIZES—S ECONDARY SCHOOL Male Female White Black Hispanic Asian Native American Sample Size (N=14,078) 6,882 7,196 10,186 1,247 1,514 911 116 Percent of Total Sample 48.88 51.12 72.35 8.86 10.75 6.47 0.82 English Speaking Home (EH) Non-English Speaking Home (NEH) Missing EH Status Hispanic Non-English Hispanic-English Asian Non-English Asian-English Missing Race*EH Low Income Higher Income Missing Income Status 12444 1521 113 735 768 421 439 166 1,944 10,740 1394 89.11 10.89 0.80 5.22 5.46 2.99 3.12 1.18 13.81 76.29 9.90 White-Low Income White-High Income Black-Low Income Black-High Income Hispanic-Low Income Hispanic-High Income Asian-Low Income Asian-High Income Native American-Low Income Native American-High Income Missing Race*Income Status 852 8424 435 673 460 851 145 668 32 58 1480 6.76 66.87 3.45 5.34 3.65 6.76 1.15 5.30 0.25 0.46 10.5 Black students represent 8.9 percent of the sample, and 10.8 percent of the sample is Hispanic. Asian students account for 6.5 percent of the sample, Native Americans for just 0.8 percent, and the remainder are white students. Of the sample, 10.9 percent speak a language other than English in their homes. As in ECLS-K, the Asian and Hispanic students are divided about evenly between those from non-English speaking homes and from English-speaking homes. Based on our definition of low-income (eligibility for the free and reduced price lunch program), 13.8 percent of the analytic sample qualifies as low-income.15 Analyses are weighted so that the samples become nationally representative, in that results from the analytic models can be generalized to two groups of students. First, the ECLS-K findings can be generalized to children across America who entered kindergarten in the fall of 1998. Second, the NELS:88 findings can be generalized to adolescents across America who were enrolled in eighth grade in the spring of 1988. The findings presented in this report provide a broad overview of the patterns in achievement for young children and high school students. Though 1988 seems long ago and high school reforms have The proportion of high school students eligible for the free and reduced price lunch program (less than 15 percent) differs dramatically from the proportion of eligible elementary school students in the ECLS-K sample for a number of reasons. One of the most important is that eligibility is determined through completing an application sent home with or to students. A risk of humiliation or a lack of interest in signing up for the program may prevent more eligible students from applying. 15 10 come and gone in the meantime, this report offers a baseline of what learning can be achieved. We think the findings are still relevant today and will serve as useful reference points for future research. 11 CHAPTER II: A VERAGE LEARNING RATES This chapter presents answers to the first research question: how much are students learning over particular time intervals? Analyses estimate gains in reading and mathematics for a typical elementary school student who was in kindergarten in 1998 and for a typical high school student who was in the eighth grade in 1988. We present reading and math results, first for elementary school students16 and then for secondary school students. Elementary School This section presents findings from the elementary school analyses. Results for reading and math are presented for each time interval in terms of gain per month, effect size per month, gain per time period, and accumulated gain. Reading results are shown in Table 2.1 and math results are shown in Table 2.2. All differences discussed in the text are statistically significant unless otherwise specified. We also present findings that convert achievement scores to skill proficiencies.17 This provides a more substantive interpretation of how much children gain, not in points but in specific skills and knowledge. Figure 2.1 shows this conversion for reading. Figure 2.2 shows this conversion for math. Reading Starting point: Entering kindergarten The estimated score on the reading assessment at the start of kindergarten is 22.75 points. As illustrated in Figure 2.1, this means that the average child is learning to recognize letters at the start of formal schooling. Kindergarten During kindergarten, children, on average, gain 1.81 points per month on the reading assessment (moving 0.196 standard deviations up the distribution of scores at the beginning of kindergarten). On average, kindergartners primarily learn beginning sounds, and by the end of the kindergarten year, they are learning, on average, how to identify ending sounds. Summer No gains occur, on average, during the summer between kindergarten and first grade. Indeed, there is a 0.171 point loss per month, suggesting a slight summer “slide” effect. Summer slide refers to a dip in children’s cognitive development in the summer months when children are not exposed to stimulation at school. However, this loss is very small.18 First grade During first grade, children are gaining 3.28 points per month (moving 0.210 standard deviations up the distribution of scores at the beginning of first grade) in reading skills, on average. The average first grader moves from learning ending sounds to learning how to read words in context. The slightly larger gain in first grade compared to kindergarten may reflect traditional emphasis of teaching basic reading skills such All results are based on the twice-rescaled test scores. We expect that the advent of the thrice-rescaled scores from the fifth grade data will slightly alter the findings. More discussion about the impact of rescaling in subsequent rounds of ECLS-K is provided in Appendix B. 17 Appendix A provides a table for converting points to skill proficiencies. 18 We do not discuss issues of summer learning extensively for several reasons. First, the summer analysis cannot be replicated with the subsequent years of data. Second, others have conducted research focusing on the summer months with these ECLS-K data (Burkam et al., 2004; Downey et al., 2004). Third, this report focuses on what students learn in a given year, not how the summer learning rate compares with the school year learning rate. 16 12 as phonics in the first grade curriculum. Or, the larger increase may derive from issues with the test design, discussed in Chapter 1 and further discussed in Appendix B (see Figure B.1). Second and third grades19 Over the next two grades, children are gaining 1.58 points per month, on average. This represents a gain of just 0.075 SD on the reading assessment. During second and third grade, children are learning to identify words on sight, understand words in context, and draw literal inferences. By the end of third grade, the average student is learning how to extrapolate information from text. TABLE 2.1: READING GAINS FOR STUDENTS IN KINDERGARTEN IN 1998 Time Period Gain Per Month Effect Size Per Month Gain Per Period Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade At End of Period 22.75 1.81 (0.0106) -0.171 (0.0462) 3.28 (0.0187) 1.59 (0.0065) 0.196 17.06 39.80 -0.0126 -0.44 39.36 0.210 30.89 70.26 0.0749 38.17 108.43 Note: Standard errors for the estimated coefficients are presented in the first column of the table in parentheses below the corresponding coefficient. All coefficients are significantly different from zero. To calculate effect sizes, we divide the gain per month by the estimated standard deviation of the base-period test at the start of each time period. We cannot determine whether the learning rate is the same in second and third grades, because assessment data are collected only at the end of third grade and not during second grade. So we can discuss and compare the kindergarten and first grade rates explicitly, but we cannot distinguish the second and third grade learning rates. The second-grade learning rate could plausibly be similar to either the learning rate in first grade or the learning rate in third grade. 19 13 FIGURE 2.1: READING GAINS FOR STUDENTS IN KINDERGARTEN IN 1998 ECLS Reading Scores, All Students 110 7-EXTRAPOLATION 70 5-WORD IN CONTEXT 4-SIGHT WORDS 50 Score 90 6-LITERAL INFERENCE 3-ENDING SOUNDS 30 2-BEGINNING SOUNDS 10 1-LETTER RECOGNITION Kindergarten 1st Grade 2nd and 3rd Grades Mathematics Starting point: Entering kindergarten Findings for mathematics learning rates are presented in Table 2.2. At the start of kindergarten, children score, on average, 17.52 points on the mathematics assessment. This mean score implies that many students already know how to count and to identify shapes (see Figure 2.2). These are the math skills that many children typically learn before formal schooling begins. Kindergarten In kindergarten, the monthly average gain is 1.62 points per month (moving 0.196 standard deviations each month up the distribution of scores at the start of kindergarten). Children, on average, are learning relative size near the beginning of kindergarten and learning ordinality and sequences near the end of kindergarten. Summer Children exhibit slow growth in scores over the summer months, but still make modest gains. Instead of losing their math skills, on average, they gain roughly a half point per month (0.491 points, moving 0.042 standard deviations each month up the distribution of scores at the end of kindergarten). First grade In first grade, children gain 2.37 points per month, on average, moving 0.191 standard deviations (each month) up the distribution of scores at the beginning of first grade. In this year, children, on average, learn how to add and subtract. 14 Second and third grades During second and third grades, children advance their math performance at an average rate of 1.20 points per month. This rate is equivalent to moving 0.077 standard deviations (each month) up the distribution of scores at the beginning of first grade. Over these years, children are learning multiplication and division, on average, and may begin to learn more advanced skills like place value by the end of the period. Though the test includes questions on using the concepts of rate and measurement, children are not yet learning those skills (on average, though the highest-performing students have mastered those skills by the middle of third grade). TABLE 2.2: MATH GAINS FOR STUDENTS IN KINDERGARTEN IN 1998 Time Period Gain Per Month Effect Size Per Month Gain Per Period Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade At End of Period 17.52 1.62 (0.0090) 0.491 (0.0446) 2.37 (0.0158) 1.20 (0.0048) 0.196 15.21 32.73 0.0422 1.26 34.00 0.191 22.28 56.28 0.0767 28.87 85.15 Note: Standard errors for the estimated coefficients are presented in the first column of the table in parentheses below the corresponding coefficient. All coefficients are significantly different from zero. To calculate effect sizes, we divide the gain per month by the estimated standard deviation of the base-period test at the start of each time period. 15 FIGURE 2.2: MATH GAINS FOR STUDENTS IN KINDERGARTEN IN 1998 ECLS Math Scores, All Students 90 100 7-RATE & MEASUREMENT 50 60 5-MULTIPLY/DIVIDE 4-ADD/SUBTRACT 30 3-ORDINALITY, SEQUENCE 20 2-RELATIVE SIZE 10 40 Score 70 80 6-PLACE VALUE 1-COUNT, NUMBER, SHAPE Kindergarten 1st Grade 2nd and 3rd Grades Summary Children make bigger gains in reading and mathematics in kindergarten and first grade than in second and third grades.20 In reading, children are gaining about 0.20 standard deviations (each month) on the baseline distribution of scores (at the beginning of the period) over the first two years of school. By the end of third grade, their gain has dropped to a third of the earlier pace, to 0.07 standard deviations per month. In math, a similar pattern emerges. Examining the estimated learning rates measured in points, the rate in first grade is much faster than the rate in kindergarten. However, the distribution of the first grade scores is more widely spread than that of the kindergarten scores. Thus, when these gains are converted into effect sizes, the learning rates across kindergarten and first grade are nearly the same. In previous work with the ECLS-K data conducted by NCES, the average gain children made from the beginning of kindergarten through the end of third grade equaled 85.55 points in reading and 64.53 points in math (NCES, 2004, Appendix A: Table A-6). Our analyses find a gain of 85.68 points in reading and 67.63 points in math over the same time period.21 Our results are thus very similar to results produced by NCES, when looking at the average gains made by all students. These differences in growth may derive from a number of sources, including different test scaling as well as different rates of gain. So this interpretation of these differences should be taken as the result of several factors, not only that children gain more ground earlier in elementary school than later. 21 Differences between our results and those in NCES (2004) most likely derive from NCES’ use of regression analysis, our use of growth curve modeling and precision weights, and a more inclusive sample of heretofore unreleased restricted data. 20 16 Secondary School This section presents findings from the secondary school analyses with the NELS:88 data, following the same format as the previous section. Tables 2.3 and 2.4 present the findings in the same four metrics as the ECLS-K findings, and graphs depict the relationship between the numerical findings and the corresponding proficiencies. Reading On average, students make slightly larger gains on the reading test earlier in high school than they do later in high school. In the spring of eighth grade, the average reading achievement is 28.25 points. Students gain an average of 3.66 points between the spring of eighth grade and tenth grade and 2.17 points between tenth grade and twelfth grade. The effect sizes are small, suggesting a slow rate of learning. Between eighth and tenth grades, students move 0.02 standard deviations (each month) up the distribution of scores at the end of eighth grade on the reading assessment; between tenth and twelfth grades, this drops to 0.01 standard deviations per month, reflecting both slower mean gain in points on the test and a greater standard deviation on the test at the end of tenth grade than at the end of eighth. TABLE 2.3: READING GAINS FOR STUDENTS IN EIGHTH GRADE IN 1988 Time Period Gain Per Month Effect Size Per Month Gain Per Period Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade At End of Period 28.25 0.152 (0.0033) 0.0903 (0.0042) 0.0199 3.66 31.91 0.00978 2.17 34.07 Note: Standard errors for the estimated coefficients are presented in the first column of the table in parentheses below the corresponding coefficient. All coefficients are significantly different from zero. To calculate effect sizes, we divide the gain per month by the estimated standard deviation of the base-period test at the start of each time period. 17 45 FIGURE 2.3: READING GAINS FOR STUDENTS IN EIGHTH G RADE IN 1988 NELS Reading Scores, All Students 30 2-Simple Inferences 20 25 Score 35 40 3-Complex Inferences 8-10 10-12 Mathematics On average, students make slightly larger gains in math achievement earlier in high school than they do later in high school. The average eighth grader scores 38.16 points on the math assessment. Students gain an average of 7.78 points (moving 0.029 standard deviations, each month, up the distribution of scores at the end of eighth grade) between eighth and tenth grades and 4.28 points (moving 0.014 standard deviations, each month, up the distribution of scores at the end of tenth grade) between tenth and twelfth grades. TABLE 2.4: MATH GAINS FOR STUDENTS IN EIGHTH GRADE IN 1988 Time Period Gain Per Month Effect Size per Month Gain Per Period Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade At End of Period 38.16 0.324 (0.0038) 0.178 (0.0038) 0.0291 7.78 45.94 0.0136 4.28 50.21 Note: Standard errors for the estimated coefficients are presented in the first column of the table in parentheses below the corresponding coefficient. All coefficients are significantly different from zero. To calculate effect sizes, we divide the gain per month by the estimated standard deviation of the base-period test at the start of each time period. 18 FIGURE 2.4: MATH GAINS FOR STUDENTS IN EIGHTH G RADE IN 1998 60 NELS Math Scores, All Students 50 4-Intermediate Level Math 40 Score 3-Simple Problem Solving 30 2-Fractions and Exponents 20 1-Single Operations 8-10 10-12 Summary Gains made during high school appear slow relative to elementary school. In the two years between eighth and tenth grade, high school students, on average, gain just 0.02 to 0.03 SD per month in reading and math respectively. This is equivalent to about 0.48 SD over this time period in reading and about 0.70 SD in math during these two years. The gains between tenth and twelfth grades are even smaller. From tenth grade to twelfth grade, students gain less than a quarter of a standard deviation (0.23 SD) in reading and about a third of a standard deviation in math. The gain per period between the first half of high school is nearly double the gain made during the second half of high school. There are at least four plausible explanations for the apparent slowdown of learning, only one of which actually implies a slower mean rate of growth in high school. First, the reading and math assessments include basic questions about concepts and skills that students may no longer encounter in their classes, so students are not improving their scores. High school may be where students gain knowledge about social studies and chemistry, not about reading or basic math. Second, these are not the same tests, nor the same children—the ECLS tests children who were in kindergarten in 1998 and the NELS tests children who were in eighth grade in 1988—two different cohorts receiving two different tests, so the results may not be comparable for a host of reasons. Third, the underlying variation in math and reading skills may be much greater in high school than in elementary school, so that gains expressed in standard deviation units appear smaller relative to the variation in the population. Finally, it may be that there are decreasing returns to instruction, and more students learn at a lower rate once they have learned most of the material taught prior to high school (so they are on the “flatter” part of their individual learning curves). This last explanation is the only one of these four explanations that implies slower learning (though both the first and last imply a slower mean rate of growth in reading and math) in high school. 19 Summary of Elementary and High School Estimates In elementary school, children are gaining more on math and reading tests in kindergarten and first grade than in second and third grade. By the time students enter high school, their achievement gains in these subjects decrease substantially. In elementary school, children make quadruple the achievement gains high school students make. In kindergarten, children gain about 1.9 SD in reading and in math, and first graders gain nearly 2.0 SD in reading and about 1.8 SD in math. The gain between second and third grade is on a similar magnitude to kindergarten gain (about 1.8 SD in both reading and math), but this gain is made over two years, not just one year. In high school, the gain per period drops drastically to just 0.48 SD over the first two years in reading and to 0.70 SD in math. The second half of high school witnesses further decreases in achievement gain; students gain about a quarter of a standard deviation in reading and a third of a standard deviation in math over this time period. In comparing across subjects, the gain in reading and math is about the same during kindergarten, as measured in effect sizes. During first through third grades, children gain more on the reading test than on the math test. Then in high school, the gain in math exceeds the gain in reading. This makes sense if we consider that the primary grades typically emphasize literacy, so children may pick up more reading skills than math skills. In subsequent years, math learning may depend more on classroom instruction, especially in later grades when advanced math is more likely taught. Table 2.5 compares two measures of effect size gains, one that divides point gains by the estimated standard deviation on the test at the start of the period, and one that divides point gains by the estimated standard deviation on the test at the end of the period. The first measures progress along the distribution of scores at the initial time period, for example, how far the typical student would move during kindergarten up the distribution of scores from the beginning of kindergarten. The second measures progress along the distribution of scores at the end of the period, for example, how far the typical student would move during kindergarten up the distribution of scores taken from the end of kindergarten. Both have a similar interpretation, and both are in some sense scale-free, but give slightly different impressions of relative rates of gain. The gain per period in elementary school seem quite large, measured in standard deviation units (regardless of whether we use the standard deviation from the first-period test, which we refer to as effect sizes, or the alternative measure). Students gain more than one standard deviation per grade in kindergarten and first grade and just under one standard deviation per grade in second and third grades. In comparison, Kane (2003) finds effect sizes of only 0.25 and 0.5 during elementary school. However, Kane was looking at math and reading gains during fifth grade. This would be consistent with a pattern of decreasing effect sizes across grade-levels after first grade. Indeed, our findings suggest an average gain in second and third grades slower than the first grade gain, but faster than Kane’s fifth grade gains. In addition, our estimates suggest that by high school students are making gains of about two to three tenths of a standard deviation per school year between eighth and tenth grade and more than one tenth but less than two tenths of a standard deviation per school year between tenth and twelfth grades. 20 TABLE 2.5: COMPARING R EADING AND MATH GAINS ACROSS TIME (IN TWO DIFFERENT EFFECT SIZE PER PERIOD MEASURES ) Using SD at Start of Period Using SD at End of Period Time Period Reading Math Reading Math Kindergarten First Grade 1.84 1.98 1.85 1.80 1.25 1.46 1.31 1.42 2nd and 3rd Grades 1.80 1.84 1.86 1.60 8th Grade to 10th Grade 0.478 0.698 0.396 0.592 10th Grade to 12th Grade 0.235 0.325 0.218 0.306 There are several possible explanations for the apparent decline in learning rates. First, there is likely a shift in instruction, away from basic skills such as reading and math, toward more specialized topics such as social studies or physical sciences. The reading and math assessments used in these analyses do not focus on such topics. Second, some students may in high school reach the level of proficiency in reading that is their lifetime maximum, and additional instruction has no effect on these students. In general, we expect there to be decreasing returns to instruction in any topic, and many students are likely on the “flat” part of their learning curve in reading by tenth grade. A third possible explanation for the decreasing effect size between elementary school and high school is the changing population. The demographic composition of the samples may shift in ways that increase the standard deviation in the distribution of test scores between elementary and high school. For example, if the fraction of black and Hispanic students increased substantially between 1988 (when the NELS study started) and 1998 (when the ECLS-K study started), this could cause the standard deviation in test scores overall to go up.22 Indeed, the ECLS-K data contain a much higher fraction of black and Hispanic students than the NELS:88 data. Thus, based on differences in demographics alone, we expect larger standard deviations in the ECLS-K data and, consequently, smaller effect sizes. Another possible explanation for the decreasing effect size between elementary school and high school is simply that the standard deviation of test scores increases as children move to higher grade levels. Indeed, the standard deviations in ECLS-K increase consistently as children move from kindergarten to third grade and a similar pattern is seen in the NELS:88 data as students move from eighth to twelfth grade.23 However, the ECLS-K scale scores are not comparable to the NELS:88 scale scores. In sum, the rate of gain in elementary school appear to differ substantially from the rate of gain in high school, as measured by gains in points relative to the distributions on tests administered in ECLS and NELS. A quick rate of gain in reading and math emerges in kindergarten and first grade, which then drops off during second and third grades. By high school, the rate of gain slows even more dramatically. This pattern is manifest through our analyses using the IRT scale score metric as well as effect sizes. However, we cannot conclude that different rates of gains on these tests, even measured in effect sizes so they are more comparable across tests, correspond to different rates of learning at different points on time. This assumes that the (within-group) standard deviation in scores for Black students and Hispanic students is similar to (or larger than) the standard deviation for other students but that the Black students and Hispanic students have much lower scores on average. 23 See Appendix C for a discussion on the standard deviations and a corresponding table. 22 21 CHAPTER III: D IFFERENCES IN LEARNING RATES The previous chapter established the average baseline learning rates for elementary school students and for secondary school students. Focusing on the average, however, can mask vast differences in learning rates across different student subgroups. Several studies have found substantial gaps in achievement by race/ethnicity and by socioeconomic status, starting from before kindergarten through the end of high school (Fryer & Levitt, 2002; Hedges & Nowell, 1998; Lee & Burkam, 2002; Phillips et al., 1998). This chapter presents differences in learning rates by four dimensions of student background: gender, race/ethnicity, language background, and economic status. For each of these dimensions, reading and mathematics results are separately presented. Relevant tables and graphs follow the text. Elementary School Gender Reading Female students begin kindergarten with higher scores than male students and maintain their slight advantage through kindergarten, as shown in Table 3.1 and Figure 3.1. At the start of kindergarten, girls are predicted to score nearly a point higher on the reading assessment. During kindergarten and first grade, girls gain very slightly more than boys (0.136 points per month or 0.0146 SD in kindergarten; 0.00925 SD in first grade), but the difference is statistically significant. After first grade, the gain per year on reading tests is essentially identical across genders, but due to the initial differences, the slight advantage for females (in terms of overall points earned on the assessment) remains. Girls finish third grade with an average reading score nearly 4 points higher than boys. This advantage is seen in Figure 3.1. In kindergarten, the lines representing gains are quite close, and they separate by first grade with the line representing females’ learning very slightly steeper. During the second and third grades, the lines that identify male and female learning rates are parallel, with the gain for females slightly higher than the gain for boys. But, in terms of substance, by the end of third grade, both boys and girls are learning literal inference and not yet learning extrapolation. 22 TABLE 3.1: DIFFERENCES IN R EADING L EARNING RATES BY G ENDER—ELEMENTARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period M ALE STUDENTS Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade 22.29 1.75 (0.0147) -0.230 (0.0651) 3.21 (0.0271) 1.59 (0.0094) 0.189 16.44 38.73 -0.0168 -0.59 38.14 0.206 30.24 68.38 0.102 38.25 106.63 FEMALE STUDENTS (DIFFERENCE FROM MALE S TUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade 0.94 0.136 (0.0211) 0.120 (0.0923) 0.144 (0.0374) -0.00694 (0.0129) 0.0146 1.28 2.22 0.00880 0.31 2.53 0.00925 1.36 3.88 -0.000445 -0.17 3.72 These analyses are based on students who have at least one reading or math test score in five rounds of ECLS-K data. Each estimate in bold is significantly different from the corresponding estimate for male students at the 5 percent level. Descriptions of models are provided in Appendix A. 23 FIGURE 3.1: DIFFERENCES IN R EADING L EARNING RATES BY G ENDER —ELEMENTARY SCHOOL ECLS Reading Scores by Gender 110 7-EXTRAPOLATION 70 5-WORD IN CONTEXT 50 Score 90 6-LITERAL INFERENCE 4-SIGHT WORDS 3-ENDING SOUNDS 30 2-BEGINNING SOUNDS 10 1-LETTER RECOGNITION Kindergarten 1st Grade 2nd and 3rd Grades Female Male Math Male and female students start kindergarten with very similar math scores, but in sharp contrast to the reading results, male students begin to edge out girls in first grade (see Table 3.2). Also in contrast to the reading results, the gap continues to widen over time. In kindergarten, boys and girls start with similar math scores and make similar gains on the math assessment. In first grade, girls begin to make less gain in math (-0.0078 SD). By the third grade assessment, girls have earned 2.79 points less on the math assessment than boys. But this does not translate to a great difference in skill attainment. Both male and female students are learning place value by the end of third grade, as presented in Figure 3.2. 24 TABLE 3.2: DIFFERENCES IN MATH L EARNING RATES BY G ENDER—ELEMENTARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period M ALE STUDENTS Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade 17.53 1.63 (0.0132) 0.469 (0.0649) 2.41 (0.0229) 1.23 (0.0069) 0.198 15.37 32.91 0.0403 1.21 34.11 0.195 22.72 56.83 0.0998 29.68 86.51 FEMALE STUDENTS (DIFFERENCE FROM MALE S TUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -0.03 -0.0350 (0.0180) 0.0464 (0.0890) -0.0959 (0.0316) -0.0684 (0.0097) -0.00425 -0.33 -0.36 0.00399 0.12 -0.24 -0.00776 -0.90 -1.15 -0.00553 -1.64 -2.79 These analyses are based on students who have at least one reading or math test score in five rounds of ECLS-K data. Each estimate in bold is significantly different from the corresponding estimate for male students at the 5 percent level. Descriptions of models are provided in Appendix A. 25 FIGURE 3.2: DIFFERENCES IN MATH L EARNING RATES BY G ENDER —ELEMENTARY SCHOOL 7-RATE & MEASUREMENT 10 20 30 40 50 60 70 80 90 100 Score ECLS Math Scores by Gender 6-PLACE VALUE 5-MULTIPLY/DIVIDE 4-ADD/SUBTRACT 3-ORDINALITY, SEQUENCE 2-RELATIVE SIZE 1-COUNT, NUMBER, SHAPE Kindergarten 1st Grade 2nd and 3rd Grades Female Male Race/Ethnicity Reading Even before school begins, learning differences by race/ethnicity emerge, and these differences persist during the early school years. Table 3.3 presents differences in learning rates by race/ethnicity, and Figure 3.3 maps these learning rates onto skill proficiency levels. Black children begin kindergarten more than 3 points behind white children and trail in their reading learning rates during the first three years of school. Black children gain 0.031 SD less per month during kindergarten and 0.037 SD less during first grade. Thus the initial difference is compounded by a slower learning rate. Figure 3.3 illustrates the widening of the difference in points gained per time period. Hispanic children start kindergarten with the lowest average score on the reading assessment. This deficit increases during school, because Hispanic students make fewer gains in reading than white students. Compared to white children, Hispanic children gain 0.0201 SD less per month in kindergarten and 0.044 SD less per month in first grade. In second and third grades, Hispanic children are still gaining significantly less per month than white children. But Hispanic children are not as disadvantaged in their reading gains as black children. The deficit that Hispanic children face is significantly smaller than what black children face in kindergarten and in second and third grades.24 Asian children start kindergarten with 1.51 more points than white children and learn significantly more quickly than white children in kindergarten, gaining an average 0.240 more points per month or 0.0259 SD We tested selected subgroup comparisons to determine if differences in subgroup differences from White students (slopes in HLM parlance) were significant. 24 26 more. Surprisingly, unlike other subgroups, Asian children on average make gains in reading during the summer (0.611 points per month or 0.0448 SD). However, during first through third grades, the pattern reverses; Asian students learn significantly less in reading than white students. Figure 3.3 shows the narrowing of these differences by the end of first grade and the crossover in the second and third grade time period as the cumulative average score for white students begins to exceed the cumulative score for Asian students. Figure 3.3 depicts the racial/ethnic learning differences. The learning rates for black and Hispanic children are below those for white and Asian children in kindergarten and remain so through the spring of third grade. Black and Hispanic children end third grade on average learning literal inference, a skill that Asian and white children have already learned. 27 TABLE 3.3: DIFFERENCES IN R EADING L EARNING RATES BY RACE/ ETHNICITY—ELEMENTARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period W HITE STUDENTS Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade 24.22 1.90 (0.0137) -0.231 (0.0602) 3.53 (0.0251) 1.64 (0.0083) 0.205 17.86 42.08 -0.0169 -0.60 41.49 0.227 33.27 74.75 0.0773 39.41 114.16 BLACK STUDENTS (DIFFERENCE FROM WHITE STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -3.17 -0.283 (0.0281) -0.0288 (0.1240) -0.576 (0.0507) -0.185 (0.0195) -0.0306 -2.66 -5.83 -0.00211 -0.07 -5.91 -0.0370 -5.43 -11.33 -0.00871 -4.44 -15.77 HISPANIC STUDENTS (DIFFERENCE FROM WHITE STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -4.92 -0.186 (0.0304) 0.268 (0.1350) -0.687 (0.0499) -0.0578 (0.0176) -0.0201 -1.75 -6.67 0.0196 0.69 -5.98 -0.0440 -6.47 -12.45 -0.00273 -1.39 -13.84 ASIAN STUDENTS (DIFFERENCE FROM W HITE STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade 1.51 0.240 (0.0595) 0.611 (0.3020) -0.289 (0.1020) -0.250 (0.0274) 0.0259 2.26 3.77 0.0448 1.57 5.34 -0.0185 -2.72 2.63 -0.0118 -6.02 -3.39 Each estimate in bold is significantly different from the corresponding estimate for white students at the 5 percent level. Descriptions of models are provided in Appendix A. 28 FIGURE 3.3: DIFFERENCES IN R EADING L EARNING RATES BY RACE/ETHNICITY—ELEMENTARY SCHOOL ECLS Reading Scores by Race 110 7-EXTRAPOLATION 70 5-WORD IN CONTEXT 50 Score 90 6-LITERAL INFERENCE 4-SIGHT WORDS 30 3-ENDING SOUNDS 2-BEGINNING SOUNDS 10 1-LETTER RECOGNITION Kindergarten 1st Grade Asian Hispanic 2nd and 3rd Grades Black White Math Table 3.4 presents the learning rates in mathematics for children who enrolled in kindergarten in 1988. As in reading, Asian and white children score higher than black and Hispanic children in mathematics. At the start of kindergarten, black children score about 4.5 points lower than white children on the math assessment. By the end of third grade, black children trail white students by a total of about 15.5 points. Hispanic children start kindergarten with the lowest score in math, and compared to white children gain less in math over the first few years of elementary school. The difference in monthly math gains decreases over time, but Hispanic students continue to learn at lower rates during the later periods. The initial gap and the slower learning pace result in Hispanic children scoring more than 10 points less than white students (on average) at the end of third grade. Asian children and white children start kindergarten with nearly equivalent scores and gain a similar amount of points on the math assessment during kindergarten. White children, however, experience slightly larger gains during first grade, and Asian children make larger gains in second and third grades. By the end of third grade, the difference between the subgroups accumulated math gain is less than a quarter of a point. Figure 3.4 depicts the differences in learning rates by race/ethnicity. The lines representing the learning rates of white and Asian children overlap and are substantially above those representing the other subgroups. White and Asian children begin kindergarten learning relative size and end third grade learning place value, but black and Hispanic children are nearly one grade level behind throughout these grades (for 29 example, they are learning to multiply and divide in third grade, but white and Asian students are learning this skill in second grade). The differences magnify slightly for black children over these primary school years, while Hispanic students seem to maintain a constant disadvantage relative to white and Asian students. 30 TABLE 3.4: DIFFERENCES IN MATHEMATICS LEARNING RATES BY RACE/ ETHNICITY — ELEMENTARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period W HITE STUDENTS Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade 19.53 1.73 (0.0122) 0.473 (0.0624) 2.49 (0.0225) 1.23 (0.0062) 0.210 16.26 35.79 0.0407 1.22 37.01 0.202 23.48 60.49 0.0784 29.51 90.00 BLACK STUDENTS (DIFFERENCE FROM WHITE STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -4.64 -0.361 (0.0237) -0.0965 (0.1200) -0.379 (0.0424) -0.147 (0.0144) -0.0439 -3.40 -8.04 -0.00829 -0.25 -8.29 -0.0306 -3.57 -11.85 -0.00938 -3.53 -15.38 HISPANIC STUDENTS (DIFFERENCE FROM WHITE STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -5.63 -0.253 (0.0230) 0.125 (0.1160) -0.222 (0.0399) -0.0275 (0.0130) -0.0307 -2.38 -8.01 0.0108 0.32 -7.68 -0.0180 -2.09 -9.78 -0.00176 -0.66 -10.44 ASIAN STUDENTS (DIFFERENCE FROM W HITE STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade 0.56 -0.0341 (0.0467) 0.443 (0.2770) -0.362 (0.0818) 0.0941 (0.0208) -0.00414 -0.32 0.23 0.0381 1.14 1.38 -0.0293 -3.41 -2.03 0.00601 2.26 0.23 Each estimate in bold is significantly different from the corresponding estimate for white students at the 5 percent level. Descriptions of models are provided in Appendix A. 31 FIGURE 3.4: DIFFERENCES IN MATH L EARNING RATES BY RACE/ETHNICITY—ELEMENTARY SCHOOL 7-RATE & MEASUREMENT 10 20 30 40 50 60 70 80 90 100 Score ECLS Math Scores by Race 6-PLACE VALUE 5-MULTIPLY/DIVIDE 4-ADD/SUBTRACT 3-ORDINALITY, SEQUENCE 2-RELATIVE SIZE 1-COUNT, NUMBER, SHAPE Kindergarten 1st Grade 2nd and 3rd Grades Asian Hispanic Black White Language Status Reading We define language status as a dichotomous variable indicating whether children come from homes in which the primary language spoken is English (EH) or not English (NEH).25 Table 3.5 presents the results for reading learning by language status, and Figure 3.5 aligns these learning rates with gains in skill proficiency levels. Students included in the language assessments speak English with sufficient fluency to qualify to take the reading assessment. Please refer to Appendix A for a discussion comparing achievement between those students who qualified to take the reading assessment and those who did not.26 Children from homes where English is not the primary language start kindergarten with reading scores 4.12 points below those of their peers whose home language is English. During kindergarten, NEH Children who were administered the oral language screening test are classified as NEH, since children whose home language was not English were administered the oral language screening test to determine if they could take the assessments. By the third grade assessment, no child was excluded from taking the assessment for not speaking English with sufficient fluency, so this is more a measure of potential limited English proficiency (LEP) at some point in time, than of actual contemporaneous LEP. 26 Hispanic students who failed the OLDS screening test start kindergarten with the lowest average reading score and continue to earn less than White students on the reading assessment throughout early elementary school. Asian students who failed the OLDS start off at the same reading score as White students on average, but gain less in almost every time period than White students. Asian students who never failed the OLDS fare better than White students until second and third grades. Hispanic never-failed students essentially keep pace with White students. Students who never failed the OLDS are more likely included in early estimates of reading learning. Children who failed the OLDS at least once enter the models after the first time period and therefore may bias the later estimates. Their entrance biases the initial level upward (students weaker in English reading are excluded initially), but biases the level at the subsequent time point downward, especially for the subgroups with language minority members, such as Asian students and Hispanic students (students who are weaker are now included in average estimates). This leads to a lower average estimated growth rate. See Appendix A for more details. 25 32 students gain slightly less per month than EH students, and this learning rate gap grows even larger in first grade. By the end of first grade, the cumulative point difference in reading is already more than 10 points. In second and third grades, the difference in reading gains is significant, but small. Thus the 10-point difference widens only slightly. This is illustrated in Figure 3.3 where the lines representing learning rates visibly diverge over the first grade time period yet stay parallel throughout the second and third grades. On average, NEH children are learning literal inference by the end of third grade, while EH children have generally advanced past this skill. TABLE 3.5: DIFFERENCES IN R EADING L EARNING RATES BY LANGUAGE STATUS —ELEMENTARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period ENGLISH SPEAKING HOME STUDENTS Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade 23.16 1.84 (0.0110) -0.205 (0.0482) 3.36 (0.0199) 1.59 (0.0070) 0.199 17.30 40.46 -0.0150 -0.53 39.94 0.215 31.59 71.53 0.0752 38.33 109.86 NON -ENGLISH SPEAKING HOME STUDENTS (DIFFERENCE FROM ENGLISH SPEAKING HOME S TUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -0.154 (0.0398) 0.154 (0.1680) -0.543 (0.0571) -0.0217 (0.0181) -4.17 -0.0166 -1.45 -5.62 0.0113 0.40 -5.22 -0.0348 -5.11 -10.34 -0.00102 -0.52 -10.86 Each estimate in bold is significantly different from the corresponding estimate for EH students at the 5 percent level. Descriptions of models are provided in Appendix A. 33 FIGURE 3.5: DIFFERENCES IN R EADING L EARNING RATES BY LANGUAGE STATUS —ELEMENTARY SCHOOL ECLS Reading Scores by Language Status 110 7-EXTRAPOLATION 70 5-WORD IN CONTEXT 50 Score 90 6-LITERAL INFERENCE 4-SIGHT WORDS 30 3-ENDING SOUNDS 2-BEGINNING SOUNDS 10 1-LETTER RECOGNITION Kindergarten 1st Grade 2nd and 3rd Grades EH Non-EH Math The math assessment includes more children who may not speak English at home, because the math test was translated into Spanish for those participants who did not pass the language screening test. Results for math by language status are presented in Table 3.6 and represented in Figure 3.6 as changes in skill proficiency levels. Children from different language backgrounds learn math at different rates in kindergarten and in first grade. NEH children begin kindergarten more than five points behind their peers on the math assessment. The difference in learning rates is slight; NEH students gain just 0.200 points less per month (-0.02 SD) in kindergarten. The deficit declines in first grade to 0.0773 points less per month (-0.00627 SD). By second and third grades, there is a slight, but significant, difference in math learning rates between the two groups, with the difference favoring the NEH students. The reversal in second and third grades cannot eliminate the gap, however, and NEH students’ scores remain behind EH students’ scores. 34 TABLE 3.6: DIFFERENCES IN MATH L EARNING RATES BY LANGUAGE STATUS —ELEMENTARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period ENGLISH SPEAKING HOME STUDENTS Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade 23.16 1.84 (0.0110) -0.205 (0.0482) 3.36 (0.0199) 1.59 (0.0070) 0.199 17.30 40.46 -0.0150 -0.53 39.94 0.215 31.59 71.53 0.0752 38.33 109.86 NON -ENGLISH SPEAKING HOME STUDENTS (DIFFERENCE FROM ENGLISH SPEAKING HOME S TUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -0.154 (0.0398) 0.154 (0.1680) -0.543 (0.0571) -0.0217 (0.0181) -4.17 -0.0166 -1.45 -5.62 0.0113 0.40 -5.22 -0.0348 -5.11 -10.34 -0.00102 -0.52 -10.86 Each estimate in bold is significantly different from the corresponding estimate for EH students at the 5 percent level. Descriptions of models are provided in Appendix A. 35 FIGURE 3.6: DIFFERENCES IN MATH L EARNING RATES BY LANGUAGE STATUS —ELEMENTARY SCHOOL 7-RATE & MEASUREMENT 10 20 30 40 50 60 70 80 90 100 Score ECLS Math Scores by Language Status 6-PLACE VALUE 5-MULTIPLY/DIVIDE 4-ADD/SUBTRACT 3-ORDINALITY, SEQUENCE 2-RELATIVE SIZE 1-COUNT, NUMBER, SHAPE Kindergarten 1st Grade 2nd and 3rd Grades EH Non-EH Race and Language Reading Table 3.7 presents reading skills by racial/ethnic categories and by whether English is the primary language spoken at home.27,28 As seen in an earlier table (Table 3.3), Hispanic students experience lower growth rates in reading than white students. In Table 3.7, we see that this pattern holds for both Hispanic EH and NEH students. However, the differences are much larger for NEH children. In first grade, the deficit for Hispanic NEH children balloons. Most likely this is due to the entrance of students previously excluded from the assessment because they failed to pass the OLDS screening test. Far more Hispanic NEH students qualified to take the reading assessment in first grade than in kindergarten. These newly-included students may demonstrate weaker-than-average English reading skills and thus depress the reading scores in the first grade spring data. This helps to explain the lower scores and gains in first grade for Hispanic NEH children. Hispanic EH children move from being about 3 points behind white children at the beginning of kindergarten to more than 8 points behind by the end of third grade. In contrast, Hispanic NEH children shift from a gap that starts at 8.15 points to a much wider gap of over 19 points. We remind readers that the language minority sample with reading scores is somewhat more advantaged than the sample excluded from the assessment process for language reasons. This caveat does not apply in the same way to mathematics tests. Hispanic LEP students who do not pass the screening test take a Spanish form of the math assessment. There were no translations of the math test into Asian languages, so Asian LEP students were excluded. 28 Children from homes where a language other than English is predominantly spoken are labeled NEH, for non-English speaking home. Children from homes where English is the primary language are labeled EH, for English-speaking home. 27 36 Asian EH children start kindergarten with higher average reading scores than white children but learn at similar rates to white children during kindergarten and first grade. In second and third grades, Asian EH children’s learning rate is slightly but significantly slower than white children’s learning rate. Thus, Asian EH students end third grade with about the same accumulated gain as white students. Asian NEH children start kindergarten with an average reading score less than a point behind white children. These Asian children then learn more than white children in kindergarten, but significantly less in the later grades. Thus Asian NEH students end third grade with cumulatively fewer points on the reading assessment than white students. The more these students are exposed to school, the slower their learning rate seems to become. On average, all groups except Hispanic NEH begin kindergarten with skills in letter recognition. By the end of kindergarten, white and Asian children (regardless of language at home) have learned beginning and ending sounds. By the end of third grade, all groups are learning literal inference, though white and Asian EH children are beginning to learn how to extrapolate (see Figure 3.7). 37 TABLE 3.7: DIFFERENCES IN R EADING L EARNING RATES BY RACE AND LANGUAGE—ELEMENTARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period W HITE STUDENTS Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade 24.22 1.90 (0.0137) -0.231 (0.0602) 3.53 (0.0251) 1.64 (0.0083) 0.205 17.86 42.08 -0.0169 -0.60 41.49 0.227 33.27 74.76 0.0773 39.41 114.17 BLACK STUDENTS (DIFFERENCE FROM WHITE STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -3.17 -0.283 (0.0281) -0.0290 (0.1240) -0.576 (0.0507) -0.185 (0.0195) -0.0306 -2.66 -5.83 -0.00212 -0.07 -5.91 -0.0370 -5.43 -11.33 -0.00870 -4.44 -15.77 HISPANIC ENGLISH SPEAKING HOME STUDENTS (DIFFERENCE FROM W HITE STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -3.36 -0.0567 (0.0380) 0.366 (0.1770) -0.437 (0.0676) -0.0648 (0.0254) -0.00613 -0.53 -3.89 0.0268 0.94 -2.95 -0.0280 -4.12 -7.07 -0.00305 -1.56 -8.62 HISPAN IC NON -ENGLISH SPEAKING HOME STUDENTS (DIFFERENCE FROM WHITE STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -6.98 -0.293 (0.0450) 0.0404 (0.1790) -0.880 (0.0629) -0.0309 (0.0213) -0.0317 -2.76 -9.74 0.00296 0.10 -9.64 -0.0564 -8.28 -17.92 -0.00146 -0.74 -18.67 Table Continues on Next Page 38 TABLE 3.7(CONT .): DIFFERENCES IN R EADING L EARNING RATES BY RACE AND LANGUAGE—ELEMENTARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period ASIAN ENGLISH SPEAKING HOME STUDENTS (DIFFERENCE FROM WHITE STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade 3.85 0.312 (0.0817) 0.758 (0.3720) -0.213 (0.1290) -0.273 (0.0408) 0.0337 2.94 6.79 0.0556 1.95 8.75 -0.0136 -2.00 6.74 -0.0129 -6.56 0.18 ASIAN NON -ENGLISH SPEAKING HOME S TUDENTS (DIFFERENCE FROM W HITE STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade 0.221 (0.0837) 0.501 (0.4730) -0.327 (0.1560) -0.235 (0.0350) -0.56 0.0239 2.08 1.51 0.0367 1.29 2.81 -0.0210 -3.08 -0.27 -0.0111 -5.64 -5.92 Estimates in bold are significantly different from the corresponding estimate for white students at the 5 percent level. Description of models is provided in Appendix A. 39 FIGURE 3.7: DIFFERENCES IN R EADING L EARNING RATES BY RACE AND LANGUAGE—ELEMENTARY SCHOOL ECLS Reading Scores by Race and Language 110 7-EXTRAPOLATION 70 5-WORD IN CONTEXT 50 4-SIGHT WORDS 3-ENDING SOUNDS 2-BEGINNING SOUNDS 30 Score 90 6-LITERAL INFERENCE 10 1-LETTER RECOGNITION Kindergarten 1st Grade 2nd and 3rd Grades Asian EH Black Hispanic Non-EH Asian Non-EH Hispanic EH White Math Table 3.8 breaks out race and ethnic groups by their language proficiency. An earlier table (Table 3.4) shows Hispanic students experiencing slower math growth between the beginning of kindergarten and the end of third grade than white students. In Table 3.8, this pattern generally holds true regardless of the language spoken at home. Hispanic NEH students, however, make slightly less gain in math than Hispanic EH students. As presented earlier in Table 3.4, Asian and white students learn math at about the same rate between the beginning of kindergarten and the end of third grade, with some vacillation. In Table 3.8, the same pattern emerges for Asian NEH children. Asian NEH children start kindergarten with an average test score that does not differ significantly from that of white children but gain less than white children during kindergarten. Nearly 25 percent of Asian students did not take the math assessment in kindergarten but did qualify in first grade. The introduction of these students to the sample may drop the average math score for this subgroup in first grade, thus explaining the larger deficit. In second and third grades, Asian NEH students are making greater gains and shrinking the gap that widened during first grade. Asian EH students start kindergarten with an average math score almost two points higher than white students. In kindergarten, Asian EH children and white children gain similar points on the math assessment. Unlike Asian NEH children, Asian EH children gain significantly more than white children in the summer between kindergarten and first grade. Like Asian NEH students, Asian EH students fall behind in math during first grade, but regain the advantage in second and third grades. 40 By the end of third grade, white students a nd both groups of Asian students are, on average, learning place value, as illustrated in Figure 3.8. In sum, by the end of third grade, Asian EH students have scores similar to white students in both reading and math, and Asian NEH students are behind white students only in reading. Black and Hispanic students fall behind in gaining math skills, and so end third grade on average not yet learning place value. 41 TABLE 3.8: DIFFERENCES IN MATH L EARNING RATES BY RACE AND LANGUAGE —- ELEMENTARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period W HITE STUDENTS Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade 24.22 1.90 (0.0137) -0.231 (0.0602) 3.53 (0.0251) 1.64 (0.0083) 0.205 17.86 42.08 -0.0169 -0.60 41.49 0.227 33.27 74.76 0.0773 39.41 114.17 BLACK STUDENTS (DIFFERENCE FROM WHITE STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -3.17 -0.283 (0.0281) -0.0290 (0.1240) -0.576 (0.0507) -0.185 (0.0195) -0.0306 -2.66 -5.83 -0.00212 -0.07 -5.91 -0.0370 -5.43 -11.33 -0.00870 -4.44 -15.77 HISPANIC ENGLISH SPEAKING HOME STUDENTS (DIFFERENCE FROM W HITE STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -0.0567 (0.0380) 0.366 (0.1770) -0.437 (0.0676) -0.0648 (0.0254) -3.36 -0.00613 -0.53 -3.89 0.0268 0.94 -2.95 -0.0280 -4.12 -7.07 -0.00305 -1.56 -8.62 HISPANIC NON -ENGLISH SPEAKING HOME STUDENTS (DIFFERENCE FROM WHITE STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -0.293 (0.0450) 0.0404 (0.1790) -0.880 (0.0629) -0.0309 (0.0213) -0.0317 -2.76 -9.74 0.00296 0.10 -9.64 -0.0564 -8.28 -17.92 -0.00146 -0.74 -18.67 Table Continues on Next Page 42 -6.98 TABLE 3.8 (C ONT.): DIFFERENCES IN MATH L EARNING RATES BY RACE AND LANGUAGE —- ELEMENTARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period ASIAN ENGLISH SPEAKING HOME STUDENTS (DIFFERENCE FROM WHITE STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade 3.85 0.312 (0.0817) 0.758 (0.3720) -0.213 (0.1290) -0.273 (0.0408) 0.0337 2.94 6.79 0.0556 1.95 8.75 -0.0136 -2.00 6.74 -0.0129 -6.56 0.18 ASIAN NON -ENGLISH SPEAKING HOME S TUDENTS (DIFFERENCE FROM W HITE STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -0.0567 (0.0380) 0.366 (0.1770) -0.437 (0.0676) -0.0648 (0.0254) -3.36 -0.00613 -0.53 -3.89 0.0268 0.94 -2.95 -0.0280 -4.12 -7.07 -0.00305 -1.56 -8.62 Estimates in bold are significantly different from the corresponding estimate for white students at the 5 percent level. Descriptions of models are provided in Appendix A. Some of the numbers in the end of period column are off by 0.01 points due to rounding. 43 FIGURE 3.8: DIFFERENCES IN MATH L EARNING RATES BY RACE AND LANGUAGE—ELEMENTARY SCHOOL 7-RATE & MEASUREMENT 10 20 30 40 50 60 70 80 90 100 Score ECLS Math Scores by Race and Language 6-PLACE VALUE 5-MULTIPLY/DIVIDE 4-ADD/SUBTRACT 3-ORDINALITY, SEQUENCE 2-RELATIVE SIZE 1-COUNT, NUMBER, SHAPE Kindergarten 1st Grade 2nd and 3rd Grades Asian EH Black Hispanic Non-EH Asian Non-EH Hispanic EH White Economic Status29 Reading The findings presented in Table 3.9 show that low-income students begin kindergarten more than 5 points behind higher-income students on the reading assessment. Low-income children are learning to recognize letters at the beginning of kindergarten (see Figure 3.9). Higher-income students are learning beginning sounds to words in the first half of kindergarten, on average, but low-income children are learning beginning sounds at the end of kindergarten. The learning rates differ slightly in kindergarten and continue to diverge significantly in the summer and in first grade. By the end of first grade, low-income children are just learning how to infer meaning from text, whereas higher-income children are beginning to learn the next more advanced skill, extrapolation (see Figure 3.9). The gap between low-income and higher-income children in reading widens from kindergarten through third grades. Because of the initial advantage and faster gain for higher-income children, lowincome children continue to lag behind. By the end of third grade, higher-income children are starting to learn extrapolation skills, whereas their low-income peers are on average still learning the less advanced skill of literal inference. Low-income status is defined as having family income less than 1.85 times the poverty line, representing the cutoff for eligibility in the federal free and reduced price lunch program. Our measure of low-income status therefore includes many students whose families live above the poverty line, but these families are considered low-income by the federal government. 29 44 TABLE 3.9: DIFFERENCES IN R EADING L EARNING RATES BY ECONOMIC STATUS —ELEMENTARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period HIGHER -INCOME S TUDENTS Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade 25.03 1.93 (0.0140) -0.0636 (0.0623) 3.52 (0.0247) 1.61 (0.0083) 0.209 18.21 43.25 -0.00466 -0.16 43.08 0.225 33.11 76.20 0.103 38.79 114.98 LOW -INCOME STUDENTS (DIFFERENCE FROM HIGHER -INCOME S TUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -0.277 (0.0209) -0.273 (0.0917) -0.541 (0.0375) -0.0665 (0.0132) -5.37 -0.0299 -2.61 -7.98 -0.0200 -0.70 -8.68 -0.0347 -5.09 -13.77 -0.00427 -1.60 -15.37 Each estimate in bold is significantly different from the corresponding estimate for higher-income students at the 5 percent level. Descriptions of models are provided in Appendix A. 45 FIGURE 3.9: DIFFERENCES IN R EADING L EARNING RATES BY ECONOMIC STATUS —ELEMENTARY SCHOOL ECLS Reading Scores by Economic Status 110 7-EXTRAPOLATION 70 5-WORD IN CONTEXT 50 Score 90 6-LITERAL INFERENCE 4-SIGHT WORDS 3-ENDING SOUNDS 30 2-BEGINNING SOUNDS 10 1-LETTER RECOGNITION Kindergarten 1st Grade 2nd and 3rd Grades Higher-Income Low-Income Math As with reading, low-income students begin kindergarten more than 5 points behind their peers in mathematics. Figure 3.10 shows that the average higher-income student is learning to compare sizes at kindergarten entry. In contrast, low-income students are learning to compare sizes halfway through kindergarten, on average. The difference between the learning rates of low-income and higher-income students in kindergarten and first grade is about a quarter of a point per month. Although this difference may seem small, it accumulates over nine months of school, and compounds the initial deficit with which low-income students enter kindergarten. The math learning gap between the economically advantaged and disadvantaged exists on the first day of school and persists through third grade, so that low-income students are nearly a grade behind throughout these grades. For example, higher-income students are on average learning the concept of place value in second grade, whereas low-income students are learning about place value in third grade. 46 TABLE 3.10: DIFFERENCES IN MATH L EARNING RATES BY ECONOMIC STATUS —ELEMENTARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period HIGHER -INCOME S TUDENTS Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade 19.91 1.72 (0.0120) 0.514 (0.0626) 2.48 (0.0221) 1.23 (0.0061) 0.209 16.24 36.15 0.0441 1.32 37.48 0.201 23.33 60.81 0.0999 29.68 90.49 LOW -INCOME STUDENTS (DIFFERENCE FROM HIGHER -INCOME S TUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -0.248 (0.0180) -0.0677 (0.0881) -0.252 (0.0313) -0.0805 (0.0100) -5.50 -0.0301 -2.34 -7.83 -0.00582 -0.17 -8.01 -0.0204 -2.37 -10.38 -0.00651 -1.94 -12.32 Each estimate in bold is significantly different from the corresponding estimate for higher-income students at the 5 percent level. Descriptions of models are provided in Appendix A. 47 FIGURE 3.10: DIFFERENCES IN MATH L EARNING RATES BY ECONOMIC STATUS —ELEMENTARY SCHOOL 7-RATE & MEASUREMENT 10 20 30 40 50 60 70 80 90 100 Score ECLS Math Scores by Economic Status 6-PLACE VALUE 5-MULTIPLY/DIVIDE 4-ADD/SUBTRACT 3-ORDINALITY, SEQUENCE 2-RELATIVE SIZE 1-COUNT, NUMBER, SHAPE Kindergarten 1st Grade 2nd and 3rd Grades Higher-Income Low-Income Race and Income Reading Previous analyses with just income and just race indicated large disparities in average reading scores and growth rates. Combining the two highlights which subgroups experience the greatest challenges in learning. The analyses of learning differences by race/ethnicity and income status are presented in Table 3.11 and illustrated in Figure 3.11. The reading gains made by white low-income children differ from those made by higher-income white children. White low-income children begin kindergarten with a 4.57-point disadvantage on the reading assessment. The difference in gains widens slightly in kindergarten and is widest in first grade. Thus the accumulated disadvantage grows to more than 10 points by the end of third grade. Comparing the white low-income disadvantage and the black low-income disadvantage indicates the importance of race. Despite similar economic backgrounds, white low-income children are significantly better off than black low-income children, when both are compared to white higher-income children. In first through third grades, white low-income children have deficits in reading gain half the size of black low-income children’s deficits. White low-income children do not fall as far behind white higher-income children in learning reading as do black low-income children.30 Early in school, low-income children of black, Hispanic, and Asian ethnicities gain substantially less than not only white-higher income children but also higher-income children of the same ethnicity. Black lowincome children start kindergarten with lower reading scores and gain significantly less in kindergarten 30 This comparison was tested explicitly in the HLM program. 48 than black higher-income children. 31 In first through third grades, however, black children at both income strata learn at very similar rates. Hispanic higher-income students start kindergarten with a deficit on the reading assessment compared to white higher-income students. During elementary school, this deficit doubles in size. Although Hispanic higher-income children gain reading skills at a similar pace as white higher-income children in kindergarten and second and third grades, they gain significantly less in first grade. 32 This then leads to the large 8-point accumulated deficit by the end of third grade. Hispanic low-income students start kindergarten with the most substantial deficit on the reading assessment, compared to white higher-income students. In each grade, the gap (the difference in achievement levels) between Hispanic low-income students and white higher-income students increases (reflecting lower average growth among Hispanic students). The gap is widest in first grade, the year that many Hispanic students excluded for a lack of English proficiency take the reading assessment for the first time. In second and third grades, the gap shrinks dramatically. However, the overall gap remains large. By the end of third grade, Hispanic low-income students’ original deficit almost triples in size and matches the magnitude of the accumulated deficit faced by black low-income students. The gap between higher- and low-income Asian children grows by the most and ends up being the largest of the income gaps by race. Asian low-income children start kindergarten with a 4-point disadvantage compared to white higher-income children on the reading assessment and a 7-point gap compared to their higher-income Asian counterparts. Asian higher-income children score the highest on the reading assessment in the fall of kindergarten and make more gain in reading during kindergarten. In first grade, however, Asian higher-income children make less gain in reading than white high-income children in first grade. 32 In second and third grades, Asian higher-income students manifest the slowest reading gains of all the subgroups. Nevertheless, higher-income Asian students end up almost 10 points ahead of their low-income Asian counterparts but still behind white higher-income students. Black higher-income children start kindergarten fewer than 2 points behind white higher-income children on the reading assessment. However, the deficit increases over the first four years of elementary school. By the end of third grade, black higher-income children have gained 21 points less than white higher-income children. Interestingly, black higher-income children score higher than white low-income children on the initial assessment, but by the end of third grade have fallen behind, due in part to a dramatically lower rate of learning in first grade and a slightly lower rate in second and third grades. Figure 3.11a focuses on the higher-income children, with separate lines to represent the learning rate of each race/ethnicity. The reading gains made by Asian higher-income children outpace the rest of the higher-income children. However, in second and third grades, white students catch up to Asian students. Black higher-income and Hispanic higher-income children make slower progress than white students and Asian students of the same socioeconomic stratum. All the higher-income children are advancing towards learning extrapolation, with white students and Asian students making the quickest gain. Figure 3.11b focuses on low-income children. Again, black and Hispanic students do not gain as quickly as white and Asian students. But unlike in the higher-income cluster, among the low-income students, Asian This comparison was tested explicitly in the HLM program. Again, this is partially attributable to the introduction of students who passed the OLDS English-language screening test in first grade and may exhibit weaker than average reading skills. 31 32 49 students keep pace with white students. The difference in gain between Asian and white students is smaller among the low-income students than among the higher-income students. Among low-income students, white students not only catch up to Asian students in second and third grades, but also begin to make very slightly stronger gains. By the end of third grade, Asian and white low-income children are advancing towards learning extrapolation (like most of the higher-income students). 50 TABLE 3.11: DIFFERENCES IN R EADING L EARNING RATES BY RACE AND INCOME—ELEMENTARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period W HITE HIGHER -INCOME STUDENTS Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade 25.51 1.95 (0.0163) -0.159 (0.0723) 3.64 (0.0290) 1.64 (0.0095) 0.210 18.32 43.83 -0.0116 -0.41 43.42 0.233 34.25 77.67 0.0774 39.47 117.15 W HITE LOW -INCOME STUDENTS (DIFFERENCE FROM WHITE HIGHER -INCOME STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -0.169 (0.0296) -0.242 (0.1300) -0.384 (0.0575) -0.0202 (0.0193) -4.57 -0.0183 -1.59 -6.17 -0.0177 -0.62 -6.79 -0.0246 -3.61 -10.41 -0.000952 -0.49 -10.89 BLACK HIGHER -INCOME STUDENTS (DIFFERENCE FROM W HITE HIGHER -INCOME STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -0.164 (0.0515) 0.337 (0.2310) -0.626 (0.0877) -0.171 (0.0377) -1.62 -0.0177 -1.54 -3.17 0.0247 0.87 -2.30 -0.0401 -5.89 -8.19 -0.00806 -4.11 -12.30 BLACK LOW -INCOME STUDENTS (DIFFERENCE FROM WHITE HIGHER -INCOME STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -5.69 -0.405 (0.0321) -0.311 (0.1390) -0.699 (0.0591) -0.193 (0.0219) 51 -0.0437 -3.81 -9.50 -0.0228 -0.80 -10.30 -0.0448 -6.58 -16.88 -0.00910 -4.64 -21.52 TABLE 3.11 (CONT .): DIFFERENCES IN R EADING L EARNING RATES BY RACE AND I NCOME—ELEMENTARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period HISPANIC HIGHER -INCOME S TUDENTS (DIFFERENCE FROM W HITE HIGHER -INCOME STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -3.38 -0.0322 (0.0460) 0.280 (0.2080) -0.394 (0.0782) -0.0518 (0.0267) -0.00348 -0.30 -3.69 0.0205 0.72 -2.96 -0.0252 -3.71 -6.67 -0.00244 -1.25 -7.91 HISPANIC LOW -INCOME STUDENTS (DIFFERENCE FROM WHITE HIGHER -INCOME STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -0.337 (0.0381) 0.133 (0.1660) -1.00 (0.0587) -0.0621 (0.0220) -7.87 -0.0364 -3.18 -11.05 0.00977 0.34 -10.70 -0.0643 -9.44 -20.14 -0.00293 -1.49 -21.64 ASIAN HIGHER -INCOME STUDENTS (DIFFERENCE FROM W HITE HIGHER -INCOME STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade 0.342 (0.0801) 0.531 (0.3880) -0.211 (0.1290) -0.281 (0.0366) 3.16 0.0369 3.22 6.38 0.0389 1.37 7.74 -0.0135 -1.99 5.76 -0.0133 -6.77 -1.01 ASIAN LOW -INCOME STUDENTS (DIFFERENCE FROM W HITE HIGHER -INCOME S TUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -0.00597 (0.0799) 0.628 (0.4420) -0.657 (0.1430) -0.208 (0.0397) -4.72 -0.000644 -0.06 -4.77 0.0460 1.62 -3.15 -0.0422 -6.19 -9.34 -0.00981 -5.00 -14.34 Each estimate in bold is significantly different from the corresponding estimate for white higher-income students at the 5 percent level. Descriptions of models are provided in Appendix A. 52 FIGURE 3.11A: READING L EARNING RATES BY RACE (HIGHER INCOME)—ELEMENTARY SCHOOL ECLS Reading Scores by Race, Higher Inc 110 7-EXTRAPOLATION 70 5-WORD IN CONTEXT 50 4-SIGHT WORDS 30 3-ENDING SOUNDS 2-BEGINNING SOUNDS 1-LETTER RECOGNITION 10 Score 90 6-LITERAL INFERENCE Kindergarten 1st Grade 2nd and 3rd Grades Asian Hispanic Black White 53 FIGURE 3.11B: READING L EARNING RATES BY RACE(LOW -INCOME)—ELEMENTARY SCHOOL ECLS Reading Scores by Race, Low Income 110 7-EXTRAPOLATION 70 5-WORD IN CONTEXT 50 Score 90 6-LITERAL INFERENCE 4-SIGHT WORDS 30 3-ENDING SOUNDS 2-BEGINNING SOUNDS 10 1-LETTER RECOGNITION Kindergarten 1st Grade Asian Hispanic 2nd and 3rd Grades Black White Math Results from math models that combine race/ethnicity and income show how being low-income and belonging to a minority group is associated with compounded disadvantages in learning. Earlier, we showed that low-income students gain about two or three hundredths of a standard deviation less per month in math than higher-income students. Table 3.4 showed that black students gain less than white students throughout these early grades, though the difference shrinks with time (-0.04 SD in kindergarten, -0.03 SD in first grade, and -0.01 SD in second and third grades). Hispanic students show even less of a deficit with white students (-0.03 SD in kindergarten; -0.02 SD in first grade, and -0.001 SD in second and third grades). When breaking these findings out by race and income, black and Hispanic low-income students are at an even greater disadvantage. Black low-income and Hispanic low-income children begin kindergarten with the lowest scores on the math assessment and make the slowest gain in kindergarten. Each of the learning rate differences discussed in the previous paragraph increases for low-income children. Black low-income children gain 0.05 SD less than white higher-income children during kindergarten, 0.04 SD less in first grade, and 0.01 SD less in second and third grades. Hispanic low-income children experience a similar compounding effect in their gap relative to white higher-income children. Hispanic low-income children’s learning rate is lower than white higher-income students by 0.05 SD per month in kindergarten, 0.03 SD per month in first grade, and 0.004 SD per month in second and third grade. In the fall of kindergarten, black higher-income children’s math scores look like those of white low-income children. But their achievement diverges over time. Black higher-income students make less gain in math 54 than white low-income students during every time period. By the end of third grade, black higher-income children are behind white children of both income strata. At the start of kindergarten, Hispanic higher-income students earn lower math scores than white and black higher-income students, but face less of a deficit than black students throughout the primary school years. Hispanic children from higher-income families finish third grade with cumulative average scores higher than black higher-income children and their Hispanic low-income peers. However, these scores are nearly 7 points behind those of white higher-income children. Asian higher-income children start school with an average math score more than a point higher than white higher-income children. Their learning rates are even during kindergarten. But Asian students gain points on the math assessment at a slower pace in first grade and a faster pace in second and third grades, relative to white students. By the end of third grade, Asian higher-income children are 1.67 points ahead of white higher-income children. Asian low-income children start with a lower average math score than white higher-income children. During kindergarten, they make less gain in math than white higher-income students. In first grade, they make even slower gains in learning math, but gain faster than white higher-income students in second and third grades, relative to white students. In second and third grades, Asian children, of both income categories, are the only students making greater math gains on average than white higher-income children. Thus Asian higher-income students retain their original advantage on the math assessment by the end of third grade. The finding that Asian students learn less in math during first grade than white students but later learn more than white students is not perplexing when taken in context of the change in sample. Many Asian students who were excluded from the first rounds of data collection for lack of English proficiency were included in the first grade data collection. These students may demonstrate below-average math skills and push down the gain estimated during that period (see Appendix A for more on sample selection). Figures 3.12a and 3.12b illustrate the learning rates in math for higher-income and low-income children respectively. Figure 3.12a shows that black students, despite their higher-income background, have a distinctly slower learning rate than the Asian, white, and Hispanic higher-income groups. Figure 3.12b suggests a similarity between the learning rates of black and Hispanic low-income children; both are behind Asian and white low-income children’s learning rates. Figures 3.12a and 3.12b illustrate that Asian and white students learn at nearly the same rate, regardless of income status. Higher-income students are already learning place value (see Figure 3.12a), whereas the low-income black and Hispanic students are not yet there. 55 TABLE 3.12: DIFFERENCES IN MATH L EARNING RATES BY RACE AND INCOME—ELEMENTARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period W HITE HIGHER -INCOME STUDENTS Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade 20.76 1.77 (0.0142) 0.524 (0.0766) 2.54 (0.0273) 1.24 (0.0071) 0.215 16.64 37.39 0.0450 1.35 38.74 0.206 23.93 62.68 0.0791 29.77 92.45 W HITE LOW -INCOME STUDENTS (DIFFERENCE FROM WHITE HIGH ER -INCOME STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -0.138 (0.0272) -0.177 (0.1290) -0.176 (0.0477) -0.0412 (0.0147) -4.35 -0.0167 -1.30 -5.64 -0.0152 -0.46 -6.10 -0.0143 -1.66 -7.76 -0.00263 -0.99 -8.75 BLACK HIGHER -INCOME STUDENTS (DIFFERENCE FROM W HITE HIGHER -INCOME STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -0.303 (0.0409) -0.0632 (0.2010) -0.384 (0.0692) -0.0942 (0.0260) -3.67 -0.0368 -2.86 -6.53 -0.00543 -0.16 -6.69 -0.0310 -3.61 -10.30 -0.00602 -2.27 -12.57 BLACK LOW -INCOME STUDENTS (DIFFERENCE FROM WHITE HIGHER -INCOME STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -0.443 (0.0278) -0.187 (0.1440) -0.445 (0.0514) -0.185 (0.0167) -0.0538 -4.17 -10.99 -0.0160 -0.48 -11.47 -0.0360 -4.19 -15.66 -0.0118 -4.45 -20.11 Table Continues on Next Page 56 -6.82 Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period HISPANIC HIGHER -INCOME S TUDENTS (DIFFERENCE FROM W HITE HIGHER -INCOME STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -4.18 -0.108 (0.0360) -0.198 (0.1770) -0.129 (0.0628) 0.00648 (0.0187) -0.0131 -1.02 -5.20 -0.0170 -0.51 -5.71 -0.0104 -1.21 -6.92 0.000414 0.16 -6.77 HISPANIC LOW -INCOME STUDENTS (DIFFERENCE FROM WHITE HIGHER -INCOME STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -8.25 -0.388 (0.0278) 0.226 (0.1460) -0.343 (0.0487) -0.0628 (0.0165) -0.0472 -3.66 -11.91 0.0194 0.58 -11.33 -0.0277 -3.23 -14.55 -0.00401 -1.51 -16.06 ASIAN HIGHER -INCOME STUDENTS (DIFFERENCE FROM W HITE HIGHER -INCOME STUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade 1.64 -0.00469 (0.0611) 0.212 (0.3500) -0.281 (0.1010) 0.0906 (0.0276) -0.000569 -0.04 1.59 0.0182 0.55 2.14 -0.0227 -2.64 -0.51 0.00579 2.18 1.67 ASIAN LOW -INCOME STUDENTS (DIFFERENCE FROM W HITE HIGHER -INCOME S TUDENTS) Before Kindergarten During Kindergarten Summer K-1st During 1st Grade After 1st Grade, into 3rd Grade -4.55 -0.144 (0.0687) 0.754 (0.3970) -0.617 (0.1230) 0.0745 (0.0302) -0.0175 -1.36 -5.91 0.0648 1.94 -3.97 -0.0499 -5.81 -9.78 0.00476 1.79 -7.99 Each estimates in bold is significantly different from the corresponding estimate for white higher-income students at the 5 percent level. Descriptions of models are provided in Appendix A. 57 FIGURE 3.12A: MATH L EARNING RATES BY RACE (HIGHER INCOME)—ELEMENTARY SCHOOL 7-RATE & MEASUREMENT 10 20 30 40 50 60 70 80 90 100 Score ECLS Math Scores by Race, Higher Inc 6-PLACE VALUE 5-MULTIPLY/DIVIDE 4-ADD/SUBTRACT 3-ORDINALITY, SEQUENCE 2-RELATIVE SIZE 1-COUNT, NUMBER, SHAPE Kindergarten 1st Grade 2nd and 3rd Grades Asian Hispanic Black White FIGURE 3.12B: MATH L EARNING RATES BY RACE (LOW-I NCOME)—ELEMENTARY SCHOOL 7-RATE & MEASUREMENT 10 20 30 40 50 60 70 80 90 100 Score ECLS Math Scores by Race, Low Income 6-PLACE VALUE 5-MULTIPLY/DIVIDE 4-ADD/SUBTRACT 3-ORDINALITY, SEQUENCE 2-RELATIVE SIZE 1-COUNT, NUMBER, SHAPE Kindergarten 1st Grade 2nd and 3rd Grades Asian Hispanic Black White 58 Summary Differences in learning by gender, race/ethnicity, and economic status have already begun when children enter formal schooling in kindergarten and tend to continue or even grow during elementary school. Race/ethnicity is a substantial factor in these learning rate differences. Black and Hispanic students are mostly lagging behind white and Asian students, with black students consistently worst off. In reading and math, low-income children start school with substantially lower test scores than their more advantaged peers, and learn at a slightly lower rate in each grade. The learning deficit decreases in later grades (that is, low-income students learn at a lower rate throughout, but their learning rate is closer to their higher-income peers in second and third grade), but the achievement gap grows throughout the early school years. The analyses interacting race/ethnicity with income suggest the relative importance of race in reading and math gaps in achievement and learning. Black and Hispanic children, whether in higher-income or lowincome families, gain consistently less in reading and math than Asian and white peers of similar economic background. The analyses that explore the interaction of race/ethnicity and language status suggest a similar conclusion. Black children start kindergarten with lower scores and improve at a slower rate, on average, than either Asian or Hispanic students in non-English homes. Secondary School Gender Reading In high school, females hold a significant initial advantage in reading achievement. Findings presented in Table 3.13 show that female students score nearly 2 points higher in eighth grade than their male peers. Females make reading gains similar to those of males early in high school and slightly higher between tenth and twelfth grades, leaving them with about the same point advantage at the end of high school as they had at the beginning. Both groups are, on average, moving from learning simple to more complex inferences. TABLE 3.13: DIFFERENCES IN R EADING L EARNING RATES BY G ENDER—S ECONDARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period M ALE STUDENTS Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade 27.30 0.156 (0.0049) 0.0809 (0.0066) 0.0204 3.74 31.04 0.00876 1.94 32.98 FEMALE STUDENTS (DIFFERENCE FROM MALE S TUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade 1.87 -0.00670 (0.0066) 0.0190 (0.0083) -0.000875 -0.16 1.71 0.00206 0.46 2.17 Each estimate in bold is significantly different from the corresponding estimate for male students at the 5 percent level. Descriptions of models are provided in Appendix A. 59 45 FIGURE 3.13: DIFFERENCES IN R EADING L EARNING RATES BY G ENDER—S ECONDARY SCHOOL NELS Reading Scores by Gender 30 2-Simple Inferences 20 25 Score 35 40 3-Complex Inferences 8-10 10-12 Female Male Math Perhaps not surprisingly, given the elementary school student results presented earlier in Table 3.2, females start high school with slightly lower math achievement than their male peers (a difference of about a half point). These differences, presented in Table 3.13, do not increase between eighth and tenth grades but do increase between tenth and twelfth grades when female students gain about 0.84 points less per period on the math test than boys. This gap in gain represents about 0.01 of a standard deviation on the tenth-grade test. By the end of high school, the initial male advantage on the math assessment increases to an advantage of almost 2 points. 60 TABLE 3.14: DIFFERENCES IN MATH L EARNING RATES BY G ENDER—S ECONDARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period M ALE STUDENTS Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade 38.44 0.330 (0.0058) 0.200 (0.0057) 0.0296 7.92 46.36 0.0152 4.81 51.17 FEMALE STUDENTS (DIFFERENCE FROM MALE S TUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -0.56 -0.0117 (0.0076) -0.0442 (0.0076) -0.00105 -0.28 -0.84 -0.00336 -1.06 -1.90 Each estimate in bold is significantly different from the corresponding estimate for male students at the 5 percent level. Descriptions of models are provided in Appendix A. FIGURE 3.14: DIFFERENCES IN MATH L EARNING RATES BY G ENDER—S ECONDARY SCHOOL 60 NELS Math Scores by Gender 50 4-Intermediate Level Math 40 2-Fractions and Exponents 30 Score 3-Simple Problem Solving 20 1-Single Operations 8-10 10-12 Female Male Race/Ethnicity Reading White students hold an initial advantage in reading achievement over black and Hispanic students (see Table 3.15) but not Asian students. Black students score 5.49 points lower than white students and 61 Hispanic students score 4.83 points lower than white students. These differences in initial status are compounded by differences in reading gains made during high school. There are statistically significant race/ethnicity differences in reading gains during high school. Between ninth and tenth grades, white students gain very slightly more than black students and Hispanic students but less than Asian students. Between tenth and twelfth grades, white students gain at a slightly faster rate than black students but at a slower rate than Hispanic students and Asian students. By the end of high school, black students and Hispanic students are learning simple inference and abstract points, which white students and Asian students, on average, learned in eighth grade. TABLE 3.15: DIFFERENCES IN R EADING L EARNING RATES BY RACE/ETHNICITY — SECONDARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period W HITE STUDENTS Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade 29.35 0.157 (0.0038) 0.0885 (0.0049) 0.0205 3.76 33.10 0.00959 2.12 35.23 BLACK STUDENTS (DIFFERENCE FROM WHITE STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -5.49 -0.0360 (0.0114) -0.0129 (0.0140) -0.00471 -0.86 -6.36 -0.00140 -0.31 -6.67 HISPANIC STUDENTS (DIFFERENCE FROM WHITE STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -4.83 -0.0122 (0.0114) 0.0191 (0.0133) -0.00160 -0.29 -5.12 0.00207 0.46 -4.66 ASIAN STUDENTS (DIFFERENCE FROM W HITE STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -0.26 0.0231 (0.0127) 0.0477 (0.0190) 0.00302 0.55 0.29 0.00516 1.14 1.43 Each estimates in bold is significantly different from the corresponding estimate for white students at the 5 percent level. Descriptions of models are provided in Appendix A. 62 NELS Reading Scores by Race 3-Complex Inferences 30 2-Simple Inferences 20 25 Score 35 40 45 FIGURE 3.15: DIFFERENCES IN R EADING L EARNING RATES BY RACE/ ETHNICITY—SECONDARY SCHOOL 8-10 10-12 Asian Hispanic Black White Math Table 3.16 shows the significant race/ethnic differences in math achievement at the start of high school. In eighth grade, white students have an initial advantage over black and Hispanic students. However, Asian students have an initial 2.71 advantage over white students and keep pace with white students throughout high school. The deficits increase early in high school. Between eighth and tenth grade, black students and Hispanic students make slower gains in math than white students, with black students falling farthest behind. Asian students gain 2.71 more points than white students during this time period. Some of these differences in gains persist later in high school. Between tenth and twelfth grades, white students gain more than black students, and Asian students gain more than white students. There are no significant differences in math gains between white students and Hispanic students. By the end of high school, gaps between groups increase slightly. For example, the initial 9-point advantage of white students over black students increases by about a point, and the initial advantage of Asian students over white students also increases by about a point. These changes translate into wide gaps in skills. By the end of high school, Asian students are beginning to learn intermediate-level math concepts, whereas black and Hispanic students are far behind, learning fractions and decimals, math concepts that the white and Asian students learned by the start of eighth grade. These gaps can also be compared to the gender gaps. Black and Hispanic students end twelfth grade with scores 11 and 7 points behind those of white students, the male-female difference in math scores is only around 2 points. 63 TABLE 3.16: DIFFERENCES IN MATH L EARNING RATES BY RACE/ ETHNICITY—S ECONDARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period W HITE STUDENTS Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade 39.75 0.333 (0.0041) 0.176 (0.0045) 0.0299 7.99 47.75 0.0134 4.23 51.97 BLACK STUDENTS (DIFFERENCE FROM WHITE STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -9.13 -0.0625 (0.0168) -0.00472 (0.0124) -0.00560 -1.50 -10.63 -0.000359 -0.11 -10.74 HISPANIC STUDENTS (DIFFERENCE FROM WHITE STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -7.00 -0.0329 (0.0121) 0.0168 (0.0128) -0.00295 -0.79 -7.79 0.00128 0.40 -7.39 ASIAN STUDENTS (DIFFERENCE FROM W HITE STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade 2.71 0.0326 (0.0174) 0.0299 (0.0190) 0.00292 0.78 3.49 0.00228 0.72 4.20 Each estimate in bold is significantly different from the corresponding estimate for white students at the 5 percent level. Descriptions of models are provided in Appendix A. 64 FIGURE 3.16: DIFFERENCES IN MATH L EARNING RATES BY RACE/ ETHNICITY—SECONDARY SCHOOL 60 NELS Math Scores by Race 50 4-Intermediate Level Math 40 2-Fractions and Exponents 30 Score 3-Simple Problem Solving 20 1-Single Operations 8-10 10-12 Asian Hispanic Black White Language Status Reading Table 3.17 shows that students from homes where English is not the primary language (NEH) hold an initial disadvantage in reading achievement compared to students from homes where English is the primary language (EH). NEH students start high school with an average eighth grade reading score 3.96 points lower than EH students. There is no significant difference between NEH and EH students’ learning between eighth and tenth grades. However, between tenth and twelfth grades, NEH students actually gain about a point more than EH students (0.004 SD per month). Figure 3.17 illustrates the slight narrowing of the learning rate difference over tenth to twelfth grades. Similarly, in elementary school, the NEH disadvantage in learning gains diminishes after first grade (Table 3.5), though NEH students still have slower growth rates in second and third grades. Interestingly, the effect size per month gain for NEH students between tenth and twelfth grades is equal in magnitude to the effect size per month loss for NEH students in second and third grades (see Table 3.5). 65 TABLE 3.17: DIFFERENCES IN R EADING L EARNING RATES BY LANGUAGE STATUS —S ECONDARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period ENGL ISH SPEAKING HOME STUDENTS Before High School 28.60 8th Grade to 10th Grade 0.152 (0.0035) 0.0873 (0.0044) 10th Grade to 12th Grade 0.0199 3.65 32.25 0.00945 2.09 34.34 NON -ENGLISH SPEAKING HOME STUDENTS (DIFFERENCE FROM ENGLISH SPEAKING HOME S TUDENTS) Before High School 8th Grade to 10th Grade 0.00368 (0.0106) 0.0406 10th Grade to 12th Grade -3.96 0.000481 0.09 -3.87 0.00440 0.97 -2.90 (0.0132) Each estimate in bold is significantly different from the corresponding estimate for EH students at the 5 percent level. Descriptions of models are provided in Appendix A. 45 FIGURE 3.17: DIFFERENCES IN R EADING L EARNING RATES BY LANGUAGE STATUS —S ECONDARY SCHOOL NELS Reading Scores by Language Status 25 30 2-Simple Inferences 20 Score 35 40 3-Complex Inferences 8-10 10-12 EH Non-EH 66 Math Table 3.18 shows that NEH students start high school with a significant disadvantage in math achievement compared to EH students. NEH students score 4.01 points lower than English speaking students at the end of eighth grade. This difference does not grow early in high school, when NEH students make the same gain as EH students. But like in reading, this pattern reverses between tenth and twelfth grades when NEH students gain significantly more than EH students (0.75 points). By the end of high school, both groups are moving past simple problem solving and beginning to learn more advanced math concepts. In elementary school, we also see that the initial advantage in math learning gains of EH households observed in kindergarten and first grade is not observed during second and third grades. This suggests that the longer the NEH students remain in school, the more their learning rates approach those of EH students. TABLE 3.18: DIFFERENCES IN MATH L EARNING RATES BY LANGUAGE STATUS —S ECONDARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period ENGLISH SPEAKING HOME STUDENTS Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade 38.52 0.325 (0.0040) 0.176 (0.0040) 0.0292 7.80 46.32 0.0133 4.21 50.53 NON -ENGLISH SPEAKING HOME STUDENTS (DIFFERENCE FROM ENGLISH SPEAKING HOME S TUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade At End of Period -0.0103 (0.0131) 0.0314 (0.0129) -4.01 -0.000926 -0.25 -4.26 0.00238 0.75 -3.51 Each estimate in bold is significantly different from the corresponding estimate for EH students at the 5 percent level. Descriptions of models are provided in Appendix A. 67 FIGURE 3.18: DIFFERENCES IN MATH L EARNING RATES BY LANGUAGE STATUS —S ECONDARY SCHOOL 60 NELS Math Scores by Language Status 50 4-Intermediate Level Math 40 2-Fractions and Exponents 30 Score 3-Simple Problem Solving 20 1-Single Operations 8-10 10-12 EH Non-EH Race and Language Reading As seen in Table 3.15, there are significant differences in average reading achievement between white students and racial/ethnic minority students in the spring of eighth grade. In Table 3.19, we split these findings by language status. Black students start high school with less of a deficit than Hispanic NEH students. But because black students make less gain than Hispanic students, regardless of language status, black students end high school with the deepest deficit on the reading assessment. Hispanic students regardless of EH status start high school with a lower average reading score than white students, but the deficit is twice as great for Hispanic NEH students than for Hispanic EH students. Hispanic students of both NEH and EH subgroups gain points at a similar rate to white students throughout high school. As seen in an earlier table (Table 3.15), Asian students do not differ from white students in terms of average reading achievement in the spring of eighth grade but gain more than white students during high school. Breaking out the Asian subgroup by language background shows some surprising findings. Asian EH students keep pace with white students throughout high school. From tenth to twelfth grade, Asian NEH students make greater gains in reading than white students (2.18 points more). This runs counter to the elementary findings where the later elementary grades showed slower reading gains for Asian students. Here the later secondary grades prove more successful for Asian students. 68 TABLE 3.19: DIFFERENCES IN R EADING L EARNING RATES BY RACE AND LANGUAGE—S ECONDARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period W HITE STUDENTS Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade 29.35 0.157 (0.0038) 0.0884 (0.0049) 0.0205 3.76 33.11 0.00957 2.12 35.23 BLACK STUDENTS (DIFFERENCE FROM WHITE STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -5.47 -0.0356 (0.0115) -0.0129 (0.0140) -0.00465 -0.85 -6.32 -0.00139 -0.31 -6.63 HISPANIC ENGLISH SPEAKING HOME STUDENTS (DIFFERENCE FROM W HITE STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -0.0184 (0.0150) 0.00231 (0.0174) -3.17 -0.00240 -0.44 -2.81 0.000250 0.06 -2.39 HISPANIC NON -ENGLISH SPEAKING HOME STUDENTS (DIFFERENCE FROM WHITE STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -0.00517 (0.0161) 0.0324 (0.0185) -6.21 -0.000676 -0.12 -5.83 0.00350 0.78 -5.38 ASIAN ENGLISH SPEAKING HOME STUDENTS (DIFFERENCE FROM WHITE STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade 0.0221 (0.0188) 0.0372 (0.0233) 0.36 0.00288 0.53 0.81 0.00402 0.89 1.37 ASIAN NON -ENGLISH SPEAKING HOME S TUDENTS (DIFFERENCE FROM W HITE STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade 0.0288 (0.0168) 0.0911 (0.0183) -1.20 0.00376 0.69 -0.79 0.00986 2.18 -0.35 Each estimate in bold is significantly different from the corresponding estimate for white students at the 5 percent level. Descriptions of models are provided in Appendix A. 69 FIGURE 3.19: DIFFERENCES IN R EADING L EARNING RATES BY RACE AND LANGUAGE—S ECONDARY SCHOOL 45 NELS Reading Scores by Race and Language 30 2-Simple Inferences 20 25 Score 35 40 3-Complex Inferences 8-10 10-12 Asian EH Black Hispanic Non-EH Asian Non-EH Hispanic EH White Math Results presented earlier in Table 3.16 show that black and Hispanic students have lower initial math achievement at the end of eighth grade and slower math gains throughout high school than white students, though Hispanic students learn at a similar rate between tenth and twelfth grades. Table 3.20 teases apart these race/ethnicity findings by language status. Compared to white students, Hispanic NEH students start high school with a greater math deficit than Hispanic EH students. Throughout high school, Hispanic students—both NEH and EH—make similar gains in math as white students. This differs from the findings presented in Table 3.16 in which Hispanic students overall learn very slightly but significantly less in eighth to tenth grades than white students. The difference is likely because the estimates for Hispanic students by language status are less precise than the overall results for Hispanic students because of the smaller sample sizes. In general, Asian students have higher initial math achievement than white students and during high school, make similar gains to white students (see Table 3.16). These patterns hold for Asian students regardless of language status, with one exception. Asian NEH students gain math points slightly but significantly faster than white students between tenth and twelfth grades (0.072 points per month, or 0.005 SD). This provides further support for the general theme that once NEH students have been in school for a few years their learning rates increase. 70 TABLE 3.20: DIFFERENCES IN MATH L EARNING RATES BY RACE AND LANGUAGE—S ECONDARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period W HITE STUDENTS Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade 39.76 0.333 (0.0041) 0.176 (0.0045) 0.0299 8.00 47.75 0.0134 4.22 51.98 BLACK STUDENTS (DIFFERENCE FROM WHITE STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -9.11 -0.0623 (0.0168) -0.00469 (0.0125) -0.00559 -1.50 -10.61 -0.000356 -0.11 -10.72 HISPANIC ENGLISH SPEAKING HOME STUDENTS (DIFFERENCE FROM W HITE STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -0.0332 (0.0151) 0.00158 (0.0154) -5.17 -0.00298 -0.80 -5.97 0.000120 0.04 -5.93 HISPANIC NON -ENGLISH SPEAKING HOME STUDENTS (DIFFERENCE FROM WHITE STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -0.0317 (0.0177) 0.0282 (0.0184) -8.50 -0.00284 -0.76 -9.26 0.00214 0.68 -8.58 ASIAN ENGLISH SPEAKING HOME STUDENTS (DIFFERENCE FROM WHITE STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade 0.0245 (0.0175) 0.00612 (0.0228) 2.82 0.00220 0.59 3.41 0.000465 0.15 3.56 ASIAN NON -ENGLISH SPEAKING HOME S TUDENTS (DIFFERENCE FROM W HITE STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade 0.0309 (0.0300) 0.0719 (0.0216) 2.97 0.00277 0.74 3.71 0.00547 1.72 5.44 Each estimate in bold is significantly different from the corresponding estimate for white students at the 5 percent level. Descriptions of models are provided in Appendix A. 71 FIGURE 3.20: DIFFERENCES IN MATH L EARNING RATES BY RACE AND LANGUAGE—S ECONDARY SCHOOL 60 NELS Math Scores by Race and Language 50 4-Intermediate Level Math 40 2-Fractions and Exponents 30 Score 3-Simple Problem Solving 20 1-Single Operations 8-10 10-12 Asian EH Black Hispanic Non-EH Asian Non-EH Hispanic EH White Economic Status Reading In the spring of eighth grade, the average reading achievement for students from low-income families (below 185 percent of the poverty line) is 3.62 points below that of higher-income students (see Table 3.21). This gap continues to widen early in high school when low-income students gain 0.71 points less than higher-income students between the spring of eighth grade and the spring of tenth grade. Later in high school, low-income students make slightly greater gain in reading than higher-income students. But the initial gap and the difference in learning rates means that low-income students end high school with nearly a 4-point disadvantage in cumulative math score. Both groups are learning simple inference skills, however, higher-income students are making strides towards learning complex inference skills (Figure 3.21). 72 TABLE 3.21: DIFFERENCES IN R EADING L EARNING RATES BY ECONOMIC STATUS —S ECONDARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period HIGHER -INCOME S TUDENTS Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade 29.52 0.163 (0.0041) 0.0825 (0.0054) 0.0213 3.90 33.42 0.00893 1.98 35.40 LOW -INCOME STUDENTS (DIFFERENCE FROM HIGHER -INCOME S TUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -0.0296 (0.0069) 0.0225 (0.0083) -3.62 -0.00387 -0.71 -4.33 0.00244 0.54 -3.79 Each estimate in bold is significantly different from the corresponding estimate for higher-income students at the 5 percent level. Descriptions of models are provided in Appendix A. 45 FIGURE 3.21: DIFFERENCES IN R EADING L EARNING RATES BY ECONOMIC STATUS —SECONDARY SCHOOL NELS Reading Scores by Economic Status 30 2-Simple Inferences 20 25 Score 35 40 3-Complex Inferences 8-10 10-12 Higher-Income Low-Income Math Findings presented in Table 3.22 show that low-income students start high school at a significant disadvantage compared to higher-income students and continue to lose ground during high school. Low73 income students have average math scores about 5 points lower than higher-income students in the spring of eighth grade. Low-income students gain about 1 point less than more advantaged peers in the first half of high school and about a half point less in the second half of high school. Low-income students end high school not yet learning simple problem solving skills, a skill higher-income students have learned by tenth grade (Figure 3.22). TABLE 3.22: DIFFERENCES IN MATH L EARNING RATES BY ECONOMIC STATUS —S ECONDARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period HIGHER -INCOME S TUDENTS Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade 40.05 0.338 (0.0049) 0.185 (0.0050) 0.0303 8.10 48.15 0.0141 4.44 52.59 LOW -INCOME STUDENTS (DIFFERENCE FROM HIGHER -INCOME S TUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -0.0383 (0.0076) -0.0192 (0.0075) -5.40 -0.00344 -0.92 -6.32 -0.00146 -0.46 -6.79 Each estimate in bold is significantly different from the corresponding estimate for higher-income students at the 5 percent level. Descriptions of models are provided in Appendix A. 74 FIGURE 3.22: DIFFERENCES IN MATH L EARNING RATES BY ECONOMIC STATUS —SECONDARY SCHOOL 60 NELS Math Scores by Economic Status 50 4-Intermediate Level Math 40 2-Fractions and Exponents 30 Score 3-Simple Problem Solving 20 1-Single Operations 8-10 10-12 Higher-Income Low-Income Race and Income Reading Tables 3.15 and 3.21, presented earlier, compared the initial status and learning rates across race/ethnicity and across income status. There were stark differences across subgroups. In Table 3.23, the race and income groups are combined to explore how both minority status and income status matter to learning. White children of low-income backgrounds start high school with a lower average reading score (2.77 points less) than their more advantaged white peers. The white low-income students learn significantly less than the white higher-income students between eighth and tenth grades but learn significantly more between tenth and twelfth grades, ending with about the same deficit that they started school with. Black students and Hispanic students, regardless of income status, have significantly lower reading scores at the end of eighth grade than higher-income white students. Throughout high school, black students from both income strata, make less gain on the reading assessment than white students, regardless of their economic background. The learning deficit is greatest for low-income black students. They gain significantly fewer points between eighth and tenth grades than any other subgroup. Hispanic students from low-income backgrounds start high school with the largest deficit on the reading test. However, these students gain significantly more between tenth and twelfth grades than white higherincome students and so reduce their accumulated point deficit. Hispanic students from higher-income backgrounds start high school behind white students in reading but ma ke similar progress as white students throughout high school. 75 Low-income Asian students earn less than higher-income white students on the initial reading assessment at the end of eighth grade, but make similar gains to higher-income white students early in high school. Later in high school, these less economically advantaged Asian students gain 2 points more than white higher-income students on the reading assessment. Higher-income Asian students start high school with a similar average eighth grade reading score and learn at the same rate throughout high school as white higher-income students. By the end of high school, Asian and white students from higher-income backgrounds are advancing towards learning complex inference skills, while black and Hispanic higher-income students are learning simple inferences (see Figure 3.23a). The lines representing learning rates for the low-income students (shown in Figure 3.23b) indicate closer clustering among these subgroups. They all are learning simple inferences and abstract points. TABLE 3.23: DIFFERENCES IN R EADING L EARNING RATES BY RACE AND INCOME—S ECONDARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period W HITE HIGHER -INCOME STUDENTS Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade 30.16 0.166 (0.0046) 0.0812 (0.0061) 0.0217 3.98 34.14 0.00880 1.95 36.09 W HITE LOW -INCOME STUDENTS (DIFFERENCE FROM WHITE HIGHER -INCOME STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -0.0311 (0.0083) 0.0249 (0.0100) -2.77 -0.00407 -0.75 -3.52 0.00270 0.60 -2.92 BLACK HIGHER -INCOME STUDENTS (DIFFERENCE FROM W HITE HIGHER -INCOME STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -0.0353 (0.0184) -0.00530 (0.0219) -4.60 -0.00461 -0.85 -5.45 -0.000573 -0.13 -5.57 BLACK LOW -INCOME STUDENTS (DIFFERENCE FROM WHITE HIGHER -INCOME STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade At End of Period -0.0523 (0.0143) -0.00490 (0.0178) -0.00683 -1.25 -8.77 -0.000531 -0.12 -8.89 Table Continues on Next Page 76 -7.52 TABLE 3.23 (CONT .): DIFFERENCES IN R EADING L EARNING RATES BY RACE AND INCOME—S ECONDARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period HISPANIC HIGHER -INCOME S TUDENTS (DIFFERENCE FROM W HITE HIGHER -INCOME STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -4.55 -0.0283 (0.0158) 0.00186 (0.0194) -0.00370 -0.68 -5.23 0.000201 0.04 -5.19 HISPANIC LOW -INCOME STUDENTS (DIFFERENCE FROM WHITE HIGHER -INCOME STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -0.0162 (0.0158) 0.0461 (0.0176) -6.50 -0.00212 -0.39 -6.89 0.00499 1.10 -5.78 ASIAN HIGHER -INCOME STUDENTS (DIFFERENCE FROM W HITE HIGHER -INCOME STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade 0.0327 (0.0158) 0.0419 (0.0244) 0.45 0.00427 0.78 1.23 0.00454 1.01 2.24 ASIAN LOW -INCOME STUDENTS (DIFFERENCE FROM W HITE HIGHER -INCOME S TUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -0.0307 (0.0195) 0.0862 (0.0261) -4.53 -0.00401 -0.74 -5.27 0.00933 2.07 -3.20 Estimates in bold are significantly different from corresponding estimate for white higher-income students at the 5 percent level. Descriptions of models are provided in Appendix A. 77 45 FIGURE 3.23A: DIFFERENCES IN R EADING L EARNING RATES BY RACE (HIGHER INCOME)—S ECONDARY SCHOOL NELS Reading Scores by Race, Higher Inc 30 2-Simple Inferences 20 25 Score 35 40 3-Complex Inferences 8-10 10-12 Asian Hispanic Black White NELS Reading Scores by Race, Low Income 3-Complex Inferences 25 30 2-Simple Inferences 20 Score 35 40 45 FIGURE 3.23 B: DIFFERENCES IN R EADING L EARNING RATES BY RACE(LOW-I NCOME)—S ECONDARY SCHOOL 8-10 10-12 Asian Hispanic Black White 78 Math Previous tables (Tables 3.16 and 3.22) indicate differences in initial status and learning by race/ethnicity and by income status. In this section, we look at differences by race and income simultaneously. There are substantial differences in initial math achievement between low-income and higher-income white students. White low-income students start high school with an average math score 4.1 points less than white higher-income students. During high school, white low-income students fall farther behind, because they gain slightly less than white higher-income students. Black low-income students start high school at the greatest disadvantage in math. On average, black lowincome students earn 12 fewer points on the math assessment at the end of eighth grade than white higher-income students. This deficit increases in early high school, during which black low-income students gain 2.04 points less than these white students. The math learning rates between black lowincome and white higher-income students do not differ significantly later in high school. Black higherincome students start high school with a 7.74-point deficit on the math assessment compared to white higher-income students but both groups learn at similar rates during high school. This deficit is larger than that faced by white low-income students, suggesting the importance of race to achievement gains. Relative to the eighth grade math performance of white higher-income students, Hispanic low-income students are 9.32 points behind and Hispanic higher-income students are 6.81 points behind. The Hispanic low-income students gain 1.52 points less than white higher-income students between eighth and tenth grades, thus expanding the initial gap. Hispanic higher-income students gain math points at about the same pace as white higher-income students throughout all four years of high school. Asian higher-income students earn a higher average math score at the end of eighth grade than white higher-income students (3.11 points). Throughout high school, these Asian students make math gains that do not significantly differ from the math gains made by white higher-income students. Asian low-income students earn a slightly lower math score at the end of eighth grade than white higher-income students, but gain slightly more during each time period (though this advantage is not significant). By the end of high school, Asian higher-income students have outpaced all other subgroups and are learning intermediate-level math concepts (see Figure 3.24a). White higher-income students are moving towards learning these concepts but are not there yet. Black and Hispanic higher-income students are beginning to pick up skills in the less advanced proficiency level of simple problem solving. Figure 3.24b illustrates the learning rates of the low-income students by race/ethnicity. In this graph, black and Hispanic students are clustered together, relatively far below the white and Asian low-income students, who are already learning intermediate-level math concepts. In sum, black and Hispanic students, regardless of income status, are far behind their peers in learning math. 79 TABLE 3.24: DIFFERENCES IN MATH L EARNING RATES BY RACE AND INCOME—S ECONDARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period W HITE HIGHER -INCOME STUDENTS ON L Y Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade 40.95 0.341 (0.0051) 0.183 (0.0057) 0.0306 8.19 49.14 0.0140 4.40 53.54 W HITE LOW -INCOME STUDENTS (DIFFERENCE FROM WHITE HIGHER -INCOME STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -0.0274 (0.0086) -0.0240 (0.0087) -4.10 -0.00246 -0.66 -4.76 -0.00183 -0.58 -5.33 BLACK HIGHER -INCOME STUDENTS (DIFFERENCE FROM W HITE HIGHER -INCOME STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -0.0501 (0.0309) -0.000609 (0.0189) -7.74 -0.00450 -1.20 -8.94 -0.0000463 -0.01 -8.95 BLACK LOW -INCOME STUDENTS (DIFFERENCE FROM WHITE HIGHER -INCOME STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -0.0851 (0.0181) -0.0187 (0.0160) -12.19 -0.00764 -2.04 -14.23 -0.00142 -0.45 -14.68 HISPANIC HIGHER -INCOME STUDENTS (DIFFERENCE FROM W HITE HIGHER -INCOME STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -6.81 -0.0123 (0.0187) 0.0179 (0.0187) -0.00110 -0.29 -7.11 0.00136 0.43 -6.68 HISPANIC LOW -INCOME STUDENTS (DIFFERENCE FROM WHITE HIGHER -INCOME STUDENTS) Before High School 8th Grade to 10th Grade -0.0634 -0.00569 (0.0150) 10th Grade to 12th Grade 0.00314 0.000238 (0.0171) Table Continues on Next Page 80 -9.32 -1.52 -10.84 0.08 -10.76 TABLE 3.24 (CONT .): DIFFERENCES IN MATH L EARNING RATES BY RACE AND INCOME—S ECONDARY SCHOOL Time Period Gain Per Month Effect Size Per Month Gain Per Period At End of Period ASIAN HIGH -INCOME STUDENTS (DIFFERENCE FROM W HITE HIGHER -INCOME STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade 3.11 0.0361 (0.0226) 0.0167 (0.0244) 0.00324 0.87 3.97 0.00127 0.40 4.37 ASIAN LOW -INCOME STUDENTS (DIFFERENCE FROM W HITE HIGHER -INCOME STUDENTS) Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade -0.00283 (0.0255) 0.0368 (0.0270) -2.12 -0.000254 -0.07 -2.19 0.00280 0.88 -1.30 Each estimate in bold is significantly different from the corresponding estimate for white higher-income students at the 5 percent level. Descriptions of models are provided in Appendix A. FIGURE 3.24A: DIFFERENCES IN MATH L EARNING RATES BY RACE (HIGHER INCOME)—S ECONDARY SCHOOL 60 NELS Math Scores by Race, Higher Inc 50 4-Intermediate Level Math 40 30 2-Fractions and Exponents 1-Single Operations 20 Score 3-Simple Problem Solving 8-10 10-12 Asian Hispanic Black White 81 FIGURE 3.24 B: DIFFERENCES IN MATH L EARNING RATES BY RACE (L OW INCOME)—S ECONDARY SCHOOL 60 NELS Math Scores by Race, Low Income 50 4-Intermediate Level Math 40 2-Fractions and Exponents 30 Score 3-Simple Problem Solving 20 1-Single Operations 8-10 10-12 Asian Hispanic Black White Summary These secondary school analyses reveal some interesting learning differences by gender, race/ethnicity, language status, and income background. Females start high school with a slightly higher average reading score than males and gain faster than males later in high school. Males start high school with a very slightly higher average math score, and gain faster than females later in high school. Black students face substantial deficits in reading and math achievement compared to white students and to students of other ethnicities. Overall, Hispanic students perform better than black students in reading and in math (both initial status and learning rates). In general, the initial status and the gains for students who do not speak English at home lag behind those who do speak English at home in both reading and math. But when the groups are combined to create subgroups by race and language status, a more complicated story emerges. Hispanic and Asian NEH students make greater gains in reading and math during high school than EH students of the same ethnicities. Income status plays a major role in learning rate differences. Higher-income students of black and Hispanic ethnicities finish eighth grade with average reading and math scores less than those of white higher-income students. However, all three subgroups make similar amounts of reading and math gain during high school. Low-income black and Hispanic students have the lowest average reading and math achievement at the end of eighth grade. However, Hispanic low-income students either keep pace or outpace white higher-income students between tenth and twelfth grades. Black low-income students consistently fall farther behind white higher-income students by learning at a slower rate during high school. 82 Summary of Elementary and High School Estimates The analyses in this chapter have examined differences in reading and math learning across gender, race/ethnicity, and economic status during elementary and high school. There are larger differences in initial achievement and gains between males and females in elementary school than in high school. This holds true for both reading and math results. These analyses confirm how important race is to academic achievement in America. In both the elementary school and secondary school analyses, regardless of subject matter, black students on average start behind and finish even farther behind white students due to slower learning rates. Hispanic students are also at a disadvantage, compared to white students, both in terms of starting point at the beginning of kindergarten and their average learning rate during elementary school. Over time, this difference in learning rate fades away, but the differences in achievement levels remain. By high school, Hispanic students are still behind in achievement levels, but are keeping pace with white students’ learning rate. Economic background appears important for students’ reading and math gains. In elementary school, the reading and math learning differences between low-income and higher-income students is large at first then becomes smaller over time. In high school, the gap is significant only in their initial status and then later, from tenth to twelfth grade. Students from families with more middle class incomes start higher and make slightly faster gains than students from lower-income families. A great deal of research has focused on achievement differences by race/ethnicity. Clearly these differences are closely related to income status. In elementary school, black and Hispanic children begin kindergarten with significantly lower reading and math scores than white children. Minority children from low-income backgrounds start even farther behind and gain even less. Minority children from higherincome backgrounds start school with less of a deficit compared to the average white student. However, the higher-income black children start higher—but gain more slowly than white low-income children, falling behind white low-income children by the end of third grade. The advantage of higher income to children’s reading and math fall kindergarten test scores may derive from better resources available at home. Not surprisingly, black higher income students gain at faster rates than black low-income students in both math and reading. Hispanic higher-income students also fare better than Hispanic low-income students in terms of growth rates during elementary school, but only as well as white low-income students in math and reading. In high school, black and Hispanic higher-income students end eighth grade with large deficits in reading and math test scores compared to white higherincome students. But during high school, these students keep pace with the white higher-income students. Black and Hispanic low-income students continue to fall behind and compound their initial deficit, again highlighting the joint importance of income and race. Family income is an important indicator of better performance, predicting both a higher initial level and a faster rate of growth on reading and math tests. However, race seems to be a more important indicator, even conditional on family income. The apparent influence of race may reflect many factors not explored here, such as differences in neighborhoods33 or schools, which we hope to explore further in future research. Nevertheless, the simple fact that higher-income black children start higher but gain more slowly than white low-income children, and have lower mean achievement than low-income white students after the third grade, confirms the importance of race as an indicator of where our greatest challenges lie in the education system. 33 Though recent research by Sanbonmatsu et al. (2006) suggests little effect of neighborhood characteristics. 83 CHAPTER IV: COMPARISON WITH THETA SCORES All of the analytic models presented in Chapter 2 and 3 were also estimated using theta scores as the outcome. In this section, we compare findings using the IRT scale scores (used as the outcome variable in the Chapter 2 and 3 analyses) to those using the theta scores.34 The theta scores have at least two advantages over the scale scores. First, the theta scores are potentially less determined by choices made in test item selection. Second, the distribution of theta scores more closely resembles the normal distribution; the distribution is more symmetric and is less truncated or compressed at the tails, which better matches the assumptions that underlie our statistical modeling. The main drawbacks of theta scores are that they may not be available for all tests (so results using theta scores are potentially less comparable across tests), and they are measured in arbitrary units (in ECLS, from –5 to 5, and in NELS, from 0 to 100). In general, the theta scores do not represent a more or less accurate metric tha n the IRT scale scores from the prior chapter, but are simply different. Because the IRT scale and theta results use different metrics, there is no guarantee that the results will look similar, and in fact they differ in several important ways, as can be seen in the tables that follow in the next section. However, this potential and actual difference offers an important test of our use of “standardized” rates of learning or “effect sizes,” in which we divide mean growth rates by the standard deviation at the base period to construct a measure of growth in standard deviation units. If the “effect sizes” do actually offer the potential of comparing across different tests using different metrics, the IRT scale and theta score results should look more similar when presented as effect sizes, and we find they do—especially when using end-of-period standard deviations to estimate standardized gaps, as in the final section of this chapter. As such, the results in this chapter support the findings from Chapters 2 and 3., and offer a compelling reason to focus on effect size results in those chapters. Tables in this chapter present the IRT results from four models next to the theta results from the same four models. The four selected models in this comparative analysis are: 1) a base model with only variables measuring time spent in each specified time period used as predictors; 2) the base model augmented with race/ethnicity indicators; 3) the base model augmented with race/ethnicity indicators fully interacted with language status; and 4) the base model augmented with race/ethnicity indicators fully interacted with income status. Elementary School Base model Results using the ECLS data, presented in Table 4.1, indicate that the effect sizes look reasonably similar across the two metrics we examine—IRT scale scores and theta scores. A priori, we need not expect that these two metrics would produce similar results. The IRT scale score distribution is multi-modal and skewed; the theta score distribution is more normal (see Appendix C for more detail). But comparing the effect sizes for the two metrics, in the third and sixth columns of estimates in Table 4.1, we can conclude they seem similar and suggest a broad agreement of results. The only glaring discrepancy is for reading gains (or losses) over the summer between kindergarten and first grade, but the gains over this period are very imprecisely estimated, due in part to the smaller sample size at the beginning of first grade, so neither estimate differs significantly from zero (i.e. the null hypothesis of no change over summer cannot be rejected). The estimates for kindergarten and second and third grades are more robust, and seem reasonably close in magnitude. The IRT scores are nonlinear transformations of the theta scores, including information on the set of test item characteristics. The IRT model uses an individual’s pattern of item responses (right, wrong, omitted) and dimensions of the items to estimate each individual’s probability of answering each item correctly, and the individual’s “ability” parameter theta (similar to a student fixed effect in a logistic regression). 34 84 Second, a comparison of gain as measured in IRT scale score points suggests that first grade gains exceed kindergarten gains. However, the theta scores and the effect sizes using IRT scale score indicate that the first grade rate is similar to—or perhaps smaller than—the kindergarten rate. In Table 4.2, the ratio of first grade to kindergarten estimated growth rates, and the ratio of second and third grade to first grade estimated growth rates, are presented for both metrics. The ratios for the unstandardized growth estimates indicate that IRT scale reading scores increase nearly twice as fast in first grade as in kindergarten (1.812 times as fast) but theta reading scores increase about nine tenths as fast in first grade as in kindergarten (0.936 times as fast). Looking at the same two metrics using effect sizes, however, the rate of increase appears just as fast in first grade as in kindergarten (both 1.071 and 0.966 are close to one, indicating parity or near-parity of learning rates). TABLE 4.1: ESTIMATED GROWTH RATES IN ECLS-K: COMPARISON OF IRT AND T HETA, ALL STUDENTS Time Period Starting Level IRT Gain Per Month Effect Size Per Month Starting Level Theta Gain Per Month Effect Size Per Month SCORES READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade 22.8 39.8 39.4 70.3 1.81 -0.171 3.28 1.59 0.196 -0.0126 0.210 0.0749 -1.34 -0.372 -0.349 0.559 0.103 0.00889 0.0964 0.0309 0.175 0.0160 0.169 0.0614 M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade 17.5 32.7 34.0 56.3 1.62 0.491 2.37 1.20 0.196 0.0422 0.191 0.0767 -1.24 -0.342 -0.265 0.552 0.0955 0.0301 0.0867 0.0317 0.163 0.0545 0.154 0.0638 Third, the gain per month and the effect sizes measure results on the theta scores do not diverge as dramatically across time periods as the same results based on the IRT scale scores. This is primarily because the size of the standard deviation in the theta metric is fairly stable while the standard deviations for the IRT scores vary more widely (see Appendix C). Thus, comparisons of ratios in Table 4.2 range between roughly four tenths and one for effect sizes (and for raw theta scores), but range between one half and 1.8 for IRT scale scores. TABLE 4.2: RATIOS OF ESTIMATED GROWTH RATES IN ECLS-K: COMPARISON OF IRT AND T HETA, ALL STUDENTS Time Period IRT Gain Per Month Effect Size Per Month Theta Gain Per Month Effect Size Per Month RATIOS READING 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade 1.812 0.485 1.071 0.357 0.936 0.321 0.966 0.363 1.463 0.506 0.974 0.402 0.908 0.366 0.945 0.414 M ATHEMATICS 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade 85 Race/ethnicity model Even disaggregated by race/ethnicity, the effect sizes look reasonably similar across the two metrics we examine—IRT scale scores and theta scores. While individual time periods for some groups seem to have different estimated growth rates, an overall comparison of the effect sizes for the two metrics, in the third and sixth columns of estimates in Table 4.3, suggests the results are broadly similar. The major difference seems to be that black and Hispanic students are predicted to experience faster growth in scores than white students, using the theta metric, when results using the IRT scale metric predict faster growth rates for white students. 86 TABLE 4.3: ESTIMATED GROWTH RATES IN ECLS-K: COMPARISON OF IRT AND T HETA, BY RACE Time Period Starting Level IRT Gain Per Month Effect Size Per Month Starting Level Theta Gain Per Month Effect Size Per Month W HITE STUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade 24.2 42.1 41.5 74.8 1.90 -0.231 3.53 1.64 0.205 -0.0169 0.227 0.0773 -1.23 -0.263 -0.247 0.668 0.103 0.00609 0.0972 0.0310 0.174 0.0109 0.170 0.0616 M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade 19.5 35.8 37.0 60.5 1.73 0.473 2.49 1.23 0.210 0.0407 0.202 0.0784 -1.08 -0.189 -0.121 0.683 0.0945 0.0267 0.0853 0.0315 0.161 0.0482 0.151 0.0635 BLACK STUDENTS , DIFFERENCE RELATIVE TO W HITE STUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -3.17 -5.83 -5.91 -11.3 -0.283 -0.0288 -0.576 -0.185 -0.0306 -0.00211 -0.0370 -0.00871 -0.244 -0.276 -0.277 -0.271 -0.00332 -0.000396 0.000639 -0.00131 -0.00562 -0.000711 0.00112 -0.00260 M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -4.64 -8.04 -8.29 -11.9 -0.361 -0.0965 -0.379 -0.147 -0.0439 -0.00829 -0.0306 -0.00938 -0.363 -0.398 -0.397 -0.367 -0.00374 0.000578 0.00314 -0.00113 -0.00639 0.00105 0.00555 -0.00227 HISPANIC STUDENTS , DIFFERENCE RELATIVE TO W HITE STUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -4.92 -6.67 -5.98 -12.4 -0.186 0.268 -0.687 -0.0578 -0.0201 0.0196 -0.0440 -0.00273 -0.373 -0.311 -0.265 -0.301 0.00660 0.0179 -0.00384 0.00122 0.0112 0.0321 -0.00671 0.00242 M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -5.63 -8.01 -7.68 -9.78 -0.253 0.125 -0.222 -0.0275 -0.0307 0.0108 -0.0180 -0.00176 -0.461 -0.404 -0.355 -0.303 0.00606 0.0189 0.00554 0.00100 0.0103 0.0342 0.00981 0.00202 ASIAN STUDENTS, DIFFERENCE RELATIVE TO W HITE STUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade 1.51 3.77 5.34 2.63 0.240 0.611 -0.289 -0.250 0.0259 0.0448 -0.0185 -0.0118 0.0789 0.107 0.136 0.0456 0.00295 0.0113 -0.00959 -0.00463 0.00500 0.0204 -0.0168 -0.00920 M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade 0.556 0.235 1.38 -2.03 -0.0341 0.443 -0.362 0.0941 -0.00414 0.0381 -0.0293 0.00601 0.0233 -0.00936 0.0198 0.0645 -0.00347 0.0145 -0.00986 0.00307 -0.00592 0.0263 -0.0175 0.00619 87 Black students’ learning rates are consistently higher relative to white students when looking at theta scores instead of scale scores. In particular, black students actually make larger gains on the reading test in first grade, and larger gains on the math test in both first and the summer before, relative to white students. While black students start out farther behind (about four tenths of a standard deviation behind white students at the beginning of kindergarten) when using the theta metric, they wind up slightly less far behind by the end of first grade when using the theta metric. However, when the gaps a re measured in standard deviation units (see the final section of this chapter), the differences between the theta and scale score results at a point in time are quite small indeed. This is partly due to differences in how the standard deviation on the test changes over time for the two different metrics, and partly to the order of magnitude of the gaps in learning rates, which are quite small relative to the gap in levels. That is, the rate of change in the gap over relatively short periods of time like first grade or the summer between kindergarten and first is less relevant to the size of the gap than its starting size and its rate of change over longer periods of time, such as the period encompassing second and third grades. A much larger difference appears for Hispanic students, who gain relative to non-Hispanic white students in every period (i.e. the coefficients measuring the rates of gain on the mathematics test, in the third column of estimates, are all positive) when looking at the theta score results. Hispanic students seem to lose ground relative to non-Hispanic white students in every period except the summer between kindergarten and first grade when looking at the scale score results (i.e. three of the four coefficients measuring the rates of gain, in the first column of estimates, are negative). Thus a researcher using theta scores would find that Hispanic students seem to start at a large disadvantage and gradually narrow that gap, while a researcher using scale scores would find that Hispanic students seem to start at a smaller disadvantage that gradually increased in size. 88 Language model TABLE 4.4: ESTIMATED GROWTH RATES IN ECLS-K: COMPARISON OF IRT AND T HETA, BY LANGUAGE Time Period Starting Level IRT Gain Per Month Effect Size Per Month Starting Level Theta Gain Per Month Effect Size Per Month ENGLISH AT HOME STUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade 23.1 40.3 39.8 71.3 1.89 -0.236 3.53 1.64 0.206 -0.0174 0.228 0.0775 -1.32 -0.345 -0.328 0.585 0.103 0.00612 0.0973 0.0310 0.174 0.0110 0.170 0.0616 M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade 18.0 33.4 34.7 57.0 1.72 0.471 2.49 1.23 0.210 0.0406 0.202 0.0785 -1.20 -0.306 -0.231 0.576 0.0945 0.0266 0.0855 0.0316 0.162 0.0481 0.151 0.0634 -0.00563 0.000328 0.000480 -0.00254 -0.00707 0.00390 0.00458 -0.00214 NON -ENGLISH AT HOME STUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -4.12 -5.53 -5.18 -10.1 -0.282 -0.0129 -0.588 -0.186 -0.0306 -0.000952 -0.0379 -0.00880 -0.309 -0.276 -0.235 -0.245 -0.00332 0.000183 0.000274 -0.00128 M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -4.51 -6.40 -6.40 -7.13 -0.367 -0.0594 -0.392 -0.146 -0.0448 -0.00512 -0.0318 -0.00932 -0.384 -0.339 -0.298 -0.226 -0.00414 0.00216 0.00259 -0.00107 Race/ethnicity by language model Disaggregating further, by race/ethnicity and by whether students have ever failed the OLDS assessment (indicating poor English skills), the effect sizes still look broadly similar across IRT scale scores and theta scores. While estimated growth rates differ in individual time periods for some groups, the effect sizes for the two metrics reported in the third and sixth columns of estimates in Table 4.5 seem broadly similar. Again, black students seem to gain a bit more on white students when looking at the theta results, but the difference in learning rates is negligible relative to the size of the initial gap. The major difference is again for Hispanic students, who are predicted to experience faster growth in scores than white students using the theta metric, when results using the IRT scale metric predict faster growth rates for white students, and when we disaggregate by language status, this difference is very similar for both types of Hispanic students. The difference in results for Hispanic students is thus robust to the particular type of disaggregation reported in Table 4.5, and there is no obvious explanation for why the rates of gain would be so different looking at scale scores or theta scores. Because many Hispanic students who took the OLDS test a t some point in time (whom we classify as belonging to non-English homes, or NEH) have missing test scores on the reading test in early periods, we might expect the reading test results to look somewhat different, especially for the subgroup of students that we classify as NEH. This is not the case, however—in fact, the larger difference is observed in the math test, where Hispanic students are not missing scores at an exceptionally high rate. 89 TABLE 4.5: ESTIMATED GROWTH RATES IN ECLS-K: COMPARISON OF IRT AND T HETA, BY RACE AND LANGUAGE Time Period Starting Level IRT Gain Per Month Effect Size Per Month Starting Level Theta Gain Per Month Effect Size Per Month W HITE STUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade 24.2 42.1 41.5 74.8 1.90 -0.231 3.53 1.64 0.205 -0.0169 0.227 0.0773 -1.23 -1.263 -0.247 0.668 0.103 0.00612 0.0972 0.0310 0.174 0.0110 0.170 0.0616 M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade 19.5 35.8 37.0 60.5 1.73 0.473 2.49 1.23 0.210 0.0407 0.202 0.0784 -1.08 -0.189 -0.121 0.682 0.0945 0.0266 0.0853 0.0315 0.161 0.0481 0.151 0.0635 BLACK STUDENTS , DIFFERENCE RELATIVE TO W HITE STUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -3.17 -5.83 -5.91 -11.3 -0.283 -0.0290 -0.576 -0.185 -0.0306 -0.00212 -0.0370 -0.00870 -0.244 -0.276 -0.277 -0.271 -0.00332 -0.000396 0.000639 -0.00131 -0.00562 -0.000711 0.00112 -0.00260 M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade T ABLE CONTINUES ON NEXT P AGE -4.64 -8.04 -8.29 -11.9 -0.367 -0.0594 -0.392 -0.146 -0.0448 -0.00512 -0.0318 -0.00932 -0.363 -0.398 -0.396 -0.36 -0.00371 0.000578 0.00314 -0.00113 -0.00633 0.00105 0.00555 -0.00227 90 Time Period Starting Level IRT Gain Per Month Effect Size Per Month Starting Level Theta Gain Per Month Effect Size Per Month HISPANIC ENGLISH SPE AKING HOME STUDENTS, DIFFERENCE RELATIVE TO W HITE STUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -3.36 -3.89 -2.95 -7.07 -0.0567 0.366 -0.437 -0.0648 -0.00613 0.0268 -0.0280 -0.00305 -0.257 -0.174 -0.136 -0.174 0.00883 0.0148 -0.00402 -0.0000913 0.0150 0.0265 -0.00703 -0.000182 M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1s t Grade After 1st Grade, into 3rd Grade -3.79 -5.27 -4.76 -6.95 -0.160 0.191 -0.228 -0.0144 -0.0196 0.0165 -0.0185 -0.000919 -0.295 -0.249 -0.211 -0.210 0.00490 0.0146 0.000152 0.000517 0.00836 0.0265 0.000270 0.00104 HISPANIC NON -ENGLISH SPEAKING HOME STUDENTS , DIFFERENCE RELATIVE TO W HITE STUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -6.98 -974 -9.64 -17.9 -0.293 0.0404 -0.880 -0.0309 -0.0317 0.00296 -0.0564 -0.00146 -0.526 -0.469 -0.419 -0.432 0.00606 0.0194 -0.00134 0.00286 0.0103 0.0348 -0.00234 0.00569 M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -7.39 -10.6 -10.4 -12.5 -0.340 0.0829 -0.221 -0.0376 -0.0413 0.00712 -0.0178 -0.00240 -0.621 -0.551 -0.486 -0.392 0.0251 0.0251 0.00998 0.00146 0.0453 0.0453 0.0177 0.00294 ASIAN ENGLISH SPEAKING HOME STUDENTS , DIFFERENCE RELATIVE TO W HITE STUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade 3.85 6.79 8.75 6.74 0.312 0.758 -0.213 -0.273 0.0337 0.0556 -0.0136 -0.0129 0.196 0.212 0.247 0.127 0.00170 0.0136 -0.0128 -0.00548 0.00289 0.0245 -0.0224 -0.0109 M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade 2.09 2.63 4.51 1.51 0.0574 0.731 -0.319 0.0657 0.00697 0.0628 -0.0258 0.00420 0.120 0.0947 0.151 0.0364 -0.00265 0.0220 -0.0122 0.00177 -0.00452 0.0397 -0.0216 0.00355 ASIAN NON -ENGLISH SPEAKING HOME S TUDENTS, DIFFERENCE RELATIVE TO W HITE STUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -0.564 1.51 2.81 0.274 0.221 0.501 -0.327 -0.235 0.0239 0.0367 -0.0210 -0.0111 -0.0207 0.0280 0.0519 -0.0114 0.00517 0.00928 -0.00673 -0.00402 0.00876 0.0167 -0.0118 -0.00799 M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -0.638 -1.52 -1.13 -4.55 -0.0937 0.153 -0.363 0.114 -0.0114 0.0131 -0.0294 0.00731 -0.0492 -0.0856 -0.0682 -0.136 -0.00387 0.00673 -0.00724 0.00402 -0.00659 0.0122 -0.0128 0.00809 91 Income model TABLE 4.6: ESTIMATED GROWTH RATES IN ECLS-K: COMPARISON OF IRT AND T HETA, BY INCOME Time Period Starting Level IRT Gain Per Month Effect Size Per Month Starting Level Theta Gain Per Month Effect Size Per Month HIGHER -INCOME S TUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade 25.0 43.3 43.1 76.2 1.89 -0.236 3.53 1.64 0.206 -0.0174 0.228 0.0775 -1.17 -0.215 -0.189 .701 0.103 0.00612 0.0973 0.0310 0.174 0.0110 0.170 0.0616 M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade 19.9 36.2 37.5 60.8 1.72 0.471 2.49 1.23 0.210 0.0406 0.202 0.0785 -1.05 -0.173 -0.99 0.693 0.0945 0.0266 0.0855 0.0316 0.162 0.0481 0.151 0.0634 -0.00563 0.000328 0.000480 -0.00254 -0.00707 0.00390 0.00458 -0.00214 LOW -INCOME STUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -5.37 -7.98 -8.68 -13.77 -0.282 -0.0129 -0.588 -0.186 -0.0306 -0.000952 -0.0379 -0.00880 -0.401 -0.362 -0.370 -0.328 -0.00332 0.000183 0.000274 -0.00128 M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -5.50 -7.83 -8.01 -10.4 -0.367 -0.0594 -0.392 -0.146 -0.0448 -0.00512 -0.0318 -0.00932 -0.437 -0.388 -0.382 -0.323 -0.00414 0.00216 0.00259 -0.00107 Race/ethnicity by income model Disaggregating in a different way, by race/ethnicity and by whether students have family income low enough to qualify for free or reduced-price lunch (1.85 times poverty), the overall pattern of effect sizes still looks reasonably similar across IRT scale scores and theta scores. While estimated growth rates differ in individual time periods for some groups, the effect sizes for the two metrics reported in the third and sixth columns of estimates in Table 4.7 seem broadly similar. Again, the biggest differences between scale score and theta score results tend to observed among the estimates of gains for Hispanic children, though the results are much closer after we break the Hispanic sample into low-income and higher-income groups. The one subgroup with relatively large differences between scale score and theta score results is low-income Hispanic students, and this difference is mainly observed in mathematics growth rates. It is possible that the group of students who cannot be assigned to the low-income or the higher-income group, due to missing income or household composition data, is driving the large observed differences between scale score and theta score results for Hispanic students in Tables 4.3 and 4.5. Further research is warranted to check if this inconsistency is primarily a data quality issue, or in fact reflects undesirable features of the test design. 92 TABLE 4.7: ESTIMATED GROWTH RATES IN ECLS-K: COMPARISON OF IRT AND T HETA, BY RACE AND I NCOME Time Period Starting Level IRT Gain Per Month SD Per Month Starting Level Theta Gain Per Month SD Per Month W HITE HIGHER INCOME STUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade 25.5 43.8 43.4 77.7 1.95 -0.159 3.64 1.64 0.210 -0.0116 0.233 0.0774 -1.14 -0.186 -0.166 0.736 0.101 0.00767 0.0958 0.0305 0.171 0.0138 0.168 0.0606 M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade 20.8 37.4 38.7 62.9 1.77 0.524 2.54 1.24 0.215 0.0450 0.206 0.0791 -0.987 -0.114 -0.040 0.749 0.0928 0.0285 0.0839 0.0314 0.158 0.0515 0.149 0.0632 W HITE LOW INCOME STUDENTS, DIFFERENCE RELATIVE TO W HITE HIGHER INCOME STUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -4.57 -6.17 -6.79 -10.41 -0.169 -0.242 -0.384 -0.0202 -0.0183 -0.0177 -0.0246 -0.000952 -0.331 -0.271 -0.283 -0.244 0.00645 -0.00502 0.00423 0.00161 0.0109 -0.00903 0.00740 0.00321 MaTHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -4.35 -5.64 -6.10 -7.76 -0.138 -0.177 -0.176 -0.0412 -0.0167 -0.0152 -0.0143 -0.00263 -0.325 -0.267 -0.284 -0.236 0.00609 -0.00645 0.00511 0.000365 0.0104 -0.0117 0.00906 0.000735 BLACK HIGHER INCOME STUDENTS, DIFFERENCE RELATIVE TO W HITE HIGHER INCOME STUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -1.62 -3.17 -2.30 -8.19 -0.164 0.337 -0.626 -0.171 -0.0177 0.0247 -0.0401 -0.00806 -0.122 -0.146 -0.126 -0.190 -0.00253 0.00767 -0.00676 -0.00219 -0.00428 0.0138 -0.0118 -0.00436 M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -3.67 -6.53 -6.69 -10.3 -0.303 -0.0632 -0.384 -0.0942 -0.0368 -0.00543 -0.0310 -0.00602 -0.271 -0.313 -0.317 -0.306 -0.00447 -0.00167 0.00122 -0.000852 -0.00763 -0.00303 0.00216 -0.00172 BLACK LOW INCOME S TUDENTS, DIFFERENCE RELATIVE TO W HITE HIGHER INCOME STUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -5.69 -9.50 -10.3 -16.9 -0.405 -0.311 -0.699 -0.193 -0.0437 -0.0228 -0.0448 -0.00910 -0.431 -0.442 -0.459 -0.403 -0.00110 -0.00679 0.00590 -0.000183 -0.00186 -0.0122 0.0103 -0.000363 M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade T ABLE CONTINUES ON NEXT P AGE -6.82 -11.0 -11.5 -15.7 -0.443 -0.187 -0.445 -0.185 -0.0538 -0.0160 -0.0360 -0.0118 -0.534 -0.544 -0.547 -0.489 -0.000974 -0.00116 0.00609 -0.00107 -0.00166 -0.00209 0.0108 -0.00214 93 TABLE 4.7: ESTIMATED GROWTH RATES IN ECLS-K: COMPARISON OF IRT AND T HETA, BY RACE AND I NCOME Time Period Starting Level IRT Gain Per Month SD Per Month Starting Level Theta Gain Per Month SD Per Month HISPANIC HIGHER INCOME STUDENTS, DIFFERENCE RELATIVE TO W HITE HIGHER INCOME STUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -3.38 -3.69 -2.96 -6.67 -0.0322 0.280 -0.394 -0.0518 -0.00348 0.0205 -0.0252 -0.00244 -0.244 -0.161 -0.137 -0.155 0.00883 0.00910 -0.00192 -0.000335 0.0150 0.0164 -0.00336 -0.000666 M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -4.18 -5.20 -5.71 -6.92 -0.108 -0.198 -0.129 0.00648 -0.0131 -0.0170 -0.0104 0.000414 -0.312 -0.244 -0.247 -0.208 0.00721 -0.00143 0.00414 0.000974 0.0123 -0.00259 0.00733 0.00196 HISPANIC LOW INCOME S TUDENTS, DIFFERENCE RELATIVE TO W HITE HIGHER INCOME STUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -7.87 -11.1 -10.7 -20.1 -0.337 0.133 -1.00 -0.0621 -0.0364 0.00977 -0.0643 -0.00293 -0.601 -0.516 -0.461 -0.487 0.00901 0.0212 -0.00277 0.00295 0.0153 0.0381 -0.00485 0.00587 M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -8.25 -11.9 -11.3 -14.6 -0.388 0.226 -0.343 -0.0628 -0.0472 0.0194 -0.0277 -0.00401 -0.679 -0.604 -0.532 -0.452 0.00807 0.0278 0.00852 0.00119 0.0138 0.0502 0.0151 0.00239 ASIAN HIGHER INCOME STUDENTS , DIFFERENCE RELATIVE TO W HITE HIGHER INCOME STUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade 3.16 6.38 7.74 5.76 0.342 0.531 -0.211 -0.281 0.0369 0.0389 -0.0135 -0.0133 0.179 0.186 0.190 0.108 0.000761 0.00143 -0.00871 -0.00542 0.00129 0.00257 -0.0152 -0.0108 M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade 1.64 1.59 2.14 -.051 -0.00469 0.212 -0.281 0.0906 -0.000569 0.0182 -0.0227 0.00579 0.087 0.045 0.048 -0.024 -0.00444 0.00143 -0.00764 0.00295 -0.00758 0.00259 -0.0135 0.00594 ASIAN LOW INCOME STUDENTS , DIFFERENCE RELATIVE TO W HITE HIGHER INCOME STUDENTS READING During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade M ATHEMATICS During Kindergarten Summer Kindergarten-1st Grade During 1st Grade After 1st Grade, into 3rd Grade -4.72 -0.000644 0.0460 -0.0422 -0.00981 -0.344 -4.77 -3.15 -9.34 -0.00597 0.628 -0.657 -0.208 -0.209 -0.141 -0.220 0.0143 0.0263 -0.00834 -0.00207 0.0243 0.0473 -0.0146 -0.00412 -4.55 -5.91 -3.97 -9.78 -0.144 0.754 -0.617 0.0745 -0.0175 0.0648 -0.0499 0.00476 -0.325 -0.283 -0.198 -0.293 0.00444 0.0328 -0.0101 0.00362 0.00758 0.0593 -0.0178 0.00729 94 High School Base Model First, the effect sizes are essentially the same—IRT scale score and theta score. Again, there is no reason to think that these results should match so closely. Yet the gains per month reported in standard deviation units (effect size columns, the third and sixth columns in Table 4.8) are very similar. Even the IRT scale and theta score gains per month (reported in the second and fifth columns of estimates in Table 4.8) are very similar. However, looking at Table 4.9, we can see that the ratio of the mean growth rate in the second half of high school to the mean growth rate in the first half of high school is much more similar when measured in standard deviation units, lending support to our focus on effect sizes in prior chapters. TABLE 4.8: ESTIMATED GROWTH RATES IN NELS:88: COMPARISON OF IRT AND T HETA, ALL STUDENTS Time Period IRT Starting Level Gain Per Month Effect Size Per Month Theta Starting Gain Per Level Month Effect Size Per Month SCORES READING 8th Grade to 10th Grade 10th Grade to 12th Grade 28.3 31.9 0.152 0.0903 0.0199 0.00978 47.6 51.4 0.156 0.0971 0.0205 0.0102 38.2 45.9 0.324 0.178 0.0291 0.0136 46.6 52.04 0.227 0.133 0.0300 0.0145 M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade TABLE 4.9: RATIOS OF ESTIMATED GROWTH RATES IN NELS:88: COMPARISON OF IRT AND T HETA, ALL STUDENTS Time Period IRT Gain Per Month Effect Size Per Month Theta Gain Per Month Effect Size Per Month RATIOS READING 10th to 12th Grade/8th to 10th Grade 0.594 0.498 0.622 0.491 M ATHEMATICS 10th to 12th Grade/8th to 10th Grade 0.549 0.467 0.586 0.483 Race/ethnicity model The comparison of results using IRT scale scores and theta scores in Table 4.10 is less straightforward. Looking only at effect size estimates, every estimate of the mean growth rate for white students and comparison to this base group (chosen because they represent the largest fraction of students) has a similar magnitude, and every difference from the base group has the same estimated sign using either IRT scale scores or theta scores. However, it seems that black and Hispanic students slightly compare less favorably to white and Asian students when using theta scores (positive coefficients measuring the difference from white students tend to be smaller, and negative point estimate tend to be larger) in reading, and slightly more favorably in math. White students compare less favorably to Asian students when using theta scores (every positive point estimate for the difference of Asian students from white students is larger). Further research would be required to pinpoint the source of these minor differences between results using IRT scale scores and theta scores. 95 TABLE 4.10: ESTIMATED G ROWTH RATES IN NELS:88: COMPARISON OF IRT AND THETA, BY RACE Time Period IRT Starting Level Gain Per Month Effect Size Per Month Theta Starting Gain Per Level Month Effect Size Per Month W HITE STUDENTS ONLY READING 8th Grade to 10th Grade 10th Grade to 12th Grade 29.4 33.1 0.157 0.0885 0.0205 0.00959 48.7 52.6 0.161 0.0959 0.0212 0.0101 M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade 39.8 47.8 0.333 0.176 0.0299 0.0134 47.8 53.3 0.231 0.132 0.0305 0.0145 BLACK STUDENTS , DIFFERENCE RELATIVE TO W HITE STUDENTS READING 8th Grade to 10th Grade 10th Grade to 12th Grade -5.49 -6.36 -0.0360 -0.0129 -0.00471 -0.00140 -5.44 -6.49 -0.0440 -0.0212 -0.00580 -0.00223 M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade -9.13 -10.6 -0.0625 -0.00472 -0.00560 -0.000359 -6.61 -7.41 -0.0331 -0.0106 -0.00438 -0.00115 HISPANIC STUDENTS , DIFFERENCE RELATIVE TO W HITE STUDENTS READING 8th Grade to 10th Grade 10th Grade to 12th Grade -4.83 -5.12 -0.0122 0.0191 -0.00160 0.00207 -4.75 -5.22 -0.0197 0.0168 -0.00259 0.00178 M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade -7.00 -7.79 -0.0329 0.0168 -0.00295 0.00128 -4.99 -5.30 -0.0131 0.00594 -0.00174 0.000649 ASIAN STUDENTS, DIFFERENCE RELATIVE TO W HITE STUDENTS READING 8th Grade to 10th Grade 10th Grade to 12th Grade -0.265 0.289 0.0231 0.0477 0.00302 0.00516 0.557 1.44 0.0232 0.0601 0.00306 0.00634 M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade 2.71 3.49 0.0326 0.0299 0.00292 0.00228 1.86 2.52 0.0277 0.0301 0.00366 0.00329 96 Language model TABLE 4.11: ESTIMATED G ROWTH RATES IN ECLS-K: COMPARISON OF IRT AND T HETA, BY LANGUAGE Time Period Starting Level Theta Gain Per Month Effect Size Per Month 0.0199 0.00945 48.9 52.9 0.0205 0.00988 3.734 2.245 0.325 0.0292 0.176 0.0133 NON -ENGLISH AT HOME STUDENTS 46.9 52.3 0.227 0.131 0.0300 0.0143 Starting Level IRT Gain Per Month Effect Size Per Month ENGLISH AT HOME STUDENTS READING 8th Grade to 10th Grade 10th Grade to 12th Grade 28.6 32.2 M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade 38.5 46.3 READING 8th Grade to 10th Grade 10th Grade to 12th Grade -3.96 -3.87 0.00368 0.0406 0.000481 0.00440 -3.54 -4.40 -0.000533 0.00482 -0.097 1.096 M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade -4.01 -4.26 -0.0103 0.0314 -0.000926 0.00238 -5.86 -2.89 -0.000944 0.0224 -0.000125 0.00245 0.152 0.0873 Race/ethnicity by language model Again, the gains per month reported in standard deviation units (effect size columns, the third and sixth columns of estimates in Table 4.9) are very similar. Even the IRT scale and theta score gains per month (reported in the second and fifth columns of estimates in Table 4.12) are very similar. Again white students compare less favorably to Asian students when using theta scores than when using scale scores, but the clear pattern is observed only for NEH Asian students. Most differences between the estimates of IRT scale and theta score gains per month in Table 4.12 are quite small. 97 TABLE 4.12: ESTIMATED G ROWTH RATES IN NELS:88: COMPARISON OF IRT AND THETA, BY RACE AND LANGUAGE Time Period Starting Level IRT Gain Per Month Effect Size Per Month Starting Level Theta Gain Per Month Effect Size Per Month W HITE STUDENTS READING 8th Grade to 10th Grade 10th Grade to 12th Grade 29.4 33.1 0.157 0.0884 0.0205 0.00957 48.73 52.60 0.161 0.0957 0.0212 0.0101 M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade 39.8 47.5 0.333 0.176 0.0299 0.0134 47.8 53.3 0.231 0.132 0.0306 0.0145 BL ACK STUDENTS , DIFFERENCE RELATIVE TO W HITE STUDENTS READING 8th Grade to 10th Grade 10th Grade to 12th Grade -5.47 -6.32 -0.0356 -0.0129 -0.00465 -0.00139 -5.41 -6.45 -0.0436 -0.0212 -0.00574 -0.00224 M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade -9.11 -10.6 -0.0623 -0.00469 -0.00559 -0.000356 -6.60 -7.39 -0.0331 -0.0107 -0.00437 -0.00116 HISPANIC ENGLISH SPEAKING HOME STUDENTS, DIFFERENCE RELATIVE TO W HITE STUDENTS READING 8th Grade to 10th Grade 10th Grade to 12th Grade -3.17 -3.61 -0.0184 0.00231 -0.00240 0.000250 -3.13 -3.66 -0.0222 -0.00317 -0.00293 -0.000334 M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade -5.17 -5.97 -0.0332 0.00158 -0.00298 0.000120 -3.70 -4.05 -0.0146 -0.00569 -0.00193 -0.000622 HISPANIC NON -ENGLISH SPEAKING HOME STUDENTS , DIFFERENCE RELATIVE TO W HITE STUDENTS READING 8th Grade to 10th Grade 10th Grade to 12th Grade -6.21 -6.34 -0.00517 0.0324 -0.000676 0.00350 -6.10 -6.48 -0.0158 0.0320 -0.00208 0.00338 M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade -8.50 -9.26 -0.0317 0.0282 -0.00284 0.00214 -6.03 -6.30 -0.0113 0.0145 -0.00150 0.00158 ASIAN ENGLISH SPEAKING HOME STUDENTS , DIFFERENCE RELATIVE TO W HITE STUDENTS READING 8th Grade to 10th Grade 10th Grade to 12th Grade 0.361 0.890 0.0221 0.0372 0.00288 0.00402 0.366 0.812 0.0186 0.0513 0.00245 0.00542 M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade 2.82 3.41 0.0245 0.00612 0.00220 0.000465 1.88 2.45 0.0236 0.0138 0.00313 0.00151 ASIAN NON -ENGLISH SPEAKING HO ME S TUDENTS, DIFFERENCE RELATIVE TO W HITE STUDENTS READING 8th Grade to 10th Grade 10th Grade to 12th Grade -1.20 -0.506 0.0288 0.0911 0.00376 0.00986 -1.13 -0.428 0.0293 0.107 0.00385 0.0113 M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade 2.97 3.71 0.0309 0.0719 0.00277 0.00547 2.07 2.74 0.0276 0.0603 0.00365 0.00659 98 Income model TABLE 4.13: ESTIMATED G ROWTH RATES IN ECLS-K: COMPARISON OF IRT AND T HETA, BY LANGUAGE Time Period Starting Level Theta Gain Per Month Effect Size Per Month 0.0199 0.00945 48.9 52.9 0.0205 0.00988 3.734 2.245 0.325 0.0292 0.176 0.0133 NON -ENGLISH AT HOME STUDENTS 46.9 52.3 0.227 0.131 0.0300 0.0143 Starting Level IRT Gain Per Month Effect Size Per Month ENGLISH AT HOME STUDENTS READING 8th Grade to 10th Grade 10th Grade to 12th Grade 28.6 32.2 M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade 38.5 46.3 READING 8th Grade to 10th Grade 10th Grade to 12th Grade -3.96 -3.87 0.00368 0.0406 0.000481 0.00440 -3.54 -4.40 -0.000533 0.00482 -0.097 1.096 M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade -4.01 -4.26 -0.0103 0.0314 -0.000926 0.00238 -5.86 -2.89 -0.000944 0.0224 -0.000125 0.00245 0.152 0.0873 Race/ethnicity by income model The gains per month reported in standard deviation units (effect size columns, the third and sixth columns of estimates) in Table 4.14 are very similar, and the IRT scale and theta score gains per month are also close. Again white students compare less favorably to Asian students when using theta scores than when using scale scores, but the pattern is observed only for higher-income Asian students. 99 TABLE 4.14: L EARNING RATE DIFFERENCES ACROSS GRADE IN NELS:88: COMPARISON OF IRT AND T HETA, RACE AND I NCOME Time Period Starting Level IRT Gain Per Month Effect Size Per Month Starting Level Theta Gain Per Month Effect Size Per Month W HITE HIGHER INCOME STUDENTS READING 8th Grade to 10th Grade 10th Grade to 12th Grade 30.2 34.1 0.166 0.0812 0.0217 0.00880 49.5 53.6 0.172 0.0903 0.0226 0.00953 M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade 41.0 49.1 0.341 0.183 0.0306 0.0140 48.6 54.3 0.236 0.140 0.0312 0.0153 W HITE LOW INCOME STUDENTS, DIFFERENCE RELATIVE TO W HITE HIGHER INCOME STUDENTS READING 8th Grade to 10th Grade 10th Grade to 12th Grade -2.77 -3.52 -0.0311 0.0249 -0.00407 0.00270 -2.72 -3.58 -0.0360 0.0191 -0.00474 0.00202 -4.10 -4.76 -0.0274 -0.0240 -0.00246 -0.00183 -2.84 -3.28 -0.0185 -0.0239 -0.00244 -0.00261 MaTHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade BLACK HIGHER INCOME STUDENTS, DIFFERENCE RELATIVE TO W HITE HIGHER INCOME STUDENTS READING 8th Grade to 10th Grade 10th Grade to 12th Grade -4.60 -5.45 -0.0353 -0.00530 -0.00461 -0.000573 -4.58 -5.59 -0.0421 -0.0135 -0.00555 -0.00142 M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade -7.74 -8.94 -0.0501 -0.000609 -0.00450 -0.0000463 -5.53 -6.25 -0.0300 -0.00600 -0.00397 -0.000655 BLACK LOW INCOME S TUDENTS, DIFFERENCE RELATIVE TO W HITE HIGHER INCOME STUDENTS READING 8th Grade to 10th Grade 10th Grade to 12th Grade M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade T ABLE CONTINUES ON NEXT P AGE -7.52 -8.78 -0.0523 -0.00490 -0.00683 -0.000531 -7.40 -8.93 -0.0421 -0.0135 -0.00555 -0.00142 -12.2 -14.2 -0.0851 -0.0187 -0.00764 -0.00142 -8.82 -9.90 -0.0453 -0.0246 -0.00599 -0.00269 100 TABLE 4.14 (CONTINUED). L EARNING RATE DIFFERENCES ACROSS G RADE IN NELS:88: COMPARISON OF IRT AND THETA, RACE AND INCOME Time Period Starting Level IRT Gain Per Month Effect Size Per Month Starting Level Theta Gain Per Month Effect Size Per Month HISPANIC HIGHER INCOME STUDENTS, DIFFERENCE RELATIVE TO W HITE HIGHER INCOME STUDENTS READING 8th Grade to 10th Grade 10th Grade to 12th Grade -4.56 -5.23 -0.0283 0.00186 -0.00370 0.000201 -4.59 -5.38 -0.0330 0.0000913 -0.00434 0.00000964 -6.81 -7.11 -0.0123 0.0179 -0.00110 0.00136 -4.87 -4.86 0.000426 0.00642 0.0000564 0.000702 M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade HISPANIC LOW INCOME S TUDENTS, DIFFERENCE RELATIVE TO W HITE HIGHER INCOME STUDENTS READING 8th Grade to 10th Grade 10th Grade to 12th Grade -6.50 -6.89 -0.0162 0.0461 -0.00212 0.00499 -6.30 -6.98 -0.0283 0.0402 -0.00373 0.00425 M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade -9.32 -10.8 -0.0634 0.00314 -0.00569 0.000238 -6.58 -7.38 -0.0333 -0.00797 -0.00441 -0.000872 ASIAN HIGHER INCOME STUDENTS , DIFFERENCE RELATIVE TO W HITE HIGHER INCOME STUDENTS READING 8th Grade to 10th Grade 10th Grade to 12th Grade 0.448 1.23 0.0327 0.0419 0.00427 0.00454 0.489 1.31 0.0342 0.0590 0.00450 0.00622 M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade 3.11 3.97 0.0361 0.0167 0.00324 0.00127 2.16 2.98 0.0342 0.0180 0.00453 0.00197 ASIAN LOW INCOME STUDENTS , DIFFERENCE RELATIVE TO W HITE HIGHER INCOME STUDEN TS READING 8th Grade to 10th Grade 10th Grade to 12th Grade M ATHEMATICS 8th Grade to 10th Grade 10th Grade to 12th Grade -4.53 -0.00401 0.00933 -4.47 -5.27 -0.0307 0.0862 -5.39 -0.0382 0.0824 -0.00503 0.00869 -2.12 -2.19 -0.00283 0.0368 -0.000254 0.00280 -1.54 -1.65 -0.00466 0.0299 -0.000616 0.00327 Changes in Learning Rates Over Time The rates presented in Tables 4.1 and 4.6 suggest that learning rates decline over time, as do the ratios in Tables 4.2 and 4.9. For example, Table 4.6 shows a much larger drop in learning rates after first grade than between kindergarten and first grade. Even in the more disaggregated results, the growth rate in second and third grades, as measured in the effect size metric, is often less than half the first grade growth rate. In comparison, the growth rates in first grade are generally much closer to or even larger than the kindergarten rates. For example, across analyses, the ratio of the first grade gain to the kindergarten gain is typically around 1.0 (in IRT scale effect sizes or theta scores), ranging from 0.812 (IRT, Asian EH) to 1.143 (theta, black low-income). This suggests that the kindergarten and first grade gains are similar in magnitude, whereas the second and third grade gains are consistently and substantially smaller than the first grade gains (thus the ratios hovering around 0.30 to 0.50). Gains in high school are substantially 101 smaller, in standard deviation units. However, there are a number of possible explanations for this pattern, and only a small subset of them imply that students are actually learning less. See the conclusion of Chapter 3, and Chapters 1 and 6, for more on this subject. Changes in Learning Rates and Gaps Over Time While we cannot directly compare learning rates within a type of student across time, we can more reasonably compare the estimated gap across different types of students, and track changes in the size of this gap over time. In this section, we use estimates from Chapter 3 and the tables earlier in this chapter to construct estimates of the gap in scores at the end of each time period, scaled by the standard deviation of scores at the end of each time period (from Appendix C), to measure the gap in contemporaneous standard deviation units. Thus, girls are about a tenth of a standard deviation ahead of boys on Reading tests at the start of kindergarten, and about two tenths of a standard deviation ahead of boys on Reading tests at the end of third grade. We compute these gaps for the Male-Female comparison (female students measured relative to male students), the black-white comparison (black students measured relative to white students), the Low-High Income comparison (low-income students measured relative to higher-income students), and the NonEnglish-English comparison (non-English-speaking students measured relative to English-speaking students). We then graph these estimated gaps, in standard deviation units, over the ages at which individuals are expected to reach at each of these time periods, in Figures 4.1 through 4.4. We can compare the size of gaps at each point in time by comparing the level of each line to the light-grey zero line (marked 0 on the ordinate, representing no difference from the reference group), or differences in learning rates by comparing the slopes of the lines. We can also compare the gaps over time, though by doing this later comparison, we are conflating results from two different surveys, representing different populations of students. In addition, the estimates for the end of third grade are out-of-sample predictions (which can be improved on using fifth grade data available in 2006). Given all these caveats, the gaps are remarkably stable over time, and seem to indicate broad trends of interest. We focus here on a few representative findings on gender, race, income, and language. Girls seem to take an early lead over boys on reading tests (see Figure 4.1), and to maintain that lead from the end of first grade through the end of high school. In contrast, boys gain a small advantage over girls on Math tests (see Figure 4.2) in early grades. Boys begin high school with a lesser advantage, but make faster gains, so they finish high school more that a tenth of a standard deviation ahead. These results are quite robust whether using IRT scale scores or theta scores. Black students start school at a large disadvantage relative to white students (Figures 4.3 and 4.4), in both Reading and Math, and fall farther behind in elementary school, and then rebound somewhat by the end of high school. Hispanic students (Figures 4.5 and 4.6) start out behind black students, more than half a standard deviation behind white students in reading, and about three quarters of a standard deviation behind white students in math, but they do not lose ground as do black students. In high school, Hispanic students gain on white students, and end high school about half a standard deviation behind on both tests. Black students do not make comparable gains on their white peers in high school, ending high school about three quarters of a standard deviation behind on both tests. The time path of the gap between low-income students and higher-income students (Figures 4.7 and 4.8) is very similar to the path of the gap between black and white students, though low-income students start with a bigger gap and have a smaller gap by the end of high school than do black students. Both gaps are similar across Reading and Math tests, and across models using IRT scale scores or theta scores as the outcome measure. 102 FIGURE 4.1 READING ACHIEVEMENT GAPS IN STANDARD D EVIATION U NITS, FEMALE COMPARED TO MALE STUDENTS (BOTH SCALE AND THETA M ETRICS, USING ECLS AND NELS DATA) .1 0 -.1 -.3 -.2 -.4 SD units .2 .3 .4 Reading Gaps, Female vs. Male Students Beg K End 1 End 3 End 8 End 10 Grade Level Reading Theta 103 Reading Scale End 12 FIGURE 4.2 MATH ACHIEVEMENT GAPS IN STANDARD D EVIATION U NITS, FEMALE COMPARED TO MALE STUDENTS (BOTH SCALE AND THETA M ETRICS, USING ECLS AND NELS DATA) .1 0 -.1 -.2 -.3 -.4 SD units .2 .3 .4 Math Gaps, Female vs. Male Students Beg K End 1 End 3 End 8 End 10 Grade Level Math Theta 104 Math Scale End 12 FIGURE 4.3 READING ACHIEVEMENT GAPS IN STANDARD D EVIATION U NITS, BLACK COMPARED TO W HITE STUDENTS (BOTH SCALE AND THETA M ETRICS, USING ECLS AND NELS DATA) -.8 -.7 -.6 -.5 -.4 -.3 -.2 -.1 0 .1 .2 .3 .4 SD units Reading Gaps, Black vs. White Beg K End 1 End 3 End 8 End 10 Grade Level Reading Theta 105 Reading Scale End 12 FIGURE 4.4 MATH ACHIEVEMENT GAPS IN STANDARD D EVIATION U NITS, BLACK COMPARED TO W HITE STUDENTS (BOTH SCALE AND THETA M ETRICS, USING ECLS AND NELS DATA) -.9 -.8 -.7 -.6 -.5 -.4 -.3 -.2 -.1 0 .1 .2 .3 .4 SD units Math Gaps, Black vs. White Beg K End 1 End 3 End 8 End 10 Grade Level Math Theta 106 Math Scale End 12 FIGURE 4.5 READING ACHIEVEMENT GAPS IN STANDARD D EVIATION U NITS, HISPANIC COMPARED TO W HITE STUDENTS (BOTH SCALE AND THETA M ETRICS, USING ECLS AND NELS DATA) -.7 -.6 -.5 -.4 -.3 -.2 -.1 0 .1 .2 .3 .4 SD units Reading Gaps, Hispanic vs. White Beg K End 1 End 3 End 8 End 10 Grade Level Reading Theta 107 Reading Scale End 12 FIGURE 4.6 MATH ACHIEVEMENT GAPS IN STANDARD D EVIATION U NITS, HISPANIC COMPARED TO W HITE STUDENTS (BOTH SCALE AND THETA M ETRICS, USING ECLS AND NELS DATA) -.8 -.7 -.6 -.5 -.4 -.3 -.2 -.1 0 .1 .2 .3 .4 SD units Math Gaps, Hispanic vs. White Beg K End 1 End 3 End 8 End 10 Grade Level Math Theta 108 Math Scale End 12 FIGURE 4.7 READING ACHIEVEMENT GAPS IN STANDARD D EVIATION U NITS, LOW-INCOME COMPARED TO HIGHER INCOME STUDENTS (BOTH SCALE AND T HETA M ETRICS, USING ECLS AND NELS DATA) -.8 -.7 -.6 -.5 -.4 -.3 -.2 -.1 0 .1 .2 .3 .4 SD units Reading Gaps, Low-Income vs. Higher Income Beg K End 1 End 3 End 8 End 10 Grade Level Reading Theta 109 Reading Scale End 12 FIGURE 4.8 MATH ACHIEVEMENT GAPS IN STANDARD D EVIATION U NITS, LOW-INCOME COMPARED TO HIGHER I NCOME STUDENTS (B OTH SCALE AND T HETA METRICS, USING ECLS AND NELS DATA) -.8 -.7 -.6 -.5 -.4 -.3 -.2 -.1 0 .1 .2 .3 .4 SD units Math Gaps, Low-Income vs. Higher Income Beg K End 1 End 3 End 8 End 10 Grade Level Math Theta 110 Math Scale End 12 FIGURE 4.9 READING ACHIEVEMENT GAPS IN STANDARD D EVIATION U NITS, NEH COMPARED TO EH STUDENTS (BOTH SCALE AND THETA M ETRICS, USING ECLS AND NELS DATA) -.6 -.5 -.4 -.3 -.2 -.1 0 .1 .2 .3 .4 SD units Reading Gaps, NEH vs. EH Beg K End 1 End 3 End 8 End 10 Grade Level Reading Theta 111 Reading Scale End 12 FIGURE 4.10 MATH ACHIEVEMENT GAPS IN STANDARD D EVIATION U NITS, NEH COMPARED TO EH STUDENTS (BOTH SCALE AND THETA M ETRICS, USING ECLS AND NELS DATA) -.7 -.6 -.5 -.4 -.3 -.2 -.1 0 .1 .2 .3 .4 SD units Math Gaps, NEH vs. EH Beg K End 1 End 3 End 8 End 10 End 12 Grade Level Math Theta Math Scale In both data sets, the “Non-English at Home” (or NEH) variable measures whether a language other than English spoken in the home. Among kindergartners in 1998, this is a strong predictor of limited English proficiency (LEP), but among eighth-grade students in 1988, a language other than English spoken in the home may represent a different bundle of individual characteristics. The two data source represent distinct populations in at least two ways: first, the population of immigrants has changed over time in the US, and second, the effect of a language other than English spoken at home almost certainly changes with age, and probably has changed over the last decades. Nevertheless, the gap seems to exhibit a smooth trend over time (Figures 4.9 and 4.10), with NEH students apparently starting school half a standard deviation or more behind their peers in both Math and Reading tests, and gradually gaining until they are only about three tenths of a standard deviation behind. The population of students is not exactly comparable across the two surveys, since one survey represents students who were in kindergarten in late 1998 and one represents students at the end of eighth grade in early 1988 (implying that the two cohorts are nearly twenty years apart in age). Given that the preceding graphs conflate estimates from many models and two different surveys representing different populations, too much credence should not be placed in any specific conclusions drawn from them. They are intended more to be an illustration of the types of results that may easily be constructed from the estimates contained in this report. Future waves of the ECLS may offer more reliable estimates in later school years, and could plausibly confirm or contradict these suggestive patterns. 112 Summary Theta analyses in this chapter produced results that are generally more similar to the results in Chapters 2 and 3 that were based on effect sizes than to those based on the IRT scale scores before standardization. They offer some support for the use of effect sizes in comparing outcomes across types of students, across tests, and to a lesser extent, across time. Gaps across types of students measured in standard deviation units seem to be exacerbated in elementary school, but to diminish somewhat by the end of high school, though further research is needed to confirm this finding. 113 CHAPTER V: LOCALLY STANDARDIZED D IFFERENCES IN LEARNING RATES In this chapter, we report estimates of the average locally standardized difference (LSD) in growth rates between student subgroups. This approach measures differences in learning rates in a way that is less sensitive to the details of test design than our regression analyses from Chapters 2 and 3, where IRT scores are the dependent variable. Brief Description of Method The locally standardized difference technique compares the average learning rates during each time period of students in different subgroups who start the time period with similar initial test scores (hence the ‘local’ feature of the estimates). This difference in learning rates at each initial test score level is then standardized by dividing the difference by the pooled within-group standard deviation of growth rates (this pooled standard deviation, too, is estimated locally). Details of the estimation are in Appendix E. Each of these standardized estimates of learning rate differences is local to a particular baseline test score; averaging across the entire distribution of baseline scores produces an average locally standardized difference in growth rates. This estimate can be interpreted as the expected difference in test score growth rates between two students of different subgroups who have the same initial test scores, expressed as a fraction of the standard deviation of test score growth rates among students starting with this same initial score. Advantage of Method Over Linear Models The advantage of this method of describing differences in growth rates is that it is robust to differences across tests: regardless of what test metric the scores are reported in, we obtain virtually the same estimates of locally standardized growth rate differences. The same cannot be said of the regression growth model estimates reported in Chapter 3, as these estimates could be quite different if we used a different version of the test metric. Prior to reporting the results from this approach, we briefly describe its rationale. Any comparison of growth rates in math and reading between different population subgroups relies on the assumption that test scores are interval-scaled—meaning that a one-point difference in scores at the high end of the test scale, for example, represents a difference in skills equivalent to that indicated by a onepoint difference in scores at the low end of the scale. Without this assumption, it is difficult to determine whether a one-point gain in average scores for a group that begins with low average scores represents a gain in skills greater than, equal to, or less than that indicated by a one-point gain in average scores for a group that starts with high average skills. To make this more concrete, consider the following stylized example. Consider two students: student A, who can reliably add one-digit numbers but cannot subtract, multiply, or divide; and student B, who can reliably perform simple addition, subtraction, and multiplication, but not division. Suppose we administer two arithmetic tests to each student at the beginning of the school year. The first test—test I—has a lot of simple addition and subtraction items on it, and few multiplication and division items; the other—test II— has few addition and subtraction items and many simple multiplication and division items on it. Clearly student B will do better, on average, on both tests than will student A. This is illustrated in Figure 5.1: at the start of the year, student A and B are at the skill levels labeled A1 and B1, respectively, at the start of the year. Regardless of which test we use, student B scores higher than student A. 114 FIGURE 5.1: COMPARING STUDENT PERFORMANCE ON TWO H YPOTHETICAL MATH TESTS Comparison of Two Hypothetical Arithmetic Tests B2 test II division B1 multiplication subtraction addition A2 A1 addition subtraction test I multiplication division Now suppose that student A becomes proficient in subtraction during the school year, but still cannot multiply and divide well; and suppose that student B learns to perform simple division during the school year. At the end of the school year, we administer the same two tests to each student—both of whom have been in math classes during the interval. Student A, having learned to perform simple subtraction during the interval, performs far better on test I than at time 1, but little better on test II (since he has yet to learn to multiply and divide). In contrast, student B, because she already could add and subtract with proficiency, performs little better on test I than at time 1, but performs much better on test II (since she has learned division during the school year). In Figure 5.1, the end of the year scores of students A and B will correspond to the points A2 and B2, respectively. So which student has learned more during the interval between tests? As a practical matter, the answer will depend on which test we give. According to test I, which weights subtraction skills more heavily than division skills, student A ‘learned more’ (i.e., showed a greater gain in test score points) than student B during second grade. According to test II, however, which weights division skills more than subtraction, student B ‘learned more.’ And which test we decide to report depends on whether we think learning to subtract represents a greater increase in math skills than does learning to divide. While there may be arguments that resolve this dilemma on content grounds, test scores are often reported without specific detail about the content and relative weight of the test items. Instead, the test is simply assumed to be an interval-scaled measure of students’ skills in a broad domain, and the uncertainty about whether the comparison of gain scores is meaningful is not noted because only a single test metric is generally available. This stylized example suggests that any comparison of achievement growth rates is intricately tied up with the metric of the test used to measure achievement. This issue has significant consequences for the interpretation growth rates in achievement, since the dependence of the inferences on the test metric calls into question any conclusions one might draw from a single test. This leaves us with a bit of a problem: we want to know whether different subgroups have different learning rates, but our answer depends on the 115 generally unjustifiable assumption that the test metric is interval-scaled. Moreover, in a study like ECLS-K, where test scores are available in several metrics, we will get different answers (as was evident in Chapter 4) depending on which metric we use—a highly unsatisfactory result. So how are we to address this issue? Returning to the example above, consider as well a third student C, who begins the school year knowing how to add, but not to subtract, just as does student A (at the skill level indicated by A1). Student C, however, learns not only subtraction, but also multiplication and division over the course of the school year, and so at the end of the school year has arithmetic skills equivalent to student B (at B2). Now, regardless of which test we use, it will be clear that student C has learned more than student A. Moreover, this will be evident on any test of arithmetic skills which has a monotonic relationship to tests I and II. The important point here is that, because students C and A start with the same skills, any test that ranks multiplication and division skills higher than subtraction skills will lead us to conclude that student C learned more than student A during the school year. This insight suggests that we could make unambiguous comparisons of growth rates if we restrict our comparisons to students with the same initial scores. We use such an approach in this chapter. Limitations of the LSD Method While the locally standardized difference methods has the advantage of producing results that are much less dependent on the assumption that test scores are interval-scaled measures of achievement, it is not without its own drawbacks. First, the method assumes that initial scores are measured without error; this is, of course, unlikely to be true. The consequence of measurement error in the test scores is twofold: 1) it will bias estimates of the difference in growth rates upward (away from zero); and 2) it will bias estimates of the local standard deviation upward. Since the locally standardized difference is the ratio of these two, the two biases will tend to at least partially cancel one another out, though not necessarily completely. So the resulting estimates may be biased, though the direction of bias is not clear. Second, the method is less familiar, and hence, the results less interpretable perhaps. In brief, the average locally standardized difference is the estimated average difference in growth rates between two students from different subgroups who start with the same initial test score, expressed in terms of the standard deviation of growth rates among individuals starting with the same initial scores. Third, the method does not make full use of the longitudinal nature of the data. Locally Standardized Difference Estimates Gender Reading Growth rate differences in reading by gender are presented in Table 5.1. Over the course of kindergarten through third grade, female students develop reading skills, on average, at a rate one-tenth of a standard deviation faster than male students who start kindergarten with similar initial reading skills. The reading growth differences between males and females are most pronounced in kindergarten, and are small or nonexistent in the later grades. Math In contrast to reading patterns, over the course of kindergarten through third grade, female students develop math skills, on average, at a rate roughly one-quarter of a standard deviation slower than male students who start kindergarten with similar initial math skills. There is no significant math learning rate difference between males and females in kindergarten, but growth rates begin to diverge slightly in first grade and then diverge substantially in second and third grade, so that during the period from the end of first grade to the start of third grade, female students’ math learning rates are, on average, one-fifth of a standard deviation slower than those of male students who have similar math skills at the end of first grade. 116 TABLE 5.1: DIFFERENCES IN R EADING AND MATH L EARNING RATES BY G ENDER—ELEM ENTARY SCHOOL Time Period Reading (R2 Metric) Reading (Theta Metric) FEMALE-M ALE LOCALLY STANDARDIZED LEARNING RATE DIFFERENCE Overall K-3rd 0.104 During K Summer K-1st During 1 st After 1st, into 3 rd 0.113 0.052 0.017 0.065 Math (R2 Metric) Math (Theta Metric) 0.105 -0.269 -0.270 0.124 0.054 0.034 0.061 -0.043 -0.040 -0.092 -0.211 -0.037 -0.029 -0.087 -0.209 These analyses are based on all students in the ECLS-K who had valid scores at the two waves corresponding to the endpoints of each period. Estimates in bold are significantly different from zero at the 5 percent level. Descriptions of models are provided in Appendix E. Race/Ethnicity Reading Table 5.2 indicates substantial differences in learning rates in reading skills among race/ethnic groups. Over the course of kindergarten through third grade, black students develop reading skills, on average, at a rate almost two-thirds of a standard deviation slower than white, non-Hispanic students who start kindergarten with similar initial reading skills. The reading growth differences between black and white students are most pronounced in second and third grade, but are substantial through kindergarten and first grade as well. Over the course of kindergarten through third grade, Hispanic students develop reading skills, on average, at a rate about one-seventh of a standard deviation slower than white, non-Hispanic students who start kindergarten with similar initial reading skills. Note, however, that these estimates are based only on the 70 percent of Hispanic kindergarten students sufficiently proficient in English to be assessed in reading in kindergarten. The estimated reading growth differences between Hispanic and white students are not significantly different from zero in the kindergarten year (but again, these are based only on Englishproficient Hispanic kindergartners), but are moderately large in first, second, and third grades. These differences across grades may be attributable, in part, to the fact that the later estimates are based on a larger population of Hispanic students, since more were proficient enough by first grade to take the reading assessments. In the primary grades, Asian students develop reading skills, on average, at about the same rate as those of white, non-Hispanic students who start kindergarten with similar initial reading skills. As above, these estimates are based only on the 77 percent of Asian kindergarten students sufficiently proficient in English to be assessed in reading in kindergarten. The estimated reading growth differences between Asian and white students are large and positive in kindergarten, but large and negative in second and third grade. As above, these differences across grades may be attributable, in part, to the fact that the later estimates are based on a larger population of Asian students, since more were proficient enough by first grade to take the reading assessments. Math Over the course of kindergarten through third grade, black students develop math skills, on average, at a rate roughly one-half of a standard deviation slower than white students who start kindergarten with similar initial math skills. In kindergarten, first grade, and second and third grades, the difference in growth rates between black and white students is roughly one-quarter to one-third of a standard deviation, while during the summer between kindergarten and first grade the difference in growth rates is much smaller. 117 Estimates of Hispanic-white differences in math learning rates show a somewhat puzzling pattern. Over the full course of kindergarten through third grade, Hispanic and white students who start kindergarten with similar initial math skills show no significant difference in learning rates. However, in each of the school years from kindergarten through third grade, Hispanic students learn math skills at a rate roughly one-seventh of a standard deviation slower than white students who start with the same level of math skills at the start of the grade. This discrepancy is in part a result of the fact tha t the standard deviation of growth rates varies across the range of initial scores, the difference in growth rates varies across the initial range of scores, and the resulting shifts in the distribution of Hispanic students’ scores relative to white students’ scores means that the average growth rates are computed over different parts of the score distribution at each wave. Over the course of kindergarten through third grade, Asian students develop math skills, on average, at a rate slightly greater than white, non-Hispanic students who start kindergarten with similar initial reading skills. As above, however, these estimates are based only on the 77 percent of Asian kindergarten students sufficiently proficient in English to be assessed in math in kindergarten. The estimated math growth differences between Asian and white students during kindergarten and the K-1 summer are similar (about a tenth of standard deviation higher for Asian students). In first grade, however, Asian students learn math skills at a rate a fifth to a quarter of a standard deviation slower than white students who start first grade with comparable skill levels, while in second and third grade they learn faster than comparable white students. As above, these differences across grades may be attributable, in part, to the fact that the later estimates are based on a larger population of Asian students, since more were proficient enough in English to take the math assessments in the later grades. TABLE 5.2: DIFFERENCES IN R EADING AND MATH L EARNING RATES BY RACE/ ETHNICITY—ELEMENTARY SCHOOL Time Period Reading (Theta Metric) Math (R2 Metric) Math (Theta Metric) BLACK-W HITE LOCALLY STANDARDIZED LEARNING RATE DIFFERENCE Overall K-3rd -0.629 -0.637 -0.485 -0.503 -0.237 -0.113 -0.244 -0.586 -0.33 -0.136 -0.252 -0.327 -0.322 -0.114 -0.237 -0.332 HISPANIC-W HITE LOCALLY S TANDARDIZED LEARNING RATE DIFFERENCE Overall K-3rd -0.136 -0.145 -0.042 -0.052 -0.001 -0.027 -0.166 -0.341 -0.134 -0.06 -0.162 -0.134 -0.139 -0.054 -0.145 -0.149 -0.064 0.104 0.055 0.305 -0.122 0 -0.284 0.06 0.114 -0.264 0.152 0.12 0.172 -0.194 0.113 During K Summer K-1st During 1 st After 1 st, into 3 rd During K Summer K-1st During 1 st After 1 st, into 3 rd Reading (R2 Metric) -0.21 -0.111 -0.272 -0.587 -0.002 -0.049 -0.183 -0.34 ASIAN -W HITE LOCALLY STANDARDIZED LEARNING RATE DIFFERENCE Overall K-3rd -0.084 During K Summer K-1st During 1 st After 1 st, into 3 rd 0.288 0.138 -0.058 -0.298 These analyses are based on all students in the ECLS-K who had valid scores at the two waves corresponding to the endpoints of each period. Estimates in bold are significantly different from zero at the 5 percent level. Descriptions of models are 118 provided in Appendix E. Note that substantial numbers of Hispanic and Asian students did not take the reading test in the early waves (and substantial numbers of Asian students did not take the math test in early waves) because of limited English proficiency. Thus the differences in growth rates for these groups estimated for the early periods are based on a smaller (and more skilled) population than those estimated for the later periods. Economic Status Reading Table 5.3 presents the growth rate differences by economic status. Over the course of kindergarten through third grade, higher-income students develop reading skills, on average, at a rate roughly two-fifths of a standard deviation faster than low-income students who start kindergarten with similar initial reading skills. The reading growth differences between higher- and low-income students are most pronounced in second and third grade, but are substantial through kindergarten and first grade as well. Math Similarly, over the course of kindergarten through third grade, higher-income students develop math skills, on average, at a rate roughly one-quarter of a standard deviation slower than low-income students who start kindergarten with similar initial math skills. As with reading development, the growth differences between higher- and low-income students are most pronounced in second and third grade, but are substantial through kindergarten and first grade as well. TABLE 5.3: DIFFERENCES IN R EADING AND MATH L EARNING RATES BY ECONOMIC STATUS —ELEMENTARY SCHOOL Time Period Reading (R2 Metric) Reading (Theta Metric) LOWER -INCOME - HIGHER -INCOME LOCALLY STANDARDIZED LEARNING RATE DIFFERENCE Overall K-3rd -0.39 -0.385 During K Summer K-1st During 1 st After 1st, into 3 rd -0.145 -0.223 -0.17 -0.42 -0.150 -0.231 -0.174 -0.415 Math (R2 Metric) Math (Theta Metric) -0.234 -0.240 -0.136 -0.171 -0.159 -0.316 -0.148 -0.170 -0.160 -0.316 These analyses are based on all students in the ECLS-K who had valid scores at the two waves corresponding to the endpoints of each period. Estimates in bold are significantly different from zero at the 5 percent level. Descriptions of models are provided in Appendix E. Summary This chapter uses a different analytic approach from the previous three chapters, yet draws similar conclusions. For example, as presented in Chapter 3, the gender gap favors girls in reading and boys in math. Black students’ reading and math test scores start behind and lag behind those of white students on average. The black-white achievement gap is widest when measured during the school year and least when measured over the summer. Asian students exceed white students’ learning in math and keep pace in reading. As also seen in Chapter 3’s results, low-income students score lower than higher-income students in both math and reading, and this gap widens, until it is most pronounced in second and third grades. Where the results diverge, and what represents the advantages of this chapter’s analyses, is potentially better estimates for differences in growth rates in second and third grade between certain subgroups. The rescaling of the ECLS-K assessments may affect the subgroups differently, and results for the most recent round is the most likely to be affected by rescaling issues (see Appendix B for details). The figures in Chapter 4 illustrate that the estimated differences in growth rates in second and third grade and the levels at the end of third grade seem to diverge somewhat from our other estimates, and they will likely be much changed by the addition of fifth grade data. The LSD method offers the potential to produce estimates 119 more robust to rescaling, in the sense that they may change less when fifth grade ECLS data are added to the analysis, and newly rescaled scores are used. Using the growth models, differences in achievement gains are estimated to consistently shrink during the second and third grade time period. However, using the LSD method, the differences in growth rates are most pronounced during second and third grades, as would be expected in light of the figures from Chapter 4. The race, language status, and income status gaps all become much larger between the end of first grade and the end of third grade. Yet the estimated gaps in high school are more comparable with the estimated gaps at the end of first grade. In the locally standardized analyses, the gaps during the second and third grade time periods are greater than in previous grades, which suggest a more linear expansion in the differences. This expansion seems even greater than what the high school data imply exists in the later grades. Either the gaps legitimately widen in late elementary school then return in the middle school years, or these larger estimated gaps at the end of third grade represent bias in the estimated gap trajectory due to rescaling and other data-specific issues—once subsequent rounds of ECLS data are made available, the resolution to this question may become clear. In these locally standardized analyses, Hispanic students learn reading at the same rate as white students in kindergarten. Differences in rates of achievement gain in reading emerge in first grade, and the difference nearly doubles during second and third grades. However, in the growth curve analyses, Hispanic students, from either low-income or higher-income backgrounds or from English-speaking or non-English speaking households, start behind and their gains continue to lag behind those of white students (see Table 3.3). The difference is most pronounced during first grade and decreases dramatically during second and third grades. In addition, the findings in this chapter confirm the convergence of the IRT and theta results. Using the IRT or the theta scores in these locally standardized difference analyses present highly similar findings. In the analysis by income, the estimates would be the same if rounded to the second significant figure. Larger discrepancies between the theta and the IRT score results emerge in the analysis by race, where the difference ranges from 0.01 to 0.03 in magnitude. 120 CHAPTER VI: D ISCUSSION AND IMPLICATIONS This report examines achievement gains by U.S. students in the early elementary school grades and in the high school grades. Our findings quantify the gains students make, on average, across one and two-year time spans in elementary and high school. We also quantify the differences in achievement, and differences in achievement gains, across subgroups defined by gender, race, family income, and English-speaking homes. These estimated achievement gains serve as important benchmarks for researchers and policymakers. They set a context of how much ground the average student makes in reading or in math in a given year. Researchers planning interventions need to know what magnitude of effect is realistic to expect within this context. Policymakers who interpret findings from such interventions to determine funding require such a context to deem a program effective or not. Estimates broken down by subgroups allow researchers and policymakers to more accurately predict the performance of students in settings where the distribution of students’ demographic characteristics does not conform to national averages. Our measures of how much students gain over different time periods, and how the differences across different types of student are magnified or minimized over time will also inform the debate over where educational resources should be targeted. Parents, educators, administrators, policymakers, and researchers may find suggestive evidence that directly bears on questions about what kids should be expected to learn over time, a nd how and when some students fall behind others. The report may raise many questions, as well, and in this chapter, we both summarize the findings and discuss some possible implications and directions for future research. Achievement Gains Over Time In general, our findings indicate that students appear to make greater gains on reading and math tests in elementary school than in secondary school. On reading and math assessments, kindergartners and firstgraders gain one and a half to two standard deviations per year. Over the two-year time span between the end of first grade and the end of third grade, children move nearly two standard deviations up the distribution of first-grade scores, for both reading and math scores, indicating a rate of gain just over half that observed in the first two years. Over a two-year time span in high school between the end of eighth grade and the end of tenth, students move about sixth tenths of a standard deviation up the distribution of eighth-grade scores in math and about four tenths of a standard deviation up the distribution of eighthgrade scores in reading. The implied learning rate is thus one quarter or one third as fast during ninth and tenth grades as in second and third, and the implied learning rate in eleventh and twelfth grades is half that of ninth and tenth grades. There are a number of plausible explanations for the apparent slowdown of learning. First, the reading and math assessments include questions about concepts and skills that may be taught less in later grades, so may be learning more social studies and science, rather than reading or basic math. Second, the underlying variation in math and reading skills may increase over time, so that gains expressed in standard deviation units appear smaller relative to the variation in the population, but we argue throughout the report that some type of standardization is necessary to compare gains made during different time periods, or from different initial levels. Finally, it may be that there are decreasing returns to instruction, and more students learn at a lower rate once they have learned most of the material taught prior to high school (so they are on the “flatter” part of their individual learning curves). Notice that the last explanation is the only one we have considered that implies a slower learning in later grades. 121 If we accept a standardized measure of progress in reading and math as a reasonable proxy for overall learning, we find a dramatic slowdown in learning that begins earlier than previously thought. A recent report from the Fordham Foundation (Yecke and Finn 2005) generalized that children “do reasonably well” in elementary school but falter in middle school and by high school, their academic performance is weak. Our analyses here support the observations about high school, since we find a relatively quick pace in reading and math gains early in schooling and a slower pace later in schooling. However, our findings indicate that students may begin to falter earlier than middle school. From some of our estimates, it seems that children gain less in second and third grades than in kindergarten and first grade, but forthcoming ECLS-K data from the fifth grade assessment should help clarify the pattern of math and reading learning rates over time. Gaps in Achievement Gains, and Gaps in Achievement The performance of the average student masks large differences between students who differ by gender, race and ethnicity, native language, or family income. We focus on these categories because they have been explored in prior research, and because the size and mean performance of these groups are usually easily identifiable in school or regional data, which gives our results maximal utility for policymakers, researchers, and educational administrators. The gap between groups is of interest in its own right, as well, and better measures of gaps in achievement or learning at different points in time may help better direct educational resources. The gender gap in reading seems to be in place before school begins. Girls start kindergarten about one tenths to two tenths of a standard deviation ahead of boys on reading assessments and the gap favoring girls does not widen or close substantially by the end of third grade. Girls seem to be about two tenths of a standard deviation ahead of boys on reading tests throughout high school. In math, the gender gap takes quite a different form. Boys and girls enter kindergarten with roughly the same score on a math assessment. But boys gain faster than girls on math tests throughout the early years of school , so that boys are about a tenth of a standard deviation ahead in third grade (boys are about a tenth of a standard deviation ahead throughout high school as well). Past work with small datasets that are not nationally representative suggest that the middle school years are when gender gaps in math achievement emerge (Eccles; Midgley). However, recently released data from the TIMMS study suggests the gender gap in math achievement starts early in America, and our results confirm this finding. Achievement gaps by ethnicity, by language status, and by income have made national headlines in light of the No Child Left Behind Act of 2001. The black-white achievement gap has attracted attention for years, but No Child Left Behind also focuses attention on Hispanic children, children who speak English as a second language, and economically disadvantaged children. Their achievement differences compared to white students, English speakers, and economically more advantaged students are now estimated and publicized. This report presents achievement gain differences among these student subgroups. Black students start kindergarten about six tenths of a standard deviation behind white students on math tests, and about four tenths of a standard deviation behind white students on reading tests. These same students are about three quarters of a standard deviation behind by third grade on both tests, so they have lost ground (learned at a slower rate) relative to white students. Hispanic students, on the other hand, start farther behind black students on average, but gain faster. On math tests, Hispanic students start kindergarten about three quarters of a standard deviation behind white students, but are only six tenths of a standard deviation behind white students by third grade. On reading tests, Hispanic students start kindergarten about six tenths of a standard deviation behind white students, and are about three quarters of a standard deviation behind white students by third grade. Black students are about eight tenths of a standard deviation behind white students in high school math tests, whereas Hispanic students are about 122 six tenths of a standard deviation behind white students. Both black and Hispanic students are about half a standard deviation behind white students in high school reading tests. In short, black students start school slightly behind, but have slower growth in scores than other groups, so wind up severely disadvantaged by high school, even compared to other disadvantaged groups. Asian students are a minority group for which the achievement gap with white students is reversed. Asian students begin kindergarten with a significant advantage in reading, which is no longer evident by third grade. Asian students begin high school at similar achievement levels to white students in reading test, but gain slightly faster, so they are more than a tenth of a standard deviation ahead by the end of high school. On the math assessment, white and Asian students start kindergarten essentially even, and remain at similar levels through the end of third grade. Asian students begin high school two or three tenths of a standard deviation ahead of white students on math tests, on average, and gain faster than white students throughout high school. Differences by language status matter to achievement as well. Students from households where English is not the first language start kindergarten about half a standard deviation behind other students in reading and in math. In math, the gap shrinks significantly in both our elementary and high school samples, and the gap is only a quarter of a standard deviation at the end of high school. In reading, students from households where English is not the first language make no substantial gains on other students during elementary school, but the gap is smaller at the start of high school, and shrinks to about three tenths of a standard deviation by the end of high school. As students from non-English households spend more time in school and with peers and teachers who speak English, their academic disadvantage diminishes. Non-native speakers may not only improve their language skills from immersion in a predominantly English-speaking environment but also feel more comfortable asking for clarification in an ever more familiar environment. The narrowing of the gap during elementary school is more evident in math than in reading. It is unclear why this would be the case, and it is possible that some portion of this result is due to sample selection that arises from the exclusion from the reading test of Spanish speakers who fail a placement exam. As previously excluded sample members pass the placement exam and take the reading test, these students may reduce the mean performance of students from non-English households in the latter grades of elementary school. Differences by subgroup membership are often attributed to differences in home resources. And differences by economic status are consistently wide. In both reading and math, students from economically disadvantaged backgrounds start kindergarten with a significant, substantial disadvantage compared to their more advantaged peers. This gap is more than half a standard deviation through elementary school, and about half a standard deviation throughout high school. It is not clear whether lowincome students are gaining between third grade and eighth grade, since the comparability of the data between the elementary and high school samples is subpar for income and poverty measures. Compared to other large gaps in achievement, such as the black-white gap, the language status gap, and the gender gap, the economic status gap is quite large at the start of kindergarten, and is of comparable size to the gap between Hispanic and white students. Only the Hispa nic students who speak Spanish at home are at a greater disadvantage. But over time, Hispanic students who speak Spanish at home make substantial gains relative to white students, and low-income students seem to keep pace with higher income students, while black students fall farther and farther behind. By the end of high school, the black-white gap is much larger than all other achievement gaps. These analyses by individual subgroup raise the question of what happens when students not only are economically disadvantaged but also belong to a minority group that does not make as much reading and 123 math gain as white students. Indeed, black low-income students comprise the only subgroup for whom reading and math achievement gaps consistently widen, from elementary school through high school. Most fascinating and perplexing, these analyses show the limits of home financial resources to academic achievement. Black higher-income students begin kindergarten with average test scores nearly the same as white higher-income students, but fall behind in subsequent years of schooling. Hispanic higher-income students start kindergarten behind white and black higher-income students, but their gap compared to white higher-income students does not widen as much as the black-white gap. White, black, and Hispanic low-income students all start with deep deficits in achievement compared to white higher-income students and continue to lose ground by gaining less in eighth through tenth grades. However, only white low-income students continue to face widening gaps later in high school and make significantly slower average gain in math. There are several possible explanations for this finding, including selection as white students exhibit lower dropout rates, or the higher proportion of black and Hispanic students who fall into the low-income category, which may imply that the low-income category includes students of higher mean ability among black and Hispanic students than white students. Methodological Caveats There are several caveats to interpreting this work that must be noted. First, the use of achievement tests to capture what students know is not straightforward. Tests are more sensitive to design issues than intuition would suggest, even after sophisticated manipulation based on Item Response Theory. Virtually any desired change in estimates of learning rates could be generated by an appropriate reweighting of the test toward items of certain difficulty levels. We have focused throughout the report on measuring gains or differences in achievement across students in terms of standard deviations, and used several methods to make these comparisons, because in this way we can get a sense of the range of possible estimates, and reach more robust conclusions. However, the conclusions are dependent on measures of variability as well as growth rates and gap estimates, and are still only as good as the tests on which they are based. Second, our results, which generalize across all students in the nation, may not be robust to inclusion of other explanatory factors, such as school characteristics (e.g., racial/ethnic composition of schools). Future analyses, discussed in a subsequent section, will include more factors that might influence student learning. Despite these limitations, the results may be interpreted as the estimated achievement gains made by all students and as the estimated learning differences among specific subgroups defined by such characteristics as ethnicity and income status. None of the caveats mentioned in the report overwhelm the usefulness of our findings as benchmarks for how much elementary school and high school students learn. Policy Recommendations It is clear from the analytic results that the achievement gaps more frequently discussed in relation to later grades are already present at the start of kindergarten and in first grade. Most achievement gaps observed at school entry seem to remain at similar magnitudes over time, except the black-white gap, which widens over time. Achievement gaps that exist in third grade among students who were kindergartners in 1998 often seem similar to those observed at the start of high school in 1988. Addressing these gaps early may be the key to closing them. Results in this report suggest that efforts should be targeted towards younger children, when gaps tend to be smaller, and when growth rates seem highest. 124 Indeed, policymakers have endeavored for decades to provide preschool interventions to give economically disadvantaged children a boost before they start school. Widespread federal interventions such as Early Start and Head Start begin before school begins so that achievement gaps may be narrowed before students reach kindergarten. Our analyses indicate that achievement gaps already exist by kindergarten, and often widen most for the most disadvantaged subgroups. Critics of federal preschool programs, e.g., Head Start, cite relatively weak, ephemeral impacts on academic performance; however, Head Start attempts to compensate for more than a shortage of cognitive or academic resources among disadvantaged children. The program’s emphasis on social development and health practices may be valuable, but may have less impact on the academic disadvantage with which lowincome or minority students begin school. Our results suggest that more cognitively- and academically-focused interventions should begin earlier when learning appears to be faster and the subsequent payoff potentially richer. Getting disadvantaged students to the point of recognizing beginning and ending sounds to words by the fall of kindergarten, for example, could help reduce the gap in initial status on reading assessments. Continuing intensive cognitive stimulation during school, whether through tutoring services or through after-school programs, might help economically disadvantaged students further. Such interventions should be targeted to specific populations and at specific timepoints. Targeted students should be those who most clearly need a boost before they begin school and those who fall behind during school—these are not necessarily the same students. The results of this report play their most significant and crucial role for researchers and policymakers who wish to understand the effectiveness of interventions. Our results set benchmarks against which to evaluate the estimated impacts of educational interventions. Young students can make on average gains of about 2 standard deviations in reading during kindergarten and during first grade and in second and third grades combined. An intervention that improves achievement from kindergarten to first grade by less than a tenth of a standard deviation may not provide a sufficiently effective improvement to students’ learning. And since the black-white achievement gap in math increases by about 0.40 standard deviations per academic year, an intervention that improves black students’ scores by 0.05 standard deviations per year will do little to narrow that achievement gap substantially. These analyses also suggest that testing children in academic subjects is possible before the third grade, when the federal education programs require initial assessment. In light of our findings, unearthing achievement gaps in third grade may be too late to change course and close gaps. Assessing achievement in kindergarten and first grade is possible, and there is interesting, crucial variation in both achievement and learning rates to study. Closing achievement gaps between subgroups is not the only policy focus suggested by these results. During school, the early years should be acknowledged as a time of fast cognitive growth. This time demands sufficient parental and instructional support to make sure children can make the great leaps in reading and math skills that our results indicate are possible. Well-prepared teachers, intensive interventions, and programs to increase parental participation in cognitive activities at home all should be considered as factors that may bolster the achievement gains made in elementary school. Future Work In the 2006 calendar year, the fifth-grade data from the ECLS-K study should be made available. From these data, we can learn what happens to children’s learning and to the gaps in learning rates as students transition out of elementary school and into middle school. The current work shows achievement gains 125 and differential growth rates in elementary school and in high school. The future fifth-grade data can provide a bridge between the two analyses. Future work will consider how differences in learning trajectories may derive from differences in school characteristics. The importance of school context cannot be ignored, and we plan to construct three-level hierarchical linear models to determine the extent to which achievement gains vary by school. One of the significant contributions of this work is aligning achievement gains to actual skills that students gain. This helps to make test scores tangible and comprehensible to a lay audience. Gains in points mean less in interpretation than gains in the actual reading and math skills the assessments are designed to test. Our next analyses will examine the proportions of children that achieve at or above certain benchmarks, such as mastering problem-solving in math or comprehension of complex reading passages. Future analyses will also examine the evolution of variation in test scores, both for all students, and any differences among the relevant subgroups. Especially given the use of standard deviation estimates to standardize gaps and growth rates, changes in the variance in test scores across individuals could produce equally large observed changes in estimated learning rates as any plausible intervention in schooling. Thus, we plan to characterize the evolution of variance, and the influence of a variety of predictors of changes to the variance in test scores across individuals over time. Conclusion This report presented the average achievement gains in reading and math made by nationally representative samples of elementary school students and of high school students. Achievement gains are estimated for all students in the sample and then are disaggregated by subgroup membership, including by gender, ethnicity, language status, income status, and the intersections of ethnicity and language status and very importantly, or ethnicity and income status. These achievement gains provide benchmarks for researchers, program designers, and policymakers to understand the amount of learning to be expected in these grades. Estimates should provide a useful context to compare impact estimates of extant programs, to interpret children’s progress in skill attainment, to design future experimental studies, and to develop and target future interventions. These are good estimates based on the data available to us now. Once the fifth grade ECLS data, collected in the spring of 2004, are analyzed, estimates using newly rescaled test scores from that wave may differ substantially from these estimates, particularly for the period following first grade and extending into third grade, but potentially for all elementary school grades. 126 REFERENCES Burkam, D.T., Ready, D.D., Lee, V.E, & LoGerfo, L.F. (2004). Social-class differences in summer learning between kindergarten and first grade: Model specification and estimation. Sociology of Education, 77(1), 1-31. Downey, D.B., Broh, B.A., & von Hippel, P.T. (2004). Are schools the great equalizer? Cognitive inequality during the summer months and the school year. American Sociological Review, 69(3), 613-635. Fryer, R.G., & Levitt, S.D. (2004). Understanding the Black-White test score gap in the first two years of school. Review of Economics and Statistics, 86(2), 447-464. Goldhaber, D. & Brewer, D. (2000). Does teacher certification matter? High school teacher certification status and student achievement. Educational Evaluation and Policy Analysis, 22, 129-145. Jencks, C. & Phillips, M. (1998). The Black-White Test Score Gap. Washington, DC: Brookings. Lee, V.E., & Burkam, D.T. (2002). Inequality at the Starting Gate: Social Background Differences in Achievement as Children Begin School. Washington, DC: Economic Policy Institute. Nye, B., Konstantopoulos, S., & Hedges, L.V. (2004). “How large are teacher effects?” Educational Evaluation and Policy Analysis, 26(3), 237-257. Reardon, Sean. (2003). “Sources of educational inequality: The growth of racial/ethnic and socioeconomic test score gaps in kindergarten and first grade.” Population Research Institute Working Paper 03-05R. Sanbonmatsu, Lisa, Jeffrey R. Kling, Greg J. Duncan, and Jeanne Brooks-Gunn. “Neighborhoods and Academic Achievement: Results from the Moving to Opportunity Experiment.” NBER Working Paper No. 11909 (available at nber.org). Yecke, Cheri Pierson & Chester E. Finn, Jr. (2005). Mayhem in the middle: How middle schools have failed America. Washington, DC: Thomas B. Fordham Foundation. 127 A PPENDIX A: D ETAILS OF DATA AND METHODS This section briefly describes the methodology for the report. It opens with discussing how the analytic sample differs from the survey sample for the ECLS-K and NELS:88 data. The variables included in the models are described next, which is followed by a concise description of the analytic method. ECLS-K The ECLS-K study was constructed with a multi-stage, probability sample design (NCES, 2000). The primary sampling units were geographic areas, from which 1,280 public and private schools offering kindergarten in the base year were sampled. NCES sampled schools to ensure a nationally representative sample of public, Catholic, other religious, and private schools. In each sampled public school, between twenty and thirty kindergartners were selected to participate, and in private schools at least twelve kindergartners were selected. Private schools were oversampled, as were Asian and Pacific Islander children, to facilitate comparative analyses (NCES, 2000).35 Missing Data Of the ECLS-K sample with 21,399 children, 6,152 do not have any test score data and are excluded from our analyses. These students differ in several ways from the students in the analytic sample. The analytic sample has a larger proportion of white children and a smaller proportion of black children than expected from the survey sample. There are more non-English-speaking children missing from the analytic sample than would be expected under an assumption of randomly missing data. Significantly fewer children from the lowest SES quintile and more children from the highest SES quintile are in the analytic sample than expected. Effect of Missing Scores for Non-English Speakers Hispanic and Asian children are much more likely than white, black, or other children to fail the oral language proficiency screening exam (oral language development scale, or OLDS), and therefore are more likely to be missing scores in earlier rounds of testing. The missing scores may lead us to overstate the true average achievement in early rounds (particularly at the beginning of kindergarten) for Hispanic and Asian children and to underestimate average learning rates, but the estimated gap between Hispanic or Asian children and white children at the end of grade 3 is much less likely to be biased, since all children passed the OLDS placement exam in the fifth round of testing. Children who ever failed the OLDS placement exam perform worse on average by every measure (they have lower starting points and learn at a slower rate) than those who never failed. Looking only at children who never failed the OLDS placement exam, the estimated gap between Hispanic children and white children at the end of grade 3 is only 59 percent as big (8.36 points instead of 14.14 points). The average achievement of Asian students who never failed the OLDS placement exam is indistinguishable from that of white children at the end of grade 3 (compared to a nearly 5 point gap when including children who failed the OLDS test at some point in time). Excluding children with missing test scores in early periods could therefore lead to much smaller estimates of gaps in achievement. In the third round of data collection, at the fall of first grade, only a subsample of about 3,400 children were surveyed and tested. To create this subsample, NCES drew a 30 percent equal probability subsample of ECLS-K schools and included all the base-year responding children in those schools. 35 128 Hispanic Students (difference in IRT reading scores from White Students) All students Never failed OLDS Beginning of Kindergarten -5.22 -3.87 End of Kindergarten End of Summer after K-Beginning of 1st -6.97 -4.39 -6.33 -3.56 End of 1st Grade -12.6 -7.54 End of 3rd Grade -14.14 -8.36 Asian Students (difference in IRT reading scores from White Students) All students Never failed OLDS Beginning of Kindergarten 1.09 2.54 End of Kindergarten End of Summer after K-Beginning of 1st 2.59 5.57 3.83 7.46 End of 1st Grade 0.81 5.65 End of 3rd Grade -4.88 -0.520 NELS:88 Like ECLS-K, NELS:88 used a two-stage sampling procedure. In the first stage, 815 public schools and 237 private schools were selected with probabilities proportional to their eighth grade enrollment. Twentysix students were then randomly sampled from each school in the second stage. However, the study oversampled Hispanic students and Asian students to allow for valid comparisons across racial groups. In the first follow-up (1990) and second follow-up (1992), students who were on time in the correct grade were in tenth and twelfth grades, respectively. Missing Data There are small to moderate amounts of missing data on students’ test scores that are not missing at random (MAR) for which our weights do not adjust (Allison 2002; Little and Rubin 2002). Approximately 3 percent of students are missing test scores for the base year (grade 8), 3 percent for the first follow-up (grade 10), and 18 percent for the second follow-up (grade 12). On average, students with low achievement are more likely to be missing test scores than students with high achievement.36 In addition, low-income students and black students are slightly more likely to have missing data on test scores than higher-income and white students, respectively. The non-random nature to the missing data results in a loss of statistical power and possible bias in parameter estimates. The use of growth curve models, which utilize all possible test scores in estimating parameters, deals with this problem. The analyses discussed in the text exclude students who dropped out of high school and were retained at some point during their schooling. A cross-tabulation analysis indicates that an overwhelming majority of students who were retained are very likely—almost certain—to be the same students who leave high school without completing a degree. Of those who dropout, 96.1 percent are retained; of those who are retained, 78.1 percent drop out. By twelfth grade, 807 students dropped out of high school at some point37 and have a reading test score, and 1,127 students were retained at some point and have a reading test score. Of the dropouts, 801 have a math score, and of the students ever retained, 1,123 have a math score. 36 37 Inspection of plots of kernel density functions reveals this pattern. Results are available from the authors upon request. Some students who dropped out may have returned to school, but we do not know this information. 129 We ran analyses that include the dropouts to see how much their inclusion changes our results. The average reading score at the end of eighth grade is about a third of a point less with the dropouts included than with the more select group of students. A tenth of a point separates the learning rates between the two samples, with the inclusive sample making less gain in tenth grade. In twelfth grade, the learning rates are very similar; the difference in gain (measured in effect size per month) is 0.00022 SD and favors the sample that includes the dropouts. These differences do not strike us as problematic. TABLE A.3: READING GAINS FOR STUDENTS IN EIGHTH G RADE IN 1988—EXCLUDING D ROPOUTS Time Period Gain Per Month Effect Size Per Month Gain Per Period Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade At End of Period 28.25 0.152 (0.0033) 0.0903 (0.0042) 0.0199 3.66 31.91 0.00978 2.17 34.07 Gain Per Period At End of Period R EADING GAINS FOR STUDENTS IN EIGHTH GRADE IN 1988—INCLUDING D ROPOUTS Time Period Gain Per Month Effect Size Per Month Before High School 8th Grade to 10th Grade 10th Grade to 12th Grade 27.94 0.146 (0.0035) 0.0925 (0.0042) 0.0191 3.50 31.44 0.0100 2.22 33.66 Note: Standard errors for the estimated coefficients are presented in the first column of the table in parentheses below the corresponding coefficient. All coefficients are significantly different from zero. To calculate effect sizes, we divide the gain per month by the estimated standard deviation of the base-period test at the start of each time period. Variables Used This section describes the key variables that are included in the analytic models. We discuss first the outcome measures and second, the demographic background variables and adjustments for differences in school exposure. Dependent Variables In ECLS-K, cognitive assessments were administered to children one-on-one by trained NCES staff. In each round, children from a language minority background first took a screening test called the Oral Language Development Scale (OLDS). If children did not pass this test but spoke Spanish, they could not take the reading test but could take a Spanish translation of the math assessment. If children demonstrated sufficient proficiency in English on the OLDS at any point, they then took both the reading and math assessments. Each assessment included two stages. First, children answered the same 12 to 20 reading and mathematics questions whose difficulty levels varied widely. This first stage provided a rough estimate of children’s achievement level. Children’s performances on these test questions in each domain determined the next stage of the assessment. In the second stage, the tests included items of low, medium, and high difficulty. The routing process tried to ensure that children were given questions in this second stage that were 130 appropriate to their level of cognitive development and more precisely measured their achievement (NCES, 2000). The cognitive assessments in NELS:88 followed a similar approach. However, rather than use a routing test at the same assessment time, NCES used each student’s eighth-grade scores to tailor the appropriate level of difficulty for the follow-up test in tenth grade (NCES, 1995). The tenth-grade performance then influenced the form of the test for the twelfth-grade follow-up. There were two forms of the NELS:88 reading tests (low and high difficulty) and three forms of the math tests (low, average, and high difficulty). Each successive grade level form included additional more difficult items to address students’ presumably improving skills. The adaptive approach to testing in both ECLS-K and NELS:88 reduces the chance of ceiling and floor effects. Reading Reading questions tapped basic skills, from letter recognition and the link between letters and sounds to vocabulary and reading comprehension. Because more children than expected performed close to the ceiling on the spring K reading assessment, NCES increased the number and difficulty of questions in first grade (NCES, 2002). Changes between the first and third grade rounds included adding more advanced questions about literal inference, extrapolation, and evaluation. The NELS:88 reading tests posed questions about reading passages that varied in length from a single paragraph to a half-page. The tests measured skills in reading comprehension, literal inference, and critical evaluation, which represent an extension of the skills tapped by the ECLS-K tests. The high difficulty form was differentiated from the low difficulty form by including more complex texts taken from social studies and science. Math Math questions tapped skills in conceptual knowledge, procedural knowledge, and problem solving. Items range from asking children to identify numbers to solving simple multiplication and division problems. This assessment included the same number and difficulty of questions in kindergarten and first grade until third grade at which point more difficult items were added. These additional, more difficult items measure skills in geometry and spatial sense, data analysis, probability and statistics, and basic algebraic functions (NCES, 2004). At the secondary level, the NELS:88 10 th and 12th grade math tests had three forms of difficulty. All levels of the test tapped skills in arithmetic, similar to the skills found on the ECLS-K tests. The average and high difficulty level tests tapped skills in algebra and geometry. The high difficulty level tests included precalculus questions and/or analytic geometry questions. Weights Precision weights The precision weight accounts for differences in the quality of information across observations at each wave. The importance of imprecise estimates are weighted down, while more precise estimates are more heavily weighted. This precision weight is constructed by multiplying the variance in the test scores (IRT scale score or theta, depending on the analysis) at each wave with the inverse of the reliability for the theta scores, which is the best approximate estimate of the test reliability (NCES, 2005, p.85). Individual weights We created analytic weights for each child included in the level-2 part of the hierarchical linear models. In ECLS-K, the analytic weights are derived from the school weight in the base year (SCHBSW0), a flag indicating whether students participated in rounds 1 and 2 (R12SC_F0), and the within school child weight (WS_CWGT), all pulled from the restricted data files. These variables are then multiplied together. To 131 check this measure, we compared the resulting weight with the child weight at round 2. As expected, they are highly correlated. In NELS:88, we use the panel weight for students included in all three rounds of data collection. we weight our sample to ensure that it is representative to students across the country who were eighth graders in 1988. The weight F2PNLWT is applied, which weights the base year through second follow-up panel. The weights that we use in our analyses reflect not only the unequal probabilities of sampling but also non-response adjustments. Student Characteristics All analyses use demographic data from the base year data collection. Gender. Males and females may learn at different rates in elementary and secondary school. In both datasets, gender is represented by a dummy variable.38 Ethnicity. This report focuses on different learning rates across four ethnic subgroups: White, Black, Hispanic, and Asian.39 Language status. The models also account for subgroups with substantial fractions of non-English households (Hispanic and Asian). In ECLS-K, families responded whether English is the primary language spoken at their home. If not, their children were tested with the oral language screening test to determine English proficiency. In NELS:88, whether English is the predominant language spoken in the household was collected from students. Estimates by language status are analyzed separately from analyses that focus on the following subgroups: Hispanic in English-Speaking Homes, Hispanic in Non-English-Speaking Homes, Asian in English-Speaking Homes, and Asian in Non-English-Speaking Homes.40 Income status. We adjust for differences in children’s family resources by focusing on the gap in learning between low-income and higher-income children. We characterize as low-income those children whose family’s income-to-needs ratio41 is less than 1.85 (the same figure often used to determine eligibility for the reduced price lunch program [NAS, 2000]). Differences in school exposure—testing time gaps. In large-scale studies, students cannot be administered cognitive assessments at the same time. Assessments typically occur over a span of at least two and sometimes four months. This means that students have different levels of school exposure before they take their assessments. And because school exposure is positively correlated with test performance, models must adjust for these differences. We model test scores as a function of the time before a given assessment. In other words, each time parameter measures from the beginning of the school year to the date of the test42, For ECLS-K and NELS:88 analyses, we use a dummy variable coded 1 for female and 0 for male derived from the GENDER variable. 39 From ECLS-K, we use the measure of WKRACETH. Children in American Indian, Native Hawaiians, and Mixed Race subgroups are combined under one category of “Other” to estimate mean gains. Estimates of expected gains in this “Other” group will have much higher errors, because the children do not perform in consistently similar ways on the achievement tests. These findings are not at the focus of our report. The same applies to the NELS:88 findings for Native Americans (from RACE). These students are not included in sufficient numbers to present reliable results, so we do not focus on them. 40 In ECLS-K, we used the variable called WKLANGST, and in NELS:88, we used the variable, BYS22. 41 The income-to-needs ratio is defined as pre-tax income divided by poverty threshold for particular family size based on census data. For ECLS-K, we use the variables WKINCOME and P1HTOTAL and the poverty thresholds for 1998. For NELS:88, we use the variables BYFAMINC, BYFAMSIZ, and the poverty thresholds for 1988. 42 In NELS:88, there are no dates for the start and end of school years. We assumed June 1, 1988 for the end of eighth grade and June 1, 1990 for the end of tenth grade. 38 132 both of which are individually variable.43 Thus by including these time measures, the models account for the variable amount of time spent in each grade, or school exposure. The coefficients for these parameters represent the amount of learning during that grade or time period. Differences in school exposure—kindergarten program. This report explores how much children learn over a given year. In kindergarten, children enrolled in full-day kindergarten are exposed to more schooling than children enrolled in half-day kindergarten. Thus if we assume that school contributes significantly to student learning, then these full-day kindergartners may make greater gains in reading and math. These greater gains may emerge not only in kindergarten when the difference in school exposure is most immediate but also in subsequent grades if full-day kindergartners are better prepared for later academic success. To test this hypothesis, we re-ran two of the models with a sample of only half-day kindergarteners and a sample of only full-day kindergartners, then compared the results. Disregarding the comparability of gains in IRT scores across grade levels (discussed in detail in Chapter 4 and Appendix B), some of the higher estimated gains in first grade when compared to kindergarten may arise from the different types of instruction provided in kindergarten, and differing lengths of exposure. Forty-four percent of the estimation sample was enrolled in half-day kindergarten, and we might expect these students to learn less in kindergarten and more in first grade. Table A.1 describes the two groups of kindergartners, and Table A.2 presents results for the two groups. Indeed, students enrolled in full-day kindergarten are more than two IRT scale points ahead of their counterparts in half-day kindergarten, on average, at the end of kindergarten, but the two groups are indistinguishable by the end of first grade. We do not distinguish these groups for analysis, but acknowledge the slight differences in learning. For example, for assessments in kindergarten, the amount of time in the first and third grades equals zero. This changes as the assessment time changes so that by third grade, the times in kindergarten and first grade are set (those grades are already completed and spanned a fixed amount of time, approximately 286 days, including weekends). However, the time in third grade before the assessment does not equal a full year, because at the third grade assessment, third grade is not yet completed. 43 133 44 TABLE A.1: SAMPLE SIZES—BY KINDERGARTEN PROGRAM Half-day Kindergarten Male Female Missing Gender Full-day Kindergarten n=11,012 5,537 5,432 43 White Black Hispanic Asian Other Missing Race 5,560 2,364 1,769 558 677 84 5,204 618 1,782 733 396 51 English Speaking Home (EH) Non-English Speaking Home (NEH) Missing EH Status Hispanic Non-EH Hispanic-EH Asian Non-EH Asian-EH Missing Race*EH Low Income Higher Income White-Low Income White-High Income Black-Low Income Black-High Income Hispanic-Low Income Hispanic-High Income Asian-Low Income Asian-High Income Other-Low Income Other-High Income Missing Race*Economic Status 9,190 1,219 603 777 899 280 212 625 4,254 6,758 2,116 3,444 1,355 1,009 946 823 158 400 357 320 84 6,924 1,393 467 858 836 429 183 470 2,954 5,830 1,390 3,814 340 278 1,039 743 241 492 149 247 51 n=8,784 4,527 4,220 37 There are 1,603 students (7.49 percent of the sample) missing data on the variable that indicates full-day or half-day kindergarten. 44 134 TABLE A.2: IRT READING SCORES AT END OF P ERIOD, BY LENGTH OF KINDERGARTEN SCHOOL DAY Full-day Half-day Kindergarten Kindergarten Beginning of Kindergarten 22.98749 23.00523 End of Kindergarten End of Summer after KBeginning of 1st 41.35577 38.93086 40.83949 38.73071 End of 1st Grade 69.90634 70.25758 End of 3rd Grade 110.2182 109.0318 Results presented in these tables suggest that full-day kindergartners start on par with their peers in halfday kindergarten programs, but learn slightly more during kindergarten. In subsequent years, children who had full-day kindergarten gain less in reading than children who attended half-day kindergarten. In models that interact full-day kindergarten with income status, a full-day kindergarten program seems to help alleviate academic challenges associated with low-income status during kindergarten but shows no advantage in first through third grades. Low-income students in full-day programs and in half-day programs start kindergarten with similar deficits on the reading assessment compared to high-income students. In kindergarten, the low-income full-day kindergartners do not lag behind high-income students as much as the low-income half-day kindergartners. Perhaps the extra time in school compensates slightly for the lack of resources at home. But in first grade, low-income children regardless of kindergarten program face nearly the same deficit in reading gains compared to their high-income peers. In second and third grades, the low-income children who were in half-day programs actually catch up in reading gains to their high-income peers, but the children from full-day kindergarten programs make slightly less gain. These differences might derive from other characteristics associated with the selection of full-day versus half-day kindergarten. Black children are much more likely to participate in full-day kindergarten programs than in half-day kindergarten programs. Black students in full-day programs do not start kindergarten as far behind white full-day kindergartners as do black and Hispanic students in half-day programs. In kindergarten, black and Hispanic students in half-day kindergarten gain slightly but significantly less in reading than white and black students in full-day programs. Hispanic students in full-day kindergartens make similar gains in reading to their white peers who were enrolled in full-day programs. In sum, children in full-day kindergartens gain more during kindergarten than those in half-day kindergartens, but this advantage is offset by faster learning among children who participated in half-day kindergartens later in elementary school. Analytic Methods This section outlines the analytic approach and explains the multiple metrics in which results are reported to facilitate interpretation. To calculate average gain in reading and math across grades, we estimate the relationship between test scores and time spent in each grade. One set of models estimates students’ test scores at the beginning of elementary school and their learning rates from kindergarten until third grade. Another set of models estimates students’ test scores at the beginning of high school and their learning rates from that time point forward. 45 The unique challenge in working with the ECLS-K data is the rescaling of earlier test performance based on subsequent test scores, an issue that is fully explained in Appendix B. 45 135 Elementary school The elementary school analyses model test scores as a function of five time parameters—before kindergarten entry, kindergarten, summer between kindergarten and first grade, first grade, and third grade. By the end of third grade, time spent in kindergarten and in first grade equals, on average, 286.64 days or 9.5 months (each grade). The shortest time parameter represents the summer between kindergarten and first grade—78.36 days, or about 2.5 months. Time spent between the end of first grade and the third grade assessment date averages 691.28 days or 23.04 months. Secondary school The secondary school analyses follow the same procedure but use three time components—before high school or eighth grade, tenth grade, and twelfth grade. In NELS:88 most test dates occurred within a small time from of about 2 to 4 months (standard deviations equal to 1 and 2) and in regular intervals of about 24 months.46 Model Specification Seven models are constructed for estimating learning in reading and mathematics: 1. A model with only the time parameters 2. A model with a dummy variable for female on the initial status and each of the learning slopes 3. A model with dummy variables for race/ethnicity subgroup membership on the intercept and each of the learning slopes 4. A model with a dummy variable for language status on the initial status and each of the learning slopes 5. A model with dummy variables for low-income/higher-income status on the intercept and each of the learning slopes 6. A model with dummy variables for ethnicity on the initial status and each of the learning slopes, with ethnicity broken out by language minority status 7. A model with dummy variables that interact race/ethnicity with income status on the initial status and each of the learning slopes The first model estimates average reading and math gains for the typical student. Models 2 through 7 illustrate differences in these gains by subgroup membership. In all models, initial status and learning in each time period are a llowed to vary randomly across students. Metrics for results Findings are reported in four metrics, each of which should facilitate interpretation and is explained in the following section. Points Learning or growth rates are reported in points per day, per month, and per time period. These metrics represent the most easily understood and familiar approach. But what do points mean exactly? To understand what a 1.5-point gain per month actually means, we relate the scores to specific skills in the accompanying graphs. Effect sizes We also present findings for differences in learning rates in units of standard deviations, or effect sizes. Effect sizes measure the magnitude of a relationship and can be compared across tests with different point ranges. For example, the magnitude of the achievement gap found between white and black students in Exact test dates are not available for the base year of NELS:88 but are available for the first and second follow-ups. In order to calculate the elapsed time between tests, we impute the base year test date with the median test date for the base year of April 1, 1988. If the test dates were missing for the first or second follow-ups, we imputed them with the median test dates of March 20, 1990 and February 27, 1992, respectively. 46 136 elementary school can be compared to the same gap in secondary school when results are presented in effect sizes. Effect sizes may be a more familiar outcome metric for researchers prepared to create experimental designs. We divide average growth rates in each period by the standard deviation of scores from the assessment at the beginning of the period, measured at the start of the period. This makes expected gains comparable across different test designs. In calculating effect sizes, we could have used the standard deviation for a specific group or the standard deviation for all. We chose the latter, so that the standard deviation used is comparable and consistent across analyses. We can provide ratios to convert the standard deviation across all to the standard deviation for a specific group. Proficiency index The point where a student’s score corresponds to a 50 percent probability of mastery of a topic or skill set is the ability level at which children are learning the topic at the fastest rate. We refer to this type of proficiency as the current level of achievement for a student with this score. For example, a child in the ECLS-K study with a math score of 43.84 (in the rescaled version of the test as of third grade) has a 50 percent chance of being proficient on the topic labeled “ADD/SUBTRACT.” From this, we can plausibly say that students with scores in the vicinity of 44 are learning to add and subtract. The next such level "MULTIPLY/DIVIDE” occurs at an IRT score of 67.32, so students in the vicinity of 67 are learning to multiply and divide. Gradations of ability between these two milestones (in the range from 44 to 67 points) cannot be tied to specific na med skills, but the milestones offer a means to measure increases in an essentially arbitrary test score metric using familiar concepts. The following table provides the key to converting test scores to proficiency scores. TABLE A1. SKILLS B EING L EARNED AT SPECIFIED IRT SCORE L EVELS—ECLS-K (Assumes 50% Proficiency Level Corresponds to Point of Maximal Learning Speed) Math Skills IRT Score Proficiency Type 10.05 1-COUNT, NUMBER, SHAPE 18.71 2-RELATIVE SIZE 28.46 3-ORDINALITY, SEQUENCE 43.84 4-ADD/SUBTRACT 67.32 5-MULTIPLY/DIVIDE 91.29 6-PLACE VALUE 104.41 7-RATE & MEASUREMENT Reading Skills IRT Score Proficiency Type 21.41 1-LETTER RECOGNITION 30.73 2-BEGINNING SOUNDS 36.08 3-ENDING SOUNDS 51.03 4-SIGHT WORDS 68.87 5-WORD IN CONTEXT 91.63 6-LITERAL INFERENCE 112.98 7-EXTRAPOLATION 124.59 8-EVALUATION 137 TABLE A2. SKILLS B EING L EARNED AT SPECIFIED IRT SCORE L EVELS—NELS:88 (Assumes 50% Proficiency Level Corresponds to Point of Maximal Learning Speed) Math Skills IRT Score Proficiency Type 15.59 1-COMPREHENSION, INCLUDING LEVEL OF DETAIL 30.65 2-SIMPLE INFERENCES AND UNDERSTAND ABSTRACT CONCEPTS 43.30 3-COMPLEX INFERENCE AND EVALUATE JUDGMENTS Reading Skills IRT Score 22.82 37.24 46.21 57.73 73.55 Proficiency Type 1-SINGLE OPERATIONS WITH WHOLE NUMBERS 2-FRACTIONS, DECIMALS, POWERS, AND ROOTS 3-SIMPLE PROBLEM SOLVING 4-INTERMEDIATE LEVEL MATH CONCEPTS 5-MULTI-STEP PROBLEM SOLVING AND ADVANCED MATH Linear models To account for different rates of learning across students, we construct growth curve models, with growth varying by time period and done in piecewise fashion (e.g., in ECLS-K, Fall-K to Spring-K; Fall-1 to Spring-1, etc.). Models are two-level hierarchical models, with testing times nested within students. Level-1 represents testing times, with analyses weighted by precision weights to account for measurement error. Level-2 represents individual students, weighted to ensure generalizability of the sample (the inverse of the probability of being selected for the sample). We report findings from these models with robust standard errors. The model’s equations are: Level 1 Yti = p0i + p1i ati + eti Yti = p0i = ati = eti = observed status at time t for individual i growth trajectory parameter for subject i at time 0 amount of time passed at time t for person i error term Level 2 p0i = ß00 + Sß0qXqi + r0i p1i = ß00 + S ß1qXqi + r0i p0i = p1i = ß0q = Xq = r0i = initial status at time 0, constant term growth rate for person i over the time period; the expected change during this time the effect of Xq on the growth parameter an individual background measure (e.g., gender, race/ethnicity) random effect with mean of 0, assumed to be normally distributed These hierarchical models allow the initial level and learning rate of each student to have a common component and an individual random component. Ignoring this feature, the models are essentially a linear 138 regression of scores on the lengths of time spent in various parts of the educational system at the point the score is measured. The estimated constant in such a model is the initial score when entering kindergarten or at the end of eighth grade, and the coefficients on time variables are growth rates of scores in points per day during the relevant span of time. We compare estimates constructed in these hierarchical linear models using the twice-rescaled versions of the reading and math tests from the fifth round of data but excluding the fifth round to those constructed using the once-rescaled versions of the reading and math tests from the fourth round of data, and found no substantial differences. Locally standardized regressions We estimate gains for different subgroups using only individuals with base year scores close to a particular point, then normalize the gains by pooled standard deviation of gains (again only of individuals with base year scores close to a particular point). These estimates are constructed at many points across the entire distribution of base year scores, dividing always by the standard deviation of gains. This method is described more extensively in Appendix D. 139 A PPENDIX B: RESCALING ISSUES Rescaling conducted in later rounds of data substantially change any tabulations of mean point gains for many students. This issue applies uniquely to our ECLS-K estimations. Since rescaling is conducted in each new round, our tables measuring gains made in kindergarten in 1998 could change in 2006 when data on fifth grade are released, and again each time a new round of survey data is made available. Thus, the gains at a far-prior point in time could become a moving target, subject to data collection conducted in future years. NELS:88 data were not rescaled in each round, so these concerns apply only to the ECLS-K data. Each assessment in ECLS-K and NELS:88 includes more items than participants actually answered. NCES then scales the results using item response theory (IRT). IRT uses information about student ability from the questions they did answer and item characteristics (difficulty level, discrimination power, and likelihood of guessing the correct answer) to estimate how many questions students would have answered correctly if they had responded to all possible questions. Some overlapping questions among test versions allow estimators based on IRT to place all student scores on the same scale (horizontal equating). In these cases where test scores measure the estimated number of questions a student would have answered correctly if administered all items on the test, the assumption that test scores are interval-scaled is difficult to justify, since the sensitivity of the test score to changes in skill levels will be highly sensitive to the density of test items of different difficulty levels. For example, in ECLS-K, children’s performances on the math tests in the fall and spring of first grade are scored two different ways—the estimated number of items a student would have gotten correct had she or he been given all 64 items on the “first grade version” of the test (this is called the R1 test score); and the estimated number of items a student would have gotten correct had she or he been given all 124 items (the 64 first grade items and 60 additional items) on the “third grade version” of the test (this is called the R2 test score). Because the additional 60 items on the R2 test measure higher-level math skills than most of those on the R1 test, the R2 version of the test score is based on a test with many more “difficult” items than the R1 version of the test (Figures B1 and B2 below). As a result, if we examine the difference in the fall-spring gain in test scores between two students—one who starts with low level math skills, and one who starts with high level skills—the magnitude and direction of this difference may not be the same if we use the R1 or R2 version of the test. 140 FIGURE B1: DISTRIBUTION OF R EADING ASSESSMENT ITEM DIFFICULTY ECLS-K Reading Assessment Item Difficulty Distributions, Benchmarked Against Proficiency Levels -3 Item Difficulty (Theta metric) -1 0 1 -2 2 3 K-1 test items (R1 scale) extrapolation evaluation literal inference words in context sight words ending sounds letter recognition beginning sounds K-3 test items (R2 scale) Proficiency Levels *Note: the height of the distribution at each point indicates the relative number of items at a given difficulty level on each test. Since the K-3 test includes all 64 items on the K-1 test, plus an additional 60 items, the relative item volume on the K-3 test is everywhere greater than (or equal to) the item volume on the K-1 test. For example, there are the same number of items measuring relative size on both versions of the test, but more than twice as many multiplication/division questions on the K-3 test as the K-1 test. FIGURE B2: DISTRIBUTION OF MATH ASSESSMENT ITEM DIFFICULTY ECLS-K Math Assessment Item Difficulty Distributions, Benchmarked Against Proficiency Levels -3 -2 Item Difficulty (Theta metric) -1 0 1 2 3 K-1 test items (R1 scale) rate & measurement place value multiply-divide add-subtract ordinality-sequence relative size count-number-shape K-3 test items (R2 scale) Proficiency Levels *Note: the height of the distribution at each point indicates the relative number of items at a given difficulty level on each test. Since the K-3 test includes all 93 items on the K-1 test, plus an additional 61 items, the relative item volume on the K-3 test is everywhere greater than (or equal to) the item volume on the K-1 test. For example, there are the same number of items measuring ending sounds on both versions of the test, but more than twice as many literal inference questions on the K-3 test as the K-1 test. Comparing R0 scores (IRT scores without any subsequent rescaling, available only for kindergarten) and R1 scores (IRT scores rescaled based on K and first grade tests) suggests little difference in the distributions. But the difference between R0 or R1 and R3 scores is quite sizeable, at least for the reading 141 test. NCES added more questions with a relatively high guessability factor (many students could guess the correct answer) to the reading test than to the math test. In addition, few students were scoring near the highest level on the math test, unlike on the reading test. So adding more difficult items to the math test did not elicit more correct responses or consequently more accurate estimates of student ability. Thus rescaling shifts the distribution of math scores only slightly. However, the rescaling in third grade shifts the previous reading test scores substantially, especially among high achievers, and shifts up all scores for the reading test. This is visible in Figure B3 as a shift to the right in the distribution of baseline scores (the two smoother, single-peaked dashed and dotted lines are the distributions of scores in Spring K, in the R1 and R3 metrics) of about 7 points. FIGURE B3: RESCALING EFFECTS: DISTRIBUTION OF R EADING SCORES SHIFTS RIGHT , GAINS HIGHER 30 20 0 10 0 Gain in Points/Year 40 .01 .02 .03 .04 Density of Spring K Scores ECLS-K Reading Scores and Gains, Spring K to Spring 1 10 20 30 40 50 60 70 Baseline (Spring K) Score ... Annual Gains in R3 Metric Annual Gains in R1 Metric 80 90 Baseline Score Distribution, R3 Metric Baseline Score Distribution, R1 Metric More importantly, average gains computed using the twice-rescaled scores, shown in Figure B3 as a solid line graphed against prior period scores, differ substantially from those calculated using the once-rescaled scores (shown as a dashed line). Math does not show as much of an effect of rescaling in the distribution of scores, but has an even larger problem in mean gains (Figure B4). The distribution of raw scores moves slightly to the right, and gains are everywhere substantially higher, when using twice-rescaled scores. 142 FIGURE B4: RESCALING EFFECTS: DISTRIBUTION OF MATH SCORES S HIFTS SLIGHTLY, GAINS M UCH HIGHER .01 .02 .03 .04 Density of Spring K Scores 20 15 10 0 0 5 Gain in Points/Year 25 .05 ECLS-K Math Scores and Gains, Spring K to Spring 1 10 20 30 40 Annual Gains in R3 Metric Annual Gains in R1 Metric 50 score... 60 70 80 90 Baseline Score Distribution, R3 Metric Baseline Score Distribution, R1 Metric The analytic models we discuss in this report attempt to deal with these issues and maximize the precision and interpretability of the estimates. There are two main complementary goals: 1) to make estimated gains independent of the specific design of the tests, and 2) to make estimated mean gains comparable across different tests (e.g., ECLS-K tests and tests for NCLB requirements). However, due to the intrinsic nature of the rescaling process, all of the results in this report, including not only growth rates, but also projected levels, could be different when the scores are rescaled again using fifth grade (round 6) estimates. We checked the extent of this possible difference by constructing models pretending that we had no information from the third grade round of data. Our growth-curve estimates (not reported here) using once-rescaled scores (based on first grade assessments collected in round 4) look similar to estimates using twice-rescaled scores but dropping all data from round 5. This indicates that our results may be less sensitive to future rescaling than might have been feared from an examination of mean gains alone. However, some effects of the rescaling process are clear. The rescaling process affects whether evidence of a summer slide emerges. Analyses in this report and analyses with early rounds of ECLS-K data show no significant dip in children’s reading test scores during the summer time. Despite the lack of statistical significance, analyses with early data produce a small positive coefficient and analyses with the latest round of data produce a small negative coefficient. This difference may not seem particularly important but do highlight the issues with rescaling. 143 A PPENDIX C: STANDARD D EVIATIONS We present our results in a number of metrics, including effect sizes. These are calculated by dividing estimated effects by the standard deviation of the outcome. Not surprisingly, these effect size estimates depend on how this standard deviation is calculated. In this appendix we discuss various alternatives and the one we used. Effect Size The familiar notion of normalizing a variable divides the variable by its own standard deviation, to obtain a variable measured in standard deviation units (analogous to a standard normal distribution, hence the verb normalize), which is a scale-free metric47. The concept of effect size captures the notion of a “normalized” growth rate, where the estimate of gain in levels (points in some metric) is converted into a number representing the number of standard deviations gained over time, constructed as a ratio dividing the gain estimate by the standard deviation of scores—and there are as many notions of effect size as there are plausible values for the standard deviation used in the denominator of that ratio. There are at least four obvious options for constructing effect sizes from estimates of gain in levels. One option would use the standard deviation of all test scores over the entire time period, pooling across time. The second would use the standard deviation of test scores in the first period of measurement, or fall of kindergarten (fall k) in the ECLS-K data. The third would use the standard deviation of test scores in the first period of measurement to scale gains between the first and second period, expressing gains in points as gains in standard deviations on the baseline test. The fourth would use the standard deviation of test scores in the second period of measurement to scale gains between the first and second period for each pair of assessments, expressing gains in points as gains in standard deviations on the test taken after learning new material. We have adopted a revised version of the third method, which has the most intuitive interpretation, and could potentially be useful to anyone having results from a test in one grade and wanting to predict likely gains on a hypothetical test in a later period (for example, for the purposes of power analysis in the process of designing an experiment). Our method does not divide point gains by the raw standard deviation of test scores pooled within a round, since tests are administered at different points in time. Instead, we estimate the standard deviation at the beginning of the period of interest (e.g. at the beginning of first grade to compute the effect size for gains over the course of first grade). These are obtained by shifting the intercept of the regression model to that point in time, which does not change the slope estimates, but does change the estimated constant term. The estimated standard deviation of scores around the intercept term is our estimate of the standard deviation on the test at that point in time. This is a particularly useful technique when the standard deviation of test administration dates changes appreciably over time, though this is not the case with our data. The deviation from mean assessment dates by round in the ECLS-K data is shown in Figure C1, and it is clear that the random variation in assessment dates does not differ much across the rounds. However, it is still possible that different subgroups are sampled at different times, or even that the lag between test dates differs systematically across subgroups (particularly for the ECLS-K data, since the middle third round used a small nonrepresentative sample). It is also clear that point gains are strongly related to the number of days elapsed The normalized variable has no units, and measurements in different scales produce the same normalized values, e.g. a variable measuring length coded in inches, or one containing the same measurements in feet will be identical after normalizing the two variables. In this sense the scale of measurement is irrelevant. 47 144 between tests, as shown in Figure C2 (all of the positive slopes, while small in magnitude, are significant at the 1-10 -12 level). These estimates pool all types of individuals (i.e. are not estimated separately by race or income or language status) and are constructed separately for each model we run, to ensure the sample is identical. TABLE C1. ESTIMATED STANDARD D EVIATIONS —ELEMENTARY SCHOOL Time Period FULL S AMPLE Beginning of K End of K Beginning of 1st End of 1st End of 3rd Reading IRT Math IRT Reading Theta Math Theta 9.25822 13.64673 15.59768 21.20280 19.45075 8.23542 11.63820 12.36079 15.65170 15.62940 0.59042 0.55636 0.57151 0.50295 0.37636 0.58623 0.55331 0.56463 0.49667 0.45640 0.58919 0.55684 0.57103 0.50374 0.36730 0.5852 0.55355 0.56438 0.49768 0.45352 S AMPLES RESTRICTED TO NON - MISSING L ANGUAGE STATUS Beginning of K 9.19347 8.19316 End of K 13.59159 11.60565 Beginning of 1st 15.50918 12.326 End of 1st 21.16788 15.63889 End of 3rd 19.23780 16.13096 TABLE C2. ESTIMATED STANDARD D EVIATIONS —S ECONDARY SCHOOL Time Period FULL S AMPLE End of 8th grade End of 10th Grade End of 12th Grade Reading IRT Math IRT Reading Theta Math Theta 8.532 9.836 11.707 13.471 8.451 10.050 8.312 9.409 Tables C1 and C2 present our estimates of the standard deviation on the test at each relevant point in time (the beginning of each period over which we estimate gains). Tables C3 and C4 present the corresponding estimates we would have used if we had not modified the third option, that is, if we had used the raw standard deviation of test scores pooled within a round. The difference is quite small in most cases. TABLE C3. ACTUAL STANDARD D EVIATIONS — ELEMENTARY SCHOOL Time Period Reading IRT Math IRT Reading Theta Math Theta FULL S AMPLE Beginning of K End of K Beginning of 1st End of 1st 9.810306 13.09301 16.56015 20.47198 8.830328 11.43061 13.45482 15.94837 .5667186 .5538426 .573677 .5118973 .5798147 .5636462 .5838926 .5218103 S AMPLES RESTR ICTED TO NON - MISSING L ANGUAGE STATUS Beginning of K 9.841982 8.851763 End of K 13.10695 11.44324 Beginning of 1st 16.53039 13.41443 End of 1st 20.43452 15.93634 .5672197 .5523441 .5687264 .5084035 .5803936 .5624954 .5780821 .5188048 145 TABLE C4. ACTUAL STANDARD D EVIATIONS —S ECONDARY SCHOOL Time Period Reading IRT Math IRT Reading Theta Math Theta FULL S AMPLE End of 8th Grade End of 10th Grade 8.532705 9.836779 11.70741 13.4713 8.450959 10.05063 8.312182 9.4096 In any case, using the tables, one can convert any of the effect size estimates in this report into a different type of effect size estimate. For example, to convert the math gain made by male students in first grade (Table 3.1, third row) into effect size units that use the raw standard deviation on the Fall Kindergarten assessment in the denominator, multiply the estimate from the table (0.208) by the estimated standard deviation (15.382) at the start of first grade from Table C1, then divide by the actual standard deviation of scores in round 1 (9.810306) from Table C3. IRT and Theta Score Distributions One of the main differences between IRT scale and theta scores, from a practical standpoint, is that the standard deviation of the IRT scale scores can change dramatically over time. In contrast, the standard deviation of the theta scores remains fairly stable. This has important implications for our comparisons of growth rates based on scale scores and effect sizes. In the ECLS-K data, the IRT scores exhibit substantial growth of standard deviations over time. This phenomenon is evident in the tables above, and in Figures C3 and C4. This leads to greater differences in IRT and effect size estimates when comparing growth rates across time, and makes comparisons across time more dependent on relatively subjective choices about the scale of measurement. In each round, the distribution of IRT scores moves to the right, and the distance between the centroids of these distributions represents mean learning across rounds, expressed in IRT point scores as observed in ECLS-K. Thus, the first and second curves are distant (overlap less than the second and third, counting left to right), reflecting the high rate of learning (growth in theta scores) between rounds 1 and 2 (roughly, in Kindergarten). The second and third curves are quite close (overlap more than the others) reflecting the minimal learning (growth in IRT scores) observed between rounds 2 and 3 (roughly, during the summer between Kindergarten and first grade). The third and four distributions are slightly farther apart than the first and second, indicating apparently faster growth in first grade, but the dispersion also increases, so larger point gains do not translate into faster gains as measured in standard deviations. There is also evidence in Figures C3 and C4 of truncation of the distribution in the most recent round of testing. If not enough difficult questions are added to the test bank, and if many new questions have high guessability, the highest performing students will be indistinguishable from moderately high performers, or nearly (for example, the 99th percentile will be much closer to the 75th than the first is to the 25th). This will reduce the mean gain estimated across all students, and could lead us to conclude that learning is slower in the time between the penultimate and the last assessment, when there is no real slowing of the rate of learning. It will be instructive to examine the most recent assessment and newly rescaled scores in early 2006, to investigate this concern in the light of new evidence. One of the reasons that effect size and theta score gains are similar (see the last pages of Chapter 3 for discussion of this point) is that theta scores are already approximately normal, so renormalizing by dividing by the standard deviation has little impact. This is plainly visible in Figures C5 and C6, where the distribution of theta in each round is shown as the estimated density across the range of theta scores. 146 Each round has a distribution of theta that is roughly bell-shaped and the distance between the centroids of these approximate bells represents mean learning across rounds, expressed in theta scores as observed in ECLS-K. Thus, the first and second curves are quite distant (overlap less than the others) reflecting the high rate of learning (growth in theta scores) between rounds 1 and 2 (roughly, in Kindergarten). The second and third curves are quite close (overlap more than the others) reflecting the minimal learning (growth in theta scores) observed between rounds 2 and 3 (roughly, during the summer between Kindergarten and first grade). Note that the thetas when pooled across all rounds need not look so bell-shaped, since the pooled theta scores are theoretically drawn from a mixture of normals distribution, and will typically look “lumpier” than each round’s theta distribution. Figures C7 and C8 demonstrate this point, and this exercise serves to underline the interpretation of theta not as a measure of inborn or genetic capacity, but as a learned ability to score well on a particular family of tests. Figures C9 to C14 repeat the graphs C3 to C8 using the NELS data, which has somewhat different properties. In particular, the IRT scores do not exhibit the dispersion of ECLS-K IRT scores, which is evidence that the NELS tests likely created an artificial ceiling or floor on performance of extremely low and high performers, truncating the distribution of scores. Also, the distributions of thetas in individual rounds look less normal than the distribution of theta pooled across rounds, for both reading and math tests. This may indicate that the IRT estimation model was imperfectly applied, or the tests were not welldesigned to capture changes in the learned ability measured by theta over the high school years. 147 0 .005 .01 .015 .02 .025 FiGURE C1. DISTRIBUTION OF ASSESSMENT DATES FOR EACH OF ROUNDS 1-5 IN ECLS-K -60 -30 0 30 60 Distance from mean assessment date in days FIGURE C2. LINEAR FIT OF IRT POINT GAINS REGRESSED ON DAYS ELAPSED B ETWEEN ASSESSMENTS Spr K to Fall 1 Fall 1 to Spr 1 Spr 1 to Spr 3 0 30 40 20 10 0 Fitted values 10 20 30 40 Fall K to Spr K -100 -50 0 50 100 -100 -50 0 Days Elapsed Between Assessments Math Gain Reading Gain Graphs by Rounds of Assessments 148 50 100 0 .02 .04 .06 FIGURE C3. DISTRIBUTION OF R EADING IRT SCORES FOR EACH OF ROUNDS 1-5 IN ECLS-K 0 50 Reading IRT score 100 150 0 .01 .02 .03 .04 .05 FIGURE C4. DISTRIBUTION OF MATH IRT SCORES FOR EACH OF R OUNDS 1-5 IN ECLS-K 0 50 Math IRT score 149 100 150 0 .5 1 1.5 FIGURE C5. DISTRIBUTION OF R EADING T HETA SCORES FOR EACH OF ROUNDS 1-5 IN ECLS-K -3 -2 -1 0 Reading theta 1 2 3 0 .5 1 1.5 FIGURE C6. DISTRIBUTION OF MATH T HETA SCORES FOR EACH OF R OUNDS 1-5 IN ECLS-K -3 -2 -1 0 Math theta 150 1 2 3 0 .1 Density .2 .3 .4 FIGURE C7. DENSITY OF R EADING THETA SCORES P OOLING ALL ROUNDS (1-5) IN ECLS-K -3 -2 -1 0 Reading theta 1 2 3 0 .1 Density .2 .3 .4 FIGURE C8. DENSITY OF MATH T HETA SCORES POOLING ALL ROUNDS (1-5) IN ECLS-K -3 -2 -1 0 Math theta 151 1 2 3 0 .01 .02 .03 .04 FIGURE C9. DENSITY OF R EADING IRT SCORES FOR EACH OF ROUNDS 1-3 IN NELS 0 10 20 30 40 50 60 Reading IRT score 0 .01 .02 .03 FIGURE C10. DENSITY OF MATH IRT SCORES FOR EACH OF R OUNDS 1-3 IN NELS 0 10 20 30 40 50 Math IRT score 152 60 70 80 90 0 .01 .02 .03 .04 FIGURE C11. DENSITY OF R EADING T HETA SCORES FOR EACH OF R OUNDS 1-3 IN NELS 20 30 40 50 60 70 80 90 Reading Theta score 0 .01 .02 .03 .04 .05 FIGURE C12. DENSITY OF MATH T HETA SCORES FOR EACH OF R OUNDS 1-3 IN NELS 20 30 40 50 60 Math Theta score 153 70 80 90 0 .01 .02 .03 .04 FIGURE C13. DENSITY OF R EADING T HETA SCORES POOLING ALL ROUNDS (1-3) IN NELS 0 10 20 30 40 50 60 70 80 90 Reading Theta Score 0 .01 .02 .03 .04 FIGURE C14. DENSITY OF MATH T HETA SCORES POOLING ALL ROUNDS (1-3) IN NELS 0 10 20 30 40 50 Math Theta Score 154 60 70 80 90 A PPENDIX D: COMPARISON OF IRT AND THETA SCORES At the end of Chapter 3, we compare findings in the IRT scale score metric with the findings in the theta score metric. This comparison analyzes ratios of the learning rate in a subsequent grade with the learning rate in the previous grade (e.g., first grade gain to kindergarten gain). This Appendix presents the tables on which we based our comparisons of the learning rate ratios. TABLE D1: RATIO OF L EARNING RATE DIFFERENCES IN ECLS-K: COMPARISON OF IRT AND THETA BY RACE Time Period IRT Gain Per Month Effect Size Per Month Theta Gain Per Month Effect Size Per Month W HITE STUDENTS / READING 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade 1.836 0.474 1.090 0.348 0.940 0.320 0.971 0.363 W HITE STUDENTS / M ATHEMATICS 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade 1.451 0.500 0.968 0.395 0.905 0.736 0.941 0.827 BLACK STUDENTS / READING 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade 1.771 0.520 1.046 0.383 0.974 0.305 1.004 0.346 BLACK STUDENTS / M ATHEMATICS 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade 1.581 0.529 1.055 0.420 0.979 0.340 1.019 0.386 HISPANIC STUDENTS / READING 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade 1.597 0.591 0.944 0.435 0.845 0.347 0.872 0.393 HISPANIC STUDENTS / M ATHEMATICS 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade 1.571 0.546 1.047 0.433 0.908 0.355 0.945 0.402 ASIAN STUDENTS / READING 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade 1.459 0.448 0.863 0.330 0.819 0.303 0.846 0.343 ASIAN STUDENTS / M ATHEMATICS 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade 1.266 0.644 0.843 0.512 0.835 0.452 0.869 0.513 155 TABLE D2: RATIO OF L EARNING RATE DIFFERENCES IN ECLS-K: COMPARISON OF IRT AND THETA BY RACE AND LANGUAGE Time Period IRT Gain Per Month Effect Size Per Month Theta Gain Per Month Effect Size Per Month W HITE STUDENTS / READING 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade 1.836 0.474 1.090 0.348 0.940 0.320 0.971 0.363 1.451 0.500 0.968 0.395 0.905 0.367 0.941 0.417 1.771 0.520 1.046 0.383 0.974 0.305 1.004 0.346 1.581 0.529 1.055 0.420 0.979 0.340 1.018 0.386 1.582 0.650 0.935 0.478 0.872 0.355 0.899 0.403 1.679 0.541 1.120 0.429 0.802 0.343 0.824 0.389 1.622 0.536 0.959 0.395 0.826 0.333 0.852 0.378 1.461 0.554 0.973 0.440 0.865 0.371 0.900 0.420 1.454 0.459 0.859 0.338 0.829 0.300 0.855 0.340 1.315 0.655 0.875 0.520 0.867 0.450 0.903 0.509 1.445 0.429 0.855 0.316 0.798 0.304 0.824 0.344 1.223 0.617 0.814 0.490 0.802 0.449 0.835 0.509 W HITE STUDENTS / M ATHEMATICS 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade BLACK STUDENTS / READING 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade BLACK STUDENTS / M ATHEMATICS 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade HISPANIC (ENGLISH NOT SPOKEN AT HOME) / READING 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade HISPANIC (ENGLISH NOT SPOKEN AT HOME) / M ATHEMATICS 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade HISPANIC (ENGLISH SPOKEN AT HOME) / R EADING 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade HISPANIC (ENGLISH SPOKEN AT HOME) / M ATHEMATICS 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade ASIAN (ENGLISH NOT SPOKEN AT HOME) / READING 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade ASIAN (ENGLISH NOT SPOKEN AT HOME) / M ATHEMATICS 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade ASIAN (ENGLISH SPOKEN AT HOME) / READING 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade ASIAN (ENGLISH SPOKEN AT HOME) / M ATHEMATICS 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade 156 TABLE D3: RATIO OF L EARNING RATE DIFFERENCES IN ECLS-K FOR IRT AND THETA BY RACE AND INCOME IRT Time Period Gain Per Month W HITE LOW -INCOME STUDENTS / R EADING 1.765 1st Grade/Kindergarten 0.542 2nd and 3rd Grades/1st Grade W HITE LOW -INCOME STUDENTS / M ATHEMATICS 1.480 1st Grade/Kindergarten 0.528 2nd and 3rd Grades/1st Grade W HITE HIGHER -INCOME STUDENTS / READING 1.840 1st Grade/Kindergarten 0.467 2nd and 3rd Grades/1st Grade W HITE HIGHER -INCOME STUDENTS / M ATHEMATICS 1.448 1st Grade/Kindergarten 0.497 2nd and 3rd Grades/1st Grade BLACK LOW -INCOME STUDENTS / R EADING 1.837 1st Grade/Kindergarten 0.541 2nd and 3rd Grades/1st Grade BLACK LOW -INCOME STUDENTS / M ATHEMATICS Effect Size Theta Gain Per Month Effect Size 1.043 0.399 0.919 0.323 0.949 0.366 0.985 0.419 0.904 0.349 0.940 0.396 1.091 0.344 0.942 0.319 0.974 0.362 0.966 0.392 0.906 0.370 0.944 0.419 1.085 0.398 1.004 0.300 1.036 0.340 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade BLACK HIGHER -INCOME STUDENTS / READING 1.636 0.527 1.090 0.419 0.982 0.330 1.021 0.374 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade BLACK HIGHER -INCOME STUDENTS / M ATHEMATICS 1.612 0.535 0.953 0.393 0.892 0.320 0.921 0.363 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade HISPANIC LOW -INCOME STUDENTS / R EADING 1.508 0.557 1.005 0.442 0.966 0.351 1.005 0.398 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade HISPANIC LOW -INCOME STUDENTS / M ATHEMATICS 1.548 0.670 0.913 0.494 0.836 0.362 0.863 0.410 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade HISPANIC HIGHER -INCOME S TUDENTS / READING 1.645 0.561 1.097 0.445 0.919 0.345 0.956 0.391 1.623 1st Grade/Kindergarten 0.533 2nd and 3rd Grades/1st Grade HISPANIC HIGHER -INCOME S TUDENTS / M ATHEMATICS 0.960 0.392 0.845 0.324 0.872 0.367 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade ASIAN LOW -INCOME STUDENTS / READING 1.482 0.538 0.987 0.427 0.884 0.360 0.920 0.408 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade ASIAN LOW -INCOME STUDENTS / M ATHEMATICS 1.454 0.527 0.859 0.388 0.751 0.327 0.775 0.371 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade ASIAN HIGHER -INCOME STUDENTS / READING 1.188 0.727 0.790 0.577 0.766 0.461 0.798 0.522 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade ASIAN HIGHER -INCOME STUDENTS / M ATHEMATICS 1.426 0.427 0.844 0.313 0.845 0.291 0.872 0.329 1.293 0.618 0.861 0.490 1.293 0.618 0.861 0.490 1st Grade/Kindergarten 2nd and 3rd Grades/1st Grade 157 TABLE D4: RATIO OF L EARNING RATE DIFFERENCES ACROSS GRADE (10TH TO 12TH G RADE RATE DIVIDED BY 8TH TO 10TH G RADE RATE) IN NELS:88: COMPARISON OF IRT AND T HETA Time Period IRT Gain Per Month Effect Size Per Month Theta Gain Per Month Effect Size Per Month ALL STUDENTS Reading Mathematics 0.594 0.549 0.498 0.467 0.622 0.586 0.491 0.483 TABLE D5: RATIO OF L EARNING RATE DIFFERENCES ACROSS GRADE (10TH TO 12TH G RADE RATE DIVIDED BY 8TH TO 10TH G RADE RATE) IN NELS:88: COMPARISON OF IRT AND T HETA BY RACE Time Period IRT Gain Per Month Effect Size Per Month Theta Gain Per Month Effect Size Per Month W HITE STUDENTS Reading Mathematics 0.579 0.539 0.479 0.458 0.609 0.579 0.487 0.479 Reading Mathematics 0.667 0.663 0.552 0.563 0.678 0.631 0.542 0.521 Reading Mathematics 0.783 0.669 0.648 0.569 0.836 0.650 0.669 0.536 Reading Mathematics 0.788 0.583 0.652 0.496 0.877 0.640 0.702 0.529 BLACK STUDENTS HISPANIC STUDENTS ASIAN STUDENTS 158 TABLE D6: RATIO OF L EARNING RATE DIFFERENCES ACROSS GRADE (10TH TO 12TH G RADE RATE DIVIDED BY 8TH TO 10TH G RADE RATE) IN NELS:88: COMPARISON OF IRT AND T HETA BY RACE AND LANGUAGE Time Period IRT Gain Per Effect Size Per Month Month Gain Per Month Theta Effect Size Per Month W HITE STUDENTS Reading Mathematics 0.578 0.539 0.479 0.458 0.608 0.579 0.487 0.479 Reading Mathematics 0.665 0.662 0.550 0.563 0.675 0.662 0.539 0.563 HISPANIC (ENGLISH NOT SPOKEN AT HOME) Reading Mathematics 0.836 0.705 0.691 0.599 0.921 0.684 0.737 0.564 0.693 0.618 0.573 0.525 0.702 0.599 0.562 0.494 1.003 0.704 0.830 0.598 1.101 0.759 0.883 0.627 0.732 0.528 0.606 0.449 0.850 0.586 0.681 0.483 BLACK STUDENTS HISPANIC (ENGLISH SPOKEN AT HOME) Reading Mathematics ASIAN (ENGLISH NOT SPOKEN AT HOME) Reading Mathematics ASIAN (ENGLISH SPOKEN AT HOME) Reading Mathematics 159 TABLE D7: RATIO OF L EARNING RATE DIFFERENCES ACROSS GRADE (10TH TO 12TH G RADE RATE DIVIDED BY 8TH TO 10TH G RADE RATE) IN NELS:88: COMPARISON OF IRT AND T HETA BY RACE AND I NCOME Time Period IRT Gain Per Month Effect Size Per Month Theta Gain Per Month Effect Size Per Month W HITE LOW -INCOME STUDENTS Reading Mathematics 0.953 0.519 0.788 0.442 0.968 0.523 0.775 0.431 W HITE HIGHER -INCOME STUDENTS Reading Mathematics 0.539 0.543 0.447 0.462 0.571 0.590 0.458 0.487 BLACK LOW -INCOME STUDENTS Reading Mathematics 0.857 0.667 0.708 0.568 0.874 0.597 0.699 0.492 0.728 0.648 0.602 0.551 0.734 0.645 0.587 0.532 Reading Mathematics 1.004 0.695 0.831 0.591 1.075 0.645 0.862 0.533 HISPANIC HIGHER -INCOME S TUDENTS Reading Mathematics 0.745 0.628 0.616 0.534 0.790 0.613 0.632 0.506 ASIAN LOW -INCOME STUDENTS Reading Mathematics 1.455 0.669 1.203 0.569 1.524 0.733 1.221 0.605 ASIAN HIGHER -INCOME STUDENTS Reading Mathematics 0.716 0.541 0.592 0.460 0.821 0.578 0.657 0.477 BLACK HIGHER -INCOME STUDENTS Reading Mathematics HISPANIC LOW -INCOME STUDENTS 160 A PPENDIX E: LOCALLY STANDARDIZED GROWTH RATE D IFFERENCES The locally standardized difference estimates reported in Chapter 5 are estimated as follows. We first compute each student’s growth rate between round t and round t+1 by computing the change in test score between rounds and dividing this by the number of calendar days elapsed between the two assessments.48 Next we divided the distribution of test scores at round t at a relatively large number of evenly-spaced points.49 At each of these points, we computed a bias-adjusted kernel-weighted50 estimate of the local within-group difference in growth rates and a kernel-weighted estimate of the local pooled standard deviation51, and we then divide the estimated difference in growth rates by the estimated pooled standard deviation, yielding an estimate of the locally standardized difference in growth rates at each point. We then compute a standard error for the estimated standardized difference at each point.52 Finally, we average the estimated locally standardized differences in growth rates over the range of the initial test scores, weighting the average by the inverse of the variance of the estimate at each point.53 This process yields an estimate of the average locally standardized difference in growth rates between two groups. This estimate can be interpreted as the expected difference in growth rates between round t and t+1 between two students with the same score at round t, expressed in terms of the standard deviation of growth rates among all students starting with that same score. Since growth rates are later scaled by their standard deviations, it makes no difference whether we use these daily growth rates, or rescale them to some other unit of time. 49 In the analyses here, we selected points in the distribution of test scores that were 0.02 standard deviations apart, though the results were insensitive to other choices. 50 We report estimates based on a biweight kernel with halfwidth of 0.1 standard deviations of the round t score distribution, but our results are virtually identical if we use a rectangular kernel or a different halfwidth. The biweight kernel has the form 48 d wij = 1 − ij h 2 2 , where dij is the distance from point j, and h is the kernel halfwidth; where dij>h, wij=0. The bias adjustment is done by fitting a kernel-weighted regression model at each point j (i.e., fitting a model of the form Yi=b0j+b1j(groupi)+b2j(dij)+ei). 51 In computing female-male differences, we use the pooled within-gender standard deviation; in computing poor-non-poor differences, we use the pooled within-economic group standard deviation; and in computing race/ethnic differences, we use the pooled within race/ethnic group standard deviation. 52 The standard error of δˆ , the estimated locally standardized difference at point j, is computed as j ( ) s.e, δˆj = δˆ j ( ) s.e. γˆ j γˆ j 2 2 + ( ) s.e. σˆ j σˆ j2 2 , where δˆ j = γˆ j , and where γˆ j and σ̂ j are the estimated difference in growth rates and the σˆ j estimated pooled standard deviation of growth rates at point j, respectively. Bootstrap standard errors matched these computed standard errors very closely. [ ( )] 53 The standard error of this average is computed as s.e. δˆ = j ∑ s.e. δˆj j 161 −2 − 1 2 .