Attrition in the National Longitudinal Survey of Youth 1997 Alison Aughinbaugh* Bureau of Labor Statistics Rosella M. Gardecki Center for Human Resource Research, The Ohio State University Current Draft: May 2008 *The views expressed are those of the authors and do not reflect the policies of the Bureau of Labor Statistics or the views of other BLS staff members. We thank Ian Rucker for research assistance and Chuck Pierret, Donna Rothstein, and Michael Pergamit for helpful comments. All errors are our own. Corresponding author: Alison Aughinbaugh, 2 Massachusetts Ave NE, Room 4945, Washington, DC 20212, aughinbaugh.alison@bls.gov, 202-691-7520. Keywords: attrition, panel data The National Longitudinal Survey of Youth 1997 (NLSY97) is a national sample of about 9,000 youth who were ages twelve to sixteen on December 31, 1996, and living in the US at that time. Starting in 1997, interviews have been conducted annually. Currently, data through Round 9 are available (respondents are ages 21 to 26) and data through Round 11 have been collected (respondents are ages 23 to 28). Respondents are currently completing their schooling and entering both their careers and a period of their lives in which many marriages and births occur. Consequently, the NLSY97 is becoming a valuable source for examining how early events and decisions—such as teenage pregnancy, employment during high school, and dropping out of school—are related to later outcomes and decisions—such as educational attainment, earnings, and career choice. Though the focus of the NLSY97 is employment, the data set covers a broad array of topics including schooling, training, marriage, fertility, and income, thus permitting one to examine how different areas of life are related to labor market outcomes. The survey takes approximately one hour to complete. Incentive payments for respondents ranged from $10 to $80 over the period considered in this paper.1 Although unit non-response has always been a concern for longitudinal surveys, over the past fifteen years the levels of attrition have increased. Atrostic, et al. (2001) study attrition in six U.S. government household surveys and find that over the 1990s the rate of unit non1 In the first three rounds, respondents were paid $10 for participating. Over Rounds 4 and 5, the incentive amount was raised to $20, which remained the payment level through Round 8. As an experiment, the incentive was raised to $15 in Round 4 and then to $20 in Round 5 for half the sample, whereas for the other half the full increase to $20 occurred in Round 4. A second incentive experiment was conducted in Rounds 7 and 8 in which respondents who had missed the previous interview were offered $5 additional per round missed since the last interview for up to 3 missed rounds to complete the current interview to compensate for the longer interview. The amount for missed interviews was raised to $10 in Rounds 9 and 10. In addition, an incentive experiment was conducted in the latter half of the Round 8 field period, in which some respondents were offered an additional $20 gift certificate to participate. In Round 9, the base incentive remained at $20, but was raised to $30 in Round 10. A fourth incentive experiment was conducted as part of Round 10 fielding. Sample members who had not participated in the first 3 months of Round 10 fielding were divided into 3 groups. The first served as a control group. For the 2nd group, the incentive was increased to $50 in cash. The third group was offered a $20 gift certificate in addition to the $30 cash payment to participate. 1 response increased in all six. A similar pattern emerges in the National Longitudinal Surveys: the response rate in the NLSY97 was 89.9 in Round 4, but in its predecessor survey, the NLSY79, the response rate did not fall below 90 percent until Round 16. (NLSY79 User’s Guide) While larger sample loss reduces the precision of estimation, it does not necessarily result in attrition bias. Bias results when attrition is non-random. As with any panel data set, the ability of the NLSY97 to provide estimates of the effect of early events on later outcomes depends on whether the NLSY97 remains representative of the population of interest. Recent examinations of two longstanding U.S. panel data sets show that attritors differ significantly from non-attritors, but the effects of this non-random attrition do not influence the pictures of labor market outcomes presented by the data. (Fitzgerald, Gottschalk, and Moffitt 1998, MaCurdy, Mroz, and Gritz 1998, Zabel 1998) In the current study, we examine the effects of attrition in the NLSY97—a newer and younger data set—using methods employed in previous studies of attrition in the NLSY79 and the Panel Study of Income Dynamics (PSID). This paper measures the level, the patterns, and the implications of attrition in the NLSY97. Much of the survey methodology literature considers participation in surveys as a multi-step process, where step 1 is establishing contact and step 2 involves gaining cooperation (Watson and Woods 2006). Because few NLSY97 sample members are unlocatable, however, we study attrition as a simple one-step process.2 The first section of this paper describes the patterns of wave non-response, first attrition, and return in the NLSY97 and attempts to gauge whether attritors and returnees differ from the full sample using data from the first ten rounds. The second section examines how attrition affects estimates. Using Rounds 1 through 9 of the NLSY97, we estimate three outcomes— having earned a high school diploma by age 20, having had a child by age 20, and weeks worked 2 Of any round, the greatest number of sample members, 278 or 3.1 percent, were unlocatable in Round 5. 2 in the year in which one turns 20—for three different samples: (1) all youth who participated in Round 9, (2) youth who have participated in all rounds of the NLSY97, and (3) all youth who participated in at least one round after turning 20 years old. For all three outcomes, the inference drawn from the estimates is unaffected by the sample used. In the last section, we attempt to compare outcomes from Rounds 1 through 9 in the NLSY97 with those for similarly aged individuals from other nationally-representative surveys, namely the Current Population Survey (CPS) and the National Survey of Family Growth (NSFG). I. Attrition Patterns in the NLSY97 Using unweighted data, Table 1 summarizes the patterns of attrition in each of the first ten rounds of the NLSY97. The first column presents attrition rates for the full sample of 8984. The second and third columns present the statistics on attrition for the 2236 individuals who make up the oversample of Black and Hispanics youth3 and the 6748 sample members who comprise the cross-sectional sample. For each round of data, the table presents statistics on the rates of wave non-response, first attrition, never returning to the NLSY97 after missing a round, and return after having missed one or more interviews. Because the denominators vary across these statistics, we provide definitions for the statistics that we present in Table 1. Wave non-response is simply missing an interview, and the rate of wave non-response is the number that misses an interview out of all sample members. The rate of first attrition is the number of sample members who misses an interview for the first time out of those who have responded in all previous waves. “Never return” is equal to 1 for those sample members who attrite for the first time in a given round and who do not participate in any subsequent round. The rate of “never return” is defined as the number of respondents who 3 The oversample provides sufficient sample size to permit analysts to estimate results separately for these minority groups. 3 “never return” out of those who attrite for the first time in a given round. The fraction of returnees is the portion that has missed at least one previous interview out of those who participate in the current survey round. Until Round 10, the rate of wave non-response increases with round. The rates of wave non-response were the lowest in round 2 at nearly 7 percent for the entire sample and then gradually rose through Round 9 to 18 percent, the highest level of wave non-response to date. In Round 10, the rate of wave non-response fell back to 16 percent. Conversely, the rate of first attrition never exceeds the 7 percent that occurs in Round 2. In all other rounds the rate of first attrition lies between three and five percent with the rate of first attrition falling to just below 3 percent in Round 10. Most of the NLSY97 sample members who miss an interview return to the survey. 4 Of those who do not participate in the Round 2 interview, only 21 percent have never returned to the NLSY97 by Round 10. Obviously, those who initially leave the survey in an earlier round have had more chances to return to the survey, which helps explain why the rate of never returning generally rises across rounds. The rate of returnees (those who have previously missed at least one interview out of all previous rounds) interviewed climbs across rounds and ranges from 3 percent in Round 3 to 23 percent in Round 10. With few exceptions, these patterns are comparable for the whole sample, the oversamples, and the cross-sectional sample. Table 2 examines the same patterns of attrition and return, but for subsamples that are defined on demographic characteristics of the sample members. When comparing the subsamples, three patterns emerge. First, as has been found in other longitudinal surveys, wave non-response is higher for men than for women. (Burkam and Lee 1998, Fitzgerald, Gottschalk 4 Because of the event history format of much of the NLSY97, missing an interview does not mean that data are never collected on a given year’s activities (Olsen 2006). However, a longer recall period that comes with skipping an interview and returning may result in lower data quality (Pierret 1998). 4 and Moffitt 1998, MaCurdy, Mroz, Gritz 1998) Second, across the various racial/ethnic subsamples, differences in survey participation begin to appear in recent rounds. Non-response rates are comparable in rounds two through five for Blacks, Hispanics, and whites, but are higher for whites than for Black or Hispanics in Round 6 and later.5 In addition, whites have higher rates of never returning to the NLSY97. Because of the racial and ethnic composition of the oversample versus the cross-section, a similar pattern is seen for the oversample versus the cross-section. Third, the NLSY97 wave non-response rates are decreasing in birth year. That is, sample members born earlier are less likely to participate in each round. Compared to its predecessor survey, the NLSY79, the levels of attrition are higher in the NLSY97.6 In Round 8 of the NLSY79, only 10.7 percent of male sample members and 9.0 percent of female sample members did not participate (MaCurdy, Mroz, and Gritz 1998). The comparable numbers are substantially larger in the NLSY97, 17.3 and 14.3 percent. Because the levels of unit non-response are so much lower, it is not surprising that the percentage of participants who have missed at least one previous interview is also much lower in the NLSY79. In Round 10 of the NLSY79, 15.9 percent of male respondents and 11.8 percent of female respondents were returnees versus 26.8 and 19.4 in Round 10 of the NLSY97. In the time between these two surveys, attrition levels have risen substantially on a national level.7 The statistics in the previous two tables show the extent to which sample members leave and return to the NLSY97 and how the patterns vary by basic demographic characteristics. However, these tables do not provide information about whether attrition causes the sample to be The group referred to as “whites” in this paper is actually composed of non-Black, non-Hispanics. We refer to them as white for simplicity. 6 The NLSY79 sample interviewed slightly older respondents at the first interview as that cohort was ages 14 to 21. 7 Conversely, about 90 percent of the youth eligible to be sample members agreed to participate in Round 1 of the NLSY79, while a higher percentage, 91.6 percent, agreed to participate in the first round of the NLSY97 (Moore et al. 2000). 5 5 nonrepresentative. Table 3 provides some information on whether the mean characteristics measured at Round 1 are related to attrition behavior by comparing these characteristics across three samples: (1) those who never missed an interview (continuous sample), (2) those who missed at least one interview and returned to the NLSY97 (intermittent sample), and (3) those who miss an interview and never return. The variation in demographic characteristics across the three samples provides some corroboration for the patterns seen in the previous table. First, women are over-represented among those who participate in all rounds, while men are under-represented. Second, as compared to all NLSY97 sample members, the continuous sample is made up of a greater percentage of those born in later years. Among those who participate in all rounds, 16.5 percent were born in 1980 and 21.9 percent were born in 1984, while the whole sample is closer to having roughly 20 percent come from each birth year. Third, the racial composition of the continuous sample closely matches that for the whole sample, although the sample of those who never return to the NLSY97 contains a lower percentage of Blacks and a higher percentage of whites than the whole sample. In contrast, the intermittent sample is composed of a higher percentage of Blacks and a lower percentage of whites than the whole sample. The characteristics discussed in this paragraph (sex, birth year, and race) are accounted for in the sampling weights of the NLSY97, which are adjusted each survey round to take account of selective attrition based on the exogenous demographic characteristics of age, race/ethnicity, and sex. Consequently, using the weights in analysis of the NLSY97 should correct for non-random attrition along these lines. The three groups differ with respect to other characteristics, as well. Those sample members for whom a parent interview was conducted are over-represented in the continuous 6 sample, which may imply that sample members who come from more cooperative families are themselves more cooperative (Laurie, Smith, and Scott 1999). Sample members from more advantaged families—measured by parental educational attainment, living with their biological parents at the Round 1 interview, and family income in Round 1—are less apt to participate intermittently. Among those who participate in all rounds of the survey, the mean for father’s highest grade completed is 12.71 and for mother’s highest grade completed is 12.58, respectively. Mean parental education is 12.74 and 12.45 for fathers and mothers in the sample who left and never returned. Compared with the intermittent sample, a greater proportion of the continuous sample and of the sample that does not return lived with their biological mother and biological father at the time of the Round 1 interview. On average, family income for those who attrite and then return to the NLSY97 is lower than family income in the other two samples. In addition, the variation in the proxies for socio-economic status for a given group is comparable across the three groups. Two youth outcome measures that are reported in Round 1 (having ever repeated a grade and having ever smoked) are presented in the bottom rows of Table 3. For both of these measures, intermittent participation in the survey is correlated with worse outcomes. Those in the intermittent sample have the highest incidence of both measures, with 21 percent having repeated a grade and 44 percent having smoked. Those who attrite and do not return to the NLSY97 are least likely to have repeated a grade (14 percent) and those in the continuous sample are the least likely to have smoked (39 percent). Of course, the pattern that sample members from earlier birth years are more apt to leave the survey is confounded with the pattern 7 on cigarette smoking. Those who are older in Round 1 are more like to attrite and older youth are more likely to have smoked.8 Using a different tack, Table 4 also considers the question of whether characteristics of attritors differ from the characteristics of sample members who are interviewed. Table 4 attempts to address this question by examining each birth cohort that makes up the NLSY97 in the round that corresponds to the birth cohort being age 16 to 17. Thus, the characteristics available in Round 1 for sample members born in 1980 are compared to the characteristics reported in Round 5 for sample members born in 1984. The bottom row shows that 92 to 93 percent of the 1981 through 1984 birth cohorts are interviewed in the round in which they are 16 to 17 years old. Because the 1980 birth cohort is age 16 to 17 in Round 1, by definition, 100 percent of the 1980 birth cohort is interviewed. Presumably, in Round 1, prior to any attrition, the 1980 birth cohort is representative of the population. One caveat to this analysis is that any trend in teenage behavior may be attributed erroneously as an effect of attrition. However, given the short timeframe over which we are looking, we ignore this possibility. Two sets of descriptive statistics for each birth year are presented—the first unweighted and the second weighted by the sampling weight in the round from which the data are drawn. In the unweighted statistics, sex composition does not vary across the birth cohorts when observed in the various rounds. The percentage of Black respondents is greater among those born in 1980 than among the subsequent birth years. Grades in 8th grade increase slightly across the birth years; the sum of the percentage of sample members in the “half A’s and half B’s” and “mostly A” categories rises with each birth year from 34 percent among the 1980 birth cohort to 40 percent among the 1984 birth cohort, while the percentage of sample members with 8th grade 8 This may also be true for grade retention, though to a lesser extent. Initiation into cigarette smoking most often occurs in the teenage years, while the bulk of grade retention occurs in kindergarten and 1 st grade. 8 grades of “half B’s and half C’s” or lower generally declines. In addition, the proportion of the 1980 birth year sample that report having had sex by Round 1 is lower than that reported by the later birth cohorts, while the percentage who report having repeated a grade is higher. Any difference by birth year in racial/ethnic composition is virtually eliminated in the weighted statistics. The other differences, however, remain after weighting the data. The percentage who earn “half A’s and half B’s” or “mostly A’s” rises by 6.2 percentage points from the 1980 birth cohort to the 1984 birth cohort in the weighted calculations versus a 6.3 percentage point increase in the unweighted numbers—though the levels of those with high 8th grade grades is about 3 percentage points higher when the data are weighted. Similarly, for the teen outcomes of having had sex and having repeated a grade, though the levels of incidence again differ in the unweighted versus weighed data, the differences across the birth years are similar in the weighed and unweighted statistics. In the unweighted data the percentage of respondents who reports having had sex rises by 9.5 percentage points from birth year 1980 to birth year 1984 and by 10.2 percentage points in the unweighted data. For having repeated a grade, the levels drop across birth years, by 8.5 percentage points when the data are unweighted versus by 8.9 percentage points when the data are weighted. Taken together, Tables 3 and 4 may imply that attrition in the NLSY97 is non-random with respect to teen outcomes and decisions as well as the SES of the youths’ parents. In addition, the comparison of weighed versus unweighted statistics in Table 4 suggest that the weights provided by the NLSY97 account for non-random attrition by sex and race, as they are designed to, but that they do not remedy the impact of attrition that is non-random along other dimensions.9 9 Though not shown here, the weighted statistics for Table 3 were also produced. Using the weighted data, the patterns across columns are generally consistent with those presented in Table 3. Again, income levels are higher 9 IA. Probability of Initial Attrition The previous tables examined attrition patterns using bivariate statistics. Tables 5 and 6 present results that further examine how attritors are different from nonattritors by estimating initial attrition and return as functions of the respondent’s life activities at interview t and how those activities are related to attrition at time t+1 or return after t+1. Equation (1) is a logit that estimates the probability that an individual attrites at the next round of data collection, conditional on the respondent having participated in all previous rounds. (1) P(At 1 1) Xi1 Xit2 it A1,..., At 0 where At+1 indicates whether the individual attrites at Round t+1, X is the general notation for the covariates included in this analysis, some of which are permanent characteristics and others are measured in the previous round (Round t), and β are the parameters to be estimated. The estimates of β gauge the extent to which attritors come from certain segments of the population. The life activities for which the equations control are whether the sample member had become a parent to a biological child, was married or cohabiting, whether she was in school, and whether she was employed—all measured in the six month-period prior to the round t interview. Equation 1 is estimated for the full sample as well as estimated separately for the crosssection and oversample of the NLSY97, by sex, and by race/ethnicity. All estimates control for the birth year of the sample member, round of data collection, and the interaction between birth year and round of data collection. All estimates are unweighted. Table 5 presents the estimated marginal effects for Equation 1. For the NLSY97 as a whole, initial attrition appears to be non-random with respect to the decisions that the NLSY97 sample members are making about school, work, fertility, and union formation. The results from using weighted data. In addition, parents’ educational attainment and youth’s cigarette smoking is higher, and the incidence of grade repetition is lower. 10 the full sample indicate that those who have given birth are 8 percentage points more likely (or over 165 percent more likely than for the average observation where the chance of not participating next round is approximately 5 percent) to be non-respondents in the next round. Those sample members who are in school and those who are employed are less likely by 2.0 and 2.9 percentage points to attrite from the survey in the next round. In addition, sample members who were married at the Round t interview are 1 percentage point less likely to attrite at the Round t+1 interview.10 The estimates from the cross-section closely match those from the whole sample. Likewise, for the oversamples, the estimated marginal effects of life events are quite similar to those estimated in the full sample and in the cross-section. Because of its smaller sample size, the standard errors in the oversample estimates, however, are up to two times larger, causing the married indicator to lose statistical significance. Separate estimates by sex show that the effects of being in school and of being employed are comparable for men and women. These activities are associated with a 2 percentage point and a 3 percentage point decline in the probability of attrition in the next round, as was the case in the full sample. The process of attrition for women and men differs in that for women the probability of leaving the survey varies less by birth year. Compared to women, men born in higher birth years are less likely to leave the survey relative to their counterparts born in 1980. However, having a child increases the probability of attrition among men by 9.5 percentage points, compared to 6.4 percentage points among women. In addition, being married is associated with a lower probability of attrition among women, but is unrelated to attrition behavior among men. 10 Among the sample used to estimate 1st attrition, 12 % give birth, 12 % are married, 17 % are cohabiting, 24 % are in school, and 76 % are employed over the six months prior to the Round t interview. 11 When attrition is estimated separately by race/ethnicity, parenthood, school enrollment, and employment have effects similar to those discussed above for the whole sample. A few differences do emerge in the estimates by race/ethnic group. For instance, being married is associated with a lower chance of attrition by about one percentage point in the Hispanic and white samples. In contrast, marriage is unrelated to attrition for the Blacks in the NLSY97. In sum, prospective attritors are more likely to come from those sample members who become a parent relatively early in their lives and less likely to come from those employed or in school. Through Round 10, attrition appears independent of cohabitation status, but related to marital status. Marriage appears to be associated with a lower chance of attrition for women, Hispanics, and whites—though not for males or Blacks. MaCurdy, Mroz, and Gritz (1998) estimate a similar equation in their examination of attrition in the NLSY79. They also find that men are less likely to attrite if employed. However, among women they find that attritors are more likely to come from the non-employed. Their specification differs from our Equation 1 in a number of key ways. First, they permit the impact of life events to vary by age group and find that, for men, the impact of employment increases with age. In their analysis of the NLSY79, the sample members range in age from 14 to 34 whereas we have fewer years of data and a smaller age range of 12 to 24. Second, in MaCurdy, Mroz, and Gritz, schooling and employment are defined as mutually exclusive activities at ages 20 and younger where schooling takes precedence over employment. Third, we control for demographic events—births, marriages, and cohabitation—while they focus on the labor market and control only for schooling and work. IB. Probability of Return Following Initial Attrition 12 Equation (2) parallels Equation (1), but estimates the probability of returning to the survey after having missed an interview for the first time. The annual observations included are those that occur after first attrition and up to the round in which the respondent first returns to the NLSY97. P(R t 1 1) Xi 1 XiLI2 Xit 13 it 1 A1 ,..., ALI 0 and ALI1,...At 1 1 (2) where Rt+1 indicates whether the individual attrites at the next wave, X is the general notation for the covariates included in this analysis some of which are permanent characteristics and others are measured in the last round in which the respondent was interviewed (Round LI), and round is included as a control that is measured at t+1, and α are the parameters to be estimated. Looking at the marginal effects of various life events on the probability of return among those in their first spell of attrition, Table 6 shows that having had a birth has a consistently significant and negative impact on the probability of returning to the survey after missing at least one interview. For all samples considered, becoming a parent decreases the likelihood of return by almost 40 percentage points or by 133 percent (the average chance of return is in the neighborhood of 32 percent). In most of the samples (all except the subsample of whites), being married increases the chance that the individual returns to the survey. Other activities such as cohabitating, being enrolled in school, and being employed at the date of the last interview are unrelated to the probability of returning to the NLSY97 after initially attriting. The sign, magnitude, and significance of most of the estimates of Equation (2) are consistent across the subsamples examined, providing little indication that the process of return varies across subsamples.11 II. Impact of Attrition on Estimates 11 For the sample used to estimate the equation explaining return after initial attrition, 70 % had given birth, 4 % are married, 8 % are cohabitating, 7 % are in school, and 24 % are employed over the six months prior to the last interview in which they participated. 13 In this section, we attempt to gauge the impact of attrition on estimates by estimating outcomes using three different estimation samples that are defined based on survey participation. The three outcomes examined are (a) whether the youth had earned a high school diploma by age 20, (b) whether the youth had had a child by age 20, and (c) the number of weeks worked in the year that the youth turned age 20. The three samples consist of (1) the youth who participated in the Round 9 survey, (2) the youth who participated in all rounds through Round 9, and (3) sample members who participated in at least one round after their 20th birthdays. The outcomes are estimated separately by sex, using both unweighted and weighted data. Logit equations are estimated for the outcomes earning a high school diploma and having had a child by age 20, and an OLS equation for number of weeks worked in the year that the sample member turned 20. The size of the three samples varies considerably. Among men, roughly 91 percent (about 4200) of the 4599 males in the NLSY97 sample were interviewed at least once after turning 20-years-old. About 80 percent of the males in the NLSY97 were interviewed in Round 9, and about 62 percent in every round up through Round 9. The rates are higher for women, with 92 percent, 83 percent, and 70 percent of the 4385 young women in the NLSY97 sample meeting the three sample criteria used here. Descriptive statistics on the three samples are presented in Table 7. In general, the picture shown does not differ across the three estimation samples. Among males, the unweighted data show that the rate of earning a high school diploma ranges from 72 percent in the most inclusive sample to over 76 percent in the most restrictive sample. For women, the percentage who has earned a high school diploma by age 20 ranges from 78 to 80 percent across the three samples, with the highest rate again in the most restrictive sample. The percentage who have had a child by age 20 varies less across the samples; 9 to 11 percent of men have had a 14 child by age 20 and 22 to 23 percent of women have had a child by age 20. In the unweighted data, the number of weeks worked does not differ across the three samples considered. Among males, the median number of weeks worked ranges from 46 to 48 with about 35 weeks as the mean number of weeks work. For females, the median number of weeks worked is 46 and the mean is about 35 in all three of the samples.12 Table 8 presents the coefficient estimates for the three outcomes across each of the three samples and confirms the picture from Table 7. That is, the samples vary little with respect to the dependent variables, and the controls and hence produce the similar estimates. Estimates for males are presented in Panel A and those for females in Panel B. Within both the male and the female groups, the estimates of the probability of earning a high school diploma by age 20 are consistent across the three samples with respect to sign and significance. For males, one difference emerges between the three sets of estimates. In only the least restrictive sample, being Hispanic is associated with an increased probability of receiving a high school diploma by age 20. For men and for women the estimates for having a child by age 20 and for the number of weeks worked the patterns are mostly consistent across the estimation samples. Differences emerge in the estimates of having a child by age 20 for the sample interviewed in all rounds. The patterns of significance differ for that sample as compared to the other two. Most notably, in the most restrictive sample, race and ethnicity are unrelated to having a child by age 20. Among men, being Black or Hispanic increases the chances of becoming a father by age 20 by about 3 percentage points in the sample interviewed in Round 9 12 The patterns between the samples are similar in the weighted data—though the youth appear to have better outcomes across the board. For men, the percentage that has a high school diploma is about 4 percentage points higher, the percentage that has had a child born is about 2 percentage points lower, and the mean number of weeks worked rises by about one week in the weighted data compared to the unweighted data. Among women, the percentage that has a high school diploma is about 2 percentage points higher, the percentage that has had a birth is about 4 percentage points lower, and, as is the case for men, the mean number of weeks rises by about one week when the data are weighted. 15 and in the sample interviewed after age 20. Among women, being Black is associated with an increase of about 3 percentage points in the chance of becoming a mother by age 20 in the less restrictive samples. With few exceptions, the patterns of sign and significance are the same across the estimates of weeks worked from the three samples. III. Comparison of NLSY97 to Cross-Sectional Data Sets In this section, we attempt to compare results from the NLSY97 with two other data sets, the Current Population Survey (CPS) and the National Survey of Family Growth (NSFG). We compare estimates of the same three outcomes that we examined in the previous section: obtaining a high school diploma by age 20, having a child by age 20, and weeks worked in the calendar year that the youth turned age 20. Educational attainment and weeks worked are available in the CPS. Educational attainment and age at first birth are available in the NSFG. Not surprisingly, the concepts are not measured identically across the data sets. While in the NLSY97 information on schooling, employment, and fertility is collected in an event history format, information is collected retrospectively in the CPS and NSFG. Additionally, in the CPS information about an individual is not necessarily reported by that individual, but may be reported by someone else in his family—whereas in the NLSY97 and the NSFG, the information is reported by the youth him or herself. Not only do the interview protocols differ, but also who is included in the estimation samples based on birth date varies across the surveys. In the CPS, information about earning a high school diploma is collected in the October supplement. The survey asks in what year the individual completed high school and whether they completed by taking an equivalency test. We restrict the sample to those individuals who are 21 at the October interview in the years 2000 through 2004. Consequently, the NLSY97 and CPS samples do not correspond perfectly with 16 respect to birthdates. The NLSY97 sample members are born in 1980 to 1984, whereas the CPS sample will be composed of those born from October 1979 to October 1984. In the CPS, the information on weeks worked comes from the March interview. The question asked is “During (last calendar year), in how many weeks did (name) work, even for a few hours? Include paid vacation and sick leave as work.” The sample for comparison with the NLSY97 is limited to individuals who are age 20 at the March 2000-2004 interviews. Again, this sample will roughly correspond to birth years 1980 to 1984, including those born from March 1980 to March 1985. Cycle 6 of the NSFG was collected in 2002, which means that the later birth cohorts that make up the NLSY97 will not have reached age 20 by the interview date. Consequently, we examine two samples from the NSFG. First, we compare the 2002 outcomes of birth years 1977 to 1981 in the NSFG to the Round 9 (2005) outcomes of the NLSY97 sample members. Second, we compare the 2002 outcomes (Round 6 in the NLSY97) of birth years 1980 and 1981 from the NSFG and the NLSY97. Table 9 presents descriptive statistics from all three surveys, for both unweighted and weighted data. Statistics are presented separately by sex. Information from the NLSY97 is presented in the first four columns. The figures from the CPS and NSFG are presented in columns five through eight and columns nine through twelve respectively. For the most part, we compare the weighted statistics as the weights take account of any oversamples included as part of the datasets. Note that the samples that make-up these data sets were drawn in different years. As a consequence, the three data sets are nationally representative at different points in time. In particular, because the NLSY97 was drawn to be nationally representative of birth years 1980 to 1984 living in the US on December 31, 1996, any immigration that took place after 1996 will not be accounted for in the NLSY97 sample. 17 The rates of earning a high school diploma by age 20 are in the same neighborhood regardless of which dataset is used—though the rates are slightly higher in the CPS. Among males, the weighted data show that 80 percent of those in the CPS sample, 77 percent of those in the NLSY97 sample, and 76 percent of those in the NSFG sample have earned a high school diploma by age 20. The same pattern emerges among the females, with the percentage that reports earning a high school diploma being highest in the CPS closely followed by that in the NLSY97 and NSFG. In addition, the rates of having a child by age 20 are about equal in the NLSY97 and the NSFG. The NLSY97 indicates that 9 percent of men and 20 percent of women have a child by age 20, while the NSFG indicates that 8 percent of men and 21 percent of women have a child by age 20. In contrast, the average number of weeks worked in the calendar year that the sample member turns age 20 differs in the NLSY97 versus the CPS. Weeks worked are lower in the CPS. The CPS shows that men work 31 weeks and women work 28.5 weeks during the calendar year in which they turn 20. The comparable numbers are roughly 15 percent and 25 percent higher in the NLSY97 with men working an average of 36 weeks and women working an average of 37 weeks. The samples from the three datasets vary along other dimensions as well. The NLSY97 sample contains more white (Non-Hispanic, non-Black) respondents than the samples from the other two surveys. Compared to the NLSY97, the NSFG shows a larger percentage living in the central city, a greater percentage whose mothers in the highest educational group, and a greater percentage whose mothers who were teenagers when they 1st gave birth. 18 Table 10 presents estimates of graduating from high school by age 20 and of weeks worked in the year that the youth turned 20 from the NLSY97 and from the CPS. Table 10 is composed of four panels. Panels A and B present the results for males while Panels C and D present the results for females. Each panel is composed of four columns, where columns 1 and 2 present the results from the NLSY97 using the unweighted and weighted data respectively. Columns 3 and 4 present the results from the CPS using the unweighted and weighted data. In comparing the estimates from the two data sets, two key differences are apparent. First, race and ethnicity have larger effects in the estimates from the CPS than in the estimates from the NLSY97. In the estimates from the CPS explaining whether the youth had graduated from high school by age 20, being Hispanic decreases that probability by over 25 percentage points among males. The corresponding estimates are about twelve to eight percentage points for both males and females when estimated using the NLSY97. Also, in the estimates of weeks worked the negative impact associated with being Black is greater in the CPS than in the NLSY97. Further, in the estimates of weeks worked, the trend across birth years is more pronounced in the CPS sample than in the NLSY97 sample. Among both males and females, in each subsequent year, the 20-year-olds in the CPS work increasingly fewer weeks compared to the number of weeks worked by those who were age 20 in 2000. Table 11 compares estimates from the NLSY97 with those from the NSFG and is organized in the same fashion as Table 10. Each panel of Table 11 has eight columns. For each data set, four sets of estimates are presented. Differences between the NLSY97 results and the NSFG results emerge in each of the panels. Panel A shows the estimated marginal effect of having a child by age 20 among males. In this instance, mother’s educational attainment is related to having a child by age 20 in the 19 NLSY97 samples, but not in the NSFG samples. The estimates from the NLSY97 show that having a mother with more education decreases the chance of having an early birth. Though the sample sizes in the NSFG are smaller than those in the NLSY97 are, the lack of a relationship between mother’s education attainment and an early birth appears to be driven by small estimates and not by large standard errors. Second, in the NSFG sample of those born in 1980 and 1981, the race/ethnicity indicators are, for males, unrelated to having an early birth, but the lack of significance may be driven by large standard errors. For females, having a mother who education beyond a high school degree is associated with a lower probability of having an early birth in both data sets. With respect to the effects of mother’s age at 1st birth on the probability of having an early birth, the patterns of sign and significance are consistent across the NLSY97 and the NSFG. The magnitudes of the effects are larger in the NSFG. For instance, the weighted data show that having a mother who had her first child at age 30 or older is associated with a 16-percentage point lower probability of having an early birth in the NLSY97 and a 22 percentage point lower probability in the NSFG. In the restricted sample, the estimated marginal effects from the weighted data are a 14 percentage point decrease and a 36 percentage point decrease respectively. In comparing the estimates of the probability of graduating from high school by age 20 for men, race and ethnicity, as well as mother’s age at 1st birth, are not related to the outcome only in the more restrictive sample from the NSFG. For both sets of explanatory variables, the estimated marginal effects are consistent with those from the other three samples, but have standard errors about twice as large. The patterns for women are very different. Only in the NSFG estimates, being Black decreases the likelihood of graduating from high school by age 20. In both the NLSY97 and 20 NSFG samples, the chance of earning a high school diploma by age 20 is increasing with mother’s education. As is the case for males, the restricted NSFG sample, mother’s age at first birth is not related to the probability of graduating from high school. IV. Concluding Comments In the most recent round of the NLSY97, the level of wave nonresponse was in the neighborhood of 16 percent. Moreover, wave nonresponse is, in every wave, higher in the NLSY97 than it was in the NLSY79. The evidence presented here implies that attrition is nonrandom with respect to the youths’ outcomes and the socioeconomic status of the youths’ families. Youth who attrite and return had parents with less education and lower incomes in Round 1. Youth who are nonresponsive at some point are more likely to have repeated a grade, had lower 8th grade grades, and are more likely to have had a child early. Despite these differences, including or excluding attritors in the estimation samples does not affect the estimates for three different outcomes: earning a high school diploma by age 20, becoming a parent by age 20, and weeks worked in the year that the sample member turned 20. The one exception is that race is not significantly related to the outcome of having a child by age 20 in the sample composed only of the sample members who have not missed an interview, while race is significant when attritors are included in the estimation sample. When estimates of the same three outcomes from the NLSY97 are compared with estimates based on similarly aged samples drawn from the CPS and NSFG, few differences emerge. A comparison of the results from the three data sets shows that the estimated effect of race is larger using either the CPS or the NSFG than the NLSY97. In contrast, mother’s characteristics have a greater impact on the outcomes in the estimates from the NLSY97 than those from the NSFG. Presumably, measurement error in mother’s characteristics is smaller in 21 the NLSY97 where for the most part mothers reported their own characteristics as compared to the NSFG where mother’s characteristics are reported by the respondents. Over the first nine waves, attrition from the NLSY97 does not appear to affect inference when estimating the three outcomes at age 20. However, the effects of attrition can only be examined on a case-by-case basis and attrition may affect other outcomes. In addition, though attrition may not affect the inference from these estimations, it will affect the precision of the estimates. 22 Works Cited Burkam, David T., and Valerie E. Lee. 1998. "Effects of Monotone and Nonmonotone Attrition on Parameter Estimates in Regression Models With Educational Data: Demographic Effects on Achievement, Aspirations and Attitudes." Journal of Human Resources 33(2):555-574. Fitzgerald, John, Gottschalk, Peter, and Robert Moffit. 1998. "An Analysis of Sample Attrition in Panel Data: The Michigan Panel Study of Income Dynamics." Journal of Human Resources 33(2):251-299. Laurie, Heather, Rachel Smith, and Lynne Scott. 1999. “Strategies for Reducing Nonresponse in a Longitudinal Panel Survey.” Journal of Official Statistics 15(2): 369-282. MaCurdy, Thomas, Mroz, Thomas, and R. Mark Gritz. 1998. "An Evaluation of the National Longitudinal Survey of Youth." Journal of Human Resources 33(2):345-436. Moore, Whitney, Steven Pedlow, Parvati Krishnamurty, and Kirk Wolter. 2000. National Longitudinal Survey of Youth 1997: Technical Sampling Report. National Opinion Research Center: Chicago, IL. NLSY79 Users Guide. 2006. Center for Human Resources Research, The Ohio State University: Columbus, OH. Olsen, Randall J. 2005. “The Problem of Respondent Attrition: Survey Methodology is Key.” Monthly Labor Review 128(2): 63-70. Pierret, Charles R.. 2001. "Event History Data Survey Recall: An Analysis of the National Longitudinal Survey of Youth 1979 Recall Experiment." Journal of Human Resources 36(3):439-466. Watson, Nicole and Mark Wooden. (2006) “Modelling Longitudinal Survey Response: The Experience of the HILDA Survey.” HILDA Project Discussion Paper Series, No2/06. Zabel, Jeffrey E. 1998. "An Analysis of Attrition in the Panel Study of Income Dynamics and the Survey of Income and Program Participation with an Application to a Model of Labor Market Behavior." Journal of Human Resources 33(2):479-506. 23 Table 1: Attrition Rates and the Fraction of Participants who are Returnees, by Sample Round All Oversample Cross-section Sample Size 8984 2236 6748 0.067 0.086 0.101 0.123 0.121 0.137 0.165 0.183 0.159 0.058 0.089 0.094 0.122 0.106 0.118 0.149 0.150 0.135 0.070 0.085 0.103 0.123 0.126 0.143 0.170 0.194 0.167 0.067 0.051 0.046 0.051 0.036 0.039 0.052 0.054 0.029 0.058 0.058 0.042 0.064 0.035 0.033 0.056 0.050 0.029 0.070 0.049 0.048 0.046 0.036 0.041 0.051 0.056 0.029 0.212 0.161 0.178 0.184 0.262 0.331 0.283 0.419 0.171 0.098 0.096 0.149 0.254 0.246 0.172 0.372 0.224 0.187 0.202 0.201 0.264 0.354 0.323 0.432 0.031 0.061 0.086 0.120 0.139 0.157 0.185 0.231 0.026 0.062 0.094 0.141 0.158 0.176 0.217 0.253 0.032 0.061 0.084 0.113 0.132 0.150 0.173 0.224 Wave Non-response Round 2 Round 3 Round 4 Round 5 Round 6 Round 7 Round 8 Round 9 Round 10 First Attrition Round 2 Round 3 Round 4 Round 5 Round 6 Round 7 Round 8 Round 9 Round 10 Never Return Round 2 Round 3 Round 4 Round 5 Round 6 Round 7 Round 8 Round 9 Return Round 3 Round 4 Round 5 Round 6 Round 7 Round 8 Round 9 Round 10 Note: The first attrition proportions are defined out of those who have never missed an interview. The "never return" proportions defined out of those who attrite for the first time in that round. The return proportions are defined out of those who are interviewed that round. Table 2: Attrition Rates and the Fraction of Participants who are Returnees, by Selected Characteristics Sub-sample By Sex By Race/Ethnicity Male Female Black Hispanic White 1980 Wave Non-response Round 2 0.069 0.064 0.056 0.068 0.071 0.088 Round 3 0.093 0.079 0.087 0.090 0.085 0.122 Round 4 0.105 0.096 0.090 0.106 0.104 0.148 Round 5 0.133 0.112 0.130 0.119 0.121 0.167 Round 6 0.131 0.111 0.106 0.116 0.131 0.163 Round 7 0.146 0.127 0.118 0.136 0.147 0.177 Round 8 0.188 0.140 0.146 0.156 0.179 0.199 Round 9 0.203 0.163 0.152 0.180 0.200 0.213 Round 10 0.173 0.143 0.128 0.160 0.174 0.193 First Attrition Round 2 0.069 0.064 0.056 0.068 0.071 0.088 Round 3 0.056 0.046 0.055 0.056 0.047 0.079 Round 4 0.051 0.042 0.046 0.049 0.046 0.079 Round 5 0.059 0.043 0.064 0.050 0.044 0.070 Round 6 0.042 0.030 0.034 0.039 0.035 0.041 Round 7 0.042 0.035 0.031 0.048 0.039 0.057 Round 8 0.068 0.037 0.055 0.048 0.053 0.060 Round 9 0.063 0.046 0.047 0.065 0.054 0.050 Round 10 0.039 0.020 0.023 0.039 0.028 0.028 Never Return Round 2 0.209 0.216 0.168 0.147 0.259 0.162 Round 3 0.159 0.164 0.131 0.121 0.201 0.180 Round 4 0.160 0.201 0.104 0.122 0.238 0.188 Round 5 0.173 0.200 0.134 0.150 0.234 0.187 Round 6 0.247 0.283 0.188 0.271 0.285 0.260 Round 7 0.299 0.369 0.196 0.271 0.415 0.333 Round 8 0.259 0.325 0.146 0.273 0.355 0.258 Round 9 0.367 0.486 0.364 0.430 0.435 0.404 Return Round 3 0.030 0.031 0.024 0.032 0.033 0.043 Round 4 0.068 0.054 0.065 0.064 0.057 0.091 Round 5 0.094 0.078 0.085 0.098 0.082 0.136 Round 6 0.134 0.106 0.140 0.136 0.104 0.175 Round 7 0.156 0.122 0.156 0.158 0.122 0.208 Round 8 0.172 0.142 0.176 0.180 0.137 0.235 Round 9 0.210 0.159 0.209 0.211 0.161 0.261 Round 10 0.268 0.194 0.249 0.259 0.211 0.300 By Year of Birth 1981 1982 1983 1984 0.077 0.110 0.129 0.156 0.152 0.162 0.196 0.225 0.179 0.061 0.080 0.095 0.122 0.120 0.132 0.148 0.164 0.149 0.063 0.069 0.070 0.093 0.097 0.108 0.147 0.160 0.138 0.044 0.051 0.062 0.076 0.075 0.106 0.134 0.154 0.135 0.077 0.073 0.062 0.063 0.042 0.028 0.069 0.062 0.024 0.061 0.043 0.039 0.053 0.038 0.037 0.042 0.056 0.030 0.063 0.037 0.031 0.039 0.035 0.035 0.046 0.054 0.031 0.044 0.026 0.027 0.034 0.024 0.040 0.049 0.049 0.031 0.262 0.183 0.222 0.191 0.237 0.368 0.311 0.434 0.248 0.108 0.172 0.129 0.228 0.315 0.259 0.413 0.202 0.161 0.100 0.213 0.302 0.216 0.354 0.342 0.179 0.136 0.156 0.222 0.297 0.417 0.225 0.500 0.038 0.079 0.109 0.150 0.164 0.188 0.210 0.272 0.024 0.046 0.069 0.107 0.128 0.149 0.181 0.219 0.031 0.059 0.073 0.101 0.121 0.123 0.158 0.205 0.018 0.035 0.053 0.076 0.083 0.098 0.123 0.169 Note: Sample sizes are 8984 for the entire sample, 4599 for the male sample, 4385 for the sample of females, 2335 for the Black sample, 1901 for the Hispanic sample, 83 for the mixed sample, 4665 for the non-Black, non-Hispanic (white) sample, the sample sizes are 1691, 1874, 1841, 1807, 1771 for birth years 1980, 1981, 1982, 1983, and 1984. The first attrition proportions are defined out of those who have never missed an interview. The "never return" proportions defined out of those who attrite for the first time in that round. The return proportions are defined out of those interviewed that round. 25 Table 3: Characteristics by Attrition Status Variable Whole Sample In All Rounds Attrite and Return Attrite and Never Return 0.512 0.488 0.260 0.212 0.519 0.188 0.209 0.205 0.201 0.197 0.884 0.479 0.521 0.263 0.203 0.523 0.165 0.193 0.210 0.213 0.219 0.909 0.580 0.420 0.282 0.233 0.478 0.243 0.237 0.197 0.178 0.146 0.834 0.552 0.448 0.183 0.211 0.597 0.204 0.241 0.190 0.182 0.183 0.852 12.564 (3.212) 12.438 (2.913) 0.207 0.077 0.880 0.569 46361.70 (42143.50) 12.705 (3.203) 12.577 (2.972) 0.197 0.070 0.891 0.582 47280.13 (42450.76) 12.107 (3.258) 12.068 (2.828) 0.244 0.095 0.852 0.516 42575.35 (41301.340) 12.735 (3.064) 12.451 (2.650) 0.182 0.083 0.880 0.615 49328.17 (41514.66) Region at Round 1 Northeast North Central South West 0.176 0.228 0.374 0.222 0.171 0.233 0.377 0.218 0.175 0.219 0.379 0.227 0.213 0.217 0.341 0.228 Youth Outcomes at Round 1 Ever Repeat a Grade Ever Smoke a Cigarette Sample Size 0.172 0.393 8984 0.162 0.374 5810 0.211 0.440 2268 0.142 0.396 910 Demographic Characteristics Male Female Black Hispanic White Birth Year 1980 Birth Year 1981 Birth Year 1982 Birth Year 1983 Birth Year 1984 Parent interview conducted Family Background at Round 1 Highest Grade Completed--Bio Father Highest Grade Completed--Bio Mother HGC Bio Father--Missing HGC Bio Mother--Missing Lives with Bio Mother Lives with Bio Father Household Income Note: Data are unweighted. Standard errors are in parentheses for continuous variables. 26 Table 4: Comparison of Selected Characteristics by Birth Year and Round in which Sample Members are Age 16 to 17 Year of Birth and Round Observed 1980 in Round 1 Unweighted Weighted 1981 in Round 2 Unweighted 1982 in Round 3 Weighted Unweighted Weighted 1983 in Round 4 Unweighted Weighted 1984 in Round 5 Unweighted Weighted Basic Demographics Male 0.505 0.518 0.505 0.510 0.516 0.512 0.513 0.504 0.517 0.518 Female 0.495 0.482 0.495 0.490 0.484 0.488 0.487 0.496 0.483 0.482 Black 0.281 0.151 0.256 0.156 0.268 0.155 0.260 0.160 0.247 0.151 Hispanic 0.205 0.128 0.212 0.129 0.213 0.130 0.202 0.129 0.218 0.124 Non-Black, Non-Hispanic 0.504 0.709 0.523 0.704 0.509 0.701 0.530 0.700 0.524 0.710 Grades in 8th Grade Less than mostly C's Mostly C's 0.105 0.135 0.101 0.126 0.118 0.118 0.108 0.116 0.114 0.142 0.109 0.134 0.125 0.128 0.115 0.116 0.120 0.122 0.113 0.108 Half B's and Half C's Mostly B's 0.262 0.147 0.240 0.145 0.243 0.138 0.233 0.139 0.222 0.147 0.200 0.140 0.230 0.127 0.216 0.129 0.203 0.129 0.186 0.129 Half A's and Half B's Mostly A's 0.202 0.135 0.211 0.164 0.216 0.152 0.218 0.169 0.197 0.158 0.210 0.187 0.233 0.140 0.244 0.164 0.239 0.161 0.249 0.188 Other 0.014 0.013 0.016 0.016 0.019 0.019 0.017 0.016 0.028 0.027 0.876 0.882 0.898 0.908 0.878 0.888 0.901 0.904 0.900 0.908 47704.14 54414.00 47331.24 52534.46 45862.91 52134.88 46647.59 52429.25 44694.12 50433.14 Ever had sex 0.458 0.434 0.531 0.509 0.566 0.528 0.528 0.507 0.553 0.536 Ever smoked a Cigarette 0.548 0.582 0.565 0.601 0.595 0.622 0.546 0.569 0.546 0.566 Ever Repeated a Grade 0.244 0.220 0.203 0.167 0.187 0.163 0.165 0.145 0.158 0.131 Have a parent interview Household Income, Round 1 Sample Size 1691 1729 1694 1680 1636 % of cohort interviewed 100% 92.30% 92.00% 93.00% 92.40% Note: When weighted, the data are weighted by sampling weight from the round corresponding to the respondents being ages 16 to 17. 27 Table 5: Estimated Marginal Effects of Probability of Initial Attrition Sample All Cross Oversamples Males Females Section Event/Status Birth of Child 0.079*** 0.078*** 0.074*** 0.095*** 0.064*** (0.003) (0.004) (0.006) (0.006) (0.004) Married -0.010*** -0.010*** -0.008 -0.002 -0.011*** (0.002) (0.002) (0.004) (0.003) (0.002) Cohabiting -0.002 -0.002 -0.001 0.002 -0.003 (0.002) (0.002) (0.003) (0.003) (0.002) In School -0.020*** -0.018*** -0.023*** -0.022*** -0.016*** (0.001) (0.002) (0.003) (0.002) (0.002) Employed -0.029*** -0.032*** -0.022*** -0.032*** -0.026*** (0.002) (0.002) (0.003) (0.003) (0.002) Birth Year 1981 -0.004 -0.001 -0.013 -0.010 0.003 (0.004) (0.004) (0.007) (0.005) (0.005) 1982 -0.006 -0.006 -0.007 -0.011 -0.001 (0.004) (0.004) (0.008) (0.005) (0.005) 1983 -0.006 -0.005 -0.006 -0.013* 0.002 (0.004) (0.004) (0.008) (0.005) (0.006) 1984 -0.014*** -0.016*** -0.009 -0.024*** -0.004 (0.004) (0.004) (0.008) (0.005) (0.005) Round Round 2 -0.001 -0.001 -0.002 -0.004 0.001 (0.004) (0.005) (0.009) (0.006) (0.006) Round 3 0.002 0.002 0.003 0.002 0.002 (0.005) (0.005) (0.010) (0.007) (0.006) Round 4 -0.001 -0.004 0.011 0.004 -0.004 (0.005) (0.005) (0.011) (0.007) (0.005) Round 5 -0.014** -0.015** -0.007 -0.020** -0.007 (0.004) (0.004) (0.008) (0.005) (0.005) Round 6 -0.003 0.001 -0.012 -0.002 -0.003 (0.005) (0.005) (0.008) (0.007) (0.006) Round 7 0.004 0.005 0.001 0.002 0.005 (0.005) (0.005) (0.010) (0.007) (0.006) Round 8 0.008 0.007 0.013 0.012 0.006 (0.006) (0.007) (0.014) (0.010) (0.008) Round 9 -0.009 -0.007 -0.012 0.001 -0.016 (0.005) (0.006) (0.010) (0.009) (0.005) Psuedo-R2 0.139 0.149 0.120 0.143 0.139 Sample Size 66056 49615 16441 33144 32912 Black Hispanic White 0.079*** (0.006) -0.008 (0.005) -0.001 (0.003) -0.015*** (0.003) -0.024*** (0.003) 0.064*** (0.007) -0.010* (0.004) -0.004 (0.004) -0.031*** (0.003) -0.029*** (0.004) 0.079*** (0.005) -0.010*** (0.002) -0.003 (0.002) -0.017*** (0.002) -0.032*** (0.003) -0.012 (0.006) -0.028*** (0.006) -0.010 (0.007) -0.019* (0.006) -0.005 (0.009) -0.001 (0.010) -0.011 (0.009) -0.014 (0.009) 0.001 (0.005) 0.002 (0.006) -0.002 (0.005) -0.011* (0.005) -0.004 (0.007) 0.000 (0.008) 0.005 (0.009) -0.008 (0.007) -0.014 (0.007) -0.008 (0.007) -0.001 (0.010) -0.017 (0.008) 0.124 17209 0.003 (0.011) 0.002 (0.011) 0.006 (0.012) -0.023* (0.008) -0.011 (0.009) 0.000 (0.011) 0.023 (0.018) 0.001 (0.014) 0.120 13828 -0.001 (0.005) 0.002 (0.006) -0.006 (0.005) -0.012* (0.005) 0.008 (0.007) 0.014* (0.007) 0.009 (0.009) -0.006 (0.008) 0.166 34397 Note: Robust standard errors are in parentheses. *indicates significance at the 0.10-level, ** at the 0.05-level, and *** at the 0.01-level. Event/Status variables are measured over the 6 months prior to the Round t interview. Annual observations come from 8984 individuals for the entire sample, 2236 for the over-sample, and 6748 for the cross-sectional sample, 4599 for the male sample, 4385 for the sample of females, 2335 for the Black sample, 1901 for the Hispanic sample, 4665 for the non-Black, non-Hispanic (white) sample. Data are unweighted. 28 Table 6: Estimated Marginal Effects of Probability of Initial Return Among Attritors Sample All Cross Oversamples Males Females Section Event/Status Birth of Child -0.424*** -0.427*** -0.421*** -0.432*** -0.419*** (0.024) (0.029) (0.044) (0.033) (0.038) Married 0.091** 0.077* 0.165* 0.049 0.139** (0.034) (0.038) (0.075) (0.045) (0.053) Cohabiting 0.022 0.018 0.052 0.057 0.006 (0.024) (0.027) (0.050) (0.038) (0.032) In School -0.020 -0.008 -0.063 -0.038 0.005 (0.022) (0.025) (0.049) (0.033) (0.032) Employed 0.032 0.030 0.028 0.019 0.033 (0.023) (0.027) (0.048) (0.033) (0.034) Birth Year 1981 0.034 0.075 -0.139 0.144* -0.081 (0.049) (0.056) (0.104) (0.071) (0.062) 1982 0.013 0.033 -0.061 0.101 -0.093 (0.051) (0.058) (0.107) (0.072) (0.065) 1983 0.062 0.075 0.014 0.214** -0.100 (0.053) (0.061) (0.113) (0.078) (0.058) 1984 0.057 0.100 -0.109 0.182* -0.080 (0.057) (0.067) (0.104) (0.082) (0.066) Round Round 3 0.017 0.083 -0.195 0.001 0.038 (0.057) (0.069) (0.090) (0.077) (0.089) Round 4 0.022 0.056 -0.089 0.163* -0.131* (0.053) (0.062) (0.107) (0.078) (0.053) Round 5 -0.021 -0.025 -0.015 0.073 -0.114 (0.048) (0.053) (0.109) (0.074) (0.053) Round 6 -0.023 -0.024 -0.035 -0.003 -0.051 (0.045) (0.052) (0.097) (0.064) (0.062) Round 7 -0.054 -0.045 -0.084 -0.013 -0.101 (0.046) (0.051) (0.099) (0.067) (0.057) Round 8 -0.047 -0.022 -0.131 -0.019 -0.078 (0.039) (0.045) (0.086) (0.058) (0.050) Round 9 0.128 0.120 0.155 0.147 0.100 (0.054) (0.061) (0.111) (0.074) (0.075) Psuedo-R2 0.155 0.164 0.159 0.152 0.178 Sample Size 7110 5515 1595 3920 3190 Black Hispanic White -0.422*** (0.041) 0.263* (0.114) 0.106 (0.057) -0.056 (0.049) 0.017 (0.046) -0.342*** (0.050) 0.117* (0.060) 0.033 (0.049) -0.021 (0.056) 0.078 (0.049) -0.468*** (0.039) 0.044 (0.042) -0.016 (0.030) -0.024 (0.028) 0.029 (0.034) 0.015 (0.118) 0.221 (0.131) 0.234 (0.131) 0.068 (0.133) -0.041 (0.111) -0.076 (0.108) 0.015 (0.104) 0.074 (0.116) 0.051 (0.060) -0.006 (0.060) 0.023 (0.064) 0.031 (0.073) -0.092 (0.107) 0.167 (0.124) -0.125 (0.100) 0.071 (0.111) 0.110 (0.121) 0.004 (0.106) 0.304 (0.111) 0.159 1630 -0.106 (0.109) -0.116 (0.096) 0.021 (0.111) -0.038 (0.097) -0.185 (0.089) -0.055 (0.091) 0.158 (0.119) 0.139 1503 0.117 (0.085) 0.007 (0.065) 0.020 (0.065) -0.076 (0.052) -0.089 (0.048) -0.060 (0.044) 0.052 (0.064) 0.182 3924 Note: Standard errors are in parentheses. * indicates significance at the 0.10-level, ** at the 0.05-level, and *** at the 0.01-level. Event/Status variables are measured during the 6-month period prior to the last interview in which the respondent participated. The samples consist of those respondents who have missed at least one interview in the survey rounds subsequent to missing their first interview. The samples are composed of 2306 respondents for the entire sample, 1730 for the cross-sectional sample, 576 for the oversample, 1283 for males, 1023 for females, 596 for Blacks, 591 for Hispanics, and 1172 for whites (non-Black, non-Hispanic respondents). Data are unweighted. 29 Table 7: Outcomes at Age 20, by Sample Definition and by Sex Males In Round 9 Outcomes at Age 20 Earned High School Diploma Birth of Child Weeks Worked In All Rounds Females In After Age 20 In Round 9 In All Rounds In After Age 20 0.730 0.109 35.177 0.766 0.094 36.488 0.720 0.108 34.878 0.784 0.234 35.469 0.803 0.218 35.984 0.787 0.230 35.455 Race/Ethnicity Black Hispanic Mixed Non-Black, Non-Hispanic 0.258 0.212 0.009 0.521 0.247 0.200 0.010 0.542 0.258 0.213 0.009 0.520 0.282 0.212 0.010 0.497 0.277 0.210 0.009 0.505 0.273 0.212 0.010 0.505 Year of Birth 1980 1981 1982 1983 1984 0.177 0.196 0.211 0.210 0.206 0.160 0.186 0.212 0.218 0.224 0.190 0.206 0.211 0.200 0.193 0.186 0.199 0.209 0.203 0.203 0.169 0.196 0.210 0.209 0.216 0.197 0.209 0.206 0.193 0.194 Geographic Characteristics In MSA Out of MSA Not Known—MSA Urban Rural Non Known—Urbanicity Northeast Midwest South West Not Known—Region 0.941 0.048 0.011 0.776 0.210 0.014 0.166 0.223 0.384 0.221 0.006 0.941 0.050 0.010 0.773 0.214 0.012 0.165 0.228 0.380 0.221 0.006 0.936 0.054 0.011 0.772 0.209 0.019 0.166 0.218 0.388 0.223 0.006 0.953 0.042 0.005 0.801 0.189 0.010 0.156 0.213 0.401 0.226 0.004 0.950 0.045 0.003 0.797 0.194 0.009 0.156 0.214 0.403 0.224 0.003 0.945 0.048 0.007 0.798 0.186 0.016 0.161 0.211 0.396 0.226 0.005 Family Background Highest grade completed by mother* 12.525 12.660 12.470 12.406 12.485 12.403 Highest grade completed by father* 12.574 12.694 12.522 12.577 12.703 12.585 Mother’s Age at 1st Birth* 22.791 22.988 22.729 22.640 22.794 22.694 Score on ASVAB* 45.243 47.175 44.433 46.261 47.280 46.278 Number of Observations 3666 2896 4233 3672 3088 4077 Note: Sample sizes based on sample for which birth at age 20 is available. Data are unweighted. * indicates that some observations have missing values. 30 Table 8: Estimates for Three Different Samples A. Males Race/Ethnicity Black Hispanic Year of Birth 1981 1982 1983 1984 Geographic Characteristics Rural Not in an MSA Midwest South West High School Diploma by Age 20 In Round In All In After 9 Rounds Age 20 Birth of Child by Age 20 In Round In All In After 9 Rounds Age 20 Weeks Worked in Year turned Age 20 In Round In All In After 9 Rounds Age 20 0.020 (0.018) 0.027 (0.020) 0.026 (0.018) 0.042* (0.021) 0.003 (0.017) 0.033 (0.019) 0.038** (0.011) 0.030* (0.012) 0.019 (0.010) 0.015 (0.012) 0.035** (0.010) 0.027* (0.011) -6.824** (0.908) -0.345 (0.980) -6.674** (0.993) -0.127 (1.079) -7.269** (0.847) -0.404 (0.912) -0.008 (0.022) -0.002 (0.022) -0.008 (0.022) 0.003 (0.022) 0.004 (0.023) -0.020 (0.022) 0.001 (0.022) -0.000 (0.022) -0.003 (0.020) 0.003 (0.020) -0.001 (0.021) 0.015 (0.021) -0.004 (0.013) 0.001 (0.012) -0.025 (0.013) -0.011 (0.013) -0.015 (0.012) -0.012 (0.011) -0.035** (0.012) -0.021 (0.012) 0.001 (0.012) -0.003 (0.012) -0.024 (0.012) -0.016 (0.012) -2.726* (1.068) -2.426* (1.052) -3.873** (1.054) -7.054** (1.059) -4.228** (1.198) -3.679** (1.164) -4.646** (1.159) -8.847** (1.155) -2.581** (0.969) -2.460* (0.971) -4.158** (0.988) -7.010** (0.997) -0.015 (0.018) -0.017 (0.031) 0.011 (0.022) 0.001 (0.020) 0.058* (0.023) 0.010 (0.018) -0.022 (0.030) 0.009 (0.023) -0.002 (0.020) 0.045 (0.023) -0.008 (0.017) -0.003 (0.029) 0.008 (0.022) -0.004 (0.019) 0.039 (0.022) -0.013 (0.011) 0.052** (0.016) 0.021 (0.014) 0.013 (0.013) -0.016 (0.015) -0.025* (0.011) 0.045** (0.015) 0.018 (0.013) 0.014 (0.012) -0.015 (0.014) -0.015 (0.011) 0.053** (0.015) 0.027* (0.013) 0.026* (0.012) -0.001 (0.014) 0.480 (0.863) 0.778 (1.607) 1.209 (1.062) -0.692 (0.975) 0.990 (1.087) 0.934 (0.929) 0.050 (1.717) 0.537 (1.144) -1.351 (1.057) 0.814 (1.181) 0.825 (0.806) 1.474 (1.426) 1.537 (0.991) -0.920 (0.904) 1.152 (1.007) Family Background Highest grade, Mother 0.015** 0.013** 0.015** -0.004* -0.004* -0.003 -0.063 -0.096 -0.140 (0.003) (0.003) (0.003) (0.002) (0.002) (0.002) (0.151) (0.165) (0.141) Highest grade, Father 0.011** 0.011** 0.012** -0.005** -0.004* -0.003* -0.194 -0.295 -0.207 (0.003) (0.003) (0.003) (0.002) (0.002) (0.002) (0.142) (0.155) (0.133) Mother’s age at 1st birth 0.009** 0.009** 0.010** -0.004** -0.004** -0.004** 0.054 0.066 0.087 (0.002) (0.002) (0.002) (0.001) (0.001) (0.001) (0.076) (0.082) (0.071) Math-Verbal ASVAB 0.005** 0.004** 0.005** -0.001** -0.001** -0.001** 0.013 0.011 0.007 (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.014) (0.015) (0.013) # of Observations 3630 2870 4190 3666 2896 4233 3606 2855 4160 Notes: Standard errors in parentheses. * significant at 5%; ** significant at 1%. Data are unweighted. Regressions also include dummy variables indicating that the value of region, ASVAB score, mother’s age at 1st birth, and parent’s educational attainment are missing. 31 B. Females Race/Ethnicity Black Hispanic Year of Birth 1981 1982 1983 1984 Geographic Characteristics Rural Not in an MSA Midwest South West High School Diploma by Age 20 In Round In All In After 9 Rounds Age 20 Birth of Child by Age 20 In Round In All In After 9 Rounds Age 20 Weeks Worked in Year turned Age 20 In Round In All In After 9 Rounds Age 20 0.075** (0.014) 0.064** (0.016) 0.068** (0.014) 0.058** (0.016) 0.072** (0.013) 0.057** (0.015) 0.037* (0.016) 0.015 (0.019) 0.032 (0.017) 0.005 (0.020) 0.034* (0.016) 0.025 (0.018) -2.520** (0.855) -0.762 (0.956) -2.232* (0.927) -0.121 (1.037) -2.614** (0.820) -1.282 (0.908) -0.019 (0.017) -0.029 (0.017) -0.015 (0.017) -0.004 (0.017) -0.022 (0.017) -0.020 (0.017) -0.020 (0.017) -0.002 (0.017) -0.016 (0.016) -0.020 (0.016) -0.017 (0.016) -0.002 (0.016) 0.000 (0.020) 0.010 (0.020) -0.011 (0.020) -0.034 (0.020) -0.014 (0.021) -0.005 (0.021) -0.008 (0.021) -0.038 (0.021) 0.006 (0.018) 0.007 (0.018) -0.006 (0.019) -0.037 (0.019) -0.823 (1.018) -1.956 (1.008) -3.707** (1.014) -5.282** (1.015) -0.569 (1.127) -1.739 (1.111) -3.563** (1.112) -5.132** (1.104) -1.119 (0.944) -2.220* (0.952) -4.065** (0.970) -5.912** (0.971) 0.026 (0.015) 0.001 (0.027) 0.015 (0.018) -0.000 (0.016) 0.028 (0.018) 0.024 (0.014) 0.012 (0.025) 0.024 (0.018) 0.010 (0.015) 0.043* (0.018) 0.030* (0.014) -0.003 (0.024) 0.013 (0.017) -0.004 (0.015) 0.030 (0.017) -0.021 (0.018) 0.085** (0.030) 0.063** (0.022) 0.044* (0.020) -0.024 (0.023) -0.032 (0.018) 0.087** (0.030) 0.051* (0.023) 0.036 (0.020) -0.025 (0.024) -0.018 (0.017) 0.073** (0.027) 0.062** (0.020) 0.044* (0.018) -0.020 (0.021) 0.837 (0.866) -6.380** (1.650) -1.833 (1.059) -2.105* (0.958) -1.858 (1.062) 0.627 (0.921) -6.542** (1.716) -0.897 (1.139) -1.542 (1.031) -1.831 (1.144) 0.935 (0.831) -5.155** (1.496) -1.272 (0.998) -1.963* (0.903) -1.421 (1.000) Family Background Highest grade, Mother 0.008** 0.006* 0.008** -0.012** -0.013** -0.011** 0.342* 0.350* 0.325* (0.002) (0.002) (0.002) (0.003) (0.003) (0.003) (0.145) (0.155) (0.139) Highest grade, Father 0.009** 0.008** 0.009** -0.007* -0.005 -0.007* -0.164 -0.123 -0.222 (0.002) (0.002) (0.002) (0.003) (0.003) (0.003) (0.142) (0.152) (0.135) Mother’s age at 1st birth 0.006** 0.005** 0.006** -0.007** -0.007** -0.007** -0.025 -0.114 -0.042 (0.001) (0.001) (0.001) (0.002) (0.002) (0.001) (0.072) (0.077) (0.069) Math-Verbal ASVAB 0.005** 0.005** 0.005** -0.004** -0.003** -0.003** 0.112** 0.103** 0.111** (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.014) (0.015) (0.014) # of Observations 3650 3072 4051 3672 3088 4076 3636 3060 4031 Standard errors in parentheses. * significant at 5%; ** significant at 1%. Data are unweighted. Regressions also include dummy variables indicating that the value of region, ASVAB score, mother’s age at 1 st birth, and parent’s educational attainment are missing. 32 Table 9: Descriptive Statistics from the NLSY97, CPS, and NSFG NLSY97 Males Females Unweighted Weighted Unweighted CPS Males Weighted NSFG Females Males Females Unweighted Weighted Unweighted Weighted Unweighted Weighted Unweighted Weighted 0.743 0.075 0.758 0.082 0.754 0.253 0.785 0.210 Outcomes by Age 20 High School Diploma Birth Weeks Worked 0.730 0.109 35.177 0.767 0.087 36.126 0.784 0.234 35.469 0.808 0.196 36.678 0.798 0.795 0.840 0.841 30.094 30.817 28.349 28.534 Race/Ethnicity Non-Black, non-Hisp Black Hispanic 0.521 0.258 0.212 0.700 0.153 0.134 0.496 0.282 0.213 0.709 0.155 0.122 0.684 0093 0.147 0.640 0.127 0.162 0.667 0.121 0.136 0.622 0.155 0.154 0.570 0.244 0.187 0.668 0.186 0.146 0.637 0.212 0.151 0.683 0.172 0.145 23.072 23.138 23.081 23.118 21.000 21.000 21.000 21.000 23.536 23.617 23.651 23.682 Region Northeast Midwest South West 0.167 0.225 0.387 0.222 0.176 0.257 0.354 0.213 0.157 0.214 0.403 0.226 0.161 0.250 0.372 0.217 0.210 0.256 0.286 0.248 0.188 0.232 0.354 0.226 0.206 0.259 0.277 0.258 0.188 0.229 0.344 0.238 Metropolitan In MSA, in Central City In MSA, other Not in MSA 0.420 0.521 0.048 0.390 0.549 0.051 0.414 0.527 0.050 0.387 0.551 0.052 0.461 0.413 0.126 0.444 0.400 0.156 0.422 0.438 0.140 0.423 0.427 0.150 Mother’s Education Less than High School High School More than High School 0.294 0.333 0.373 0.241 0.339 0.420 0.298 0.330 0.372 0.234 0.329 0.437 0.204 0.336 0.451 0.173 0.318 0.506 0.220 0.304 0.467 0.155 0.310 0.512 Mother’s Age at 1st birth 22.791 23.236 22.640 23.202 <18 0.109 0.085 0.121 0.089 0.143 0.128 0.169 18-19 0.161 0.145 0.166 0.153 0.189 0.189 0.208 20-24 0.405 0.409 0.399 0.399 0.391 0.425 0.386 25-29 0.228 0.256 0.219 0.252 0.184 0.175 0.173 30 and older 0.097 0.104 0.096 0.108 0.088 0.079 0.054 # of Observations 3666 3666 3650 3650 4180 4180 4268 4268 1382 1382 1361 Note: Empty cells indicate that the variable is not available in a given data set. Sample sizes and statistics are reported for the sample for high school diploma by age 20. 0.157 0.194 0.410 0.174 0.056 1361 Age 33 Table 10: Comparison of NSLY97 Estimates and CPS Estimates of Outcomes at Age 20 A. Estimates of marginal effects for High school graduation by age 20, Males NLSY97 CPS Unweighted Weighted Unweighted Weighted Region Northeast 0.039 0.055** 0.023 0.033* (0.021) (0.020) (0.017) (0.016) Midwest 0.039* 0.046** 0.023 0.027 (0.020) (0.018) (0.016) (0.016) West 0.060** 0.052** 0.026 0.051** (0.020) (0.019) (0.016) (0.015) Race/Ethnicity Black -0.152** -0.127** -0.120** -0.126 (0.017) (0.017) (0.026) (0.023) Hispanic -0.126** -0.085** -0.269** -0.258** (0.019) (0.019) (0.023) (0.022) Year 2001 -0.010 0.009 -0.011 -0.025 (0.023) (0.021) (0.020) (0.020) 2002 -0.009 -0.001 -0.005 0.000 (0.022) (0.021) (0.019) (0.020) 2003 -0.008 -0.004 -0.014 -0.020 (0.022) (0.021) (0.020) (0.020) 2004 0.016 0.020 0.017 0.015 (0.023) (0.022) (0.019) (0.019) Observations 3828 3828 4123 4123 Standard errors in parentheses. * significant at 5%; ** significant at 1% B. Coefficient Estimates for Weeks worked in Calendar Year that Respondent turned Age 20, Males NLSY79 CPS Unweighted Weighted Unweighted Weighted Region Northeast 1.463 2.000* -0.526 -0.327 (0.946) (0.927) (0.886) (0.889) Midwest 3.268** 3.129** 1.793 2.218** (0.872) (0.834) (0.831) (0.832) West 1.140 -0.343 0.278 0.153 (0.909) (0.904) (0.825) (0.822) Race/Ethnicity Black -7.539** -7.187** -10.224** -10.526** (0.803) (0.913) (1.085) (1.018) Hispanic 0.104 0.404 1.885 1.635 (0.859) (0.962) (0.821) (0.838) Year 2001 3.845** 3.364** -3.788* -4.036* (1.041) (0.987) (1.244) (1.888) 2002 1.465 1.772 -4.476** -4.549** (1.002) (0.990) (1.719) (1.741) 2003 0.566 0.598 -4.535** -4.245* (0.996) (0.986) (1.744) (1.763) 2004 1.554 1.378 -6.551 -6.773** (1.010) (1.003) (1.729) (1.749) Observations 3765 3765 4794 4794 Standard errors in parentheses. * significant at 5%; ** significant at 1% 34 C. Estimates of Marginal Effects for High school graduation by age 20, Females NLSY CPS Unweighted Weighted Unweighted Weighted Region Northeast 0.016 0.037* 0.038** 0.042** (0.019) (0.018) (0.014) (0.014) Midwest 0.039* 0.059** 0.019 0.018 (0.018) (0.016) (0.014) (0.014) West 0.050** 0.049** 0.029* 0.052** (0.018) (0.017) (0.014) (0013) Race/Ethnicity Black -0.077** -0.063** -0.055** -0.048** (0.016) (0.016) (0.020) (0.018) Hispanic -0.111** -0.081** -0.187** -0.182** (0.017) (0.018) (0.022) (0.021) Year 2001 0.028 0.026 -0.020 -0.015 (0.021) (0.020) (0.018) (0.018) 2002 -0.008 -0.011 0.001 0.007 (0.020) (0.019) (0.017) (0.017) 2003 0.002 0.002 0.019 0.023 (0.020) (0.019) (0.017) (0.016) 2004 0.008 -0.004 -0.003 -0.010 (0.021) (0.019) (0.017) (0.017) Observations 3771 3771 4256 4256 Standard errors in parentheses. * significant at 5%; ** significant at 1% D. Coefficient Estimates for Weeks worked in calendar year that Respondent turned Age 20, Females NLSY CPS Unweighted Weighted Unweighted Weighted Region Northeast 3.357** 3.772** -1.490 -0.939 (0.935) (0.908) (0.820) (0.891) Midwest 2.358** 2.750** 0.435 1.536 (0.858) (0.803) (0.790) (0.846) West 0.743 0.818 -0.714 -0.340 (0.889) (0.869) (0.771) (0.844) Race/Ethnicity Black -6.507** -6.407** -9.775** -8.351** (0.771) (0.879) (0.985) (1.007) Hispanic -2.859** -2.401* -2.585** -0.572 (0.851) (0.972) (0.827) (0.852) Year 2001 2.921** 2.172* -3.869* -2.969 (1.008) (0.966) (1.678) (1.737) 2002 0.728 0.238 -4.461** -3.798** (0.983) (0.962) (1.586) (1.606) 2003 2.050* 1.982* -6.483** -4.864** (0.982) (0.956) (1.589) (1.610) 2004 0.204 -0.301 -7.141** -5.031** (0.989) (0.963) (1.581) (1.610) Observations 3718 3718 4937 4937 Standard errors in parentheses. * significant at 5%; ** significant at 1%. 35 Table 11: Comparison of NLSY97 Estimates and NSFG Estimates for Outcomes at Age 20 A. Estimates of Marginal Effects for Birth before Age 20, Males NLSY97 Birth Years 1980 to Birth Years 1980 and 1984 in Round 9 1981 in Round 6 Unweighted Race/Ethnicity Black Weighted Unweighted Weighted NSFG Birth Years 1977 to Birth Years 1977 to 1981 in 2002 1981 in 2002 Unweighted Weighted Unweighted Weighted 0.066** (0.010) 0.047** (0.012) 0.046** (0.009) 0.024* (0.010) 0.080** (0.017) 0.054** (0.019) 0.051** (0.016) 0.030 (0.018) 0.048* (0.019) 0.036 (0.020) 0.060** (0.020) 0.049** (0.018) 0.046 (0.034) 0.012 (0.038) 0.056 (0.031) 0.058* (0.027) 0.029 (0.042) -0.022 (0.040) 0.052 (0.045) 0.014 (0.043) 0.039 (0.066) 0.013 (0.065) 0.061 (0.071) 0.041 (0.070) -0.022 (0.016) 0.029 (0.022) -0.032 (0.017) 0.041* (0.020) -0.026 (0.028) 0.003 (0.043) -0.026 (0.025) 0.055 (0.029) 0.026* (0.011) -0.052** (0.012) 0.030** (0.009) -0.042** (0.010) 0.030 (0.017) -0.050* (0.019) 0.035* (0.016) -0.034* (0.017) 0.017 (0.019) 0.003 (0.019) 0.006 (0.019) 0.013 (0.018) -0.007 (0.035) -0.020 (0.032) -0.006 (0.032) -0.012 (0.026) 20 to 24 -0.028* (0.014) -0.064** -0.019 (0.012) -0.054** -0.009 (0.024) -0.048* -0.010 (0.022) -0.046* 0.023 (0.021) -0.023 0.036 (0.035) -0.067 0.028 (0.030) -0.062* 25 to 29 (0.013) -0.089** (0.012) -0.081** (0.022) -0.059* (0.020) -0.063* (0.021) -0.056 (0.038) -0.088 (0.031) -0.057 (0.017) (0.015) (0.027) (0.025) -0.070** -0.065** -0.004 -0.032 (0.021) (0.019) (0.031) (0.029) Observations 3666 3666 1499 1499 Standard errors in parentheses. * significant at 5%; ** significant at 1% (0.031) -0.089 (0.051) 755 -0.006 (0.020) 0.055** (0.020) 0.079** (0.029) -0.109* (0.051) 755 (0.049) (0.035) Hispanic MSA Status Not in MSA In MSA, Not in Central City Mother’s Characteristics Less than High School degree More than High Sch. degree Mother’s age at 1st birth 18 to 19 30 and older 36 324 324 B. Estimates of Marginal effects for high school graduation before age 20, Males NLSY97 Birth Years 1980 to Birth Years 1980 and 1984 in Round 9 1981 in Round 6 Race/Ethnicity Black Hispanic MSA Status Not in MSA In MSA, Not in Central City Mother’s Characteristics Less than HS degree More than HS degree NSFG Birth Years 1977 to 1981 in 2002 Birth Years 1977 to 1981 in 2002 Unweighted Weighted Unweighted Weighted Unweighted Weighted Unweighted Weighted -0.093** (0.018) -0.032 (0.020) -0.075** (0.018) -0.010 (0.020) -0.131** (0.027) -0.050 (0.030) -0.111** (0.027) -0.036 (0.029) -0.110** (0.039) -0.018 (0.042) -0.125** (0.040) -0.080* (0.040) -0.017 (0.058) 0.079 (0.068) 0.002 (0.069) -0.025 (0.063) 0.082 (0.074) 0.142* (0.068) 0.047 (0.068) 0.102 (0.063) -0.020 (0.102) 0.021 (0.100) -0.065 (0.098) -0.017 (0.097) -0.006 (0.033) -0.110* (0.046) -0.008 (0.033) -0.110** (0.042) 0.029 (0.048) -0.065 (0.065) -0.023 (0.048) -0.088 (0.067) -0.139** (0.019) 0.132** (0.019) -0.132** (0.018) 0.115** (0.017) -0.104** (0.029) 0.144** (0.030) -0.104** (0.028) 0.118** (0.026) -0.109** (0.040) 0.081* (0.037) -0.166** (0.040) 0.080* (0.036) -0.145* (0.062) 0.110* (0.052) -0.215** (0.069) 0.127* (0.052) 0.063 (0.047) 0.157** (0.044) 0.152** (0.054) 0.302** (0.079) 829 0.038 (0.047) 0.122** (0.043) 0.098 (0.053) 0.246** (0.083) 829 0.030 (0.074) 0.089 (0.066) 0.105 (0.073) 0.258* (0.114) 361 0.018 (0.079) 0.040 (0.064) 0.014 (0.070) 0.137 (0.118) 361 Mother’s age at 1st birth 18 to 19 0.081** 0.072** -0.001 -0.013 (0.027) (0.026) (0.041) (0.040) 20 to 24 0.124** 0.107** 0.084* 0.056 (0.024) (0.024) (0.037) (0.036) 25 to 29 0.214** 0.195** 0.148** 0.114** (0.029) (0.028) (0.045) (0.042) 30 and older 0.159** 0.150** 0.099 0.090 (0.035) (0.034) (0.056) (0.053) Observations 3630 3630 1495 1495 Standard errors in parentheses. * significant at 5%; ** significant at 1% 37 . C. Estimates of Marginal Effects of having a birth before age 20, Females NLSY97 Birth Years 1980 to Birth Years 1980 and 1984 in Round 9 1981 in Round 6 NSFG Birth Years 1977 to 1981 in 2002 Birth Years 1977 to 1981 in 2002 Unweighted Weighted Unweighted Weighted Unweighted Weighted Unweighted Weighted 0.115** (0.016) 0.059** (0.019) 0.095** (0.016) 0.048** (0.018) 0.124** (0.021) 0.092** (0.023) 0.099** (0.020) 0.069** (0.023) 0.170** (0.029) 0.173** (0.029) 0.121** (0.027) 0.125** (0.026) 0.134** (0.040) 0.114** (0.042) 0.102** (0.034) 0.058 (0.035) 0.188 (0.111) 0.104 (0.107) 0.224 (0.125) 0.170 (0.123) 0.153 (0.108) 0.096 (0.107) 0.036 (0.078) 0.003 (0.077) 0.037 (0.025) 0.068 (0.035) 0.018 (0.022) 0.058* (0.029) 0.040 (0.035) 0.056 (0.047) -0.006 (0.029) 0.032 (0.037) 0.109** (0.018) -0.080** (0.018) 0.118** (0.017) -0.059** (0.016) 0.086** (0.022) -0.091** (0.023) 0.109** (0.021) -0.065** (0.020) -0.044 (0.030) -0.135** (0.027) -0.036 (0.027) -0.120** (0.024) -0.015 (0.044) -0.077* (0.037) 0.003 (0.036) -0.058 (0.030) -0.016 -0.012 0.023 0.034 (0.023) (0.022) (0.030) (0.029) 20 to 24 -0.091** -0.091** -0.070* -0.061* (0.021) (0.021) (0.028) (0.028) 25 to 29 -0.126** -0.127** -0.108** -0.086** (0.026) (0.024) (0.034) (0.032) 30 and older -0.158** -0.164** -0.125** -0.137** (0.033) (0.031) (0.043) (0.041) Observations 3672 3672 2306 2306 Standard errors in parentheses. * significant at 5%; ** significant at 1% -0.065* (0.032) -0.140** (0.031) -0.245** (0.044) -0.228** (0.069) 1393 -0.014 (0.028) -0.116** (0.027) -0.221** (0.041) -0.216** (0.067) 1393 -0.001 (0.046) -0.088* (0.044) -0.205** (0.064) -0.347* (0.143) 567 0.006 (0.038) -0.052 (0.037) -0.136* (0.055) -0.358* (0.173) 567 Race/Ethnicity Black Hispanic MSA Status Not in MSA In MSA, Not in Central City Mother’s Characteristics Less than HS degree More than HS degree Mother’s age at 1st birth 18 to 19 38 D. Estimates of marginal effects of high school graduation by age 20, Females NLSY97 Birth Years 1980 to Birth Years 1980 and 1984 in Round 9 1981 in Round 6 Race/Ethnicity Black Hispanic MSA Status Not in MSA In MSA, Not in Central City Mother’s Characteristics Less than HS degree More than HS degree NSFG Birth Years 1977 to 1981 in 2002 Birth Years 1977 to 1981 in 2002 Unweighted Weighted Unweighted Weighted Unweighted Weighted Unweighted Weighted -0.012 (0.016) 0.002 (0.018) -0.014 (0.016) 0.005 (0.018) -0.015 (0.019) -0.013 (0.021) -0.013 (0.019) -0.010 (0.021) -0.186** (0.028) -0.067* (0.031) -0.161** (0.026) -0.048 (0.029) -0.134** (0.043) -0.082 (0.047) -0.107** (0.039) -0.038 (0.043) -0.012 (0.095) -0.005 (0.090) -0.037 (0.094) -0.029 (0.091) -0.079 (0.091) -0.047 (0.089) -0.115 (0.088) -0.088 (0.087) -0.009 (0.025) -0.083* (0.034) 0.003 (0.023) -0.049 (0.029) 0.013 (0.038) -0.062 (0.049) 0.031 (0.033) -0.046 (0.041) -0.125** (0.017) 0.113** (0.018) -0.129** (0.016) 0.094** (0.016) -0.142** (0.020) 0.105** (0.022) -0.136** (0.018) 0.090** (0.019) -0.081** (0.029) 0.132** (0.028) -0.089** (0.027) 0.129** (0.025) -0.088 (0.046) 0.101* (0.041) -0.093* (0.041) 0.086* (0.035) 0.074* (0.033) 0.115** (0.031) 0.158** (0.042) 0.239** (0.075) 1361 0.069* (0.029) 0.109** (0.028) 0.152** (0.039) 0.235** (0.073) 1361 0.040 (0.052) 0.087 (0.048) 0.060 (0.061) 0.153 (0.103) 557 0.047 (0.046) 0.050 (0.043) 0.042 (0.055) 0.130 (0.091) 557 Mother’s age at 1st birth 18 to 19 0.019 0.019 0.009 0.012 (0.022) (0.022) (0.026) (0.025) 20 to 24 0.089** 0.090** 0.054* 0.062* (0.020) (0.021) (0.024) (0.024) 25 to 29 0.140** 0.143** 0.112** 0.104** (0.025) (0.024) (0.031) (0.029) 30 and older 0.174** 0.165** 0.146** 0.141** (0.033) (0.030) (0.040) (0.037) Observations 3650 3650 2305 2305 Standard errors in parentheses. * significant at 5%; ** significant at 1%. 39