Borrowing Constraints, Migrant Selection, and the Dynamics of Return and Repeat Migration Joseph-Simon Görlach∗ March 22, 2016 Work in Progress Please click here for the latest version. Abstract This paper analyzes the impact of an exogenous change in household incomes in a migrant’s sending country on migration duration, the incidence of repeat migration and migrant selection. Higher incomes raise the opportunity cost of residing abroad, but also help facilitate a costly migration if households are borrowing constrained. To disentangle these mechanisms and to evaluate the effects of income changes on return and re-migration choices, I formulate a dynamic life cycle model of consumption, employment, emigration, return and repeat migration. Households may borrow up to an endogenous limit that gives higher-income households better access to credit. Given unobserved heterogeneity in productivity and the preference for migrating, identification of this income dependence of borrowing limits requires exogenous variation in income. I thus exploit a policy experiment that randomly allocated cash transfers in Mexico. I find that an increase in income in Mexico reduces migration duration, but increases both the average number of trips per migrant and the responsiveness to economic conditions. The effect on return migration makes immigrants staying at the destination more positively selected, which has important implications for the assessment from a U.S. perspective of migrations that have been induced by higher incomes in Mexico. JEL codes: J61, O15, F22, D31 Keywords: Migration, Borrowing, Income, Selection ∗ Department of Economics, University College London and Centre for Research and Analysis of Migration; e-mail: joseph-simon.gorlach.09@ucl.ac.uk. I am grateful to my advisors Christian Dustmann and Jérôme Adda for their feedback and support. I also greatly benefited from discussions with James Albrecht, Joseph Altonji, Orazio Attanasio, Mariacristina De Nardi, Aureo de Paula, Jan Eeckhout, Rebecca Lessem, Suphanit Piyapromdee, Imran Rasul, Mark Rosenzweig, Uta Schönberg and Michela Tincani, as well as participants of the joint CReAM/UCL, Norwegian School of Economics and University of Warwick Workshop on Topics in Labor Economics, participants at the Oxford Development Economics Workshop, the NEUDC conference at Brown, the Econometric Society’s European Winter Meeting at Bocconi and at various seminars at UCL. I acknowledge funding from the NORFACE program on migration. 1 1 Introduction The impact of immigration on a destination country, its labor market and the fiscal system depends on the size and composition of the immigrant population. These in turn are determined by the dynamics of return and repeat migration, and the length of time different immigrants choose to stay in the destination country. These choices are influenced by factors that determine the affordability of a migration, with potentially important consequences of rising wealth levels in many traditional migrant sending countries for migrant selection and—ultimately—the effects of immigration on receiving countries. This paper evaluates the effect of a rise in household income in a migrant’s country of origin on return and re-migration when households are borrowing constrained. To assess likely implications for the receiving country, it further investigates the dynamics of compositional change within an immigrant population through the selection induced by return and repeat migration.1 One focus of this paper is on the monetary cost of migration, which is a major, though usually unobserved determinant of migrant behavior. Part of this cost may have to be paid in advance and prevent many migrations that would be welfare improving for the respective individuals. Prohibitive migration costs are an often-cited argument in the literature for lower than expected emigration rates of individuals at the lower end of the earnings distribution despite presumably high returns from migration for low-skilled workers (Chiquiar and Hanson, 2005; McKenzie and Rapoport, 2010; Grogger and Hanson, 2011; Belot and Hatton, 2012; Dustmann and Okatenko, 2014). The extent to which wealth constraints are binding depends on a potential migrant’s income and on the ability to borrow. Hence, policies that increase incomes in migrant sending countries or ease credit constraints will—apart from increasing the opportunity cost of emigration—also help overcome migration constraints. Therefore, the overall effect of a rise in household income on migration decisions is ambiguous. Provided exogenous variation in income, a reduced form identifies the net effect of an increase in this opportunity cost and a relaxation of financial constraints, but cannot disentangle these two mechanisms. A distinction between the two components however can be achieved by putting some structure on the choices and constraints faced by individuals. To this end, I formulate a dynamic life cycle model of saving decisions, and both individual and family location choices. This allows me to study not only the channels through which an income shock affects emigration, but also the effect on the intensive margin of migration duration, as 1 The analysis focuses on the empirically important case of Mexico-U.S. migration, where repeat migration is a common phenomenon. The American Community Survey estimates 11.7 million Mexicans reside in the U.S. (U.S. Census Bureau, 2012), which makes it the worldwide largest bilateral migrant population. 2 well as the proportion of individuals migrating repeatedly and the selection of return and repeat migrants. Since assets and debt levels are rarely observed both before and after migration costs have been covered, identification of the monetary cost of migration relies on observed emigrations conditional on previously held assets and income. This requires the analyst to specify the extent to which households have access to credit: If households can borrow up to some unknown limit to cover part of the cost of migration, it is unclear whether an observed migration has been facilitated by low migration costs or by a generous borrowing limit. Hence, information on realized migrations alone cannot identify both the cost of migration and the unobserved limit to debt that a household can hold. Existing studies have therefore assumed that individuals have no access to credit (see Rendon and Cuecuecha, 2010; Thom, 2010; Nakajima, 2014). While this assumption may be a plausible simplification for those at the very bottom of the income distribution, it is violated for a considerable part of the Mexican population, as I will illustrate below. Such income dependence implies an endogeneity of borrowing limits with respect to income that makes the identification of credit access in a model like the one used here, where earnings and migration choices are affected by unobserved factors, challenging. Whereas in a simpler model without individual heterogeneity, observed covariation between income and borrowing could be used to pin down credit constraints that depend on income, this provides little information on the structural relation between income and credit access in a model where unobserved individual productivity may be related to the preference for migrating. For instance, if one motive for taking up credit is to finance a migration, a positive correlation between credit take-up and income could either be due to better access to loans by high-income households (which is the relation that needs to be identified), or due to a higher demand for credit to finance migrations, arising from a lower attachment to the country of origin and a stronger preference for moving to the U.S. Similarly, unobserved factors like English language knowledge that enhance earnings in Mexico and access to loans, may at the same time be correlated with the preference for moving to the U.S. and thus the demand for credit. Hence, an observed joint distribution of income, borrowing and migration could be generated by different combinations of income dependent borrowing limits and correlations in unobserved heterogeneity. To achieve identification of households’ access to credit, I use truly exogenous variation in income from the randomized introduction of the Mexican cash transfer program PROGRESA together with information on differences in loan take-up by households in treatment and control communities. While exogenous cash transfers from the program do affect both borrowing and the probability to emigrate, the identifying assumption I make 3 is that the randomization of the program is uncorrelated with unobserved preferences for moving to the U.S. Hence, whereas standard survey covariation between income and borrowing can identify the income dependence of credit limits under the restriction of orthogonality between individual productivity and preferences, the effect of randomized variation in income from the experiment on borrowing allows such identification without restrictions on unobserved heterogeneity. As such, the policy experiment breaks the contaminating link between income and other factors that determine the desire to emigrate and thus the demand for credit. The treatment effect of cash transfers from the program on household borrowing is used as an additional moment in the structural estimation of the model, which together with observed migrations allows a joint identification of migration costs and debt limits. A further novelty of the model developed here is that it is the first within the structural migration literature to consider the role of aggregate seasonal fluctuations in labor demand for the timing of migration decisions. The prospect of higher earnings is one of the most important motives for individuals to migrate, and this prospect depends on employment probabilities in the destination country. A large share of Mexican immigrants works in the U.S. agricultural sector or in construction, where employment is strongly seasonal, which translates into seasonality in the desire to migrate. To account for this, the model allows for seasonal variation in employment transitions. This is important, as some migrations do not take place in winter if individuals know that the probability of receiving a job offer in the U.S. is low. A decreasing marginal utility of wealth implies that this reduction in migrations is not fully compensated by higher job offer probabilities in the U.S. during summer. A model without this seasonal fluctuation falsely attributes lower overall migration rates to high migration costs. I estimate the model using data from the Mexican Family Life Survey, the U.S. Survey of Income and Program Participation, the Mexican Migration Project and PROGRESA’s evaluation survey. I explicitly address the non-representativeness of some of the survey populations, as well as the asymptotics of a simulated minimum distance estimator under different sample sizes. I then use the model to evaluate dynamic effects of higher Mexican incomes both on compositional changes in the immigrant population due to selective immigration and return migration, as well as the behavioral changes in the length of stay and the frequency of migrations by previous migrants. I find that an increase in household income in Mexico reduces the time immigrants stay in the U.S., but that contrary to a model without financial constraints the distribution of the number of migrations is shifted towards a larger number of trips per migrant. A 10 percent increase in Mexican incomes is predicted to raise the average number of trips undertaken by Mexican U.S. migrants by more than 6 percent, net of compositional changes. Important for selection, 4 I find that return migration becomes more responsive to economic outcomes in the host country as origin incomes increase. These higher incomes reinforce the already negative selection of returning migrants, implying that over time, the U.S. is left with an increasingly positive selection of stayers among an initial immigrant cohort. This compositional change leads to 6 percent higher average earnings within the population of Mexican immigrants in the U.S. Both the responsiveness in migrations to economic outcomes and the selection of stayers is important for an assessment of immigration from a receiving country’s perspective.2 These results also suggest that while much of the literature on selective return migration has focused on effects of host country outcomes (e.g. Hu, 2000; Dustmann, 2003; Lubotsky, 2007), economic conditions in a migrant’s country of origin may have to be taken into account more strongly in future research. My work is related to a paper by Lessem (2013), who among other things studies the effect of wages at origin on emigration rates, migration duration and the average number of trips using a life cycle model of Mexico-U.S. migration. While her focus is on the impact of varying degrees of border enforcement, she also uses her model to evaluate effects of a change in Mexican incomes, however abstracting from monetary migration costs and thus financial constraints. In her model, a rise in the opportunity cost of migrating through an increase in Mexican wages therefore mechanically predicts a reduction in Mexican emigration as well as a decrease in the average length of stay and the number of trips per migrant. My results show that some of these conclusions reverse when financial constraints are accounted for. The paper also is related to reduced form studies that estimate the effect of an income shock on the probability of emigration. In the context of Mexico-U.S. migration, Angelucci (2015) finds that an exogenous increase in incomes from PROGRESA transfers raises emigration rates by about 50 percent.3 While such analysis identifies the net effect of a decrease in the relative attractiveness of migrating to the U.S. and of a relaxation of financial constraints, it cannot disentangle these two effects. It also cannot identify the cost of migration, which is an inherent part of the financial constraint, and which prevents individuals at the lower end of the wealth distribution from migrating, nor the dynamic effects on return and re-migration choices that result from the income shock. By introducing exogenous variation of the type used in the reduced form literature 2 A wider discussion of the implications of temporary migrants’ behavior for migrant sending and receiving countries is provided in Dustmann and Görlach (2016). 3 See Stecklov et al. (2005) and Rubalcava and Teruel (2006) for similar analyses in this context. Using cross-state variation in a public works program in India, Imbert and Papp (2015) on the other hand find a negative effect of participation in the program on urban migration. For international migration from Indonesia, Bazzi (2014) uses rainfall and commodity price shocks to evaluate determinants of emigration. In line with the existence of credit constraints, he finds a positive effect of income shocks on emigration from villages with relatively more small landholders. See Clemens (2014) for a survey of this literature. 5 to the analysis of migration dynamics based on structurally estimated life cycle models, this paper also contributes to the growing literature that combines structural estimation with experimental variation. While some studies use policy reforms or a randomized treatment of sub-populations to examine the external validity of a structural model that has been estimated on the non-treated sample (e.g. Lise et al., 2005; Todd and Wolpin, 2006; Kaboski and Townsend, 2011), exogenous variation can alternatively be used for identification of model parameters that are not well identified using observational survey covariations alone. This is the approach taken here, where identification of the income dependence of debt limits requires information on borrowing in response to exogenous variation in incomes that can be credibly separated from heterogeneous preferences. The randomized experiment, thus, is used to allow a more flexible specification of the structural model’s assumptions on access to credit, increasing its credibility. On the other hand, the structural model increases the scope of evaluation of the policy experiment, in particular toward a more dynamic analysis of return and repeat migration, and a disentangling of different underlying mechanisms. In this respect, I build on Attanasio et al. (2012), who use the randomized introduction of PROGRESA to analyze education choices in Mexico. Another recent example from a different context is the paper by Adda et al. (2014), who use an equilibrium model to evaluate the effect on non-drug-related crime of a counterfactual expansion of a south London experiment depenalizing cannabis possession to the entire city. The next section presents the model on which my results are based, before describing in Section 3 the data sources used, with a number of descriptive statistics providing further detail on the context considered. In Section 4, I discuss identification of the model’s parameters and produce estimates of the treatment effect of randomized cash transfers on household borrowing, which provide the additional moments required to identify endogenous debt constraints in the structural estimation. Also in that section, I discuss issues arising in the combination of multiple data sources, as is required for the estimation of my model. In Section 5, estimation results are reported and the dynamic implications of higher sending country incomes evaluated. 2 Model The model is chosen to reflect emigration, return migration and re-migration decisions of household heads and dependent family members under financial constraints, as well as their asset accumulation and loan take-up. The model’s main purpose is to provide a framework in which the effect of origin country incomes on migration dynamics under borrowing constraints can be evaluated. To the best of my knowledge, it is the first model 6 to be used to identify monetary migration costs taking into account that households may have access to credit to cover part of this cost. To make the model applicable to the Mexico-U.S. context where many Mexican migrant workers take up jobs in the U.S. agricultural sector or in construction, the model further is the first to consider the role of aggregate seasonal variation in labor demand.4 This section describes the primitives of the model, including agents’ information set and choices, state variable transitions and the timing of choices, and finally the dynamic specification of the model. Additional specification details can be found in Appendix A. State variables. A household head i at time t makes decisions based on age ait , employment status eit ∈ {working (w), not working (nw)}, on whether there are dependent members fit = {0, 1}, on current household head and family location family f lit ≡ lit , lit ∈ {M X, U S}2 , legal status in the U.S. δit ∈ {0, 1}, total U.S. experience XitU S , the accumulated stock of assets Ait , and current season st ∈ {summer, winter}. Furthermore, decisions are based on information known by the agent, but unobserved by the econometrician. This includes the unobserved productivity in different locations, αi ≡ αiM X , αiU S , preferences πi towards the home country, and transitory shocks to earnings and locational preference, summarized in Υit ≡ (vitl , εlit ). The vector Ωit = ait , eit , fit , lit , δit , XitU S , Ait , st , αi , πi , Υit collects the state variables observed by an agent at time t, though some of this information is revealed sequentially within the period, as will be detailed below. In the estimation, a period is taken to be six months. Family, legal status and location. At the beginning of each period, it is determined whether an individual gains or loses dependent family, with state dependent transition rates pf (Ωit ). Also at the beginning of the period, individuals learn about their U.S. legal status. Transitions rates pδ (Ωit ) for an individual’s legal status vary with age and employment status. The timing in the model is such that after family size and legal status are known, household members choose a location. It is assumed that when there is dependent family, families make consumption and location decisions in agreement, so that choices maximize household welfare.5 This can result, however, in no one, all, or just the household head migrating. In data from the Mexican Migration Project (see Section 3), the probability of a spouse migrating while the household head stays in Mexico is only about 4 percent of the reverse. I thus exclude the possibility that dependent family resides in the U.S. while the household head is in Mexico. To further save on computation 4 29% of Mexicans in my Mexican Migration Project sample (more on this below) report having worked in agriculture during their last trip to the U.S., with another 15% in construction. 5 As this is a unitary household model focusing on migration behavior of male Mexican household heads, I will refer to household heads and individuals interchangeably. 7 time, I assume that all family members share the same legal status in the United States. For married migrants sampled by the Mexican Migration Project this is true in 93 percent of cases. Individuals choosing to migrate face a monetary cost, which varies by age and whether an immigrant holds a U.S. visa, and which may be different for the household head and other family. Employment and earnings. I focus on the extensive margin of employment and assume that individuals either work full time or do not work, with job offers arriving at a rate λw (Ωit ), and jobs being lost at a rate λnw (Ωit ), each depending on individual characteristics such as age and time spent in either of the two countries, but also on seasonally varying aggregate demand for workers. If working, log monthly earnings in location l ∈ {M X, U S} are given by log yitl = αil + f l (ait , XitU S ) + vitl where f M X (·) is a function of age, while f U S (·) also is a function of the U.S. experience an individual has accumulated up to time t.6 Idiosyncratic shocks to wages, vitl , are assumed to be independent and identically distributed across time and individuals, with mean zero and location specific variance σv2l . Individuals retire at age aret and from then until the (known) end of life receive retirement benefits y ret (Ωit ). See Appendix A for details on the exact specification of λw (Ωit ), λnw (Ωit ), f l (ait , XitU S ) and y ret (Ωit ). Budget constraint. The driving motives for temporary migration in the model are financial wealth accumulation for an increase in future consumption and the buffering of employment and wage fluctuations, and the response to preference shocks. For a given location l, I assume a standard intertemporal budget constraint augmented by migration cost C(Ωit ) for household heads and C f (Ωit ) for the remaining family to relate future assets Ait+1 to income yitl , the current stock of assets Ait , and consumption cit , Ait+1 = (1 + rA )Ait + y(Ωit ) − cit −1[lit−1 = M X ∩ lit = U S]C(Ωit ) f −1[lit−1 = M X ∩ litf = U S]C f (Ωit ), (1) where the real interest rate rA depends on whether a household holds debt, so that Ait < 0. At the beginning of working life, the stock of assets of household i is assumed to equal Ai0 = Ã0 exp(αiM X ). To abstract from the decision of where to hold accumulated savings, 6 Since location is chosen endogenously, returns to U.S. experience after a return to Mexico cannot be identified in my data separately from a correlation between productivity αiM X and the preference for being in the U.S., see also the discussion on identification in Section 4.1 8 I assume that interest rates are identical in Mexico and the United States. In order to take into account differences in currency purchasing power, however, the stock of assets is adjusted by real exchange rate x if individuals (re-)migrate. When a household head is in Mexico, the household may choose to borrow up to some limit B(E[yitM X ], Ωit ) in order to smooth consumption or to finance a migration.7 Motivated by evidence presented in Section 3, I let this limit vary by (expected) income. I further assume that borrowing constraints become tighter towards the end of life, enforcing a repayment of debt during retirement (see Appendix A for details). Hence, assets are constrained by Ait ≥ min{−B(E[yitM X ], Ωit ), (1 + rA )Ait−1 } at all times, (2) i.e. apart from temporal variation in income that drives B(E[yitM X ], Ωit ) below debt levels inherited from the previous period, B(E[yitM X ], Ωit ) is the credit constraint. Equations (1) and (2) summarize the identification problem when both borrowing constraint B and migration cost C are unknown, and the level of assets just after a migration has been paid for, Ait+1 , is unobserved. This typically is the case and leaves it unclear, whether an observed migration has been facilitated by a C that is low enough to be covered by cash on hand, (1 + rA )Ait + y(Ωit ), or whether migration costs are in fact higher, while B > 0 and households can borrow to partly pay for the migration. Preferences. Individuals derive utility from consumption, family and locational amenities. For individuals residing in the U.S. who have dependent family, utility flows are further adjusted by where this family resides. With these features in mind, per period utility is specified as uit = φlf fit cφitc πil + εlit , f f where φlf = φl6f=l if families are spatially separated, and φlf = φl=l if not. In addition to f heterogeneous time-constant location preferences πil , households face transitory preference shocks εlit that are assumed to be additive and independently extreme value distributed.8 As only relative utility flows in the two locations are identified, πiM X is normalized to one, so that πiU S becomes the marginal utility from consumption in the U.S. relative to marginal utility from consumption in Mexico.9 I further assume that uit goes to 7 Evidence by Banerjee and Munshi (2004) and Laszlo and Santor (2009) suggests that weaker network structures at the destination are likely to limit credit access relative to the origin. 8 Shocks ε have cdf P (ε ≤ x) = exp(− exp(−x/sε (ait ))), where sε (ait ) is an estimated spread parameter, specified as a linear function of age. Additivity of ε in the utility function, independence and extreme value distribution imply that the location choice probabilities take a logistic form, with value functions in the home and host country as arguments (Rust, 1987). 9 An alternative explanation for migration rates that fall short of fully eliminating persistent wage differences across locations is the existence of insurance networks in an individual’s home community, 9 minus infinity for cit < 0, which prevents individuals for whom (1 + r)Ait + y(Ωit ) + B(E[yitM X ], Ωit ) < C(Ωit ), i.e. cash on hand plus the maximum obtainable credit does not cover the migration cost, from migrating. Welfare. After family and legal status are revealed, the location of both the household head and of dependent family members has been chosen, and individuals know the job offers and income available to them, consumption is chosen to maximize household welfare. The dynamic problem for these choices is given by the Bellman equation V (Ωit ) = max uit (cit , Ωit ) + βEt [V (Ωit+1 )] , cit ,lit where β discounts future utility streams and Et [·] is the expectations operator given the information available at time t. Individuals live until age aend , with V (Ωit |ait = aend ) = 0. 3 Data and Descriptives To estimate the model, data from different sources are needed, as identification of the full set of parameters requires information on individuals’ choices and outcomes for nonmigrants and return migrants in Mexico, for both temporary and permanent Mexican immigrants in the U.S., and of course on migration behavior itself. In addition, given a potential correlation between unobserved heterogeneity in location preferences and productivity, credible identification of the income dependence of access to credit requires information on borrowing and variation in income that can be separated from the preference for a costly migration. Hence, I further use information from the randomized introduction of PROGRESA cash transfers (more on this below) as a truly exogenous variation in income, and data on its effect on borrowing. This section introduces the data sources used and provides some descriptive statistics. Estimates of the aforementioned treatment effect of a cash transfer program on borrowing are shown in Section 4.1 where identification of the various groups of model parameters is discussed. Mexican Migration Project (MMP). A major challenge to empirical migration research in a context where migrants may choose to move back and forth is the lack of datasets that track individuals across international borders. Existing analyses of repeat migration thus rely almost exclusively on retrospective information about migration hiswhich are weakend in case of a migration (see e.g. the recent paper by Munshi and Rosenzweig, 2016). In contrast to location specific utility flows from consumption, the existence of insurance networks alone, however, cannot rationalize temporary emigration. 10 tories from the Mexican Migration Project.10 Between 1982 and 2013, the MMP has surveyed 22,894 households in 143 communities in Mexico. The main advantage for the purpose of this paper is its complete retrospective life histories, which contain detailed information on employment, family status and migrations for each household head and spouse. MMP data are most suited to assess the duration of completed migrations, the documentation used, the incidence of repeat migrations and their correlation with individual characteristics. Besides annual information on whether an individual has been to the U.S., the MMP data also report the length of stay, so that separate migrations in consecutive years can be identified. This allows an analysis of seasonal migration, which so far has been prominently absent from the structural econometric migration literature. I explore this further below. Figure 1a displays the distribution of the number of completed migrations made by Mexican men who have reached age 65 or older, and are thus likely to have completed their total lifetime number of trips. In this sample of former U.S. migrants, 60 percent report having been to the U.S. more than once, among whom about two thirds have migrated more than twice. Figure 1b shows the distribution of duration of the most recent migration. Although there is much variation in migration duration, a considerable proportion, 38 percent, has returned within one year after emigration. Figure 1: Number of migrations and migration durations. Source: MMP 143. Figure 1a shows the distribution of the number of migrations made per returned migrant by the end of working life. The distribution is based on the MMP cross-sectional files, restricting the sample to Mexican-born non-tertiary educated males aged 65 or older at the time of the survey. Figure 1b, showing the distribution of migration duration, refers to the last trip to the U.S. by Mexican-born non-tertiary educated males aged 16-64 at the time of the survey. For the main analysis, I restrict the sample to the years 1996-2007. The reason for this is twofold: first, a series of policy changes since the Immigration and Control Act 10 mmp.opr.princeton.edu; Deléchat (2001); Colussi (2003); Thom (2010); Lessem (2013) and Nakajima (2014, 2015) all use the MMP as their main data source. 11 (IRCA) of 1986 gradually tightened control of the U.S. southern border, culminating in the Illegal Immigration Reform and Immigrant Responsibility Act of 1996. Each of these reforms, which include some more local measures, such as Operation Hold-the-Line in 1993 and Operation Gatekeeper in 1994, expanded border control (see e.g. Gathmann, 2008, for details). I do not intend to model these shifts in the environment faced by Mexican migrants. Second, the focus on post-1996 data avoids contamination of my results by the peso crisis of December 1994 and the most abrupt economic woes that followed (see e.g. Sachs et al., 1996). I further exclude individuals who were born in the U.S. and may thus have a stronger attachment to the United States. Lastly, I restrict attention to male household heads aged 16-64 without tertiary education.11 I do, however, use information on migrations of household heads’ spouses to identify model parameters relating to dependent family members’ residence location. The same restrictions apply to moments generated from the other datasets discussed below. This leaves a relatively homogeneous sample to which the model of Section 2 is applied. While the MMP is a valuable source on migration experiences, there are a number of shortcomings. First, it is relatively weak on asset information. Data on assets, however are essential for the estimation of a model of international migration, where emigration is restricted by financial constraints. Moreover, the MMP is a representative data source only within the communities surveyed, while the communities themselves are a nonrandom selection of all Mexican localities. Although the population in these communities presumably is representative for a large share of the Mexican population, a number of authors criticize its bias toward communities with a strong history of sending migrants to the United States (Orrenius and Zavodny, 2005; Hanson, 2006; McKenzie and Rapoport, 2007; Fernández-Huertas Moraga, 2011). I explicitly address this non-randomness in the estimation as explained in Section 4.2. Mexican Family Life Survey (MxFLS). Due to the above-mentioned shortcomings, moments from the MMP are supplemented with information from the Mexican Family Life Survey,12 which is a representative household survey conducted in 2002, with followups in 2005-06 and 2009-12, and of which I use the first two waves. In addition to detailed household and individual characteristics, including income, debt and savings, the MxFLS reports whether and where individuals are thinking of migrating in the future. Based on this information, the MxFLS data support the earlier assertion that emigration is particularly desirable for Mexicans at the lower end of the income distribution. Figure 2 shows that Mexicans who express the wish to move to the U.S. are disproportionately 11 In the MMP data, and given the other restrictions, less than 10 percent of individuals are tertiary educated. 12 http://www.ennvih-mxfls.org. 12 drawn from the lower tail of the income distribution, with a significant difference of about 14.97 percent in mean incomes, conditional on age, gender, years of schooling, occupation and year observed. Figure 2: Log annual income in Mexico and the wish to move to the United States. Source: Mexican Family Life Survey, 2002, 2005. Log annual income is denoted in purchasing power adjusted USD, deflated to 2005. The sample includes individuals aged 16-64. The density is computed using an Epanechnikov kernel with 3/4 of the optimal bandwidth to prevent oversmoothing. Important for the purpose of this paper, the MxFLS inquires about asset and debt holdings. In the presence of positive migration costs, wealth constraints have been highlighted as an important factor in migration decisions and migrant selection (Chiquiar and Hanson, 2005; Belot and Hatton, 2012; McKenzie and Rapoport, 2010; Grogger and Hanson, 2011; Dustmann and Okatenko, 2014). Such constraints may serve as an explanation of lower observed emigration rates from the bottom of the income distribution than would be expected given the often lower spread and higher level of this distribution in destination countries. An underlying assumption is that individuals face borrowing constraints that are too tight to finance a migration. Indeed, existing dynamic structural migration models which allow for asset accumulation (Rendon and Cuecuecha, 2010; Thom, 2010; Nakajima, 2014) assume that individuals cannot borrow at all. Contrasting this, almost one fifth of the working age population surveyed by the MxFLS reports to hold some, if only modest, amounts of debt. This of course does not imply that there are no binding borrowing constraints. Given the evidence, however, these are likely to be at non-zero levels of debt. Angelucci (2015) argues that access to credit and borrowing limits are likely related to income, including any anticipated governmental transfers that can be used as collateral. A positive relation 13 between debt and income is supported by Figure 3. The left panel shows the cumulative distribution of debt held separately for above and below median income individuals. At the mean, the difference in debt levels is strongly significant, with high income individuals holding more than twice as much debt than individuals with incomes below the median. The same strongly positive relation is depicted in the right panel of the same figure, which plots (log) debt against (log) incomes. This pattern motivates the more flexible specification of debt limits in the model of Section 2. Figure 3: Distribution of debt levels by income. Source: Mexican Family Life Survey, 2002, 2005. Debt is calculated as negative net assets (in purchasing power adjusted USD). Figure 3a includes zeros to illustrate the fraction holding debt. Means and 95% confidence intervals are computed for positive debt levels (negative net assets) only. Figure 3b plots log debt levels against log annual earnings, and a fitted local polynomial regression line with 95% confidence interval, excluding the top and bottom 0.5% of observations. Both samples include non-tertiary educated male household heads aged 16-64. Survey of Income and Program Participation (SIPP). A second shortcoming of the MMP is that—as it surveys households in Mexico—long-term Mexican emigrants are not covered. To assess the evolution of employment and wages for these individuals, I use data from the Survey of Income and Program Participation.13 The SIPP is a short panel survey, the advantage of which over other and possibly longer U.S. panel surveys is its large sample size that allows a separate analysis of Mexican immigrants. In addition, the SIPP provides monthly information suitable to assess the relevance of seasonal variation. As Figure 4a suggests, seasonality in employment rates is indeed prevalent. In fact, fluctuations in U.S. labor demand can be expected to be stronger, as seasonal variation in labor supply due to migration is likely to be pro-cyclical. 13 http://www.census.gov/sipp 14 This is supported by data on monthly apprehensions during 1999-2007 provided by the U.S. Border Patrol, which suggest an approximately twice as high number of apprehensions at the U.S. southern border during the summer months than in winter (Figure 4b). To the extent that this reflects a larger immigrant stock in summer, such increased labor supply indeed confirms that seasonal variation in labor demand is actually stronger than apparent from the seasonal employment profile depicted in Figure 4a. To take this important feature of Mexico-U.S. migration into account, the model in Section 2 allows job offer and job loss probabilities to vary by season. In line with my restriction of the MMP sample, I use the three slightly overlapping but unconnected SIPP panels 1996-2001, 2001-2004 and 2004-2007. Figure 4: Seasonality. Sources: (a) SIPP, 1996-2007; (b) U.S. Border Patrol, 1999-2007. The graphs show seasonality in (a) the share among non-tertiary educated Mexican-born male household heads aged 16-64 residing in the U.S. who worked for at least one week during the respective month, and (b) average monthly apprehensions at the U.S. southern border. Vertical lines show the 95% confidence intervals. The main variables used from each of these three data sources are listed in Table 1. Panel (a), which summarizes the MMP variables, separately displays means and standard deviations in different reference populations: for the MMP’s retrospective life history files and for an individual’s most recent migration to the U.S. Further, the top most panel distinguishes between moments of the entire sample population, and of (retrospective) observation points in the U.S. The first entry shows the strong migration history of communities sampled by the MMP, with 8.6 percent of the individuals sampled spending at least part of a given year in the U.S. This high propensity to migrate contributes toward many individuals from these communities departing for the U.S. without legal documentation. The fraction of 37.7 percent of relatively low-income MMP immigrants residing in the U.S. legally is likely lower than the share among all Mexican immigrants.14 14 Passel (2005) estimates the 47% of Mexicans in the U.S. to have legal permits, see Hanson (2006) 15 While most moments describing employment transitions in the U.S. are computed from SIPP data, I use covariation of employment transitions and legal status from the MMP to pin down the effect of having legal documentation on job finding and job loss rates in the U.S. A comparison with employment rates from the SIPP data at the end of the table shows virtually no difference across the two samples in this respect. Migration duration and the amount saved refer to an individual’s most recent migration spell. Savings in this case include remittances made for the purpose of saving throughout the migration spell and the total amount of savings accumulated and repatriated at return. As shown earlier in Figure 3a, close to one fifth of the Mexican population report having negative net assets, with (positive) debt levels averaging around 500 USD. Immigrants surveyed by the SIPP have been to the U.S. on average for close to 17 years. This is considerably more than the average migration duration of just five years for the exclusively temporary migrants covered by the MMP, and highlights the importance of using a data source that includes permanent migrants as well. Finally, average earnings in the U.S. of Mexicans who select into migration are about 2 log points higher than average earnings in Mexico, suggesting an on average strong economic incentive for U.S. migrations.15 PROGRESA evaluation data. In order to identify how access to credit varies with income when migration costs are unobserved and preferences for the U.S. may be correlated with unobserved earnings factors, so that the demand for credit varies across income groups as well, I use evaluation data for the Mexican Programa de Educación, Salud, y Alimenación (PROGRESA, later called Oportunidades, now Prospera).16 In May 1998, PROGRESA started handing out conditional cash transfers in a randomized group of 320 “treated” communities. To evaluate the program, household data in both this treatment group and in a control group of 186 communities, where the program was introduced in November 1999, were collected. Eligibility of families was determined by a pre-program survey in 1997, based on a multi-dimensional marginalization measure.17 The pre-program sample allows for a comparison of prior outcomes of households in program and control localities. Information on loan take-up is not available for 1997. Instead, Table 2 lists differences between a number of wealth proxies and other household characteristics. Overall, this comparison suggests small and statistically insignificant differences for a detailed account of unauthorized Mexican immigration to the United States. 15 Hanson (2006) reports differences in average wages of Mexicans in the U.S. and in Mexico based on the respective Censuses in 2000. Depending on age and education, this difference amounts to 97%-490% of wages in Mexico. 16 Programa de Educación (2012) 17 See Skoufias et al. (1999) for details. Similar to the MMP sample, households eligible for PROGRESA are not representative of the Mexican population, which will be taken into account in the estimation, as explained in Section 4.2. 16 Table 1: Summary statistics for the three main data sources used. (a) Mexican Migration Project (MMP) Life history files Full sample Variable Is in the U.S. Number of trips∗ X U S (in years)∗ Individuals population Mean Std. dev. 0.086 0.281 2.177 2.495 5.621 6.348 If in the U.S. Variable Mean Legal status 0.377 Working 0.900 Family in the U.S. 0.266 (given head is) 10,202 Std. dev. 0.485 0.300 0.442 1,366 Cross-sectional files, last U.S. migration Variable Migration duration (in years) Total amount saved (in USD) Individuals Mean 4.878 5393.124 Standard deviation 6.598 4636.865 1,835 (b) Mexican Family Life Survey (MxFLS) Variable Mean Age 42.276 Has dependent family 0.941 Working, Oct-Mar 0.907 Working, Apr-Sept 0.885 Log annual earnings (in USD, PPP adj.) 8.253 Net assets (in USD, PPP adj.) 1067.702 Has debt 0.187 Amount of debt (in USD, PPP adj.) 502.828 Individuals 5,810 Standard deviation 11.474 0.235 0.290 0.319 0.992 14209.370 0.390 1773.736 (c) Survey of Income and Program Participation (SIPP) Variable Mean Standard deviation Age 38.732 10.046 Years since immigration 16.903 9.617 Working, Oct-Mar 0.887 0.294 Working, Apr-Sept 0.901 0.272 Log annual earnings (in USD) 10.228 0.773 Individuals 1,447 MMP, 1996-2007; MxFLS, 2002, 2005; SIPP, 1996-2007. In each case, the sample includes non-tertiary educated male household heads aged 16-64. SIPP statistics on age, years since immigration and earnings are based on the March survey, thus the number of observations are annual rather than monthly. Individuals are considered working in a given half-year if this is the case in at least 4 months. Monetary values are deflated to 2005, and adjusted by purchasing power parities if referring to Mexico. ∗ Conditional on ever having been to the U.S. in these dimensions.18 Starting in 1998, eligible families in program communities received scholarships for each child aged 8 to 21 who attended school in one of the last four grades of primary, or the first three grades of secondary school. Judging from school attendance rates in control communities, the transfer was in fact unconditional for low-income families with children up to age 14, of whom over 97 percent attended school in the absence of the program. For the estimation, I thus restrict attention to these families. Amounts paid vary by grade attended and gender of the child as displayed in the left graph of Figure 5. This randomized introduction of the program induces exogenous variation in incomes and allows the construction of the additional moments needed to identify income dependent 18 See also Behrman and Todd (1999) for an extensive evaluation of the PROGRESA randomization. 17 Table 2: Comparison of pre-treatment household wealth proxies in program and control communities. Control mean age of HH head 40.320 literate HH head 0.744 HH head works 0.947 hours worked 42.134 hourly wage (in pesos) 3.450 no. of rooms 1.640 land owned (in hectares) 1.902 Observations 2450 Difference between treatment and control −0.164 (0.220) −0.007 (.011) −0.010 (0.006) 0.125 (0.395) −0.081 (0.069) −0.003 (0.024) −0.054 (0.099) 6596 PROGRESA evaluation data, 1997. Column 1 lists mean outcomes for the control sample, while column 2 shows the difference between program and control observations before introduction of the program, with standard errors in parentheses. borrowing limits and thus migration costs. The data show that loans taken out by eligible families in program localities during the 6 months leading up to November 1998, when evaluation data were collected, are on average higher than in control localities. Figure 5: PROGRESA’s monthly cash transfers (1998, 2nd term) and loans taken during the last 6 months. Sources: Schultz (2004) and PROGRESA, November 1998. Figure 5a depicts monthly cash transfers by PROGRESA by gender of the child and grade attended. Figure 5b shows the distribution of log amounts of loans taken within the previous 6 months by treatment status. The sample includes male heads aged 16-64 of eligible households with children aged 8-14 attending school. The density is computed using an Epanechnikov kernel with 3/4 of the optimal bandwidth to prevent oversmoothing. Figure 5b depicts the conditional density of the log amount of these recent loans in the two groups of locations, and when discussing model identification in Section 4, I provide 18 point estimates of the treatment effect of randomized transfers from the program on loan take-up, which serves as an additional moment in the structural estimation. 4 Estimation The model in Section 2 can be solved by backward induction, and the choice functions obtained are used to simulate migration and consumption behavior of a sample of individuals the moments of which can be compared to those observed in the Mexican and U.S. data. I estimate the 101 parameters of the model by minimizing the distance of 210 moments computed from model simulations to their empirical counterparts in the four datasets used. As I combine different data sources, which all have different sample sizes and partly represent different populations, a number of important issues arise, which are discussed in Section 4.2. First, however, I discuss identification of the model parameters, with further details provided in Appendix B. 4.1 Identification Most parameters are identified from static or dynamic conditional data moments that can be obtained through auxiliary regressions, and which—apart from selection that is modeled explicitly—are relatively direct counterparts to the respective parameters. For instance, to identify parameters governing transitions in family status (pf (Ω)), legal status (pδ (Ω)) and employment (λe (Ω)), I match coefficients from OLS regressions of observed transitions in these outcomes on observed state variables that are arguments of these functions. Note, however, that due to the endogenous selection of individuals into the Mexican and U.S. samples, these parameters cannot simply be pre-estimated outside the model. Similarly, to identify parameters from the earnings function (the mean level of productivity αil , parameters in f l (ait , XitU S ) and the variance σv2l of transitory shocks), I regress log earnings of workers in Mexico on age indicators, and log earnings of Mexican workers in the U.S. on indicators of age and U.S. experience.19 Unobserved heterogeneity in productivity αil around its mean is identified by quantiles of within-individual mean earnings residuals from these regressions. The distribution of relative preferences for being in the U.S., πiU S , is pinned down by the distribution of U.S. 19 The joint distribution of earnings (over two waves) and past migration experience in the MxFLS do not allow a separate identification of returns in Mexican earnings to having been to the U.S. from selection in migration choices that are due to a correlation between productivity and the relative preference for the U.S. The literature so far has been ambiguous about whether there are returns to a temporary U.S. migration for Mexican workers: while Reinhold and Thom (2013) do find small positive returns, Lacuesta (2010) argues that observed earnings differences between Mexican non-migrants and returnees are likely the result of selective emigration. 19 experience, as well as by coefficients from an auxiliary regression of past U.S. migration on quantiles of within-individual mean earnings residual for individuals working in Mexico from the above regressions, and coefficients from a regression of staying in the U.S. for another period on mean earnings residual quantiles in the United States. These latter two sets of moments also link the dimensions of unobserved heterogeneity in the model. The average number of trips per migrant by age, in turn, is informative for the spread parameter of transitory shocks to locational preferences, sε (ait ). The adjustment of utility flows if having family and by where this family resides, φlf , is identified from covariation of assets or debt with having family, and with where dependent family members reside. The parameter relating productivity to the initial stock of assets, Ã0 , and risk aversion, 1 − φc , are identified by the level and the evolution of savings over the life cycle. Given asset accumulation and access to credit, migration cost parameters are identified from observed migrations conditional on observed state variables. More specifically, the coefficients from regressions of indicators for household heads and spouses making a new trip to the U.S. on age and legal status are informative for migration costs C(Ωit ) and C f (Ωit ) for documented and undocumented individuals at different ages. Identification of these cost parameters here crucially depends on migrants’ access to credit. Under unobserved borrowing constraints, observed migrations only identify a combination of migration costs and borrowing limits, so that identification also requires information on borrowing. In addition, the model allows for access to credit to depend on income. Given that high income individuals also may have a high preference for migrating to the U.S. and, hence, a potentially higher demand for credit to finance migrations, this debt limit cannot be identified from survey covariation of debt and income alone. Such covariation would be sufficient to identify the income dependence of borrowing limits in a simpler model that abstracts from correlated unobserved heterogeneity in dimension that determine both income and the demand for credit. The more realistic model used here, however, does not impose such a restriction. Hence, correlations observed between income and the propensity to migrate, and between income and borrowing in principle can be generated by different combinations of the structural correlation in the unobserved heterogeneity dimensions and of migrants’ access to credit. In addition to the above moments, I thus use the effect on borrowing of an exogenous variation in incomes induced by the randomized introduction of PROGRESA, a cash transfer program detailed in Section 3, under the assumption that this randomized policy is uncorrelated with individual preferences. The average treatment effect of being covered by the program, AT E = E[loani |1treated = 1] − E[loani |1treated = 0], i i 20 is identified by α1 in an OLS regression of the form loani = α0 + α1 1treated + γ 0 xi + ui , i (3) where the sample is restricted to eligible households with children in the relevant age range attending school, and xi includes a full set of age indicators for the male household head, as well as his employment and marital status. To proxy for pre-program wealth, I further include indicators for the number of rooms and the amount of land owned by the household. Table 2 above lists the pre-program differences in these variables. While the randomized introduction of the program guarantees consistency, inclusion of these controls reduces the residual and increases precision of the point estimates. Table 3 shows the results of this estimation. In this sample of fairly poor households, the mean monthly transfer amount of 260.32 pesos (51.14 PPP adjusted USD) corresponds to 27.8 percent of household heads’ average earnings in control villages. This sizeable exogenous variation in income helps to pin down the income dependence of borrowing limits. While I do not find evidence for an increase in the extensive margin of credit take-up (column 1), the average level of recent (positive) loans increases by 0.38 log points or 63.72 PPP adjusted USD. After the introduction of a state variable indicating whether an individual is from a treated community of origin or a control community, these are moments the model can generate and that will be used in the structural estimation below. The model’s estimation by matching simulated to observed moments does not actually require a consistent estimate α̂1 , as (3) only serves as an auxiliary regression. The importance rather lies with the income variation being unrelated to location preferences. The model then implies, for a given set of parameters, a distribution of incomes and a corresponding demand for credit. To the extent that for some individuals this demand exceeds the (unobserved) borrowing constraint, i.e. that the constraint is binding, the covariation of income and observed borrowing captured by α̂1 identifies the borrowing limit.20 The full set of moments and their role in identification are listed in Appendix B. 4.2 Data Combination Two important issues arise when combining different data sources for the structural estimation of the model parameters: first, the datasets used here have different target populations, and two (the MMP data and the PROGRESA evaluation sample) are not representative for my population of interest, i.e. the population of non-tertiary educated male Mexican household heads. Second, all four datasets have different sample sizes and 20 Adda and Eaton (1998) use a similar strategy to identify constraints to sovereign debt. 21 Table 3: Average treatment effect of the program on loans taken during the previous 6 months. 1treated Observations loan> 0 0.00170 (0.00436) 6490 log(loan amount in USD) 0.380 (0.192)∗∗ 186 PROGRESA evaluation data, November 1998. The sample includes eligible male household heads aged 16-64. Dependent variable: loans taken within past 6 months. ATE identified by OLS, controlling for age (full set of indicators), employment status, marital status, number of rooms (indicators for 1, ...9, 10+ rooms), land owned (indicators for 1, ..., 9, 10+ hectares) in both regressions. Heteroskedasticity robust standard errors in parentheses; ∗∗ p < 0.05. thus provide moments of different precision. I address these concerns in turn. 4.2.1 Representativeness Both communities sampled by the Mexican Migration Project and by PROGRESA are predominantly rural, low-income villages. The model presented in Section 2, however, was chosen as a description of the entire population of Mexican-born male household heads without tertiary education, as are—conditional on the respective location of residence— the moments generated from the two other data sources used. The dimensions along which the samples differ are not clear a priori. The higher poverty level among households covered by PROGRESA, however, is the most obvious deviation from representativeness, while the main recurrent critique against the MMP in the literature is its bias toward communities with a strong history of sending migrants to the United States (Orrenius and Zavodny, 2005; Hanson, 2006; McKenzie and Rapoport, 2007; Fernández-Huertas Moraga, 2011). Some of these authors maintain that while the MMP might not be a good representation of the Mexican population as a whole, it probably is a good approximation of the population of Mexican migrant sending households. Given that interviews take place in Mexico, however, the latter certainly is a relief only to the extent that the emigration and return migration process is modeled. To further address the oversampling of migrant sending households by the MMP and different income levels across the sampled populations, I allow for different weights of the unobserved heterogeneity types in the model. The estimation approximates heterogeneity in the population by allowing for T discrete types τ of simulated individuals, each of which is associated with a 3-tuple of preference for the U.S. (π U S ), productivity in Mexico (αM X ) and productivity in the U.S. (αU S ). The value of each of these components and the weights ωτ assigned to each type with unobserved characteristics {πτU S , ατM X , ατU S }, τ ∈ {1, ..., T }, are estimated based on the combination of residual quantiles from auxiliary regressions of earnings and location 22 on observed state variables as explained above in Section 4.1. To allow both for a higher propensity to migrate conditional on observables and for lower productivity levels in rural communities, I estimate separate sets of weights {ω1M M P , ..., ωTM M P } and {ω1P ROGRESA , ..., ωTP ROGRESA } used in the construction of simulated moments that have their empirical counterparts in the MMP and PROGRESA samples, respectively. Suppose, for instance, that stronger migrant networks from MMP communities reduce the utility cost of residing in the U.S., and that log earnings conditional on observables are lower in these locations. Then types with a higher preference for the U.S. πτU S and lower productivity in Mexico ατM X would receive a higher weight in simulated moments that are to be matched with empirical moments from the MMP data than in moments from the nationally representative Mexican Family Life Survey. Note, that this not only allows for different productivity or preference levels across samples, but leaves the entire joint distribution of the three dimensions of unobserved heterogeneity unrestricted allowing, for example, also for different levels of inequality across samples. As explained above, differences in the propensity to migrate and earnings levels are likely the most important concerns and are addressed by these multi-dimensional weights. Exogenous differences (i.e. differences unexplained by the model) along other dimensions are possible of course, though a simple comparison of sample moments from the four sources is barely informative about this. As an example, lower unemployment rates in the MxFLS sample, where earnings are higher than in PROGRESA villages may— rather than reflecting a fundamental difference in local labor demand—result from better affordability of migration costs, so that after a job loss emigration is a viable option, while unemployed individuals in PROGRESA villages who cannot afford emigration stay and are counted as unemployed. While this is a pure selection issue, other mechanisms are also captured by the model. Employment rates may vary across samples, for instance, also due to differences in past U.S. experience, which will affect employment transition probabilities after a return to Mexico. Such effects are modeled explicitly and hence do not require any re-weighting. 4.2.2 Different Sample Sizes A further concern arises irrespective of the representativeness of the samples used. All four datasets in my analysis have different sample sizes and thus tend to yield moments of different precision. If all data moments needed for identification were observed from the same source with sample size N , Gourieroux et al.’s (1993) indirect inference estimator √ would converge at a rate N . My estimation matches simulated moments to empirical counterparts from four samples ς ∈ {M M P, M xF LS, SIP P, P ROGRESA} of different sizes Nς . While consistency of the simulated minimum distance estimator is unaffected 23 by the use of multiple samples, the derivation of the asymptotic distribution requires an assumption on the rate at which these samples increase. In line with Angrist and Krueger (1992) and Arellano and Meghir (1992), who derive the asymptotics of a two sample instrumental variables estimator in a situation where identification requires moment conditions from two independent data sources,21 assume that sample sizes increase at proportional rates, and let simulated sample sizes Nςs increase at a rate such that lim (Nς /Nςs ) = nς , Nς →∞ Nςs →∞ with 0 < nς < ∞. The simulated minimum distance estimator θ̂ for a vector of pa0 rameters θ minimizes criterion Γ(ϑ) = D(ϑ)0 W D(ϑ) = md − ms (ϑ) W md − ms (ϑ) , where D(ϑ) = md − ms (ϑ) is the difference between a vector of observed data moments md and the corresponding moments ms (ϑ) simulated from the model with structural parameters ϑ, and W is a weighting matrix. Importantly, the observed moment vector may be a collection of moments calculated from different data samples, such that md = (mdM M P mdM xF LS mdSIP P mdP ROGRESA )0 . The moments used in this paper are OLS estimates of coefficients in linear auxiliary models, and are thus asymptotically normally distributed. Given this, and under the additional assumption listed in Appendix C, √ d N (θ̂ − θ) −→ N 0, −1 ∂D0 ∂D (θ̂)W 0 (θ̂) ∂ϑ ∂ϑ ∂D0 ∂Dς (θ̂) N (1 + nς ) ς (θ̂)Wς var(mdς )Wς ∂ϑ ∂ϑ0 ς −1 ! ∂D ∂D0 (θ̂)W 0 (θ̂) , ∂ϑ ∂ϑ ! X with Dς (θ̂) = mdς − msς (θ̂), and where Wς are blocks on the diagonal of the weighting matrix W , with one block of weights for the moments from each sample ς. I provide a derivation of this result as well as the assumptions required in Appendix C. 21 See also Ridder and Moffitt (2007) for an extensive survey on data combination. 24 5 5.1 Results Model Fit The model was chosen to mirror migration choices in a context where many migrants choose to return and possibly re-migrate at a later stage. The left panel of Figure 6a displays the distribution of the number of migrations undertaken up until the time individuals were surveyed by the Mexican Migration Project.22 For comparison, the right panel of Figure 6a shows the same distribution in the population simulated by the model. Similarly, Figure 6b shows the empirical and simulated distributions of cumulative U.S. experience, that is the time in the U.S. that may have been accumulated over several trips, while Figure 7 shows the model fit for the conditional distributions of the time continuously spent in the U.S. by legal status. Note that in the structural estimation only means of these outcomes are targeted rather than the full distributions. Overall, I conclude that although the distribution of cumulative U.S. experience predicted by the model is somewhat too compressed, with too many individuals predicted to have been to the U.S. for between 5 and 10 years, the model replicates well the prevalence of migration temporariness and repeat migrations. Appendix B lists the full set of observed and simulated moments used in the structural estimation. Figure 6: Model fit: Number of migrations and cumulative U.S. experience. Distributions of (a) the number of migrations in the MMP data at the time of the survey and the corresponding distribution in the population simulated by the model; and (b) cumulative U.S. experience in the MMP data and the simulated sample. Model predictions are based on 20,000 simulated individuals, drawing from the MMP’s age distribution at the time of the survey and using estimated weights used in the construction of moments with empirical counterparts in the MMP. 22 Note that this is weakly less than the total number of migrations during an individual’s life cycle, which Figure 1 attempted to capture by restricting the sample to individuals aged 65 or older. 25 Figure 7: Model fit: Migration duration by legal status. Distributions of the number of migrations in the MMP at the time of the survey and the corresponding distribution in the population simulated by the model for (a) immigrants with a U.S. visa; and (b) for undocumented immigrants. Model predictions are based on 20,000 simulated individuals, drawing from the MMP’s age distribution at the time of the survey and using estimated weights used in the construction of moments with empirical counterparts in the MMP. 5.2 Estimates The model has 101 estimated parameters. I focus here on a subset, in particular on estimates describing preferences, migration costs and access to credit. A full list of the structural parameter estimates is provided in Tables 17-23 in Appendix D. Preferences. Utility in the model is derived from consumption, family status and family location, as well as from locational amenities that may be valued differently across individuals. Everything else equal, individuals tend to derive slightly higher utility from being in Mexico, with an average utility loss of 6 percent from migrating (π Ui S = 0.97), though with considerable heterogeneity across individual types (see Table 4).23 The estimate of φc indicates a decreasing marginal utility of consumption. Per period utility flows are adjusted by whether an individual has dependent family members, and by whether f f they reside in the same location. An estimate of φl=l (φl6f=l ) larger (smaller) than one f means that individuals derive positive utility (suffer a utility loss) from having family if this family resides in the same (a different) location. Estimates suggest that individuals who have family that does not migrate are—at equal consumption flows—more than 23 The estimated weights for the four types are (0.167, 0.101, 0.487, 0.245) in the representative data sources (MxFLS and SIPP), (0.000, 0.236, 0.279, 0.485) for the MMP sample, and (0.082, 0.714, 0.205, 0.000) for the PROGRESA sample. This yields a higher mean relative preference for being in the U.S. of 1.07 among household heads in MMP communities, and a lower one of 0.76 in the PROGRESA sample. The correlation betweeen πiU S and log earnings heterogeneity in Mexico, αiM X , is 0.77, that with log earnings heterogeneity in the U.S., αiU S , is -0.78. 26 three times better off if staying with their family in Mexico than if migrating. Table 4: Structural estimates of preference, borrowing constraint and migration cost parameters. Parameter c l Preferences: uit = φlf πil cφ it + εit preference of type 1 for the U.S. (π1U S ) preference of type 2 for the U.S. (π2U S ) preference of type 3 for the U.S. (π3U S ) preference of type 4 for the U.S. (π4U S ) returns to consumption (φc ) Point estimate 0.640 0.721 0.956 1.315 0.627 f effect of spatial separation from family (φl6f=l ) Standard error (0.050) (0.010) (0.014) (0.024) (0.002) 0.161 (0.008) 3.295 (0.037) Borrowing limit: B(E[yit ], Ωit ) = min {b0 + by E[yit ], ·} intercept (b0 ) 8.632 effect of biannual income (by ) 0.702 (7.256) (0.004) Migration cost: C(Ωit ) = mc0 + gC (ageit ) + coyote intercept (mc0 ) 4952.948 effect of age at age ≤ 30 12.906 effect of age at 30 < age ≤ 50 57.665 effect of age at age > 50 110.577 coyote 775.357 (84.604) (2.251) (1.965) (0.013) (27.993) f effect of family in same location (φfl=l ) Model parameters characterising preferences, access to credit and migration costs estimated by simulated minimum distance estimation based on 20,000 simulated individuals × 50 years × 2 seasons. See Section 4.2.2 for details on the computation of standard errors. Borrowing limit. Apart from a debt limit that becomes tighter towards the end of life and ensures repayment, households face a constraint to the maximum amount of debt they can hold that is related to their current predicted income (see Appendix A for more detail). This part of the constraint, which is specified as a linear function of predicted income, is estimated to be the binding constraint in most cases. Given the low estimate of the intercept b0 , the estimate of by implies that households have credit access to about 70 percent of their half yearly or 35 percent of their annual income. Migration costs. The monetary cost of migration depends on age and may be different for household heads and other family members, with an extra cost for border crossings without a U.S. permit. The cost of migration is estimated, for instance, to equal 5.340 USD for 30-year-old household heads with a U.S. visa, and an additional 775 USD for undocumented immigrants. This estimate of the cost for a “coyote” turns out to be very close to the mean of 774 USD reported by former undocumented migrants in my MMP sample. This is particularly reassuring given that I do not use this information in the structural estimation. My estimates compare to 8,900 USD for undocumented migrants estimated by Thom (2010) for the year 2000 using a model where other family members’ migration is not explicitly considered; the estimate of 8,900 USD potentially includes 27 migration costs for other family members. For a sample of mostly, but not exclusively, undocumented migrants, Nakajima (2015) estimates the cost of migration per person to be 10,900 USD, which is assumed to be equal for wives joining their husbands in the U.S. Note that the models used vary across several margins, some of which suggest an overestimation and others an underestimation of the cost of migration. The assumption of zero access to credit tends to reduce migration cost estimates, while for instance abstracting from seasonality in labor demand increases this estimate as long as this seasonality is stronger in the sectors employing Mexicans in the U.S. than employment seasonality in Mexico. Similarly, additional costs for family migration are, if not modeled explicitly, picked up by this parameter. 5.3 Policy Simulation Higher incomes at origin. I use the estimated model to analyze the effect of higher origin country incomes on migration dynamics. While identification of short-run net effects of income on emigration can be achieved by reduced form estimations if an exogenous variation in income can be exploited, an investigation of subsequent choices, for example whether and when to return, requires a more structural approach. Given that many Mexican migrants stay in the U.S. only temporarily, it is a priori unclear whether a positive effect on the propensity to emigrate implies a one-to-one increase in the stock of migrants residing in the U.S. Furthermore, the selection of both immigrants and return migrants is likely to respond to changes in economic conditions at the origin. I thus use the model to investigate how flexibly migrants respond to economic outcomes, and whether policies that relax financial constraints in the country of origin induce changes in migration durations, repeated migrations undertaken and the selection in terms of labor market outcomes. As discussed earlier, a rise in household income has at least two counteracting effects on the propensity of individuals to emigrate: While staying becomes relatively more attractive as incomes and thus the opportunity cost of migration rise, a higher income may help to overcome binding liquidity constraints and facilitate emigration to a still higher paying destination, or one where an individual desires to move to for non-economic reasons. Existing evidence points toward a positive net effect in many contexts (e.g. Bazzi, 2014; Angelucci, 2015). Beyond this extensive margin, however, an increase in income also affects migration on the intensive margin of migration duration, as well as the propensity of individuals to move back and forth repeatedly, and the selection of those who return. Figure 1 illustrated that repeat migration is a common phenomenon between Mexico and the U.S. In the model used here, repeated migrations are driven by changes in any of the time varying state variables. For instance, an immigrant in the U.S. who 28 has accumulated sufficient savings may find it worthwhile—given expectations about employment and other outcomes—to return and enjoy a higher utility from consumption in Mexico where other family members live. If later on that returnee loses a job and reemployment probabilities in Mexico are relatively low, a re-migration may be the optimal choice. Similarly, shocks to preferences, wages, family or legal status may trigger repeated migrations. As with the first migration undertaken, however, individuals are prevented from re-migrating if their cash on hand plus the debt limit does not cover the monetary cost of a migration. An increase in sending country incomes makes migrations more affordable. Hence, apart from increasing the appeal of spending time in the origin country, such a change enhances the capacity of individuals to adjust to changing personal and economic conditions, including employment opportunities. For instance, immigrants in the U.S. losing a job will be less reluctant to return to Mexico, knowing that a re-migration the following spring, when more jobs will be on offer, is affordable. Figure 8, which shows the distribution of the number of migrations, illustrates this for an increase in Mexican incomes by 10 and 20 percent, and separately for (a) all migrations under each of the respective scenarios, and (b) for only those individuals who migrate at least once under either regime, hence eliminating compositional changes by looking at the same group of individuals throughout. At baseline, without any transfers, the average number of migrations throughout an individual’s working life and conditional on having ever migrated is about 1.5.24 Figure 8: Effect of higher origin country incomes on repeated migrations. Number of migrations under different income levels in Mexico, considering (a) all individuals with at least one migration under the respective scenario, and (b) individuals with at least one migration in either of the cases considered. Model prediction based on 20,000 simulated individuals. An increase in incomes by 10 percent shifts the distribution of the number of trips outward, increasing the average number of migrations by about 3.9 percent. Part of the behavioral change among individuals who would have migrated under both scenarios 24 Note that this does not correspond to the distribution displayed in Figure 1a, which is drawn from the non-representative MMP data, and is conditional on migrants having returned. 29 is obscured by compositional changes as additional individuals can afford to migrate and others may prefer to stay in Mexico if incomes are higher. To abstract from these selection issues, the right-hand graph shows the change in the distribution of the number of migrations only for those who are predicted to migrate at least once in either case. The purely behavioral change in response to a 10 percent increase in incomes in Mexico is predicted to lead to an average increase in the number of trips per migrant of 6.1 percent. This shows that rather than by compositional changes, the effect is driven by the response of individuals who would have migrated even in the absence of the program (Figure 8b), while compositional changes partly offset this shift in the distribution (Figure 8a). A closer look by income groups reveals that this increase in repeat migration is driven by migrants in the middle of the income distribution, of whom many are on the margin of being able to finance their desired migrations. Immigrants in the second tercile of the income distribution raise their number of trips by 12.4 percent. Lower-income migrants, who often are still constrained even under 10 percent higher incomes, and higher-income migrants, who are less restricted by financial constraints to start with, each only migrate 1.7 percent more frequently. A rise in expected incomes at origin not only raises the opportunity cost of staying abroad in terms of origin income forgone, but the value of being in the country of origin is boosted further because individuals know that the future option of emigrating will be more easily affordable if desired. Both these channels are likely to shorten migration durations among immigrants in the U.S. From a host country’s perspective this is important, given that a stronger concern to policy makers than the number of migrations undertaken usually is the stock of immigrants residing in the country. The left-hand graph of Figure 9 shows the survival rates of immigrants in the U.S., that is, the fraction of initial arrivals left in the country by years since immigration. Whereas the solid curve represents the survival rate at baseline, the dashed profile shows the reduction in migration durations when incomes in Mexico increase by 10 percent, which shortens the average length of completed migrations by one year, or 19 percent.25 Again, the figure shows the composite effect of a behavioral change among individuals who are predicted to migrate in either case and of compositional changes. Figure 9b shows the change in survival rates only for those who are predicted to migrate under either scenario, with the effect on migration durations becoming still stronger. The reason for the lower survival rates (and the seemingly smaller effect of higher incomes in Mexico) for the full migrant population at baseline (Figure 9a) is that it includes some migrants with a relatively strong preference for Mexico who under low Mexican incomes tend to stay only for a 25 The overall effect on the Mexican population residing in the U.S. at a given point in time is negative (-4.8 percent), though I show at the end of this section that in line with other studies the effect on the flow of immigrants is positive for low-income individuals. 30 short period in the U.S., while prefering not to migrate at all when incomes in Mexico are higher. Figure 9: Fraction left in the U.S. Survival rates in the U.S. under different income levels in Mexico of (a) all Mexican immigrants, (b) Mexicans who would have migrated under both income levels. Model prediction based on 20,000 simulated individuals. With many migrants returning, a relevant question from a host country’s perspective is how responsive out-migration decisions by immigrants are to economic outcomes, and hence how those who stay at the destination are selected. While policymakers will be less interested in location choices that are driven merely by individual preferences, there may be an interest in how immigrants react to labor market shocks, and in particular whether immigrants stay in the country during unemployment spells. The responsiveness of migration decisions to employment at destination is of particular relevance in the Mexico-U.S. context, where a large proportion of Mexicans work in the U.S. agricultural and construction industries, in which labor demand varies strongly across season and job offers are distributed very unevenly over the year. Figure 10 shows the annual fraction of immigrants that the model predicts to return the following period by employment status. The figure suggests that at baseline the probability to return is higher for unemployed immigrants. The model further predicts that an increase in incomes not only shortens migration duration and raises return migration rates, but it does so much more among immigrants who have lost their job in the host country. This is because the higher wealth level helps immigrants to better respond to economic shocks, and prevents unemployed immigrants from staying in a host country merely on the basis that they will be less likely to re-migrate if desired in the future. The other important dimension of migrant selection is individual productivity. A large part of the early microeconometric migration literature has been concerned with the estimation of immigrants’ earning profiles (Chiswick, 1978; Long, 1980; Borjas, 1985), and a number of later studies explicitly consider contexts where some immigrants stay only temporarily (Hu, 2000; Lubotsky, 2007; see Dustmann and Görlach, 2015, for a critical 31 Figure 10: Effect of higher origin country incomes on return decisions of unemployed immigrants. Annual return migration rates of Mexican immigrants in the U.S. by employment status under different income levels in Mexico. Model prediction based on 20,000 simulated individuals. review of this literature). The long-standing interest in this topic can be explained by the various concerns about immigration, such as the fiscal contribution of immigrants or increased labor market competition with certain groups of native-born workers, an analysis of which requires an understanding of immigrant career paths. The above model allows an evaluation of both Mexican immigrant and out-migrant selection to and from the U.S. in terms of individual productivity when migration from Mexico is constrained by monetary costs. Similar to the increased responsiveness of migration to employment outcomes in the U.S., higher incomes in Mexico make migrants react more strongly to the wage payoff from being in the U.S., but also affects various types of immigrants in different ways. Under higher origin incomes, immigration increases most strongly among the least productive group of workers, depressing the average of immigrant wages at arrival. However, as many of these migrants cannot afford to bring their families with them, they suffer a disproportional utility loss from residing in the U.S., so return migration rates are high for this group of workers. In addition, the length of stay of these low-skilled migrants is most strongly affected by a change in origin incomes, so return migration responds strongly to a rise in the opportunity cost of staying abroad. Immigration also increases among workers at the middle and the upper part of the skill distribution. These immigrants are more likely to bring their family with them and become longer-term immigrants. Hence, the U.S. tends to retain immigrants with above average productivity. This implies that those of an initial arrival cohort staying at the destination become increasingly positively selected on productivity as time passes. Figure 11 displays immigrant earnings profiles in the U.S. by time since the last immigration for different origin 32 country incomes. The graph suggests that 10 percent higher incomes in Mexico lead to about 6 percent lower average earnings at arrival, while negatively selected outmigration reverses this gap after about two years.26 The net effect of lower initial average earnings and higher average earnings by long-term immigrants is ambiguous. The last row in Table 5 shows mean log annual earnings at baseline as well as under 10 percent higher Mexican incomes, indicating that the overall effect on the selection of the immigrant population residing in the U.S. is positive. The model predicts that 10 percent higher incomes in Mexico lead to an about 6 percent higher average in immigrant earnings . Figure 11: Effect of higher origin country incomes on migrant selection in terms of earnings. Mean annual earnings of Mexican immigrants in the U.S. by time since last immigration under different income levels in Mexico. Model prediction based on 20,000 simulated individuals. The effect of a 10 percent rise in household incomes in Mexico is summarized in Table 5, which lists outcomes at the baseline, as well as under the higher income level. The first two rows show the change in repeat migration and migration duration, corresponding to the patterns illustrated in Figures 8 and 9. In order to disentangle the negative effect on emigration due to a rise in the opportunity cost of migration when incomes at home are higher, and the positive effect due to a relaxation of financial constraints, rows 3 and 4 of Table 5 contrast the fall in the fraction of Mexicans who would prefer to emigrate to the U.S. in any given year, and the reduction in the share among them who are prevented from actually migrating because their assets, income and maximum debt level are too low to cover the cost of migration. The resulting annual emigration rates are shown in row 5. It is only this net effect of higher income at the origin on emigration that is identified in 26 Note that neither the “baseline” nor the counterfactual profiles show the returns to time spent in the U.S., but that the comparably steep slope is partially driven by negatively selected out-migrants under all three scenarios. 33 reduced form estimations. The model thus suggests an income elasticity of emigration of approximately 0.6 (≈ (1.53 − 1.44)/1.44 · 10). I relate this to comparable reduced form estimates in more detail at the end of this section. Table 5: Effect of higher incomes in the origin country. (1) Mean Number of trips per migrant (2) Mean migration duration∗ (in years) (3) Fraction that would like to emigrate∗∗ in summer in winter (4) Fraction constrained∗∗ (5) Fraction actually emigrating∗∗ (6) Mean log ann. earnings in the U.S. (in log USD) baseline 1.504 6.390 13.40% 10.33% 9.81% 89.28% 1.44% 9.626 10% higher origin incomes 1.562 5.249 11.92% 9.03% 8.57% 87.18% 1.53% 9.685 Effects predicted by the model, based on 20,000 simulated individuals. ∗ of completed migrations ∗∗ annual fractions: shares of those who want to emigrate/are constrained (among those who want to emigrate)/actually emigrate at least once per year. Motivated by the observation that more than 40 percent of former migrants surveyed by the Mexican Migration Project report having worked in the U.S. agricultural or construction sectors, the model takes into account seasonality in aggregate labor demand. Since the values of being in Mexico or in the U.S. are non-linear functions of income, seasonality in choices does not average out to choices that would be predicted by a model which abstracts from seasonality. Hence, this is an important aspect in modeling migration decisions which so far has been absent from the structural migration literature. The seasonality in (expected) employment in the U.S. translates into a seasonality in migrations, in line with the indirect evidence on apprehensions along the U.S. southern border in Figure 4b. This is because prospective migrants anticipate employment probabilities in the U.S., so that the desire to emigrate is higher during summer months. The previous analysis has focused on migrant selection and the responsiveness to employment shocks which, given the sectors many Mexican immigrants work in, is likely of concern to U.S. policymakers. In light of the temporary nature of many migrations and the better capacity to accumulate savings in the U.S., an interesting question from a Mexican perspective is whether a cash transfer program that raises incomes in Mexico can raise domestic demand over and above the aggregate transfer amount paid by facilitating migrations that would not have taken place otherwise. For simplicity I simulate a scheme that pays out an annual lump sum to all households whose head resides in Mexico. While some migrants induced by such a program may stay for very long or even permanently and consume most of their wealth in the U.S.—including assets that have been accumulated in Mexico prior to migration—others may return with a stock of assets larger than what they owned before emigrating. To evaluate the effect on aggregate expenditure in Mexico, 34 Figure 12 compares average discounted cumulative transfer amounts paid under different schemes to the rise in average discounted cumulative consumption by individuals residing in Mexico, where each period’s consumption is calculated as cash on hand minus next period’s (discounted) assets and migration costs in case an individual emigrates. The difference between bars for each policy shows that repatriated savings, indeed, make up easily for forgone domestic consumption by long-term emigrants. Figure 12: Effect of income subsidies on expenditure in Mexico. Discounted cumulative cost of a cash transfer program and discounted cumulative consumption in Mexico. Model prediction based on 20,000 simulated individuals. Access to credit. A relaxation of debt constraints has similar effects to an increase in Mexican incomes: prospective migrants are less likely to be wealth constrained, and— anticipating this—immigrants in the U.S. are also more flexible in their location choice. On the other hand, staying in Mexico becomes more attractive as better access to credit makes negative labor market shocks less painful. The model suggests that in 89 percent of cases, Mexicans who would choose to move to the U.S. are prevented from doing so because they cannot cover migration costs. To illustrate the effect of access to credit on the prevalence of binding wealth constraints, Figure 13 displays the share of would-bemigrants in a given period who are constrained under different (counterfactual) limits to the maximum amount of loans they can access to finance a migration.27 The mean debt limit predicted by the model is 631 USD, and the figure shows that starting from current levels, better access to credit has a modest effect on the fraction of potential migrants who can actually afford to migrate. The simulation suggests a potentially strong effect 27 In contrast to the estimated model, where debt limits vary by age and potential income, this counterfactual simulation for simplicity assumes equal access to credit for all. 35 of a development of capital markets in Mexico on emigration. For instance, an increase in the debt limit from 1,000 to 1,500 USD (PPP adj.) is predicted to reduce the fraction of would-be-migrants who are prevented from migrating due to borrowing constraints by about 7 percent. Figure 13: Access to credit for financing migrations. Fraction of constrained Mexicans among those who desire to migrate to the U.S. by the amount of migration costs that can be covered by credit. In contrast to the estimated model, this counterfactual simulation assumes equal access to credit irrespective of individuals’ earnings potential. Model prediction based on 20,000 simulated individuals. Emigration. The focus of this paper is on choices taken after an initial emigration has taken place, the analysis of which requires a dynamic model of the kind I use. Nevertheless, this estimated model also makes predictions about the rate of emigration from Mexico to the U.S., which other studies have looked at. The most directly comparable estimates are those by Angelucci (2015), who evaluates the average treatment effect of PROGRESA on the propensity to emigrate in a linear probability model. As the treatment effect on emigration is not used in the estimation of the structural model in Section 2, a comparison to her estimate may serve as an additional credibility check of the model. To match Angelucci’s estimation as closely as possible, I draw from the simulated sample individuals with dependent family from the age distribution of eligible households’ heads in the PROGRESA evaluation data and use the estimated unobserved heterogeneity weights for this sample (see Section 4.2). I further restrict attention to under 40-year-olds in line with her sample restriction. Figure 14 shows the percentage point increase in emigration by age group that is induced by an increase in incomes corresponding to the PROGRESA transfers. For comparison, the figure also shows the 36 corresponding estimate and 95 percent confidence interval reported by Angelucci. On average, the effect predicted by my model is well within her confidence bounds. Figure 14: Comparison to effect of PROGRESA on emigration to the U.S. estimated by Angelucci (2015). Percentage point change in emigration rates due to a 600 USD (PPP adjusted) increase in incomes in Mexico, corresponding to the average amount received by households eligible for PROGRESA transfer, by heads of eligible households by age group. Model prediction based on 20,000 simulated individuals. The differentiation by age group reveals both a strong variation and an interesting non-monotonicity over the life cycle. Emigration is particularly attractive for younger workers: the difference in employment probabilities between the U.S. and Mexico is largest among young workers and among workers close to retirement. At the same time, migration costs are estimated to be lowest for the youngest. Taken together, the desire to emigrate is especially strong among younger Mexicans, who however are also likely to be wealth constrained, as they have not yet accumulated a large stock of assets and often lack access to credit. In particular among the over-proportionally poor households eligible for PROGRESA, the transfers are often still not sufficient for young household heads to overcome these constraints. Thus, while the desire to emigrate may be large for young Mexicans, the effect of the transfers on emigration is in fact strongest for individuals in their early 30s, when a larger mass of prospective migrants is on the margin of being able to afford a migration. For older Mexicans, on the other hand, attachment to the origin and higher migration costs are more likely to discourage a migration irrespective of transfer receipts. 37 6 Conclusion Understanding return and repeat migration, and the selection they involve is essential for an assessment of immigration. One important determinant of migration is the income level in a migrant’s country of origin, which affects both the desire and the ability of individuals to migrate. Importantly, a change in sending country incomes not only has a short-term effect on the probability to emigrate, but also affects the more dynamic aspects of migration, like the duration of stay at a destination, the propensity to migrate back and forth repeatedly, and the selection of both return and re-migrants. Any of these effects depends on the prevalence of financial constraints and whether individuals can borrow in order to finance a migration. This is the first paper to evaluate the effect of a change in incomes in a migrant sending country on return and repeat migration under financial constraints. I further relate the selection in return migration to economic condition in the migrant’s country of origin taking into account migration dynamics that may lead to multiple migration spells. The explicit consideration of a migrant’s ability to cover part of the monetary cost of migration by loans is an important contribution to the empirical migration literature, which so far has estimated migration costs under the assumption of zero borrowing. To identify the model’s parameters, I use a combination of panel and cross-sectional datasets that provide individual and household level information in Mexico and in the U.S., including randomized variation in income induced by a policy experiment. This truly exogenous variation allows the identification of income dependent borrowing limits in a context where the preference for migration and thus the demand for credit may be related to an individual’s earnings capacity. My results suggest that access to credit and hence the ability to migrate is indeed strongly related to household income. One prediction of the model is that the number of trips per migrant increases as financial constraints at the origin are relaxed and migrants are better able to respond to economic shocks. This contrasts with the prediction of models that abstract from asset accumulation and financial constraints. Alternative explanations for a rise in repeat migration include initial information constraints about earning opportunities at the destination that are reduced after a first migration has been undertaken. This is the mechanism suggested in the recent paper by Bryan et al. (2014), who find a higher probability of consecutive rural-urban migrations in Bangladesh after the cost of an initial trip has been covered. While I consider a relaxation of financial constraints a plausible explanation in the Mexico-U.S. context, where income differentials are known to be large, a definitive separation of the two mechanisms is an interesting avenue for future research. Also in contrast to models abstracting from financial constraints, the prediction by the 38 model estimated in this paper that the overall emigration rate rises with origin incomes is well in line with previous reduced form estimations. The economic literature on temporary migration largely has focused on the effects of economic outcomes in the host country on the decision to return. This paper suggests that besides host country outcomes, economic conditions in a migrant’s country of origin may have to be taken more strongly into account in future analyses of migrant behavior. Furthermore, my findings that the selection of immigrants staying in the U.S. for longer becomes more positive as origin country incomes increase imply that events in migrant sending countries play an important role also for the impact of immigration on the host country population. In particular, policies that raise incomes in the origin country are thus likely to translate into a higher average fiscal contribution to the host economy by an increasingly positively selected population of long-term immigrants. This is in stark contrast with simpler static models, where in the presence of financial constraints, a rise in origin incomes unambiuously leads to a more negatively selected immigrant population. From a sending country’s perspective my results predict that such higher incomes may well lead to a more than proportional increase in domestic consumption expenditure due to repatriated savings of new emigrants who are likely to leave only temporarily. References Adda, J. and Eaton, J. (1998). Borrowing with Unobserved Liquidity Constraints: Structural Estimation with an Application to Sovereign Debt. Unpublished Manuscript. Adda, J., McConnell, B., and Rasul, I. (2014). Crime and the Depenalization of Cannabis Possession: Evidence from a Policing Experiment. Journal of Political Economy, 122(6):1379–1381. Angelucci, M. (2015). Migration and Financial Constraints: Evidence from Mexico. Review of Economics and Statistics, 97(1):224–228. Angrist, J. D. and Krueger, A. B. (1992). The Effect of Age at School Entry on Educational Attainment: An Application of Instrumental Variables with Moments from Two Samples. Journal of the American Statistical Association, 87(418):328–336. Arellano, M. and Meghir, C. (1992). Female Labour Supply and On-the-Job Search: An Empirical Model Estimated Using Complementary Data Sets. Review of Economic Studies, 59(3):537–559. Attanasio, O. P., Meghir, C., and Santiago, A. (2012). Education Choices in Mexico: 39 Using a Structural Model and a Randomized Experiment to Evaluate PROGRESA. Review of Economic Studies, 79(1):37–66. Banerjee, A. and Munshi, K. (2004). How Efficiently is Capital Allocated? Evidence from the Knitted Garment Industry in Tirupur. The Review of Economic Studies, 71(1):19–42. Bazzi, S. (2014). Wealth Heterogeneity and the Income Elasticity of Migration. Unpublished manuscript. Behrman, J. R. and Todd, P. E. (1999). Randomness in the Experimental Samples of PROGRESA (Education, Health, and Nutrition Program). International Food Policy Research Institute, Washington, DC. Belot, M. V. K. and Hatton, T. J. (2012). Immigrant Selection in the OECD. Scandinavian Journal of Economics, 114(4):1105–1128. Borjas, G. J. (1985). Assimilation, Changes in Cohort Quality, and the Earnings of Immigrants. Journal of Labor Economics, 3(4):463–489. Bryan, G., Chowdhury, S., and Mobarak, A. M. (2014). Underinvestment in a Profitable Technology: The Case of Seasonal Migration in Bangladesh. Econometrica, 82(5):1671– 1748. Chiquiar, D. and Hanson, G. H. (2005). International Migration, Self-Selection, and the Distribution of Wages: Evidence from Mexico and the United States. Journal of Political Economy, 113(2):239–281. Chiswick, B. R. (1978). The Effect of Americanization on the Earnings of Foreign-born Men. Journal of Political Economy, 86(5):897–921. Clemens, M. A. (2014). Does Development Reduce Migration? In Lucas, R. E., editor, International Handbook on Migration and Economic Development, chapter 6, pages 152–185. Edward Elgar Publishing, Lonon. Colussi, A. (2003). Migrants’ Networks: An Estimable Model of Illegal Mexican Immigration. Job Market Paper, University of Pennsylvania. Deléchat, C. (2001). International Migration Dynamics: The Role of Experience and Social Networks. Labour, 15(3):457–486. Dustmann, C. (2003). Return migration, wage differentials, and the optimal migration duration. European Economic Review, 47(2):353–369. 40 Dustmann, C. and Görlach, J.-S. (2015). Selective Out-migration and the Estimation of Immigrant Earnings Profiles. In Chiswick, B. R. and Miller, P. W., editors, Handbook of the Economics of International Migration 1A, chapter 10, pages 489–533. North Holland. Dustmann, C. and Görlach, J.-S. (2016). The Economics of Temporary Migrations. Journal of Economic Literature, forthcoming. Dustmann, C. and Okatenko, A. (2014). Out-migration, wealth constraints, and the quality of local amenities. Journal of Development Economics, 110:52–63. Fernández-Huertas Moraga, J. (2011). New Evidence on Emigrant Selection. Review of Economics and Statistics, 93(1):72–96. Gathmann, C. (2008). Effects of enforcement on illegal markets: Evidence from migrant smuggling along the southwestern border. Journal of Public Economics, 92(10):1926– 1941. Gourieroux, C., Monfort, A., and Renault, E. (1993). Indirect Inference. Journal of Applied Econometrics, 8(Supplement: Special Issue on Econometric Inference Using Simulation Techniques):S85–S118. Grogger, J. and Hanson, G. H. (2011). Income Maximization and the Selection and Sorting of International Migrants. Journal of Development Economics, 95(1):42–57. Hanson, G. H. (2006). Illegal Migration from Mexico to the United States. Journal of Economic Literature, 44(4):869–924. Hu, W.-Y. (2000). Immigrant Earnings Assimilation: Estimates from Longitudinal Data. American Economic Review, 90(2):368–372. Imbert, C. and Papp, J. (2015). Labor Market Effects of Social Programs: Evidence from India’s Employment Guarantee. American Economic Journal: Applied Economics, 7(2):233–263. Kaboski, J. P. and Townsend, R. M. (2011). A Structural Evaluation of a Large-Scale Quasi-Experimental Microfinance Initiative. Econometrica, 79(5):1357–1406. Lacuesta, A. (2010). A Revision of the Self-selection of Migrants Using Returning Migrant’s Earnings. Annals of Economics and Statistics, (97/98):235–259. Laszlo, S. and Santor, E. (2009). Migration, Social Networks, and Credit: Empirical Evidence from Peru. The Developing Economies, 47(4):383–409. 41 Lessem, R. (2013). Mexico-U.S. Immigration: Effects Wages and Border Enforcement. Unpublished Manuscript. Lise, J., Seitz, S., and Smith, J. (2005). Equilibrium Policy Experiments and the Evaluation of Social Programs. Unpublished Manuscript. Long, J. E. (1980). The Effect of Americanization on Earnings: Some Evidence for Women. Journal of Political Economy, 88(3):620–629. Lubotsky, D. (2007). Chutes or Ladders? A Longitudinal Analysis of Immigrant Earnings. Journal of Political Economy, 115(5):820–867. McKenzie, D. and Rapoport, H. (2007). Network effects and the dynamics of migration and inequality: Theory and evidence from Mexico. Journal of Development Economics, 84(1):1–24. McKenzie, D. and Rapoport, H. (2010). Self-Selection Patterns in Mexico-US Migration: The Role of Migration Networks. Review of Economics and Statistics, 92(4):811–821. Munshi, K. and Rosenzweig, M. (2016). Networks and Misallocation: Insurance, Migration, and the Rural-Urban Wage Gap. American Economic Review, 106(1):46–98. Nakajima, K. (2014). The Repeated Migration Puzzle: A New Explanation. Unpublished Manuscript. Nakajima, K. (2015). The Fiscal Impact of Border Tightening. Job Market Paper, University of Wisconsin-Madison. OECD (2007). Pensions at a Glance 2007: Public Policies across OECD Countries. OECD Publishing, Paris. Orrenius, P. M. and Zavodny, M. (2005). Self-selection among undocumented immigrants from Mexico. Journal of Development Economics, 78(1):215–240. Passel, J. S. (2005). Estimates of the Size and Characteristics of the Undocumented Population. Report, Pew Hispanic Center Washington, DC, March 21. Programa de Educación, Salud, y. A. (2012). Mexico, Evaluation of PROGRESA. http: //hdl.handle.net/1902.1/18235. Harvard Dataverse, V1. Accessed: 31.03.2015. Reinhold, S. and Thom, K. (2013). Migration Experience and Earnings in the Mexican Llabor Market. Journal of Human Resources, 48(3):768–820. 42 Rendon, S. and Cuecuecha, A. (2010). International Job Search: Mexicans in and out of the US. Review of Economics of the Household, 8(1):53–82. Ridder, G. and Moffitt, R. (2007). The Econometrics of Data Combination. In Heckman, J. J. and Leamer, E. E., editors, Handbook of Econometrics, volume 6B, chapter 75, pages 5469–5547. North Holland, Amsterdam. Rubalcava, L. and Teruel, G. (2006). Conditional Public Transfers and Living Arrangements in Rural Mexico. California Center for Population Research, 006-06. Rust, J. (1987). Optimal Replacement of GMC Bus Engines: An Empirical Model of Harold Zurcher. Econometrica, 55(5):999–1033. Sachs, J., Tornell, A., and Velasco, A. (1996). The Collapse of the Mexican Peso: What Have We Learned? Economic Policy, 22 (April):13–64. Schultz, T. P. (2004). School subsidies for the poor: evaluating the Mexican Progresa poverty program. Journal of Development Economics, 74(1):199–250. Singleton, K. J. (2006). Empirical Dynamic Asset Pricing: Model Specification and Econometric Assessment. Princeton University Press. Skoufias, E., Davis, B., and Behrman, J. (1999). An Evaluation of the Selection of Beneficiary Households in the Education, Health, and Nutrition Program (PROGRESA) of Mexico. International Food Policy Research Institute, Washington, DC. Stecklov, G., Winters, P., Stampini, M., and Davis, B. (2005). Do Conditional Cash Transfers Influence Migration? A Study Using Experimental Data from the Mexican PROGRESA Program. Demography, 42(4):769–790. Thom, K. (2010). Repeated Circular Migration: Theory and Evidence from Undocumented Migrants. Mimeo, New York University. Todd, P. E. and Wolpin, K. I. (2006). Assessing the Impact of a School Subsidy Program in Mexico: Using a Social Experiment to Validate a Dynamic Behavioral Model of Child Schooling and Fertility. American Economic Review, 96(5):1384–1417. U.S. Census Bureau (2012). The Foreign-Born Population in the United States: 2010. American Community Survey Reports. World Bank (2015). World Development Indicators. Washington, D.C. 43 APPENDIX A Model Specification The below details specifications of model components from Section 2, including the values of parameters that are calibrated using outside information. Functional form specifications. The probability of gaining dependent family is assumed to be given by pf =1 (Ωit |fit−1 = 0) = Φ ψ0f + + gf =1 (ait ) , where Φ() denotes the standard normal cumulative distribution function, and gf =1 (ait ) is a f+ f+ piecewise linear function of age with nodes at 30 and 50 years, and slopes ψa≤30 , ψ30<a≤50 f+ and ψa>50 . Similarly, the probability of losing dependent family is given by another transformed piecewise linear function of age, pf =0 (Ωit |fit−1 = 1) = Φ ψ0f − + gf =0 (ait ) , f− f− f− and ψa>50 . , ψ30<a≤50 where gf =0 (ait ) again has nodes at 30 and 50 years, and slopes ψa≤30 The probabilities of obtaining or losing a U.S. legal permit are assumed to be pδ=1 (Ωit |δit−1 = 0) = Φ ψ0δ+ + gδ=1 (ait ) + ψeδ1 eit and pδ=0 (Ωit |δit−1 = 1) = Φ ψ0δ− + gδ=0 (ait ) + ψeδ0 eit , respectively, where again gδ=1 (ait ) and gδ=0 (ait ) are piecewise linear functions with nodes at 30 and 50 years of age, and correspondingly denoted slope parameters. Finally, when an individual is in Mexico, jobs are assumed to be found and lost with probabilities w,M X 1[XitU S > 0] λw (Ωit |eit−1 = nw, lit = M X) = Φ ψ0w,M X + gw,M X (ait ) + ψX ! +ψsw,M X 1[sit = summer] 44 and nw,M X λnw (Ωit |eit−1 = w, lit = M X) = Φ ψ0nw,M X + gnw,M X (ait ) + ψX 1[XitU S > 0] ! +ψsnw,M X 1[sit = summer] , and when having migrated to the U.S. with probabilities w,U S λw (Ωit |eit−1 = nw, lit U S) = Φ ψ0w,U S + gw,U S (ait ) + ψX Xit ! +ψsw,U S 1[sit = summer] + ψδw,U S 1[δit = 1] and nw,U S λnw (Ωit |eit−1 = w, lit = U S) = Φ ψ0nw,U S + gnw,U S (ait ) + ψX Xit ! +ψsnw,U S 1[sit = summer] + ψδnw,U S 1[δit = 1] , with linear splines gw,M X (ait ), gw,M X (ait ), gw,U S (ait ) and gw,U S (ait ) that all have nodes at 30 and 50 years of age, and correspondingly denoted slope parameters. The location specific function relating age and U.S. experience to earnings is given by y,M X f M X (ait , XitU S ) = gay,M X (ait ) + ψX 1[XitU S > 0] when an individual works in Mexico, and by y,U S f U S (ait , XitU S ) = gay,U S (ait ) + gX (XitU S ) when working in the United States, with the piecewise linear functions ga,M X (ait ) and ga,U S (ait ) having nodes at 20, 25, 35 and 50 years of age, and gX,U S (XitU S ) has nodes at 5 and 10 years of U.S. experience. Migration costs are a function of age and may be different for household heads and for the remaining family. They further include an extra cost Ccoyote if a household has no legal permit to enter the U.S. The age specific part is specified as a piecewise linear function of age, with nodes at 30 and 50 years of age, separately for heads and dependent family. 45 Retirement benefits. Individuals are assumed to live until age aend = 75, which corresponds to life expectancy in Mexico at the middle of my sample period in 2002 (World Bank, 2015). Retirement schemes in Mexico and in the U.S. are approximated based on OECD (2007) data as follows: individuals retire at age aret = 65, with benefits y ret (Ωit ) corresponding to a net replacement rate in Mexico of 37.9 percent (55.3 percent in the U.S.) of potential earnings at age 64. If a migrant retires in the U.S., the retirement benefits are a weighted average between Mexican entitlements and benefits from the U.S., with the weight toward U.S. benefits being the fraction of working life spent in the U.S., X U S /(65 − 16). Interest rates and time preference. The biannual real interest on positive asset holdings is set to rA≥0 = 0.02, while the real lending interest rate is set to rA<0 = rA≥0 + 0.034, based on information from the World Bank’s (2015) World Development Indicators. The biannual discount factor β is assumed to equal 1/(1 + rA≥0 ). Borrowing limit. Households with a working age head can take up credit. They are assumed to face two constraints to the maximum amount of debt, B(E[yitM X ], Ωit ), they can hold: a debt limit that depends on retirement benefits (used as collateral) and which becomes tighter towards the end of life to ensure full debt repayment, and one that captures better access to credit by high-income households and is a linear function of expected earnings, E[yit ], so that for ait < aret , !) ( aend −aret +1 1 − (1 + r ) A<0 . B(E[yit ], Ωit ) = min b0 + by E[yit ], y ret (Ωit ) −rA<0 46 B Moments Used for Identification Identification and model fit are discussed in Sections 4.1 and 5.1, respectively. This appendix provides further details. Table 6 lists the model parameters to be estimated and the identifying moments more systematically. Figure 15 illustrates the fit to the marginal distributions of the three dimensions of unobserved heterogeneity. It should be noted that the model features four ex-ante heterogeneous types of individuals, and hence has a hard time matching all ten targeted deciles of these distributions. Tables 7-14 list the full set of empirical moments used in the estimation together with their simulated counterparts and standard deviation. Parameters λe (Ω) f l (a, X U S ) l σu pf (Ω) pδ (Ω) φlf Ã0 , φc sε (a) C(Ω) C f (Ω) B(a, E[y M X ]) pdf of αli , πi MMP weights PROGRESA weights Identifying moments fraction working, season last worked, and transitions into and out of employment by age, legal status, location, having been to the U.S. and season log earnings by age, location and U.S. experience standard deviation of log earnings residuals transitions to and from having dependent family if being in Mexico by age transitions to and from having U.S. visa by age, employment status and having previously been to the U.S. with legal documentation stock of assets/debt if in Mexico by family status and family location by age stock of assets/debt if in Mexico by age, and annual savings and remittances if in the U.S. by age, U.S. experience and employment status number of U.S. migrations by age annual fraction of household heads migrating to the U.S. by age and legal status annual fraction of spouses migrating to the U.S. by age and legal status loan amount taken within the past six months by age and randomized income shock treatment mean log earnings by location and quantiles of within-individual mean log earnings residual by location, having been to the U.S. by quantiles of mean log earnings residual in Mexico, leaving the U.S. by quantiles of mean log earnings residual quantiles in the U.S. fraction residing in the U.S. and quantiles of within-individual mean residual from regression of location on age quantiles of within-individual mean log earnings in Mexico Table 6: Identification of model parameters 47 Dataset MMP, MxFLS, SIPP MxFLS, SIPP MxFLS, SIPP MxFLS MMP MxFLS, MMP MxFLS, MMP MMP MMP MMP PROGRESA MxFLS, SIPP MMP PROGRESA Figure 15: Model fit: Unobserved heterogeneity Model fit for the three dimensions of unobserved heterogeneity: The upper two panels compare deciles of within-individual mean residuals from the auxiliary earnings regressions that identify f l (a, X U S ) in Mexico and in the U.S., i.e. a regression of log annual earnings in Mexico on age indicators and an indicator of having been in the U.S. from the MxFLS, and a regression of log annual earnings in the U.S. on indicators for age and years since immigration. The bottom panel shows the fit for deciles of within-individual mean residuals from a regression of an indicator for being in the U.S. in a given year on age indicators from the MMP data. 48 Table 7: Migration outcomes by age. Moment share in U.S. share in U.S. share in U.S. share in U.S. share in U.S. at at at at at 16 ≤ a < 25 25 ≤ a < 35 35 ≤ a < 45 45 ≤ a < 55 55 ≤ a < 65 Data 0.091 0.075 0.047 0.030 0.014 Std. err. (0.002) (0.002) (0.002) (0.002) (0.002) Simulation 0.088 0.064 0.030 0.018 0.020 share share share share share in in in in in U.S. U.S. U.S. U.S. U.S. 0.906 0.855 0.847 0.850 0.846 (0.008) (0.006) (0.007) (0.011) (0.020) 0.885 0.856 0.818 0.864 0.956 0.220 0.513 0.536 0.482 0.587 (0.016) (0.011) (0.011) (0.012) (0.015) 0.117 0.293 0.421 0.485 0.532 0.184 0.099 0.077 0.135 0.162 (0.016) (0.009) (0.010) (0.015) (0.027) 0.323 0.439 0.283 0.330 0.523 Regression of head migrating to the U.S. on: 1[16 ≤ age < 25] 0.042 1[25 ≤ age < 35] 0.036 1[35 ≤ age < 45] 0.019 1[45 ≤ age < 55] 0.008 1[55 ≤ age < 65] −0.010 1[legal] 0.178 (0.002) (0.001) (0.001) (0.001) (0.002) (0.003) 0.023 0.013 0.011 0.004 0.001 −0.001 Regression of spouse migrating to the U.S. on: 1[16 ≤ age < 25] 0.044 1[25 ≤ age < 35] 0.028 1[35 ≤ age < 45] 0.018 1[45 ≤ age < 55] 0.033 1[55 ≤ age < 65] 0.074 (0.010) (0.005) (0.006) (0.009) (0.015) 0.011 0.010 0.003 0.002 0.003 of of of of of number number number number number share share share share share year year year year year of of of of of with with with with with trips trips trips trips trips by by by by by family family family family family at at at at at 16 ≤ a < 25 25 ≤ a < 35 35 ≤ a < 45 45 ≤ a < 55 55 ≤ a < 65 16 ≤ a < 25 25 ≤ a < 35 35 ≤ a < 45 45 ≤ a < 55 55 ≤ a < 65 in in in in in U.S. U.S. U.S. U.S. U.S. at at at at at 16 ≤ a < 25 25 ≤ a < 35 35 ≤ a < 45 45 ≤ a < 55 55 ≤ a < 65 Data moments obtained from the MMP. Simulation based on 20,000 individuals × 50 years × 2 seasons. 49 Table 8: Family and legal status transitions. Moment Data Std. err. Transition to having family (MxFLS): 1[16 ≤ age < 25] 0.429 (0.179) 1[25 ≤ age < 35] 0.553 (0.077) 1[35 ≤ age < 45] 0.372 (0.072) 1[45 ≤ age < 55] 0.269 (0.066) 1[55 ≤ age < 65] 0.283 (0.061) Transition to not having family (MxFLS): 1[16 ≤ age < 25] 0.053 (0.024) 1[25 ≤ age < 35] 0.007 (0.005) 1[35 ≤ age < 45] 0.013 (0.004) 1[45 ≤ age < 55] 0.026 (0.004) 1[55 ≤ age < 65] 0.043 (0.005) Simulation 0.719 0.736 0.228 0.043 0.042 0.005 0.001 0.001 0.004 0.029 Regression of transition to having a U.S. visa (MMP) on: 1[16 ≤ age < 25] 0.012 (0.024) 0.032 1[25 ≤ age < 35] 0.284 (0.022) 0.087 1[35 ≤ age < 45] 0.433 (0.023) 0.249 1[45 ≤ age < 55] 0.526 (0.023) 0.600 1[55 ≤ age < 65] 0.514 (0.024) 0.501 1[working] 0.369 (0.022) 0.030 Regression of transition to not having a U.S. visa (MMP) on: 1[16 ≤ age < 25] 0.020 (0.012) 0.014 1[25 ≤ age < 35] 0.015 (0.010) 0.011 1[35 ≤ age < 45] 0.009 (0.010) 0.010 1[45 ≤ age < 55] 0.002 (0.010) 0.008 1[55 ≤ age < 65] 0.002 (0.014) 0.009 1[working] −0.002 (0.009) −0.008 Data moments obtained from the MMP and the MxFLS as indicated. Simulation based on 20,000 individuals × 50 years × 2 seasons. 50 Table 9: Employment in Mexico. Moment Data Std. err. Regression of working in Mexico on: 1[summer] −0.018 (0.014) 1[16 ≤ age < 25] 0.946 (0.020) 1[25 ≤ age < 35] 0.956 (0.015) 1[35 ≤ age < 45] 0.948 (0.015) 1[45 ≤ age < 55] 0.898 (0.015) 1[55 ≤ age < 65] 0.770 (0.016) Simulation −0.004 0.863 0.866 0.826 0.751 0.593 Regression of season last worked∗ in Mexico on: 1[summer] −0.637 (0.185) −0.649 1[16 ≤ age < 25] 0.689 (0.208) 0.813 1[25 ≤ age < 35] 0.660 (0.195) 0.827 1[35 ≤ age < 45] 0.770 (0.207) 0.822 1[45 ≤ age < 55] 0.882 (0.190) 0.823 1[55 ≤ age < 65] 0.769 (0.190) 0.832 Regression of transition into work in Mexico on: 1[16 ≤ age < 25] 1.000 (0.335) 1.013 1[25 ≤ age < 35] 0.795 (0.087) 1.092 1[35 ≤ age < 45] 0.768 (0.066) 1.044 1[45 ≤ age < 55] 0.611 (0.057) 0.969 1[55 ≤ age < 65] 0.487 (0.046) 0.791 1[been in U.S.] 0.079 (0.108) −0.162 Regression of transition out of work in Mexico on: 1[16 ≤ age < 25] 0.034 (0.055) 0.003 1[25 ≤ age < 35] 0.053 (0.013) −0.006 1[35 ≤ age < 45] 0.057 (0.010) 0.007 1[45 ≤ age < 55] 0.080 (0.010) 0.056 1[55 ≤ age < 65] 0.160 (0.012) 0.183 1[been in U.S.] 0.036 (0.023) 0.087 * Dependent variable is an indicator taking value 1 if the last season an individual has worked (excluding the current one). Data moments obtained from the MxFLS. Simulation based on 20,000 individuals × 50 years × 2 seasons. 51 Table 10: Employment in the U.S. Moment Data Std. err. Regression of working in the U.S. (MMP) on: 1[legal] −0.003 (0.013) U.S. experience 0.001 (0.001) constant 0.886 (0.007) Simulation −0.021 0.005 0.936 Regression of fraction of year worked in the U.S. (MMP) on: 1[legal] −0.186 (0.010) −0.037 U.S. experience 0.011 (0.001) 0.012 constant 0.855 (0.006) 0.829 Regression of transition into work in the 1[16 ≤ age < 25 ∩ winter] 0.500 1[25 ≤ age < 35 ∩ winter] 0.500 1[35 ≤ age < 45 ∩ winter] 0.308 1[45 ≤ age < 55 ∩ winter] 0.145 1[55 ≤ age < 65 ∩ winter] 0.034 1[16 ≤ age < 25 ∩ summer] 0.000 1[25 ≤ age < 35 ∩ summer] 0.157 1[35 ≤ age < 45 ∩ summer] 0.152 1[45 ≤ age < 55 ∩ summer] 0.075 1[55 ≤ age < 65 ∩ summer] 0.052 Regression of transition out of 1[16 ≤ age < 25 ∩ winter] 1[25 ≤ age < 35 ∩ winter] 1[35 ≤ age < 45 ∩ winter] 1[45 ≤ age < 55 ∩ winter] 1[55 ≤ age < 65 ∩ winter] 1[16 ≤ age < 25 ∩ summer] 1[25 ≤ age < 35 ∩ summer] 1[35 ≤ age < 45 ∩ summer] 1[45 ≤ age < 55 ∩ summer] 1[55 ≤ age < 65 ∩ summer] U.S. (SIPP) on: (0.115) 0.257 (0.048) 0.246 (0.052) 0.225 (0.044) 0.294 (0.035) 0.120 (0.145) 0.621 (0.046) 0.566 (0.048) 0.559 (0.036) 0.633 (0.030) 0.259 work in the U.S. (SIPP) on: 0.023 (0.013) 0.090 0.023 (0.004) 0.058 0.023 (0.004) 0.011 0.022 (0.005) 0.005 0.085 (0.008) 0.002 0.005 (0.009) 0.054 0.004 (0.003) 0.037 0.004 (0.003) 0.007 0.008 (0.004) 0.003 0.009 (0.008) 0.002 Data moments obtained from the MMP and the SIPP as indicated. Simulation based on 20,000 individuals × 50 years × 2 seasons. 52 Table 11: Earnings. Moment Data Std. err. Simulation Regression of log annual earnings in Mexico (MxFLS) on: 1[16 ≤ age ≤ 20] 7.885 (0.093) 7.001 1[20 < age ≤ 25] 8.307 (0.045) 7.831 1[25 < age ≤ 30] 8.333 (0.032) 8.198 1[30 < age ≤ 35] 8.347 (0.029) 8.500 1[35 < age ≤ 40] 8.340 (0.028) 8.528 1[40 < age ≤ 45] 8.317 (0.029) 8.370 1[45 < age ≤ 50] 8.236 (0.031) 8.208 1[50 < age ≤ 55] 8.160 (0.035) 8.010 1[55 < age ≤ 60] 8.026 (0.039) 7.768 1[60 < age < 65] 7.933 (0.053) 7.536 standard deviation of residual 0.984 (0.023) 0.976 Regression of log annual earnings in the U.S. (SIPP) on: 1[16 ≤ age ≤ 20] 10.047 (0.135) 8.645 1[20 < age ≤ 25] 9.969 (0.058) 9.092 1[25 < age ≤ 30] 10.004 (0.052) 8.986 1[30 < age ≤ 35] 10.022 (0.052) 9.650 1[35 < age ≤ 40] 9.934 (0.056) 10.042 1[40 < age ≤ 45] 10.141 (0.057) 10.135 1[45 < age ≤ 50] 10.254 (0.062) 10.355 1[50 < age ≤ 55] 10.076 (0.066) 10.425 1[55 < age ≤ 60] 9.954 (0.077) 10.462 1[60 < age < 65] 9.520 (0.114) 10.515 1[5 ≤ U.S. experience < 10] 0.060 (0.054) 0.341 1[10 ≤ U.S. experience < 15] 0.149 (0.054) 0.032 1[15 ≤ U.S. experience] 0.285 (0.050) −0.543 standard deviation of residual 0.747 (0.015) 0.796 Data moments obtained from the MxFLS and the SIPP as indicated. Simulation based on 20,000 individuals × 50 years × 2 seasons. Table 12: Assets and debt. Moment Data Regression of log assets (MxFLS) on: 1[16 ≤ age < 25] 5.214 1[25 ≤ age < 35] 6.324 1[35 ≤ age < 45] 6.838 1[45 ≤ age < 55] 6.986 1[55 ≤ age < 65] 7.057 1[f amily] 0.000 Regression of log debt (MxFLS) on: 1[16 ≤ age < 25] 1[25 ≤ age < 35] 1[35 ≤ age < 45] 1[45 ≤ age < 55] 1[55 ≤ age < 65] 1[f amily] 4.202 4.437 4.656 4.626 4.612 0.026 Std. err. Simulation (0.190) (0.174) (0.173) (0.172) (0.173) (0.168) 7.966 8.659 8.800 8.576 7.926 −0.409 (0.220) (0.186) (0.181) (0.186) (0.193) (0.175) 5.720 5.957 5.926 5.892 5.464 0.181 Regression of log loan take-up during last 6 months (PROGRESA) on: 1[age > 40] 0.593 (0.288) −0.524 1[PROGRESA treated] 0.349 (0.222) 0.342 1[age > 40 ∩ PROGRESA treated] −0.471 (0.354) −0.219 constant 4.384 (0.168) 5.948 Data moments obtained from the MxFLS and PROGRESA as indicated. Simulation based on 20,000 individuals × 50 years × 2 seasons. 53 Table 13: Unobserved heterogeneity. Moment Within-individual mean earnings 1. dec of mean earn res in MX 2. dec of mean earn res in MX 3. dec of mean earn res in MX 4. dec of mean earn res in MX 5. dec of mean earn res in MX 6. dec of mean earn res in MX 7. dec of mean earn res in MX 8. dec of mean earn res in MX 9. dec of mean earn res in MX 10. dec of mean earn res in MX Data Std. err. Simulation residual in Mexico (MxFLS): −1.768 (0.012) −1.934 −0.774 (0.012) −0.558 −0.424 (0.012) −0.154 −0.192 (0.012) −0.107 −0.007 (0.012) −0.059 0.168 (0.012) 0.114 0.329 (0.012) 0.585 0.505 (0.012) 0.652 0.719 (0.012) 0.699 1.341 (0.012) 0.764 Within-individual mean earnings 1. dec of mean earn res in US 2. dec of mean earn res in US 3. dec of mean earn res in US 4. dec of mean earn res in US 5. dec of mean earn res in US 6. dec of mean earn res in US 7. dec of mean earn res in US 8. dec of mean earn res in US 9. dec of mean earn res in US 10. dec of mean earn res in US residual in the U.S. (SIPP): −1.336 (0.012) −1.224 −0.638 (0.012) −0.500 −0.369 (0.012) −0.332 −0.188 (0.012) −0.219 −0.025 (0.012) −0.142 0.114 (0.012) 0.051 0.241 (0.012) 0.445 0.408 (0.012) 0.549 0.629 (0.012) 0.621 1.046 (0.012) 0.719 Fraction having been to the U.S. (MxFLS): 1. dec of mean earn res in MX 0.068 2. dec of mean earn res in MX 0.063 3. dec of mean earn res in MX 0.045 4. dec of mean earn res in MX 0.059 5. dec of mean earn res in MX 0.042 6. dec of mean earn res in MX 0.045 7. dec of mean earn res in MX 0.044 8. dec of mean earn res in MX 0.067 9. dec of mean earn res in MX 0.053 10. dec of mean earn res in MX 0.045 (0.007) (0.007) (0.007) (0.007) (0.007) (0.007) (0.007) (0.008) (0.007) (0.008) 0.021 0.076 0.047 0.061 0.054 0.212 0.322 0.265 0.270 0.312 Fraction staying in the U.S. for at least 1 more years (SIPP): 1. dec of mean earn res in US 0.555 (0.026) 0.884 2. dec of mean earn res in US 0.622 (0.026) 0.918 3. dec of mean earn res in US 0.599 (0.026) 0.954 4. dec of mean earn res in US 0.630 (0.026) 0.976 5. dec of mean earn res in US 0.598 (0.026) 0.981 6. dec of mean earn res in US 0.639 (0.026) 0.979 7. dec of mean earn res in US 0.637 (0.026) 0.996 8. dec of mean earn res in US 0.634 (0.026) 0.998 9. dec of mean earn res in US 0.640 (0.026) 0.999 10. dec of mean earn res in US 0.643 (0.026) 0.999 Data moments obtained from the MxFLS and the SIPP as indicated. Simulation based on 20,000 individuals × 50 years × 2 seasons. 54 Table 14: Non-representative weights. Moment U.S. migration in the MMP: 1. res dec of being in US—age 2. res dec of being in US—age 3. res dec of being in US—age 4. res dec of being in US—age 5. res dec of being in US—age 6. res dec of being in US—age 7. res dec of being in US—age 8. res dec of being in US—age 9. res dec of being in US—age 10. res dec of being in US—age Earnings in PROGRESA: 1. earnings decile in PROGRESA 2. earnings decile in PROGRESA 3. earnings decile in PROGRESA 4. earnings decile in PROGRESA 5. earnings decile in PROGRESA 6. earnings decile in PROGRESA 7. earnings decile in PROGRESA 8. earnings decile in PROGRESA 9. earnings decile in PROGRESA 10. earnings decile in PROGRESA Data Std. err. Simulation −0.087 −0.077 −0.066 −0.052 −0.043 −0.036 −0.028 −0.017 0.063 0.395 (0.001) (0.001) (0.001) (0.001) (0.001) (0.001) (0.001) (0.001) (0.001) (0.001) −0.081 −0.073 −0.064 −0.057 −0.052 −0.047 −0.044 −0.018 0.070 0.381 6.374 7.186 7.578 7.813 7.975 8.171 8.360 8.412 8.568 9.176 (0.010) (0.010) (0.009) (0.011) (0.010) (0.009) (0.009) (0.056) (0.010) (0.011) 6.374 7.265 7.473 7.785 7.978 8.140 8.205 8.473 8.681 8.972 Data moments obtained from the MMP and PROGRESA as indicated. Simulation based on 20,000 individuals × 50 years × 2 seasons. 55 C Asymptotic Distribution of the Simulated Minimum Distance Estimator with Multiple Samples The following derivation of the asymptotic distribution of the estimator used in this paper extends results by Gourieroux et al. (1993) for the simulated minimum distance estimator with one data source to the case where identification requires moments from multiple datasets.28 The following assumptions need to be made: Assumption 1. The different samples used are drawn independently. This implies that any cross-sample moments are zero and most plausible weighting matrices W , including the efficient ones, will be block diagonal, with a block Wς for each set of moments derived from the same dataset ς. Assumption 2. The criterion function to be minimized, Γ(ϑ) = D(ϑ)0 W D(ϑ) = 0 md − ms (ϑ) W md − ms (ϑ) , is differentiable and attains its global minimum at the true parameter vector θ. (θ) has full rank, which ensures identification of parameters θ Assumption 3. ∂D(ϑ) ∂ϑ0 through the moments in D(ϑ). Assumption 4. The moments targeted, md , are aymptotically normally distributed. Assumption 5. Sample sizes Nς of the datasets used increase at a proportional rate lim (Nς /N ) = λς , Nς →∞ N →∞ with N = P Nς and 0 < λς < ∞. This ensures that none of the samples is irrelevant ς relative to the others. Assumption 6. Simulated sample sizes Nςs increase at a rate such that lim (Nς /Nςs ) = nς , Nς →∞ Nςs →∞ with 0 < nς < ∞. 28 The derivation is similar to that of the properties the two sample IV estimator by Angrist and Krueger (1992) and Arellano and Meghir (1992). Also related is the discussion by Singleton (2006) of GMM estimation with time series data of unequal length. 56 Then, by the first order conditions for a minimum of the criterion function at the parameter estimate θ̂, ∂Γ ∂ms0 (θ̂) = −2 (θ̂)W md − ms (θ̂) = 0, ∂ϑ ∂ϑ or ∂D0 (θ̂)W D(θ̂) = 0. ∂ϑ By the mean value theorem, for some θ̄ between θ̂ and θ, D(θ̂) = D(θ) + ∂D (θ̄) · (θ̂ − θ). ∂ϑ0 Substituting into the first order condition yields θ̂ − θ = − −1 ∂D ∂D0 ∂D0 (θ̂)W 0 (θ̄) (θ̂)W D(θ). ∂ϑ ∂ϑ ∂ϑ If the observed moment vector md consists of moments from several independently drawn samples ς, and W is block diagonal as described above, Γ(ϑ) can be written as a sum of the contributions to the criterion by the moments of each sample. Hence, by the first order conditions, X ∂Dς0 ∂D0 d s d s 0=E (θ̂)W m − m (θ) = E (θ̂)Wς mς − mς (θ) , ∂ϑ ∂ϑ ς where mdς and msς (θ) are vectors of observed and simulated moments from sample ς. So one can write h ∂D0 i ∂D0 X√ √ ∂D0 ς ς N (θ̂)W D(θ) = N (θ̂)Wς mdς − E (θ̂)Wς mdς ∂ϑ ∂ϑ ∂ϑ ς − ! h ∂D0 i ς ς (θ̂)Wς msς (θ) − E (θ̂)Wς msς (θ) . ∂ϑ ∂ϑ ∂D0 Then, under assumptions 4-6, this yields the asymptotic distribution for θ̂, √ d N (θ̂ − θ) −→ N 0, ∂D0 ∂D (θ̂)W 0 (θ̂) ∂ϑ ∂ϑ −1 ! ∂Dς0 ∂D ς · (θ̂)Wς var(mdς )Wς (θ̂) N (1 + nς ) 0 ∂ϑ ∂ϑ ς −1 ! 0 ∂D ∂D · (θ̂)W 0 (θ̂) . ∂ϑ ∂ϑ X 57 D Structural Estimates This appendix lists the full set of structural parameters estimated. I group these into parameters governing family status transitions, legal status transitions, employment transitions in Mexico, employment transitions in the U.S., earnings in Mexico, earnings in the U.S., preferences, migration costs, and the initial stock of assets and debt limits. Table 15: Structural estimates of family status transition parameters. Parameter ψ0f + ψ0f − f+ ψa≤30 Point estimate −0.840 −1.570 −0.001 Standard error (0.023) (0.088) (0.001) f+ ψ30<a≤50 −0.084 (0.090) f+ ψa>50 f− ψa≤30 0.014 −0.067 (0.415) (0.003) 0.019 (0.008) f− ψa>50 0.077 (0.033) f− ψ30<a≤50 Estimation by simulated minimum distance estimation, standard errors in parentheses. Table 16: Structural estimates of legal status transition parameters. Parameter ψ0δ+ ψ0δ− ψeδ+ ψeδ− δ+ ψa≤30 δ+ ψ30<a≤50 δ+ ψa>50 δ− ψa≤30 δ− ψ30<a≤50 δ− ψa>50 Point estimate −3.063 −1.404 0.575 −2.712 0.033 0.076 −0.012 −0.024 −0.029 0.041 Standard error (0.100) (0.468) (0.085) (0.000) (0.003) (0.011) (0.146) (0.014) (0.138) (0.273) Estimation by simulated minimum distance estimation, standard errors in parentheses. 58 Table 17: Structural estimates of employment transition parameter for Mexico. Parameter ψ0w,M X ψ0nw,M X ψsw,M X ψsnw,M X w,M X ψX nw,M X ψX w,M X ψa≤25 nw,M X ψa≤25 Point estimate −1.537 −0.835 0.002 0.003 −0.725 −0.025 0.167 Standard error (0.030) (0.016) (0.016) (0.003) (0.026) (0.037) (0.002) −0.009 (0.001) w,M X ψ25<a≤40 −0.094 (0.002) 0.011 (0.001) w,M X ψ40<a≤55 −0.051 (0.002) 0.019 (0.003) w,M X ψa>55 nw,M X ψa>55 −0.069 0.041 (0.017) (0.009) nw,M X ψ25<a≤40 nw,M X ψ40<a≤55 Estimation by simulated minimum distance estimation, standard errors in parentheses. Table 18: Structural estimates of employment transition parameters in the U.S. Parameter ψ0w,U S ψ0nw,U S ψsw,U S ψsnw,U S w,U S ψX nw,U S ψX ψδw,U S ψδnw,U S w,U S ψa≤25 nw,U S ψa≤25 w,U S ψ25<a≤40 Point estimate −1.395 −3.978 0.896 −0.093 0.004 0.028 −0.022 0.021 −0.005 −0.019 Standard error (0.024) (1.906) (0.015) (0.872) (0.001) (0.013) (0.035) (5.583) (0.000) (0.119) −0.009 (0.004) nw,U S ψ25<a≤40 0.011 (0.123) 0.020 (0.002) nw,U S ψ40<a≤55 0.011 (0.127) −0.256 0.033 (0.140) (0.040) w,U S ψ40<a≤55 w,U S ψa>55 nw,U S ψa>55 Estimation by simulated minimum distance estimation, standard errors in parentheses. 59 Table 19: Structural estimates of earnings function parameters in Mexico. Parameter X αM i y,M X ψa≤20 y,M X ψ20<a≤25 y,M X ψ25<a≤35 y,M X ψ35<a≤50 y,M X ψ50<a MX σu Point estimate 3.733 0.193 0.162 Standard error (0.021) (0.001) (0.002) 0.004 (0.001) −0.028 (0.001) −0.038 0.361 (0.002) (0.011) Estimation by simulated minimum distance estimation, standard errors in parentheses. Table 20: Structural estimates of earnings function parameters in the U.S. Parameter S αU i y,U S ψx≤5 y,U S ψ5<x≤10 y,U S ψx>10 y,U S ψa≤20 y,U S ψ20<a≤25 y,U S ψ25<a≤35 y,U S ψ35<a≤50 y,U S ψ50<a U σu S Point estimate 7.263 0.143 Standard error (0.073) (0.004) 0.044 (0.002) 0.035 0.090 (0.001) (0.001) −0.112 (0.003) −0.067 (0.002) −0.051 (0.002) −0.029 0.344 (0.005) (0.010) Estimation by simulated minimum distance estimation, standard errors in parentheses. Table 21: Structural estimates of preference parameters. Parameter S πU i φc φl6f=l f φl=l f sε0 sεa f Point estimate 0.968 0.627 Standard error (0.008) (0.002) 0.161 (0.008) 3.295 378.090 −1.434 (0.037) (5.839) (0.424) Estimation by simulated minimum distance estimation, standard errors in parentheses. 60 Table 22: Structural estimates of migration cost parameters. Parameter mc0 mca≤30 mc30<a≤50 mca>50 mcf0 mcfa≤30 Point estimate 4952.948 12.906 57.665 110.577 13090.514 15.673 Standard error (84.604) (2.251) (1.965) (0.013) (1.420) (0.000) mcf30<a≤50 161.568 (52.790) mcfa>50 Ccoyote 130.750 775.357 (0.000) (27.993) Estimation by simulated minimum distance estimation, standard errors in parentheses. Table 23: Structural estimates of borrowing constraint and initial stock of assets parameters. Parameter B0 By A0 Point estimate 8.632 0.702 26.530 Standard error (7.256) (0.004) (0.724) Estimation by simulated minimum distance estimation, standard errors in parentheses. 61