Immigrant Wage Assimilation in the United States: The Role of Occupational Upgrading Rebecca Lessem∗and Carl Sanders† November 23, 2014 Abstract We quantify the import of a prominent explanation for the wage assimilation of immigrants to the US, which we term occupational upgrading: due to a lack of opportunities, new immigrants take jobs below their true skill levels, but with more time in the US they find better matches for their skills. To do this, we create a simple model that links immigrant skills to wages, job offers, and occupational upgrading. We estimate the model on representative labor market and migration data from US immigrants. Our results show that only high-skilled immigrants would be better matched by policies to reduce frictions. JEL Codes: J31, J15, J62 ∗ Tepper School of Business, Carnegie Mellon University. rlessem@andrew.cmu.edu Department of Economics, Washington University in St. Louis. carlsanders@wustl.edu Thanks to seminar participants at CMU, Kansas State University, Montana State University, NYU, Washington University in St. Louis, LSE, UCL, and the Society for Economics Dynamics, the Society of Labor Economics, and the North American Summer Meeting of the Econometric Society for helpful comments. Financial support from the National Center for Border Security and Immigration at the University of Arizona is gratefully acknowledged. † 1 1 Introduction Immigrants to the United States earn lower wages than natives who have the same levels of education and work experience. But with time in the US, this gap between immigrants’ wages and their comparable natives’ wages decreases, from 35% below natives at entry to 10% below after 15 years in the labor force.1 This empirical phenomenon is called “wage assimilation” and is one component of an active debate among politicians and pundits discussing proposed changes to US immigration policy. Policymakers would like to improve the process of assimilating immigrants into the existing social and economic structure of the US.2 While there is a good deal of evidence documenting empirical wage assimilation patterns, the primary problem that policymakers face is that both they and economists do not know the importance of the potential reasons why immigrants experience wage assimilation. In this paper, we quantify the role of a primary explanation given in the literature for immigrant wage assimilation in the US, which we term “occupational upgrading.” Occupational upgrading is the idea that immigrants may be restricted from finding the best job for their skills when they enter the US. Consider an accountant immigrating from India: she may have the same education and skills as native accountants, but lack state-level licenses, personal connections, or knowledge of US accounting firms. Given these frictions, she may begin her time in the US as a secretary. With time in the US, she can get the licenses she needs, meet people in her field, and learn about the best way to find a job as an accountant. In the long run, she can overcome the frictions she faces and can find the best job for her skills. Her wages will catch up to her “comparable” natives; that is, natives who had her level of education but also had opportunities to get licenses, connections, and field-specific knowledge before she even entered the US. We determine the importance of occupational upgrading for wage assimi1 Numbers taken from our data. E.g. the debate between Ross Douthat (http://douthat.blogs.nytimes.com/2013/06/12/thegreat-assimilation-debate), David Frum (http://www.thedailybeast.com/articles/2013/04/24/the-immigrants-haven-t-changed-thedestination-has.html), and David Brooks (http://www.nytimes.com/2013/05/07/opinion/brooksbeyond-the-fence.html). 2 2 lation by combining data from the New Immigrant Survey (NIS) with a novel econometric model. The NIS is a representative3 sample of recently-legalized immigrants to the US which includes detailed migration and employment histories as well as a number of demographics not typically available in Census-based data sets. Our econometric model jointly estimates the wages and occupational choices of immigrants over their careers in the US. By combining model estimates with data on native US workers, we then quantify the role of occupational upgrading for the closing of the native-immigrant wage gap as immigrants gain experience in the US. If occupational upgrading we the primary driver of wage assimilation, policies that helped immigrants find better jobs (without associated skills training) could reduce misallocation of immigrants to jobs and help close the native-immigrant wage gap. There is reason to believe these policies might help: occupational upgrading has been found to be a significant component of wage assimilation in non-US contexts. Most notably, Eckstein and Weiss (2004) and Weiss et al. (2003) studied the role of occupational upgrading in immigrant wage assimilation in a sample of Russian migrants into Israel after the collapse of the Soviet Union. Our results show that their numbers should not be directly applied in the US: the migrants in their sample were highly-educated engineers in Russia, so the sample had almost no variation in educational levels, home country occupations, and legal/illegal immigration status. We find that the importance of occupational upgrading for an immigrant depends dramatically on their skill levels, so the experiences of the engineers in their sample are not generalizable. The data available in the NIS is well suited for quantifying occupational upgrading and wage assimilation in the US. The NIS sample reports retrospective labor market outcomes of immigrants in the US and information on both legal and illegal experience in the US. Additionally, it has extensive details on immigrants’ pre-immigration characteristics, in particular education, English skills, visa status, and home country occupation. For outcomes in the US, the NIS con3 The NIS is representative of immigrants who received a Green Card between May-November 2003, not the overall immigrant population. See Section 2 for information on the sampling scheme. 3 tains information on immigrants’ first occupation in the US and their occupation in the US at the time of the survey in 2003. We first document that the cognitive tasks of occupations are an important driver of immigrant wages even conditional on immigrant characteristics. Next, we document the existence of occupational upgrading: simple regressions show that immigrants move to higher cognitive task occupations with time in the US. The basic patterns in the data show that immigrants upgrade their occupations over time in the US, but are only suggestive that occupational upgrading may be a significant component of immigrant wage assimilation. The regressions cannot be used directly to quantify its importance because of endogenous missing data in our sample. The NIS data asks information on first and current jobs in the US, but no information on any jobs in between those two. This means the missing data on jobs is non-random: immigrants who upgrade occupations more often have more missing job observations. To deal with this missing data issue, we create a novel latent variable model of occupational upgrading and immigrant skills and estimate it on the NIS. In the model, workers’ wages are a function of their latent skills, their labor market experience, and their current occupation. Their occupational choices then come from a latent variable structure: every period, they may remain at their previous job, receive a shock into the unemployment pool, or get an outside offer from another job, which they only accept if the new job is better than their current one. Instead of simply estimating latent skill levels as residuals in the wage and occupation equations, we estimate a mapping from the detailed individual demographics available in the NIS to immigrant skills. We show that the model is non-parametrically identified even with incomplete worker histories in our data, estimate it by Simulated Maximum Likelihood, and show that it provides an excellent fit on observed immigrant occupational transitions. With the parameter estimates, we perform decompositions to determine the importance of occupational upgrading for immigrant wage assimilation. Our primary decomposition estimates how the wage assimilation process would change if immigrants were immediately placed in the estimated job they would end up with after 25 years in the US. That is, if the model predicts a worker starting as 4 a secretary has the skills to end up as an accountant, how would starting her as an accountant at entry to the US affect her wage gap with comparable natives? While this is obviously impossible as a policy solution, we interpret the results from the thought experiment as an upper bound for the effectiveness of programs that aim to reduce job frictions. The results show that while occupational upgrading plays a significant role in wage assimilation, its importance varies dramatically depending on the immigrants’ skill levels. If immigrants were moved immediately to the occupation they would end up with in the long run in the US, the average native-immigrant wage gap at entry would decrease by 20%, a non-trivial amount which still leaves room for other factors. The effects of this decomposition depend strongly on the pre-immigration characteristics of the immigrants: higher-skilled immigrants also tend to be those who get the most benefit from immediately moving to their long-run job. For example, for immigrants with high English skills, moving them to their optimal occupation decreases the wage gap at entry by 54% and reduces the average wage gap over the first 15 years to -1% (they earn more than natives). On the other hand, moving immigrants with low English skills to their long-run occupation at entry only reduces the initial wage gap by 10% and the gap over the first 15 years to 26% (from 24% before). We show more generally that the estimated reduction in the wage gap from moving an immigrant to her long-term job is increasing in the immigrant’s entry wage. Our results have implications for both US immigration policy and future research into wage assimilation. We find that eliminating the need for occupational upgrading would increase the speed of wage assimilation, but it would have a significant impact only for high-skilled immigrants who already have the best time in the US labor market. Rather than policies that look to help immigrants find the right jobs, a policy that specifically focused on increasing the skills of low-skilled immigrants may have better distributional consequences. For future research, our results show that the higher-skilled the immigrant, the higher estimated role of occupational upgrading in assimilation. Given that many data sets used in the immigration literature (including ours) have some selectivity of the sample, this result emphasizes that effects of potential policies cannot be un- 5 critically applied to immigrants of different skill levels. The previous literature on wage assimilation in the US has almost universally been focused on documenting its existence and extent. Chiswick (1978), Borjas (1985), and LaLonde and Topel (1992) document assimilation using crosssectional data from the US Census, and Duleep and Dowhan (2002) and Lubotsky (2007) use longitudinal data from Social Security Administration records. These results of these studies differ in the specifics depending on the data set and timeframe, but they all document the general phenomenon of wage assimilation4 . They do not, however, go beyond documenting the existence and extent of wage assimilation. While we also document the existence of wage assimilation in our sample, our primary contribution to is to be the first to analyze the role of occupational upgrading in immigrant assimilation into the US labor market. There is a small group of papers attempting to quantify the importance of occupational upgrading for wage assimilation, but they typically focus on non-US labor markets. As mentioned above, Eckstein and Weiss (2004) and Weiss et al. (2003) look at the role of firm and occupational transitions for wage growth of a non-representative sample of highly-skilled Russian immigrants to Israel, while de Matos (2011) shows reduced form evidence on immigrants moving to more productive firms over time in linked employer-employee data from Portugal. Imai et al. (2011) use Canadian data to show that home country occupation predicts immigrant wage growth, but does not explicitly consider occupational upgrading within Canada or quantify the effects of home country occupation on the observed wage gap between immigrants and natives. 2 Data: Immigrant Histories and Occupational Characteristics The New Immigrant Survey (NIS) has an unusual sampling scheme, and the structure of the sample matters for our estimation strategy and interpretation of 4 There is some argument over whether recent cohorts are still seeing wage growth in the US; see Borjas and Friedberg (2009). 6 the results. In this section, we discuss the sampling scheme and construction of our data; see Section 3 for summary statistics and a discussion of sample selection issues. The NIS drew a random sample from a group of individuals who had applied for permanent residency in the US and were granted Legal Permanent Resident (LPR) status between May and November 2003, becoming what is colloquially known as “Green Card” holders. The Green Card recipients were interviewed inperson over that 6-month period at the location where the LPR documentation was sent. The immigrants were asked a wide variety of demographic and labor market questions, as well as a detailed retrospective migration history. The demographics we use are year of birth, year of entry in the US, education, home country (which we combine with a measure of home country per capita GDP), US entry visa status, gender, home country occupation, and English skills. The data also includes years of experience both as a legal and illegal immigrant in the US. All of this information is self-reported, so the measures are necessarily crude for some demographics: for example, English skills is a self-reported “low,” “medium,” and “high” skill scale. We restrict our sample to LPR recipients who were currently living in the US at the time of the interview. The labor market questions include occupation, industry, wage, firm characteristics, and firm tenure. These questions were asked about the current job, but also retrospectively about 1) the immigrant’s first job after age 16 in their home country; 2) their final job in their home country; and 3) their first job in the US. Of these three additional questions we do not use responses to #1 unless it was also their final job in the home country, since the job at age 16 is not typically informative of later outcomes. Using these responses, we construct a panel of wages and occupations for immigrants in the US. The structure of this panel is that for both the first job in the US and the immigrant’s current job, we have the wage at the job, the main occupation of the job, and job tenure in years. This data structure has quite a bit of missing labor market information in the US: any information on jobs between the first and current is missing. Additionally, this missing data is non-random, since a worker who moves jobs often will have more missing job characteristics than an immigrant who never moves. 7 To quantify the occupational upgrading of immigrants, we use the “task-based” approach of classifying occupations; for a summary of this topic see Sanders and Taber (2012). The NIS data includes the 3-digit 2000 Census Occupational Codes for each job. Without aggregation, there are far too many occupational cells relative to observations to perform inference. We could classify jobs into different bins, such as “skilled” and “unskilled”, but this results in a loss of variation between jobs that are put into the same grouping. To avoid these problems, we characterize occupations by a continuous measure of the cognitive, manual, and interpersonal tasks performed. We follow the literature and use the O*NET database of occupational tasks to score each occupation. O*NET was created by the US Bureau of Labor Statistics and is a representative survey that asks workers about the tasks they perform in their occupation. Using these responses we create a low-dimensional index of different tasks performed in each occupation using the procedure discussed in Appendix A. Although we form measures of the cognitive, manual, and interpersonal task scores for each occupation from the O*NET data, in the remainder of paper we focus on cognitive tasks. In preliminary wage regressions, we found strong returns to cognitive tasks and only small effects of the other tasks, which we take as evidence that cognitive tasks best represent the occupational ladder that workers climb to increase their wages. This result holds true even though immigrants are significantly more likely to be working in more manually-intensive occupations than equivalent natives and even in more manually-intensive occupations than their home country occupations. Despite this, we see almost no wage returns to moving up the manual task job ladder, suggesting that these jobs are more stopgaps than parts of the immigrant’s occupational upgrading path. 3 Descriptive Statistics 3.1 Summary Statistics Table I shows general summary statistics for the sample. The average age in the sample is close to 40, and the sample is about 55% male. The average immigrant has about 4.5 years of work experience in the US as a legal immigrant. 8 Even though the survey is a sample of legalized immigrants, many (19%) had worked as illegal immigrants for some period of time. Conditional on having any illegal experience, the average amount of illegal experience in the data is around 13 years.5 About one-quarter of the sample moved to the US on a visa sponsored by an employer. This is an important control as people in this group likely had a job offer before moving to the US, so we expect them to be higher skill workers and to suffer less of a drop in the skill level of their job after moving to the US. Most of the remainder of the sample moved on family reunification visas. Over 60% of the sample has had some schooling beyond high school, and around a third of the sample reports high English skills. While this sample is representative of LPR recipients, clearly it is not a representative sample of all US immigrants: it does not contain information on immigrants who never apply for LPR status or those who apply and are not granted a Green Card. We expect the sample selection issue to bias our results towards measuring more wage assimilation than a truly representative sample of all immigrants for two reasons. First, LPR recipients are likely to simply be higher skilled relative to non-LPR immigrants. Second, even if they had the same skills as non-LPR recipients, immigrants who are unsuccessful in the US are presumably under-represented in the pool of LPR applicants and recipients. Lubotsky (2007) emphasizes that return migration can bias wage assimilation estimates upwards. On the other hand, given an observed level of wage assimilation of this selected group, it is not obvious which direction selection bias would work in terms of quantifying the role of occupational upgrading in that assimilation. To gain some information on the extent which our sample differs from the overall population of immigrants in the US, we calculated basic summary statistics on the sample of immigrants in the 2003 Current Population Survey. Individuals who were born abroad have an average age of 39, 36% have attended college, and 49% are male. The average age and gender composition of the NIS sample are similar to the overall immigrant population, but the NIS sample has a significantly higher percentage with some college education (60% vs. 36%). 5 Illegal experience is self-reported, so there is no way to know to what extent it is being understated. 9 3.2 Occupational Upgrading In this section we describe how immigrants in our sample moved up the occupational ladder with time in the US and how this relates to their skills as proxied by pre-immigration characteristics. The cognitive task measures of jobs are standardized to be between 0 and 1, with 0 being the lowest-cognitive task occupation in the US and 1 being the highest. Figure I shows the distribution of cognitive tasks for the home job, initial job in US, and current job in the US. While there is only the slightest amount of occupational upgrading within the US in this picture, the three distributions are not independent and there is a significant variance of pre-immigration characteristics, including mixing across different arrival cohorts. The conditional means of occupational choices are much more informative about occupational transitions over time. The first column of Table II shows the results from a regression of the immigrant’s initial job at entry to the US onto their pre-immigration characteristics. This column can be interpreted as showing the importance of the preimmigration characteristics for the skills needed to get high-level cognitive task jobs at US entry. There is a great deal of variation in the predicted initial cognitive tasks by demographics. Taking a (non-existent) “worst” immigrant who was in the lowest cognitive task occupation in his home country with no English skills, no education post-high school, etc., the regression predicts he would begin in the occupation located in the 23rd percentile of cognitive task occupations in the US. Repeating the same exercise for the “best” possible immigrant, he would end up in the 75th percentile of the US cognitive task distribution. Of particular note is the interaction between home occupation and home country GDP: as might be expected, coming from countries with higher per capita GDP means that cognitive tasks in that country are a stronger proxy for skills. The regression of the cognitive task level of the current job onto demographics and the initial US job (shown in the second column of Table II) are unsurprising. Cognitive tasks increase with both legal and illegal work experience, but as expected the effects of legal work experience are larger. This regression also suggests that the cognitive task growth rates of higher skilled immigrants are faster 10 than lower skilled immigrants, since even conditioning on initial job and time in the US many of the demographics still have significant effects. To look more carefully at the relationship between skills and occupational upgrading, Table III shows the determinants of task growth between the first and current job in the US. At first glance, conditional on everything else, time in the US leads to higher task growth. However, the coefficient on initial US cognitive tasks is large and negative. The interpretation of this in terms of an occupational upgrading framework is actually straightforward: if a worker gets a job with cognitive tasks above his true skill level, we should expect to see low or even negative growth in cognitive tasks over time. A simple example using these numbers can illustrate this: consider the “best” immigrant discussed above, who has the highest possible skills as proxied by the demographics. If this immigrant enters the US at the cognitive task job predicted by the initial job regression, the 75th percentile cognitive task job, the task growth regression predicts upgrading of from the 75th to the 83rd percentile cognitive task job over 10 years of legal experience. But say this worker got very unlucky and instead only could find a job at the 25th percentile of the cognitive task distribution. This regression now predicts this worker would move from the 25th percentile cognitive job to the 52nd percentile job, a significantly higher growth rate. The descriptive statistics are informative to the overall degree of occupational upgrading in the sample, but there are issues with attempting to directly use the regressions to measure the rates of occupational upgrading. For one, we actually have more data on the jobs than used here: we have not used the durations of the jobs to provide any information. Additionally, the endogenous missing data problem of not observing intermediate jobs is not fixed by simply conditioning on the last observed job. To use the duration information and deal with the missing data problem, in the next section we develop a simple econometric model which uses flexible functional forms from the intuition behind occupational upgrading. 11 4 Model Our econometric model is a multiple-equation latent variable model relating wages and occupations with immigrant skills. First, in the wage equation, we assume that log wages for worker i in occupation j in time t are given by ¡ ¢ w i j t = w h i t , π j t + εi j t , (1) where w is a function to be estimated, h i t is their current stock of productive human capital, π j t is the productivity of occupation j that the worker chose at time t , and εi j t is white noise independent of everything else. Second, in the occupation equation we model the productivity of the immigrant’s occupation of choice at time t as ¡ ¢ π j t = π s i , π j (t −1) , νi j t (2) where π is a function we specify later, s i is the worker’s skill at finding jobs (potentially distinct from her human capital h), π j (t −1) is the previous period’s occupational productivity, and νi j t is an i.i.d. shock the worker observes but we do not. There are some decompositions of wage assimilation that do not require estimation of the function π in equation (2). For example, how would the wage gap between natives and immigrants change if immigrants never changed occupations after US labor market entry? Clearly simply estimating equation (1) (as long as sufficiently good measures of human capital were used) and plugging in πi t = π j 0 for all t would work. This is just assuming π (·) = π j 0 : immigrants are always at their initial job. On the other hand, a decomposition of wage growth that depends on keeping the π process fixed but changing where the worker begins requires estimation of the occupational upgrading function π. Our primary interest is in one such decomposition: we will assume that workers begin their occupation in their “optimal” occupation they would have reached in the long run. In this case, we need to know the occupational upgrading process. For the thought experiment, we need to know each person’s long run occu12 pation, but we do not have this in the data for all individuals. To get an estimate of a worker’s long run job as a function of their characteristics, we need to estimate the occupational upgrading function to determine what happens to that workers after a long time in the US. Our estimation strategy will estimate the wage equation, equation (1), using standard regression techniques, but we cannot do that for the occupational upgrading function π in equation (2). Our panel data structure for occupations has a typical case that looks like Year 1: τc = 0.33, Year 2: τc = 0.33, Year 3: Missing, Year 4: Missing, Year 5: τc = 0.5 where τc gives the worker’s cognitive task level at their job. If we estimated the occupational upgrading process as a linear model, we would never have workers remaining in the same job for more than one period. Additionally, as in the example we will be missing the previous productivity π j (t −1) for some year (in the example, Year 5). The number of missing years is higher for immigrants who move more since we do not observe even the second job. A simple example will illustrate why this is a problem. Consider two workers, A and B, who both begin in the job τc = 0. In period 2, we see worker A still in job τc = 0 but are missing data for worker B. In period 3, we see both workers at τc = 1. If the missing data was random, using worker B for inference is no problem. But since missing data on B means we know she moved jobs between periods 1 and 2, the correct inference about job offer rates for the two workers would be different. ¡ ¢ This missing data problem motivates us to choose a functional form for π s i , π j (t −1) , νi j t that uses a latent variable structure to deal with both these censored observations and the missing data for middle jobs. We model π as ¡ ¢ Fired ∼ Bernoulli s iF (3) ¡ ¢ Offer ∼ Bernoulli s iO (4) 13 ¡ ¢ πOffer ∼ K s iπ , supp (K ) = [0, 1] jt πj t = π j (t −1) −1 (5) if Fired = 0, Offer = 0 if Fired = 1, Offer = 0 πOffer if Fired = 1, Offer = 1 jt n o max πOffer , π j (t −1) if Fired = 0, Offer = 1 jt . (6) This can be motivated by a simple search model: at the beginning of a period, a worker may get fired with probability s iF and then may receive a job offer with probability s iO , as seen in equations (3) and (4). If she receives an offer (Offer = 1), she draws the productivity of the offer between 0 and 1 from some distribution K conditional on her skills s iπ (equation 5). Lastly, there are four separate cases seen in equation (6), the occupational choice equation. In the first case, the worker neither got fired nor received a new job offer, so she remains at her previous job. In the second case, she got fired and did not receive a new offer, so she must be unemployed, which is indicated by the (arbitrary) notation π = −1. In the third case, she was both fired and received a new job offer right away and so takes that offer, and her new firm is πOffer . In the fourth and final case, she did not get fired but did receive a new offer, so she chooses the more productive of the two jobs. This model can be derived from an optimizing model of worker behavior where their reservation value is 0 and there are no dynamic effects of the current productivity level of the job. However, it is not essential for our results that these be considered deep structural parameters. Most importantly for our purposes, this model does an excellent job fitting the immigrant occupational career paths when we use cognitive tasks as proxies for the occupational productivities and pre-immigration demographics as proxies for immigrant skills. An example path generated by the model is shown in Figure II (a). Modelgenerated paths can match observed sample paths in terms of workers spending multiple periods in the same job, as well as both upwards and downwards occupational transitions and movements into and out of unemployment. Of course, given the data structure we would not be able to observe this example immi- 14 grant’s full career: the observed data we would see given this path is shown in Figure II (b). While each individual will have these “jumpy” occupational paths, the average changes in occupation over time in the US for any given skill level and initial occupation are monotonic. Figure III shows smoothed versions of sample paths averaged over many workers and many simulations of each worker. Different lines correspond to different initial draws of job in the US (with a distribution given by the pdf on the left side of the figure). In the example, the “long run” occupation for the immigrant’s skills is 0.52. Immigrants who receive low initial job offers start off significantly lower on the occupational ladder than those who receive high offers, but over time in the US there is convergence in occupations. 5 Estimation Given the model setup, estimation can be done in two steps. First, we estimate the wage parameters with a semi-parametric regression of wages onto skills and occupational productivity. Since we do not directly observe either of those variables, we assume pre-immigration characteristics and cognitive tasks are good proxies for these factors. In the second step we derive the likelihood function for the observed occupation choices. The likelihood is complicated by the fact that we do not observe entire worker histories but just the first and current jobs in the US and their durations. Simulated Maximum Likelihood allows us to get an estimator even with this missing data in a computationally straightforward way. In this section here we discuss the parametrization of the wage equation and the occupational transition process, while in the Appendices we derive the likelihood show our estimator is identified. 5.1 Parametrization From the model section, the log wage equation is ¡ ¢ w i j t = w h i t , π j t + εi j t . (7) Since we see neither human capital levels nor occupational productivity, we as15 sume that w can be written as a polynomial expansion of pre-immigration characteristics, US labor market experience, and occupational cognitive tasks, ¡ ¢ w h i t , π j t ≡ PolynomialExpansion (X i t , τct ) (8) where X i t are individual characteristics and τct are the cognitive tasks performed in occupation j chosen in time t . The vector of X i t , which includes home occupation, legal and illegal experience, English skills, etc., is shown in Table IV. For the occupational upgrading process, we assume that occupational productivity offers πOffer are drawn from the Kumaraswamy distribution, a computationally simpler variant of the Beta Distribution. This allows for a variety of shapes for the offer distribution while still being bounded between 0 and 1. The Kumaraswamy distribution has 2 parameters, a and b, but we restrict a = 2 for simplicity, which gives the one-parameter pdf ¢b ¡s π ¢−1 ¡ ¢ ¡ i . k (π) = 2 · b s iπ · π 1 − π2 (9) where we allow b to depend on the immigrant’s unobserved skill. Given this parametrization, as b increases, the average occupational productivity offer the worker receives falls. We then make additional assumptions about the existence of a mapping from demographics to skills. We here assume that both the skill that determines the probability of getting a job offer and the skill that determines the occupational offer distribution have single-index form in demographics: ¡ ¢ s iO = Φ γ0 + X iOt γ (10) ¡ ¢ s iπ = Φ ψ0 + X iπt ψ (11) for some γ and ψ, where the X O and X π can be different sets of covariates and Φ is the Standard Normal cdf to ensure the s are between 0 and 1. To reduce the dimensionality, we also assume that the firing probability s iF = κ, a constant shared by all workers. In earlier versions of the estimation we allowed for more generality, such as allowing s iF to vary across individuals, but none of the estimates 16 on these parameters were statistically significant and the qualitative results were identical. Lastly, we allow for a few more generalizations of the offer process. All the parameters may be different for the initial job offer in the US in order to allow for the possibility that skills which help an immigrant’s initial placement in the US may not be determinative of her success after she arrives. Additionally, we let the entire offer distribution be different if the worker is working illegally in the US versus legally in a given period since there is reason to believe the illegal labor market rewards significantly different skills than the legal market, or perhaps does not reward skills at all. Given our parametrization, writing the likelihood of any given offer is quite simple: it is just k (π) . However, even with a full panel, we only observe accepted offers. Moreover, given our sample we do not even see accepted offers except for the first period and first period of the final job. Writing the full likelihood even with this simple individual period likelihood requires additional work, but is straightforward. In Appendix B, we derive the full likelihood function with the missing data and show how we constructed a Simulated Maximum Likelihood Estimator of the model parameters, and then in Appendix C we show that the offer distribution k is actually non-parametrically identified (although we still use a parametric form in estimation for precision). 6 Estimation Results 6.1 Wage Equation The log wage equation for immigrant i in occupation j in time t is given by w i j t = PolynomialExpansion (X i t , τct ) + εi j t (12) We have two wage observations for each person: the wages in their initial and current job in the US. For precision purposes we use a second degree polynomial expansion, so all cross-terms and squares were included along with direct effects. This approach necessarily makes for difficult-to-interpret estimates since 17 the marginal effects of any variable is a function of all other variables. Instead, results from a low-dimensional version of the wage equation including mainly direct effects can be seen in Table V, but we use the estimates from the full polynomial expansion in everything that follows. This specification generates a distribution of marginal effects for demographics. For example, we allow for different effects of legal experience in the US and illegal experience to account for both potential differences in the skill backgrounds of workers who arrive and live in the US illegally as well as the possibility that they may receive less training or be more poorly matched with firms than legal immigrants. We find that estimated average effect of a year of legal US labor market experience is about 50% higher than that of an illegal year (6% vs. 4%), which reflects both per-existing skill differences as well as different experiences in the US. Figure IV (a) shows the estimated distribution of the returns to one year of legal experience and one year of illegal experience. We also find that visa status at entry plays an important role as a proxy for worker human capital. The effect of the dummy variable that is 1 if the worker arrived on a work visa (e.g. H1B visa category for skilled workers) is one of the best proxies for individual human capital, even conditional on education, and is significantly more important than the cognitive tasks of the home job or English skills. Figure IV (b) shows the estimated wage returns to switching from nonemployer-sponsored visa to employer-sponsored status. The graph shows both the gain to getting a sponsor for those without one, and (for ease of comparison) the absolute value of the loss from not having a sponsor for those who do have one. Presumably this reflects firm-specific human capital that the worker may already have for the firm who sponsored the visa, since often workers with employer-sponsored visas are already working for the particular firm who sponsors it in the immigrant’s home country. 6.2 Occupational Transition Parameters We estimate multiple sets of the parameters relating immigrant demographics to the occupational transition process, given by equations (3), (4), (5), and (6). The parameters of the job offer rates and the rate of job loss, our measures of 18 s O and s F from above, are shown in Table VI. Table VII reports the estimated parameters governing the shape of the job offer distribution as a function of characteristics. Almost none of the parameters have a direct economic interpretation, and, while many are statistically insignificant, even those that are large and statistically significant are typically interacted with many other independent variables. The primary reason we include Tables VI and VII is to emphasize that we estimate two sets of job offer rate parameters and four sets of job offer distribution parameters, depending on whether or not the immigrant is legal or illegal or whether they are in their first job in the US. This necessarily hurts the statistical precision of some of our estimates, but since we are depending heavily on identification on observable demographics, allowing a flexible functional form allows for a good deal of heterogeneity in responses. It is difficult to interpret the economic significance of the parameter estimates for the job offer process simply from the numbers themselves. Instead, we put them in some understandable units. To do this, we vary one characteristic at a time while holding all others constant and to see how a change in that factor affects occupational outcomes over time. In Figure VI we show how four of the most important observable demographics serve as proxies for immigrant skills. In Figure VI, panels (a) and (b), education and English skills both proxy for skills relevant to entering the US (as seen by the higher intercept in the first period) and skills for finding better jobs over time (as shown by the steeper slope than the baseline occupations). In Figure VI (c) and (d), on the other hand, having an employer sponsored visa or a better home occupation is primary a proxy for the quality of the job at entry and is not estimated to increase the occupational transition rate in the US over time. Of particular interest here are the parameters governing the different job offer distributions of legal and illegal immigrants. In Figure VII, we show what the predicted occupation paths of immigrants who arrived and stayed in the US illegally would be if all their US experience were legal, keeping all other characteristics fixed. There are at two potential interpretations of these results. First, even conditional on all other demographics, those who arrive illegally may have lower levels of skills. Second, some characteristics of the illegal labor market in the US 19 may prevent them from moving up the occupational ladder as quickly as their legal immigrant peers. We cannot distinguish between any combination of these two explanations in our setup, but to our knowledge this is some of the first evidence on the career paths of otherwise comparable legal and illegal immigrants to the US. 6.3 Model Fit A comparison of the predicted occupational cognitive tasks between the model and the data is shown in Figure V (a). For each immigrant we have up to two occupation observations over potentially many years. To get their predicted outcomes, we simulated their whole model-predicted career path 100 times given their pre-immigration characteristics and compared the predicted values to the observed values for each year we have observations. Note we do not use any within-US data on the immigrants for the model predictions: instead the model generates both their initial job and career path based on pre-immigration characteristics. The model fit is quite good, particularly because the average occupational cognitive tasks in the data is non-monotonic over time. Looking at the picture of the raw data may actually lead to skepticism that there is occupational upgrading for immigrant workers in the US: the average cognitive tasks doesn’t look anything like the example average careers simulated from the model in Figure III. In fact, it turns out that the observed double-humped cognitive tasks path over time is strongly consistent with a model of monotonic occupational upgrading. The model can fit the rapid growth of occupational tasks over the first 5 years in the US, the rapid decrease between 5 and 15 years, and gets mixed results in matching the slight (although not statistically significant) uptick in occupations from 15-25 years. The observed dip in occupations after 5 years in the US is simply because of demographics – the same workers are not being compared across different time periods. For the current job in the US, workers have varied years of experience, depending on when they immigrated. They also have variations in the number of years of illegal labor market experience, given that people who have been in 20 the US for a long period of time and just received a green card are more likely to have worked in the US as an illegal immigrant. We split the sample based on the number of years in the US, looking at immigrants with 10 or fewer years in the US versus those with more than 10. These two groups are quite different in terms of education, with 65% of the lower-experience group having attended college versus 42% for the high-experience immigrants. In addition, the highexperience immigrants have a greater share of years of experience as an illegal immigrant (50% of their US experience versus 11%). We also allow for unobservable average differences in the skills of different birth cohorts in estimation, and it may be a concern that this is a cheap way of forcing Figure Va to match the dip in the data while the other observable demographics would not have been sufficient to match the shape of occupation levels over time. To test this, we simply show the model fit from the estimated parameters with the cohort effects set to 0 throughout the model; see Figure Vb. Re-estimating the model without cohort effects would deliver an even better fit, but this suffices to show that observable demographics that vary within-cohort are sufficient to deliver the non-monotonic average cognitive tasks seen in the data. 6.4 The Wage Gap Wage assimilation is typically defined as the catch-up of immigrants to “comparable” natives; that is, natives with the same levels of education and labor force experience. We create wages for comparable natives for each immigrant in our sample using the Current Population Survey. We compute the average wages of native workers conditional on age, years of work experience, and education. We then can impute the “native” wage for each immigrant with the same levels of those variables using the conditional averages from the natives. We purposefully do not control for occupation since it is not a part of the typical wage gap calculation. Table VIII shows the wage regression coefficients used to impute the comparable native wages for each immigrant. Once we have conditional native wages imputed for immigrants, we can calculate the difference between immigrant wages and comparable native wages. 21 Figure VIII shows the evolution of the wage gap with experience in the US. The top solid black line gives the average gap across the sample: natives earn about 35% more than immigrants on entry, and the gap falls to about 10% after 15 years in the US labor force. For the other lines, we calculate the wage gap for different subgroups, using the estimates from the model to predict their occupation choices and wages. While we use the model to “fill in” missing observations (we do not observe the full careers of most workers in these subgroups), this can just be considered data smoothing since we do not change any of the baseline parameters to form these wage gap measures. The results for the subgroup wage assimilation simulation show that immigrant skills as proxied by demographic characteristics can dramatically change the measured wage gap. First, restricting attention to only workers who have the higher level of education effectively does not change the wage gap. This may seem surprising since education (of course) has a significant relationship with immigrant skills. But this is because education is one of the conditioning variables on creating comparable native wages. Looking at only educated immigrants means they are more highly skilled than the average; however, their comparable natives are also more highly skilled than average, leading to a small net effect. On the other hand, both employer sponsored visas and English skills play a significant role in immigrant wage, even conditioning on education levels. The bottom two lines in Figure VIII show that by restricting attention to only workers with good English skills, the wage gap is still 20% at entry but falls to 0 after 7 years. For employer-sponsored visa immigrants, they start very close to natives (only a 10%) wage gap, but the wage gap falls to 0 by 4 years of labor market experience and is negative after that. However, the speed at which the wage gap declines is similar for all the different skill subgroups. The interpretation of these results is there is a significant amount of skill heterogeneity within immigrant education groups, but that heterogeneity seems to largely be in terms of wage levels but not growth rates. From a policy perspective, this suggests that further restrictions on immigration based on finer measures of immigrant skills would be useful to reduce the measured average wage gap in levels but cannot make 22 immigrants “like natives:” they will either have a wage gap at entry or their wages will overshoot natives later. 7 Quantifying the Role of Occupational Upgrading The perfect counterfactual to quantify the importance of occupational upgrading for wage assimilation would be to determine how much occupational upgrading early in the career could be improved through labor market policies. However, this is an impossible object to estimate given our data; we do not have any idea what the effects of a program would be that helped immigrants find better jobs since there is no such variation in our data. Instead, our approach is to give an approximate upper bound for policy using the following thought experiment: what would happen if immigrants were placed right away into the occupation they would have after 25 years in the US labor market? We expect this to be higher than the true potential gains from policy, since immigrants may be gaining occupational-specific skills or knowledge that cannot be given instantaneously. This counterfactual, placing immigrants in their long-term occupation immediately, is graphically shown by the top line in Figure IX (a). For comparison we also run a non-model based counterfactual in which workers never upgrade their occupation in the US in order to show how important observed occupational upgrading is for immigrant wage growth; this is the bottom line in Figure IX (a). Table IX gives the wages and wage gaps associated with the counterfactual occupational paths, and Figure IX (b) shows the wages graphically. The dashed line in the figure gives the average wages of comparable natives, while the other lines show immigrant baseline wages as well as counterfactual wages. The results show that initially, immigrants earn around $11 per hour, while comparable natives earn around $18. Placed in their long-run occupation, immigrants would earn about $12.20, a reduction of the initial wage gap of around 20%. Of course, since the counterfactual immigrants will (on average) remain in that occupation throughout their career, their increase in wages relative to the baseline will de- 23 crease over time. After around 15 years the counterfactual and baseline wages effectively converge since immigrants will have found jobs very close to their long-run job even in the baseline. The overall earnings gap averaging over the 15 years after entry between immigrants and natives is 7% smaller in the counterfactual than the baseline. A closer analysis of occupational upgrading reveals why the model does not consider it the primary driver of wage assimilation. Figure X, panels (a) and (b) show the counterfactual restricted to immigrants with high English skills and low English skills, respectively. The initial wage gap is closed by 54% in the counterfactual for those with good English skills, much larger than the average effect for the whole sample. Those with high English earn about 3% less than natives over the first 15 years of their career. In the counterfactual when they are placed immediately at their long-run job, that earnings gap reverses and the immigrants earn about 1% more than the natives. On the other hand, for those with low English skills, the counterfactual shown in Figure X (b) reduces the quite large wage gap between those less skilled immigrants and natives by only 10% at entry, and the earnings gap over the first 15 years falls from 28% to 26%. This is because low-skilled immigrants see the least occupational growth in their careers; then putting them in their long-run career is barely better than doing nothing. Similarly, the same analysis holds to a lesser extent for home country occupation. Figure XI compares the effect of the counterfactual at the top 25% versus the bottom 25% of our sample in terms of home occupations. As in the English language case, higher skilled immigrants see a larger effect from the counterfactual: 20% of the initial wage gap is closed for the high home occupation immigrants but only 10% for the low home country occupation immigrants. These results have important distributional consequences. Our model naturally generates a different prediction for the importance of occupational upgrading for each immigrant as a function of their characteristics. The distribution of these estimated effects is shown in Figure XII. To generate this graph, for each individual in the sample we simulated their career many times under the baseline and then under the counterfactual of being placed immediately in their longrun occupation. We then averaged within each worker to get an estimate of how 24 much that particular worker’s wage gap with their comparable natives is affected by occupational upgrading. For example, an individual with a result of $0 means their average wage was the same in the simulations whether or not she started in her long-run job. The results from this exercise show that there is a left skew to the results: for most immigrants, the effects are fairly modest, but there are many who would receive a significant wage boost through this counterfactual. However, as shown above in particular cases, this counterfactual primarily helps workers who are already high-skilled. Also on Figure XII we show the average entry wages from the NIS data of workers at different responsiveness levels to faster occupational upgrading. The higher the wage at entry, the more potentially helpful removing barriers to occupational upgrading would be. The average immigrant being helped $0 by the counterfactual on average makes around $7 at entry, while the average immigrant who is helped $7 is making around $15 at entry. While our model says nothing about welfare, knowledge of these heterogeneous effects of occupational upgrading could help policymakers make decisions about the types of policies they are interested in. If the policymaker wants to encourage high-skilled immigrants to get back to their full earning potential, focusing on barriers to occupational upgrading may make a significant difference. But if the goal is to attempt to help the worst-off immigrants, there does not seem to be much room in our counterfactual for significant effects of an occupational upgrading policy. While we interpret these results as an upper bound for policy in our framework, potentially policies could move individuals above their baseline long-run job; for example, occupational-specific training for a high cognitive task occupation. While we cannot evaluate that counterfactual without additional information, the model setup can potentially be used to evaluate any effects of any policy that acts to change the occupational transitions of immigrants as long as the distributions of potential changes are known. 8 Conclusion In this paper, we quantified the role of occupational upgrading in the wage 25 assimilation process of immigrants to the US. To do this, we used panel data on the migration histories, labor market histories, and demographics of immigrants to the US from the New Immigrant Survey, which allowed us to see partial immigrant career paths. To deal with endogenous partial censoring of occupations (since only accepted job offers are observed) and missing data (since we only see the first and most recent job in the US), we created an econometric model of skills, job offers and occupational transitions, and established non-parametric identification of the functions linking worker demographics to the model’s offer distributions. We then estimated the model on the NIS sample of immigrants and compared the model-predicted wages of immigrants under different occupational upgrading paths with wage data on comparable natives from the Current Population Survey. The results indicate that the effects of occupational upgrading depend heavily on pre-immigration characteristics. If workers were moved into their longterm job immediately, the initial wage gap between immigrants and natives would fall by 20% and their earnings gap with natives over their first 15 years in the US would fall by 7%. We interpret this as the upper bound that policy interventions could achieve. The immigrants who are helped the most by faster occupational mobility are those who are higher earners in the baseline; e.g. those with high English skills or who started in better home country occupations see the highest gains from moving immediately to their long-term job. Our results have implications for both US immigration policy and future research into wage assimilation. For policy, rather than policies that look to help immigrants find the right jobs, policies which specifically focused on increasing the skills of low-skilled immigrants may have better distributional consequences than focusing on job-to-job mobility. For future research, our results show that the higher-skilled the immigrant, the higher estimated role of occupational upgrading in assimilation. Given that many data sets in the immigration literature (including ours) have some selectivity of the sample, this result emphasizes that effects of potential policies cannot be simply applied to immigrants of different skill levels. 26 References Borjas, G. J. (1985). Assimilation, changes in cohort quality, and the earnings of immigrants. Journal of Labor Economics, 3 (4):463–489. Borjas, G. J. and Friedberg, R. M. (2009). Recent trends in the earnings of new immigrants to the United States. Working Paper. Chiswick, B. (1978). The effect of Americanization on the earnings of foreignborn men. Journal of Political Economy, 86:897–921. de Matos, A. D. (2011). The careers of immigrants. Working Paper. Duleep, H. O. and Dowhan, D. J. (2002). Insights from longitudinal data on the earnings growth of U.S. foreign-born men. Demography, 39(3):485–506. Eckstein, Z. and Weiss, Y. (2004). On the wage growth of immigrants: Israel, 19902000. Journal of the European Economic Association, 2(4):665–695. Imai, S., Stacey, D., and Warman, C. (2011). From engineer to taxi driver? Occupational skills and the economic outcomes of immigrants. Working Paper. Jaso, G., Massey, D. S., Rosenzweig, M. R., and Smith, J. P. (2006). The New Immigrant Survey 2003 Round 1 (NIS-2003-1) Public Release Data. King, M., Ruggles, S., Alexander, J. T., Flood, S., Genadek, K., Schroeder, M. B., and Vick, B. T. R. (2010). Integrated Public Use Microdata Series - Current Population Survey: Version 3.0. LaLonde, R. J. and Topel, R. H. (1992). Assimilation of immigrants in the U.S. labor market. In Borjas, G. J. and Freeman, R. B., editors, Immigration and the Work Force. The University of Chicago Press. Lubotsky, D. (2007). Chutes or ladders? A longitudinal analysis of immigrant earnings. Journal of Political Economy, 115(5):820–867. Sanders, C. (2014). Skill accumulation, skill uncertainty, and occupational choice. Working Paper. Sanders, C. and Taber, C. (2012). Life-cycle wage growth and heterogeneous human capital. Annual Review of Economics, 4:399–425. Weiss, Y., Sauer, R. M., and Gotlibovski, M. (2003). Immigration, search, and the loss of skill. Journal of Labor Economics, 21(3):557–591. 27 Tables and Figures Table I: Summary Statistics Variable Age Percent male Years living legally in the US Years living illegally in the US Percent with non-zero illegal experience Fraction that have an employer sponsor More than high school High English skills Sample Size 28 Mean 38.21 55.53% 4.39 2.01 18.74% 28.37% 60.53% 33.28% 4,018 Table II: Determinants of Tasks of Jobs in the US Cognitive tasks of home job Home cognitive skills * home GDP (1) Initial job 0.211∗∗∗ (0.0181) 0.00231∗∗∗ (0.000578) Cognitive tasks of initial US job Years of legal work experience Legal years US squared Years of illegal work experience Illegal years squared 0.0969∗∗∗ (0.00587) 0.0408∗∗∗ (0.00594) 0.0409∗∗∗ (0.00615) 0.0281∗∗∗ (0.00692) 0.00105 (0.00505) -0.000353 (0.000755) 0.0000182 (0.0000224) 0.239∗∗∗ (0.0100) 3144 0.265 Employer sponsored visa English skills More than 12 years education Schooling in US Male Years experience at home Home experience squared Constant Observations Adjusted R 2 (2) Current job 0.0400∗∗ (0.0161) 0.000859∗ (0.000503) 0.613∗∗∗ (0.0156) 0.00653∗∗∗ (0.00191) -0.000197∗ (0.000119) 0.00317∗∗ (0.00150) -0.000160∗∗ (0.0000768) 0.0336∗∗∗ (0.00537) 0.0266∗∗∗ (0.00526) 0.0229∗∗∗ (0.00565) 0.0345∗∗∗ (0.00620) -0.00542 (0.00444) 0.000236 (0.000702) -0.0000152 (0.0000224) 0.108∗∗∗ (0.0101) 2877 0.531 Notes: Standard errors in parentheses. ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01 29 Table III: Task growth Cognitive tasks of home job Home cognitive skills * home GDP Cognitive tasks of initial US job Years of legal work experience Legal years US squared Years of illegal work experience Illegal years squared Employer sponsored visa English skills More than 12 years education Schooling in US Male Years experience at home Home experience squared Constant Observations Adjusted R 2 (1) Cognitive task growth 0.0317∗ (0.0162) 0.00102∗∗ (0.000504) -0.375∗∗∗ (0.0158) 0.00685∗∗∗ (0.00194) -0.000223∗ (0.000121) 0.00316∗∗ (0.00155) -0.000150∗ (0.0000774) 0.0334∗∗∗ (0.00541) 0.0270∗∗∗ (0.00529) 0.0203∗∗∗ (0.00574) 0.0324∗∗∗ (0.00632) -0.00509 (0.00448) 0.0000629 (0.000715) -0.00000996 (0.0000226) 0.109∗∗∗ (0.0103) 2769 0.183 Standard errors in parentheses. p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01 30 Table IV: Characteristics in Wage Equation Years legal experience Years illegal experience College education Current legal/illegal status Home occupation’s cognitive tasks Labor market experience in home country Visa status English skills Home country GDP Gender Birth year 31 Table V: Low-Dimensional Wage Regression Cognitive tasks of job Cognitive tasks of home job Home cognitive skills * home GDP Years of illegal work experience Illegal years squared Years of legal work experience Legal years US squared Legal US experience * home GDP Home GDP Employer sponsored visa English skills More than 12 years education Legal immigrant Years experience at home Home experience squared Constant Observations (1) 13.50∗∗∗ (0.591) 4.109∗∗∗ (0.915) -0.147∗∗ (0.0689) 0.416∗∗∗ (0.0741) -0.00861∗∗ (0.00356) 0.787∗∗∗ (0.0801) -0.0295∗∗∗ (0.00439) -0.00855∗ (0.00439) 0.0878∗∗ (0.0384) 5.016∗∗∗ (0.211) 2.707∗∗∗ (0.205) 1.234∗∗∗ (0.210) 0.886∗∗∗ (0.316) -0.0396 (0.0257) 0.000169 (0.000779) -0.0654 (0.586) 5433 Notes: Standard errors in parentheses. ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01 32 Table VI: Job Offer Rates and Loss Parameter Estimates Constant Term College Home Occupation Home Occupation2 College × Home Occupation Sponsor English Male Cohort Effects Home Country GDP Home country occupation× Home GDP Education× Home GDP Probability of job loss Notes: Standard errors in parentheses. 33 Parameter Estimates Legal Immigrants Illegal Immigrants -0.96 -0.61 (0.24) (0.64) 0.14 0.26 (0.18) (0.58) -0.09 0.13 (0.78) (2.23) 1.40 0.54 (0.72) (2.12) -0.11 -1.13 (0.33) (0.95) -0.06 (0.05) 0.10 0.26 (0.05) (0.15) -0.06 -0.72 (0.05) (0.11) -0.07 0.12 (0.02) (0.06) 0.38 -0.11 (0.43) (1.51) -1.66 -0.10 (0.78) (3.28) 0.17 0.005 (0.28) (1.09) 0.17 Table VII: Job offer distribution parameter estimates Initial job Constant term Education Home occupation Home occupation squared Education * home occupation Sponsor English Cohort effects Male Home country GDP Home country GDP*home occupation Home country GDP*education Notes: Standard errors in parentheses 34 Legal 2.44 (0.18) -0.48 (0.20) 0.07 (0.70) -1.56 (0.69) 0.18 (0.39) -0.39 (0.05) -0.19 (0.05) -0.10 (0.02) -0.06 (0.04) 0.07 (0.38) -0.06 (0.71) 0.08 (0.28) Illegal 2.27 (0.81) -0.81 (0.62) -1.01 (2.56) 0.70 (2.18) 1.02 (0.94) 0.14 (0.17) -0.06 (0.07) -0.18 (0.13) -0.015 (1.66) -0.084 (3.13) 0.17 (1.18) Current job Legal 2.97 (0.18) -0.36 (0.18) -1.42 (0.62) 0.40 (0.58) 0.46 (0.32) -0.35 (0.04) -0.40 (0.03) -0.07 (0.02) -0.02 (0.04) -0.24 (0.24) -0.19 (0.46) -0.26 (0.l9) Illegal 2.17 (0.43) 0.11 (0.40) 0.58 (1.40) 0.20 (1.20) -1.03 (0.72) -0.13 (0.14) 0.15 (0.04) -0.43 (0.09) 0.005 (1.01) -0.010 (2.45) 0.005 (0.61) Table VIII: CPS Wage Regressions Dependent variable = wage Education 7.14 (0.08) Years experience 0.70 (0.01) Experience squared -0.012 (0.0003) Constant 6.13 (0.12) Number of observations 70,572 R-squared 0.14 Notes: Standard errors in parentheses. Table IX: Counterfactual: Originally Placed in Long Run Occupation Years 0 3 6 9 Average Wages 10.7 13.6 15.6 16.8 Counterfactual Wages 12.2 14.4 16.1 17.1 35 Native Wages 17.9 18.6 19.1 19.4 % Decrease in Wage Gap 21% 16% 11% 9% Figure I: Home Country Occupation and US Occupations 36 Figure II: Model Example (a) Example Career Path (b) Observed Data from Example Career 37 Figure III: Effect of initial job 38 Figure IV: Estimated Population Distributions of Wage Returns (a) Returns to Experience (b) Returns to Employer Sponsor 39 Figure V: Model fit: occupations (a) Model Fit (b) Model Fit, Cohort Effects Set to 0 40 Figure VI: Effects of demographic characteristics on occupational outcomes (a) Education (b) English (c) Employer sponsored visa (d) Home occupation 41 Figure VII: Effects of illegal labor market experience on occupational outcomes 42 Figure VIII: The Wage Gap 43 Figure IX: Effects of occupational mobility (a) Occupations (b) Wages Figure X: Effects of occupational mobility: English Skills (a) High English Skills (b) Low English Skills 44 Figure XI: Effects of occupational mobility: Home Country Occupation (a) High Home Country Occupation (b) Low Home Country Occupation 45 Figure XII: Heterogeneous Returns to Moving to Long-Run Job Immediately 46 For Online Publication: A Forming the O*NET Task Measures This process is identical to that used in Sanders (2014), but is included here for completeness. The following questions were taken from the “Work Activities” section of the O*NET survey. Workers were asked to rate the importance of a broad set of activities in their job, from which we choose the activities most related to cognitive skills: • Getting Information • Processing Information • Analyzing Data or Information • Making Decisions and Solving Problems • Thinking Creatively • Updating and Using Relevant Knowledge • Developing Objectives and Strategies • Organizing, Planning, and Prioritizing Work We then took workers and occupational choices from the National Longitudinal Survey of Youth 1979. We linearly projected wages onto worker characteristics and survey responses for their chosen occupation in each year using the regression £ ¤ E Real Wagei t |Covariatesi t = β0i + β1 Urban/Rural Dummyi t + β2 t + β3 t 2 + β4 Labor Market Experiencei t + β5 Labor Market Experience2i t + β6 Num. Childreni t + β7 Marital Statusi t + βC · Task Responsesi t The regression is estimated using fixed effects, which allows for individualspecific intercept terms (the β0i ). We then formed the scores for each occupation using the estimated coefficients from the wage equation, τ̂C j = β̂C · Cognitive Task Responses in Occupation j 47 and normalized each score between 0 and 1 to get the final τC for each occupation. We constructed manual and interpersonal task scores similarly using responses to other task questions. All these regressions and scores are available upon request. 48 B The Occupation Transition Process: SMLE Estimation To derive the likelihood, it is easiest to begin by writing the likelihood of the occupational history assuming we observed all jobs. We then deal with the missing data by integrating out the data we do not observe using Simulated Maximum Likelihood. Assume for one individual (suppressing i and s notation) we see their occupation in each period. Denoting the whole path of occupation for a worker from time 1 (labor market entry in the US) to T (time of the survey) and using the Markov structure of the model we have the likelihood of the path π0 → π1 → π2 ... L (π0 , π1 , ..., πT ) = l 0 (π0 ) l (π1 |π0 ) l (π2 |π1 ) ...l (πT |πT −1 ) (13) where l 0 is the likelihood of the initial observation and l is the conditional density of observing a worker in occupation πt as a function of the previous occupation. We treat unemployment as its own job with some arbitrary value of π below the lower bound of the job offer distribution. The support of the job offer distribution is [0, 1], and we let πt = −1 if the worker is unemployed. This simply allows us to unify the notation of employed and unemployed states since unemployed workers will always accept an offer as all offers have higher π than unemployment. The first component of the likelihood is the initial job offer. We assume each worker gets a job offer in the first period, so the likelihood is just the density of the time 0 offer pdf at the observed occupation: 49 l 0 (π0 ) = k 0 (π0 ) . The later conditional likelihoods can be calculated from the occupational transition equation, given in equation (6). The model breaks down the likelihood of transitioning between occupations πt −1 and πt , l (πt |πt −1 ) , into five different cases depending if the worker moves to a higher productivity firm, lower productivity firm, unemployment, etc.: 1. A worker is employed at time t − 1 but unemployed at t . In this case, the worker must have gotten fired, which happens with probability s F . We also know that they did not get a new job offer in this period since all job offers are accepted when The probability that a worker does not ¡ unemployed. ¢ O get a job offer is 1 − s . The likelihood in this case is ¢ ¡ l (πt = −1|πt −1 6= −1) = s F · 1 − s O . 2. A worker is unemployed at both at the end of last period and at the end of the current one. This worker must have not received an offer in period t , so the likelihood is l (πt = −1|πt −1 = −1) = 1 − s O . 3. A worker moves a higher productivity firm. This case includes when a worker moves to a firm from unemployment. If we see a worker move to a higher productivity firm, we know that he got a job offer and we know exactly what the offer was. In this case, it does not matter whether or not the worker was fired, since all that is relevant is that he received a higher job offer, which he would accept regardless of whether or not he was fired. The likelihood in this case is the probability of receiving a job offer multiplied times the likelihood of receiving the specific offer that is observed: l (πt > πt −1 |πt −1 , ) = s O · k (πt ) . 4. A worker moves to a lower productivity firm but is not unemployed. This worker must have been fired, otherwise he would not have left his previous higher-productivity job. We also know that he received a job offer at 50 productivity level πt , so the likelihood is l (−1 < πt < πt −1 |πt −1 ) = s F · s O · k (πt ) . 5. A worker stays at the same job as in the previous period. We know that he was not fired, since the probability of getting a new offer at the same job is 0. The likelihood of not getting fired is (1 − s F¡). He then either did not ¢ O get a new offer, which happens with probability 1 − s , or he got an offer for a lower productivity job. The probability of getting an offer is s O and the probability of it being lower than his current job is the cdf of the offer distribution K (πt −1 ). The likelihood is then ¡ ¢ ¡£ ¤ ¢ l (πt = πt −1 |πt −1 6= −1) = 1 − s F · 1 − s O + s O · K (πt −1 ) . Unfortunately, the data does not contain a full record of occupations each year. Instead, we have the initial and current jobs π0 and πT , and the durations (in years) of both jobs, d 0 and d T respectively. Denote the first period of the final job by K = T − d T . The likelihood of the observed data can be written as L (π0 , πT , d 0 , d T ) = l 0 (π0 ) · l (πt = π0 |πt −1 = π0 )d0 −1 × ¡ ¢ Pr πd0 6= πd0 +1 × l (πK |π0 , d 0 ) · l (πt = πT |πt −1 = πT )dT −1 (14) . This reads as: in the first period the worker received the offer π0 and kept it for d 0 − 1 periods. In the period after that, we know they moved jobs. The individual then receives their offer in period K for their final job and stays there for d T − 1 periods without leaving. Without the term l (πK |π0 , d 0 ), the first period of the final job, this would be straightforward to calculate using the conditional likelihoods from above. However, l (πK |π0 , d 0 ) requires calculating the conditional distribution of observing some job π in period K as a function of the last observed job π0 and duration 51 at the job d 0 while missing data on job transitions from periods d 0 to K . Direct calculation of this requires evaluating a K − d 0 dimensional integral for each individual for each likelihood evaluation. Instead of direct computation we use simulation-based estimation method. The Simulated Maximum Likelihood (SMLE) estimator begins by writing down the likelihood as if we observed the entire occupational history and then integrating out the missing data directly. We can transform the full likelihood into the observed likelihood by integrating out over all the missing middle jobs. The SMLE method notes that this integral can be written as E πK −1 ,...,πd0 +1 |πd0 [l (πK |πK −1 )], the expected value of the conditional likelihood for period K with the expectation taken over the possible paths that led to πK . Since our model is cheap to simulate, it is easy to start the model at d 0 with current job π0 and simulate the job path forward until period K . Doing this S times for each individual’s data, we ¡ ¢ can then calculate the value l πK |πSK −1 for each data point πK combined with simulated job in period K − 1, πSK −1 . We know that as long as πSK −1 is drawn from the correct conditional distribution, as S → ∞ S ¡ ¢ 1X l πK |πSK −1 →p E πK −1 ,...,πd0 +1 |πd0 [l (πK |πK −1 )] . S i =1 (15) Using this, the SMLE estimator maximizes the calculated likelihood for the observed data points combined with the simulations used to eliminate the missing data problem. 52 C Identification Our econometric model of occupational transitions relates individual demographic characteristics X i to occupational productivity outcomes π j t through the equations (3), (4), (5) and (6), where we assumed that the offer distribution K (·) is a Kumaraswamy distribution with a parameter s π which has a single-index form in demographics. We have not yet showed that we can actually recover these parameters from our data. In this section, we do more than that: we show that we can non-parametrically identify the offer distribution K (·). Our sample is too small to use non-parametric estimators, but in estimation we used as flexible functional forms as possible and in principle we could use increasingly flexible functional forms as the amount of data increased. First we show the job offer and firing rates are identified. For this section we suppress the observable demographics X i ; we can repeat the argument for any given X i . Consider a worker who is at job π0 in the initial period. We will observe them at the same job next period only if they did not lose their job and if they received an offer, it was lower than π0 . The probability of this event is ¡ ¢¡ ¡ ¢ ¢ Pr (π1 = π0 ) = 1 − s F s O + 1 − s O K (π0 ) . (16) Now consider workers who have π0 = 1, that is, the workers with the best jobs. The probability of them getting an offer lower than 1 is 1, so K (1) = 1 and this reduces to Pr (π1 = π0 |π0 = 1) = 1 − s F . 53 This directly identifies s F , the probability of job loss. Intuitively, we have data about how long it takes a worker to switch jobs, as well as a ranking of jobs. If we look at individuals only in the highest type of jobs, the only model mechanism for leaving this job is job loss since they will rarely get a better offer to make a job-to-job move. Once we have identified the probability of job loss s F , we can use a similar argument to recover the probability of a job offer s O . Consider workers at π0 = 0, that is, the workers with the worst jobs. Since we know the probability of an offer above 0 is 1, K (0) = 0 and the probability of not moving jobs is ¡ ¢ Pr (π1 = π0 |π0 = 0) = 1 − s F · s O . Since we already know s F , this probability gives us s O . As above, if we look at individuals only in the worst type of jobs who did not lose their jobs, the only model mechanism for moving up is receiving an outside offer so we know all upwards moves come with an offer and every time they stay in their job there was not an offer. We are able to identify the relative frequency of job offers versus the offer distribution, unlike in many versions of search models, because we assume we have data on rankings of jobs, so we can ex ante identify workers who are either unlikely or very likely to receive better offers. Lastly, once we know s O and s F , solving for K (π0 ) in equation (16) gives K (π0 ) = ¡ sO Pr (π1 = π0 ) ¢¡ ¢− . 1 − sO 1 − s F 1 − sO (17) The right hand side is simply data (Pr (π1 = π0 )) and known parameters. Given an 54 original job, we can now remove the correct proportion of workers who had either been fired or not received an offer and then use the proportion of remaining workers who did not move to identify the probability of getting an offer below that job. As long as workers are observed at every possible job in some period (which will be true given the model setup), the full distribution of K can be traced by varying π0 in equation (17). For this identification argument we only required a limited part of the data: the type of the first job and one observation in period 1 of whether the individual remained in that job or not. The duration of the first job, the type of the final job, and the duration of the final job are all not strictly required for identification but increase the power of our estimators. Since the actual cross-section of workers is relatively small, the additional power of knowing the first and final job durations helps significantly for getting a reasonable amount of precision. 55