Who Quits Next? Firm Growth in Growing Economies Julieta Caunedo Emircan Yurdagul∗ Cornell University Universidad Carlos III June 2015 Abstract This paper provides a theory linking characteristics of the firm dynamics to the nature of aggregate growth in an economy. We analyze firms’ life-cycle productivity, age-employment profiles, and firm selection across countries. Using a large cross-country dataset, we document (i) more frequent labor productivity growth for firms operating in fast growing economies, (ii) younger firms in faster growing economies, (iii) lack of systematic relationship between the tail of the employment size distribution and growth, and (iv) steeper age-employment profiles in slow growing economies. We build a tractable general equilibrium model that displays endogenous long run growth compatible with a stationary size distribution and the documented empirical facts. We explore the role of i) intensive and ii) extensive margin firms’ investment for economic growth. We show that disparities in the extensive margin of firm growth, the number of firms successfully innovating in a period of time, account for about a third of crosscountry growth disparity. The extensive margin is tightly related to the age distribution of firms observed in an economy. The intensive margin of firm growth is closely related to rank reversals in firm productivity growth but not necessarily aggregate economic growth. Keywords: Productivity, Firm Selection, Economic Growth. JEL codes: E23, O4. ∗ Contact: julieta.caunedo@cornell.edu or yurdagul@wustl.edu 1 1 Introduction Entrepreneurs, in braving uncertainty, face the risk of being replaced if their ventures are not sufficiently successful given the performance of other firms in the market. Selection and return uncertainty are key in determining the characteristics of firms operating in an economy and aggregate economic growth. Economies where i) firms invest little or ii) few firms invest in productivity, may grow slower not only because the levels of investment are low, but also because the process of firm selection in the market slows down. This paper explores the role of i) intensive and ii) extensive margin firms’ investment for economic growth. It provides a theory that links aggregate growth to firm dynamic characteristics. In particular, we argue that patterns of firm dynamic, such as characteristics of the age distribution of firms and cross sectional employment age profiles, hold information as of the nature of economic growth. Using a cross country dataset we document the relationship between growth and features of the industry dynamic. We build a stylized general equilibrium model of industry dynamic with endogenous firm investment and selection (entry and exit) consistent with those patterns. We show that disparities in the extensive margin of firm growth, the number of firms successfully innovating in a period of time, account for about a third of cross-country growth disparity. The extensive margin is tightly related to the age distribution of firms observed in an economy. The intensive margin of firm growth is closely related to rank reversals in firm productivity growth but not necessarily aggregate economic growth. The intensive margin of firm growth relates to employment-age profiles across firms, i.e. average employment per firm age. Firm investment behavior is key in understanding the nature of economic growth. When investing, firms face two different layers of uncertainty: a) on whether a given outcome is realized (i.e. introduction of a new product to the market); but also, conditional on success, b) on the return to any given investment (i.e. profits associated to the 2 new product in the market). Often times, ventures that entail high uncertainty in both dimensions, such as those associated to disruptive technologies, are the ones pushing growth the most1 . We argue that disparities in the probability with which firms turn their investments into actual labor productivity growth (probability of success hereafter), and uncertainty on returns if successful, can explain observed differences in firm employment growth, age distribution, selection and ultimately, aggregate growth. We start by describing cross-country patterns of aggregate growth and industry dynamics2 : (i) establishments’ labor productivity growth (productivity growth hereafter), (ii) firm size and employment distribution, and iii) firm size and age in the cross-section. First, the frequency of firm level productivity growth is higher in countries with a higher growth rate of GDP per capita. Second, firms are younger in faster growing economies. Third, the share of firms that are large, employment-wise, does not vary systematically with aggregate economy growth3 . Finally, there is a non-monotonic relationship between the steepness of the age-employment profiles and aggregate growth rates. In particular, we show that the cross sectional employment age profile in slow growing economies can, on average, be steeper than that in faster growing economies. The first fact is to be expected if aggregate growth is connected to the activity of the firms operating in the market. However, it is not tautological. It is possible for a larger share of firms to be growing at relatively low rates, which would induce slow rather than fast aggregate growth. The second empirical regularity is explained by more severe selection in faster growing economies (higher firm turn around via entry and exit). The last two are less understood and important in describing the nature of aggregate growth. The presence of large firms (employment-wise) in a frictionless economy, is related to their relative productivity versus the average in the economy4 . If there are economies 1 See Christensen (1997) for a detailed description disruptive technologies. We use the 2006-2011 Enterprise Survey data by the World Bank. 3 Formally, the estimated tail of the employment distribution (linear in log-log space) does not vary systematically with growth. 4 In an economy with frictions such as Restuccia and Rogerson (2008), the size of the firm can or 2 3 of scale, and large firms are relatively more productive, they could function as an engine for growth. Our empirical evidence and previous literature suggest otherwise. The literature finds that small young firms account for a large portion of productivity gains in an economy (Haltinwanger, et.al., 2010, Eslava & Haltiwanger, 2012). Such finding is consistent with the lack of correlation that we find in our data. In terms of mechanisms that could generate the lack of correlation, in this paper we point that it is necessary to distinguish between those that generate generate rank reversals in the firm productivity space (and hence employment) from those that do not, but may improve average productivity. Steep age-employment profiles in slow growth economies are at odds with a pure selection theory. Such theory predicts positive correlation between selection and the speed of growth both at the firm level and in the aggregate. Two issues arise. First, the data analysis is cross sectional, so these profiles cannot be interpreted as firm employment growth paths necessarily (longitudinal data). Instead, they reflect the fact that in slow growing economies, relatively large older firms coexist with small young ones. Second, aggregate growth in the economy is related to the pace of average productivity improvements, whereas firm employment growth, reflects firm productivity improvements relative to the mean. There are two types of slow growing economies: very unproductive poor ones, and very productive rich ones. In the former, steep employment age profiles are consistent with weak selection, which allows small unproductive firms to survive, and coexist with large slow growth firms. In the latter, this fact is related to strong selection, young small firms entering and exiting the market but sometimes experiencing high productivity growth relative to the mean. If those successful firms are relatively few, average output growth is low. In order to understand the relationship between the probability of success, return uncertainty and the highlighted empirical facts, we build a tractable model with encannot be correlated with it’s productivity. Employment would be determine by a combination of productivity and distortions. 4 dogenous firm selection. In the model, firms are allowed to invest to grow. In our benchmark, whether investment returns are realized or not is uncertain. Towards the end of the paper, we allow further uncertainty in returns, consistent with productivity rank reversals. In the model, firms can be innovative and attempt to improve productivity, or alternatively, they can operate with a constant productivity technology. At the beginning of each period, innovative firms decide whether or not to invest in technology improvements and how much to produce in the current period. If they invest, they pay a cost that is a non-linear function of their investment and get returns one period ahead with some probability. The probability of success is the same for all the firms in the market, but the realizations of course need not be. Innovative firms can also decide to liquidate, at which point their technology is taken over by a non-innovative firm. Non-innovative firms decide whether to stay in the market, produce and pay operating costs, or exit. The study of a general equilibrium economy is critical for our purposes for two reasons. First, in an economy with higher probability of positive returns to investment, firms’ productivity growth would be faster; which implies that relatively unsuccessful firms exit quicker (strong selection). Hence, growth in this economy will be higher than the level predicted by a simple partial equilibrium model where selection is not accounted for. Second, suppose we compare two economies, one with high probability of success and one with low probability of success. As long as returns to investment are common across firms operating in an economy (possibly proportional to productivity), the allocation of employment and sales among firms may not differ between the low and how probability of success economies. This helps our model replicate the absence of a systematic relationship between growth and the allocation of employment across firms. In our paper, the key fundamental difference across economies is the probability of realizing returns to investment. Uncertainty of such realization relates to political instability, changes in tax regimes, changes in terms of trade. Such uncertainty has a 5 direct impact on the life-cycle of firms through the frequency of growth episodes. The probability of success also alters the expected return per dollar invested, and hence the size of those investments in equilibrium. Together, these two channels will generate firm productivity growth profiles that differ across economies. We identify the probability of success in the data via law of large numbers. We argue that at the top of the employment distribution, the probability of success is proxy by the number of firms who successfully increase their productivity in a year. Preliminary computations show that variations in the probability of success have quantitative impact on both the rate of investment at the firm level and aggregate growth of the economy. Moreover, it determines the share of firms investing in productivity growth, as well as the equilibrium measure of firms operating in the market changes. The existence of a balance growth path poses tight restrictions on equilibrium productivity growth of firms operating in the market.Without return uncertainty conditional of investment success, there is no rank reversal in equilibrium, which in turn implies no employment growth. Intuitively, firms’ productivity grows at a constant rate along the balance growth path. If the firm is successful, it grows at everybody’s rate. Hence, its employment (which depends on its relative productivity vis a vis the average) does not change. If instead the firm is unsuccessful, employment shrinks, as it is now relatively less productive than before. When we augment our benchmark economy to allow return uncertainty (and hence for rank reversals), we show that the probability of success jointly with firm uncertainty are relevant to determine the relationship between age-employment profiles and firms’ age distribution with aggregate growth in the economy. Furthermore, return uncertainty and success probability jointly determine the pace of selection in the market and the decoupling between average firm growth and aggregate growth. However, once the model is able to replicate disparities in employment-age profiles observed in the economy, it’s predictive power for differences in aggregate growth is very similar to the one obtained by varying the probability of success alone. 6 Related literature. This paper provides a theory that links characteristics of the industry dynamics to aggregate growth, consistently with the patterns documented in the data. Allocation of factors across firms has been shown to be key in understanding aggregate productivity differences across countries, and through them, income per capita (Hsieh and Klenow (2009), Restuccia and Rogerson (2008)). Until recently, most of the literature that studies the link between the micro structure of the economy and productivity focused on static allocations. However, static allocations can reflect firm level distortions that can have lasting effects on incentives to innovation, and firm growth (as in Hsieh and Klenow (2012), Da-Rocha, Tavares, and Restuccia (2014) and Cole, Greenwood, and Sanchez (2012)) and ultimately, aggregate growth (as in Akcigit, Alp, and Peters (2014), Peters (2011)). Peters (2011) links static misallocation (through markup variation) with innovation incentives and growth. Akcigit, Alp, and Peters (2014) provides a theory of firm dynamics in developing countries based on contracting frictions that prevent managerial delegation in weak institutional environments, i.e. poor economies. Both Akcigit, Alp, and Peters (2014) and Peters (2011) follow the Klette and Kortum (2004) tradition in that firms innovate by deciding the frequency of upgrades. Distinctively, we fix the frequency of innovation episodes, but we allow firms to choose the intensity in the improvement. This is important because in an environment where firms only decide the frequency of upgrades, steep employment age profiles are always coupled with faster growing economies. As mentioned before, this fact is at odds with the data. We are able to accommodate slow growing economies and steep employment age profiles by decoupling success rates from return uncertainty. Hsieh and Klenow (2012) analyze employment age profiles in India, Mexico and the US to argue that about 25% of the differences in aggregate TFP can be accounted by disparities in firm employment growth. When analyzing a general equilibrium model with endogenous productivity growth (as in Atkeson and Burstein (2010)), they adjust returns to innovation in each country by the revenue distortions uncovered from 7 the Mexican and Indian data. Because employment profiles are flatter in these countries, implied tax distortions are higher for firms with higher productivity. Higher tax distortions for more productive firms, induce the model to predict flatter employment profiles. In this paper we take an alternative path. We argue that it is not the actual return that is lower for more productive larger firms, but the expected return, once adjusted by uncertainty. Whereas the same technologies might available in developed and developing countries, their expected returns are different. Furthermore, our result holds even assuming that uncertainty on returns is the same irrespective of firm size. Slower growth as the firm ages is not generated by productivity dependent penalties on innovation, but through a general equilibrium effect via endogenous selection, i.e. entry and exit. Another piece of research that argues that while the same technologies might be available across countries, features of the environment in which firms operate can generate disparities in the technological ladders that firms adopt, is Cole, Greenwood, and Sanchez (2012). The authors assert that poorer financial markets can explain the disparities in total factor productivity and employment size distribution. Contracting problems are undoubtedly a relevant mechanism that explains flatter productivity profiles. Akcigit, Alp, and Peters (2014), analyze contracting frictions that affect the ability of entrepreneurs to delegate managerial activities. Such friction can explain why it is optimal in certain economies for firms to remain small through time. Improvements in the contractual environment improve the net benefit of delegation and induces firm growth. In our paper, we vary the probability of obtaining returns to any given investment, conditional on the investment taken place. The presence of contracting frictions is definitely an important mechanism that would affect what we call the probability of success. There are other sources that our framework can also accommodate. For example, the existence of ”social capital” as documented in Knack and Keefer (1997), and trust relationships, documented in Bloom and Sadun (2012); or the degree of technology experimentation, which has been shown key to predict innovation 8 (Thomke (2003)). By keeping this reduced form approach, we gain model tractability and intuition as of the link between these alternative sources of uncertainty, the industry dynamics and growth. Certainly, the mechanism underlying firm uncertainty need to be further understood. There is also a growing literature that studies firm growth in general equilibrium and its impact on aggregate allocation. Atkeson and Burstein (2010) study an open economy with endogenous growth. They focus on the impact of trade barriers on the equilibrium entry and exit rates and incentives to innovate. In their economy, firms choose the probability of realizing an improvement in productivity of a given size. Distinctively, our model takes the probability of success as a fundamental of the economy, and lets firms choose their path of productivity improvement. This is important because it allows us to fully characterize the equilibrium firm size distribution, which is key in validating our results. Our paper is also related to the literature studying growth as the outcome of technological investment of heterogeneous firms operating in the market. Luttmer (2010) provides conditions such that the thick tail in the firm size distribution widely documented in the data can arise as the outcome of firms’ growth under uncertainty. In our paper, we assume an initial distribution of productivity that has such a tail and hence we are silent about the process that originates it. This is also a feature of Perla and Tonetti (2014) that study diffusion of technology and growth in an economy with heterogeneous firms. Two features differentiate our model from theirs. First, when firms decide to invest in searching new technologies, the technology that is assigned to them is purely exogenous. In our model, firms decide on the optimal level of technology that they would like to run. Second, in Perla and Tonetti (2014) growth is lead by the firms at the bottom of the size distribution, who find it profitable to search better technologies. Bigger and more productive firms do not innovate. In our model, growth is lead by successful and fast growing firms. Small firms in the market invest in technology and only if successful, they turn large. If unsuccessful, they may opt out 9 of innovation. The smallest firms in the market survive for a finite number of periods, and then endogenously exit. The rest of the paper is organized as follows. Section 2 gathers empirical evidence across countries. Section 3 presents a stylized model that is consistent with those facts. Section 4 gives the analytical results from the model. Section 5 gives a quantitative assessment of the model. Section 6 introduces an extension of our benchmark model by allowing for additional idiosyncratic shocks affecting the size of firm growth. Section 7 concludes. 2 Evidence We use the standardized dataset for 2006-2011 of the World Bank Enterprise Surveys (ES). We use data for 92 countries and for each country we pick the most recent survey available. For aggregate statistics such as income we use the Penn World Table 8.0. When we split countries into growth quartiles, we consider the average growth rate of GDP per capita (cgdpo/pop) since 2000. We omit countries with negative average growth rate in that interval. Since we argue that firms’ probability of success has implications on income levels, we first analyze sales and employment information in the ES dataset. In particular, our objective is to show how the frequency of firms with positive growth in labor productivity relates with output growth across countries. The fraction of firms that have positive productivity growth does not correspond to the probability of success in attempts to increase productivity, a variable that we do not observe in the data. However, we expect such fraction to be increasing in the probability of success. High probability of success would increase the frequency of episodes with productivity growth for a given set of firms operating in the market. Selection may induce firms to exit more often, but if anything those would be relatively unsuccessful firms. Hence, we expect the fraction of firms with productivity growth to be increasing in the probability of 10 .1 Figure 1: Fraction of firms with productivity growth and GDP per capita growth KAZ MNG ZWE TJK OMN CHN GEO GABMDA GDP per capita growth .05 0 AGO UKR VNM BLR ZMB ROMARM UZB EST PER MOZ ECU RWA TZA LVA BGR ETH MNE PAN BTN SUR SVK LTU ALB TUR CHL SRB POL DOM MRT BIH GHA LKA IDNUGAHRV ZAF BWA THA DMA HUN BOL MWI COL GMB SLE PRY LCA BLZ CZE BGD BDI BFA ZARMEX NAM PAK HND BRA GRD SVN URY GNB NPL MLI ATG TGO GTM LSO CRI BEN PHL NER JAM MAR CAF CMR BRB KEN SEN MDG MUS FJI TCD LBR SWZ -.05 GIN SLV Correlation=0.3175 20 40 60 % with productivity growth 80 100 success. To compute growth in labor productivity, we compute the ratio of sales per worker in the last year, and three years before the survey. Figure 1 shows this relationship between the fraction of firms with positive sales-per-worker growth within the two years and average annual GDP growth as described before. The correlation coefficient is 0.32 in spite of the large variation across countries. This initial set of evidence suggests that there is a link between probability of success and income growth. Second, we focus on the firm age across countries. Figure 2 plots the average firm age against the growth rate of GDP per capita in our sample. It shows that faster growing countries tend to have younger firms (with correlation -0.11). Next, we turn to the relationship between the distribution of employment among firms and income growth. For this, we construct the Pareto tails for each country following Axtell (2001). We are interested in the tail of the distribution rather than other summary statistics for two reasons. First, the largest firms in each country contribute disproportionately more to value added, and explain most of the changes 11 Figure 2: Average firm age and GDP per capita growth 30 ZWE LBN ARG BOL CHL GTM URY HND LKA GRD BRA IRL SUR 20 JAM CRI SLV PAKPRY DEU DOM PRTESP MEX GRC SRB BIH PHL SVN BLZ POL COL TURPAN PER CMR ATG ZAR ECU SLE OMN HRV IND MYS BWA THA ZMB ZAF GHA MUS MOZTCD IDN MWI BLR DMA BRB BGD HUN JOR BTN NER LCA UKR TJK LSO GAB BGR UZB CZE SVK EST ROM BFA LVA UGA LTU CAF MLI MNE KEN CHN MDA TZA SEN BEN NAM GNB IRQ AGO NPL TGO SWZ MRT ETH ARMVNMGEO GMB RWA BDI GIN LBR ALB MDG 10 Mean age FJI AZE MNG NGA KAZ KHM -.05 0 .05 Correlation=-0.1257 .1 .15 GDP per capita growth in aggregate output (as in Carvalho and Gabaix (2013)). Second, the survey considers formal firms for most countries. In the presence of informal firms, the nature of this data could bias the analysis of the relationship between the employment size distribution and GDP growth. Our results are robust to analyzing firms that are larger than 5 employees.5 Our estimates for the tails of the distribution go from 1 to 4. A higher parameter indicates a thinner tail, i.e. there are relatively few large establishments employmentwise. Figure 4 shows that the thickness of the tail in the employment size distribution lacks systematic relationship with GDP growth (with correlation -0.06). In addition, it 5 In order to get the Pareto tail indices, we first form 15 employment categories, where category i corresponds to 3i−1 integers. (The first category includes firms with 1 employee, the second category includes firms with 2 to 4 employees, so on.) For each country, we drop the categories that have less frequency that the one with higher order. Then we regress the logarithm of the number of firms on the logarithm of the median point of each category. The coefficient of the latter is the tail index for that economy. Pareto distribution captures the allocation of employment in these tails very well. Specifically, the average R-squared statistic in the regressions that we run to get the tail indices is 0.977. 12 .15 Figure 3: Tail indices and GDP per capita growth NGA GDP per capita growth .05 .1 AZE KAZ LBN ZWE MNG TJK OMN GEO 0 BLR CHN AGO UKR MDA GABVNM TCDUZB ZMB ROM ARM EST MOZ PER ECU BGR RWA ETH MNE PANTZA SUR SVK BTN LVA LTU ALB TUR CHL SRB POL JOR BIH DOM GHA MRT BWA UGA IDN LKA MYS ZAFHUN HRV THA DMA BOL MWI COL GMB IRL SLE PRY ARG LCA BLZ CZE BGD BDI BFA ZAR MEX NAM PAK DEU BRA HND SVN GRCESP GRD URY LBR GNB NPL PRT MLI ATG TGO GTM LSO BEN CRI PHL NER JAM CMR MAR CAF BRB KEN SEN MUS MDG FJI SWZ KHM IND -.05 GIN SLV 1 1.5 2 Correlation=-0.0763 2.5 Tail index also lacks systematic relationship with the fraction of firms with positive productivity growth (with correlation -0.08). Finally, we analyze firm employment growth and its relationship to aggregate output growth. We construct age-employment profiles by using the cross sectional dimension of our dataset. In particular, we compute the predicted employment by fitting a quadratic polynomial of age on employment for each particular growth quartile. In order to make profiles comparable across growth groups we normalize to 1 the predicted employment at age 5. Figure 5 shows that faster growing economies do not necessarily have steeper employment profiles. In fact, the figure illustrates that the slowest quartile have the steepest employment profile with age, and the fastest quartile has the flattest. We close this section by highlighting the four empirical facts we document here. First, there is a positive correlation between frequency of productivity growth at the firm-level and the growth rate of income per capita. Second, the latter growth rate is not correlated with the tail of employment size distribution. Third, there is a negative correlation between average firm age and the growth rate of income per capita. Finally, 13 100 Figure 4: Tail indices and fraction of firms with productivity growth % with productivity growth 40 60 80 ZWE AGO BLR BGR GEO MNE MAR MUS HRV LBR MLI ZMB SRB AZE UZB GHA EST MWI SLE SVNUKR TJK LTU LVA VNM MNG KAZ ARM CZE MEX LKA BIH URY TZA NAM MDA ROM SVK BRA BWA SWZ JAM KEN CHL ZARGAB POL MOZ ZAFHUN BOL BRBCOL TUR NGA THA ECU PAK GMB CMR BDI GTM PRY BTN OMN LCA GIN PER GNB LSO SLV CAF PHL MDG PAN UGA BEN TCD IDN BFA FJI BGD SEN NPL DMACHN GRD TGO HND DOM ALB BLZ MRT NER ETH CRI RWA ATG 20 SUR Correlation=-0.0904 1 1.5 2 2.5 Tail index employment age profiles are not necessarily steeper in faster growing economies. If anything, faster growing economies tend to have flatter employment profiles. 14 .8 Average employment (normalized, age 5=1) 1.2 1.6 1 1.4 1.8 Figure 5: Age-employment profiles by growth quartile 0 10 20 30 40 50 60 70 Age 0%-1.7% 3 1.7%-2.8 2.9%-4.8% 5%+ Model In the economy, there is an infinitely-lived representative consumer that derives utility from consumption of a final good y. There is a continuum of heterogeneous firms. Each firm operates a technology that uses labor l to produce a homogeneous good y. The technology displays decreasing returns in labor, and has a Hicks neutral productivity shifter z ζ . Hence, after paying workers their wage rate w the firm bears positive profits that are paid as dividends to the representative consumer in the economy. There are two alternative type of firms. The non-innovative ones, whose productivity level z is constant in time; and the innovative ones, whose productivity can change via investment. The technology for improvements in productivity is such that whenever a firm with productivity z undertakes an investment φ in period t, the productivity in period t + 1 is z 0 = φz with probability q and z otherwise. The probability of success 15 in investment returns is the same for all firms operating in the economy. Investment is costly, as characterized by C(φ, z, w) = c φ τ zη θ . w 1−θ There is an inelastic supply of labor equal to 1. A firm has an overhead cost of labor of fj , j = {N, I} that depends on whether or not it belongs to the innovative group. At any point in time a non-innovative firm can decide to exit the market at no cost. An innovative firm has also an option of liquidation. If the firm is liquidated and turn into a non-innovative project it receives a scrap value equal to the expected value of the non-innovative firm with the same productivity z. If the innovative firm exits it gets a scrap value of zero, so liquidation and transformation into a non-innovative project is preferable as long as the non-innovative projects are valuable. Finally, firms are liquidated exogenously at rate δ after production and investment takes place. Notice that the mirror process of that of liquidation of innovative projects is that of entry into the non-innovative sector. For a new firm to be created into the noninnovative sector a price P (z, w) is paid. If there is free entry in the market, this price equals the expected value of the firm, which is the scrap value received by the liquidated innovative firm. Fundamentally, the model dictates that less productive firms will tend to imitate incumbents in the innovative sector (as they inherit productivity z). Finally, we allow endogenous entry to the innovative sector. Any entrant starting a new firm pays the entry cost is xs w, and draws a productivity from a Pareto distribution with threshold zs > 0. We assume that zs is smaller than the productivity level of the least productive firm in the economy, and grows at the same rate with the average productivity in the economy. We now characterize the optimization of each of the agents in the economy. We can solve the problem of the firms in two parts. First we solve for the allocation of labor given the distribution of productivities in the market. Then we solve for the dynamic decisions of the firm, which include technology investment. 16 3.1 Firm’s problem (Static) The static problem of a firm is: Π(z, w) = max z ζ lθ − wl l where θ is the share of labor in production. Then the optimality gives l= For notational convenience, define η ≡ θz ζ w 1 1−θ ζ . 1−θ The total labor supply that can be used for productive purposes is equal to 1 − M f , if there are M firms operating in the market (M = MI + MN where MI is the measure of innovative firms and MN that of non-innovative firms). We can solve for the cost of labor θ w= (1 − (fN MN + fI MI ))1−θ Z η Z I z dv (z) + η N 1−θ z dv (z) where v I (v N ) is the equilibrium distribution of productivities for innovative (nonR innovative) projects. By definition, Mj = dv j (z) for j = I, N . Using the equilibrium cost of labor we can characterize profits, labor demand and output for an arbitrary firm with productivity z: zη θ Π(z, w) = (1 − θ)θ 1−θ θ w 1−θ 1 l(z, w) = θ 1−θ y(z, w) = θ zη 1 w 1−θ θ zη 1−θ θ w 1−θ 17 3.2 3.2.1 Firm’s Problem (Dynamic) Non-Innovative Firm Let VN (z, w) be the value of a firm operating in the non-innovative sector, when the cost of labor is w and its productivity is z. The value of a firm operating in sector N satisfies: θ VN (z, w) = (1 − θ)θ 1−θ zη w θ 1−θ − fN w + 1−δ max{0, VN (z, γw)} R Hence, the value of the firm is negative if and only if the the flow value in that period is negative. i.e. zη θ VN (z, w) ≤ 0 ⇔ (1 − θ)θ 1−θ θ w 1−θ − fN w ≤ 0. Notice that the only decision for this firm is when to exit the market, conditional on survival. 3.2.2 Innovative firms An innovative firm has the same static revenues as a non-innovative firm, but is different in two important aspects. First it can engage in risky investment in productivity which results in successful innovation only with probability q. Second, at any period it can sell its technology to an entrant into the non–innovative sector. We can write the value of an innovative firm as: VI (z, w) = max(1 − θ)θ θ 1−θ φ≥1 + zη θ w 1−θ −c φτ z η θ w 1−θ − fI w 1−δ [q max{V (φz, w0 ), P (z, w0 )} + (1 − q) max{V (z, w0 ), P (z, w0 ))}] R Here we assume that if a firm sells the technology in a period, this nullifies any improvement in the technology that was made on the productivity in the last period. 18 This is analogous to assume that the liquidation of the firm takes place before the firm knows whether their technology investment where successful or not. If the firm would like to liquidate, at the beginning of every period it meets one potential entrant that will use its technology z to operate a non–innovative firm. We assume that the latter makes a take-it-or-leave-it offer P (z, w) to the former to buy the entire technology. As long as the technology’s worth as an N -firm is more than its worth as an I-firm, it is optimal for the entrant to offer minimum price that leaves the I-firm indifferent between selling or not. If the technology is more valuable as an I-firm, the entrant is indifferent between offering any amount less than VI (z, w) because any price in this range will imply that the transaction will not go through; and any price above that gives negative payoff to the N -firm. Without loss of generality, we assume that in this case the offer reads the price equal to the value of the technology as an N -firm: P (z, w) = min{VI (z, w), VN (z, w)} and the innovative firm accepts if and only if P (z, w) ≥ VI (z, w) In other words the transaction occurs when its surplus is positive. With this pricing scheme, we have max{V (φz, w0 ), P (z, w0 )} = V (φz, w0 ) and max{V (z, w0 ), P (z, w0 )} = V (z, w0 ). We assume that there is an infinite mass of potential entrant non-innovative firms. Hence, an innovative firm makes its technology investment decisions as if it were operating an infinitely lived firm. θ VI (z, w) = max(1 − θ)θ 1−θ φ≥1 zη w θ 1−θ −c φτ z η w θ 1−θ − fI w + 1−δ [qV (φz, w0 ) + (1 − q)V (z, w0 )] R When we describe the solution of the model, we will show that any innovative firm 19 is bound to sell its technology to an entrant non-innovative firm in finite number of periods. 4 Solution of the model For tractability reasons, we focus our attention on the balanced growth path. Definition 1 A balanced growth path (BGP) in this economy is a sequence of aggregate output, a measure of aggregate productivity and wages that grow at a constant rate. Additionally, we require the existence of an invariant distribution of firm productivities except possibly for a time trend. To characterize the BGP we guess there exist one. Under this assumption we solve for the optimal policies of the firms and show the existence of an invariant distribution. Once we solve for the equilibrium distribution and allocation of firms across innovative and non-innovative projects, we compute the equilibrium growth rate for aggregate output and wages. Both are constant which confirms the existence of the BGP. Aggregate output in the economy can be characterized by Y (w) = w θ hence along a BGP output and wages have to grow at the same rate. i.e. [Yt , wt ] = [Y, w]γ t for some γ ≥ 1. Using the growth rate for wages we can go back to the problem of the firms. 20 4.1 4.1.1 Solution of firms’ problems Non-Innovative Firms Define the function for the number of periods left in the market as: θ 1 t T (z, w) ≡ max{t : (1 − θ)θ 1−θ z η ≥ fN w 1−θ γ 1−θ }. Here notice that we need to subtract 1 period since the first term gives the first period that the firm will not operate in the market. We will simply denote this as T . Proposition 1 Along the BGP, the value of a firm at the beginning of a period can be characterized by VN (z, w) = BN T zη θ w 1−θ − DN T w. where for any T ≥ 1: BN T = (1 − θ)θ θ 1−θ T −1 X 1−δ t=0 DN T R γ θ − 1−θ 1− t = (1 − θ)θ θ 1−θ 1− 1−δ T θ Rγ 1−θ 1−δ θ Rγ 1−θ T t T −1 X 1 − 1−δ γ 1−δ R =f γ = fN 1−δ R 1 − γ R t=0 and for T = 0 both coefficients are 0. If the functions BN T and DN T are as characterized above the non-innovative firm exits the market in T periods, and VN (z, w) satisfies the Bellman equation which assures optimality. 21 4.1.2 Innovative firms We guess that the value of an innovative firm is VI (z, w) = BI zη − DI w θ w 1−θ Replacing it in the Euler equation yields BI + zη θ w 1−θ zη θ − DI w = max(1 − θ)θ 1−θ φ≥1 θ −c w 1−θ φτ z η θ w 1−θ − fI w 1−δ φη z η zη 0 [q(BI − D w ) + (1 − q)(B − DI w0 )] I I θ θ 0 0 R w 1−θ w 1−θ Then, solving the FOC with respect to φ yields " z 0 = φz = 1−δ qBI η R 1 # τ −η θ z cτ γ 1−θ Substituting back " BI η = η(1 − θ)θ θ 1−θ − c(η − τ ) 1−δ qBI η R cτ γ θ 1−θ τ # τ −η + 1−δ (1 R − q)BI η θ γ 1−θ gives the first equation relating φ and γ through BI . Meanwhile, the other component of the value function, DI reads: DI = fI . 1 − 1−δ γ R Here, an important reminder is that we assumed interiority in this solution. In particular, we assumed that the firm with optimal investment rule as defined above, and successful innovation, remains in the market next period. We will come back to this later, and show that a firm that innovates successfully remains in the market. 22 4.1.3 Sale of the innovative firm to an entrant non-innovative firm Proposition 2 For every innovative firm, there exists a finite time that the technology will be sold to an entrant non-innovative firm, if the firm is not hit with an exogenous exit shock until then. As shown in the proof in the Appendix, for this mechanism to be optimal for the firms we need (i) the surplus in the transaction to increase in unsuccessful innovations. Hence, unlucky firms eventually would like to liquidate. We also need that (ii) for a sufficiently large z relative to w the surplus is negative (a very productive firm does not find profitable to liquidate), and (iii) for a sufficiently small z relative to w the surplus is positive. If this is true there is a finite point in time such that an innovative firm is traded. Hence, the population of both types of firms is non-degenerate. 4.2 Distribution dynamics Define the threshold productivity such a non innovative firm exits in exactly in t periods, Z(t, w) as 1 t fN w 1−θ γ 1−θ Z(t, w) = ! η1 θ (1 − θ)θ 1−θ The lower bound in productivity for firms operating in the market is z̃, which solves: θ 1 (1 − θ)θ 1−θ z̃ η = fN w 1−θ In fact, z̃ = Z(0, w). Under the BGP with growth rate γ, this implies that z̃ 0 = z̃µ 23 1 where µ ≡ γ η(1−θ) . Also, the productivity threshold for selling for an innovative firm ẑ, equals ẑ = Z(t̂, w) which also grows at rate µ. Proposition 3 If the initial distribution of productivities in the market is Pareto with shape parameter λ, and entrants in the innovative sector draw their productivity from the incumbent distribution 1. the growth rate of the threshold levels, µ, is the same as the investment rate, φ. 2. the equilibrium distribution of productivities across innovative firms is also Pareto with shape parameter λ This result implies that 1 φ = γ η(1−θ) , which is the second equation relating γ and φ. It also shows that there is an invariant distribution of firmsin the market. It remains to be shown that the relative population of innovative and non-innovative projects in the market is constant along the BGP. Let α be the proportion of firms in the market in the innovative sector, i.e. α ≡ MI . M Proposition 4 The share of innovative firms in the market is constant along the BGP and solves (1 − α) = α(1 − q)[1 − µ−λ ] 1−δ [1 − (1 − δ)t̂ ] δ Finally, we give the implications of the free-entry condition in our model. Remember that the assumptions regarding the entry of innovative firms are (i) new entrants are assumed to draw their productivity from a Pareto distribution with threshold zs < ẑ, (ii) they pay the entry cost before their productivity draw is realized, and (iii) the 24 threshold zs grow at the same rate as the average productivity. Hence, the free-entry condition reads: (1 − Fs (ẑ))(E(VI (z, w)|z > ẑ) + Fs (ẑ)0 = BI E (z η |z < ẑ) (1 − Fs (ẑ)) θ w 1−θ − DI w = xs w where Fs is the cumulative distribution function of new entrants. Using the properties of the Pareto distribution, we have: λ zg λ ẑ η − DI = xs BI 1 λ − η ẑ w 1−θ Notice that: ẑ η w Since we obtain z̃ η 1 1 1−θ = z̃ η w 1 1−θ (1) µ−ηt̂ . from the exit condition of the marginal non-innovative firm, w 1−θ and t̂ from the exit condition of the marginal innovative firm; equation (1) gives us the threshold productivity level ẑ given the distribution parameter of the new entrants, zs . 5 Quantitative exploration This section is split in two parts. First, we show the quantitative properties of our model, particularly showing how the variation in the probability of success shape aggregate implications. Then, we present our calibration exercise and document how this variation alone can account for cross-country differences in the growth rates and other relevant macroeconomic outcomes. 5.1 Role of the probability of success in aggregate outcomes In showing the quantitative properties of the model, we set the parameters other than q to their benchmark calibrated values that we explain in the calibration. Then we vary 25 Figure 6: Probability of success and aggregate growth 0.12 0.1 GDP growth 0.08 0.06 0.04 0.02 0 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 q q to show the sensitivity of various aggregate moments to the probability of success. First of all, Figure 6 shows that there is a strong relationship between the probability of success and the implied aggregate growth rate of the economy. Higher probability of success also makes survival more difficult, especially for the non-innovative firms that are simply stuck with their current productivity level and observing the wage rate in the economy increase as the income level grows. This can be seen in Figure 7, which illustrates shortening life-time of non-innovative firms in economies with higher levels of probability of success. Higher probability of success affects the allocation of firms into (non-)innovative activity not only through a shorter life-span for non-innovative firms, but also because it changes the likelihood that the innovative firms lag behind their same-sector counterparts. The former channel increases the relative measure of innovative firms in the economy. However, a higher growth rate might make being an innovative firm a less 26 Figure 7: Probability of success and average survival 80 70 Life−time of N−firms 60 50 40 30 20 10 0 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 q appealing option, relative to selling the technology to a non-innovative firm, because of the additional costs of investment in innovation. Hence it is possible to have a U-shaped relationship between the probability of success and the measure of innovative firms, as Figure 8 shows. 5.2 Calibration Our calibration strategy is to set as many parameters as possible following the conventional values in the literature, while we replicate the average features of our sample for key moments. We assume that the cost of innovative activity is a quadratic function of the attempted jump in productivity. In setting the weights of productivity and labor, we assume that the production function exhibits limited span-of-control in a specific manner: ζ η y = z l = zl 27 θ ζ ζ . Figure 8: Probability of success and fraction of innovative firms 0.86 0.84 0.82 Fraction of I−firms 0.8 0.78 0.76 0.74 0.72 0.7 0.68 0.66 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 q We set the span-of-control parameter equal to 0.85, a value that is standard in the literature.6 With our approach to the production function, this gives ζ =0.85. Using the standard labor share of 0.66, we set θ equal to 0.66×0.85=0.561. We set the risk-free interest rate, r, equal to 0.07, which is the average for our sample countries in the World Development Indicators dataset provided by the World Bank. For the tail parameter of the Pareto distribution, we use the average of our estimates for each country that we use in Section 2. The main novelty of our calibration strategy is in determining the probability of success for each country in the sample. In our benchmark model, all the innovative firms try to innovate and they succeed with probability q. In case of not succeeding, small innovative firms that are already at the edge of exiting might endogenously leave the market; however, larger firms remain in the regardless of their success in innovating. 6 See, for instance, Midrigan and Xu (2010). Other studies using similar values include Buera, Kaboski, and Shin (2011) with a value of 0.79 and Cagetti and De Nardi (2006) with 0.88. 28 Hence, according to our model, at the top of the size distribution the fraction of firms that grow in productivity should be equal to q.7 Accordingly, we look at the top 10 percent of the size distribution for each country, and take the frequency cases with of productivity growth to be equal to 1 − (1 − q)1/2 . This gives us a q for each country in the sample. Our remaining parameters are the level parameter for the investment cost, c, and the exogenous exit rate, δ. Since our calibration exercise is aimed at matching the average profile among our sample-economies, we calibrate these two parameters to match the average growth and average firm age in our sample. Calibrated model parameters are summarized in Table ??. Parameter Value Basis Investment cost steepness, τ 2 Quadratic Weight of productivity, ζ 0.85 Span of control Weight of labor, θ 0.56 0.66× Span of control Interest rate, R − 1 0.07 WDI Overhead costs, (fI , fN ) (0, 0.25) Pareto tail, λ 1.96 Average in the sample Investment cost (level), c 0.17 Average growth 3.7% Exogenous exit rate, δ 0.015 Average firm age 15.3 Once we calibrate our model parameters to match the average growth and firm age in our sample, our objective is to show how the model accounts for the crosscountry variation in these moments.8 We start by showing the model implications of the differences in the probability of success on the aggregate growth. In line with the earlier discussion of how the variation in q relates to the aggregate growth, Figure 9 shows that the model implies a monotone relationship between the probability of 7 Notice that exogenous exit shocks might also hit firms. However, since these shocks are independent of the size and the innovation status of the firm, they do not matter for this equality. 8 Here, we focus attention on the countries with estimated q larger than 0.2, so that we avoid negative implied growth rates. This costs us 10 countries in the sample. 29 GDP growth (model) Figure 9: Probability of success and aggregate growth, model 0.1 0.05 0 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 q success and the growth rate. The main question, however, is how well these model-implied growth rates square with the ones observed in the data. This is depicted in Figure 10. Model explains 32 percent of the variation in the growth rates through the variation in q.9 Next, we turn our attention to the variation in the firm age across countries. First, Figure 11 shows that model implies that firms generally get younger as countries’ growth rates are higher. This is qualitatively in line with empirical evidence we document regarding the same relationship in our sample. To see how the model fits the observed variation in average firm age, we plot the model implied average age against the data counterparts in Figure 12. Specifically, the model explains about 10 percent of the variation in average firm age across countries. Even though the benchmark model does a decent job in explaining the cross-country 9 Specifically, 0.32 is the slope of the linear fit in Figure 10. 30 Figure 10: Aggregate growth, model fit 45−degree line linear fit NGA GDP growth (data) AZE 0.1 KAZ MNG ZWE TJK CHN GAB AGO MDA UKR BLR VNM TCD UZB ZMB ROM ARM SLV PER MOZ 0.05 ECU BGR TZA EST MNE LVA PAN BTNSVK LTU TUR CHLPOL SRB BIH GEO BWA UGA LKA IDNHRV ZAF THA HUN BOL MWI COL FJI BFA BGD SLE PRY LCA BLZ CZE BDI MEX 491 384 PAKSVN GHA NAM HND URY BRA LBR NPL MLI TGO GTM LSO CRI BEN PHL JAM 0 ETH 0 0.05 0.1 0.15 GDP growth (model) variation in growth rates of output per capita, a major shortcoming of this simple framework is that firms can never increase their ranking in the productivity ladder, hence are never able to grow in size, employment-wise. We document this feature of the model in Figure 13, which shows the benchmark model’s quadratic fit of employment as a function of age. The lack of employment growth at the firm level is an undesirable property, which is a simple artifact of the increase in productivity of successful firms being equal to the growth rate of the average productivity. Absent idiosyncratic movements among successful firms, none of them can climb up the ladder. In the next section, we extend our benchmark economy by introducing idiosyncratic productivity shocks, and show how it can help the model generate the age-employment profiles observed in the data. 31 Figure 11: Average firm age and aggregate growth, model 19 18.5 Average age (model) 18 17.5 17 16.5 16 15.5 15 14.5 14 0 0.02 0.04 0.06 0.08 0.1 0.12 GDP growth (model) Figure 12: Average firm age, model fit ZWE 30 45−degree line linear fit BOL 25 CHL Mean age (data) FJI GTM URY LKA BGD HND BRA 20 PAK PRY JAM CRI SLV MEX SRB BIH SVN 15 10 5 14 PHL POL COL PER SLE PAN TUR 491 HRV BWA ZMB ZAF THA GHA IDN MWI TCDBTNMOZ BLR HUN LSO UKR TJKLCA AZE GAB BGR UZB CZE SVK EST ROM BFA LTU LVA UGA MLI MNG MNE CHN MDA TZA NAM AGO VNMARM GEO NPL ETH TGO NGA KAZ 384 BDI LBR 15 16 17 Mean age (model) 32 18 19 20 Figure 13: Age-employment patterns, model Labor−Age profile no shocks, normalized quadratic fit 1.05 model 1 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0 5 10 15 20 33 25 30 35 40 6 Extension: Idiosyncratic investment returns In this section, we add further uncertainty in productivity increments. In particular, we assume that if the firm successfully innovates, the actual jump in its productivity depends on an i.i.d. shock in addition to the investment that it makes: φεz, with probability q; 0 z = z, with probability 1 − q. where ε v U (ε, ε). Our strategy for calibrating (ε, ε) is to minimize the distance between the average age-employment profile in the data and in the model. Figure 14 shows this fit with the calibrated (ε, ε), which are equal to (0.955, 1.100). We do not change the parameter values used in the benchmark, including the calibrated parameters c and δ. With this modification of the benchmark model, the model is ready for the comparison of the variation in the age-employment patterns with the data.10 For illustration purposes, we group countries into four quartiles of income growth; the practice that we followed for the data counterpart in Section 2. Figure 15 shows the age-employment profiles in these four groups. In particular, it documents that the model also generates non-monotonicity of the steepness of these profiles as in the data. In particular, the steepest profile is exhibited by the slowest group, and the flattest one is that of the fastest, which is also the case in the data. 10 Most of the analytical results in the benchmark follow with this extension, including the linear policy rule for the investment, φ. We will include the modifications in the results in the appendix of later drafts. 34 Figure 14: Age-employment patterns, model with idiosyncratic shocks Employment−age profiles, normalized quadratic fit 1.4 1.35 1.3 1.25 1.2 1.15 1.1 1.05 1 0.95 0.9 0 5 10 15 20 25 30 35 40 Figure 15: Age-employment profiles by growth quartile, model with idiosyncratic shocks Employment−age profiles, normalized quadratic fit 3 2.5 1.1% 3.1% 4.2% 6.7% 2 1.5 1 0.5 0 0 5 10 15 20 35 25 30 35 40 7 Final remarks We build a stylized model of firm dynamics with growth where the aggregate growth of the economy is linked to the probability of success that firms face in their returns to investment in firm growth. In particular, our model highlights two channels through which such probability affects firm-level as well as aggregate growth. First, lower probability of success has an impact on firm growth directly by decreasing the frequency of successful growth episodes in a firm’s life-cycle, and indirectly by lowering incentives to invest in growth. This partial equilibrium mechanism, increases the firm-level and aggregate growth in economies with higher probability of success. What is also relevant, and helpful to fit important features in the data, is the general equilibrium channel. Economies with higher probability of success in turning investments into actual growth will have higher growth, making survival harder. Average productivity will grow faster, and the distribution of productivity will shift upwards, possibly preserving its shape. In addition to accounting for an important part of the cross-country variation in growth rates these features enable our model replicate two important features of our cross-country data: younger firms in faster growing economies, and the lack of systematic relationship between the tail of the size distribution and growth. Firm level uncertainty is built into our benchmark model only through the probability of realizing positive returns to investment. In order to see the link between such probability of success and the firm level patterns of employment growth, we extend our benchmark economy allowing for return uncertainty. We are able to generate age-employment patterns comparable to those in the data. 36 References Akcigit, U., H. Alp, and M. Peters (2014): “Lack of Selection and Imperfect Managerial Contracts: Firm Dynamics in Developing Countries,” Manuscript. Atkeson, A., and A. T. Burstein (2010): “Innovation, Firm Dynamics, and International Tradelll,” Journal of Political Economy, 118(3), 433–484. Axtell, R. (2001): “Zipf Distribution of U.S. Firm Sizes,” Science. Bloom, N., and R. Sadun (2012): “The Organization of Firms Across Countries,” The Quarterly Journal of Economics, 127(4), 1663–1705. Buera, F. J., J. P. Kaboski, and Y. Shin (2011): “Finance and Development: A Tale of Two Sectors,” American Economic Review, 101(5), 1964–2002. Cagetti, M., and M. De Nardi (2006): “Entrepreneurship, Frictions, and Wealth,” Journal of Political Economy, 114(5), 835–870. Carvalho, V., and X. Gabaix (2013): “The Great Diversification and Its Undoing,” American Economic Review, 103(5), 1697–1727. Christensen, C. (ed.) (1997): The Innovators Dilemma. Harvard Business Review Press. Cole, H. L., J. Greenwood, and J. M. Sanchez (2012): “Why Doesn’t Technology Flow from Rich to Poor Countries?,” Working Papers, Federal Reserve Bank of St. Louis 2012-040, Federal Reserve Bank of St. Louis. Da-Rocha, J. M., M. M. Tavares, and D. Restuccia (2014): “Policy Distortions and Aggregate Productivity with Endogenous Establishment-Level Productivity,” Working Papers tecipa-523, University of Toronto, Department of Economics. 37 Hsieh, C.-T., and P. J. Klenow (2009): “Misallocation and Manufacturing TFP in China and India,” The Quarterly Journal of Economics, 124(4), 1403–1448. (2012): “The Life Cycle of Plants in India and Mexico,” NBER Working Papers 18133, National Bureau of Economic Research, Inc. Klette, T. J., and S. Kortum (2004): “Innovating Firms and Aggregate Innovation,” Journal of Political Economy, 112(5), 986–1018. Knack, S., and P. Keefer (1997): “Does Social Capital Have an Economic Payoff? A Cross-Country Investigation,” The Quarterly Journal of Economics, 112(4), 1251– 88. Luttmer, E. G. (2010): “On the mechanics of firm growth,” Discussion paper. Midrigan, V., and D. Y. Xu (2010): “Finance and Misallocation: Evidence from Plant-level Data,” NBER Working Papers 15647, National Bureau of Economic Research, Inc. Perla, J., and C. Tonetti (2014): “Equilibrium Imitation and Growth,” Journal of Political Economy, 122(1), 52 – 76. Peters, M. (2011): “Heterogeneous Mark-Ups and Endogenous Misallocation,” Discussion paper. Restuccia, D., and R. Rogerson (2008): “Policy Distortions and Aggregate Productivity with Heterogeneous Plants,” Review of Economic Dynamics, 11(4), 707– 720. Thomke, S. (ed.) (2003): Experimentation Matters: Unlocking the Potential of New Technologies for Innovation. Harvard Business School Press. 38 8 Appendix A Proposition 2 (Proof ). We first argue that BI > BN t̂ . θ BI = max(1 − θ)θ 1−θ − cφτ + φ≥1 1−δ 1 1 [qBI φη θ + (1 − q)BI θ ] R γ 1−θ γ 1−θ θ ≥ (1 − θ)θ 1−θ − c + 1−δ 1 BI θ R γ 1−θ Meanwhile, θ BN t̂ < (1 − θ)θ 1−θ + 1−δ 1 θ BN t̂ R γ 1−θ Hence for c small enough, θ (1 − θ)θ 1−θ − c BI ≥ > BN t̂ . 1 1 − 1−δ θ R γ 1−θ Under the condition that fN − fI is not too large, we also know that DI > DN T since DI − DN T t t T −1 ∞ X X 1−δ 1−δ = fI γ − fN γ . R R 0 0 Next, we use the last findings to show that the surplus is increasing unsuccessful innovations, in other words it decreases with T (z, w). Define the surplus function: s(z, w) ≡ −VI (z, w) + VN (z, w) = −(BI − BN T ) zη θ w 1−θ + (DI − DN T )w Then in one unsuccessful innovation the surplus becomes: s(z, wγ) = −(BI − BN T −1 ) zη θ (wγ) 1−θ 39 + (DI − DN T −1 )wγ = −(BI −(BN T −(1−θ)θ θ 1−θ θ 1 − δ − 1−θ γ R s(z, wγ) − s(z, w) = −(BI − BN T ) − zη θ (1 − θ)θ θ 1−θ (wγ) 1−θ − 1−δ R zη )) θ (wγ) 1−θ zη w zη w T −1 ))wγ (γ − 1−θ − 1) + (DI − DN T )w(γ − 1) T −1 + fN 1−δ γ R T −1 wγ θ θ 1−θ T −1 +(DI −(DN T −f 1−δ γ R θ θ 1−θ θ 1 − δ − 1−θ γ R s(z, wγ) − s(z, w) = −(BI − BN T ) T −1 (γ − 1−θ − 1) + (DI − DN T )w(γ − 1) zη θ [(1 − θ)θ 1−θ w θ 1−θ γ θ T 1−θ − fN wγ T ] We know γ > 1, BI > BN T and DI > DN T . Hence, the first line in the equation below is positive. Moreover, by definition of T , zη θ (1 − θ)θ 1−θ w θ 1−θ γ θ T 1−θ − fN wγ T ≤ 0. Hence s(z, wγ) − s(z, w) is positive. This shows that the surplus increases in every unsuccessful innovation. We know that for z and w such that T (z, w) = 0, we have θ (1 − θ)θ 1−θ zη w θ θ 1−θ w 1−θ − fN w ≤ 0 BN 0 = 0, DN 0 = 0 therefore θ VI (z, w) = (1 − θ)θ 1−θ zη w θ 1−θ −c φτ z η θ w 1−θ Therefore, for T = 0 the surplus is positive. Moreover, limT →∞ DN T = DI 40 − fN w < 0. and BN T < BI so that there is large enough T such that the surplus is negative. This shows that there exists t̂ < ∞ such that an I-firm sells the technology to an entrant N -firm. Proposition 2 (Proof ). We first argue that BI > BN t̂ . θ BI = max(1 − θ)θ 1−θ − cφτ + φ≥1 1−δ 1 1 [qBI φη θ + (1 − q)BI θ ] R γ 1−θ γ 1−θ θ ≥ (1 − θ)θ 1−θ − c + 1−δ 1 BI θ R γ 1−θ Meanwhile, θ BN t̂ < (1 − θ)θ 1−θ + 1−δ 1 θ BN t̂ R γ 1−θ Hence for c small enough, θ (1 − θ)θ 1−θ − c BI ≥ > BN t̂ . 1 1 − 1−δ θ R γ 1−θ Under the condition that fN − fI is not too large, we also know that DI > DN T since DI − DN T t t T −1 ∞ X X 1−δ 1−δ = fI γ − fN γ . R R 0 0 Next, we use the last findings to show that the surplus is increasing unsuccessful innovations, in other words it decreases with T (z, w). Define the surplus function: s(z, w) ≡ −VI (z, w) + VN (z, w) = −(BI − BN T ) 41 zη θ w 1−θ + (DI − DN T )w Then in one unsuccessful innovation the surplus becomes: s(z, wγ) = −(BI − BN T −1 ) = −(BI −(BN T −(1−θ)θ θ 1−θ θ 1 − δ − 1−θ γ R s(z, wγ) − s(z, w) = −(BI − BN T ) − zη θ (1 − θ)θ θ 1−θ (wγ) 1−θ − 1−δ R + (DI − DN T −1 )wγ θ (wγ) 1−θ T −1 zη )) θ (wγ) 1−θ zη w zη w T −1 ))wγ (γ − 1−θ − 1) + (DI − DN T )w(γ − 1) T −1 + fN 1−δ γ R T −1 wγ θ θ 1−θ T −1 +(DI −(DN T −f 1−δ γ R θ θ 1−θ θ 1 − δ − 1−θ γ R s(z, wγ) − s(z, w) = −(BI − BN T ) zη (γ − 1−θ − 1) + (DI − DN T )w(γ − 1) zη θ [(1 − θ)θ 1−θ w θ 1−θ γ θ T 1−θ − fN wγ T ] We know γ > 1, BI > BN T and DI > DN T . Hence, the first line in the equation below is positive. Moreover, by definition of T , zη θ (1 − θ)θ 1−θ w θ 1−θ γ θ T 1−θ − fN wγ T ≤ 0. Hence s(z, wγ) − s(z, w) is positive. This shows that the surplus increases in every unsuccessful innovation. We know that for z and w such that T (z, w) = 0, we have θ (1 − θ)θ 1−θ zη w θ θ 1−θ w 1−θ − fN w ≤ 0 BN 0 = 0, DN 0 = 0 therefore θ VI (z, w) = (1 − θ)θ 1−θ zη θ w 1−θ 42 −c φτ z η θ w 1−θ − fN w < 0. Therefore, for T = 0 the surplus is positive. Moreover, limT →∞ DN T = DI and BN T < BI so that there is large enough T such that the surplus is negative. This shows that there exists t̂ < ∞ such that an I-firm sells the technology to an entrant N -firm. Proposition 3 (Proof ). Define: XI ≡ EI (z η ) XN ≡ EN (z η ) Also define the cumulative distribution function for the group I as: 1− FI (z) = 0, ẑ λ z , if z ≥ ẑ; o/w. Moreover, let MI and MN be the number of firms in each group, which will be MI . To show that φ = MI +MN max{ẑ, µφ ẑ}, and a ≡ z̄ẑ . Then constant in the BGP, let α ≡ along the BGP. Define z̄ ≡ XI0 Z = (1 − δ)q z̄ ∞ µ we first show that µ ≤ φ λẑ λ (φz) λ+1 dz + (1 − δ)(1 − q) z η Z ∞ zη ẑ 0 +[δ + (1 − δ)qFI (z̄) + (1 − δ)(1 − q)FI (ẑ 0 )]XI0 R∞ R ∞ η λẑλ λ q z̄ (φz)η zλẑ λ+1 dz + (1 − q) ẑ 0 z z λ+1 dz = 1 − qFI (z̄) − (1 − q)FI (ẑ 0 ) 43 λẑ λ dz z λ+1 q = η φ µ aλ−η + (1 − q)µ−λ qaλ + (1 − q)µ−λ µη λ η ẑ λ−η For N-firms we need to keep track of all the endogenous exits from group-I for the last t̂ periods. Define: Z k(ž) ≡ q ž ž/a λž λ (φz) λ+1 dz + (1 − q) z η µž Z zη ž λž λ dz z λ+1 λ λ φη [1 − aλ−η ]ž η + (1 − q) [1 − µη−λ ]ž η = q λ−η λ−η Then: ẑ XN = (1 − δ)t̂ k(ẑµ−t̂ ) + (1 − δ)t̂−1 k(ẑµ−t̂+1 ) + .. + (1 − δ)k( ) µ X t̂−1 λ λ = q φη [1 − aλ−η ] + (1 − q) [1 − µη−λ ] (1 − δ)t+1 µ−η(t+1) ẑ η λ−η λ−η t=0 λ λ 1 − (1 − δ)t̂ µ−ηt̂ η = q φη [1 − aλ−η ] + (1 − q) [1 − µη−λ ] (1 − δ)µ−η ẑ λ−η λ−η 1 − (1 − δ)µ−η Moreover: α = w0 = γ= w αXI0 + (1 − α)XN0 αXI + (1 − α)XN 1−θ = 1−θ Pt̂−1 t+1 −ηt η t̂ η λ−η η−λ (1 − δ) µ µ + (1 − α) qφ [1 − a ] + (1 − q)[1 − µ ] t=0 qaλ +(1−q)µ−λ P µη t̂−1 t+1 µ−ηt αµηt̂ + (1 − α) (qφη [1 − aλ−η ] + (1 − q)[1 − µη−λ ]) t=0 (1 − δ) η φ q( µ ) aλ−η +(1−q)µ−λ Since γ = µη(1−θ) : η φ q aλ−η + (1 − q)µ−λ = qaλ + (1 − q)µ−λ µ Which gives φ =a≤1⇒φ≤µ µ 44 We prove µ ≤ φ by contradiction. In particular, suppose that µ > φ, i.e., threshold productivity grows at a faster rate than the productivity growth of the successful innovative firms. Then, a measure FI (ẑ µφ ; t) − FI (ẑ; t) innovative firms will not be able to stay in the market even if they successfully increase their productivity. This implies that these firms would not invest in productivity growth, since the increments for the exiting firms are nullified by assumption. Notice that then the value of a firm with productivity z ∈ [ẑ, ẑ µφ ] the value is: zη θ V I (z, w) = (1 − θ)θ 1−θ w − fw − c θ 1−θ zη w θ 1−θ + 1−δ I V (z, wµ) R Since the firm endogenously exits, we have: V I (z, wµ) ≤ V N (z, wµ) Then: θ V N (z, w) = (1 − θ)θ 1−θ > (1 − θ)θ θ 1−θ zη θ 1−θ w zη θ w 1−θ 1−δ N V (z, wµ) R zη 1−δ I − fw − c θ + V (z, wµ) R w 1−θ − fw + = V I (z, w) This contradicts ẑ < z being the threshold productivity level for innovating firms; hence, shows that φ = µ. Proposition 3 (Proof ). Define: XI ≡ EI (z η ) XN ≡ EN (z η ) 45 Also define the cumulative distribution function for the group I as: 1− FI (z) = 0, ẑ λ z , if z ≥ ẑ; o/w. Moreover, let MI and MN be the number of firms in each group, which will be MI . To show that φ = MI +MN max{ẑ, µφ ẑ}, and a ≡ z̄ẑ . Then constant in the BGP, let α ≡ along the BGP. Define z̄ ≡ XI0 Z ∞ = (1 − δ)q z̄ µ we first show that µ ≤ φ λẑ λ (φz) λ+1 dz + (1 − δ)(1 − q) z η Z ∞ ẑ 0 λẑ λ z λ+1 dz z η +[δ + (1 − δ)qFI (z̄) + (1 − δ)(1 − q)FI (ẑ 0 )]XI0 R ∞ η λẑλ R∞ λ q z̄ (φz)η zλẑ λ+1 dz + (1 − q) ẑ 0 z z λ+1 dz = 1 − qFI (z̄) − (1 − q)FI (ẑ 0 ) η q µφ aλ−η + (1 − q)µ−λ λ η = µη ẑ λ −λ qa + (1 − q)µ λ−η For N-firms we need to keep track of all the endogenous exits from group-I for the last t̂ periods. Define: Z k(ž) ≡ q ž ž/a λž λ (φz) λ+1 dz + (1 − q) z η Z ž µž zη λž λ dz z λ+1 λ λ φη [1 − aλ−η ]ž η + (1 − q) [1 − µη−λ ]ž η = q λ−η λ−η 46 Then: ẑ XN = (1 − δ)t̂ k(ẑµ−t̂ ) + (1 − δ)t̂−1 k(ẑµ−t̂+1 ) + .. + (1 − δ)k( ) µ X t̂−1 λ λ φη [1 − aλ−η ] + (1 − q) [1 − µη−λ ] (1 − δ)t+1 µ−η(t+1) ẑ η = q λ−η λ−η t=0 λ λ 1 − (1 − δ)t̂ µ−ηt̂ η = q φη [1 − aλ−η ] + (1 − q) [1 − µη−λ ] (1 − δ)µ−η ẑ λ−η λ−η 1 − (1 − δ)µ−η Moreover: α = w0 = γ= w αXI0 + (1 − α)XN0 αXI + (1 − α)XN 1−θ = η φ q( µ ) aλ−η +(1−q)µ−λ η t̂ η λ−η η−λ Pt̂−1 ] + (1 − q)[1 − µ ] t=0 (1 − δ) P t̂−1 η t̂ η λ−η η−λ t+1 −ηt αµ + (1 − α) (qφ [1 − a ] + (1 − q)[1 − µ ]) µ t=0 (1 − δ) qaλ +(1−q)µ−λ µ + (1 − α) qφ [1 − a t+1 −ηt Since γ = µη(1−θ) : η φ q aλ−η + (1 − q)µ−λ = qaλ + (1 − q)µ−λ µ Which gives φ =a≤1⇒φ≤µ µ We prove µ ≤ φ by contradiction. In particular, suppose that µ > φ, i.e., threshold productivity grows at a faster rate than the productivity growth of the successful innovative firms. Then, a measure FI (ẑ µφ ; t) − FI (ẑ; t) innovative firms will not be able to stay in the market even if they successfully increase their productivity. This implies that these firms would not invest in productivity growth, since the increments for the exiting firms are nullified by assumption. Notice that then the value of a firm with productivity z ∈ [ẑ, ẑ µφ ] the value is: 47 µ 1−θ µη zη θ V I (z, w) = (1 − θ)θ 1−θ w − fw − c θ 1−θ zη w θ 1−θ + 1−δ I V (z, wµ) R Since the firm endogenously exits, we have: V I (z, wµ) ≤ V N (z, wµ) Then: θ V N (z, w) = (1 − θ)θ 1−θ > (1 − θ)θ θ 1−θ zη θ 1−θ w zη θ w 1−θ 1−δ N V (z, wµ) R zη 1−δ I V (z, wµ) − fw − c θ + R w 1−θ − fw + = V I (z, w) This contradicts ẑ < z being the threshold productivity level for innovating firms; hence, shows that φ = µ. Corollary From the previous result it follows that: XI = XN = (1 − q) λ η ẑ λ−η (2) λ 1 − (1 − δ)t̂ µ−ηt̂ η [1 − µη−λ ](1 − δ)µ−η ẑ λ−η 1 − (1 − δ)µ−η From the exit condition of group N , we know that: θ 1 (1 − θ)θ 1−θ z̃ η = f w 1−θ where w= θ (αM XI + (1 − α)M XN )1−θ (1 − f M )1−θ 48 (3) w= = t̂ µ−η t̂ 1−θ α + (1 − α)(1 − q)[1 − µη−λ ](1 − δ)µ−η 1−(1−δ) ẑ η(1−θ) −η 1−(1−δ)µ 1−θ t̂ −η t̂ 1−θ λ η−λ −η 1−(1−δ) µ α + (1 − α)(1 − q)[1 − µ ](1 − δ)µ z̃ η(1−θ) µη(1−θ)t̂ −η λ−η 1−(1−δ)µ θM 1−θ (1−f M )1−θ θM 1−θ (1−f M )1−θ λ λ−η 1−θ Hence we get θ (1−θ)θ 1−θ z̃ η f (1 − θ)θ θ 1−θ 1 = θ 1−θ M λ (1−f M ) λ−η 1 = θ 1−θ f M λ (1−f M ) λ−η t̂ −η t̂ µ α + (1 − α)(1 − q)[1 − µη−λ ](1 − δ)µ−η 1−(1−δ) 1−(1−δ)µ−η α + (1 − α)(1 − q)[1 − µ η−λ ](1 − t̂ −η t̂ µ δ)µ−η 1−(1−δ) 1−(1−δ)µ−η ẑ η µ−ηt̂ t̂ µ−η(4) This gives us M .11 In what follows, we show how to get MN (or equivalently, how to find α). Proposition 4 (Proof ). Define µž λž λ dzMI z λ+1 ž = (1 − q)[1 − µ−λ ]MI Z m(ž) ≡ (1 − q) then ẑ MN = (1 − δ)t̂ m(ẑµ−t̂ ) + (1 − δ)t̂−1 m(ẑµ−t̂+1 ) + .. + (1 − δ)m( ) µ t̂−1 X −λ MN = (1 − q)[1 − µ ] (1 − δ)t+1 MI t=0 11 Notice that this is in line with strong and positive correlation between population size and firm population illustrated in Bollard, Klenow and Li (2013). 49 = (1 − q)[1 − µ−λ ] 1−δ [1 − (1 − δ)t̂ ]MI δ (1 − α) = α(1 − q)[1 − µ−λ ] 1−δ [1 − (1 − δ)t̂ ] δ (5) which proves the claim. Proposition 4 (Proof ). Define µž λž λ dzMI z λ+1 ž = (1 − q)[1 − µ−λ ]MI Z m(ž) ≡ (1 − q) then ẑ MN = (1 − δ)t̂ m(ẑµ−t̂ ) + (1 − δ)t̂−1 m(ẑµ−t̂+1 ) + .. + (1 − δ)m( ) µ t̂−1 X MN = (1 − q)[1 − µ−λ ] (1 − δ)t+1 MI t=0 = (1 − q)[1 − µ−λ ] 1−δ [1 − (1 − δ)t̂ ]MI δ (1 − α) = α(1 − q)[1 − µ−λ ] which proves the claim. 50 1−δ [1 − (1 − δ)t̂ ] δ (6)