Convergence: Variation in Concept and Empirical Results Nazrul Islam Department of Economics Emory University First Draft: July 1996 Current Draft: June 1998 ---------------------------------------------------------------------------------------------------------------I would like to thank all those who commented on earlier versions of this paper. Their suggestions have led to significant improvements of this paper. All remaining errors and shortcomings are mine. Convergence: Variation in Concept and Empirical Results 1. Introduction A central issue around which the recent growth literature has evolved is the issue of convergence. Whether the poorer countries of the world are converging to the income levels of the richer countries is, by itself, a question of paramount importance for human welfare. However, the interest in the issue has been fueled further by the fact that convergence became linked with the question, which kind of growth theory better describes the world. It has been generally thought that convergence is an implication of the neo-classical growth theory (NCGT), while the new growth theories (NGT) do not have this implication. Hence, it was supposed that by testing for convergence, one could test which of these two classes of growth theory was more valid. Given this link, it is hardly surprising that the convergence issue has engaged so numerous and outstanding minds of the economics profession. This has led to a wide variety of concepts and results, so much so that the overall situation has become somewhat confusing. This has not been helped by the fact that there have been very few attempts to review this literature. This paper is an attempt to fill that gap. It takes stock of this vast literature and tries to synthesize the results. In doing so, it uses the logical-historical method. The literature on convergence has unfolded over time responding to perceived logical inadequacies of the works of the previous period. Hence, understanding of the logical-analytical points requires an examination of how this literature has evolved over time. As we shall see, there is a wide variety of ways in which convergence has been defined and understood. This is evident from the following, often encountered, dichotomies: (a) Convergence within an economy vs. convergence across economies; (b) Convergence in terms of growth rate vs. convergence in terms of income level; (c) β-Convergence vs. s -convergence; (d) Unconditional (absolute) convergence vs. conditional convergence; (e) Unconditional convergence vs. club-convergence; (f) Income-convergence vs. TFP (total factor productivity)-convergence; and (g) Deterministic convergence vs. stochastic convergence. 1 It is not that all these different interpretations of convergence were apparent from the beginning. Research on convergence has proceeded through several stages, and it is only with time that these different interpretations of convergence emerged and gained currency. From methodological point of view, we can identify the following five different approaches to convergence study: (a) Informal cross-section approach, (b) Formal cross-section approach, (c) Panel approach, (d) Time-series approach, and (e) Distribution approach. There is some correspondence between the convergence concepts, on the one hand, and the methodologies used, on the other. However, this correspondence is not unique. Thus, for example, informal and formal cross-section approaches, panel approach, and time-series approach (in part) have all focused on β-convergence, either conditional or unconditional. These approaches have generally dealt with convergence across economies and in terms of per capita income level. The formal cross-section and panel approaches have also been employed to examine club-convergence and TFP-convergence. The cross-section approach has even been used to study σ -convergence. On the other hand, time series approach has been used to investigate convergence both within an economy and across-economies, and this approach has often proceeded from the notion of stochastic convergence. Finally, the distribution approach has gone beyond investigating just σ -convergence and has focused on the entire shape of the distribution and intra-distribution dynamics. This variety of concepts and methodological approaches has led to a plethora of empirical results. However, at a broad level, there is considerable agreement in the results. For example, despite differences in approach and methodology, the finding of conditional β-convergence has remained relatively robust. This has been true for both small sample of developed economies and large, global sample. In fact, for the developed economies, researchers have often reported unconditional convergence. Similarly, once it is remembered that σ -convergence research generally focuses on unconditional convergence, it becomes clear that results regarding σ convergence largely agree with those regarding β-convergence. Evidence of σ -convergence is 2 found precisely in those small samples of developed economies for which there is also evidence of unconditional β-convergence. On the other hand, in large global sample, neither unconditional β-convergence nor σ -convergence are found to hold. Finally, time series analysis also has produced evidence of conditional convergence, even though steady state variation considered in this approach have generally been limited to time-invariant differences and trend breaks only. Despite these agreements, at a more concrete level, convergence research has not produced consensus estimates of the structural parameters of growth model. Two particular parameters that have figured prominently in this regard are rate of convergence and elasticity of output with respect to capital. Also, not all approaches to convergence research have been equally concerned with values of structural parameters. From this point of view, convergence studies can be classified into two broad groups. One group, comprised of the cross-section and panel approaches, imposes structure on the data and produces estimates of the structural parameters of growth model. The other group, comprised of the time series and distribution approaches, tends to avoid structure and thus resembles reduced-form analysis of output data. These latter approaches, therefore, do not produce structural parameter estimates and do not answer questions concerning precise values of these parameters. Given the differences in approach, sample, data, model, estimation technique, etc., consensus regarding the parameter values was, perhaps, not expected. Some generalities have, nevertheless, emerged. It has been generally observed that the more differences in the steady state of economies are controlled for (either by sample selection or by inclusion of relevant variables in the regression), the higher is the resulting rate of convergence. An important manifestation of this can be seen in the context of technological differences. It has been found that much higher convergence rates result when the technology determinant of the steady state is controlled for than when it is not. This also indicates that capital deepening and technological diffusion, the two processes that jointly determine income-convergence, may not always play symmetric role. Evidence indicates that while TFP-convergence has aided income-convergence in small samples of developed economies, in large, global sample, this might not have been the case. In the latter sample, there is strong evidence of capital deepening, but very large TFP differences remain. However, TFP dynamics in the global sample are yet to be fully studied. What are the implications of these convergence results? It is clear that the evidence of conditional β -convergence does not necessarily mean that per capita income levels of all 3 countries are converging to the income level of the richer countries. Hence, from welfare point of view, the implication of conditional convergence is rather hollow. In fact, conditional convergence does not imply that the income levels of the countries are converging to any one particular income level. A parallel of this can be seen in the fact that a conditional negative β does not necessarily imply a decreasing σ . But, it is also true that an increasing σ does not preclude the possibility of a negative β . This indicates the necessity of going beyond evidence of conditional β -convergence and examining the evolution of the distribution as a whole and intradistribution dynamics. The distribution approach to convergence emphasizes these latter issues and produces some important results. However, a fuller understanding of these results requires a change in the focus of growth and convergence research. What are the implications of the convergence results for the growth theory debate? The most important development in this regard has been the introduction of the concept of conditional convergence and establishment of the fact that convergence implication of NCGT is, at best, conditional on differences in steady state and not absolute. This has helped reconcile cross-country growth data with NCGT, and we have the overall result that NCGT cannot be rejected on the basis of evidence from convergence research.1 This does not mean that the conflict between NCGT and NGT has been fully resolved. What this means is that the test between NCGT and NGT has to move on to other grounds and/or formulated differently than just in terms of convergence.2 One difficulty in this regard is that, unlike NCGT, there is no ‘consensus’NGT. Commonality of different variants of NGT lies mainly in their common rejection of exogenous steady state growth of NCGT. However, they vary widely with regard to the source and workings of alternative, endogenous growth. It is therefore not surprising that convergence discussion has revolved more around the question whether or not NCGT is rejected than the question whether or not NGT is vindicated. Positive tests of NGT have been few and far between. However, the concept of conditional convergence has had two important consequences. First, it has worked toward making NCGT and NGT observationally equivalent. This is 1 This has led to Barro’s recent remark: “It is surely an irony that one of the lasting contributions of endogenous growth theory is that it stimulated empirical work that demonstrated the explanatory power of the neoclassical growth model.” (Barro 1997, p. x) 2 For examples of such tests, see Jones (1995a, 1995b). 4 particularly true when the concept of conditional convergence is pushed to the extent that countries can have not only different steady state level but also different steady state growth rate. Convergence then becomes virtually an empty construct, and equilibrium data cannot effectively discriminate between NCGT and NGT. This problem of observational equivalence has been aggravated from the other end by the fact that, partly in response to the empirical evidence supporting convergence, variants of NGT have now been put forward that themselves have convergence implication. 3 The second consequence of the concept of conditional convergence has been that, to a certain extent, it has diverted attention away from important issues of growth. This is because it focuses on growth after controlling for the determinants of steady state and, thus, abstracts from the important issue of the determination of the determinants. Yet, even from the NCGT perspective, the long run income level of an economy depends on these determinants, and improvement in them not only implies higher steady state level but also induces transitional growth. If, in addition, the steady state growth rates are also allowed to be country-specific, then there remains little useful that conditional convergence is worth. In view of this emptiness, it is perhaps somewhat unfortunate that more research is devoted to correctly estimating the rate of conditional convergence than to analyzing the steady state determinants. This relative lack of attention to the determinants of steady state is also hindering fuller understanding of the results produced by the distribution approach to convergence. Two main results of this line of research are, first, that the cross-section distribution in large sample of countries is not collapsing, and, second, that this distribution is becoming increasingly bi-modal. These results are not incompatible with conditional β -convergence provided it is kept in mind that steady state determinants can not only vary cross-sectionally but also change over time. More research directed to the dynamics of these determinants (among which are investment, fertility, technology, and institutions) can enable us to understand the underlying causes of increasing bi-modality. This may also help in making current growth and convergence research more practically relevant. In understanding the dynamics of the steady state determinants, it is important to pay due attention to the fact of interdependence of the economies of the world. One nagging problem with convergence study is that models used to study it are geared to describe dynamics within an 3 For example, see Jones and Manuelli (1990). 5 economy. Yet convergence that we are more interested in is essentially an across-economy process. The theoretical transition from this within-economy model to one that describes growth in an interconnected world is yet to be fully made. Such a construct has to address crosseconomy processes concerning both factor accumulation and technological diffusion. While international trade theory provides some perspectives on the former, there is lacking regarding the latter. Some important beginnings have been made;4 however, much more needs to be done. Also, theoretical advances that have been made in this regard are yet to be distilled into testable empirical hypotheses. Thus, both theoretical and empirical works, focussed directly on the crosscountry processes, have become another important priority. The discussion of this paper is organized as follows. In section-2, we discuss the link between growth theory and the issue of convergence. Section-3 provides an introductory description of different concepts of convergence. Section-4 reviews the initial evidence on convergence based in informal cross-section regressions. The formal model-based equation that has become the mainstay of convergence research is presented in section-5. Section-6 reviews the cross-section results based on formal specifications. Section-7 discusses the panel approach to convergence study. Time series approach to convergence analysis is reviewed in Section-8. Section-9 discusses the distribution approach to convergence. Conclusions are drawn in section10. Most sections are provided with a summary at the end to help reader follow the discussion. The literature on convergence is too vast to make an all-inclusive survey possible. Accordingly, many works remain outside of the review here, which does not mean that these are not important. Also, this paper is not meant to be a review of the entire new literature of growth. It considers only that part of this literature, which focuses on convergence and its relationship with growth theory. 2. Growth Theory and the Issue of Convergence For a long time, the neoclassical growth theory (NCGT) has been the main paradigm for discussion of economic growth. However, in the mid-eighties, two interrelated dissatisfactions arose, both of which can be traced back to the NCGT-assumption of diminishing returns to capital in the particular form of the Inada conditions. The first of these concerns the source of 4 An important recent contribution in this regard is by Ventura (1997). Other contributions in this direction include Barro, Mankiw, and Sala-i-Martin (1995) and Barro and Sala-i-Martin (1997). 6 steady state growth. Because of Inada-type diminishing returns, steady state growth in NCGT has to come from ‘outside,’ in the form of exogenous technological progress, for which there is no within-model explanation. This specification of steady state growth, which Solow had conceived as a short cut to capturing a more complicated process,5 no longer proved satisfactory.6 The second dissatisfaction concerns NCGT’s ability to explain cross-country regularities of growth. The specification of technological progress in NCGT is based on the following assumptions: (a) no resources are needed to generate technological innovation, (b) everybody equally benefits from it, and (c) nobody pays any compensation for benefiting from it. When extended to a global setting, these assumptions lead to a convergence implication. If all economies can share in technological progress equally, then they all, sooner or later, should grow at a common rate given by the rate of exogenous technological progress. This gives a hypothesis of convergence in terms of growth rate. However, a hypothesis of convergence in terms of per capita income level is also ascribed to NCGT based on the following reasoning. Diminishing returns imply higher marginal productivity of capital in a capital-poor country. Hence, with similar savings rates, their economies will grow faster and eventually catch up with the richer economies in terms of per capita income. In cross-country data, therefore, there should be a negative correlation between the initial level of income and the subsequent growth rate.7 However, initial look at the Summers and Heston (1988, 1991) data set led to the claim that, for large samples, the convergence hypothesis did not hold.8 This alleged non-conformity became the second dissatisfaction with NCGT. As Romer (1994) explained, these two dissatisfactions were also the two origins of NGT, and they also influenced the initial course that NGT took. To the extent that the convergence implication of NCGT is rooted in the assumption of diminishing returns to capital, NGT tried to avoid convergence implication by moving away from this assumption. This is most evident in the Ak-version of NGT, which opts for a straightforward replacement of diminishing returns by 5 See Solow (1994, 1997). 6 Actually, this dissatisfaction is not new. Development economists were long unhappy about the property of NCGT whereby the long run growth rate was exogenous and could not be influenced by policies. This found expression in the contention of policyirrelevance against NCGT. 7 Note that this negative correlation is necessary for convergence in terms of both growth rate and per capita income. 8 See, for example, Baumol (1986) and Romer (1989a, 1989b). We shall consider this evidence in more detail shortly. 7 constant returns. In other initial versions of NGT, e.g., Romer (1986) and Lucas (1988), similar goal is accomplished in more roundabout ways. The generic implication of these versions of NGT is that economies starting out with lower levels of per capita capital stock have no inherent reason to experience higher growth rate, hence no reason for convergence. Convergence thus became a proving ground for testing NCGT versus NGT. It became a test for diminishing returns to capital. In the Cobb-Douglas case, returns to capital is determined by capital’s exponent. Convergence, therefore, became an issue of correct estimation of this parameter. Empirical estimate of this elasticity parameter indicates whether private return from capital differs from its social or aggregate return. To that extent that many of the initial variants of NGT rely on externality, while NCGT does not, evidence of wedge between capital’s private and social return has direct bearing on the growth theory debate. All these questions coalesced in the debate over convergence, and it is because of these various ramifications that the convergence debate has been raging so forcefully for such a long time. 3. Different Concepts of Convergence A. Convergence Within vs. Convergence Across Robert Solow (1970), in his exposition of growth theory, starts out by relating to the six stylized facts about growth that were put forwarded by Kaldor (1971). Coming to the fifth and sixth of these,9he pauses and makes the following comment: “The remaining ‘stylized facts’are of a different kind, and will concern me less, because they relate more to comparisons between different economies than to the course of events within any one economy.” (p. 3; my italic) It is somewhat ironic that one of the recent dissatisfactions with Solow model has been its alleged failure to explain across- or between-country variation in growth rate and income level. Historically, the main objective of the Solow model was to show that once factor substitution was allowed, the economy could achieve stable dynamic equilibrium instead of suffering from inherent instability that characterized previous growth models by Harrod and Domar. In NCGT, no matter whether the economy starts off from a per capita capital stock that is lower or higher 8 than the equilibrium, the forces of the economy lead it to the equilibrium. Hence, this is indeed a proposition of convergence, albeit within economy. Paradoxically, the concept of convergence that arose and became associated with NCGT, was in the across-economy sense. B. Convergence in Terms of Growth Rate vs. Convergence in Terms of Income Level. These two variants of the convergence hypothesis have already been discussed in section 2. The hypothesis of convergence in terms of growth rate follows directly from an extension of the NCGT-assumptions regarding technological progress to a global setting, and no additional assumption is required. In contrast, the hypothesis of convergence in terms of per capita income level requires additional assumptions, which are not innocuous. The distinction between unconditional and conditional convergence is centered on these assumptions. C. ß-Convergence vs. s -Convergence The common methodology of investigating convergence in its across-economy sense has been to run cross-country regression with subsequent growth as the dependent variable and initial level as the explanatory variable.10 This set up is known as the growth-initial level regression.11 The hypothesized negative correlation between initial level and subsequent growth is supposed to be picked up by the coefficient (say, ß) of the initial income variable. Convergence judged by the sign of ß came to be known as ß-convergence.12 However, researchers like Quah (1993a), Friedman (1994), and others have noted that convergence is a proposition regarding the dispersion of the cross-sectional distribution of income, and a negative ß from growth-initial level regression does not imply a reduction in this dispersion. They point out that negative ß can be another example of the more general phenomenon of reversion to mean, and, by reading convergence in it, growth researchers are falling into Galton fallacy. According to this view, instead of judging indirectly and perhaps 9 The fifth of these stylized facts was that the growth rate of per capita output varied widely across countries, and the sixth, that economies with high share of profits in income had higher investment to output ratios. 10 Either alone or along with other right hand side variables. 11 Sometimes these are also called Barro-regressions, referring to Barro (1991). 12 In this paper we shall use ß as the generic notation for the coefficient on the initial level variable in the growth-initial level regressions. Note that negative ß can be interpreted as evidence of convergence in terms of both income level and growth rate. 9 erroneously, through the sign of ß, convergence should be judged directly by looking at the dynamics of dispersion of income level and/or growth rate across countries. This gave rise to the concept of s -convergence, s being the standard deviation of the corresponding distribution.13 However, despite the limitations above, researchers have continued to be interested in ßconvergence, in part, because it is a necessary condition of s -convergence, though not sufficient. The other reason for continued interest in ß-convergence is that it provides answers regarding structural parameters of growth models. In contrast, convergence research along the distribution approach generally avoids structure and tends towards reduced form analysis. D. Unconditional Convergence vs. Conditional Convergence Proceeding from the Solow model and assuming a Cobb-Douglas production function of the type Yt = K tα ( At Lt )1− α , the steady state level of per capita income, y ∗ , is given by (1) y ∗ = A0 e gt [s /(n + g + δ)]α /(1− α ) , where s is the saving rate, g and n are the assumed exponential growth rates of At and Lt , respectively.14 This shows clearly that the steady state income level of a country depends on the vector θ, which has the following six elements, ( A0 , g , s, n, δ,α ).15 Unconditional convergence, which is based on the assumption of a common steady state, implies that all elements of θ are same across the economies considered. In terms of the growth-initial level regression, it therefore implies that the sign of ß should be negative even if no other variable is included on the right hand side. In contrast, the concept of conditional convergence emphasizes the possible differences in steady state and hence requires that appropriate variables should be included on the right hand side of the growth-initial level regression to proxy for these differences. However, which of the different elements of θ should be allowed to vary and which not, is an important issue in convergence research, as we shall see. 13 The expressions ß-convergence and s -convergence were first coined by Sala-i-Martin. 14 Other notations are standard: Yt is the income or output, K t and Lt are capital and labor inputs respectively, and At is the shift parameter. 15 In the case of the Cass- Koopmans model, θ also has similar set of elements with s replaced by parameters for the rate of time preference and the inter-temporal elasticity of substitution 10 E. Conditional Convergence vs. Club Convergence The concept of club convergence can be traced back to Durlauf and Johnson (1995). Recently Galor (1996) has provided a more explicit formulation of the concept. One property of the standard NCGT is that the equilibrium is unique, and the usual notion of convergence assumes this uniqueness. In case of unconditional convergence, there is only one equilibriumlevel to which all economies approach. In case of conditional convergence, equilibrium differs by economy, and each particular economy approaches its own but unique equilibrium. In contrast, the idea of club-convergence is based on models that are characterized by the possibility of multiple equilibria.16 Which of these different equilibria an economy will be reaching, depends on its initial position.17 This produces club-convergence: 18 different groups/clubs approach different equilibrium depending on the common initial location (or some other attribute) they share. Observationally, therefore, the notion of club-convergence can be thought as an intermediate concept between conditional and unconditional convergence. This may be illustrated using the vector θ above. Absolute convergence assumes that values of all six of its elements are same across economies. Conditional convergence, on the other hand, can allow values of all these elements to differ. Club convergence emphasizes the difference in initial condition, say A0 .19 However, club convergence does not imply that each different value of initial condition yields a different equilibrium. Rather, the idea is that there are ranges of values of the initial condition that correspond to different equilibrium. F. Income-convergence vs. TFP-convergence Convergence research has generally dealt with convergence in terms of per capita income, i.e., income convergence. However, income convergence can be the joint outcome of the twin processes of capital deepening and technological catch-up. While most researchers have 16 For models with multiple equilibria, see, for example, Azariadis and Drazen (1990). 17 And, perhaps, some other attributes, though it is the difference in initial conditions that is emphasized. 18 Note that this use of the term ‘club’is different from the way it was used by Baumol (1986) when he coined the expression ‘convergence club.’In Baumol’s usage, economies had unique equilibrium, and ‘convergence club’denoted groups of similar economies, and hence the same equilibrium. 19 However, definition of initial conditions may not be limited to A0 . 11 focused on the parameters of capital deepening process, other researchers, like Dowrick and Nguyen (1989), Dougherty and Jorgenson (1996, 1997), Wolff (1991), and Dollar and Wolff (1994) have directed their attention to the process of technological catch-up. Since, total factor productivity (TFP) is the closest measure of technology, these researchers have investigated whether over time countries have come closer in terms of TFP levels. This has given rise to the concept of TFP-convergence. Clearly, income convergence can get either accelerated or thwarted depending on whether the initial TFP-differences narrow or widen over time. Note that TFP-convergence can either be an independent query or a subsequent part of the general query into income convergence. G. Deterministic Convergence vs. Stochastic Convergence The concept of stochastic convergence has been put forward by Bernard and Durlauf (1996), Carlino and Mills (1993), Evans (1996), Evans and Karras (1996a), and others. The idea may be expressed as follows. Two economies, i and j, are said to converge if the (per capita) output for them, y i ,t and y j ,t satisfy the following condition: (2) lim k → ∞ E ( y i ,t + k − a ⋅y j , t + k | I t ) = 0 . Note that with a = 1, it is a variant of the notion of unconditional convergence. This definition of stochastic convergence is relatively unambiguous for a two-economy situation. This is not so when convergence is considered in a sample consisting of more than two economies. Researchers have differed in this regard. Some have taken deviations from a reference economy as the measure of convergence in a multi-country situation. In this treatment, y it in equation (2) is replaced by y1t , where 1 is the index for the reference country. Others have based their analysis of convergence on deviations from the sample average. In this treatment, y it is replaced by y t , the average for time t. The notion of stochastic convergence is related with β-convergence. From the point of view of the above formulation, β-convergence test whether, proceeding from y it > y jt , we have (3) lim k → ∞ E ( y i , t + k − y j , t + k | I t ) < y it − y jt . 12 Countries i and j converge between dates t and (t + T) if there is the tendency of output differences to narrow over time, as formalized above. It is clear that β-convergence, in its acrosseconomy sense, is a precondition for stochastic convergence as defined above. The relationship between the time series approach and β-convergence will be formulated more directly in section7. However, the formulations above already show that time series analysis can be used to examine conditional convergence as well. For example, if a is different from 1 in equation (2), we effectively have a situation of differing steady states. Thus, presence of an intercept term in the time series analysis of deviations, as above, is equivalent to allowing time-invariant difference in the steady states of the economies. The above gives a brief introduction to the different concepts of convergence. We now move on to the discussion of empirical evidence. Along the way, we will consider the econometric formulations that have been used to investigate these different concepts of convergence. We begin by considering evidence on β-convergence. 4. Initial Stage of Convergence Study In the initial studies of convergence, the regression specifications were not formally derived from theoretical models of growth. This does not mean that these studies did not have connection with growth models. In fact, to the extent that this connection was less formal, some of these works derived inspiration from several theoretical paradigms and, therefore, had multiple focus. Although the concepts of unconditional and conditional convergence were not rigorously formulated and distinguished yet, works of the initial stage produced evidence that, with hindsight, can be attributed to both these notions of convergence. A. Initial Evidence of Unconditional Convergence Unconditional convergence was investigated in studies like Baumol (1986). The main part of his analysis was based on a sample of 16 OECD countries for which long term data were 13 available from Maddison (1982). Baumol obtained a significant negative coefficient20 on the initial income variable in a growth-initial level regression for these countries, which he took as strong evidence of (unconditional) convergence. However, prodded by Romer, Baumol also considered the relationship in an extended sample of 72 countries. In this larger sample, however, he could not find evidence of convergence.21 Thus, Baumol’s study produced evidence of both presence and absence of unconditional convergence, depending on the sample. He also coined the expression ‘convergence-club.’Introspecting on the basis of the growth-initial level scatter, he suggested that, while there was no convergence in the larger sample as a whole, there existed ‘convergence clubs,’within which evidence of convergence could be seen.22 DeLong (1988) later showed that Baumol’s finding of unconditional convergence in the 16-country OECD sample suffered from selection bias.23 Nevertheless, Baumol’s finding of absence of unconditional convergence in the larger sample of countries became an important point of departure for further discussion of convergence. B. Initial Evidence of Conditional Convergence While Baumol’s study focussed on unconditional convergence, other studies, like Kormendi and Meguire (1985) and Grier and Tullock (1989), provided evidence that can be interpreted as of conditional convergence. Reflecting research interests of an earlier period, these studies were also motivated by other issues, like inflation-output trade off, Philips curve relationship, etc. Accordingly, the regressions in these works included variables representing these other relationships. However, the basic neoclassical paradigm was preserved through the inclusion of labor and (at least, in some of the regressions) capital variables and also of initial real per capita GDP. In a sample of about fifty countries, Kormendi and Meguire’s regressions yielded negative 20 Note that the numerical magnitudes of ß from different studies are not directly comparable because of the differences in regression specification. 21 Numerical results of this regression were not presented, but Baumol reported that it yielded ‘slightly positive slope,’indicating a process of rather divergence. 22 One example was the OECD group, already considered. Another example, according to him, was the group of formerly centrally planned countries. According to Baumol, such clubs consisted of countries which had certain degree of homogeneity in ‘product mix and education’enabling them to share in the ‘public good properties of the innovations and investments of other nations.’(p. 1080) 23 The proper criterion for sample selection for convergence study, DeLong pointed out, is ex-ante income level, and not ex-post. In particular, he showed that, if, guided by the ex-ante criterion, Baumol’s OECD sample was modified slightly, the result of unconditional convergence no longer held. Baumol largely accepted this criticism. See Baumol and Wolff (1988). 14 β,24 which could be taken as evidence of conditional convergence. Grier and Tullock (1989) extended Kormendi and Meguire’s study, with larger sample size and longer sample period. This allowed them to consider the issue of parameter stability across sub-samples and sub-periods.25 For the OECD sub-sample, they found the coefficient on the initial income variable to be negative. For the larger ROW sub-sample, this coefficient turned out to be positive. However, upon splitting ROW into (three) smaller samples, they found the sign of β to vary. C. Initial Evidence on Convergence and the Growth Theory controversy How did the initial studies of convergence relate their results to the growth theory controversy? Some of these, for example, Baumol (1986) and Kormendi and Meguire (1985), either pre-dated or were contemporaneous to the publication of pioneering NGT papers, and hence, growth theory controversy was not very prominent for them. Those studies that were subsequent to the advent of NGT did try to relate their results to growth theory dispute. However, to the extent that their regression specifications were not formally linked with the growth models, such efforts had to be limited.26 Thus, Grier and Tullock (1989), for example, could make only the broad observation that their results were generally supportive of NCGT, and that these results might show directions for further development of NGT. However, from another point of view, absence of formal link between regression specification and growth model saved these initial studies from the within-across tension of the convergence concept. The interpretation of a negative initial income coefficient in these studies could be entirely cross-sectional. It was taken as evidence showing that, other things held constant, countries starting with lower levels of income grew faster.27 When specifications are formally linked to growth models, it is not possible to limit the interpretation to such pure crosssectional and reduced form. D. Convergence and Human Capital 24 Its value ranged from -0.0055 to -0.0091 in different specifications. 25 Their sample consisted of 113 countries, and they split it into an OECD (24 countries) and a Rest of the World (ROW, comprised of the remaining 89 countries) samples. Time period considered was 1950-81. 26 Grier and Tullock (1989) were themselves quite keen about this limitation. For their own comment on this point, see (p. 260). 27 See for example of such interpretation Kormendi and Meguire (1985, p. 147) 15 In the studies of conditional convergence discussed above, variables are considered in addition to those suggested by NCGT, and human capital does not figure among them. Yet we know that, in order to relax the constraint of diminishing returns, NGT primarily relies on human capital. It was, therefore, natural that human capital would soon appear in the empirical work on growth and convergence. This starts with Barro (1991)’s seminal study, which is directly inspired by the NCGT-NGT controversy and tries to look at the convergence issue from the perspective of NGT.28 Accordingly, Barro starts by abandoning the standard neoclassical format and, instead, emphasizes the simultaneity among growth, investment, and fertility.29 In view of this simultaneity, he runs separate sets of regressions with growth, investment, and fertility as dependent variables.30 Barro’s perspective is reflected in the fact that his basic regressions of growth do not include physical capital and labor as explanatory variables. Instead, the variable of focus is human capital, which now appears in all the regressions.31 These regressions show human capital to be generally important, and Barro interprets this as vindication of the NGTemphasis on human capital. To study convergence, Barro includes the initial income variable in his regressions. He starts by reporting absence of unconditional convergence in a broad sample of 98 countries,32 and interprets this as supportive of the NGT.33 However, he finds that when the initial measures of 28 Also, Barro was not an outsider to this debate. He embarked on this empirical work after already making his own contribution to the development of NGT. In Barro (1990) he examined the role of government spending in the setting of an Ak-style model of endogenous growth. Becker and Barro (1988) and Barro and Becker (1989) addressed the issue of fertility choice. These works and his close association with the genesis of NGT, gave Barro quite a few propositions to test out, and he was quite explicit that he was doing so ‘using recent theories of economic growth as a guide.’(p. 437) 29 Barro argues for this simultaneity on the basis of new theories of growth. In particular, he cites extensively the conclusions of Rebelo (1991), Barro (1990), Romer (1990), Barro and Becker (1989) and Becker, Murphy and Tamura (1990). 30 Barro’s work is, thus, not limited to the convergence issue; instead it addresses a host of other issues related to growth. It, therefore, to some extent, shares the characteristic of multiple focus that we mentioned earlier. 31 Other right hand side variables of Barro’s regressions include government consumption (reflecting his earlier interest in it), an index of relative inflation, indicators of political stability and some regional and continent dummies. Although the inflation variable is interpreted as measure of ‘market distortion,’its inclusion can as well be linked to the output-inflation literature, in which Barro himself played a no small role. See for example, Barro (1976), (1977) and (1978). In fact, Barro has returned to this issue, as can be seen in Barro (1997). 32 The correlation between the average growth rate of per capita real gross domestic product between 1960 and 1925 (GR6085) and the 1960 value of real per capita GDP (GDP60) is reported to be positive 0.09. 33 “This finding accords with recent models, such as Lucas (1988) and Rebelo (1991), that assume constant returns to a broad concept of reproducible capital, which includes human capital. In these models the growth rate of per capita product is independent of the starting level of per capita product.” (p. 408) 16 human capital are included, β turns negative and significant.34 This negative β leads Barro to conclude that the data support the convergence hypothesis in a “modified sense.”35 Inclusion of human capital causes β to be negative in the regressions with investment as dependent variable as well, and Barro interprets this too as ‘consistent with the convergence implication of the neoclassical growth model.’36 To the extent that Barro’s growth regressions include control variables other than initial income, his convergence in a ‘modified sense’can be interpreted as conditional convergence. However, since investment and labor force growth rates do not appear as control variables,37 which are the main conditioning variables from NCGT-point of view, there may be some ambiguity regarding this interpretation. Towards the end of the paper, however, Barro presents growth-initial level regression of more conventional (meaning, neoclassical) format, i.e., inclusive of investment and population growth rates as controls.38 He finds that βˆ obtained from this regression is also negative though little different in magnitude 39 from that obtained from earlier set of growth regressions which do not include investment and fertility rates as controls.40 The evidence produced by the convergence studies with informal specifications can, therefore, be summarized as follows: (a) absence of unconditional convergence in larger sample of countries, (b) presence of absolute convergence in selected sample of countries,41 and (c) presence of conditional convergence even in larger sample of countries.42 In addition, Barro’s 34 The numerical value of β in Barro’s growth regressions ranges from -0.0062 to -0.0111, with that for the base regression (no. 1) being -0.0075. 35 (Barro 1991, p. 409, our italics). We can see the concept of conditional convergence emerging here. 36 (Barro 1991, p. 427). With the ratio of private investment to GDP as the dependent variable, the coefficient on the initial income is significant and ranges between -.0093 and -.0098. 37 Which, however, appear as the main controls in other pre-model studies of conditional convergence, as we saw to be the case with Kormendi and Meguire (1985) or Grier and Tullock (1989). 38 The argumentation for the specification is, however, not neo-classical. Instead, Barro observes that NGT relationships among growth, investment, and fertility imply that residuals from the growth regression will be positively related with those from investment regression, and negatively related with residuals from fertility regression. He justifies the conventional format of the growth regression as an alternative way of checking whether the stipulated relationships among the residuals are true. 39 It is again significant and ranges from -0.0068 to -0.0077, with the base case being -0.0077. 40 Barro interprets this as showing that the negative effect of the initial income on growth does not work through its effects on investment or fertility. Instead, it works mainly through lower rate of return on investment. (p. 430) 41 The selection may be flawed, however. 42 Evidence on this point was yet ambiguous. 17 work established the importance of the role of human capital as a conditioning variable.43 However, the formal concept of conditional convergence did not arise yet, and, hence, there was no attempt in these studies to portray the controls as determinants of steady state.44 The interpretation of the initial income coefficient was limited to what is standard for a reduced form regression coefficient, and the within-across tension did not arise. Since, the specification were not formally linked with growth models, there was no attempt to recover structural parameters from the estimated coefficients. For these developments, we had to wait till next stage of the convergence research. 5. Neoclassical Equation for Convergence Study The evolution of the concept of conditional convergence from its inchoate to precise stage and the associated transition of the convergence regression from informal to formal model-based specification are accomplished in Barro and Sala-i-Martin (1992), (henceforth BS) and Mankiw, Romer, and Weil (1992) (henceforth MRW). In both these works, the regression specification is derived formally from the neoclassical growth model. MRW work with the original Solow model, while BS use the Cass-Koopmans’optimal savings version of the NCGT. Since the appearance of these studies, the neoclassical convergence equation has occupied the center stage of convergence research, and it is virtually impossible to review the literature without bringing this equation into picture. It is also needed to introduce the necessary notations.45 The exercise involves first the derivation of the law of motion around the steady state and then translation of this motion into an estimable regression equation. A. Deriving the Rate of Convergence 43 This enthusiasm regarding human capital was, however, not shared by all. Just about the same time as Barro’s work appeared, Summers and DeLong (1991) produced work emphasizing the role of equipment investment in explaining growth. In addition to showing strong positive effect on growth, they also found that this effect does not depend on ‘education infrastructure.’ Inclusion of human capital variables, which appeared in Barro’s regression, did not affect the coefficient of the equipment investment variable. Finally, they also thought that the regression results showed presence of positive externalities arising out of equipment investment. In Summers and DeLong (1993), they showed that the above results held for the developing countries as well. 44 This was also evident from the fact that, in Barro’s regressions, human capital was included in the form of an indicator of initial condition. 45 The text by Barro and Sala-i-Martin (1995) and Mankiw (1995) also provide the derivation. 18 The dynamics of capital in Solow model is given by & kˆ = s f (kˆ) − (n + g + δ)kˆ, (4) K & where, kˆ = , or capital per effective labor, kˆ is the time derivative of kˆ, and f (kˆ) is the AL production function normalized in terms of effective labor. Also, (s, n, g, d) are the rates of saving, population growth, technological progress and depreciation, respectively. First order Taylor expansion of the right hand side term around the steady state gives & kˆ = [ s f ′ (kˆ∗ ) − (n + g + δ)] (kˆ− kˆ∗ ) . (5) Substituting for s using the steady state relationship, s f (kˆ∗ ) = (n + g + δ) kˆ∗ , gives & kˆ = ([ f ′ (kˆ∗ ) kˆ∗ / f (kˆ∗ )] − 1)(n + g + δ)(kˆ− kˆ∗ ). (6) Under the assumption that capital earns its marginal product, f ′ (kˆ∗ ) kˆ∗ / f (kˆ∗ ) equals to steady state share of capital in income, a. In Cobb-Douglas case, this will also be the exponent of capital in the production function. Using this relationship we then get & kˆ = λ(kˆ∗ − kˆ) , (7) (8) where, λ = (1 − α )(n + g + δ) . Evidently, ? gives the speed at which the gap between the steady state level of capital and its current level is closed and has come to be known in the literature as the rate of convergence. The same rate holds for convergence in terms of income per effective labor. This is because yˆ = f (kˆ), which upon expansion at kˆ∗ and differentiation with respective to time gives (9) & y&ˆ = f ′ (kˆ∗ ) kˆ. As a first order approximation, we should therefore have, (10) yˆ∗ − yˆ = f ′ (kˆ∗ ) (kˆ∗ − kˆ) . 19 By substitution we then get, & = λ( yˆ∗ − yˆ) , yˆ (11) where ? is again the rate of convergence given by equation (8). B. Deriving the Equation for Testing Convergence Switching to the logarithms, solving this first order non-homogeneous differential equation, and rearranging we get from (11),46 (12) ln yˆ(t 2 ) − ln yˆ(t1 ) = (1 − e − λτ )(ln yˆ∗ (t1 ) − ln yˆ(t1 )) , where t1 denote the initial period, t 2 the subsequent period, and τ = (t 2 − t1 ) .47 If we now substitute for yˆ∗ from equation (1) above, we get α − λτ α − λτ ln( s t ) − (1 − e ) ln( nt + g + δ) − (1 − e ) ln yˆ(t1 ) (13) ln yˆ(t 2 ) − ln yˆ(t1 ) = (1 − e − λτ ) 1− α 1− α 1 1 Clearly, this is again the growth-initial level equation, but now the coefficients are formally linked with the structural parameter of the NCGT. For example, β = − (1 − e − λτ ) , and hence it is now possible to recover the value of ? from estimate of ß. The value of ?, in conjunction with other estimated coefficients of the equation, yields the values of other structural parameters of the model like a. The question that now arises is how to estimate this equation. 46 47 This derivation is most detailed in Barro and Sala-i-Martin (1995, pp. 87-88) ∗ A finer point here concerns whether to make ln yˆ contingent on t1 or not. If it is assumed that the determinants of the steady ∗ state income remain constant between t1 and t 2 , it does not matter whether ln yˆ is made contingent on t1 or not. Generally, in implementation, it is assumed that such determinants of steady state income, such as s, n, d, and a, remain the same between t1 and t 2 . This suggests that it is not necessary to make steady state income contingent on the initial period in considering transitional dynamics. This is certainly the case in the continuous time setting, when, theoretically, the difference between t1 and t 2 is instantaneous. Under the assumption that s, n, d, and a, remain constant over the transition period considered, this is also true if we are dealing with steady state income per effective labor, as in the equation (13) above. However, when dealing with steady state income per capita (as is usually the case), this would not be true. As we can see from gt equation (1), the formula for steady state income per capita has the term A e , which will differ if evaluated at t 2 instead 0 of at t1 . In the logarithmic version of the equation, this will generally imply differences in terms involving g and will not affect the basic conclusions. However, it is worth to be aware about. 20 C. Tension between Within and Across Dimensions of Convergence Note that the above derivation of ? and equation (13) is entirely on the basis of the accumulation process within an economy, and there is no reference to what is happening across economies. This shows that ? essentially refers to a within-economy process and is determined by the values of a, n, g, d of the economy concerned. It would, therefore, seem natural and proper to estimate equation (13) and ? on the basis of time series data for a particular country. However, as we saw, researchers have been, instead, estimating equation (13) using cross-section data. The main reason for this is that, from the beginning, convergence arose as a concept pertaining to across-economy growth process. The question that resides in the mind of convergence researchers is not so much whether an individual country is closing the gap between its own current and steady state levels of income as whether poorer countries are narrowing their gap with the richer countries. From this conceptual point of view, cross section data is the natural place to look for presence or absence of convergence. But, this introduces a tension in the convergence parameter ?. While according to equation (13), ? is the measure of speed at which an economy proceeds towards its own steady state level, the ? estimated from a cross-section equation is generally interpreted as the speed at which poorer economies are closing the income gap with the richer countries. This tension was not apparent so long as the cross-country regression specifications were informal. In those regressions, it was possible to limit to only the reduced form interpretation of ß, though this also meant that no information regarding the structural parameters could be obtained. Thus, while formal derivation of the neoclassical convergence equation has been a significant step forward, this has at the same time brought to fore the within-across tension in the concept of convergence and, correspondingly, in the parameter ?. We shall see some concrete manifestation of this tension in the estimation results below. 6. Cross-section Approach to Convergence A. Cross-section Estimation of the Neoclassical Convergence Equation One of the most successful implementations of the formal cross section approach has been MRW itself. In this seminal study, of the six elements of ?, two, namely s and n, are assumed to differ across countries. The values of three of the rest, namely a, g, and d were 21 supposed to be the same for all countries. Differences in A0 are assumed to be part of the error term, and this allowed estimation by OLS. MRW find that with only s and n as explanatory variables, the regression ran well, but the implied values of a are too high.48 The accompanying result is that the implied values of the rate of convergence, ?, prove to be too low. For NONOIL sample,49 for example, ? equals 0.00606, implying a half-life of 114 years, which is indeed very long. To overcome this problem, MRW suggest augmentation of the Solow model by inclusion of human capital, in exactly the same way as physical capital, only with a different exponent, say f . Regression on the basis of this augmented model produces more desirable results. The value of a decreases to empirically plausible level (0.48 for the NONOIL sample), and the value of ? increases to around .02 (0.0142 for the NONOIL, implying a half-life of 49 years). Similar results on conditional convergence across countries are also presented in BS. These are mainly drawn from Barro (1991). However, the regressors are now clearly interpreted as determinants of the steady state.50 And second, the structural parameter ? is now traced back from the estimated regression coefficient. For a similar sample of 98 countries as MRW’s NONOIL, BS report λ̂ to be 0.0184. However, although BS use the neoclassical convergence equation to recover ?, they do not quite apply it to determine the right hand side variables of the regression. B. ß-Convergence across US States 48 MRW do not report the results of the restricted version of this regression, hence we do not have an unique estimate of a. Based on the coefficient of the s variable its value would be 0.82, while that on (n + g + d), 0.68. Both of these are for the NONOIL sample and are far greater than the share of capital in the national income in these countries, computed on the basis of the national accounts data. Note that these values agree with the estimates produced earlier by Romer (1989a) from a growth accounting exercise under similar assumptions. 49 This is a sample of 98 countries which included almost all the sizable countries of the of the Summer-Heston data set except those for which extraction of oil was the dominating source of income. 50 “We interpret these variables as proxies for the steady-state value of output per effective worker and the rate of technological progress.” (p. 246). Also, note that BS uses identical term, ‘conditional convergence,’to interpret their equation, which is analogous to (13) above. To quote: “The theoretical relation in equation (15) predicts conditional convergence, (our italics) that is, negative relation between log y i , t and the subsequent growth rate if we hold constant the steady state position, 0 ∗ log yˆi , and the steady state growth rate.” (p. 243) 22 Convergence study is not limited to samples of independent countries. Researchers also addressed the issue in the context of regions of the same country. In particular, whether or not convergence holds for the states of the US has been an attractive research topic. The initial results in this regard are again presented by BS. Since the derivation of the convergence equation in BS proceeds from the Cass-Koopmans version of NCGT, in implementation, they ideally need to control for the ‘deep’ behavioral parameters. However, BS make the assumption that steady states of the US states are the same, and this relieves them from the difficult task of getting data on the deep parameters.51 This also means that BS consider convergence across US states to be a situation of unconditional convergence. However, in some of their regressions BS include regional dummies and a variable proxying for output composition. Inclusion of these variables makes it a little ambiguous whether it is unconditional or conditional convergence that is investigated. In any case, they find significant evidence of convergence across the US states,52 and the estimated rate of convergence turns out to be in the neighborhood of 2 percent per year. In contrast with BS, Holtz-Eakin (1993) emphasizes the possible differences in steady state among the US states and, thereby, considers the situation to be one of conditional convergence. He basically uses a human capital augmented version of the neoclassical convergence equation and implements a variant of pooled regression. Upon inclusion of variables that either represent or proxy for the determinants of steady state, Holtz-Eakin obtains higher estimates of the rate of convergence. C. Manifestation of Within vs. Across Tension in Convergence Results The within-across tension of the convergence concept and the parameter ? can be numerically illustrated on the basis of the results of the cross-section studies discussed above. Equation (8) clearly shows ? to be a function of a, g, d, and n. Thus, even if a, g, and d are assumed to be the same, different n implies different value of ?. In most convergence studies, as we saw above, n is allowed to vary across countries. Hence, we have a situation where 51 52 ∗ “... we assume that the steady state value, yˆi , and the rate of technological progress do not differ across states.” (p. 227) This was true in terms of both per capita income and product and for different time periods considered. BS left it as an unresolved puzzle why this rate proved to be the similar in terms of income and product. Also, in some of the regressions, BS included regional dummies and/or a variable proxying for output-composition. Inclusion of these variables changes the nature of convergence, to some extent, from unconditional to conditional. 23 assumption of variation runs into conflict with the assumption of commonness within the same parameter. One way of seeing this conflict is to use the estimated common values of a, f , and the actual values of n (g and d are given) to produce country specific values of ?. The results from such computation show that these values vary widely across countries included in the sample. Another way of seeing this tension is to take the converse route and compute the implied value of a common population growth rate from the derived value of ? using equation (8). For example, if we do this on the basis of estimated values of ?, a, and f for NONOIL sample in the restricted version of MRW’s augmented model,53 we get n equal to -0.001, 0.0064, and 0.0028 for NONOIL, INTER and OECD samples respectively.54 These are far from the representative values of n for any of these samples.55 This situation is not unique with MRW. The problem is rather generic with studies that use cross section dimension of data (therefore including panel data sets) to study convergence. D. Other Model Based Cross Section Studies of ß-Convergence Since the formal derivation of the convergence equation by MRW and BS, it became quite standard for empirical work on convergence to be based on this equation. In particular, the MRW specification, based on the original Solow model, became the point of departure for many studies. One reason for this is its easier implementibility, because it does not involve deep behavioral parameters. A variety of issues have been explored using MRW-specification, even within the cross-section set-up. For example, Chua (1992) uses it to study external economies arising from regional spillovers. His exercise shows presence of regional spillover, mainly from human capital. However, the spillover is not strong enough to support NGT. Sala-i-Martin (1996b) uses the neoclassical specification to study convergence across the European economies. Shioji (1995) conducts similar analysis of convergence across the Japanese prefectures. Other researchers have used the formal cross-section approach to explore issues of measurement error, 53 These values are 0.0142, 0.48 and 0.23 respectively. 54 These are two other samples considered in MRW. INTER is the sample of 75 countries obtained by dropping 23 countries, for which data are less reliable, from the NONOIL sample. OECD is the sample of 22 OECD member countries. 55 The negative n for the NONOIL sample is problematic. The other issue here concerns the correspondence of the theoretical and empirical relationships between n and ?. Theoretically, higher n would yield higher ?, which is what we got when we imposed this theoretical relationship in our reverse calculation above. But in actuality, n is highest on average for the NONOIL sample and lowest for OECD. Yet the estimates implied value of ? turned out to be the smallest for the former and greatest for the latter. Other researchers have also drawn attention to these problems. See, for example, Caseli et al. (1995) and Lee et al. (1995). 24 outliers, etc. These studies produce a wide variety of numerical results; however, the basic conclusion regarding conditional convergence is not refuted. E. Research on Club-Convergence Durlauf and Johnson (1995) use the neoclassical convergence equation to investigate ‘clubconvergence.’Alluding to theoretical models of multiple equilibria, they observe that convergence in large samples of countries (global convergence) does hold or proves weak because, in these samples, countries belonging to different equilibria (or ‘regimes’) are lumped together. The proper thing, according to them, is to identify groups of countries, the members of which share the same equilibrium, and to check whether convergence holds within the groups (local convergence). With this motivation, they conduct two sets of exercises. In the first, the countries are classified on the basis of arbitrarily chosen cut off levels of initial income and literacy. To the extent that such exogenous splitting may create selection bias, Durlauf and Johnson present a second exercise in which the splitting is endogenized using the regression-tree method. In either case, the results prove to be qualitatively similar. Estimated parameter values differ significantly across the groups, particularly in the case of endogenous splitting. Also, the rates of convergence within the groups, in general, prove to be higher than in the whole sample. Durlauf and Johnson interpret the observed parameter instability as indicative of countries belonging to different regimes.56 The difficulty here lies in distinguishing evidence of club-convergence from that of conditional convergence. This, in turn, is related with the criteria used to group the countries in order to demonstrate club-convergence. Clearly, steady state determinants cannot be used for this purpose, because differences in them cause equilibrium to differ even under conditional convergence. Use of time-varying characteristics, like (initial) level of income or literacy also involves problems.57 Thus, Durlauf and Johnson’s finding of faster convergence within groups than in the broader sample is also compatible with conditional convergence. 56 Since this instability pertained to groups classified according to both initial income and human capital levels, the authors concluded that both of these variables were important in identifying the ‘regimes.’ 57 All the countries, at one point of time of their history or the other, have to pass through all cut-off points (of income or literacy). Thus, if equilibria are contingent on the cut off levels, then all the countries should end up having the same equilibrium. It may be said that the (initial) level in combination with some time-invariant characteristic does the job. Or, perhaps, it is the ratio of levels that matters. This is because although the countries cross all the level values sooner or latter, 25 In sum, we see that the switch from informal to formal specifications elevated the convergence discussion from one being about broad presence or absence of convergence to one dealing with precise values of structural parameters of growth model. The cross-section studies agree regarding the broad result of conditional convergence, but no consensus-value of ? or a are obtained. However, we find that, in general, the value of ? increases as more differences in the steady state are controlled for either by inclusion of relevant variables in the regression or by selecting more ‘homogeneous’sample. 7. Panel Approach to Convergence Study In studying convergence, the cross-section studies focussed on the steady state differences in preference variables, like investment rate and labor force growth rate. However, as equation (1) shows, the steady state is also characterized by technology. It is in accounting for differences in technology that the cross-section approached encountered important limitation. This gave rise to the panel approach to convergence study. A. Omitted Variable Bias Problem of the Cross-section Regression The limitation of the cross-section approach in controlling for the technology differences creates an omitted variable bias problem. This can be illustrated using equation (13). Although this equation is in terms of income per effective labor, in actual implementation, researchers invariably work with income per capita. Expressed in terms of per capita income and rearranging, we get (14) α α ln s t1 − (1 − e λτ ) ln( nt1 + g + δ) + e − λτ ln y t1 1− α 1− α + (1 − e − λτ ) ln A0 + g (t 2 − e − λτ t1 ). ln y t 2 = (1 − e − λτ ) The A0 term on the right hand side is the productivity shift term. MRW, for example, recognize the importance of this term and observe that, “the A0 term reflects not just technology but they may not do so observing the same proportion of the levels. However, Durlauf and Johnson found that output dominated literacy as criterion for group/regime identification. 26 resource endowments, climate, institutions, and so on; it may therefore differ across countries.” (p. 410-1) However, in actual estimation, they regard A0 as part of the error term and assume it to be uncorrelated with the included variables, s and n. But, this assumption contradicts the expansive definition of A0 that MRW themselves provide. Given this definition, it is difficult to argue that A0 is uncorrelated with s or n. Actually, in cross-section regression, this assumption becomes an econometric necessity for identification. This is because there are no good measures of A0 , and, even if some proxy variables are included, there still remains a part that is unobservable or unmeasurable and yet correlated with s and n. On the other hand, claiming A0 to be the same for all countries is not appealing and contradictory to its definition. Such a claim implies that the aggregate production function is parametrically identical across countries. This kind of strict homogeneity of production function is not realistic, and researchers have been objecting to it for quite some time.58 However, ignoring A0 , by treating it as part of the uncorrelated error term, creates a problem of omitted variable bias. 59 B. Panel Estimation of the Convergence Equation Switching to the panel framework can solve this problem by allowing to control for the unobservable and unmeasurable part of A0 in the form of individual (country) effects. Using notations of panel data literature, equation (14) can be written as y it = (1 + β ) y i , t − 1 + β ψ x i , t − 1 + η t + ε it , (15) 58 Durlauf and Johnson (1995) also noted the issue of potential differences across countries in aggregate production function. Observing wide variation in the estimated parameter values across the groups, they observed that, “aggregate production function differs substantially across countries.” In particular, large differences in the intercept term -- which is related with A0 of equation (14) -- led them to conclude that, “different economies have access to different aggregate technologies.” (p. 375) Hence, they expressed the view that “... the Solow growth model should be supplemented with a theory of aggregate production function differences in order to fully explain international growth patterns.” 59 There are other possible sources of bias for the cross-section regression as well. See Lee et al. (1995) and Evans and Karras (1996a) on this point. Lee et al. (1995), for example, draw attention to another possible bias of the cross section growth regression. They start with an equation similar to (13) below and assume t and t-1 to be one year apart. Since the cross-section regression generally considers growth over a long (say, 25 year) period, they try to have the equation correspond to that by assuming initial t=0, and iterating it forward to T. This gives rise to, among others, a composite error term j ξ iT = ∑ Tj=− 1 0 β ε i , T − j . Thus, possible serial correlation in e now acts as source of bias. So does possible across country variation in A0 and g. See Lee et al. (1995, Appendix A) for details. 27 where y it = ln y t2 , y i , t − 1 = ln y t1 , (1 + β ) = e − λτ , ψ = (− α /(1 − α )) , xi , t − 1 = (ln s i , t − 1 − ln( ni , t − 1 + g + δ)), µ i = (1 − e − λτ ) ln A(0) , and η t = g (t 2 − e − λτ t1 ) .60 There are many different ways to model and deal with the country effect term µ i . However, it is clear that, in view of the correlation of A0 with s and n, random effectsspecification of µ i is not appropriate. The most appropriate choice, it seems, is Chamberlain’s (1982, 1983) model of correlated effects and the accompanying method of Minimum Distance (MD) estimation. 61 This was implemented in Knight et al. (1993) and Islam (1995). The estimated values of ? for the NONOIL, INTER and OECD samples in Islam (1995) were 0.0434, 0.0417 and 0.0670 respectively. The analogous value reported in Knight et al. (1993) for NONOIL sample was even higher, 0.0652. Also, the implied values of a (output elasticity with respect to physical capital) were now much lower and more in conformity with its commonly accepted empirical values. The estimated values of a in Islam (1995) for the three samples above were 0.4397, 0.4245 and 0.2972 respectively. The value of a reported for NONOIL sample in Knight et. al. (1993) was even lower, 0.335. The results also indicate that the way human capital influences output is, perhaps, different from the way physical capital does.62 In particular, it seems that human capital impacts output largely through its influence on the overall technological level. One advantage of the panel approach is that it produces estimates of µ i , which are indirect estimates of the aggregate 60 Recall our earlier discussion about whether the variables representing determinants of steady state should be made contingent ∗ on the initial period or not. The formulation in equation (14) is on the basis of such contingent yˆ . However under the assumption that s and n remain constant between t and t-1, which is the natural (even necessary) assumption to make here, we can replace x i , t − 1 in the equation above by x it . 61 Instead of hiding or sidetracking, the correlated effects model allows the correlation between µ i and x it ’s to come to fore and play out its role in the estimation process. 62 Incorporation of human capital in the panel analysis led to ‘anomalous’ results, with the coefficient of the human capital variable turning out to be negative and generally insignificant. This agreed with earlier results regarding human capital obtained from pooled regressions (see for example Gregorio (1992)) and also the results obtained by Benhabib and Spiegel (1994). To the extent that the human capital data are weak, there may be data issues related with this result too. The schooling data used for construction of human capital variable are yet to be adjusted for quality differences. Also, many processes of human capital formation that occur outside of formal schooling are not included in this variable. Nevertheless, these results indicate that the channel of influence of human capital on output may be more complicated than suggested by MRW though their proposal of multiplicative inclusion of human capital in the aggregate production function, alongside physical capital. Benhabib and Spiegel (1994) also found support for multiple and more complex channels of influence of human capital on output. Panel results in Islam (1995) show very strong positive correlation between measures of human capital and estimated 28 technological level. These provide the point of departure for a second level of analysis geared to ascertaining the determinants of technological differences. Researchers have since used the panel approach to investigate many other issues of convergence. These are of two types. The first refers to those issues that could not be adequately examined in the cross-section set-up. The second comprises of those new issues that are specific to panel estimation. In the following, we discuss some issues of both these types.63 C. Issue of Endogeneity Bias One issue of the first type is that of endogeneity bias. By itself, equation (15) does not pose a problem of endogeneity. The variables on the right hand side are pre-determined. However, Chamberlain’s MD estimation procedure uses both past and future values of x to substitute out µ i (and y i 0 )64 and hence requires strict exogeneity of xit ’s for the validity of estimation. Caseli et al. (1996) raise this issue and use Arellano’s GMM procedure to avoid this problem. This procedure eliminates µ i by first differencing and uses lagged values of y it ’s and xit ’s as instruments. Using this estimator, Caseli et al. find estimated value of ? and a to be 0.128 and 0.104, respectively, when the regression is based on the original Solow model. These results, however, do not support the equality of magnitudes of coefficients for s and n, and this leads Caseli et al. to reject Solow model. On the basis of the augmented (as per MRW) Solow model, estimated values of ?, a, and f turn out to be 0.0679, 0.491, and –0.259 respectively. The negative value of the human capital coefficient f now leads the authors to reject the augmented Solow model. The authors then switch to informal, extended specifications of the growth-initial level regression and, based on their results, suggest an estimated value of ? to be around ten percent. They view such a value of ? to be compatible with the open economy version of the Cass-Koopmans variant of neoclassical growth theory. values of A0 . This provides a basis for the suggestion that the route along which human capital influences output may run through A0 . 63 It is necessary to note that panel estimation is not the same as pooled estimation. Sometimes, researchers have labeled their regressions as panel when, in fact, these are pooled regressions. Panel estimation presupposes explicit modeling of the individual effect term, while in a pooled regression, this term is relegated to a single-component error. A pooled regression may also be conducted in a SURE multi-equation framework with the equations distinguished by only the time period covered. This involves some specification of the error covariance matrix and therefore has some semblance to the (GLS) estimation under random-effects assumption. However, the error covariance structures, in these two cases, are not the same. Moreover, as we have noted before, the random-effects specification is not appropriate for estimation of the convergence equation. 29 Caseli et al.’s attempt to check and correct for endogeneity-bias has been a worthy effort. There are few issues, however. First, the authors do not work out the value of a that corresponds to their suggested value of ?. Based on their results for the original Solow model, an ? equaling to ten percent would imply an a equal to 0.1258, which is too low an estimate of capital’s share in output, even if capital is defined as physical capital only. The second issue concerns their switch from the model based formal specifications to the extended informal specifications. We will come to this issue shortly. D. Heterogeneity in Steady State Growth Rate Another issue that has been explored using panel approach is the issue of further parametric heterogeneity of equation (15). In particular, researchers have been interested in heterogeneity of the steady state growth rate, g. The question whether or not countries have common g is of considerable theoretical and empirical interest. Theoretically, it remains controversial. (See Romer’s comment on Mankiw’s paper in Mankiw (1995).) Empirically, testing of the hypothesis of common g is made difficult by the fact that data only give the actual growth rates, which are generally a combination of steady state and transitional growth rates. Lee et al. make a commendable effort to find out the implication of heterogeneous g for the parameter estimates. Based on a panel for 1960-1989, they find that, with heterogeneity in g, the rate of convergence for NONOIL, INTER, and OECD samples increase to 0.1845, 0.1521, and 0.1495, respectively.65 One consequence of extending heterogeneity to g (in addition to A0 is that this leads to a virtual collapse of the concept of convergence, so far as its across-dimension is concerned. This is because convergence under heterogeneity of both A0 and g implies that the economies are converging not only to different levels of per capita income but also to different growth rates. This interpretation of conditional convergence also makes the NCGT equilibrium situation observationally indistinguishable from a situation characterized by NGT. One problem with Lee et al.’s results is that the convergence rates obtained under heterogeneous g are very high. The authors do not report the corresponding values of α, the capital share. But, when worked out, the 64 65 The procedure begins with recursive substitution for y i , t − 1 , so that at the end we just have y i 0 on the right hand side. The correspondence between Lee et al.’s samples and MRW’s NONOIL, INTER, and OECD samples is approximate. 30 values of α are likely to be very low. This makes it somewhat unclear whether the results can be interpreted to be supportive of heterogeneous g. E. TFP-Convergence and the Panel Approach One useful feature of the panel approach is that it provides a link with studies of TFPconvergence. Originally, TFP studies were based on time series data of individual countries and were focused on computation of TFP growth rates. These studies did not consider the issue of convergence in TFP levels. International comparison of relative TFP levels was initiated by Jorgenson and Nishimizu (1978) and was carried forward by Christensen, Cummings, and Jorgenson (1981). Dougherty and Jorgenson (1996, 1997) have recently resumed this line of work. Wolff (1991) and Dollar and Wolff (1994) have also considered the issue of TFP-level convergence along similar methodological lines. This methodology consists of two steps. The first step consists of growth-accounting exercise using time series data in order to get the TFP level indices. In the second step, these indices are analyzed to check for TFP-level convergence. Dougherty and Jorgenson’s second step analysis is limited to graphical treatment. Wolff, on the other hand, runs regression of subsequent TFP growth on initial TFP level. In both cases, the sample consists of the G-7 countries, and the evidence generally supports TFP-convergence. Not all researchers have adopted this two-step procedure to investigate TFP-convergence. In an important work on this topic, Dowrick and Nguyen (1989) try to do both growth accounting and TFP-convergence test in a cross-section regression. The specification is similar to (13). However, they proceed from the assumption of a common capital-output ratio for all the countries of the sample. This allows them to interpret the coefficient on the initial income variable of the cross-section growth equation to be indicative of TFP-convergence. On the whole, their results also support TFP-convergence in their sample of fifteen OECD countries.66 The problem that arises here is that capital-output ratio may not be the same, and labor productivity differentials may arise from differences in both technological level and capital intensity. This is particularly true for larger sample of countries.67 66 However, in their formulation, the initial income variable was relative to that of USA, the most advanced country of the sample. The time period covered was between 1950 and 1985. 67 Dowrick and Nguyen try to distinguish between these two sources by including interaction term in the regression. The interaction is between initial income variable and the average investment rate over the period. They conclude in favor of technological diffusion. The procedure, however, involves several simplifying assumptions. 31 One difficulty in extending the time series growth accounting approach to larger sample of countries lies in the short length of time series for most of the developing countries. It is in this respect that the panel approach to convergence study can be of help. The estimates of µ i produced by panel estimation can be used to recover A0 . Under the assumption that g is common, ratios of A0 provide relative TFP levels. These TFP level indices can be used to investigate the issue of TFP convergence in a large sample of countries. Analysis of TFPconvergence in large sample of countries can also benefit from the cross-section growth accounting procedure suggested recently by Hall and Jones (1996, 1997). This procedure also yields TFP indices for large sample of countries. Results of both Hall and Jones (1996) and Islam (1995) indicate presence of huge TFP differences across countries in global samples. Whether these differences are narrowing or widening over time is an issue that needs to be further investigated. Such an investigation will, however, require TFP-level indices for several time periods. This is yet to be done for large samples by either the panel regression procedure or the cross-section growth accounting procedure. Convergence discussion has renewed researchers’attention to TFP dynamics across and within countries. Other researchers who have recently discussed this issue include Bernard and Jones (1996), Parente and Prescott (1993, 1994), and Young (1995).68 C. Possibility of Small Sample Bias The advantages of the panel approach do not come without some problems. One of these is the possibility of small sample bias. Note that equation (15) is a dynamic panel data model. Theoretical properties of the estimators of such models are asymptotic, and their small sample properties are unknown. The panel estimators that have so far been used in the convergence literature are Arellano’s GMM, Chamberlain’s MD, LSDV, and some variants of maximum likelihood (as in Lee et al. (1995)). All these estimators are consistent,69and the only way to know their small sample performance is through conducting Monte Carlo study. Results of Monte Carlo studies are more useful when these studies are tailored to the equations actually 68 Maddison (1987), in his splendid growth accounting study for the developed economies, also deals with TFP catch-up, though instead of estimating the extent of ‘catch-up,’he imputes particular value to it. 32 estimated and the data sets actually used. Monte Carlo study using Summers-Heston data set and focusing on the convergence equation (15)70 indicates that, in general, estimators that do not use further lagged values of the dependent variable as instruments perform better than those which do.71 However, to the extent that Monte Carlo results tend to be data and model specific, it is necessary that researchers using panel estimators checked into the small properties through Monte Carlo studies designed for their respective data set and equation estimated. 72 Barro (1997) has recently drawn attention to the fact that LSDV uses only withinvariation for estimation and throws away the cross-section or between-variation. This brings us back to the within-across tension of the convergence parameter. To the extent that the convergence equation is based on a within-economy growth model, estimation of its parameters on the basis of within-variation may be more desirable.73 More importantly, whether or not an estimator uses both within and between variation, perhaps, cannot be the main criterion of an estimator’s suitability in this case. For example, the random-effects GLS estimator uses both within and between variation. But, as we saw, this estimator is not suitable for estimating the convergence equation because it contradicts the correlation of the individual effect with the included regressors. E. Panel Analysis and Extended Cross-section Regression 69 Note that the asymptotic properties of panel estimators can be considered in the direction of N → ∞ or T → ∞ or both N and T going to infinity. Further, N and T can go to infinity either at the same rate or different rates. Amemiya (1967, 1973) has shown that, although LSDV is biased in the direction of N → ∞ , it is consistent in the direction of T → ∞ . 70 See Islam (1992). The dynamic panel data estimators considered in this study include: OLS, LSDV, two instrumental variable estimators by Anderson and Hsiao (1981, 1982), two GMM estimators by Arellano and Bond (1991), 2SLS, 3SLS, generalized 3SLS and Minimum Distance estimators by Chamberlain (1982, 1983). 71 The Arellano GMM estimator, which depends heavily on lagged y’s as instruments, displayed large bias. This may, to a certain extent, explain the very high values of λthat Caseli et al. (1996) obtain using this estimator. However, it needs to be mentioned that the instrument set used for the Arellano estimator in the Monte Carlo study is not exactly the same that Caseli et al. seems to have used. 72 Nerlove (1996) has argued that the change in results brought about by LSDV estimation under fixed-effects assumption in Islam (1995) may be the result of small sample bias of this panel estimator. He conducts OLS on pooled data and GLS under random-effects assumption, and finds that results from these procedures prove to be similar to those obtained earlier from cross-section estimation. However, as mentioned earlier, the random-effects assumption is not appropriate for µ i in the convergence equation. It is not surprising that GLS under random-effects specification and OLS on either cross-section or pooled data yield similar estimates, because they all share the same assumption. They all ignore the correlation of A0 with s and n. Also, Monte Carlo study in Islam (1992) show that, for the data and model in question, LSDV performs much better than many other estimators. Chamberlain’s MD estimator also displayed similar superior performance. 73 Note in this context that Chamberlain’s MD estimator uses both cross-section and time-series variation in data. Also, Chamberlain’s MD estimator is robust to presence of heteroskedasticity and autocorrelation. Note that presence of autocorrelation induced by measurement errors has been another source of Barro’s concern regarding use of panel estimator. 33 The growth-initial level regression, which has been the mainstay of convergence research, seems to be now entering a new phase. In the first phase, represented by works reviewed in section-4, the specifications were informal. These regressions were basically geared to check which variables influence growth positively and which, negatively. This led to the emergence of a bewildering variety of explanatory variables, so much so that growth regressions fell into some disrepute. Responding to the situation, Levine and Renelt (1992) used Leamer’s extreme bound analysis to find out which of the suggested explanatory variables were statistically robust and which were not. The second phase of the growth regressions started with the switch to the formal specification of convergence equation of the type presented in section-5. This made it possible to recover the theoretical parameters, and it also rigorously specified the variables to be included in the regression. It seems that growth regressions are now entering a third phase. This is manifested in the inclusion of a large number of additional regressors in equation of the type (13). Thus, for example, Barro (1997) presents this kind of extended cross-section regression including a host of variables in addition to the ones that are specified by theoretical models. Caseli et al.’s last set of regressions also falls into this category. Other examples of this type of work include Sachs and Warner (1997), etc. Some researchers do not try to provide theoretical justification for inclusion of the additional variables. Others, who want to keep the link with theoretical growth models, try to do this by referring to the A0 term of equation (14). It is argued that the variables of the expanded list are required to control for A0 in the regression. This distinguishes these specifications from the informal specifications of the first phase, because this provides the theoretical locus standi of the variables of the expanded list. Sala-i-Martin (1997) reports running several million of this kind of regressions to find out the statistical properties of the estimated coefficients of different explanatory variables. This is somewhat in the spirit of what Levine and Renelt (1992) did earlier. It, therefore, seems that two approaches are possible in dealing with the A0 term of the growth-convergence equation. The first approach is to continue with the cross-section regression and extend it to include variables that proxy for A0 . The second approach is to opt for panel estimation and control for A0 as unobservable individual effect. The estimates of A0 are 34 analyzed in a second step. The two approaches have their respective advantages and disadvantages. The advantage of the extended cross-section regression is that it is a one-step procedure that addresses the issue of determinants of A0 in conjunction with estimation of other parameters. The disadvantage is that it cannot avoid the potential problem of omitted variable bias. The panel approach has the advantage that, in its first step, the parameters of the growth model can be estimated corrected for omitted variable bias. The disadvantage is that the first step of the exercise cannot reveal anything about the determinants of TFP level. In this context it should be noted that, in equation (15), the term A0 appears with the coefficient (1 − e − λτ ) . This implies quite a few restrictions on the coefficients of the extended cross-section regression. First, this implies that all the variables that appear as proxies or components of A0 should have the same coefficient. Second, this common coefficient of A0 components has to be equal to the coefficient of the investment variable and numerically equal but opposite in sign to the coefficient of the labor force growth variable. Researchers using extended cross-section regressions do not always recognize these restrictions. This often leads to an eclectic approach. On the one hand, the specification is claimed to have formal link with growth model, and that link is used to recover the rate of convergence parameter from the coefficient of the initial income variable. On the other hand, this link is not followed through for the coefficients of the rest of the variables of the regression.74 In sum, the panel results show that in studying convergence and estimating the parameters pertaining to the capital deepening process, it is important to account of the technological differences. In general, it is found that, when technology differences are allowed, NCGT fares better in cross-country growth data. The results also indicate that capital deepening and technological diffusion, the two processes that are supposed to yield income convergence, may not always play symmetric roles. It appears that in small sample of developed countries, the process of TFP convergence has aided income-convergence. But, this may not have been the case in large, global sample where very large TFP differentials remain. Panel approach also helped investigate other issues like endogeneity and heterogeneity of steady state growth rate. 74 Sometimes the link is not recognized to trace out the value of the capital-elasticity parameter. 35 Results from these latter investigations show the difficulty in having consensus estimates of the structural parameters. However, the hypothesis of conditional convergence is generally upheld. 8. Time Series Approach to Convergence The progression of convergence study from cross-section to panel and then on to time series approach seems to be quite natural. However, in many cases time series analysts of convergence researchers have actually taken off directly from the cross-section tradition. A. Time Series Equation for ß-Convergence The commonly used equation for time series analysis of convergence can be derived directly from the equation for ß-convergence given by (15). This generally involves the assumption that the xi , t − 1 remains unchanged over the sample period considered. Then β ψ xi , t − 1 becomes just another time invariant term, and it can be subsumed under the term µ i . Also, note that if we substitute t 2 = t and t1 = t − 1 in the expression for η t , we get (16) η t = g (t 2 − (1 + β ) t1 ) = g[t − (1 + β )(t − 1)] = (1 + β ) g − β g t . For an individual economy, (1 + β ) g is a constant, and hence can also be subsumed under µ i , so that η t effectively reduces to − β g t . Introducing these changes and upon rearrangement, we get from (15), suppressing the country subscript i, the following: (17) y t = µ − β g t + (1 + β ) y t − 1 + εt . Note that this is same as the Dickey-Fuller equation with a drift and linear trend. Recall that for convergence in the usual sense, ß should be negative. In other words, we should have (1+ ß) less than one. The question then becomes whether the coefficient on y t − 1 is less than unity or not. If we cannot reject the H 0 : (1 + β ) = 1 , then by implication we cannot reject the null ß = 0, i.e., we cannot reject the hypothesis that there is no convergence. Hence, looked from the point of view of individual economy’s time series, the question of convergence reduces to the standard question of whether or not the output series is integrated. 36 Broadly, time series analyses of convergence may be categorized into two types. The first focuses on analysis of the time series of individual economies, without reference to any other economy. This comes closest to the study of within-convergence. The second analyzes output series of one economy in reference to those of others. Within the second, again, several different strands may be distinguished. We begin by looking at evidence of convergence produced by within analysis of individual economies. B. Time Series Analysis of Within-Convergence As noted, convergence analysis focussing on individual economy’s time series and using equation (17) is virtually indistinguishable from standard unit root analysis of macroeconomic series. The difference is that while the traditional unit root analysis has been limited mainly to output series of the developed countries, under convergence paradigm, the analysis is extended to a larger sample of countries, including the developing ones. Thus, for example, Lee et al. (1995) conduct an exercise along this line and find that, out of 102 countries for which the equation is fitted, only for three the null of unit root can be rejected (at five percent level of significance).75 To account for potential serial correlation, they also adopt the augmented Dickey-Fuller specification with the number of lags determined by use of the SBC information criterion. This, however, do not change the results that much.76 One well-known weakness of the Dickey-Fuller unit root tests is the formulation of the null. The test shows when the null of unit root cannot be rejected, leaving, however, a wide range of other non-unit root alternatives still compatible with the evidence. The test proposed by Kiwatkowski et al. (1992) is an advance in this regard, because it takes stationarity as the null. However, this test also has its problems; in particular, its outcome depends on the degree of truncation. Using this test, Lee et al. find that the number of countries for which stationarity can be rejected “fell steadily with the length of the truncation parameter.” (p. 22) When this parameter is set at 8, the number of rejections falls to only 9. 75 Noting that standard Dickey-Fuller tests have low power, Lee et al. also use a test proposed by Im, Pesaran and Shin (1995), referred to as the t-bar test, which is based on the average value of the DF statistics obtained across countries. The results remain basically unchanged. 76 Based on the ADF t-statistics, the number of rejections of the unit root null ranges between 3 and 14, depending on the number of lags and whether data were demeaned or not. Lee et al. refer to allowing for ‘common time specific effects.’(p. 21) It is unclear what is meant because apparently the procedure works on individual time series separately. The use of an analogous tbar statistic for the ADF set-up, as proposed in Im et al., does not affect these results by that much. 37 It is important to note here that the above tests are based on the assumption of xi , t − 1 being constant. The presence of xi , t − 1 in equation (15) is linked to the notion of conditional convergence. The assumption that xi , t − 1 is time-invariant contradicts this notion and reduces the analysis, in large measure, to that of unconditional convergence. Yet, elements of xi , t − 1 may change over time even within an economy. Lee et al. themselves recognize the possibility of ‘once for all changes’taking the form of ‘shifts or take-offs,’and observe that their result of nonstationarity may be the result of not taking account of these changes.77 This has been confirmed by research done by Ben-David, Papell, and others. They show that introduction of simple trend breaks (either exogenous or endogenous) leads to large increase in the number of rejection of unit root. 78 C. Time Series Analysis of Convergence across US States The second type of time series analysis of convergence, which analyzes output series of one economy with reference to those of others, may be termed as time series analysis of acrossconvergence. Initial Time Series Analysis of Convergence across US States One particular application of time series analysis of across-convergence has been to the data of the US states. In this analysis, the output series of the individual states or regions are studied with reference to the average for the US as a whole. For example, Carlino and Mills (1993) analyze per capita income of eight geographic regions of the US.79 They proceed from a definition of stochastic convergence similar to that given by equation (2) and interpret y it as y t , the average for the sample (i.e., for the USA a whole). Thus, they study the time series properties 77 Lee et al. observe that Solow model does not have internal explanation for such changes. However, that does not mean that the model does not allow for the possibility of such changes 78 For details, see Ben-David and Papell (1995, 1997), Ben-David, Lumsdaine, and Papell (1997), Lumsdaine and Papell (1997), and Zivot and Andrews (1992). 79 The regions are: New England, Mideast, Great Lakes, Plains, Southeast, Southwest, Rocky Mountains, and Far West. 38 of Dy jt = ( y t − y jt ) , where y jt are the log per capita output of the region j. The authors use an augmented version (in the Dickey-Fuller sense) of equation (17) with y t replaced by Dy t .80 Note that, in this deviation-setup, the intercept term of the equation stands for timeinvariant differences in the determinants of the steady state across regions. This is equivalent to allowing some components of xi , t − 1 to vary, albeit only in the direction of i. However, Carlino and Mills also feel the necessity of allowing trend break, which can be interpreted as allowing xi , t − 1 to vary in the direction of t, albeit in a very restricted way. These changes in the setup and specification bring the analysis closer to that of conditional convergence, and Carlino and Mills find that they can reject the unit root hypothesis for majority of the regions. This clearly favors conditional convergence. Lowey and Papell (1996) extend this line of analysis. While in Carlino and Mills, the trend break is exogeneously set for the year 1946, Lowey and Pappel endogenize the timing of the break. Also, they conduct the analysis at a further disaggregated level by dividing the US into 22 regions instead of 6. Again, the null of unit root can be rejected for majority of the regions. Unit Root Analysis of Pooled Data for the US States Although in works like Carlino and Mills and Lowey and Papell, the analysis is conducted on the basis of data in deviation or relative form,81 the unit root tests are conducted region by region. In contrast, in Evans and Karras’(1996b) work on convergence across US states, similar test is conducted by pooling the deviation data. Also, Evans and Karras consider convergence at the levels of states, instead of broader regions.82 In view of the weakness of the standard DickeyFuller test, Evans and Karras use a modified version of the unit root test proposed by Levine and Lin (1993) designed specifically for pooled data. The results show rejection of the unit root hypothesis even when trend breaks are not included. Note that in this setup too, the state specific intercept term of the equation stands for (time-invariant) difference in steady state among the individual states. Hence, this may also be interpreted as a finding of conditional convergence. εt . 80 They also impose a time series structure on the error term 81 This distinguishes methodologically their work from that of, say, Lee et al. 82 This makes their results directly comparable with Barro and Sala-i-Martin (1992)’s analogous results obtained from crosssection approach. 39 Evans and Kerrera contrast this aspect of their result with Barro and Sala-i-Martin (1992)’s result of unconditional convergence across US states. However, as noted earlier, Barro and Sala-iMartin, in part of their analysis, also include regional dummies and an index of composition of output. Hence, it is a moot point whether their result is entirely of unconditional convergence. D. Time Series Analysis of Convergence across Countries Unit Root Analysis of Pooled Data for Countries Evans and Karras (1996a) also conduct unit root analysis of pooled deviation data for a sample of 56 countries. The result is similar: evidence favors rejection of unit root, i.e., favors the hypothesis of conditional convergence. Analogous results are also obtained in Evans (1996) from analysis of long historical data (1870-1989) for a sample of thirteen developed countries. Co-integration Approach to Across-Convergence Study An alternative way of analyzing output data in relative form is to use the framework of cointegration analysis. Bernard and Durlauf conduct such an analysis on the basis of a sample of 15 developed countries.83 The goal is to check whether the per capita output series of these economies are co-integrated or not, and if they were, whether the co-integration vector is of the form (1, -1) or (1,-a), where a is a constant. The indirect84 way of doing this is to formulate these hypotheses in terms of conditions on the rank of the spectral density matrix at frequency zero of ∆ D Yt and ∆ Yt . Here, ∆ Yt is the first difference of Yt , which, in turn, is the vector of individual output series, y it . Similarly, DYt is the vector of Dy it , the deviation of output of country i from that of the reference country having index 1.85 The basic idea, from Engle and Granger (1987), is that if the number of distinct stochastic trends in Yt is less than n (which would imply co-integration), then the spectral density matrix at frequency zero of ∆ Yt , i.e., of f ∆Y (0) , is not of full rank. If all n countries are converging in per 83 They also consider two sub-samples: the first consisting of 11 European countries, and, the second, consisting of 6 European countries that showed a high degree of ‘pairwise cointegration.’ 84 The direct way of doing this is to do pairwise analysis. However, apart from posing severely large number of possible pairings in any sample of respectable size, this route also faces the problem of arriving at an overall result from the pair specific results. 85 For the 15 country sample, the US is the reference country. For the European sub-samples, the reference country is France. 40 capita output, then f ∆DY (0) i = 0, ∀ i , or equivalently, the rank of f ∆DY (0) is zero. So, in operational terms, the task is to look at the spectral density matrix at frequency zero of ∆DY and ∆Y -- the first for convergence, the second for co-integration— and check for the rank conditions (more concretely, number of co-integrating vectors). Bernard and Durlauf use two sets of procedures to carry out the tests: one, based on Phillips and Ouliaris (1988), and the other, based on Johansen (1988). In either case, the conclusion is broadly similar: there is evidence of co-integration of the form (1, -a) but not of the form (1, -1). Bernard and Durlauf interpret this result as showing that the countries ‘shared common trends’ but did not converge. However, as we have noted before, co-integration of the form (1, -a) can also be a manifestation of conditional convergence.86 It may be noted that, despite the apparent methodological differences, Bernard and Durlauf’s cointegration analysis is similar to Evans and Karras’unit root analysis. In both cases, the steady state levels of different economies are allowed to differ by only a constant proportionality factor, though Evans and Karras, in addition, incorporate trend breaks. The results are qualitatively similar although pertain to different samples. In sum, time series analysis has actually been a different way of investigating ß-convergence. Both unconditional and conditional-convergence have found their place in this analysis. However, in time series analysis of conditional convergence, steady state variation was generally limited to time-invariant differences and trend breaks. The broad evidence, nevertheless, favors conditional convergence. To the extent that time series approach avoids imposition of structure, it does not produce estimates of the structural parameters and hence does not answer questions concerning specific values of these parameters. 9. Distribution Approach and s -Convergence While all the approaches discussed so far -- cross-section, panel, and time-series -- have concentrated on ß-convergence, the distribution approach has the distinction of focusing on s convergence. However, the correspondence is not so simple. Distribution approach has actually 86 Bernard and Durlauf were inclined to interpret the shared common trends as indicative of club convergence. However, we have noted earlier the difficulty in distinguishing evidence of club convergence from that of conditional convergence in general. 41 proceeded along two lines. The first maintains relationship with ß-convergence and tries to work out the precise relationship between ß and s . In contrast, the second line of the distribution approach emphasizes the limitations of ß-convergence and focuses on the shape of the entire distribution and intra-distribution dynamics. C. Relationship between ß-convergence and s -convergence In order to see the relationship between ß and s , we can start from the decomposition of the cross-section variance into its constituent elements. This decomposition is already offered in BS. They note that if all the terms other than y i , t − 1 and ε it in equation (15) are ignored, then the evolution of σ t2 , variance of y it , under suitable assumptions on ε it , can be described by ~ σ t2 = (1 − β ) 2 σ t2− 1 + σε2 = β 2σ t2− 1 + σ ε2 , (18) ~ where σε2 is the variance of e, and β = (1 + β ) . Iterating backwards, this yields 2 σ ε2 σ ε2 ~ 2t σ = ~ + σ0 − ~2 β . 1− β 2 1 β − 2 t (19) As t → ∞ , the above approaches the steady state value, σ ∞2 = σ ε2 2 ~ 2 . It is clear that σ ∞ 1− β increases with σε2 and decreases as β becomes more negative. What is more important is that σ t2 can monotonically either increase or decrease to σ ∞2 depending on whether the initial variance σ 02 is smaller or greater than the steady state variance σ ∞2 . This algebraic result again shows that a negative β can not guarantee falling variance. However, it also shows that an empirical finding of increasing cross-sectional variance is not incompatible with β convergence. Similar relationships have been presented in Lee et al. Unlike BS, Lee et al. do not ignore the terms representing variation in the steady state across countries in equation (15), and hence they get the following more involved expression for cross-section variance: 42 ~ ~ 2t 2 ~ 2t 2 1 − β 2t 2 1 − σ = β σ 0 + [1 − β ]σ∗0 + ~ 2 σε + T − 1− 1 − β 2 t (20) ~ β 2t 2 ~ σ g , β2 where σ∗20 is the cross-country variance in steady state per capita output in time 0, and σ g2 is variance of the steady state growth rate, g. Under assumption of common g, the last term drops out. Then the above expression is basically the same as of BS, except that it now has the additional term involving σ∗20 . As expected, the latter term now also appears in the expression for σ ∞2 : σ ∞2 = σ∗20 + (21) σε2 ~ 1− β 2 Substituting for σ∗20 in equation (20) using equation (21), yields ~ σ t2 = σ 02 + [1 − β 2t ](σ ∞2 − σ 02 ) . (22) Note that this is the same equation as can be obtained from BS equation (19) above, because σ∗20 gets subsumed in σ ∞2 . It also helps to see again that dispersion may either increase or decrease towards σ ∞2 , depending on whether initial dispersion is less or greater than it.87 D. Relationship between Tests for ß- and s -convergence The above shows how β and σ 2 are algebraically related, and value of one can be obtained from that of the other, provided other conditions are satisfied.88 This also implies that tests of these two concepts of convergence can be related. This is of particular importance for s convergence because, unlike β -convergence, no statistical test for s -convergence was available and/or used. Taking up the task, Litchenberg (1994) observe that, ignoring other terms, from (15) we can also have (22) 87 88 σ t2 ~ 2 σ ε2 = β + 2 σ t2− 1 σt − 1 Drawing attention to this result, Lee et al. note that s -convergence is not an implication of the Solow model. 2 As we shall see, some researchers, indeed, produced alternative estimates of β from analysis of σ . 43 The basic information for s -convergence is the ratio σ t2 / σ t2− 1 , and one can work with this ratio directly, as has been done by Miller (1995) and Lee et al. However, Lichtenberg shows that this ~ ratio can be estimated indirectly from β and the R 2 of the cross section regression estimating β . Since 1 − R 2 = σε2 / σ t2 , it follows from equation (22) that (23) ~ σ t2 / σ t2− 1 = R 2 / β 2 . ~ˆ This shows that in order to draw inference about s -convergence, β needs to be adjusted using R 2 to account for the distribution of the shock term. Short of this adjustment, hypothesis of absence of s -convergence will be rejected too often. Litchenberg suggests that the test statistic obtained from equation (23) had an Fdistribution with [n-2, n-2] degrees of freedom.89 However, Carree and Klomp (1995) point out that Litchenberg’s conclusion regarding appropriate distribution for the test statistic implied by (23) is not entirely correct. They draw attention to the fact that F-distribution is valid if σ t2 and ~ σ t2− 1 are independent of each other, which will not be true provided β ≠ 0 .90 However, they try to salvage Litchenberg’s idea of using F-distribution to test for s -convergence by showing that if ~ˆ ~ ~ˆ ~ˆ sample is large, so that β can be thought ‘close’to β , then T2 = (σˆt2 / σˆt2− 1 − β 2 ) /(1 − β 2 ) will be distributed approximately as F(n-2, n-1).91 They also suggest another variant of the statistic ~ˆ that take into account variability in the estimate β . This is given by ~ˆ ~ˆ T3 = (σˆt2 / σˆt2− 1 − ( β 2 − zα σˆβ~ ) 2 ) /(1 − ( β 2 − zα σˆβ~ ) 2 ) , where σˆβ~ is the standard error of 89 Note that the above relationships are based on a host of simplifying assumptions and ignoring, in particular, differences in the steady state. Litchenberg think that similar relationship hold even when differences in steady state are allowed. There are a few 2 2 differences here. First, the relationship involves, apart from conditioning on the left hand side, both σ u and σ ε on the right ∗ ∗ hand. Second, it is derived on the assumption that the steady state is not time contingent, so that yˆit is the same as yˆi , t − 1 . Third, the relationship above is in terms of income per effective worker; it will be more complicated when transformed in terms of income per capita. However, the basic ideas are evident from this algebra. 90 They also point out that the correct degrees of freedom is [(n-1), (n-1)] instead of [(n-2), (n-2)] 91 They denoted original Litchenberg statistic by T1 . 44 ~ estimated β and zα is the adopted critical value from the standard normal distribution. The adjustment will cause a type-I error that is lower than the significance level.92 E. Evidence regarding s -convergence Evidence regarding s -convergence differ in different samples. For the OECD countries, data have generally favored s -convergence. Lee et al., for example, compute variance of crosssection distribution of log of per capita income for different samples of countries for 1961 to1989 and plot them against time. Their results show that the variance for the OECD sample has decreased over time. Miller (1995) and other researchers have produced similar results regarding OECD sample. Note that this kind of evidence does not involve any formal statistical test. Litchenberg, in contrast, uses his procedure to formally test the hypothesis of s convergence. He runs a simple regression of ln GDP85 on ln GPD60 for the OECD countries and uses the result to compute the test statistic as per equation (23) above.93 Application of the critical values from F(20, 20) distribution results in a non-rejection of the null of nonconvergence.94 But, Carree and Klomp redo the exercise and find that use of their statistics reverses the conclusion. For the period 1960-85, all of Carree and Klomp’s three statistics report convergence. However, when a later sample period of 1972-94 is considered, they find that convergence does not hold any more. This, in their view, shows that while during initial years there has been significant reduction in variance, in the more recent years σ t2 has become close to σε2 and hence no further pronounced tendency for it to decrease is found. In other words, as the 92 One feature of these tests is that the comparison of variance is limited to that of first and last period only. In order to make use of the information in between, Carree and Klomp, proceeding from simplified version of equation (15), iterating backwards, and making use of the independence assumption, derive the following expression for the first difference of cross-section variance, (24) ~ 2 (t − 1) ~ 2 2 2 2 ∆σ t = β [( β − 1)σ 0 + σε ] . ~2 2 Under the null of no convergence ( β − 1) = 0 Hence the equation above can be used to test this null by regressing ∆σ t on ~ 2( t − 1) ~ ~ β and using the t(T-2) distribution, if β is known. In practice, one has only an estimate of β and the distribution will hold only approximately. This is their statistic, T4 . 2 2 ~ˆ2 93 He obtains a slope coefficient of 0.715 and R =0.802. This gave R / β = 1.57 which is equal to var(ln GDP85)/var(ln GDP60), the test statistic. 94 The probability value is 0.31. 45 countries get closer to the steady states, the transition-component of the dynamics recede, and idiosyncratic shocks take over, which then displays no systematic tendency to decrease. Basic evidence of s -convergence has been provided for other smaller samples of countries as well. Similar evidence of s -convergence has been reported for the US states.95 For large, global sample of countries, however, evidence generally indicates a rise in variance. For example, according to Lee et al.’s computation, output-variance in the sample of 102 countries increased from 0.77 to 1.24 between 1961 and 1989. Other researchers have also produced similar evidence. There are different ways in which these results can be interpreted. For example, rising s in the global sample may indicate that the steady state dispersion, σ ∞2 , itself has increased, which, in turn, may be the result of increased dispersion of the determinants of steady state. Alternatively, it is possible that σ ∞2 has remained unchanged, but initial variance, σ 02 , was less than σ ∞2 , so that the variance increased from below towards the steady state variance.96 In either case, the outcome is not incompatible with conditional ß-convergence. Similarly, we do not know how much of the decrease in variance in small sample of developed economies is due to negative ß and how much due to reduction in the other items of (20), including reduction in the dispersion of the steady state determinants. All this indicates that more knowledge is needed about the dynamics of the steady state determinants in order to understand the changes in the cross-section variance of income. F. Study of Evolution of Cross-sectional Distribution The second line of the distribution approach to convergence study does not limit itself to examination of variance. Instead, it studies the evolution of the entire shape of the distribution and intra-distribution dynamics. Also, this line of research strives to go beyond the anonymity of distribution, to identify the position of individual or groups of countries within the distribution, and to see how these positions change over time. This line of research has been most vigorously, and almost single-handedly pursued by Quah.97 95 See for example, Sala-i-Martin (1996), Miller (1996). 96 Lee et al. point out that increased variance may also be the result of increasing dispersion in g. 97 Quah has produced a series articles based on this line of research. These include Quah (1993b, 1996a, 1996b) 46 In order to follow the evolution of the entire distribution, Quah focuses on the probability mass at different quantiles. Already, from simple plotting of the cross-section distribution of the global sample for successive years, two features emerge: first, the cross-section distribution is not collapsing, and, second, this distribution is becoming increasingly bi-modal. However, because it is not known whether the plotted distributions are of steady state or not, and because the plots of distribution cannot tell the position of individual countries, Quah performs a more formal analysis of the distribution-dynamics. He proposes to put it in the following framework: Ft + 1 = M Ft , (26) where Ft is the cross-section distribution at time t, and Ft + 1 is the same at time t+1, and M is the operator that maps Ft onto Ft + 1 . The goal is to know M, which determines the evolution of the distribution. If M is assumed to be unchanged over time, then we have, Ft + s = M s Ft , (27) where s is any particular length of time (number of years, say).This framework, Quah points out, also gives us a way of getting at the desired steady state distribution. This is obtained by taking the limit of Ft + s with s → ∞ . Quah models M as a Markov transition matrix and calibrates it using actual data.98 Both the calibrated transition matrices and ergodic distributions obtained on their basis lead to similar conclusions. First is ‘persistence.’The values of the diagonal elements of the oneyear M are in the neighborhood of 0.9, implying that most of the countries continue to remain in the same position (or range) of the distribution. Second, whatever mobility (within the distribution) exists, it works to ‘thin out the middle,’and ‘pile up of probability mass at the two tails.’This is Quah’s result of growing ‘twin-peakedness’or bi-modality of the distribution. The results do not change if higher order specifications are used. In fact, these make the bi-modal 98 The variable modeled is per capita income normalized by world average. In general, Quah treats M to be time-invariant, although he allows it to have higher (than first) order of specification. In Quah (1993b), he takes the quantiles as fixed (at .25, .5, 1 and 2) and calibrates the transition probabilities for countries to move from one quantile to another (i.e., elements of M) from actual data. First he obtains ‘one-step annual’transition matrix by averaging the observed one-year transitions over the period between 1962-63 and 1984-85. He then lets it run to get the ergodic distribution on the basis of the one-year transition matrix. He does it in two variants: one with first order and another with higher order specification. He also obtains the 23-year transition matrix spanning the period 1962 to 1985, calibrated in analogous fashion. He then derives the ergodic distribution on its basis of the latter matrix as well. Again, he does this in above mentioned two versions. 47 property and ‘poverty piling up’even more pronounced.99 This exercise is extended in Quah (1993a) where he lets the quantiles to evolve (instead of being fixed).100 However, the results remain more or less intact. In fact, now the dynamics of Q(t) further confirm these results. Thus, altogether, Quah’s formal analysis confirms what informal plotting of distribution of successive years already suggested. Quah notes certain technical shortcomings of this analysis.101 However, perhaps, it is more important to note here that conditioning variables play no role in this analysis. In fact, Quah makes it explicit that it is his intention not to be restricted by assumptions of long term growth. M is memory-less, and no growth theory is required for its estimation; no structure is imposed on data. Thus, Markov-analysis of the evolution of cross-section distribution is a type of reduced form analysis of the cross-sectional output. This contrasts with research on conditional ß-convergence, which imposes structure and uses growth theory to decide on specification, choice of right hand side variables, etc. The connection between reduced form, Markov analysis and growth theory comes afterwards, when the results are confronted with the predictions of growth theory. For example, growth theory faces the task of explaining the phenomena of persistence, bi-modlity, etc., which this analysis has uncovered. In sum, σ -convergence and distribution approach are not so far apart from βconvergence and the approaches that focus on β-convergence. Once it is remembered that σ convergence research generally focuses on unconditional convergence, it becomes clear that results regarding σ -convergence largely agree with those regarding β-convergence. For 99 Quah also allows the one year transition matrix to be iterated 23 times and compares the resulting matrix (which he calls ‘stationary estimate’) with the 23-year transition matrix that is obtained from actual calibration of the data. He finds that the long run matrix shows stronger persistence than found in the ‘stationary estimate.’ 100 He fits a VAR model to forecast the quantiles, Q(t), and then takes the convolution with M raised to the appropriate power to get the dynamic evolution of the sequence of distributions. 100 For example, the results are contingent on the arbitrary grid that is used to discretize the point in time empirical distributions. However, as Quah notes, inappropriate discretization may destroy the Markov property of an otherwise well behaved first order Markov process. Also, conclusions regarding piling up of probability mass are contingent on the choice of discretizing grid and may not be robust. (See Quah 1993a, p. 437.) Finally, Quah observes that the VAR models are estimated on the basis of only about 20 data points and hence are not that precise 101 For example, the results are contingent on the arbitrary grid that is used to discretize the empirical distributions. Quah notes that inappropriate discretization may destroy the Markov property of an otherwise well-behaved first order Markov process. Also, conclusions regarding piling up of probability mass is contingent on the choice of discretizing grid and may not be robust. (Quah 1993a, p. 437) Similarly, Quah observes that the VAR models are estimated on the basis of only twenty data points and hence are not that precise. 48 example, σ -convergence has generally been reported for the small sample of developed economies for which there is also evidence of unconditional convergence. On the other hand, σ convergence does not hold for large, global sample of countries, for which unconditional convergence also does not hold. Similarly, the findings of non-collapsing distribution and increasing bi-modality are not incompatible with conditional β-convergence, once it is noted that the determinants of steady state can not only vary cross-sectionally but also change over time. However, fuller understanding of these results requires that growth research focused on the dynamics of steady state determinants. 10. Conclusions Discussion of this paper shows that convergence has, indeed, been understood and investigated in many different ways. It began with simple bivariate regressions of growth on initial level of income, with specifications not formally linked with growth models. The implicit assumption of these regressions was that all countries in the sample had the same steady state level of income. This later came to be known as unconditional convergence. Soon, it became clear that NCGT’s convergence implication is, at best, conditional on differences in steady state. The conceptual transition from unconditional to conditional convergence was also associated with a switch from informal to formal specification of the growth-initial level regression. The formal specification allowed recovery of the parameters of the growth model from the regression coefficients. This raised the focus of discussion from broad presence or absence of convergence to precise values of the growth model’s parameters. Despite the differences in approach and method of investigation, some general results have surfaced. Bulk of the evidence has favored conditional convergence. This suggests that NCGT cannot be rejected on the basis of evidence from convergence research. This does not mean that the dispute between NCGT and NGT has been resolved. This means that either the test has to move to other grounds, and/or it has to be specified in other ways. It is clear that the concept of conditional convergence has played a crucial role in determining the outcome of the convergence debate. This concept has two other consequences. First, it makes convergence very hollow, and it works toward making NCGT and NGT observationally equivalent. This is particularly true when the concept of conditional convergence 49 is pushed so far as to allow countries to have their specific steady state growth rates. The second consequence of this concept has been that it has, to some extent, diverted attention away from important issues of growth. This is because the concept of conditional convergence abstracts from the determinants of steady state. Yet it is these determinants on which the long run income level of a country depend. The current stage of research on growth started with quite a bit of concern for practical relevance. Lucas expressed this best in his following well-known observation: “Is there some action a government of India could take that would lead the Indian economy to grow like Indonesia’s or Egypt’s? If so, what, exactly. If not, what is it about the “nature of India” that makes it so? The consequences for human welfare involved in questions like these are simply staggering: Once one starts to think about them, it is hard to think about anything else.” (Lucas 1988, p. 5) It is in this light that it may be somewhat disturbing that more research is directed to estimation of conditional convergence rate than to investigation of key determinants of growth, like investment, fertility, and technology. This relative lack of focus on the steady state determinants is also obstructing fuller understanding of such important results obtained from the distribution approach as ‘persistence’and ‘increasing bi-modality.’It is by focusing on the dynamics of the steady state determinants that we can grasp the underlying causes of these phenomena. Such a change in focus may also help current growth and convergence research to have the kind of practical relevance that Lucas expressed above so nicely. Observing the status of growth research in the sixties and early seventies, Sen made the following remark: “... much of modern growth theory is concerned with rather esoteric issues. Its link with public policy is often very remote. It is as if a poor man collected his money for his food and blew it on alcohol.” (Sen 1970, p. 9, our italic) Romer (1989b, p. 52) noted that considerable intellectual capital accumulation was necessary before growth research could reach this new stage. It is, therefore, important that the crurrent stage of growth and convergence research does not head to a state similar to the one that Sen referred. 50 References Amemiya, T. (1967), ‘A Note on the Estimation of Balestra-Nerlove Models,’Technical Report No. 4, Institute for Mathematical Studies in Social Sciences, Stanford University. Amemiya, T. (1971), ‘The Estimation of the Variance in a Variance-Component Model,’ International Economic Review, 12:1-13. Anderson, T. W. and C. Hsiao (1981), ‘Estimation of Dynamic Models with Error Components,’Journal of American Statistical Association, 76: 598-606. Anderson, T.W. and C. Hsiao (1982), ‘Formulation and Estimation of Dynamic Models Using Panel Data,’Journal of Econometrics, 18:47-82 Arellano, Manuel and Stephen Bond (1991), “Some Tests of Specification for Panel Data: Monte Carlo Evidence and an Application to Employment Equations,” Review of Economic Studies, 58: 277-297 Azariadis Costas and Allan Drazen (1990), “Threshold Externalities in Economic Development,” Quarterly Journal of Economics, (No. 2, May) 501-526. Barro, Robert J. (1976), “Rational Expectations and the Role of Monetary Policy,” Journal of Monetary Economics, Vol. 2, 1-32. Barro, Robert J. (1978), “Unanticipated Money, Output, and the Price Level in the United States,” Journal of Political Economy, 86 (4): 549-580. Barro, Robert J. (1990), ‘Government Spending in a Simple Model of Endogenous Growth,’Journal of Political Economy, 98, 5 (October), part II, S103-S125. Barro, Robert J. (1991), “Economic Growth in a Cross Section of Countries,” Quarterly Journal of Economics, 106, 2 (May), 407-443. Barro, Robert J. (1997), Determinants of Economic Growth, Cambridge, MIT Press. Barro, Robert J. and Gary Becker (1989), “Fertility Choice in a Model of Endogenous Growth,’Econometrica, 57, 2 (March), 481-501. Barro, Robert J. and Jong-Wha Lee, ‘Winners and Losers in Economic Growth.’World Bank. Barro, Robert J. and Jong-Wha Lee (1993), “International Comparisons of Educational Attainment,” Journal of Monetary Economics, 1993, XXXII, 363-94. Barro, Robert J., N. Gregory Mankiw and Xavier Sala-i-Martin (1995), ‘Capital Mobility 51 in Neoclassical Models of Growth,’American Economic Review, 85 (1 March): 103-115. Barro, Robert J. and Xavier Sala-i-Martin (1992), ‘Convergence,’Journal of Political Economy, L, 223-51. Barro, Robert J. and Xavier Sala-i-Martin (1995), Economic Growth, McGraw Hill, New York, 1995. Barro, Robert J. and Xavier Sala-i-Martin (1997), Technological Diffusion, Convergence, and Growth, Journal of Economic Growth, 2 (1, March): 1-27. Baumol, William J. (1986), “Productivity Growth, Convergence and Welfare: What the Long Run Data Show?” American Economic Review, LXXVI, 1072-85. Baumol, William J. and Edward N. Wolff (1988), “Productivity Growth, Convergence and Welfare: Reply,” American Economic Review, 78 (5, December), 1155-1159. Becker, Gary and Robert J. Barro (1988), ‘A Reformulation of the Economic Theory of Fertility,’Quarterly Journal of Economics, 103, 1 (February), 1-25. Ben-David, Dan (1993), “Equalizing Exchange: Trade Liberalization and Income Convergence,’Quarterly Journal of Economics, CVIII: 653-679. Ben-David, Dan and David H. Papell (1995), “The Great Wars, the Great Crash, and Steady State Growth: Some New Evidence about an Old Stylized Fact,” Journal of Monetary Economics, 36: 453-475. Ben-David, Dan and David H. Papell (1997), “Slowdowns and Meltdowns: Post-War Growth Evidence from 74 Countries,” Review of Economics and Statistics. Ben-David, Dan, Robin Lumsdaine, and David Papell (1997), “Unit Roots, Postwar Slowdowns, and Long-Run Growth: Evidence from Two Structural Breaks, Department of Economics, University of Houston, mimeo Benhabib, Jess and Mark M. Spiegel (1994), “The Role of Human Capital in Economic Development: Evidence from Aggregate Cross-Country Data,” Journal of Monetary Economics, XXXIV, 143-173. Becker, Gary, Kevin M. Murphy and Robert Tamura (1990), ‘Human Capital, Fertility and Economic Growth,’Journal of Political Economy, 98 (1990), No. 2 (part 2), S12-S37. Bernard, Andrew and Steven N. Durlauf (1996), ‘Interpreting Tests of the Convergence Hypothesis,’Journal of Econometrics, 71:161-173. 52 Bernard, Andrew and Steven N. Durlauf (1995), ‘Convergence in International Output,’ Journal of Applied Econometrics, 10: 97-108. Bernard, Andrew and Charles I. Jones (1996), ‘Productivity Levels across Countries,’ Economic Journal, Vol. 106. Carlino, G. A. and L. O. Mills (1993), ‘Are the US Regional Incomes Converging? A Time Series Analysis,’Journal of Monetary Economics, 32:335-346. Carree Martin and Klomp Luuk (1995), “Testing the Convergence Hypothesis: A Comment,” Faculty of Economics, Erasmus University, Rotterdam, 1995. Caselli, Francesco, Gerardo Esquivel and Fernando Lefort (1995), ‘Reopening the Convergence Debate: A New Look at Cross Country Growth Empirics,’Dept. of Economics, Harvard University, (typescript). Cass, David (1965), “Optimum Growth in an Aggregative Model of Capital Accumulation,” Review of Economic Studies, 1965, XXXII, 233-40. Chamberlain, Gary (1982), “Multivariate Regression Models for Panel Data,” Journal of Econometrics, XVIII, 5-46. Chamberlain, Gary (1983), “Panel Data,” in Handbook of Econometrics, Zvi Griliches, and Michael Intriligator, eds. (Amsterdam: North Holland), pp. 1247-1318. Christensen, Lauritis R., Dianne Cummings, and Dale W. Jorgenson (1981),‘Relative Productivity Levels, 1947-1973: An International Comparison,’European Economic Review, 76 (No. 1, May): 61-74. Chua, Hak B. (1992), “Regional Spillovers and Economic Growth,” Department of Economics, Harvard University, 1992. Coe, David T. and Elhanan Helpman (1995), ‘International R&D Spillovers,’European Economic Review, 39:859-887. De Gregorio, Jose (1992), “Economic Growth in Latin America,” Journal of Development Economics, XXXIX, 59-84. De Long, Bradford J. (1988), “Productivity Growth, Convergence, and Welfare: A Comment,” American Economic Review, LXXVIII, 1138-54. De Long, Bradford J. and Lawrence Summers (1991), “Equipment Investment and Economic Growth,” Quarterly Journal of Economics, 106, 2 (May), 445-502. De Long, Bradford J. and Lawrence Summers (1993), “How Strongly Do Developing Economies Benefit from Equipment Investment?” Journal of Monetary 53 Economics, 1993, 32, 395-415. Den Haan, Wouter J. (1995), “Convergence in Stochastic Growth Models: The Importance of Understanding Why Income Levels Differ,” Journal of Monetary Economics, 35:65-82. Dickey, D. and W. Fuller (1979), ‘Distribution of the Estimators for Autoregressive Series with a Unit Root,’Journal of the American Statistical Association, 74 (1979): 427-431. Dickey, D. and W. Fuller (1981), ‘Likelihood Ratio Tests for Autoregressive Time Series with a Unit Root,’Econometrica, 49 (1981): 1057-1072. Dollar, David and Edward Wolff (1994), “Capital Intensity and TFP Convergence in Manufacturing, 1963-1985,” in William J. Baumol, Richard R. Nelson, and Edward N. Wolff, eds., Convergence of Productivity: Cross National Studies and Historical Evidence, New York, Oxford University Press. Domar, Evsey (1946), ‘Capital Expansion, Rate of Growth and Employment,’ Econometrica, 14 (April, 1946): 137-147. Dougherty, Chrys and Dale W. Jorgenson (1996), “International Comparison of Sources of Growth,” American Economic Review, 86 (2, May), 25-29. Dougherty, Chrys and Dale W. Jorgenson (1997), “There is No Silver Bullet: Investment and Growth in the G7,” Department of Economics, Harvard University. Dowrick, Steve and Duc-Tho Nguyen (1989), “OECD Comparative Economic Growth 1950-85: Catch-Up and Convergence,” American Economic Review, LXXIX, 1010-30. Durlauf, Steven (1993), “Nonergodic Economic Growth,” Review of Economic Studies, 60 (1993): 349-366. Durlauf, Steven. and Paul A. Johnson (1995), “Multiple Regimes and Cross-Country Growth Behavior,” Journal of Applied Econometrics, 1995, 10, 365-384. Engle, Robert and Clive Granger (1987), ‘Cointegreation and Error Correction Model,’ Econometrica. Evans, Paul (1996), “Using Cross-country Variances to Evaluate Growth Theories,” Journal of Economic Dynamics and Control, 20, 1027-1049. Evans, Paul and Georgios Karras (1996a), ‘Convergence Revisited,’Journal of Monetary Economics, 37:249-265. 54 Evans, Paul and Georgios Karras (1996b), ‘Do Economies Converge? Evidence from a Panel of US States,’Review of Economics and Statistics, 384-388. Fagerberg, Jan (1994), ‘Technology and International Differences in Growth Rates,’ Journal of Economic Literature, 32 (No. 3) 1147-1175. Friedman Milton (1994), “Do Old Fallacies Ever Die?” Journal of Economic Literature, 30 (December, No. 4) 2129-2132. Galor Oded (1996), “Convergence? Inference from Theoretical Models,” Economic Journal, 106:1056-1069. Gregorio Jose De (1992), “Economic Growth in Latin America,” Journal of Development Economics, 39:59-84. Grier, Kevin B. and Gordon Tullock (1989), “An Empirical Analysis of Cross-National Economic Growth, 1951-1980,” Journal of Monetary Economics, 24, 259276. Grossman, Gene M. and Elhanan Helpman (1995), Innovation and Growth in the Global Economy, MIT Press, Cambridge, MA. Hall, Robert E. and Charles I. Jones (1996), ‘The Productivity of Nations,’Economics Dept., Stanford University. Hall, Robert E. and Charles I. Jones (1997), ‘Levels of Economic Activity across Countries,’American Economic Review, 87:173-177. Harrod, Roy (1993), ‘An Essay in Dynamic Theory,’The Economic Journal, XLIX (1939, March) 14-33. Holtz-Eakin, Douglas (1993), ‘Solow and the States: Capital Accumulation, Productivity, and Economic Growth,’National Tax Journal, 46:425-439. Islam, Nazrul (1992), “Small Sample Performance of Dynamic Panel Data Estimators: A Monte Carlo Study,” Department of Economics, Harvard University, 1992. Islam, Nazrul (1995), ‘Growth Empirics: A Panel Data Approach,’Quarterly Journal of Economics, CX (November, No.4): 1127-1170. Islam, Nazrul (1998), ‘Growth Empirics: A Panel Data Approach— A Reply,’Quarterly Journal of Economics, CXIII (February, No. 1): 325-9. Jones, Larry E. and Rodolfo Manuelli (1990), ‘A Convex Model of Equilibrium Growth: Theory and Policy Implications,’Journal of Political Economy, 98 (No. 5, part 1) 1008-1038. 55 Johansen S. (1988), “Statistical Analysis of Co-integration Vectors,” Journal of Economic Dynamics and Control, 12: 231-54. Jones, Charles I. (1995a), ‘Time Series Tests of Endogenous Growth Models,’Quarterly Journal of Economics, 110 : 495-525. Jones, Charles I. (1995b), ‘R & D-Based Models of Economic Growth,’Journal of Political Economy, 103 (No. 4) 759-784. Jorgenson, Dale W. and M. Nishimizu (1978), “US and Japanese Economic Growth, 1952-1974,” Economic Journal, 88: 707-726. Jorgenson, Dale W. and Chrys Dougherty (1996), ‘International Comparisons of the Sources of Economic Growth,’American Economic Review, 86:25-29. Jorgenson, Dale W., Frank Gollop, and Barbara Fraumeni (1987), Productivity in the US Economic Growth, Harvard University Press, Cambridge, MA. Jorgenson, Dale W. (1995), Productivity, Vols. 1-2, MIT Press, Cambridge, MA. Kaldor, Nicholas (1971), ‘Capital Accumulation and Economic Growth,’ in F. A. Lutz and D. C. Hague (eds.) Theory of Capital, St. Martin’s Press, New York. King, Robert G., and Sergio T. Rebelo (1993), “Transitional Dynamics and Economic Growth in the Neoclassical Model,” American Economic Review, 83, 4 (September), 908-931. Knight Malcolm, Norman Loyaza and Delano Villanueva (1993), ‘Testing for Neoclassical Theory of Economic Growth,’IMF Staff Papers, 40 (September, No. 3) 512-541. Kwiatkowski, Denis, Peter P. C. Phillips, Peter Schmidt, Yongcheol Shin (1992), ‘Testing the Null Hypothesis of Stationarity against the Alternative of Unit Root,’ Journal of Econometrics, 54: 159-178. Koopmans T.C. (1965), “On the Concept of Optimal Economic Growth,” in The Economic Approach to Development Planning, Pontifical Academy of Sciences, Amsterdam, North-Holland. Kormendi, Roger C. and Philip G. Meguire (1985), “Macroeconomic Determinants of Growth: Cross-country Evidence,” Journal of Monetary Economics, 16, 141-163. Lee Kevin, M. Hashem Pesaran, and Ron Smith (1995), “Growth and Convergence: A Multicountry Empirical Analysis of the Solow Growth Model,” DAE Working 56 Papers Amalgamated Series No. 9531, Department of Applied Economics, University of Cambridge. Levine and Lin (1993), ‘Unit Root Tests in Panel Data,’Dept. of Economics, University of California-San Diego. Levine, Ross and David Renelt (1990), “Cross Country Analysis of Growth and Policy: Methodological, Conceptual, and Statistical Problems,” World Bank, Washington D. C. Levine, Ross and David Renelt (1992), “A Sensitivity Analysis of Cross Country Growth Regressions,” American Economic Review, 82, 4 (September), 942-963. Litchenberg, Frank R. (1994), “Testing the Convergence Hypothesis,” Review of Economics and Statistics, 1994, 76, 576-579. Lowey, Michael B. and David H. Papell (1996), ‘Are US Regional Incomes Converging? Some Further Evidence,’Journal of Monetary Economics, 38:587-598. Lucas Robert E. Jr. (1988), “On the Mechanics of Economic Development,” Journal of Monetary Economics, 22:3-42. Lucas Robert E. Jr. (1990), ‘Why Doesn’t Capital Flow from Rich to Poor Countries?’ American Economic Review, Papers and Proceedings, 80 (May, No. 2). Lucas Robert E. Jr. (1993), ‘Making a Miracle,’Econometrica, 61 (March, No. 3) 251272. Lumbsdine, Robin and David Papell (1997), “Multiple Trend Breaks and the Unit Root Hypothesis,” Review of Economics and Statistics, LXXIX, 212-218. Maddison, Angus (1982), Phases of Capitalist Development, Oxford University Press, Oxford. Maddison, Angus (1987), “Growth and Slowdown in Advanced Capitalist Economies: Techniques of Quantitative Assessment,” Journal of Economic Literature, 25, 649-698. Miller Ronald I. (1995), ‘Time Series Estimation of Convergence Rates,’Department of Economics, University of Columbia, 1995 (typescript). Mankiw, N. Gregory, Romer, David, and David Weil (1992), “A contribution to the Empirics of Economic Growth,” Quarterly Journal of Economics, CVII: 407-37. Mankiw, N. Gregory (1995), “The Growth of Nations,” Brookings Papers on Economic Activity, No. 1, 275-325. 57 Nerlove Marc (1996), ‘Growth Rate Convergence, Fact or Artifact?’Department of Agriculture and Resource Economics, University of Maryland, College Park, (typescript). Parente, Stephen L. and Edward C. Prescott (1994), ‘Barriers to Technology Adoption and Development,’Journal of Political Economy, 102:298-321. Parente, Stephen L. and Edward C. Prescott (1994), “Changes in Wealth of Nations,” Quarterly Review, Federal Reserve Bank of Minneapolis, Spring, 3-16. Phillips, P. C. B. and S. Ouliaris (1988), “Testing for Co-integration Using Principal Components Method,” Journal of Economic Dynamics and Control, 12: 205-230. Quah Danny (1993a), “Galton’s Fallacy and Tests of the Convergence Hypothesis,” Scandinavian Journal of Economics, 95, 4, 427-443. Quah Danny (1993b), “Empirical Cross-Section Dynamics in Economic Growth,” European Economic Review, 37, 426-434. Quah Danny (1996a), “Empirics for Economic Growth and Convergence,” European Economic Review, 40 (forthcoming). Quah Danny (1996b), “Twin Peaks: Growth and Convergence in Models of Distribution Dynamics,” Discussion Paper No. 1355, Center for Economic Policy Research, London. Rebelo, Sergio (1991), “Long Run Policy Analysis and Long Run Growth,” Journal of Political Economy, XCIX, 500-21. Romer, Paul (1986), “Increasing Returns and Long Run Growth,” Journal of Political Economy, 94 (1986, No. 5) 1002-1036. Romer, Paul (1989a), ‘Crazy Explanations for Productivity Slowdown,’Brookings Papers on Economic Activity, Washington D. C., 163-202. Romer, Paul (1989b), ‘Capital Accumulation in the Theory of Long Run Growth,’in Modern Business Cycle Theory, Robert J. Barro, ed., Harvard University Press, Cambridge MA. Romer, Paul (1989c), “Human Capital and Growth: Theory and Evidence,” NBER Working Paper 3173, Cambridge. Romer, Paul (1990), “Endogenous Technological Change,” Journal of Political Economy, 1990, 98, 5(2), S71-S102. 58 Romer, Paul (1994), “Origins of Endogeneous Growth,” Journal of Economic Perspectives, 1994, 8(1), 3-22. Sala-i-Martin, Xavier (1996a), “Classical Approach to Convergence Testing,” Economic Journal, Vol. 106. Sala-i-Martin, Xavier (1996b), “Regional Cohesion: Evidence and Theories of Regional Growth and Convergence,” European Economic Review, 40:1325-1352 Sala-i-Martin, Xavier (1997), ‘I Just Ran Two Million Regressions,’American Economic Review, 87:178-183. Sachs, Jeffrey and Andrew Warner (1997), ‘Fundamental Sources of Economic Growth,’ American Economic Growth, 87:184-188. Sen, Amartya K. (1970), Growth Economics: Selected Readings, Penguin Books Limited, Harmondsworth, 1970. Solow, Robert M. (1956), ‘A Contribution to the Theory of Economic Growth,’ Quarterly Journal of Economics, LXX: 65-94. Solow, Robert M. (1970), Growth Theory: An Exposition, Cambridge University Press, London. Solow, Robert M. (1994), ‘Perspectives on Growth Theory,’Journal of Economic Perspectives, 8 (Winter, No. 1) 45-54. Solow, Robert M. (1997), Learning from ‘Learning by Doing’: Lessons for Economic Growth, Stanford University Press, Stanford, CA. Summers, Robert, and Alan Heston (1988), “A New Set of International Comparisons of Real Product and Price Levels Estimates for 130 Countries. 1950-85,” Review of Income and Wealth, XXXIV, 1-26. Summers, Robert, and Alan Heston (1991), “The Penn World Table (Mark 5): An Expanded Set of International Comparisons, 1950-1988,” Quarterly Journal of Economics, 106, 2 (May), 327-368. Tamura, Robert (1991), ‘Income Convergence in an Endgenous Growth Model,’Journal of Political Economy, 99:522-540. Ventura, Jaume (1997), ‘Growth and Interdependence,’Quarterly Journal of Economics, CXII (No. 1, February): 57-84. Wolff, E. N. (1991), ‘Capital Fromation and Productivity Convergence,’American Economic Review, 81:565-579. 59 Young, Alwyn (1995), ‘The Tyranny of Numbers: Confronting the Statistical Realities of the East Asian Growth Experience,’Quarterly Journal of Economics, CX: 641680. Young, Alwyn (1992), ‘A Tale of Two Cities: Factor Accumulation and Technical Change in Hong Kong and Singapore,’NBER Macroeconomics Annual 1992, Olivier Blanchard and Stanley Fischer (eds.), Cambridge, MIT Press, 99. 13-54. Zivot, Eric and Donald Andrews (1992), “Further Evidence on the Great Crash, the Oil Price Shock, and the Unit Root Hypothesis,” Journal of Business and Economic Statistics, 10:251-270. 60