Internal and External Distance: Gravity Depends on It! W ERNER A NTWEILER∗ University of British Columbia October 30, 2007 Abstract Gravity models of international trade rely crucially on measures of distance, both internal within a country and external between pairs of countries. Yet, empirical work on the gravity model makes use of rather imperfect approximations such as the distance between countries’ capital cities and ad-hoc assumptions about the shape of countries. This results in distance measures that use different methodologies to derive internal and external measures and do not allow for shifting patterns of economic activity within a country. Using the Gridded Population of The World (GPWv3) database, this paper introduces a distance measure that is based on a weighted harmonic mean of distances between small latitude-longitude squares within each country that overcomes these limitations. This paper shows that internal and external distance vary over time as populations move within countries, and that these distance measures affect results of different types of gravity equation estimations in a significant manner. It also becomes evident that country-internal migration affects bilateral trade friction. A further purpose of this paper is to document the new time-varying distance measures and make them available freely to other researchers. VERY PRELIMINARY. PLEASE DO NOT CITE. ∗ Sauder School of Business, University of British Columbia, 2053 Main Mall, Vancouver, BC, V6T 1Z2, Canada. Phone: 604-822-8484. E-mail: werner.antweiler@ubc.ca. This paper owes a debt of gratitude to Keith Head, Thierry Mayer, and John Ries. Some of the critical ideas in this paper were inspired by their recent work involving gravity equations and models of economic geography. This paper was written while I was on sabbatical visiting the University of Kiel in Germany. I would like to express sincere thanks to my host Horst Raff and his staff for their hospitality. 1 1 Introduction The gravity equation of international trade remains one of the workhorses of empirical international trade research. In recent years it has found a solid theoretical underpinning as the gravity equation can be derived from a model of differentiated goods with increasing returns to scale in production and iceberg transportation costs (Anderson and van Wincoop, 2003, 2004). Gravity equations can also be derived from modern versions of Ricardian (technology- or productivity-driven) trade; see Eaton and Kortum (2002), Melitz and Ottaviano (2005), and Dekle, Eaton, and Kortum (2007). All of the empirical work related to the gravity model depends crucially on a measure of distance. Moreover, modifications of the gravity model that estimate the border effect also distinguish between internal distance Dii of a country i and external distance Dij between countries i and j. Most work simply uses the great-circle distance between capital cities of countries as a ready approximation for the external distance between countries. As the location of capital cities in large countries may not be central, country distances may be mismeasured substantially. Some capital cities are not the main agglomerations for historical or political reasons. Even taking the largest urban agglomeration instead of the capital city does not overcome the spatial mismeasurement problem. Many large countries have multiple large agglomerations. The internal distance of a country is approximated with a variety of measures, in particular the area of the country or its square root, multiplied by a suitably defined proportionality factor. As is the case for external distance, these approximations of internal distance may be quite inaccurate. Furthermore, internal and external distances are measured using different methods, therefore introducing further data problems. The questions thus are: How large are these inaccuracies? Do these inaccuracies matter? If yes, how much do they matter? Can we do better and develop more accurate measures of internal and external distance? This paper aims to answer these questions by introducing a new set of internal and external distance measures that are computed consistently. Using the Gridded Population of the World version 3 (GPWv3) database, it is feasible to calculate average internal and external distances that weight distances by population in origin and destination regions. The relatively fine grid of the GPWv3 database makes these calculations computationally expensive, but these calculations generate a much more reliable set of distances measures which conventional methods of distance calculation can be judged against. This paper is not the first to address the problem of internal and external distances. Head and Mayer (2000, 2002) provide a survey of some of the problems and attempts at addressing them. Referring to the mismeasurement of internal distances, the Nitsch (2001) paper was entitled “It’s Not Right but It’s Okay.” Using a much superior method of calculating internal and external distances consistently, this paper finds that mismeasurement problems are serious, and that internal and external distances that take account of shifts of population within countries vary significantly over time. This result is consistent with Head and Disdier (2007), who 2 report time variation in estimates of the distance effect in gravity equations. A further intriguing question lurks behind measuring internal and external distance. Does economic distance—as opposed to geographic distance—change over time? Economic distance is shaped by where economic activity takes place and thus where in a country people choose to settle. Internal migration within countries has never been linked to international trade. The effect of immigration on trade has already been studied elsewhere; see for example Wagner, Head, and Ries (2002) and Mundra (2005). However, this is the first paper to demonstrate that not just international migration but also country-internal population migration affects trade friction between countries and thus trade flows. The new measures of internal and internal distance are made available freely to other researchers on the author’s web site [URL to be announced]. This web page also contains ancillary information and further documentation. Estimation of the gravity equation remains a workhorse of empirical international trade research, including policy analysis. Improving the quality of this work is therefore important. Whereas Silva and Tenreyro (2006) take a critical look at estimation strategies, this paper takes a critical look at part of the underlying data. 2 Strategies for Estimating Gravity Equations In its simplest form, the gravity model predicts that exports Xij from region i and j is proportional to the output Yi of the exporter region and the expenditures Yj of the importer region and inverse proportional to the distance Dij between the two regions. The classic gravity equation can be written as Xij = A Yi Y j Dij (1) Estimating a log-linear fully-parameterized version of this equation ln Xijt = β0 + β1 Yit + β2 + Yjt + β3 Dij[t] + µi + µj + µt + ijt (2) has a number of pitfalls, however. As modern derivations of the gravity equation show, the proportionality factor A in equation (1) is not constant at all, and the GDP-trade elasticities ought to be constrained to unity. Furthermore, in a multiyear panel of equation (1) many authors choose inconsistent deflators for the volume of trade Xij . As Baldwin and Taglioni (2006) show, this is potentially troublesome. We will return to some of these estimation problems later and address how the inclusion of various types of dummy variables (µi , µj , and µt ) can mitigate some of these problems. There are several approaches to derive modern versions of the gravity equation. Fortunately, they all generate expressions—or estimating equations—that are quite similar. An elegant way of deriving the gravity equation has been popularized by Anderson and van Wincoop (2003), following in the footsteps of Krugman (1980), Deardorf (1998), and others. Their derivation of the gravity model assumes 3 that consumers have CES love-of-variety preferences for differentiated goods and that trade exhibits iceberg transportation costs φij = 1+τij with τij > 0. With product differentiation captured through the substitution elasticity σ, and introducing world output Yw , the modern gravity equation can be written as " Yi Yj φij Xij = Yw R j P i #1−σ (3) Here, Rj and Pi are the inward and outward multilateral resistance terms that capture the notion that trade barriers are relative to those of alternative export or import destinations. The presence of these two multilateral resistance terms makes it difficult to estimate (3) directly, and thus Anderson and van Wincoop (2003) relied on an intricate iterative estimation procedure. The estimation problems can be overcome, however. One empirical shortcut is to estimate the gravity equation with country fixed effects, country-pair fixed effects, or country-pair-time fixed effects. The fixed effects are meant to capture the (time-varying) multilateral resistance terms. The fixed effects specification provides unbiased albeit somewhat less efficient estimates of the parameters of interest than the original Anderson and van Wincoop (2003) specification. Alternative procedures transform the gravity equation (3) in a variety of useful ways in order to eliminate the multilateral resistance terms. There are several advantages of such a transformation in addition to eliminating the multilateral resistance terms. Consider the trade ratio approach developed in Head and Mayer (2002) and Head and Ries (2001): " # " φij φji Xii Xjj = (σ − 1) ln ln Xij Xji φii φjj # (4) This approach focuses on the product of two ratios, namely the odds Xii /Xji of buying domestic goods relative to foreign goods in country i with the odds Xjj /Xij of buying domestic relative to foreign goods in country j. Head and Mayer (2002) refer to the square root (the geometric mean) of the two odds ratios as bilateral “trade friction”: s Xii Xjj Ξij ≡ >0 (5) Xij Xji If one assumes symmetry in transportation costs (φij = φji ) and zero countryinternal transportation costs (φii = φjj = 1), the expression on the right-hand side of equation 4 simplifies to 2(σ − 1) ln(φij ). This makes it possible to estimate the border effect by regressing the left-hand side of equation 4 on a log-linear function of trade impediments such as distance, language, common borders, currency unions, FTA memberships, and so on. In addition to being able to estimate the border effect with ordinary least squares, a further advantage of this specification is the elimination of price effects on the left-hand side of the equation, and thus the need to find suitable price deflators for the value of exports and imports. Assuming zero country-internal trade costs is certainly a rather crude approximation, the more so for large countries such as the United States. If one assumes 4 that φij = κDij is merely a function of distance, then (4) can be estimated as s ln Xii Xjj Dij = ln Ξij = β ln q Xij Xji Dii Djj (6) where β ≡ (σ − 1)κ > 0. Note the appearance of internal distances Dii and Dij in the equation. Whereas the left-hand side of the equation is the “measured trade friction” (ln Ξij ), the right-hand side of the equation is the “predicted trade friction,” which in turn is the product of the substitution elasticity factor (σ − 1) and the relative distance ratio Dij >0 (7) Ψij ≡ q Dii Djj If distance varies over time, so does the relative distance ratio. As in Wei (1996), internal trade Xii is calculated as output minus all exports, i.e., Xii = Yi − X (8) Xij j A simple approximation of this is to subtract total exports from GDP. At the industry level, this is more suitably replaced by industry output minus industry exports. There is yet another useful transformation which has been suggested by Head, Mayer, and Ries (2007). Instead of using country dyads (pairs), one can also use country tetrads involving two exporters and two importers. Then " # " Xij Xlk φik φlj ln = (σ − 1) ln Xik Xlj φij φlk # (9) This expression involves only external distances, but for a country pair of interest this involves choosing two reference countries. Unless the reference country is held fixed, the above expression involves estimating the border effect from a data set of dimension O(n4 ) instead of O(n2 ) with respect to the number of countries n. A variation of this approach chooses the same reference country l = k, in which case one needs one internal distance and three external distances rather than four external distances. Lastly, it is also possible to estimate the gravity equation in a time-differenced form. If the trade resistance term φij is modeled in log-linear form such that (10) ln φijt = α + βUij + γ ln Vijt + δ ln Dijt + it then the time-differenced version of (6) can be written as ∆ ln Ξijt Vijt Vjit (σ − 1)γ ∆ ln = (σ − 1)δ∆ ln Ψijt + 2 Viit Vjjt ! + ijt (11) If distance was time invariant, ∆ ln Φijt would be zero. Time differencing equation (10) cancels out the intercept and time-invariant terms, leaving merely the timevariant explanatory variables. For example, if Vijt captures the tariff mark-up 1 + τijt that country j imposes on imports from country i, then (11) would appear as q ∆ ln Ξijt = (σ − 1)δ∆ ln Ψijt + (σ − 1)γ∆ ln (1 + τijt )(1 + τjit ) + µijt 5 (12) The square root expression is thus the geometric average of the bilateral tariffs. The time-differenced gravity equation (12) is able to answer one further important question. If internal and external distances change over time, then the relative distance ratio Ψ should also positively and significantly change the trade friction Ξ. This amounts to a test of the hypothesis that country-internal migration affects trade flows by increasing or decreasing trade friction. Testing this hypothesis is a central theme of this paper. The discussion above has focused primarily on the Anderson and van Wincoop (2003) approach for deriving the modern gravity equation. As already observed in Feenstra, Markusen, and Rose (2001), the gravity equation is consistent with a number of microeconomic foundations. Given the specific assumptions underlying the model, the Anderson and van Wincoop approach is by no means entirely satisfactory. Their model is not a full-fledged model of international trade because it provides no source of comparative advantage (beyond proximity or remoteness), thus ignoring technology and productivity differences (Ricardo) and factor endowments (Heckscher-Ohlin). Unrealistically, the Anderson and van Wincoop model predicts that all countries trade with each other, even if only tiny amounts. This ignores the fact that there is a very large number of zero-trade country pairs. From this perspective, the Anderson and van Wincoop model might be viewed as more fitting to differentiated goods trade among OECD countries than interindustry trade among Northern and Southern countries. Significant progress has been made to remedy some of the problems. Eaton and Kortum (2002) derive a gravity equation from a Ricardian trade model with a continuum of goods and probabilistic technology. Melitz and Ottaviano (2005) introduce quadratic utility with variety preferences for consumers, thus explicitly modeling when bilateral trade may become zero. Firms compete monopolistically and exhibit dispersion in technology and productivity. Helpman, Melitz, and Rubinstein (2007) also account for zero trade, but in a CES utility model. They also allow for firm heterogeneity so that only the highest productivity (lowest cost) firms export. This brief selection of recent research papers demonstrates the substantial progress that has been made finding a sound theoretical underpinning of the gravity equation. 3 Measuring Distance How can one construct a measure of distance that appropriately reflects the spatial structure of a country? A useful measure is obviously a weighted average of the distance between cities: either all the cities in a particular country to get a measure of internal distance, or all the cities in a pair of countries to get a measure of external distance. But which weights are appropriate? It may appear that trade flows provide good weights. However, actual trade between cities is usually unobservable. Furthermore, using a measure of predicted rather than actual trade prevents contaminating the distance measure with trade barriers that are unrelated to distance. The gravity model provides a useful starting point to obtain a measure of predicted trade opportunities in the sense that it can be viewed as a search-and-match model. 6 Let Ωi denote the set of locations in country i, and let Ωj denote the set of locations in country j. Then define the distance weight Wij ≡ Pi Pj Dij (13) as the gravity-style approximation of interaction opportunities between regions i and j. Hence, P Dij = P P P Dkl Wkl k∈C l∈C Pk Pl = P i P j P P k∈Ci l∈Cj Wkl k∈Ci l∈Cj Dkl k∈Ci l∈Cj (14) This means that distance Dij is the weighted harmonic mean of the pairwise distances of regions within countries i and j. Other authors have constructed distance measures using arithmetic means, that is, using weights Wija ≡ Pi Pj instead of Wij . The main difference between the two distance weights is that the harmonic mean gives greater weight to small distances, whereas the arithmetic mean gives greater weight to large distances. Which one should one favour? Helliwell and Verdier (2001) make the case for arithmetic means. However, since distance is meant to capture “economic distance” in the context of international trade, the harmonic mean is consistent with the ‘gravity’ potential of trade links.1 4 Empirical Implementation The CIESIN (2005) GPWv3 database provides a novel opportunity to calculate internal and external distances consistently. The database projects the entire human population on to squares of 2.5 arc-minutes of longitude and latitude, that is, 24 times 24 squares for each degree of longitude and latitude. Currently, these data are available for three years: 1990, 1995, and 2000.2 This allows not only for the calculation of distances, but also for the determination of how much these distances vary over the course of the last decade. Adding the time dimension to distance improves on earlier work by Mayer and Zignago (2005) and explicitly allows for the effects of country-internal population migration. The distance between each populated square is calculated using great circle distances between midpoints. Let φi and λi denote the latitude and longitude of location i. Then the great circle distance between the two locations i and j is given by3 6372.795km arccos (sin φi sin φj + cos φi cos φj cos(λi − λj )) (15) 1 Nevertheless, the new distance database made available in conjunction with this paper makes both types of distance measures (harmonic and arithmetic means) available. 2 Population projections are used to generate data sets for quinquennial periods past 2000. 3 Equation (15) shows the approximation for a perfect sphere. For large distances a precise ellipsoidal calculation would be superior, although computationally a lot more expensive. As the evidence shows, external distance calculations are the least sensitive for large distances, and therefore the spherical approximation suffices. 7 Table 1: Gridded Population Data by Population Thresholds Population Latitude-Longitude Squares # of Countries Threshold number share cumul. share included excluded 100,000 1,942 0.02% 1,942 0.02% 71 160 50,000 4,420 0.05% 6,362 0.08% 111 120 20,000 26,859 0.32% 33,221 0.39% 149 82 10,000 76,632 0.91% 109,853 1.30% 168 63 5,000 148,284 1.76% 258,137 3.06% 196 35 2,000 354,688 4.20% 612,825 7.26% 216 15 1,000 361,061 4.28% 973,886 11.54% 219 12 500 494,329 5.86% 1,468,215 17.39% 226 5 200 822,993 9.75% 2,291,208 27.14% 227 4 100 719,466 8.52% 3,010,674 35.66% 228 3 50 638,969 7.57% 3,649,643 43.23% 228 3 20 793,233 9.40% 4,442,876 52.63% 229 2 10 632,681 7.49% 5,075,557 60.12% 229 2 5 558,982 6.62% 5,634,539 66.74% 229 2 2 612,694 7.26% 6,247,233 74.00% 230 1 1 401,205 4.75% 6,648,438 78.75% 230 1 > 0 1,793,670 21.25% 8,442,108 100.00% 231 0 Of course, great circle distances remain a less than ideal approximation of shipping distances for manufactured goods. For now, the computational complexity of calculating the distances between all the squares in the GPWv3 database limits the use of superior methods based on geo-computation (those provided by GPS navigation systems, for example, or the popular Google-Maps interface on the web). Table 1 illustrates the computational complexity of calculating distances between pairs of latitude-longitude squares. If only squares are included that have a population of 10,000 people or more, we require (109,853)x(109,852)/2, or about 6 billion, calculations. However, at this resolution 63 countries would not be covered because they have low population densities. When including all 8,442,108 latitude-longitude squares, we require (8,442,108)x(8,442,107)/2, or 35.6 trillion, calculations. Ultimately, I calculate all external distances at a resolution that includes all squares with 50 people or more, at a computational expense of 6.7 trillion calculations. Even on a very fast Apple G5 (64-bit) computer, acquired in 2006, this task took three weeks to complete. When considering country internal distances alone, I include all longitude-latitude squares as the computational complexity of this problem is considerably smaller. Figure 1 shows the frequency distribution of external distances for all country pairs. The median distance is 7,808 km. The distribution of distances trails off for long inter-continental distances. The smallest distance is 46 km (between the U.S. and British Virgin Islands) and the largest is 19,744 km (between Ghana and Tuvalu). 8 Relative Frequency Figure 1: Distribution of External Distances 1800 1700 1600 1500 1400 1300 1200 1100 1000 900 800 700 600 500 400 300 200 100 0 0 <1 1 <2 2 <3 3 <4 4 <5 5 <6 6 <7 7 <8 8 9 10 11 12 13 14 15 16 17 18 19 <9 <10 <11 <12 <13 <14 <15 <16 <17 <18 <19 <20 Country−Pair Distances (1000km) Figure 2: Population Thresholds and Internal Distance Calculation for the United States 700 600 Distance [km] 500 400 300 200 100 0 1 10 100 1k Population Threshold 9 10k 100k Choosing the population threshold for excluding latitude-longitude squares is not unproblematic. Figure 2 illustrates how the choice of threshold affects the internal distances measure for the United States. The distance measure decreases monotonically as the threshold increases (shown on a logarithmic scale). The slope of the curve is very flat up to a population threshold of about 50 and then becomes increasingly steep. This result is a good indication that the population threshold of 50 for the external distance calculation does not compromise accuracy significantly. 5 5.1 How Much Mismeasurement? Internal Distance Versions of the gravity equation that make use of the dyadic transformation (6) require measures that relate country-internal distances to external distances of countries. As was shown in Wei (1996), the magnitude of the border effect depends crucially on the way internal distances are calculated. As Head and Mayer (2000) point out: If this internal distance is overestimated, then holding international distance constant, the negative effect of distance will be underestimated as the cost of shipping a good internally becomes closer to the cost of shipping it to another country. As a result, the border effect—which accounts for any excessive amount of trade within a country—will be given more weight in the regression, leading ceteris paribus to an overestimated border effect. Several approximations have been suggested to address this problem. (a) Wei (1996) suggest using one-quarter of the distance to the nearest foreign economic center. (b) Wolf (1997, 2000) calculates internal distance as the distance between the two largest cities in each country. (c) Nitsch (2000) and Leamer (1997) assume that internal distance is proportional to the square root of the area of the country, using a proportionality factor of 0.56 = π −1/2 . Nitsch has previously worked with ratios of 0.2 and 0.6 as well, derived from Canadian provincial data. Using a more rigorous derivation, Head and Mayer (2000) calculate a proportionality factor of 0.376 = (2/3)π −1/2 . (d) Head and Mayer (2000) use employment-weighted means to calculate distances between European Union regions. (e) Helliwell and Verdier (2001) derive and calculate arithmetic mean distances using population weights for Canadian intra-city, city-rural, and inter-provincial distances. (f) Head and Mayer (2002) develop the gravity-style harmonic mean distance and use it for European data. However, they calculate averages based on a large set of cities at a more aggregate scale. (g) Chen (2004) also calculates region-weighted internal distances for European Union countries, yielding somewhat different results than Head and Mayer (2000). 10 Table 2: Regression of Circular Area Internal Distance Approximation on Gravity Distances Intercept Log Distance Log Population Observations R2 6= 0 6= 1 6= 0 (A) −0.828c (7.17) 1.291c (10.4) 223 0.906 (B) −0.823c (7.22) 1.121a (2.19) 0.086c (3.49) 223 0.909 Note: Dependent variable is the circular area approximation from p Head and Mayer (2000), Dic = (2/3) Ai /π, where A is the country area in square kilometers. Ordinary least squares regressions are performed on all countries except those that are profound outliers. The outlier truncation rule is abs(ln(Dic /Dig )) > 2, where Dic and Dgc are internal distances calculated through the circular area approximation and through population-weighted averages, respectively. Absolute tratios for the tests shown in the table (either 6= 0 or 6= 1) are given in parentheses. Statistical significance at the 95%, 99%, and 99.9% confidence levels are indicated by the superscripts a, b, and c, respectively. How well do these approximations capture internal distance? Internal distance that explicitly allows for the spatial distribution of population (and thus economic activity) within countries provides a useful benchmark against which one can compare these approximations. Figure 3 illustrates the degree of mismeasurement4 when comparing the circular area approximation (using the 2/3 proportionality factor) with the populationweighted country-internal distances for the year 2000. Obvious outliers in this diagram are several island nations where populations are scattered over several (sometimes distant) islands, for example Tuvalu, the Maldives, or the Marshall Islands. The other extreme is Suriname. In this mid-sized South-American country (and former Dutch colony) the population is highly concentrated around the capital Paramaribo, whereas the vast interior is largely unpopulated. What is immediately apparent in figure 3 is that the data points scatter near the 45-degree line but with a downward bias for small countries and an upward bias for large countries. This visually compelling results is confirmed by econometric analysis. Table 2 shows the results from estimating a log-linear model with the circular area approximation of internal distance as the dependent variable (2/3)(Ai /π)1/2 ) and the population-weighted internal distance as the independent variable. Results are show in column (A). In column (B) population has been added as a second regressor. Outliers were removed from the regression as indicated in the table notes. The t-test for a 45-degree line (unity of the log distance coefficient) is soundly rejected. The circular area approximation biases internal distance up4 Referring to these discrepancies as “mismeasurement” implies that the new distance measures are superior; it is not meant to imply that they are perfect, though. Their limitations have been acknowledged above. 11 Figure 3: Relative Accuracy of Circular Area Approximation of Internal Distance Circular Distance (2/3)sqrt(Area[sq.km.]/Pi) 1k SUR 100 10 MHL MDV GTM LBR TUV TKL 1 10 100 Population−Weighted Distance [km] 12 ward for large countries and downward for small countries. Notwithstanding the significant heterogeneity in population densities across countries, the aforementioned bias is also linked to population size. The log distance coefficient in column (A) can be disaggregated almost completely into a size effect and and a population effect in column (B). Figure 3 and the results in table 2 show that internal distance approximations suffer from a potentially troubling bias. 5.2 External Distance By far the most popular way to approximate external distances for gravity equations has been the great circle distance between the capital cities of countries. Alternatively, some authors have used the distance between the main agglomeration, which often is not the capital city.5 This method is marginally better, but still ignores that many large countries have multiple important agglomerations, often hundreds or even thousands of kilometers apart. Prior to the work presented here, Mayer and Zignago (2005) provide the most serious attempt to put distance measures on a better footing. In addition to providing distance measures for 225 countries, Mayer and Zignago (2005) also provide an additional 13 distance measures for countries where the capital city is not the main city. Furthermore, they provide two weighted distance measures (harmonic and arithmetic means) that are based on the same idea as the measures proposed in this paper. The key difference is that the measures proposed in this paper take into account all of the world’s population rather than just a list of urban agglomerations. The two weighted distance measures used by Mayer and Zignago (2005) calculate distance between two countries based on bilateral distances between the largest cities of those two countries. The authors stress that this procedure can be used in a consistent way for both internal and international distances. This point is echoed loudly in this paper: to use distance measures appropriately and effectively, internal and external distances must be calculated in a consistent manner. Mayer and Zignago (2005) obtain geographic locations and population data of main agglomerations from the database available on the www.worldgazetteer.com web site. In contrast, the new data set provided here improves upon these measures and introduces one new distinct and important feature, namely, that internal and external distances vary over time due to the effects of countryinternal migration. Capital city distances have been used widely as approximations of external distance. Compared to the new distance measure, how far off the mark are these conventional approximations? Table 3 shows the mismeasurement of external distance by comparing capital-city distances with population-weighted distances. To present results compactly, distances are averaged for each country, using country population as weights. Only those countries are shown where the average distance difference exceeds 200 km, and countries with less than 5 million inhabi5 For example: Toronto vs. Ottawa in Canada, New York vs. Washington in the United States, Rio de Janeiro vs. Brasilia in Brazil. 13 Table 3: Mismeasurement of Country Distances by Capital City Distances Capital GPWv3 Diff. Min. Max. Rel. Country [km] [km] [km] [km] [km] [%] HKG Hong Kong 3,051 2,238 813 -379 1,258 36.3 AUS Australia 10,465 9,754 711 -545 1,513 7.3 MOZ Mozambique 7,031 6,336 695 -1,643 1,375 11.0 LKA Sri Lanka 4,232 3,545 687 -445 1,141 19.4 MMR Myanmar 3,389 2,802 586 -447 1,144 20.9 THA Thailand 3,722 3,216 506 -454 1,135 15.7 KHM Cambodia 3,669 3,172 496 -418 1,037 15.6 IDN Indonesia 6,066 5,591 475 -508 1,098 8.5 BGD Bangladesh 2,722 2,247 474 -311 855 21.1 LAO Laos 3,239 2,786 453 -425 1,101 16.3 COD Congo/Zaire 6,189 5,764 425 -1,677 1,494 7.4 MYS Malaysia 4,396 3,979 417 -535 1,083 10.5 TWN Taiwan 3,329 2,945 384 -364 1,372 13.1 CHN China 4,835 4,496 339 -939 1,846 7.6 VNM Viet Nam 3,421 3,136 285 -753 1,451 9.1 PHL Philippines 4,622 4,384 238 -559 1,486 5.4 ZMB Zambia 6,295 6,088 207 -1,376 717 3.4 ARG Argentina 10,578 10,377 201 -1,664 744 1.9 DNK Denmark 3,787 3,998 -211 -762 1,465 -5.3 SWE Sweden 4,168 4,385 -217 -819 1,403 -5.0 GBR Great Britain 4,371 4,607 -236 -741 1,686 -5.1 VEN Venezuela 8,307 8,564 -258 -948 489 -3.0 ECU Ecuador 8,951 9,215 -263 -933 511 -2.9 FIN Finland 4,358 4,644 -286 -744 1,597 -6.2 CAN Canada 6,720 7,063 -344 -1,109 1,760 -4.9 USA United States 8,782 9,257 -475 -1,677 2,121 -5.1 Note: The table is sorted in descending order of average distance mismeasurement. Differences are weighted by population of partner country divided by gridded (GPWv3) distance in order to capture the ‘trade potential’ of partner countries. Only differences in excess of 200 km are shown for countries that have at least 5 million inhabitants (thus excluding most island states). 14 tants (mostly island states) are suppressed. The percentage differences can be quite large. Capital city distances err in particular on the side of putting countries too far at a distance. The most noticeable case is Hong Kong, whose proximity to a densely populated area in mainland China, the Pearl River Delta, puts the capital city of China, Beijing, too far at a distance. On the other end of the scale, Canada and the United States are more remote to the rest of the world as the location of the capital cities suggest. A closer look at the United States illustrates the problem, decomposing the distance measurement by partner country. Table 4 shows the distance difference in descending order of percentage difference where this difference exceeds 10 percent. There are two large outliers. The capital city measure puts Mexico too far away from the United States (by 40%), and Canada too close (by 49%)! Intra-NAFTA trade flows are quite large, and gravity equation estimates that get the distances between these countries ‘wrong’ by that order of magnitude will lose some of their credibility. Table 4 shows that the location of Washington, DC on the east coast of the United States makes European countries appear roughly 10-14% closer than the population-weighted average distances suggest. Likewise, this effect is also noticeable for a number of African countries. To treat these findings more rigorously, table 5 estimates a log-linear model with the capital-city distance as the dependent variable and the populationweighted distance (linear and squared) as the independent variables, progressively limiting the sample from column (A) through column (E) by lowering the threshold for excluding country pairs from 50,000 km to 500 km. Columns (A) through (C) suggest that the relationship between the two measures is both nonlinear (due to the significance of the quadratic term) and subject to a noticeable positive intercept. The interpretation of this positive intercept captures the story between Hong-Kong and China’s Pearl River Delta. Major agglomerations are often closer to the border between two countries than the distance between the two countries’ capital cities suggests. Lowering the distance threshold for including country pairs has the effect of lowering the correlation between the two series. For long distances, the mismeasurement problem is negligible, and the R2 is near perfect. However, when looking only at near countries, the mismeasurement problem is quite pronounced and the R2 drops noticeably. The measurement problem for external distances is much less a problem than for internal distances, simply because large distances dominate the picture. The question then becomes: how much does it matter? The section after next will return to this problem. Before doing so, it is necessary and useful to explore the overlooked dimension of economic distance: time. 6 Does Distance Vary Over Time? One of the key contributions of this paper is to add the time dimension to measures of (economic) distance. With the distance measures introduced in this paper, distance variation over time is caused by internal population migration within coun15 Table 4: Mismeasurement of Country Distances by Capital City Distances: A Closer Look at the United States of America and her Trading Partners Country (Capital) Mexico (Mexico) Guatemala (Guatemala) El Salvador (San Salvador) Honduras (Tegucigalpa) Venezuela (Caracas) Austria (Vienna) Italy (Rome) Dominican Republic (Santo Domingo) Hungary (Budapest) Brazil (Brasilia) Nigeria (Abuja) Great Britain (London) Ukraine (Kiev) Bulgaria (Sofia) Burkina Faso (Ouagadougou) Serbia and Montenegro (Belgrade) Ivory Coast (Yamoussoukro) Slovakia (Bratislava) Angola (Luanda) Tunisia (Tunis) Czech Republic (Prague) Switzerland (Bern) Belgium (Brussels) Mali (Bamako) Chad (N’Djamena) Russia (Moskva) Netherlands (Amsterdam) Libya (Tripoli) Congo/Zaire (Kinshasa) Algeria (Algiers) Portugal (Lisbon) Guinea (Conakry) France (Paris) Niger (Niamey) Spain (Madrid) Morocco (Rabat) Senegal (Dakar) Canada (Ottawa) Capital GPWv3 [km] [km] 3,123 2,230 3,114 2,750 3,176 2,874 3,040 2,757 3,451 3,844 7,043 7,846 7,159 7,998 2,507 2,808 7,271 8,162 6,872 7,722 8,883 9,982 5,835 6,560 7,744 8,708 7,847 8,829 7,932 8,939 7,529 8,495 8,000 9,032 7,098 8,054 10,627 12,061 7,299 8,322 6,809 7,770 6,536 7,474 6,147 7,036 7,386 8,473 9,283 10,656 7,737 8,886 6,096 7,023 7,773 8,978 10,492 12,169 6,767 7,850 5,710 6,627 7,098 8,243 6,100 7,112 8,142 9,500 6,065 7,077 6,138 7,200 6,422 7,537 610 1,205 Diff. [km] 893 364 302 283 -393 -804 -840 -301 -890 -849 -1,099 -725 -965 -982 -1,007 -966 -1,032 -956 -1,434 -1,023 -960 -938 -888 -1,086 -1,373 -1,149 -927 -1,205 -1,677 -1,084 -918 -1,145 -1,012 -1,358 -1,012 -1,063 -1,115 -595 Rel. [%] 40.0 13.2 10.5 10.3 -10.2 -10.2 -10.5 -10.7 -10.9 -11.0 -11.0 -11.1 -11.1 -11.1 -11.3 -11.4 -11.4 -11.9 -11.9 -12.3 -12.4 -12.5 -12.6 -12.8 -12.9 -12.9 -13.2 -13.4 -13.8 -13.8 -13.8 -13.9 -14.2 -14.3 -14.3 -14.8 -14.8 -49.4 Note: The table is sorted in descending order of distance mismeasurement. Only differences in excess of 200 km are shown for countries that have at least 5 million inhabitants (thus excluding most island states). 16 Table 5: Regression of Capital-City Distances on Gravity Thresholds (A) (B) (C) Threshold [km] 50,000 10,000 5,000 c c Intercept 0.456 0.454 0.569c (13.3) (8.14) (4.56) c c 0.867c 0.900 Log Distance 0.899 (25.2) (63.0) (108) c c Log Distance Squared 0.006 0.006 0.008c (11.0) (6.07) (3.31) Observations 22,155 14,668 6,315 2 R 0.991 0.985 0.964 Distances by Distance (D) 1,000 −0.803 (.788) 1.317c (3.84) −0.029 (.996) 716 0.742 (E) 500 −1.711 (.595) 1.672 (1.56) −0.063 (.633) 231 0.535 Note: Ordinary least squares regressions are performed on all observations where the gravity distance is smaller than or equal to the threshold indicated in the first data row of the table. Dependent variable is the capital-city distance. Absolute t-ratios are given in parentheses. Statistical significance at the 95%, 99%, and 99.9% confidence levels are indicated by the superscripts a, b, and c, respectively. tries. There are several well-known economic processes that can shape internal migration: 1. Urbanization (primarily related to the transition from agricultural production to manufacturing) 2. Regional economic ascent and decline, often due to natural resources availability (minerals, oil & gas, fisheries, forestry, etc.) 3. Agglomeration (external economies of scale in production) 4. Demographics (e.g., retirees prefer warmer climates) 5. Policy-induced migration (subsidies, taxes) 6. Trade liberalization with neighbouring countries As people move within countries, they can move closer together or further apart. This affects average internal distance. Similarly, they can move closer to one border or another. This affects external distances between countries. The GPWv3 database provides the tool to investigate the magnitude of internal migration and how it affects distances. 6.1 Internal Distance Table 6 shows internal country distances, sorted in descending order of the percentage increase in the population-weighted harmonic (‘gravity’) mean between 1990 and 2000. Countries with less than 3% change are suppressed, as are a number of very small island and city-state countries. The percentage changes can be remarkably large. Several countries have changes in excess or ten percent such as Guinea, Argentina, Venezuela, Russia, 17 Table 6: Internal Country Distance Country Guinea Argentina Venezuela Armenia Cuba Côte d’Ivoire Mexico Uruguay Latvia United States Israel Peru Ireland Lesotho Mauritania Canada Georgia Norway Kyrgyzstan Equatorial Guinea El Salvador Sudan South Africa Malaysia Turkey United Arab Emirates Nicaragua Panama Mongolia Dominican Republic Sweden Greenland Botswana Zimbabwe Finland Paraguay New Zealand Iceland Haiti Senegal Bahamas Somalia Russian Federation Gambia Albania Greece Harmonic Average in 2000 ∆ 1990-2000 [km] [km] [%] 147.2 28.9 24.4 107.6 11.1 11.5 130.6 12.1 10.2 30.3 2.0 6.9 98.4 5.0 5.4 118.0 4.9 4.4 196.7 8.2 4.4 41.3 1.6 3.9 38.7 1.4 3.8 607.9 21.5 3.7 34.5 1.2 3.5 106.4 3.3 3.2 49.6 1.6 3.2 54.5 -1.7 -3.0 111.7 -3.5 -3.0 179.4 -5.8 -3.1 55.1 -1.8 -3.2 90.9 -3.1 -3.3 82.7 -3.0 -3.5 62.4 -2.3 -3.5 33.0 -1.3 -3.9 307.9 -12.7 -4.0 194.7 -8.2 -4.0 128.1 -5.5 -4.1 262.5 -12.4 -4.5 66.7 -3.3 -4.7 61.1 -3.1 -4.8 55.8 -3.0 -5.0 31.8 -1.8 -5.5 53.7 -3.2 -5.7 110.2 -6.9 -5.9 31.3 -2.1 -6.2 124.7 -8.4 -6.3 117.9 -8.2 -6.5 84.9 -6.6 -7.2 55.4 -4.5 -7.5 71.2 -6.2 -8.0 17.2 -1.5 -8.1 47.3 -4.2 -8.2 65.3 -6.0 -8.4 15.1 -1.4 -8.7 177.9 -19.1 -9.7 444.4 -54.2 -10.9 30.0 -3.9 -11.4 50.9 -7.5 -12.8 52.7 -8.9 -14.5 Arithmetic Average in 2000 ∆ 1990-2000 [km] [km] [%] 306.0 11.7 4.0 632.9 20.3 3.3 384.2 7.3 1.9 75.1 2.9 4.0 364.6 -1.2 -0.3 268.8 4.6 1.7 705.6 22.9 3.4 174.2 0.3 0.2 126.5 0.6 0.4 1778.7 24.5 1.4 81.3 1.0 1.2 573.9 8.0 1.4 135.6 -1.0 -0.7 85.0 -2.5 -2.9 390.7 1.2 0.3 1552.2 18.5 1.2 148.9 -7.4 -4.8 330.4 -4.4 -1.3 232.4 -2.3 -1.0 170.4 6.7 4.1 70.9 -1.6 -2.2 666.5 -7.0 -1.0 562.0 -17.6 -3.0 636.6 27.0 4.4 543.3 0.9 0.2 163.6 -4.3 -2.6 134.4 -0.9 -0.7 160.8 -4.7 -2.9 89.6 -7.4 -7.7 103.2 -1.5 -1.4 315.1 -2.4 -0.8 515.2 2.0 0.4 262.8 -7.1 -2.6 284.3 0.7 0.3 230.1 -2.2 -0.9 175.8 -3.0 -1.7 401.0 -10.0 -2.4 74.2 -10.6 -12.5 105.7 -4.4 -4.0 191.5 -2.9 -1.5 94.3 -9.7 -9.3 538.2 -13.0 -2.4 1705.5 -92.9 -5.2 113.6 -1.3 -1.1 86.5 -11.1 -11.3 243.2 -7.5 -3.0 Note: Only changes in excess of plus or minus three percent in harmonic distance are shown. Islands, city states, as well as small territories and countries (including Bahrain, Brunei, Djibouti, Hong-Kong, Macao, Qatar, and the Vatican) were excluded from this analysis. 18 Gambia, Albania, and Greece. Some of the distance changes can also be sizeable in absolute terms. For example, Russia’s internal distance shrank by 54 km (or about 11%) in the last decade. Russia’s economic turmoil following the end of communism contributed to significant population movements and accelerated agglomeration. Table 6 also shows the difference between harmonic and arithmetic means of internal distance. Arithmetic averages tend to be significantly larger because they give greater prominence to long distances. By comparison, harmonic means give greater weight to short distances. Most changes agree in direction, although not always. For example, Canada has shrunk by 3.1% in terms of its harmonic average distance, but has expanded by 1.2% in terms of its arithmetic average distance. However, the United States has expanded consistently by 3.7% and 1.4%, depending on the type of average. Changes in internal distance over a single decade are sufficiently large in magnitude to suggest that the time dimension matters. Without having data for a longer time period, one can only speculate how much larger the effect of countryinternal migration may have been. Nevertheless, the changes between 1990 and 2000 show how sensitive empirical work will be that makes use of internal country distances, and how this work will lead to questionable results if the time dimension (and thus internal population migration) is ignored. 6.2 External Distance As the results in the previous section show, population migration within countries is sufficient to change internal distances measurably even over a single decade. If the population ‘centre of gravity’ of countries changes, so does the distance relative to other countries. Table 7 reveals the changes for the United States between 1990 and 2000, sorted in descending order of the distance change. Only changes in excess of 25km are shown. Also indicated are the percentage changes. Most noticeably, on average the populations of Canada and the United States have moved 28km further away from each other. With 2.4% this is also the largest percentage change for any country pair. By comparison, the populations of Mexico and the United States have moved 64km closer to each other; a change of 2% and the second largest change for any country pair. From table 6 we know that the internal distances of the United States, Mexico, and Canada have all changed over the last decade, indicating considerable population movements. Apparently, the ‘centre of gravity’ of the United States has moved south, closer towards Mexico and further away from Canada. The magnitude of these changes is not negligible. A 2.4% change in economic distance is not that far off in magnitude from average tariff reductions during the first decade of NAFTA. Consequently, gravity equations would tend to underestimate the effect of trade liberalization between Canada and the United States and overestimate the effect of trade liberalization between Mexico and the United States. Table 8 shows the largest distance changes for any country pair, sorted in descending order of percentage change, and suppressing changes smaller than one 19 Table 7: External Country Distances vis-à-vis United States Harmonic Average Arithmetic Average in 2000 ∆ 1990-2000 in 2000 ∆ 1990-2000 Partner Country [km] [km] [%] [km] [km] [%] Somalia 13,328 82.6 0.6 13,419 83.4 0.6 Guinea 8,243 66.5 0.8 8,453 69.3 0.8 Russian Federation 8,886 60.5 0.7 8,987 56.4 0.6 Niger 9,500 56.0 0.6 9,676 56.9 0.6 Spain 7,077 51.2 0.7 7,275 51.9 0.7 Mauritania 7,628 49.7 0.7 7,856 52.3 0.7 Greece 9,026 49.5 0.6 9,146 49.3 0.5 Mali 8,473 49.4 0.6 8,675 51.3 0.6 Algeria 7,850 48.9 0.6 8,027 49.5 0.6 Iran 10,946 47.8 0.4 11,019 48.0 0.4 Botswana 13,801 46.5 0.3 13,927 48.2 0.3 Serbia and Montenegro 8,495 46.4 0.5 8,615 46.4 0.5 Tunisia 8,322 46.3 0.6 8,477 46.7 0.6 France 7,112 46.0 0.7 7,280 46.4 0.6 United Kingdom 6,560 45.9 0.7 6,724 46.0 0.7 Zambia 13,387 45.2 0.3 13,515 46.6 0.3 Greenland 3,971 39.8 1.0 4,157 39.1 1.0 Canada 1,205 28.0 2.4 2,049 30.0 1.5 Timor-Leste 14,844 -26.7 -0.2 14,965 -24.5 -0.2 Nauru 10,818 -40.0 -0.4 11,037 -39.0 -0.4 Tonga 10,686 -42.1 -0.4 10,875 -42.6 -0.4 Samoa 9,944 -44.6 -0.4 10,156 -44.9 -0.4 Fiji 11,033 -44.8 -0.4 11,230 -44.9 -0.4 Mexico 2,230 -46.1 -2.0 2,564 -25.7 -1.0 New Zealand 12,744 -57.0 -0.4 12,882 -57.3 -0.4 Australia 14,703 -58.4 -0.4 14,898 -55.7 -0.4 Note: Only changes in excess of plus 45 percent or minus 25 percent in harmonic distance are shown. Islands, city states, as well as small territories and countries (including Bahrain, Brunei, Djibouti, Hong-Kong, Macao, Qatar, and the Vatican) were excluded from this analysis. 20 Table 8: Largest External Country Distance Changes Harmonic Average Arithmetic Average in 2000 ∆ 1990-2000 in 2000 ∆ 1990-2000 Country Pair [km] [km] [%] [km] [km] [%] Canada, United States 1,205 28.0 2.4 2,049 30.0 1.5 Greenland, United States 3,971 39.8 1.0 4,157 39.1 1.0 United States, Guinea 8,243 66.5 0.8 8,453 69.3 0.8 United States, Spain 7,077 51.2 0.7 7,275 51.9 0.7 United States, Ireland 6,182 44.3 0.7 6,359 44.5 0.7 United States, United Kingdom 6,560 45.9 0.7 6,724 46.0 0.7 United States, Russian Federation 8,886 60.5 0.7 8,987 56.4 0.6 United States, Iceland 5,191 35.1 0.7 5,350 35.2 0.7 Honduras, United States 2,757 -19.5 -0.7 2,978 -11.2 -0.4 Cuba, United States 1,990 -14.7 -0.7 2,471 8.0 0.3 Guatemala, United States 2,750 -20.8 -0.8 2,949 -14.6 -0.5 Mexico, United States 2,230 -46.1 -2.0 2,564 -25.7 -1.0 Note: Only changes in excess of plus/minus two-thirds of a percent in harmonic distance are shown. Islands, city states, as well as small territories and countries (including Bahrain, Brunei, Djibouti, Hong-Kong, Macao, Qatar, and the Vatican) were excluded from this analysis. two-thirds of a percent. All the large changes involve the United States. This appears to be a reflection of the high level of internal mobility in the United States. The U.S. Census Bureau reports continuing population losses for the Northeast and Midwest, with significant population gains for the South and West. This general south-west movement of the ‘centre of gravity’ for the United States is reflected in changes in external distances. Distances to European countries, Canada, and Russia have increased, and distances to numerous Latin-American countries have decreased. 7 Re-Estimating Gravity: Does it really matter? The previous sections have shown that internal and external distances vary over time, and that approximations of these distances are subject to a degree of mismeasurement. These problems are particularly pronounced for internal distances and short external distances. But how much do these mismeasurements really matter? Do they affect estimation results in a significant manner? To explore this question it is useful to re-estimate both the modern and conventional types of gravity equations. The empirical work in this section makes use of additional data sources. Trade and production data were obtained from Nicita and Olarrega (2007). 21 7.1 Modern Gravity Equation Modern forms of the gravity equation can be explored usefully by regressing bilateral (geometric mean) trade friction (Ξ) on relative distance (Ψ) and other determinants of the border effect. As discussed earlier, the key benefit of this transformation is that this method avoids estimating nuisance parameters (inward and outward multilateral resistance) and finding a suitable price deflator for trade flows. Even the GDP measures, which are constrained to unity elasticities in modern versions of the gravity equation, cancel out conveniently. It is worthwhile exploring the raw data of measured trade friction and relative distance ratios. Figure 4 plots the data points for the year 2000 in four panels. Panels A through D present charts for all country pairs, pairs of OECD countries, pairs of near countries (whose external distance is less than 3000 km), and for country pairs involving the United States, respectively. Measured trade friction (Ξ) is shown on the vertical axis and the relative distance ratio (Ψ) is shown on the horizontal axis. The correlation between the data points is most striking when considering only OECD country pairs. In the other panels a positive correlation is visible, but there are clearly other determinants at work that scatter the data points more widely. For the trading partners of the United States (panel D), NAFTA partners Mexico and Canada unsurprisingly exhibit the lowest relative distance and lowest trade friction. Estimating gravity equation (6) amounts to identifying the effect of distance and other explanatory variables on bilateral (geometric mean) trade friction Ξ. The simplest form of this model is ln Ξijt = δ ln Ψij[t] + µt + ijt (16) where Ψij[t] is the relative distance ratio from (7), either captured by the harmonic mean distance or by approximation through circular area distance (for internal distance) and capital city distance (for external distance). More sophisticated versions of this model allow for further determinants of the border effect. Table 9 shows the results of the regressions for a panel consisting of the years 1990-2000, with columns (A), (B) and (C) selecting all country pairs, country pairs where both countries are OECD members, and country pairs whose external distance is less than 3000 km, respectively. Year fixed effects (µt ) were included in all regressions, but their estimates are not shown. How do the different relative distance measures perform in comparison? Quite clearly, the population-weighted distance measures outperform the conventional approximation distance measures. The new distance measures are estimated more precisely (the t-ratios are higher) and the regression R2 are all significantly higher. Unsurprisingly, the dyadic gravity equation performs best for OECD countries. This may be attributed to the fact that the modern gravity equation is based on differentiated goods trade rather than inter-industry commodity trade. The magnitude of the estimates for Ψ are also quite different across types of distance measures. When looking at the results in columns (A1) and (B1) the effect of distance is quite consistent and around unity for the regressions involving all 22 Figure 4: Measured Trade Friction and Relative Distance Ratios Panel A: All Country Pairs Panel B: OECD Country Pairs 24 20 19 22 18 20 17 16 Trade Friction Trade Friction 18 16 15 14 13 14 12 12 11 10 10 9 8 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8 8.0 0.0 1.0 2.0 Population−Weighted Distance Ratio 3.0 4.0 5.0 6.0 7.0 Population−Weighted Distance Ratio Panel C: Near Country Pairs Panel D: United States of America 22 22 BDI SDN VUT 20 CAF 20 ZWE ERI 18 Trade Friction Trade Friction 18 IRN 16 14 ETH TON TGO ALB BFA NER CPV GIN BEN TJK KGZ MWI MLI TZA SEN BWA MOZVCT CMR MDALSO GAB PNG NGA KEN NAM TKM FJI MDV NCL GEO AZE ARM BLR SYR OMN PYF CYP MDG DMA TUN LBN HRV ATG LTU MNG CIV NPL LCA KWT KAZ BGR LVA MUS JOR SVK PRYISL MAR VNM EST GHA MKD SVN KNA BGD BRB POL NIC URY UKR KHM SWZ GRD GRC EGY PAK PAN DZA MAC ARE PRT BLZ ECU BOL NOR CZE DNK SUR SLV SAUGUY GTM HUN BHR RUS FIN ARG TUR AUT ZAF PER COL BHS ESP NZL HND IDN CHL IND JAM AUS MLT ITA SWE TTO VEN BRA CRI FRA CHE NLD CHN ISR DEU GBR THA JPN PHL KOR IRL TWN 16 14 12 12 10 10 CAN 8 0.0 1.0 2.0 3.0 4.0 5.0 8 6.0 1.0 Population−Weighted Distance Ratio GMB UGA 1.5 MEX 2.0 MYS 2.5 3.0 3.5 4.0 4.5 5.0 5.5 Population−Weighted Distance Ratio Note: Measured trade frictions Ξ is shown on the vertical axis, and relative distance ratio Ψ is shown on the horizontal axis, for the country pairs indicated. All data points correspond to the year 2000. In Panel B only country pairs where both countries were OECD members in 2000 are included. In Panel C only country pairs whose external distance is less than 3000 km are included. In Panel D the partner countries of the United States are labeled. 23 Table 9: Dyadic Gravity Equation Regression, 1990-2000 (A1) Country Pairs Intercept (1995) Ψ Harmonic Mean (B1) All (B2) Both OECD 11.693c 12.816c 9.493c 10.102c (208) 1.014c (94.8) (249) (107) 1.080c (62.8) (105) Ψ Approximated Observations R2 (A2) 37,518 0.196 0.860c (83.0) 37,518 0.158 3,212 0.553 1.134c (50.4) 3,212 0.444 (C1) (C2) ≤ 3000km 10.750c (115) 1.190c (51.0) 9,872 0.212 12.056c (140) 0.896c (39.7) 9,872 0.141 Note: Dependent variable is the Head and Ries (2001) measured trade friction (ln Ξijt ). Key regressors are the distance friction ratios (ln Φijt ) based on the population-weighted harmonic mean distances (columns A1, B1, C1) or the capital-city/circular-area approximate distances (columns A2, B2, C2). Estimation is by ordinary least squares. Year fixed effects are included in all regressions. Annual data for 1990-200 are used, with distances interpolated exponentially between the 1990, 1995, and 2000 GPWv3-derived data. Absolute t-ratios are given in parentheses. Statistical significance at the 95%, 99%, and 99.9% confidence levels are indicated by the superscripts a, b, and c, respectively. countries or only OECD pairs. By comparison, in columns (A2) and (B2) there is significant instability in the results, and the effect of distance varies between 0.86 and 1.13. Lastly, the results in columns (C1) and (C2) show that when including only near country pairs in the analysis, the gap in estimates of the distance effect widens (1.19 versus 0.90), while the R2 favors the new distance measures strongly. 7.2 Time-Differenced Modern Gravity Equation The main theme of this paper has been that economic distance varies over time due to the effect of population migration within countries. The time-differenced version of the gravity equation as expressed in equation (12) provides a suitable platform for testing the hypothesis that country-internal population migration changes trade frictions, and thus the volume of trade. Using a panel of only two time-differenced periods (1990-1995 and 1995-2000) due to the fact that the GPWv3 population data are only available for 1990, 1995, and 2000, table 10 shows estimation results for a simplified version of equation (12): ∆5 ln Ξijt = µt + ∆5 ln Ψijt + ijt (17) with two fixed effects (µt ) capturing overall changes in trade friction. In table 10 these fixed effects are negative and significant, demonstrating that trade frictions have decreased in general. Much of this will be due to reductions in tariffs fol24 Table 10: Dyadic Time-Differenced Gravity Equation Regressions Country Group 1990/1995 Dummy 1995/2000 Dummy Distance Ratio Observations R2 (A) (B) (C) All both OECD ≤ 3000km −0.339c −0.203c (12.7) −0.272c (13.2) 0.870c (12.1) 4,920 0.092 (6.63) −0.414c (16.4) 5.700c (4.97) 630 0.322 −0.365c (7.23) −0.311c (8.56) 1.011c (12.0) 1,290 0.196 Note: Dependent variable is the time-differenced Head and Ries (2001) measured trade friction (∆ ln Ξijt ). Key regressor is the time-differenced distance friction ratio (∆ ln Φijt ) based on population-weighted harmonic mean distances. The data set is constructed by calculating the differences between the years 1990, 1995, and 2000, resulting in two time-differenced data points for each country. Estimation is by ordinary least squares. Absolute tratios are given in parentheses. Statistical significance at the 95%, 99%, and 99.9% confidence levels are indicated by the superscripts a, b, and c, respectively. lowing the conclusion of the Uruguay round of the GATT, and some will also be due to the information technology revolution, which has facilitated outsourcing and the integration of production across borders. Columns (A) through (C) show results for all country pairs, OECD country pairs, and near country pairs (less than 3000 km apart), respectively. Results for OECD country pairs are again the most compelling based on the regression R2 . The effect of changes in the relative distance on trade frictions is positive and significant in all cases, with the magnitude ranging from 0.87 for all countries over 1.01 for near countries to 5.70 for OECD countries. The magnitude for the OECD countries seems to suggest that economic activity and trade are particularly sensitive to the effects of internal migration for these countries. As internal migration is of course endogenous and depends on a variety of driving forces, discussed earlier in this paper, the question arises what to make of this high level of sensitivity. Without any “natural experiments” in population dynamics in OECD countries during the last decade, it is difficult to instrument the relative distance ratio changes with any exogenous variables. While the above results suggest that internal migration affects trade, there is also the potential for reverse causality. Trade liberalization with neighbouring countries may induce country-internal population migration if industries locate close to the border. Examples of such cross-border agglomerations include the Ontario/Michigan auto industry, the Mexican maquiladora, and the Pearl River Delta 25 next to Hong Kong. Arguably, trade liberalization (e.g., the US-Canada auto pact, NAFTA) may play an important role here. How can one account for this potential endogeneity? One way is to explicitly account for tariff reductions in (17), as suggested in equation (12). Another way is to control for industry location. While these important questions are subject matter for further research, this paper stops at presenting prima facies evidence for internal migration’s effect on trade. 7.3 Conventional Gravity Equation Estimation of the conventional gravity equation in log-linear form remains frought with problems. Baldwin and Taglioni (2006) point out a number of common mistakes in estimating the gravity equation in this form. Typical fixes for estimating a log-linear form of the gravity equation involve time dummies, country dummies (one each for each exporter and importer country, i.e. 2n), and country-pair dummies (i.e. n(n − 1)/2). However, using country-pair fixed effects eliminates time-invariant regressors such as capital-city distance. Table 11 presents results that show how introducing a refined measure of external distance affects gravity equation estimates (columns A through D), and how these results stack up against the conventional distance approximation via capital cities (columns E through H). The panel consists of all countries for the period 1990-2000. Columns (A) and (E) show results for a simple pooled estimation. Columns (B) and (F) show results with time dummies included (except for 1995, the base year). Columns (C) and (G) show results with both time dummies and country dummies included. Country dummies are provided for each exporter and importer country except the United States, which serves as the reference country. Columns (D) and (H) show results with both set of dummies, but now using weighted least squares instead of ordinary least squares. Weights are the product of exporter and importer GDP. Using the new distance measures improves the R2 of each regression marginally, and the distance measures tend to be slightly more significant. The magnitude of the estimates varies only slightly. This will come as good news to the authors of the huge number of papers that have used conventional gravity equations in the past. They will not have to re-estimate their models. However, the results in table 11 show how much specification matters. Including dummy variables changes the magnitude of the distance effect; it also hugely boosts the R2 . Nevertheless, looking at the R2 alone provides a rather incomplete picture of performance. If one is interested in predicting trade flows rather than testing trade models, a natural question to ask is how large the prediction error is. A simple but useful measure is the absolute error P ijt Xijt − X̂ijt (18) A = 100% · P ijt Xijt that shows how much trade is mispredicted as a percentage relative to total world trade. The corresponding numbers are shown in row “Absolute Error” in table 11. Surprisingly, using time and country dummies increases the absolute error even 26 Table 11: Conventional Gravity Equation Estimation Time Fixed Effects Country Fixed Effects Estimation Method no no OLS (A) yes yes no yes OLS OLS Panel A: Time-Varying Distance Measure (B) (C) Intercept (1995) Log Exporter GDP Log Importer GDP Log Distance R2 Absolute Error −27.70c −27.67c 10.761c Intercept (1995) Log Exporter GDP Log Importer GDP Log Distance R2 Absolute Error −27.62c (240) 1.202c (386) 0.770c (298) −1.456c (198) 0.631 82.1% −27.59c (237) 1.201c (386) 0.772c (299) −1.463c (199) 0.632 83.9% 10.742c (7.20) 0.477c (11.9) 0.362c (11.7) −1.880c (260) 0.744 348.8% yes yes WLS (D) (242) (238) (7.23) −8.873c (12.4) 1.205c (387) 1.203c (387) 0.476c (11.9) 0.535c (32.0) 0.772c (299) 0.774c (300) 0.368c (11.9) 0.723c (44.9) −1.458c (200) −1.465c (201) −1.867c (262) −1.053c (374) 0.632 0.633 0.745 0.926 84.0% 85.9% 331.8% 49.9% Panel B: Capital City Distance Measure (E) (F) (G) (H) −10.11c (13.7) 0.553c (32.2) 0.741c (44.7) −1.056c (355) 0.922 51.6% Note: Dependent variable is the log of exports. Estimation is by ordinary least squares (OLS) or weighted least squares (WLS). When weighted least squares is used, the weights are the product of the exporter and importer GDP. Time fixed effects mean that a dummy has been included for each year except 1995. Country fixed effects mean that a separate dummy has been included for each exporter country and each importer country except for the United States. Absolute t-ratios are given in parentheses. Statistical significance at the 95%, 99%, and 99.9% confidence levels are indicated by the superscripts a, b, and c, respectively. though the R2 improves significantly. This is because estimation of the log-linear gravity equation minimizes relative deviations, not absolute deviations. A central flaw in using the gravity equation to predict trade flows is that it treats relative errors alike, no matter how large the actual trade volume. Thus a relative error in the large trade volume between Canada and the United States counts the same as the same relative error in the small trade volume between Ghana and Burkina Faso. This problem can be addressed by using weighted least squares when estimating the gravity equation, giving greater weight to pairs of large countries. Columns (D) and (H) show estimates when GDP product weights are used. The R2 goes up remarkably while the absolute error shrinks enormously. The bottom line of the results in table 11 is that using better distance measures improves the estimates only marginally, whereas the real challenge lies in finding the right model specification. This result is consistent with Feenstra, Markusen, and Rose (2001), who also document that the estimates of the effect of distance de27 pend on the specification, the country sample, and the type of goods (differentiated or homogeneous) that are being considered. Moreover, the results also show that weighting large and small trading partners equally may not be the best estimation method if what one really cares about is predicting total trade flows. 8 Conclusions This paper has set out to introduce a new time-varying measure of internal (intracountry) distance and external (inter-country) distance. Using the Gridded Population of the World database that provides population figures for all 2.5-by-2.5 arcminute latitude-longitude squares it is possible to calculate (harmonic) mean distances within and between countries consistently. These new distance measures constitute a significant improvement over previous ad-hoc approximations of internal and external distance. In addition to providing new time-varying internal and external distance measures, this paper is also able to answer a number of related research questions. First, do distance measures vary over time significantly? The answer is yes. In particular, internal distances show significant time variation. Over the last decade, numerous countries exhibited “expansion” or “shrinkage” of more than ten percent. For example, internal migration in Russia led to an eleven percent decrease in average internal distance. External distance also varies over time, although at a relatively smaller scale. The United States has experienced one of the largest relative change in external distance. Populations in Canada and the United States have moved 28km (or 2.4%) away from each other over the last decade. During the same period populations in Mexico and the United States have moved 64km (or 2.0%) closer. These changes are primarily the result of large internal migration in the United States, with the Northeast losing population and the Southwest gaining population. Second, how do conventional ad-hoc approximate measures of distance stack up against the new measures of distance? Internal distance approximations using circular areas exhibit a noticeable bias compared to the new measures. These approximations produce internal distances that are too high for large (and populous) countries and too low for small countries. External distance approximations through capital city distances are likewise very problematic. Not only is the location of the capital city not always the main urban agglomeration in a country, but where countries have numerous urban agglomerations, it matter if these agglomerations are closer to the border than the capital city. For example, Stockholm, the capital of Sweden, lies almost on the periphery of the densely populated area of this country. Populous cities such as Malmö and Göteborg are much closer to neighbouring countries than Stockholm. Mismeasurements of external distance can be quite large. Averaged across trading partners, differences range from –5% (United States) to +36% (Hong Kong). The capital of the United States (Washington, DC) puts the country too close to Europe (roughly by 10%), too far away from Mexico (by 40%), and too close to Canada (by 50%). 28 Third, does the use of conventional ad-hoc distance measures bias estimates of the effect of distance when estimating both modern and conventional gravity equations? The evidence in this paper shows that conventional gravity equation estimates are quite robust to mismeasurements of external distance, in part because of the relative prominence of long distance country pairs. However, estimating modern versions of the gravity equation requires both internal and external distances. This paper shows that the method of distance calculation matters strongly for country-internal distances and shorter distances between countries. In comparison to conventional ad-hoc measures, the new distance measures perform vastly better. Fourth, does time-varying distance affect trade? The answer is yes. Using a time-differenced version of the modern gravity equation, there is solid evidence that reductions in the relative distance ratio over time reduce trade frictions between countries. This result is tantamount to proving that country-internal population migration has the ability to affect trade. This effect appears particularly pronounced for OECD countries. However, internal migration may also be a result of trade liberalization with neighbouring countries. Controling for this potential endogeneity is a task for further research. This paper has shown how to use better distance measures, and that using them is relevant. Calculating these improved distance measures is computationally expensive. However, improvements in computer technology hold the prospect of even better measures in the future. New geo-computational techniques, currently popular in car navigation systems and web sites such as Google Maps, can eventually be utilized to compute actual trucking, shipping, and air-cargo distances. The problem will be to apply these techniques to trillions of potential location pairs on the planet. With respect to estimating gravity equations, this paper makes a strong case for using time-varying measures of distance that allow for country-internal migration of populations. Even though this study focused only on the last decade, the magnitude of the distance changes are large enough to make them economically meaningful and relevant. Ignoring the time-varying nature of economic distance may bias the results of empirical work that attempts to identify the effect of policies such as tariff reductions. Putting the new distance data set into the public domain may help improve the quality and reliability of empirical work with the gravity equation in the future. 29 References Anderson, James E. and van Wincoop, Eric (2003). Gravity with Gravitas: A Solution to the Border Puzzle. American Economic Review, 93(1), 170–192. Anderson, James E. and van Wincoop, Eric (2004). Trade Costs. Journal of Economic Literature, 42(3), 691–751. Baldwin, Richard E. and Taglioni, Daria (2006). Gravity for Dummies and Dummies for Gravity Equations. National Bureau of Economic Research Working Paper 12516. Chen, Natalie (2004). Intra-national versus international trade in the European Union: why do national borders matter. Journal of International Economics, 63, 93–118. CIESIN (2005). Gridded Population of the World, Version 3. At www.ciesin.org. Deardorf, Alan (1998). Determinants of Bilateral Trade: Does Gravity Work in Necolassical Work. In J. A. Frankel (Ed.), The Regionalization of the World Economy. Ann Arbor: The University of Michigan Press. Dekle, Robert, Eaton, Jonathan, and Kortum, Samuel (2007). Unbalanced Trade. American Economic Review, 97, 351–355. Eaton, Jonathan and Kortum, Samuel (2002). Technology, Geography, and Trade. Econometrica, 70, 1741–1779. Feenstra, Robert, Markusen, James, and Rose, Andrew K. (2001). Using the Gravity Equation to Differentiate Among Alternative Theories of Trade. Canadian Journal of Economics, 34, 430–477. Head, Keith and Disdier, Anne-Celia (2007). The Puzzling Persistence of the Distance Effect on Bilateral Trade. Review of Economics and Statistics, 0, forthcoming. Head, Keith and Mayer, Thierry (2000). Non-Europe: The Magnitude and Causes of Market Fragmentation in the EU. Weltwirtschaftliches Archiv/Review of World Economics, 136(2), 284–314. Head, Keith and Mayer, Thierry (2002). Illusory Border Effects: Distance Mismeasurement Inflates Estimates of Home Bias in Trade. CEPII research center, Paris. Head, Keith, Mayer, Thierry, and Ries, John (2007). The Erosion of Colonial Trade Linkages After Independence. Working Paper Presentation at the CEPR ERWITT Worskhop in Kiel. Head, Keith and Ries, John (2001). Increasing Returns versus National Product Differentiation as an Explanation for the Pattern of U.S.-Canada Trade. American Economic Review, 91(4), 858–876. Helliwell, John F. and Verdier, Geneviève (2001). Measuring internal trade distances: a new method applied to estimate provincial border effects in Canada. Canadian Journal of Economics, 34, 1024–1041. 30 Helpman, Elhanan, Melitz, Marc, and Rubinstein, Yona (2007). Estimating Trade Flows: Trading Partners and Trading Volume. National Bureau of Economic Research Working Paper 12927. Krugman, Paul R. (1980). Scale Economies, Product Differentiation, and the Pattern of Trade. American Economic Review, 70, 950–959. Leamer, Edward E. (1997). Access to Western Markets and Eastern Effort. In S. Zecchini (Ed.), Lessons from the Economic Transition, Central and Eastern Europe in the 1990s, pp. 503–526. Dordrecht: Kluwer Academic Publishers. Mayer, Thierry and Zignago, Soledad (2005). Market Access in global and Regional Trade. Centre d’études prospectives et d’informations internationales (CEPII) Working Paper 2005-02. Melitz, Marc J. and Ottaviano, Gianmarco I.P. (2005). Market Size, Trade, and Productivity. National Bureau of Economic Research Working Ppaer 11393. Mundra, Kusum (2005). Immigration and International Trade: A Semiparametric Empirical Investigation. Journal of International Trade and Economic Development, 14, 65–91. Nicita, Allesandro and Olarrega, Marcelo (2007). Trade, Production and Protection, 1976-2004. World Bank Economic Review, 21, 165–171. Nitsch, Volker (2000). National borders and international trade: evidence from the European Union. Canadian Journal of Economics, 33, 1091–1105. Nitsch, Volker (2001). It’s Not Right but It’s Okay: On the Measurement of Intraand International Trade Distances. Working Paper. Silva, J. M. C. Santo and Tenreyro, Silvana (2006). The Log of Gravity. The Review of Economics and Statistics, 88, 641–658. Wagner, Don, Head, Keith, and Ries, John (2002). Immigration and the Trade of Provinces. Scottish Journal of Political Economy, 49, 507–525. Wei, Shang-Jin (1996). Intra-National versus International Trade: How Stubborn are Nations in global Integration. National Bureau of Economic Research Working Paper 5531. Wolf, Holger C. (1997). Patterns of Intra- and Inter-State Trade. National Bureau of Economic Research Working Paper 5939. Wolf, Holger C. (2000). Intranational home bias in trade. Review of Economics and Statistics, 82, 555–563. 31